Deploy Machine Learning Projects in Production with Open Standard Models
Use PMML, PFA, or ONNX to Make Your Models More Manageable and Tool/Language Independent
As practitioners employ more open source tools for machine learning, businesses face challenges in deploying those models in a uniform way that doesn’t tie them to specific technologies. For example, raw Apache Spark models don’t work anywhere except in Spark, which is not an ideal technology for deployment. TensorFlow’s TFX platform offers TensorFlow Serving, which only serves TensorFlow models, but won’t help you with your R models. And scikit-learn models don’t really have a “format” at all -- you have to serialize the model pipeline itself as a pickled Python object.
Aside from leading to complex dependencies and hard-to-maintain heterogeneous environments, this pattern often violates separation of concerns. Just as email and the web let us deploy content anywhere (and search it, version it, diff/compare, etc.) without requiring us to deploy our specific operating systems or word processors, we would like open, industry-standard formats for ML models (and feature pipelines) that represent just the essential logic and math.
In this course, we will review PMML, PFA, and ONNX, which represent industry collaborations to create open, standard language-and-platform-neutral models that can be produced by many tools and deployed in many other tools. They also contribute to manageability by allowing easy versioning, diffing, and updating without complex and expensive-to-maintain dependencies.
What you'll learn-and how you can apply it
By the end of this live, hands-on, online course, you’ll understand:
- The difficulties in productionizing multiple proprietary ML model “flavors”
- Tools and options for exporting models and feature pipelines as PMML, PFA, or ONNX
- How to deploy models in platform-agnostic standard runtimes
And you’ll be able to:
- Evaluate the benefits of the different open standard formats for your projects
- Select an appropriate open standard format and open-source runtime environment for your deployment
- Create and run service that performs inference (makes predictions) using these industry-standard models
This training course is for you because...
- You are a data scientist, data engineer, or MLOps engineer
- You work with ML models that need to be put into production and managed
- You want to become an engineer, architect, or leader who can deploy and operate ML models in production
- Familiarity with basic ideas of machine learning, such as continuous vs. categorical variables, and basic models like linear and logistic regression
- Basic familiarity with at least one tool for training ML models and making predictions
- Familiarity with principles of operations, especially in enterprise settings, is helpful but not required
Practitioners need to learn to separate the wheat from the chaff on their specific platforms by sorting through platform documentation on their own. There are many open platforms that address just parts of the challenge, including: - Kubeflow - mlflow (Databricks/open source) - SeldonCore - and cloud-specific offerings like AWS SageMaker, Azure MLOps, etc.
- Hidden costs and gotchas from deploying development ML code (instead of a proper model artefact) into production
- Risks of bundling complex software stacks into a Docker image in lieu of separating content (models) from the systems which produce it
- Enterprise risks of maintaining heterogeneous, opaque, and proprietary model formats
About your instructor
Adam Breindel consults and teaches widely on Apache Spark and other technologies. Adam's experience includes work with banks on neural-net fraud detection, streaming analytics, cluster management code, and web apps, as well as development at a variety of startup and established companies in the travel, productivity, and entertainment industries. He is excited by the way that Spark and other modern big-data tech remove so many old obstacles to system design and make it possible to explore new categories of interesting, fun, hard problems.
The timeframes are only estimates and may vary according to how the class is progressing
Understanding the Model Deployment Challenge (30 minutes)
- Related goals and challenges
- Breaking the Dependency Between Model Building and Model Deployment
- Exercise: What can go wrong with model deployments today?
Common, But (Sometimes) Less Desirable Approaches: (20 minutes)
- Amalgamation and Single-Product Model Formats
- Exercise: Deploy a model with TensorFlow Service
- 5 min break
An Open Standard Approach: Predictive Modeling Markup Language (35 minutes)
- PMML origin
- Which products create and/or consume PMML
- Exercise: Create a PMML model and inspect it
Newer, Better Approach: Portable Format for Analytics (35 minutes)
- PFA Design Goals
- Where Does It Work?
- Using the PFA Scoring Open-Source Reference Implementations
- Exercise: Create a PFA Scoring Service Using “Hadrian” (the Java scoring implementation)
- 5 min break
Latest Technology: Open Neural Network Exchange Format (45 minutes)
- Origin of ONNX
- Extending ONNX beyond neural nets to ML
- ONNX Runtimes available today
- Discussion: Creating an ONNX representation of a model
- Exercise: Serving predictions using Python and Microsoft’s open-source onnxruntime