MLOps is a relatively new concept in the AI (Artificial Intelligence) world and stands for “machine learning operations.” Its about how to best manage data scientists and operations people to allow for the effective development, deployment and monitoring of models.
“MLOps is the natural progression of DevOps in the context of AI,” said Samir Tout, who is a Professor of Cybersecurity at the Eastern Michigan University’s School of Information Security & Applied Computing (SISAC). “While it leverages DevOps’ focus on security, compliance, and management of IT resources, MLOps’ real emphasis is on the consistent and smooth development of models and their scalability.”
The origins of MLOps goes back to 2015 from a paper entitled “Hidden Technical Debt in Machine Learning Systems.” And since then, the growth has been particularly strong. Consider that the market for MLOps solutions is expected to reach $4 billion by 2025.
“Putting ML models in production, operating models, and scaling use cases has been challenging for companies due to technology sprawl and siloing,” said Santiago Giraldo, who is the Senior Product Marketing Manager and Data Engineer at Cloudera. “In fact, 87% of projects don’t get past the experiment phase and therefore, never make it into production.”
Then how can MLOps help? Well, the handling of data is a big part of it.
“Some key best practices are having a reproducible pipeline for data preparation and training, having a centralized experiment tracking system with well-defined metrics, and implementing a model management solution that makes it easy to compare alternative models across various metrics and roll back to an old model if there is a problem in production,” said Matei Zaharia, who is the chief technologist at Databricks. “These tools make it easy for ML teams to understand the performance of new models and catch and repair errors in production.”
Something else to consider is that AI models are subject to change. This has certainly been apparent with the COVID-19 pandemic. The result is that many AI models have essentially gone haywire because of the lack of relevant datasets.
“People often think a given model can be deployed and continue operating forever, but this is not accurate,” said Randy LeBlanc, who is the VP of Customer Success at RapidMiner. “Like a machine, models must be continuously monitored and maintained over time to see how they’re performing and shifting with new data–ensuring that they’re delivering real, ongoing business impact. MLOps also allows for faster intervention when models degrade, meaning greater data security and accuracy, and allows businesses to develop and deploy models at a faster rate. For example, if you discovered an algorithm that will save you a million dollars per month, every month this model isn’t in production or deployment costs you $1 million.”
MLOps also requires rigorous tracking that is based on tangible metrics. If not, a project can easily go off the rails. “When monitoring models, you want to have standard performance KPIs as well as those that are specific to the business problem,” said Sarah Gates, who is an Analytics Strategist at SAS. “This should be through a central location regardless of where the model is deployed or what language it was written in. That tracking should be automated–so you immediately know and are alerted—when performance degrades. Performance monitoring should be multifaceted, so you are looking at your models from different perspectives.”
While MLOps tools can be a huge help, there still needs to be discipline within the organization. Success is more than just about technology.
“Monitoring/testing of models requires a clear understanding of the data biases,” said Michael Berthold, who is the CEO and co-founder of KNIME. “Scientific research on event, model change, and drift detection has most of the answers, but they are generally ignored in real life. You need to test on independent data, use challenger models and have frequent recalibration. Most data science toolboxes today totally ignore this aspect and have a very limited view on ‘end-to-end’ data science.”