The evolution of technology, much like the evolution of life on our planet, has been characterized by steady progress interspersed with occasional mass extinctions and bursts of innovative new life. We are fortunate to be experiencing one of those evolutionary inflection points. Innovations in data science and AI abound and are already changing the very nature of business and life. With this technological boom comes real risks for organizations who do not acquire the survival traits of the new era.
Many executives and analytical professionals lack vision as to where this process is taking us. Their definition of the current era we live in, what we are calling the MLOps era, is constrained. This causes them to entrench technologies and adopt processes (traits, in the evolutionary sense) that will limit their ability to compete in the new marketplace. And what of the next evolutionary era for enterprise AI? Those who arrive first will find themselves atop the food chain. In this post we provide a definition of MLOps and discuss how top analytical enterprises are already evolving beyond our current era. We give insight into what the next era will look like, and, importantly, what kind of organizations will survive and thrive.
As we begin 2021, the data science, ML, and AI industry is currently in the early days of the MLOps era. This is an exciting evolution built on the innovations of the past. It seeks to solve the last mile problem of getting more data science products into production – to operationalize ML and AI. It promises modularized and reusable components to meet that end. It adopts principles from DevOps and software engineering such as CI/CD methods that are modified to fit the needs of data science work. It emphasizes clean data pipelines to support the operationalization process. Importantly, each of these MLOps era aspects is experiencing a transition to cloud workloads. It is helpful to note the stages of the data science lifecycle that are emphasized in this era. They are the Validate, Deploy, and Monitor stages.
This is the prevailing view of the MLOps era, but we are early in this era and have work to do before we transition to the next big thing. In particular, we will see a focus in academia and in industry on rounding out the validation and monitoring aspects of the data science lifecycle.
Validation is currently ahead of monitoring in its progress. Academia has published a large amount of research on explainability, ethics, and bias in models. Industry is translating those ideas into tools and processes used by leading organizations today. There is still a lot of work to do, and model validation will be much more mature by the time we leave this era.
Academia is later to the game when it comes to model monitoring. This is partly because much of it is a solved problem, at least in an academic sense. However, we have plenty of unanswered questions about the best way to apply known monitoring principles to models in production. Even when best practices are established, implementation will not be trivial; in order to be effective in an MLOps sense, it must complete the feedback loop to retrain or completely rebuild models. Both academia and industry have work to do before model monitoring is on a solid foundation, but they will get there by the end of this era.
For all the energy and talk around MLOps, MLOps is still about the “Production” half of the lifecycle. MLOps requires us to figure out complicated APIs and stitch together services and technologies. MLOps is about the how. This future, unnamed era will apply some of the same principles of the MLOps era to the “R&D” half of the lifecycle. That’s when we enter the new era which will be characterized by efficiencies of scale.
As one forward-looking data science leader who is already building for this future era put it,
“Ten years ago, data was our competitive advantage. Then it was our models. Today it is our process.”
With the tools of the MLOps era in place, the next era will be about process. That doesn’t mean technology will not play a part. Innovations will help leaders and teams operationalize the rest of the data science lifecycle. We’re not talking about auto-ML or an easy button for data science. It is about standardization. Operationalizing the Ideate, Analyze, and Develop stages of the lifecycle means providing tools to capture and share institutional analytical knowledge. It means providing a way to automatically track research and reproduce it with the click of a button. It has a lot to do with data science portfolio management and establishing a hierarchy of needs. It will see a focus on data science project management. It will be characterized by holistic asset management, from models to datasets to images and all things in between. It involves tracking the business value of data products. In short, it’s about emphasizing the science in data science - providing structure so teams can operate like a group of collaborative research scientists.
Based on this vision of what MLOps will look like by the end of the current era and what the next era will bring, we see three strong traits organizations and companies must develop in order to survive and thrive in the next era. Some are already developing them.
Lifecycle management will be the end state of organizations that effectively adapt during our current era, the MLOps era. It leverages the principles and tools of MLOps as a means of optimizing the process of getting models into production. It sets an organization up to extend those principles to research teams across the entire lifecycle.
Knowledge management relies on a well-defined system of record strategy for data science work. This will be made possible by technology – data science platforms to be specific – but will also require leadership to standardize work while not stifling creativity. Knowledge management systems will be the spark of inspiration for analytical breakthroughs. It will provide a compounding effect in the value created by data science teams.
Lastly, those who adopt an effective portfolio management strategy, with the tools to support it, will finally realize their hierarchy of needs vs. today’s bottoms-up approach of data science project work. All leaders, from team leads to the C-suite, will have visibility into data science projects and research. Tracking business value will become a reality. The analytical engine of organizations will finally begin to fire on all cylinders.
Organizations should embrace the current MLOps era while simultaneously laying the foundation for the next. Embrace the operationalization mindset of today. Build the pipelines. Invest in the right talent. At the same time, begin to experiment with standardization of analytical research. Begin to think about and test knowledge management principles and tools. Be cognizant of your analytical portfolio, where it came from, and how to manage it. Read about how to structure your organizations for success based on the latest research from top analytical leaders. Taking these steps now will be the key to winning in the coming decade of analytical evolution.
Visit Domino News for press releases and mentions.
Visit the Data Science Blog to learn about data science trends, tools, and best practices.