With roots tracing back more than a century, Bayer has always seen innovation as key to its mission of improving farmers’ harvests to balance the needs of humanity with our planet’s limited resources. Bayer, a world-leading provider of agricultural products, relies on data science at its core, supporting use cases such as maximizing crop yields, improving customer experience, and optimizing supply chain operations. The output of data science is a model. With models built by Bayer’s 500- plus strong data science community helping to improve more than 100 decisions, the company exemplifies what it means to be model-driven.
Bayer has adopted Domino as part of their “science@scale” data science platform to further enhance visibility and collaboration, accelerating the pace of research across hundreds of simultaneous projects and multiple business units. The platform is making a big impact on the business. Coupled with investments in an enterprise-wide data strategy and digital platforms, Bayer has realized significant cost savings by reducing cost- of-goods and increasing operational efficiencies.
Bayer has a multi-year research pipeline to develop new products, including seeds that maximize crop productivity and provide protection from insect pests and herbicides that are needed to combat yield-robbing weeds in the field. The process is expensive and time-consuming; there is little margin for error.
“Each year, we have one chance in each hemisphere’s growing season to collect data on the seeds we develop,” explained Naveen Singla, Data Science Center of Excellence lead at Bayer. “We manage an incredible amount of data to help produce high-quality results, but we also know there’s always opportunity to improve how we manage and leverage the data.”
The company applies highly complex models at each stage of the agricultural process, from early breeding to in-field testing, to increase the probability and pace of breakthroughs that will maximize output while conserving environmental resources.
Bayer realized early in their data science journey that managing the development, production, and ongoing improvements to models requires a different approach than established disciplines surrounding software engineering and data management. Bayer started developing an internal, cloud-based data science platform called “science@scale” to ingest data and provide access to widely used data science tools. While the platform sped up data analysis, the unique characteristics of models required additional collaboration.
Unlike software engineering or data management, models (and Bayer’s business) require a research-based approach comprised of constant exploration, iteration, and agility. They’re intended to be probabilistic, not deterministic. The nature of data scientists’ work is experimental and collaborative; models must constantly be tracked, retrained, and iterated on to reflect changing data and other factors that lead to model drift.
Bayer had the opportunity to augment and amplify its research-based approach for even greater success across its global data science community.
The landscape of data science tools and technologies -- i.e. the “ingredients” that go into models -- is very heterogeneous and constantly evolving. A data science platform must provide flexibility, agility, and scalability to support a dynamic tooling environment and diverse skill sets and preferences. The ability to quickly iterate on retraining models, validating, and deploying was a “must” for Bayer.
The science@scale solution included RStudio, Jupyter, Flask, etc., catering to data scientists comfortable with modern software programming paradigms. Domino has provided easier access to the big data technology stack to the broader data science community at Bayer, as well—which has had a positive impact for Bayer’s diverse research team while also delivering business value.
“We needed a platform that could abstract away complexities and allow all users to do analysis at scale, utilizing the modern tech stack and getting better insights from data,” said Singla.
Bayer leadership recognized an opportunity to enhance science@scale with Domino. Domino is a purpose-built data science platform that supports diverse tools, automates hardware infrastructure provisioning (so data scientists can run experiments in parallel and at scale), and facilitates rapid iteration and deployment of models. The critical features provided by Domino include:
Bayer’s large data science community works as a cohesive, high-performing team. They build models that both drive agricultural breakthroughs and optimize efficiencies of everyday business operations.
Digital innovations across the company, enabled via a combination of investments in data, platforms, and people, have led the company to realize value and efficiencies in delivering agricultural products to farmers around the world.
“Domino has made it easier for users across the global enterprise, using different tools and with varied backgrounds and skill sets, to work with each other, leverage past work, and collaborate quickly. This ultimately results in more models being delivered and deployed in a shorter window of time, which is empowering Bayer to be a model-driven company that’s at the forefront of farming,” Singla said.