On October 6, 2020, data scientist Peter Shen from Janssen Pharmaceuticals, a division of Johnson & Johnson, presented “Multi-GPU Machines for Computer Vision-based Deep Learning Models in Histopathology” at the NVIDIA GPU Technology Conference (GTC) for developers. Peter was joined by Katherine Shakman, Domino Data Lab Field Data Scientist. This post provides highlights from their talk, along with a link to the full session recording.
Janssen is the pharmaceutical arm of Johnson & Johnson, a multinational healthcare leader. The company uses computational data science research across immunology, compositional chemistry, and biology to develop new drugs, optimize clinical trials, and automate diagnosis techniques.
Working with Domino and NVIDIA, Janssen has accelerated training of deep learning models, in some cases as much as ten times faster, to more quickly and accurately diagnose and characterize cancer cells through whole-slide image analysis. This is a crucial step in its effort to deliver precision medicine. Based on early results, Peter anticipates that once deployed in the clinical setting, one model will deliver a four times increase in the number of patients who can be screened as positive for eligibility in clinical trials.
Artificial intelligence can transform healthcare, giving researchers new insights to discover novel treatments and deliver precision medicine to patients. But doing so requires the ability to analyze enormous data sets. In his talk, Peter dove into the specifics of how Janssen is using deep learning to analyze whole-slide images of biopsy and surgical specimens (called Histopathology images). Each image typically ranges from two gigabytes to five gigabytes, and most clinical trials generate thousands of these images. Large clinical trials, Peter said, can generate up to 100,000 images.
By training deep learning models to distinguish the difference among patients at a cellular level in these images, researchers can better identify patients who are viable for therapeutic targets and clinical trial eligibility or predict a patient’s potential response to a given therapy.
“If we’re able to deploy this model into a clinic, we’re able to get a 4X increase in the number of patients that we can screen as positive for eligibility in our clinical trials.”
—Peter Shen, Data Scientist, Janssen Pharmaceuticals
To support this work, Janssen built a unified framework for deep learning and distributed training, using the Domino data science platform to provide data scientists with self-service infrastructure access to diverse tools, languages, data sets, and scalable compute, including NVIDIA GPUs, which are critical for training deep learning models on large data sets. In his discussion, Peter shared how Domino is helping the team more rapidly develop deep learning models, in some cases as much as 10 times faster. (Of course, getting these models to production will require strong partnership among data science, IT and business leaders. Peter joined data science leaders from easyJet and PointRight to discuss challenges and best practices in this area specifically during their webinar “Reaching Across the Aisle.”)
In terms of the unified framework specifically, Peter emphasized four benefits, including the ability to:
“We built a flexible platform that really allows us to iterate through different model training, and do that also in a distributed fashion.”
—Peter Shen, Data Scientist, Janssen Pharmaceuticals
Peter also presented three examples of how this approach is helping accelerate research by:
In his role at Janssen Pharmaceuticals, Peter Shen helps the research and development of new pharmaceuticals through data-driven decision-making. Prior to Janssen, Peter was a graduate student researcher at Dana-Farber Cancer Institute, served as product manager at both Aimsio and Billion Health, and Bio-informatics Co-op at both the BC Cancer Agency and the Public Health Agency of Canada.
Katherine Shakman empowers and supports data science teams across a variety of industries. Katie’s background is in health data science and neuroscience, and she believes computational tools will transform the way we interact with our world and each other, and particularly the healthcare and life sciences landscape. She is working to help make that transformation benefit society. In her doctoral research Katie utilized neural imaging and behavioral analysis to study interactions between neural circuits regulating attention and memory in insects. She employs her skills in experimental design, problem solving, project management, analytics, machine learning, data visualization and technical communication to impact the future of technology.
Watch the Webinar “Multi-GPU Machines for Computer Vision-based Deep Learning Models in Histopathology” to learn more about the key technical challenges Janssen faced and how they addressed them.