More than 2,000 teams compete in 2018 Data Science Bowl.
When it comes to developing drugs to fight disease, a billion dollars doesn’t buy what it used to.
Not so long ago, a fortune that size might have covered the cost of creating 30 new medicines. Today, it won’t even buy you one.
This year’s Data Science Bowl could help cure what ails drug discovery. In the world’s largest AI competition for social good, more than 2,000 teams are vying to reduce the soaring costs and testing times for new drugs.
Their challenge: To use deep learning to accelerate and be more accurate in a crucial step in the drug-discovery pipeline — identifying the nucleus of each cell.
“The 2018 Data Science Bowl is driven by a very real need to develop new treatments faster and more accurately,” said Anne Carpenter, director of the imaging platform at the Broad Institute of MIT and Harvard, the nonprofit partner for the competition.
Intrigued? It’s not too late to get in on the action: The entry deadline is April 9. Final submissions are due on April 16.
$100,000 in Cash, and a NVIDIA Deep Learning Supercomputer
The fourth annual Data Science Bowl calls on participants from around the world to train deep learning models to examine cell images and identify nuclei.
“If we can improve that process, new cures and new treatments could come out at a much faster pace,” said Ray Hensberger, director of data solutions and machine intelligence at the consulting firm Booz Allen Hamilton.
Booz Allen and the Kaggle platform for data science competitions are co-presenting the contest, with additional sponsorship from NVIDIA, the medical diagnostics company PerkinElmer and others. In addition to potentially advancing drug discovery, top teams will split $170,000 in cash and prizes, including an NVIDIA DGX Station personal AI supercomputer.
The Opposite of Moore’s Law
Finding new drugs is a complex and laborious task that can cost billions and take a decade or more per treatment. Despite improvements in technology, the cost of developing a new drug roughly doubles every nine years, according to an observation known as Eroom’s law (that’s Moore’s law spelled backwards).
Biochemists try thousands of chemical compounds to figure out which, if any, are effective against a particular virus or bacteria or which cause a desired reaction in the human body. They do that by measuring how diseased and healthy cells respond to various treatments.
Because nearly all human cells contain a nucleus, the most direct route to identifying each cell is to spot the nucleus, Carpenter said. Today’s image-processing algorithms can find nuclei and measure the disease status of cells, but they work best when nuclei are fairly round and not too crowded, she said.
The algorithms fall short when nuclei are unusual shapes or crowded together, which occurs in complicated experiments involving tissue samples.
“Sometimes biologists have no choice but to personally examine thousands of images to complete their experiments,” Carpenter said.
Deep Learning Drug Discovery
Existing methods also require scientists to repeatedly revise algorithms to fit different types of images and cells. Carpenter wants deep learning software to do that without biologists’ intervention, saving hundreds of thousands of hours a year and opening up faster channels for new discoveries.
She hopes to use a winning algorithm to build deep learning software for drug discovery.
“Deep learning will help find relevant details within an image that match or surpass the power of human observation,” said Kyle Karhohs, a postdoctoral researcher working in Carpenter’s lab. “This could increase the scale and speed of drug discovery and enhance our ability to characterize the biology in images with unprecedented precision and accuracy.”