In the U.S., cancer will strike two in every five people in their lifetimes. But it affects all of us.
That’s why, in 2015, the office of the Vice President announced the Cancer Moonshot. It’s an audacious effort to make a decade’s worth of progress in cancer prevention, diagnosis, and treatment in just five years.
Beginning today, the 2017 Data Science Bowl will pursue one of the Cancer Moonshot’s key goals: unleashing the power of data against this deadly disease. Presented by Booz Allen and Kaggle, the competition will convene the data science and medical communities to develop cancer detection algorithms, and help end the disease as we know it.
The Lung Cancer Detection Challenge
Lung cancer is one of the most common types of cancer, with nearly 225,000 new cases of the disease expected in the U.S. in 2016.
Early detection is critical, as it opens a range of treatment options not available when cancer is detected at later, more advanced stages. Low-dose computed tomography (CT) is a potential breakthrough technology for early detection, with the ability to reduce deaths by 20%.1 Often, suspicious lesions identified in screening are initially assessed as high risk of cancer, but after additional follow-up tests, they turn out to be non-cancerous (false positives from the initial screening).2 Can machine learning reduce the number of radiology exams flagged for potentially unnecessary follow up and avoid patient anxiety?
Using a data set of high-resolution scans of lungs provided by the National Cancer Institute, participants will develop artificial intelligence algorithms to accurately determine when lesions in the lungs are cancerous. This will dramatically reduce the false positive rate that prevents low-dose CT scans from being widely used for lung cancer detection.
Competition results have the potential to advance our understanding of how all types of cancer develop and spread in the body. They’ll also free radiologists to spend more time with patients.
1Aberle DR, Adams AM, Berg CD, et al.: Reduced lung-cancer mortality with low-dose computed tomographic screening. N Engl J Med 365 (5): 395-409, 2011.
2Low-Dose CT has historically resulted in high false positive rates of around 25% (Aberle, et. al., New England J Med, 2011, 365:395-409).
This year, the Data Science Bowl will award a total prize purse of $1 million, provided by the Laura and John Arnold Foundation, to those who observe the right patterns, ask the right questions, and in turn, create unprecedented impact around this high-priority issue.
In addition, $5,000 will be awarded to each of the top three most highly voted Kernels (Total of $15,000) and $10,000 in prizes to be awarded for sharing your Data Science Bowl journey on social media – more details to be announced on February 1, 2017.
Start Your Submission Today
The Data Science Bowl is your opportunity to learn new skills, forge connections with a global community of problem solvers, and be part of something bigger than any one of us: making cancer a thing of the past.
Data sets are available to download beginning today, January 12, through the end of the competition on April 12. Visit Kaggle.com for more details and begin working on your submission today.