Data Science Bowl Reddit AMA with Dr. Anne Carpenter

Data Science Bowl Reddit AMA with Dr. Anne Carpenter

Data Science Bowl Reddit AMA with Dr. Anne Carpenter

By March 1, 2018 Data Science No Comments

Data Science Bowl Reddit AMA with Dr. Anne Carpenter

By March 1, 2018 Data Science No Comments

On February 15, 2018,  Dr. Anne Carpenter hosted a Reddit Ask Me Anything (AMA) to help data scientists tackle the mission of this year’s Data Science Bowl: create an algorithm to automate nucleus detection. She invited participants to ask about the challenge, her team’s open-source CellProfiler software, what it’s like being a scientist, or anything else on their minds.

One participant asked why images of nuclei are so important to cell research. Dr. Carpenter explained, “Identifying the nucleus is identifying the nucleus is useful for almost all cell experiments because it gives us a single, clear landmark within the cell that is much easier to find than if we tried to find cell edges directly.”

She continued, “The biggest hurdle in identifying nuclei is when cells are clumpy and close together … we suspect deep learning should be able to surpass classical algorithms at this.” She referred readers to the Data Science Bowl video for more details.

Another participant asked whether automated image analysis is widespread in cell biology. “Biologists are definitely getting savvier about quantifying images, even if they only have a handful,” Dr. Carpenter responded. She said those who use CellProfiler fall into two groups: those who are processing just a few images and those who are processing millions of images using cloud computing.

Other questioners took the discussion to topics such as the difficulty of curing cancer (Dr. Carpenter likened the quest to “chipping away at a large iceberg”) and how new artificial intelligence companies can advance computational biology. After noting that she expects a lot of competition in AI startups, Dr. Carpenter commented: “I would say what will differentiate the winners is careful design and collection of clinical data to design tools that work not just on test sets but in real life.”

Interested in seeing all the questions, and asking your own? Visit the Reddit AMA now.