Practitioners and data scientists have developed their own jargon, such that communication and collaboration can prove difficult across domains. For example, doctors might find it difficult to communicate to data scientists why some data (e.g., shape and structural organization of a tumor in a Magnetic Resonance Imaging scan) are especially important for a given diagnosis (metastatic potential of the tumor), and how this can be reflected in the data structure. Likewise, data scientists might struggle to explain to physicians how or why a given analytical tool (e.g., Bayesian networks) might be effective for uncovering useful information in patient records (changes in prescription medicine use over time as a predictor of future illness). The problem is only compounded when insurance companies, patient advocates, regulatory agencies, and other stakeholders weigh in.
How do we bridge the divide? One simple solution is to employ the universal “language” that helps us close communication gaps in everyday life: analogies. For two decades I (George) was a professor of cell and molecular biology, and my students and I routinely used analogies to discuss concepts that exceeded their limited command of cell biology jargon. We compared cells to busy cities teeming with “workers” (proteins) that worked in specific locations in the city; we discussed what kinds of communication networks regulated how teams of workers functioned together to complete complex tasks. Analogies helped cut through jargon at multiple levels: Instead of discussing the impact of guanine nucleotide dephosphorylation on the activation state of a protein, we’d talk about how cutting the leg off a stool would impact one’s ability to sit on it.
I employed the same strategy for discussing my academic research with colleagues in other fields. To explain how sets of genes are deployed in a specific temporal order to build a brain, my mathematics (a.k.a., data scientist) colleagues and I conceived of a “virtual house building project” where genes are represented by construction workers (e.g., “Bob’s been working a lot of overtime, looks like he’s found a new friend on the nightshift.”). Similarly, a computer scientist friend of mine explained graph theory to me by using an analogy of a faculty committee meeting, mapping the administrative/managerial relationships between committee members to help explain who sits next to whom. We proposed that similar hierarchical rules applied to where cells are positioned in a healthy tissue, and looked for instances where the “seating arrangement” was unexpected, to predict whether injured human tissue structures were healing (informal discussion at the end of a meeting) or becoming cancerous (an argument).
My favorite colleague was a faculty member in electrical engineering, who was so adept at translating jargon into analogies we called him The Universal Adaptor; he spawned a lot of productive collaborations by simply planting the analogy seeds that linked us all together. In each case, he was able to tap into our intuitive understanding of familiar situations and provide us with a built-in set of relationships between the important elements.
For the Data Science Bowl, Booz Allen and Kaggle must serve as Universal Adaptors. Head of Kaggle Competitions, Will Cukierski, is particularly adept at using analogies. To communicate tradeoffs in scoring statistics for the Data Science Bowl competition he uses golf, comparing the relative merits of focusing on “getting on the green” versus “shortening the distance from the hole.” In describing that the competition leaderboard has plenty of room for improvement, Will dons his virtual chef’s hat, and says “there’s a lot of meat left on this bone.”
We use analogies to translate across domains every day. What are your Universal Translator analogies? Tweet #DataSciBowl!
—Written by George Plopper