“In silico labeling” aims to decode the terabytes of data per day generated in bio research labs
April 18, 2018
Original link: http://www.kurzweilai.net/how-deep-learning-is-about-to-transform-biomedical-science
Deep learning is a type of machine learning that can analyze data, recognize patterns, and make predictions. A new deep-learning approach to biological images, which the researchers call “in silico labeling” (ISL), can automatically find and predict features in images of “unlabeled” cells (cells that have not been manually identified by using fluorescent chemicals).
The new deep-learning network can identify whether a cell is alive or dead, and get the answer right 98 percent of the time (humans can typically only identify a dead cell with 80 percent accuracy) — without requiring invasive fluorescent chemicals, which make it difficult to track tissues over time. The deep-learning network can also predict detailed features such as nuclei and cell type (such as neural or breast cancer tissue).
The deep-learning algorithms are expected to make it possible to handle the enormous 3–5 terabytes of data per day generated by Gladstone Institutes’ fully automated robotic microscope, which can track individual cells for up to several months.
The research was published in the April 12, 2018 issue of the journal Cell.
How to train a deep-learning neural network to predict the identity of cell features in microscope images
To explore the new deep-learning approach, Steven Finkbeiner, MD, PhD, the director of the Center for Systems and Therapeutics at Gladstone Institutes in San Francisco, teamed up with computer scientists at Google.
“We trained the [deep learning] neural network by showing it two sets of matching images of the same cells: one unlabeled [such as the black and white "phase contrast"microscope image shown in the illustration] and one with fluorescent labels [such as the three colored images shown above],” explained Eric Christiansen, a software engineer at Google Accelerated Science and the study’s first author. “We repeated this process millions of times. Then, when we presented the network with an unlabeled image it had never seen, it could accurately predict where the fluorescent labels belong.” (Fluorescent labels are created by adding chemicals to tissue samples to help visualize details.)
The study used three cell types: human motor neurons derived from induced pluripotent stem cells, rat cortical cultures, and human breast cancer cells. For instance, the deep-learning neural network can identify a physical neuron within a mix of cells in a dish. It can go one step further and predict whether an extension of that neuron is an axon or dendrite (two different but similar-looking elements of the neural cell).
For this study, Google used TensorFlow, an open-source machine learning framework for deep learning originally developed by Google AI engineers. The code for this study, which is open-source on Github, is the result of a collaboration between Google Accelerated Science and two external labs: the Lee Rubin lab at Harvard and the Steven Finkbeiner lab at Gladstone.
Transforming biomedical research
“This is going to be transformative,” said Finkbeiner, who is also a professor of neurology and physiology at UC San Francisco. “Deep learning is going to fundamentally change the way we conduct biomedical science in the future, not only by accelerating discovery, but also by helping find treatments to address major unmet medical needs.”
In his laboratory, Finkbeiner is trying to find new ways to diagnose and treat neurodegenerative disorders, such as Alzheimer’s disease, Parkinson’s disease, and amyotrophic lateral sclerosis (ALS). “We still don’t understand the exact cause of the disease for 90 percent of these patients,” said Finkbeiner. “What’s more, we don’t even know if all patients have the same cause, or if we could classify the diseases into different types. Deep learning tools could help us find answers to these questions, which have huge implications on everything from how we study the disease to the way we conduct clinical trials.”
Without knowing the classifications of a disease, a drug could be tested on the wrong group of patients and seem ineffective, when it could actually work for different patients. With induced pluripotent stem cell technology, scientists could match patients’ own cells with their clinical information, and the deep network could find relationships between the two datasets to predict connections. This could help identify a subgroup of patients with similar cell features and match them to the appropriate therapy, Finkbeiner suggests.
The research was funded by Google, the National Institute of Neurological Disorders and Stroke of the National Institutes of Health, the Taube/Koret Center for Neurodegenerative Disease Research at Gladstone, the ALS Association’s Neuro Collaborative, and The Michael J. Fox Foundation for Parkinson’s Research.