How’s Google Dabbling in Life, Immortality, and More?

Source: Thinkstock

Source: Thinkstock

In one of the latest science-fiction-esque development to come out of Google X — the company’s research arm that’s made headlines with the smart contact lens and driverless car — Google recently announced that its newest ambition is to map the genetics of health, and very likely advance modern methods of genetic data analysis in the process.

The idea is to define a baseline for what it means to be healthy beyond the observable factors that your doctor checks on at a routine appointment. Mapping out everything in its normal state — if it’s possible to broadly define normal or healthy — will make it much easier to spot when something’s changed so that doctors can detect, prevent, and treat disease early on.

The Wall Street Journal reported that a project called the Baseline Study will collect genetic and molecular data from an initial 175 volunteers to work toward an understanding of the genetics of health. The project, run by molecular biologist Andrew Conrad, aims to collect a larger, broader set of data than other groups’ genomics studies in the hope of uncovering biomarkers that can be used to detect and prevent disease.

The study began with an unspecified clinical testing firm this summer, with 175 healthy subjects providing blood, saliva, urine, tears, and other samples in addition to undergoing full genomic sequencing and other tests. The study will also create a repository of tissue samples. After all of these samples are collected, Google will then apply its computing power to finding biomarkers within the information.

Conrad, who is known for developing inexpensive, high-volume tests for HIV in blood plasma donations, joined Google X in March 2013. Since then, he has reportedly assembled a team of 7o to 100 experts in fields like physiology, biochemistry, optics, imaging, and molecular biology. Sam Gambhir, chair of the Stanford University medical school’s Department of Radiology, who has worked with Conrad on Baseline for more than a year, says that most biomarkers discovered so far are related to late-stage diseases, given the fact that most studies focus on sick patients instead of healthy ones.

Gambhir and Conrad tell The Wall Street Journal that Baseline is “a giant leap into the unknown,” since relatively little is known about the interplay of DNA, enzymes, and protein with environmental factors and diet. But genetics alone don’t provide a full picture of a person’s health, as other significant influences also factor in.

An article in The New England Journal of Medicine by Steven A. Schroeder noted that health is influenced not only by genetics, but by “social circumstances, environmental exposures, behavioral patterns, and health care.” Schroeder gives the example of how those factors contribute to premature death, and says that genetic predisposition accounts for only 30 percent of outcomes, with social circumstances accounting for 15 percent, environmental exposure accounting for 5 percent, healthcare accounting for 10 percent, and behavioral patterns accounting for a full 40 percent. The interplay of the factors varies with the range of diseases and conditions that medical researchers encounter, but serves to illustrate the fact that looking at genetics only would be a limited approach.

Though Google says that Baseline will seek to connect genetic information with clinical observations of diet and other habits in some way, it’s so far unclear how the study intends to account for the numerous health factors that are separate from genetics. But Google is uniquely poised to combine the large amounts of genetic data that Baseline collects with information about participants’ behavior, perhaps accounting for what they do at home, at work, and in their free time. (Though again, it’s unclear how or even whether Google intends to do so.)

While critics question whether Baseline is really a “moonshot” at all, pointing out that advances in technology make it straightforward for any group to collect 175 DNA samples in the way genomic researchers have done for years, it’s Google’s computational prowess that could lead to innovation and change.

However, the project realistically doesn’t expect to make huge advances in the space of a couple of years, and it’s possible that the incremental progress of the initiative will see researchers unveiling biomarkers that reveal little about diseases. It seems likely that the initial study won’t uncover any particularly valuable insight about health itself, but will help Google to establish a system for collecting and analyzing large sets of health-related data. In that way, Baseline seems to be an almost logical extension of a project that Google unveiled in February, when it announced in a post on its Research Blog that it had joined the Global Alliance for Genomics and Health.

“The Alliance is an international effort to develop harmonized approaches to enable responsible, secure, and effective sharing of genomic and clinical information in the cloud with the research and healthcare communities, meeting the highest standards of ethics and privacy,” the post explained. At the same time that Google announced its partnership with the alliance, the company launched a “Preview Release” version of a genomics API that allows researchers to import, process, store, and search genomic data. It’s a tool to help the genomic research community take advantage of the vast amount of medical information that new technology enables them to quickly and economically collect.

Until recently, the cost of collecting genetic and molecular information was very, very high. But Google noted that it now takes only about a day and $1000 to sequence an entire genome. That opens up entirely new possibilities for using larger and larger amounts of data to look for patterns, search for biomarkers, and generally figure out what contributes to health and what causes disease. The post laid out Google’s vision for the potential of the global research community for which the company built the new API:

“Imagine the impact if researchers everywhere had larger sample sizes to distinguish between people who become sick and those who remain healthy, between patients who respond to treatment and those whose condition worsens, between pathogens that cause outbreaks and those that are harmless. Imagine if they could test biological hypotheses in seconds instead of days, without owning a supercomputer.”

Another recent post on Google’s Research blog detailed some of the work that’s currently being done to facilitate genomics research with the company’s cloud platform. Researchers are working to determine the most accurate algorithms to detect genomic mutations that are associated with cancer. Sequencing the full genome of one person produces over 100 gigabytes of data and Google is providing computing resources for the researchers who are looking to optimize standard methods for identifying cancer-related mutations.

The post highlights exactly what Google can do well with its forays into health and DNA: leveraging huge computational ability to enable researchers to look for answers among sets of data that are significantly larger than the sets of medical data that scientists have had access to in the past. “We believe we are at the beginning of a transformation in medicine and basic research, driven by advances in genome sequencing and computing at scale,” read the post. Google is looking to accelerate that transformation not only to facilitate cancer research and to map out what a healthy human looks like (as if that weren’t enough), but also to tackle the science of things like aging and death.

Google announced Calico, a new venture with a focus on aging and associated diseases, in September of 2013. At the time, Larry Page said that the company would primarily research conditions like Alzheimer’s, cancer, and heart disease. Though Google provided little concrete information on the exact approach that Calico would take, it seemed clear that Calico would apply Googe’s big data and computational abilities to the problems at hand.

It seemed at the time that Google rather ambitiously aimed to “solve” aging itself, since advances in understanding the aging process could eventually lead to broader gains in longevity than research into individual diseases would. Speculation at the time noted that Google could be pursuing a number of anti-aging technologies.

A CNN article listed a few common supects, like cryonics (a process where the body is preserved in liquid nitrogen), cryotherapy (which exposes injured patients to very low temperatures for short periods of time), cloning and body part replacement, nanotechnology (deploying small robots to overcome the problem of incorrect DNA replication, one cause of aging), and even research into telomeres, the ends of a chromosome that protect cells against degradation. Researchers have hypothesized that figuring out a way to preserve telomeres could lead to a way to protect against aging. Of any of the very science-fiction-esque anti-aging theories, telomere research or even nanotechnology seem the most likely area of focus for Calico, given Google’s new explorations of DNA study and its place in a broader understanding of health.

Though Google seems uniquely well-equipped to develop methods for the collection, handling, and analysis of vast amounts of medical data, any conversation about putting personal data in Google’s hands always circles around to concerns with privacy, and questions of medical data especially prove no exception. While Google notes that the information will be anonymous, and used only for health purposes, many are quick to point out that they wouldn’t want Google to have access to such deep medical data.

Baseline will be monitored by institutional review boards, and when the “full study” commences, boards at Duke University and Stanford University will control how all of the information collected is used. With researchers at the Duke and Stanford medical schools, Google will analyze results from the pilot and design a study that will involve thousands of people.

Google and other researchers will have access to a huge amount of anonymized information on each volunteer. The Wall Street Journal explains that, “The information will include participants’ entire genomes, their parents’ genetic history, as well as information on how they metabolize food, nutrients and drugs, how fast their hearts beat under stress and how chemical reactions change the behavior of their genes.” The Google X Life Sciences group is developing wearable devices in addition to Google’s smart contact lens to continuously collect data from participants.

Re/Code’s James Temple, evaluating the privacy risks and whether he, personally, would choose to participate in the study, spoke with Hank Greely, a Stanford law professor and an expert in the ethical and legal issues that are associated with biomedical technologies. He noted that since researchers will have access to every participant’s entire genome, even with the name and social security number removed, the full genome sequence can only be considered anonymous under a very narrow definition of the concept of anonymity. Once someone has access to the sequence, they could cross-reference it with other sources that have it, such as sites like, 23andMe, or Family Tree DNA.

As DNA testing becomes cheaper and more popular, there will be sources where you could potentially find someone’s genetic information. But the risks with Baseline seem low, since Google isn’t hosting the information publicly, and only plans to share it with researchers involved in formal studies down the road. Participants are also opting in, and Temple says that he’s learned that Baseline’s consent form “explicitly describes the possible risks associated with sharing genomic data.”

Baseline is not aiming to deliver a specific commercial product or service, and in that and in its deep exploration of the healthcare sector deviates from the company’s past projects. It may represent a new direction for the company’s research efforts, as Baseline and Google’s other health-related projects seem intended as long-term explorations. The company stands to make significant contributions to the field of medical research by advancing methods of analyzing and working with the huge amounts of data that modern technology makes it possible to collect. Even if Baseline doesn’t live up to its lofty ambitions right away, it’s hard to fault Google for looking to apply advances in technology to answering important questions about health and life.

More From Tech Cheat Sheet: