Cryptographers and Geneticists Unite to Analyze Genomes They Can’t See – Scientific American

Posted: August 22, 2017 at 11:31 pm

A cryptographer and a geneticist walk into a seminar room. An hour later, after a talk by the cryptographer, the geneticist approaches him with a napkin covered in scrawls. The cryptographer furrows his brow, then nods. Nearly two years later, they reveal the product of their combined prowess: an algorithm that finds harmful mutations without actually seeing anyones genes.

The goal of the scientists, Stanford University cryptographer Dan Boneh and geneticist Gill Bejerano, along with their students, is to protect the privacy of patients who have shared their genetic data. Rapid and affordable genome sequencing has launched a revolution in personalized medicine, allowing doctors to zero in on the causes of a disease and propose tailor-made solutions. The challenge is that such comparisons typically rely on inspecting the genes of many different patientsincluding patients from unrelated institutions and studies. The simplest means to do this is for the caregiver or scientist to obtain patient consent, then post every letter of every gene in an anonymized database. The data is usually protected by licensing agreements and restricted registration, but ultimately the only thing keeping it from being shared, de-anonymized or misused is the good behavior of users. Ideally, it should be not just illegal but impossible for a researchersay, one who is hacked or who joins an insurance companyto leak the data.

When patients share their genomes, researchers managing the databases face a tough choice. If the whole genome is made available to the community, the patient risks future discrimination. For example, Stephen Kingsmore, CEO of Rady Children's Institute for Genomic Medicine, encounters many parents in the military who refuse to compare their genomes with those of their sick children, fearing they will be discharged if the military learns of harmful mutations. On the other hand, if the scientists share only summaries or limited segments of the genome, other researchers may struggle to discover critical patterns in a diseases genetics or to pinpoint the genetic causes of individual patients health problems.

Boneh and Bejerano promise the best of both worlds using a cryptographic concept called secure multiparty computation (SMC). This is, in effect, an approach to the millionaires problema hypothetical situation in which two individuals want to determine who is richest without revealing their net worth. SMC techniques work beautifully for such conjectural examples, but with the exception of one Danish sugar beet auction, they have almost never been put into practice. The Stanford groups work, published last week in Science, is among the first to apply this mind-bending technology to genomics. The new algorithm lets patients or hospitals keep genomic data private while still joining forces with faraway researchers and clinicians to find disease-linked mutationsor at least that is the hope. For widespread adoption, the new method will need to overcome the same pragmatic barriers that often leave cryptographic innovations gathering dust.

Intuitively, Boneh and Bejeranos plan seems preposterous. If someone can see they can leak it. And how could they infer anything from a genome they cant see? But cryptographers have been grappling with just such problems for years. Cryptography lets you do a lot of things like [SMC]keep data hidden and still operate on that data, Boneh says. When Bejerano attended Bonehs talk on recent developments in cryptography, he realized SMC was a perfect fit for genomic privacy.

The particular SMC technique that the Stanford team wedded to genomics is known as Yaos protocol. Say, for instance, that Alice and Bobthe ever-present denizens of cryptographers imaginationswant to check whether they share a mutation in gene X. Under Yaos protocol Alice (who knows only her own genome) writes down the answer for every possible combination of her and Bobs genes. She then encrypts each one twiceanalogous to locking it behind two layers of doorsand works with Bob to find the correct answer by strategically arranging a cryptographic garden of forking paths for him to navigate.

She sets up outer doors to correspond to the possibilities for her gene. Call them Alice doors: If Bob enters door 3, any answers he finds inside will assume that Alice has genetic variant 3. Behind each Alice door, Alice adds a second layer of doorsthe Bob doorscorresponding to the options for Bobs gene. Each combination of doors leads to the answer for the corresponding pair of Alice and Bobs genes. Bob then simply has to get the right pair of keys (essentially passwords) to unlock the doors. By scrambling the order of the doors and carefully choosing who gets to see which keys and labels, Alice can ensure that the only answer Bob will be able to unlock is the correct one, although still preventing herself from learning Bobs gene or vice versa.

Using a digital equivalent of this process, the Stanford team demonstrated three different kinds of privacy-preserving genomic analyses. They searched for the most common mutations in patients with four rare diseases, in all cases finding the known causal gene. They also diagnosed a babys illness by comparing his genome with those of his parents. Perhaps the researchers biggest triumph was discovering a previously unknown disease gene by having two hospitals search their genome databases for patients with identical mutations. In all cases the patients full genomes never left the hands of their care providers.

In addition to patient benefits keeping genomes under wraps would do much to soothe the minds of the custodians of those genome databases, who fear the trust implications of a breach, says Giske Ursin, director of the Cancer Registry of Norway. We [must] always be slightly more neurotic, she says. Genomic privacy likewise offers help for second- and third-degree relatives, [who] share a significant fraction of the genome, notes Bejeranos student Karthik Jagadeesh, one of the papers first authors. Bejerano further points to the conundrums genomicists face when they spot harmful mutations unrelated to their work. The ethical question of what mutations a genomicist must scan for or discuss with the patient does not arise if most genes stayed concealed.

Bejerano argues the SMC technique makes genomic privacy a practical option. Its a policy statement, in some sense. It says, If you want to both keep your genome private and use it for your own good and the good of others, you can. You should just demand that this opportunity is given to you.

Other researchers and clinicians, although agreeing the technique is technically sound, worry that it faces an uphill battle on the practical side. Yaniv Erlich, a Columbia University assistant professor of computer science and computational biology, predicts the technology could end up like PGP (pretty good privacy) encryption. Despite its technical strengths as a tool for encrypting e-mails, PGP is used by almost no onelargely because cryptography is typically so hard to use. And usability is of particular concern to medical practitioners: Several echo Erlichs sentiment that their priority is diagnosing and treating a condition as quickly as possible, making any friction in the process intolerable. Its great to have it as a tool in the toolbox, Erlich says, but my senseis that the field is not going in this direction.

Kingsmore, Erlich and others are also skeptical that the papers approach would solve some of the real-world problems that concern the research and clinical communities. For example, they feel it would be hard to apply it directly to oncology, where genomes are useful primarily in conjunction with detailed medical and symptomatic records.

Still, Kingsmore and Erlich do see some potential for replacing todays clunky data-management mechanisms with more widespread genome sharing. In any case, the takeaway for Bejerano is not that genome hiding is destined to happen, but that it is a technological possibility. You would think we have no choice: If we want to use the data, it must be revealed. Now that we know that is not true, it is up to society to decide what to do next.

Read the original:
Cryptographers and Geneticists Unite to Analyze Genomes They Can't See - Scientific American

Related Posts