Privacy protections: The genome hacker

Posted: May 8, 2013 at 2:45 pm

Yaniv Erlich

DANA SMITH

Late at night, a video camera captures a man striding up to the locked door of the information-technology department of a major Israeli bank. At this hour, access can be granted only by a fingerprint reader but instead of using the machine, the man pushes a button on the intercom to ring the receptionist's phone. As it rings, he holds his mobile phone up to the intercom and presses the number 8. The sound of the keypad tone is enough to unlock the door. As he opens it, the man looks back to the camera with a shrug: that was easy.

Yaniv Erlich the star of this 2006 video considers this one of his favourite hacks. Technically a penetration exercise conducted to expose the bank's vulnerabilities, it was one of several projects that Erlich worked on during a two-year stint with a security firm based near Tel Aviv. Since then, the 33-year-old computational biologist has been bringing his hacker ethos to biology. Now at the Whitehead Institute for Biomedical Research in Cambridge, Massachusetts, he is using genome data in new ways, and in the process exposing vulnerabilities in databases that hold sensitive information on thousands of individuals around the world.

In a study published in January1, Erlich's lab showed that it is possible to discover the identities of people who participate in genetic research studies by cross-referencing their data with publicly available information. Previous studies had shown that people listed in anonymous genetic data stores could be unmasked by matching their data to a sample of their DNA. But Erlich showed that all it requires is an Internet connection.

Erlich's work has exposed a pressing ethical quandary. As researchers increasingly combine patient data with other types of information everything from social-media posts to entries on genealogy websites protecting anonymity becomes next to impossible. Studying these linked data has its benefits, but it may also reveal genetic and medical information that researchers had promised to keep private and that, if made public, might hurt people's employability, insurability or even personal relationships.

Such revelations may make the scientific community uncomfortable and undermine the public's trust in medical research. But Erlich and his colleagues see their work as a way to alert the world about flawed systems, keep researchers honest and ultimately strengthen science. In March, for instance, the European Molecular Biology Laboratory (EMBL) in Heidelberg, Germany, claimed that the genome sequence that it had published for the HeLa cell line would not reveal anything about Henrietta Lacks the source of the cells or her descendants. Erlich issued a tart response: Nice lie EMBL! he tweeted. The sequence was later pulled from public databases, and the EMBL admitted that it would indeed be possible to glean information about the Lacks family from it, even though much of the HeLa genetic data had already been published as part of other studies.

Most scientists would not go anywhere close to these questions, out of a sense of what it might mean for the field, or for them personally, says David Page, director of the Whitehead Institute, who has advised Erlich about his research. But this is not about publicity-seeking this is about fearlessness, and a kind of interest in how all the parts of the Universe fit together that mark all of Yaniv's work.

Erlich was inspired to teach himself programming as a child in Israel after seeing the 1983 film WarGames, in which a teenager accidentally hacks into government computer systems and nearly launches global thermonuclear war. Erlich thought that he would study maths and physics at university, but after a friend told him that there was a lot of maths in biology, he decided to major in computational neuroscience. In 2006, following his graduation, Erlich moved to the United States to earn his PhD in genetics at Cold Spring Harbor Laboratory in New York.

Under his adviser, molecular biologist Greg Hannon, Erlich devised what he called DNA Sudoku: a sequencing method that could be used on tens of thousands of specimens analysed simultaneously. It allowed scientists to use computational techniques to find a gene carrying a rare mutation from this mixed batch of DNA and assign it to the right specimen2. Erlich is now using the technique to find disease-causing mutations in young Ashkenazi Jews to inform their decisions about potential marriage partners.

Visit link:
Privacy protections: The genome hacker

Related Posts