Scientists are on a path to sequencing 1 million human genomes and use big data to unlock genetic secrets – GCN.com

Posted: April 19, 2021 at 7:18 am

Scientists are on a path to sequencing 1 million human genomes and use big data to unlock genetic secrets

The first draft of the human genome waspublished 20 years agoin2001, took nearly three years and costbetween US$500 million and $1 billion. TheHuman Genome Projecthas allowed scientists to read, almost end to end, the 3 billion pairs of DNA bases or letters that biologically define a human being.

That project has allowed a new generation ofresearchers like me, currently a postdoctoral fellow at the National Cancer Institute, to identifynovel targets for cancer treatments, engineermice with human immune systemsand even build awebpage where anyone can navigate the entire human genomewith the same ease with which you use Google Maps.

The first complete genome was generated from a handful of anonymous donors to try to produce a reference genome that represented more than just one single individual. But this fell far short of encompassingthe wide diversity of human populations in the world. No two people are the same and no two genomes are the same, either. If researchers wanted to understand humanity in all its diversity, it would take sequencing thousands or millions of complete genomes. Now, a project like that is underway.

Understanding genetic diversity

The wealth of genetic variation among people is what makes each person unique. But genetic changes also cause many disorders and make some groups of people more susceptible to certain diseases than others.

Around the time of the Human Genome Project, researchers were also sequencing the complete genomes of organisms such asmice,fruit flies,yeastsandsome plants. The huge effort made to generate these first genomes led to a revolution in the technology required to read genomes. Thanks to these advances, instead of taking years and costing hundreds of millions of dollars to sequence a whole human genome, it now takesa few days and costs merely a thousand dollars. Genome sequencing is very different from genotyping services like 23 and Me or Ancestry, which look at only a tiny fraction of locations in a persons genome.

Advances in technology have allowed scientists to sequence the complete genomes of thousands of individuals from around the world. Initiatives such as theGenome Aggregation Consortiaare currently making efforts to collect and organize this scattered data. So far, that group has been able to gather nearly150,000 genomesthat show an incredible amount of human genetic diversity. Within that set, researchers have found more than 241 million differences in peoples genomes,with an average of one variant for every eight base pairs.

Most of these variations are very rare and will have no effect on a person. However, hidden among them are variants with important physiological and medical consequences. For example, certain variants in the BRCA1 gene predispose some groups of woman, like Ashkenazi Jews, toovarian and breast cancer. Other variants in that gene lead someNigerian women to experience higher-than-normal mortalityfrom breast cancer.

The best way researchers can identify these types of population-level variants is throughgenomewide association studiesthat compare the genomes of large groups of people with a control group. But diseases are complicated. An individuals lifestyle, symptoms and time of onset can vary greatly, and the effect of genetics on many diseases is hard to distinguish. The predictive power of current genomic research is too low to tease out many of these effects becausethere isnt enough genomic data.

Understanding the genetics of complex diseases, especially those related to the genetic differences among ethnic groups, is essentially a big data problem. And researchers need more data.

1,000,000 genomes

To address the need for more data, the National Institutes of Health has started a program calledAll of Us. The project aims to collect genetic information, medical records and health habits from surveys and wearables of more than a million people in the U.S. over the course of 10 years. It also has a goal of gathering more data from underrepresented minority groups to facilitate the study of health disparities. TheAll of Us projectopened to public enrollment in 2018, and more than 270,000 people have contributed samples since. The project is continuing to recruit participants from all 50 states. Participating in this effort are many academic laboratories and private companies.

Read more:
Scientists are on a path to sequencing 1 million human genomes and use big data to unlock genetic secrets - GCN.com

Related Posts