Human genetic clustering refers to patterns of relative genetic similarity among human individuals and populations, as well as the wide range of scientific and statistical methods used to study this aspect of human genetic variation.
Clustering studies are thought to be valuable for characterizing the general structure of genetic variation among human populations, to contribute to the study of ancestral origins, evolutionary history, and precision medicine. Since the mapping of the human genome, and with the availability of increasingly powerful analytic tools, cluster analyses have revealed a range of ancestral and migratory trends among human populations and individuals.[1] Human genetic clusters tend to be organized by geographic ancestry, with divisions between clusters aligning largely with geographic barriers such as oceans or mountain ranges.[2][3] Clustering studies have been applied to global populations,[4] as well as to population subsets like post-colonial North America.[5][6] Notably, the practice of defining clusters among modern human populations is largely arbitrary and variable due to the continuous nature of human genotypes; although individual genetic markers can be used to produce smaller groups, there are no models that produce completely distinct subgroups when larger numbers of genetic markers are used.[2][7][8]
Many studies of human genetic clustering have been implicated in discussions of race, ethnicity, and scientific racism, as some have controversially suggested that genetically derived clusters may be understood as proof of genetically determined races.[9][10] Although cluster analyses invariably organize humans (or groups of humans) into subgroups, debate is ongoing on how to interpret these genetic clusters with respect to race and its social and phenotypic features. And, because there is such a small fraction of genetic variation between human genotypes overall, genetic clustering approaches are highly dependent on the sampled data, genetic markers, and statistical methods applied to their construction.
A wide range of methods have been developed to assess the structure of human populations with the use of genetic data. Early studies of within and between-group genetic variation used physical phenotypes and blood groups, with modern genetic studies using genetic markers such as Alu sequences, short tandem repeat polymorphisms, and single nucleotide polymorphisms (SNPs), among others.[11] Models for genetic clustering also vary by algorithms and programs used to process the data. Most sophisticated methods for determining clusters can be categorized as model-based clustering methods (such as the algorithm STRUCTURE[12]) or multidimensional summaries (typically through principal component analysis).[1][13] By processing a large number of SNPs (or other genetic marker data) in different ways, both approaches to genetic clustering tend to converge on similar patterns by identifying similarities among SNPs and/or haplotype tracts to reveal ancestral genetic similarities.[13]
Common model-based clustering algorithms include STRUCTURE, ADMIXTURE, and HAPMIX. These algorithms operate by finding the best fit for genetic data among an arbitrary or mathematically derived number of clusters, such that differences within clusters are minimized and differences between clusters are maximized. This clustering method is also referred to as "admixture inference," as individual genomes (or individuals within populations) can be characterized by the proportions of alleles linked to each cluster.[1] In other words, algorithms like STRUCTURE generate results that assume the existence of discrete ancestral populations, operationalized through unique genetic markers, which have combined over time to form the admixed populations of the modern day.
Where model-based clustering characterizes populations using proportions of presupposed ancestral clusters, multidimensional summary statistics characterize populations on a continuous spectrum. The most common multidimensional statistical method used for genetic clustering is principal component analysis (PCA), which plots individuals by two or more axes (their "principal components") that represent aggregations of genetic markers that account for the highest variance. Clusters can then be identified by visually assessing the distribution of data; with larger samples of human genotypes, data tends to cluster in distinct groups as well as admixed positions between groups.[1][13]
There are caveats and limitations to genetic clustering methods of any type, given the degree of admixture and relative similarity within the human population. All genetic cluster findings are biased by the sampling process used to gather data, and by the quality and quantity of that data. For example, many clustering studies use data derived from populations that are geographically distinct and far apart from one another, which may present an illusion of discrete clusters where, in reality, populations are much more blended with one another when intermediary groups are included.[1] Sample size also plays an important moderating role on cluster findings, as different sample size inputs can influence cluster assignment, and more subtle relationships between genotypes may only emerge with larger sample sizes.[1][8] In particular, the use of STRUCTURE has been widely criticized as being potentially misleading through requiring data to be sorted into a predetermined number of clusters which may or may not reflect the actual population's distribution.[8][14] The creators of STRUCTURE originally described the algorithm as an "exploratory" method to be interpreted with caution and not as a test with statistically significant power.[12][15]
Modern applications of genetic clustering methods to global-scale genetic data were first marked by studies associated with the Human Genome Diversity Project (HGDP) data.[1] These early HGDP studies, such as those by Rosenberg et al. (2002),[4][16] contributed to theories of the serial founder effect and early human migration out of Africa, and clustering methods have been notably applied to describe admixed continental populations.[5][6][17] Genetic clustering and HGDP studies have also contributed to methods for, and criticisms of, the genetic ancestry consumer testing industry.[18]
A number of landmark genetic cluster studies have been conducted on global human populations since 2002, including the following:
Clusters of individuals are often geographically structured. For example, when clustering a population of East Asians and Europeans, each group will likely form its own respective cluster based on similar allele frequencies. In this way, clusters can have a correlation with traditional concepts of race and self-identified ancestry; in some cases, such as medical questionnaires, the latter variables can be used as a proxy for genetic ancestry where genetic data is unavailable.[9][4] However, genetic variation is distributed in a complex, continuous, and overlapping manner, so this correlation is imperfect and the use of racial categories in medicine can introduce additional hazards.[9]
Some scholars[who?] have challenged the idea that race can be inferred by genetic clusters, drawing distinctions between arbitrarily assigned genetic clusters, ancestry, and race. One recurring caution against thinking of human populations in terms of clusters is the notion that genotypic variation and traits are distributed evenly between populations, along gradual clines rather than along discrete population boundaries; so although genetic similarities are usually organized geographically, their underlying populations have never been completely separated from one another. Due to migration, gene flow, and baseline homogeneity, features between groups are extensively overlapping and intermixed.[2][9] Moreover, genetic clusters do not typically match socially defined racial groups; many commonly understood races may not be sorted into the same genetic cluster, and many genetic clusters are made up of individuals who would have distinct racial identities.[7] In general, clusters may most simply be understood as products of the methods used to sample and analyze genetic data; not without meaning for understanding ancestry and genetic characteristics, but inadequate to fully explaining the concept of race, which is more often described in terms of social and cultural forces.
In the related context of personalized medicine, race is currently listed as a risk factor for a wide range of medical conditions with genetic and non-genetic causes. Questions have emerged regarding whether or not genetic clusters support the idea of race as a valid construct to apply to medical research and treatment of disease, because there are many diseases that correspond with specific genetic markers and/or with specific populations, as seen with Tay-Sachs disease or sickle cell disease.[3][25] Researchers are careful to emphasize that ancestryrevealed in part through cluster analysesplays an important role in understanding risk of disease. But racial or ethnic identity does not perfectly align with genetic ancestry, and so race and ethnicity do not reveal enough information to make a medical diagnosis.[25] Race as a variable in medicine is more likely to reflect social factors, where ancestry information is more likely to be meaningful when considering genetic ancestry.[2][25]
Read more:
Human genetic clustering - Wikipedia
- June 11th At Westport, CT: Federal Red Flags, HIPAA Security Rules and Fraud Prevention [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- Do not learn Dvorak! [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- You Can’t Solve Problems By Making It Illegal To Have The Problem [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- A Force Fix for Healthcare [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- Yahble, HIT, Bubblecon, BIZDEV!, Solid State [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- 15 things that suck about the Palm Pre [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- What an Indie Genomics Lab Looks Like [Last Updated On: November 7th, 2009] [Originally Added On: November 7th, 2009]
- Practice Fusion: Class D Felony? [Last Updated On: February 26th, 2010] [Originally Added On: February 26th, 2010]
- Practice Fusion Responds [Last Updated On: March 7th, 2010] [Originally Added On: March 7th, 2010]
- Practice Fusion: Do the math: $44,000 is a LIE [Last Updated On: March 10th, 2010] [Originally Added On: March 10th, 2010]
- How Much Until Doctors Approve of 23andMe? [Last Updated On: March 10th, 2010] [Originally Added On: March 10th, 2010]
- Biochemicals as Media, Not Methods [Last Updated On: March 10th, 2010] [Originally Added On: March 10th, 2010]
- More Practice Fusion Reality Distortion [Last Updated On: March 10th, 2010] [Originally Added On: March 10th, 2010]
- Same Test Results: 23andMe is Myriad is BRCA is Medicine [Last Updated On: March 12th, 2010] [Originally Added On: March 12th, 2010]
- BRCA is 23andMe is Myriad is Medicine [Last Updated On: March 13th, 2010] [Originally Added On: March 13th, 2010]
- Getting Serious About Genomics as Common Medical Practice [Last Updated On: March 15th, 2010] [Originally Added On: March 15th, 2010]
- The New John Mackey of Genetics: Linda Avey? [Last Updated On: March 15th, 2010] [Originally Added On: March 15th, 2010]
- Keep the Medical, Well, Medical [Last Updated On: March 16th, 2010] [Originally Added On: March 16th, 2010]
- If 23andMe shuts down, it won’t be for some mundane reason like the bills weren’t paid [Last Updated On: March 16th, 2010] [Originally Added On: March 16th, 2010]
- If I Run A Medical Practice, How Do I Use A 23andMe? [Last Updated On: March 17th, 2010] [Originally Added On: March 17th, 2010]
- 23andMe Contract in Bad Faith [Last Updated On: March 19th, 2010] [Originally Added On: March 19th, 2010]
- Doctors CANNOT Use 23andMe Due To 23andMe’s Bad Faith Contract [Last Updated On: March 20th, 2010] [Originally Added On: March 20th, 2010]
- Pathway Compared to 23andMe and Navigenics [Last Updated On: March 22nd, 2010] [Originally Added On: March 22nd, 2010]
- There’s a Word for “Views Differ” When One View Is The State [Last Updated On: March 24th, 2010] [Originally Added On: March 24th, 2010]
- Association for Molecular Pathology, et al. v. USPTO, et al. – Opinion [Last Updated On: March 29th, 2010] [Originally Added On: March 29th, 2010]
- Birth of a Super Villain [Last Updated On: April 3rd, 2010] [Originally Added On: April 3rd, 2010]
- “Medical Products” like 23andMe must not become the new “Financial Products” [Last Updated On: April 4th, 2010] [Originally Added On: April 4th, 2010]
- How I Would Apply Genomic Technology In Clinical Use Today [Last Updated On: April 5th, 2010] [Originally Added On: April 5th, 2010]
- Gmail Enterprise: World’s Best EMR [Last Updated On: April 6th, 2010] [Originally Added On: April 6th, 2010]
- Brief Primer on Health Law Compliance [Last Updated On: April 9th, 2010] [Originally Added On: April 9th, 2010]
- Spoiler: You ARE the “Valids” [Last Updated On: April 9th, 2010] [Originally Added On: April 9th, 2010]
- Rachel Lehmann-Haupt Line by Line Take Down [Last Updated On: April 9th, 2010] [Originally Added On: April 9th, 2010]
- Is Medicare Bankrupt? What the Hell Is Going On? [Last Updated On: April 17th, 2010] [Originally Added On: April 17th, 2010]
- The Big Shuffle: Medicare Cuts Rates by 21.3% (but not “technically”) [Last Updated On: April 17th, 2010] [Originally Added On: April 17th, 2010]
- “Tech Hiring Binge” == “Fear for Your Job, Nerds” [Last Updated On: April 18th, 2010] [Originally Added On: April 18th, 2010]
- How Bad is Bad? $.20 on the Private Medical Insurance Dollar [Last Updated On: April 20th, 2010] [Originally Added On: April 20th, 2010]
- Update: How Bad is Bad? It Used to Be $.45 on the Medical Insurance Dollar [Last Updated On: April 20th, 2010] [Originally Added On: April 20th, 2010]
- World’s Best “EMR” for $1000: Google Spreadsheets + iPad [Last Updated On: April 21st, 2010] [Originally Added On: April 21st, 2010]
- Don’t Insult Me with your “AOL Keyword” Strategy, Google Health [Last Updated On: April 21st, 2010] [Originally Added On: April 21st, 2010]
- How to Play LAWGAMES [Last Updated On: April 23rd, 2010] [Originally Added On: April 23rd, 2010]
- Top 4 Predatory Schemes Encroaching on American Medicine: Part 1 [Last Updated On: April 25th, 2010] [Originally Added On: April 25th, 2010]
- What’s the Big Deal About iPads? [Last Updated On: April 27th, 2010] [Originally Added On: April 27th, 2010]
- Got Google Android for Google I/O [Last Updated On: April 27th, 2010] [Originally Added On: April 27th, 2010]
- Google Enterprise meets HIPAA and HITECH Compliant Laws [Last Updated On: April 29th, 2010] [Originally Added On: April 29th, 2010]
- Pixels of Accuracy CHALENGE: Diagnostic Medical Imaging [Last Updated On: April 29th, 2010] [Originally Added On: April 29th, 2010]
- 23andMe Launder AlioGenetics Doesn’t Even Bother to Remove 23andMe Logo [Last Updated On: April 30th, 2010] [Originally Added On: April 30th, 2010]
- Anthem of CT Denies $600 Until “Subscriber Responds to our Coordination of Benefits Questionnaire” [Last Updated On: May 1st, 2010] [Originally Added On: May 1st, 2010]
- Apple And Google Team Up To Launch Revolutionary Mobile Health System [Last Updated On: May 1st, 2010] [Originally Added On: May 1st, 2010]
- Funny Pictures from This Year Building the Medical Practice [Last Updated On: May 6th, 2010] [Originally Added On: May 6th, 2010]
- Remote Medical Video Monitoring on iPad and iPhone [Last Updated On: May 7th, 2010] [Originally Added On: May 7th, 2010]
- Google Calendar Overhead Waiting Room Display [Last Updated On: May 7th, 2010] [Originally Added On: May 7th, 2010]
- Various Whiteboards on Solid State Medical Operations [Last Updated On: May 7th, 2010] [Originally Added On: May 7th, 2010]
- The Raw Facts about Counsyl [Last Updated On: May 7th, 2010] [Originally Added On: May 7th, 2010]
- Brawndo: Still Mutilating Thirst, Still Not Yet Sold at the Stop-n-Shop Pharmacy [Last Updated On: May 9th, 2010] [Originally Added On: May 9th, 2010]
- Video: Google Enterprise to Outsource Medical Administration [Last Updated On: May 9th, 2010] [Originally Added On: May 9th, 2010]
- Gattaca: “The Matrix” of Genomics [Last Updated On: May 11th, 2010] [Originally Added On: May 11th, 2010]
- 23andMe Now Diagnoses Fatal Tay-Sachs Disease [Last Updated On: May 12th, 2010] [Originally Added On: May 12th, 2010]
- Why Was Pathway Targeted for FDA Enforcement and Not 23andMe? [Last Updated On: May 15th, 2010] [Originally Added On: May 15th, 2010]
- John Dolan on Aging and the Horrifying Conclusion of GWAS [Last Updated On: May 16th, 2010] [Originally Added On: May 16th, 2010]
- Sam R. Riley Wants To Tell You About Practice Fusion [Last Updated On: May 17th, 2010] [Originally Added On: May 17th, 2010]
- Response to “Genomic Medicine: Lost” [Last Updated On: May 19th, 2010] [Originally Added On: May 19th, 2010]
- Death And Taxes: CMS to IRS [Last Updated On: May 19th, 2010] [Originally Added On: May 19th, 2010]
- Please Stop Antagonizing the AMA [Last Updated On: May 26th, 2010] [Originally Added On: May 26th, 2010]
- Dan Vorhaus, Attorney At Law, Legally Advises Medical Doctors Can Use 23andMe To Provide Medical Advice [Last Updated On: May 28th, 2010] [Originally Added On: May 28th, 2010]
- Singularity Summit 2010 in San Francisco to Explore Intelligence Augmentation [Last Updated On: June 7th, 2010] [Originally Added On: June 7th, 2010]
- OpenPCR: DNA amplification for anyone [Last Updated On: June 10th, 2010] [Originally Added On: June 10th, 2010]
- FDA sends letters to 5 genetic testing companies [Last Updated On: June 11th, 2010] [Originally Added On: June 11th, 2010]
- Amazon And The NIH Team Up To Put Human Genome In The Cloud [Last Updated On: March 31st, 2012] [Originally Added On: March 31st, 2012]
- ReproSource Comments on New Study Linking Infertility to Genetics [Last Updated On: April 25th, 2012] [Originally Added On: April 25th, 2012]
- Genetics 101 Part 1: What are genes? - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Red Ice Radio - David Icke - Hour 1 - The Manipulation of Humanity - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Genetics Part 5: Human Genetic Disorders - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- C2CAM - The Nephilim, Genetic Manipulation [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Human Nature talk with Robert Sapolsky, Gabor Mate, James Gilligan, Richard Wilkinson - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Human Genetic Diseases - Video [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Alien Scientist on Genetics, Implants [Last Updated On: April 30th, 2012] [Originally Added On: April 30th, 2012]
- Research and Markets: Genetics, 6th Edition International Student Version Continues To Educate Today's Students for ... [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Myriad Genetics to Present at the Bank of America Merrill Lynch 2012 Health Care Conference [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- Genetics may explain some people's dislike of meat [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]
- 'Blond Genes' May Vary Around the World [Last Updated On: May 4th, 2012] [Originally Added On: May 4th, 2012]