Looking for relatedness in the HapMap Gujaratis | Gene Expression

Recently I was looking at a 3-D PCA animation which Zack generated from the Harappa Ancestry Project data set. Click the link and come back. Notice the outlier clusters? The Burusho are straightforward, they seem to have low levels of Tibetan admixture. But what about the Gujarati cluster? Again, we see what we’ve seen before, the fractioning out of the Gujaratis in PCA into two groups, one a tight cluster, and the other relatively widely distributed. This prompted me to look more closely at the HapMap Gujarati sample. Today I was exploring the question with Plink’s identity-by-descent feature. First I’ll start out with a smaller data set, my family (father, mother, sibling 1, sibling 2, and myself), and an Indian (from Uttar Pradesh) and Pakistani as unrelated individuals. I merged out 23andMe derived genotypes, and with ~900,000 markers calculated pairwise IBD:

./plink --bfile IBDControl --genome

Here are the relevant results:

Individual 1
Individual 2
Z0
Z1
Z2
PI_HAT
DST
PPC
RATIO Indian
Father
0.768
0.027
0.205
0.218
0.760
0.160
1.940 Indian
Mother
0.782
0.010
0.209
0.214
0.759
0.026
1.886 Indian
Razib
0.767
0.032
0.202
0.218
0.759
0.500
2.000 Indian
Sibling1
0.769
0.025
0.206
0.219
0.760
0.198
1.949 Indian
Sibling2
0.766
0.032
0.203
0.219
0.760
0.685
2.030 Indian
Pakistani
0.781
0.017
0.203
0.211
0.758
0.533
2.005 Father
Mother
0.776
0.018
0.207
0.215
0.759
0.284
1.965 Father
Razib
0.002
0.777
0.221
0.610
0.851
1.000
450.800 Father
Sibling1
0.001
0.785
0.214
0.606
0.850
1.000
898.800 Father
Sibling2
0.002
0.779
0.220
0.609
0.851
1.000
643.143 Father
Pakistani
0.778
0.019
0.203
0.213
0.758
0.201
1.950 Mother
Razib
0.002
0.788
0.211
0.605
0.849
1.000
639.429 Mother
Sibling1
0.002
0.781
0.218
0.608
0.850
1.000
639.857 Mother
Sibling2
0.002
0.782
0.216
0.607
0.850
1.000
447.900 Mother
Pakistani
0.779
0.020
0.201
0.211
0.758
0.052
1.904 Razib
Sibling1
0.183
0.408
0.409
0.613
0.866
1.000
11.386 Razib
Sibling2
0.194
0.432
0.374
0.590
0.858
1.000
11.491 Razib
Pakistani
0.781
0.016
0.203
0.211
0.758
0.933
2.095 Sibling1
Sibling2
0.236
0.412
0.351
0.557
0.849
1.000
9.413 Sibling1
Pakistani
0.777
0.024
0.199
0.211
0.758
0.327
1.973 Sibling2
Pakistani
0.774
0.024
0.202
0.214
0.758
0.443
1.991

You can infer some things without even knowing what the columns mean. Notice that there are differences between parent-child, sibling-sibling, and unrelated comparisons. The distance measure, DST, is basically exactly the same as the genome-wide comparison in 23andMe. Either the web app is running Plink, or, it’s using the ...

Related Posts

Comments are closed.