Sampling and sequencing
The cultivated and wild rice accessions were all from large collections of rice accessions preserved at the China National Rice Research Institute in Hangzhou, China, and the National Institute of Genetics in Mishima, Japan. The accessions were selected on the basis of the germplasm database records of phenotypic data and sampling localities to maximize genetic and geographic diversity. The collection was maintained by selfing in the laboratories. For each accession, genomic DNA from a single plant was used for sequencing, and seeds derived from the same plant were used for following field trials. In total, the genomes of 1,083 O. sativa accessions, 446 O. rufipogon accessions and 15 accessions of outgroup species were sequenced on the Illumina Genome Analyzer IIx generating 73-bp (or 117-bp) paired-end reads, each to approximately onefold (for O. sativa accessions), twofold (for O. rufipogon accessions) or threefold (for outgroup species) coverage. The detailed information, including geographic origin and sequencing coverage of the rice accessions, was listed in Supplementary Tables 2, 7 and 8. Library construction and sequencing of these accessions were performed as described21. One representative accession of O. rufipogon, W1943, was sequenced on the Illumina HiSeq2000, generating 100-bp paired-end reads with 100-fold genome coverage. An amplification-free method of library preparation49 was used in deep sequencing of the rice accession, which reduced the incidence of duplicate sequences, thus facilitating genome assembly and variation analysis.
The paired-end reads of all the rice accessions were aligned against the rice reference genome (IRGSP 4.0) using the software Smalt (version 0.4) with the parameters of -pair 50, 700 and -mthresh 50. SNPs were called using the Ssaha Pileup package (version 0.5) with detailed procedure described previously21. Genotypes of the rice accessions, including 1,083 O. sativa accessions, 446 O. rufipogon accessions and 15 accessions of outgroup species, were further called at the SNP sites from the Ssaha Pileup outputs. The genotype calls in 15 accessions of outgroup species were used to determine the ancestral states of SNPs in O. sativa and O. rufipogon. SNPs in coding regions, which were defined based on the gene models in the RAP-DB (release 2), were then annotated to be synonymous or non-synonymous for calculating the non-synonymous/synonymous ratio and dN/dS ratio. The genotype data set of the 1,529 rice accessions (1,083 O. sativa accessions and 446 O. rufipogon accessions) was generated on the basis of the calls in each rice accession. Seven sets of genome sequences, which included bacterial-artificial-chromosome-based Sanger sequences and high-coverage resequencing data, were used to assess the accuracy of the genotype data sets (Supplementary Table 6). The wild rice accessions with sequencing coverage >9 were selected to investigate the heterozygosity based on the overlapped reads that were aligned onto the reference sequence. For each accession, the proportion of heterozygosity genotypes was calculated at the polymorphic sites (Supplementary Table 3).
The software Haploview was used to calculate linkage disequilibrium with default settings, using SNPs with information in 446 O. rufipogon accessions50. Pairwise r2 was calculated for all the SNPs and then averaged across the whole genome. The matrix of pairwise genetic distance derived from simple SNP-matching coefficients was used to construct phylogenetic trees using the software PHYLIP51 (version 3.66). The software TreeView and MEGA5 were used for visualizing the phylogenetic trees. Principal component analysis of the SNPs was performed using the software EIGENSOFT52. The sequence diversity statistics () and the population-differentiation statistics (FST) were computed using a 100-kb window. The value of was calculated for each group in O. rufipogon and O. sativa, respectively, and the ratio of in the full population (or each clade) of O. rufipogon to that in the full population (or corresponding subspecies) of O. sativa was used to detect selective sweeps. The genomic regions where both O. rufipogon and O. sativa show a low level of genetic diversity were excluded for further analysis. To adopt appropriate thresholds to reduce the false-positive rate but also retain true selection signals, thresholds were chosen on the basis of both whole-genome permutation tests and signals at known loci. Permutation tests were performed to estimate the genome-wide type I error rate and determine the threshold to call selective sweeps (see Supplementary Information section 2 for details)53. The method cross-population extended haplotype homozygosity (XP-EHH) was also tested for detecting selective sweeps using the software xpehh54 (http://hgdp.uchicago.edu/Software/) (Supplementary Fig. 20). The genetic distance between two clades was computed based on the matrix of pairwise genetic distance, where the distance of all pairs of accessions from the two clades were retrieved and averaged. A custom Perl script was developed to plot all O. rufipogon accessions, using the public geographic information of world borders from the Thematic Mapping data set (version 0.3). The computational simulations under different demographic scenarios were performed using the program SFS_CODE40.
For the O. rufipogon population, approximately five seeds for each accession from the collection of wild rice were germinated and planted in the experimental field (in Sanya, China at N 18.65, E 109.80) from March 2011. The leaf sheath colour was observed and scored directly and the tiller angle was measured for each plant. The mapping population of 210 backcross inbred lines (BILs) and 61 chromosome segment substitution lines (CSSLs) was derived from a cross between O. sativa ssp. indica cv. Guangluai-4 and O. rufipogon accession W1943. The BILs were developed by one generation of backcross to Guangluai-4 followed with six generations of self-fertilization. The CSSLs were developed by five generations of backcross to Guangluai-4 followed with three generations of self-fertilization. Phenotyping was conducted in the experimental field (in Shanghai, China at N 31.13, E 121.28) from May to October, 2011. The fifteen traits that we phenotyped for this study include germination rate, tiller angle, heading date, stigma colour, the degree of stigma exsertion, plant height, panicle length, the degree of shattering, awn length, grain number per panicle, grain length, grain width, grain weight per 1,000 grains, hull colour and pericarp colour. The degree of stigma exsertion was scored based on the observation of ~20 randomly sampled spikelets of each line, on a scale of 13 (no, incomplete or complete exertion). Seed germination rate was measured by using mature seeds which were placed in a plastic Petri dishes kept at 30C in the dark for 48h5. Other traits were phenotyped and scored as described previously21, 22, 29.
For the genotype data set in O. rufipogon, genotypes of 446 O. rufipogon accessions were called specifically at the ~5 million SNP sites that were polymorphic in the O. rufipogon population. In the panel for GWAS, only the SNPs that have a minor allele frequency (MAF) of more than 5% and contain genotype calls of more than 100 accessions were left for subsequent imputation. The k-nearest neighbour algorithm-based imputation method was used for inferring missing calls21. The specificity of the genotype data set before and after imputation was assessed using three sets of genome sequences (Supplementary Table 6). Association analysis was conducted using the compressed mixed linear model55. The top five principle components were used as fixed effects and the matrix of genetic distance was used to model the variancecovariance matrix of the random effect. Permutation tests were used to define the threshold of association signals of the GWAS in the wild rice population. A total of 20 permutation analyses were performed (10 independent permutation tests for each of the two traits, sheath colour and tiller angle), which resulted in two association signals with the thresholds we set53. Hence, there were an average of 0.1 false positives (that is, totally two false positives in 20 permutation tests) in a single whole-genome scanning analysis. Simulation tests were used to compare the performance of GWAS between the populations of cultivated and wild rice.
Genomic DNA of each line in the mapping population was sequenced on the Illumina Genome Analyzer IIx, each to approximately 0.5 coverage. Both parents of the population, Guangluai-4 and W1943, were sequenced with at least 20 genome coverage, in a previous work21 and in this study, respectively. SNP identification between parents was conducted as described previously41. Genotype calling, recombination breakpoint determination and bin map construction was performed using the software SEG-Map (http://www.ncgr.ac.cn/software/SEG/). QTL analysis of the fifteen traits was conducted with the composite interval mapping (CIM) method implemented in the software Windows QTL Cartographer56 (version 2.5) with a window size of 10cM and a step size of 2cM. QTL with LOD value higher than 3.5 were called, of which the location was described according to its LOD peak location. The phenotypic effect (r2) of each QTL was computed using Windows QTL Cartographer. QTLs located within selective sweep regions were further used to associate the selected regions with their biological functions. It needs to be noted that we adopted a stringent threshold in the QTL calling (LOD>3.5), and the genomic regions with LOD ranging from 2.5 to 3.5 may include many minor QTLs (the threshold was set to 2.5 in most studies).
The genome of W1943 was assembled by using a custom pipeline integrating Phusion2 (clustering the raw reads into different groups)57 and Phrap (then assembling all the reads in each group to generate contigs)58. The N50 length of the entire assembly was calculated for the initial contigs with small contigs of <200bp excluded. All the full-length complementary DNA sequences46 of W1943 were aligned with the final assembly of W1943 genome sequence using the software GMAP59 (version 6) with the parameters -K 15000 and -k 0.97. The resulting contigs from whole-genome de novo assembly were anchored to the rice reference genome sequence (IRGSP4.0) using the software MUMmer60 (version 3).
Gene models of the genome of the wild rice W1943 were predicted using the software Fgenesh that was set for a monocot model61 (version 2.0). The resulting proteome of W1943 was compared with protein sequences in Rice Genome Annotation Project (version 7.0) using BLASTP with a cutoff of a minimum of 95% identity. Sequence variants, including SNPs, indels and imbalanced substitutions, were called using the diffseq program in the EMBOSS package62. Indels of large size were called from the alignment results of MUMmer. Effects of the sequence variants were predicted according to the gene models of Nipponbare in the RAP-DB (release 2) across the rice genome. For indels in genic regions and SNPs with large effect around the domestication loci, the effects were mainly based on the reference gene models.
The sequence reads of 1,083 O. sativa accessions, 446 O. rufipogon accessions and 15 outgroup accessions were then aligned against assembled genome sequences of W1943 using the same parameters with those against the reference Nipponbare genome sequences. Genotypes of each accession were called at all sequence variant sites (including SNPs, indels and imbalanced substitutions that were detected from assembled sequences), based on the alignment outputs against the two genome sequences. The allele frequencies at the sequence variant sites were calculated for each clade of O. sativa and O. rufipogon. In each clade, variant sites with information of less than 10 accessions (less than 2 for the outgroups) were then excluded for computing allele frequencies, namely no data available.
See more here:
A map of rice genome variation reveals the origin of cultivated rice
- ENCODE: Encyclopedia Of DNA Elements - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- 07.05.2010 - The Human Genome [ Coast To Coast AM ] - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- NOVA scienceNOW : 51 - Public Genomes, Algae Fuel, Mystery of the Gakkel Ridge, Yoky Matsuoka - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Vincent T. - Genome (Club Remix) - [Preview] - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Comparing The Human And Chimpanzee Genomes - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Whole Genome Sequencing and Its Impact on Clinical Care - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Carlos Bustamante -- "Reconstructing the Great Human Diasporas from Genome Variation Data" - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- 3 Sad Surprises: The Human Genome Project - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- The RFW interviews Genome - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Science Bulletins: Scientists Peer Inside "Superbug" Genome - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Genome : Live @ Smu's : June 3 2012 - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Inoki Genome Federation - Genome 19 - 04 02 2012 - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- THE HUMAN GENOME MUSIC PROJECT - CHROMOSOME 1 - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Genomic Medicine - Bruce Korf (2012) - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Human Genome's 'Blockbuster' Potential Undervalued in Bid GSK vs HGSI - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Announcing the Completion of the First Survey of the Entire Human Genome at the White House - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- DNA analysis Part I. Genomic Sequencing - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- The Genome Question: Moore vs. Jevons with Bud Mishra - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- Genome-Wide Association Studies - Karen Mohlke (2012) - Video [Last Updated On: September 7th, 2012] [Originally Added On: September 7th, 2012]
- New human genome research aids understanding of disease [Last Updated On: September 8th, 2012] [Originally Added On: September 8th, 2012]
- UNC Lineberger scientists lead definition of key lung cancer genome [Last Updated On: September 10th, 2012] [Originally Added On: September 10th, 2012]
- Illumina Announces Expedited Individual Genome Sequencing Service (IGS) [Last Updated On: September 11th, 2012] [Originally Added On: September 11th, 2012]
- Genome research given a boost with opening of bioscience facility [Last Updated On: September 11th, 2012] [Originally Added On: September 11th, 2012]
- Re-Imagining Our Genes: ENCODE Project Reveals Genome as an Information Processing System [Last Updated On: September 11th, 2012] [Originally Added On: September 11th, 2012]
- Illumina unveils upgraded genome sequence service [Last Updated On: September 12th, 2012] [Originally Added On: September 12th, 2012]
- US Personalized Cancer Genome Sequencing Market [Last Updated On: September 18th, 2012] [Originally Added On: September 18th, 2012]
- Yale maps “uncharted” genome regions [Last Updated On: September 18th, 2012] [Originally Added On: September 18th, 2012]
- Research and Markets: US Personalized Cancer Genome Sequencing Market [Last Updated On: September 19th, 2012] [Originally Added On: September 19th, 2012]
- 3Qs: New clues to unlocking the genome [Last Updated On: September 19th, 2012] [Originally Added On: September 19th, 2012]
- Oyster Genome Pries Open Mollusk Evolutionary Shell [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- Bangladeshi scientist decodes genome of deadly fungus [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- Oyster genome uncover the stress adaptation and complexity of shell formation [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- The oyster genome reveals stress adaptation and complexity of shell formation [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- Diseases of aging map to a few 'hotspots' on the human genome [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- GnuBIO Awarded $4.5 Million in Funding from the National Human Genome Research Institute to Develop Lower Cost Genome ... [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- Oyster genome mystery unravelled [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- Devangshu Datta: What's in a genome [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- Pacific Oyster Genome Shows Stress Adaptation And Complexity Of Shell Formation [Last Updated On: September 20th, 2012] [Originally Added On: September 20th, 2012]
- UNC Lineberger scientists lead cancer genome analysis of breast cancer [Last Updated On: September 24th, 2012] [Originally Added On: September 24th, 2012]
- Encoding the human genome [Last Updated On: September 24th, 2012] [Originally Added On: September 24th, 2012]
- Cancer genome analysis of breast cancer: Team identifies genetic causes and similarity to ovarian cancer [Last Updated On: September 24th, 2012] [Originally Added On: September 24th, 2012]
- Fungus genome map paves way for 'Snow White' jute variety [Last Updated On: September 24th, 2012] [Originally Added On: September 24th, 2012]
- New online, open access journal focuses on microbial genome announcements [Last Updated On: September 25th, 2012] [Originally Added On: September 25th, 2012]
- By Simply Sharing, Doctors Could Unlock the Genome's Potential [Last Updated On: September 25th, 2012] [Originally Added On: September 25th, 2012]
- Forget the Cloud—Knome Offers Genome Analysis in a Box [Last Updated On: September 28th, 2012] [Originally Added On: September 28th, 2012]
- BGI@CHOP Joint Genome Center to Offer Clinical Next-Generation Sequencing Services [Last Updated On: September 28th, 2012] [Originally Added On: September 28th, 2012]
- Holy Bat Virus! Genome Hints At Origin Of SARS-Like Virus [Last Updated On: September 29th, 2012] [Originally Added On: September 29th, 2012]
- Community Fundraising Effort Helps Researchers Sequence Parrot Genome [Last Updated On: September 29th, 2012] [Originally Added On: September 29th, 2012]
- UMass Med professors are sleuths of the genome [Last Updated On: September 30th, 2012] [Originally Added On: September 30th, 2012]
- Knome Introduces the knoSYS™100; First Plug-and-Play Human Genome Interpretation System [Last Updated On: September 30th, 2012] [Originally Added On: September 30th, 2012]
- First large scale trial of whole-genome cancer testing for clinical decision-making reported [Last Updated On: October 1st, 2012] [Originally Added On: October 1st, 2012]
- Should You Get Your Genome Mapped? [Last Updated On: October 1st, 2012] [Originally Added On: October 1st, 2012]
- Surprising differences between apples and pears [Last Updated On: October 2nd, 2012] [Originally Added On: October 2nd, 2012]
- 50-Hour Whole Genome Sequencing Provides Rapid Diagnosis for Children With Genetic Disorders [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- Genome analysis promises hope for breast cancer patients [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- Genome Alberta Welcomes Alberta Minister of Enterprise and Advanced Education, Stephen Khan and Federal Minister of ... [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- Fifty-hour whole genome sequencing provides rapid diagnosis for children with genetic disorders [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- Will Low-Cost Genome Sequencing Open 'Pandora's Box'? [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- Genome testing could help individualize treatments [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- Would you get your genome tested? [Last Updated On: October 3rd, 2012] [Originally Added On: October 3rd, 2012]
- The Genome — a Pandora's Box? [Last Updated On: October 4th, 2012] [Originally Added On: October 4th, 2012]
- Fast genome test could help sick newborns [Last Updated On: October 4th, 2012] [Originally Added On: October 4th, 2012]
- In-Depth Genome Analysis Moves Toward The Hospital Bed [Last Updated On: October 5th, 2012] [Originally Added On: October 5th, 2012]
- Your Verdict On Getting A Genome Test? Bring It On [Last Updated On: October 6th, 2012] [Originally Added On: October 6th, 2012]
- Genome-wide study identifies 8 new susceptibility loci for atopic dermatitis [Last Updated On: October 7th, 2012] [Originally Added On: October 7th, 2012]
- Genome-wide study identifies eight new susceptibility loci for atopic dermatitis [Last Updated On: October 7th, 2012] [Originally Added On: October 7th, 2012]
- Genome interpreter vies for place in clinical market [Last Updated On: October 10th, 2012] [Originally Added On: October 10th, 2012]
- The $1,000 Genome: A Bait and Switch? [Last Updated On: October 10th, 2012] [Originally Added On: October 10th, 2012]
- Mount Sinai School of Medicine Offers First-Ever Course with Whole Genome Sequencing [Last Updated On: October 10th, 2012] [Originally Added On: October 10th, 2012]
- First whole genome sequencing of multiple pancreatic cancer patients has been outlined [Last Updated On: October 11th, 2012] [Originally Added On: October 11th, 2012]
- Cheap genome sequences demand new rules on privacy [Last Updated On: October 11th, 2012] [Originally Added On: October 11th, 2012]
- UConn Gets Grant For Genome Research [Last Updated On: October 11th, 2012] [Originally Added On: October 11th, 2012]
- Inconsistent Genome Privacy Laws Need Toughening, Panel Says [Last Updated On: October 12th, 2012] [Originally Added On: October 12th, 2012]
- US panel calls for stronger privacy for genome data [Last Updated On: October 12th, 2012] [Originally Added On: October 12th, 2012]
- Genome Canada Board Appoints New Chair [Last Updated On: October 12th, 2012] [Originally Added On: October 12th, 2012]
- The $1,000 Genome Is Almost Here- Are We Ready? [Last Updated On: October 15th, 2012] [Originally Added On: October 15th, 2012]
- Global genome effort seeks genetic roots of disease [Last Updated On: October 31st, 2012] [Originally Added On: October 31st, 2012]
- Massive encyclopedia helps explain how the human genome works [Last Updated On: October 31st, 2012] [Originally Added On: October 31st, 2012]
- Genome evolution and carbon dioxide dynamics [Last Updated On: October 31st, 2012] [Originally Added On: October 31st, 2012]
- This Week in Genome Research [Last Updated On: October 31st, 2012] [Originally Added On: October 31st, 2012]