Page 35«..1020..34353637..4050..»

Category Archives: Genome

University Hospitals Leuven in Belgium Outlines their Menu Expansion Plans for Optical Genome Mapping as One of their Primary Analyses in Leukemias…

Posted: December 23, 2021 at 10:18 pm

SAN DIEGO, Dec. 23, 2021 (GLOBE NEWSWIRE) -- Bionano Genomics, Inc. (BNGO), pioneer of optical genome mapping (OGM) solutions on the Saphyr system and provider of the leading software solutions for visualization, interpretation and reporting of genomic data, today announced that University Hospitals Leuven in Belgium, after previously receiving its accreditation from the Belgian Accreditation Body (BELAC) for using OGM in analysis of acute lymphoblastic leukemia (ALL), is expanding its BELAC-accredited menu to include acute myeloid leukemia (AML), chronic lymphocytic leukemia (CLL) and facioscapulohumeral muscular dystrophy (FSHD).

With the flexibility we now have as an accredited laboratory by BELAC, our teams can develop OGM-based assays addressing hematological malignancies without the need for a new audit, said Barbara Dewaele, PhD, supervisor of the Laboratory for Genetics of Malignant Disorders at University Hospitals Leuven. We are excited to move forward using this valuable tool to analyze the genomes of patients with cancer and rare diseases.

At the European Cytogenomics Conference in July 2021, Dr. Dewaele shared the results of implementing an OGM-based assay for ALL patients that her team developed with Bionanos Saphyr system. As presented by Dr. Dewaele and her team, compared to their existing workflow, the new workflow including OGM as a primary analysis method reduced the number of fluorescence in-situ hybridization probes used by 90% and eliminated the need for multiplexed ligation polymorphism assays. In their new workflow including OGM, it is complemented with karyotyping to detect ploidy changes and the presence of small subclones. This transformation resulted in a turnaround time that was 14 days faster, a cost savings of approximately 50% and higher overall success rates in finding pathogenic variants in samples.

In parallel, as part of their menu expansion efforts, and under the direction of Dr. Valrie Race, Center for Human Genetics at University Hospitals Leuven, a validation of Bionanos EnFocus FSHD tool will be conducted on a prospective cohort of FSHD samples to confirm OGMs capability to accurately measure the length of D4Z4 repeat arrays and assess reproducibility and repeatability of the workflow. Preliminary results were presented at the European Society of Human Genetics conference in August 2021, and reported that OGM can be a powerful and robust technique for FSHD testing in genetic diagnostic laboratories by providing results that are concordant with the current gold standard, Southern blot analysis in a substantially simpler workflow that does not use radioactivity.

Story continues

Dr. Dewaele reported that she and her colleagues have doubled their weekly sample volume relative to when they first started using their Saphyr system and believe they are on track to reach their goal of 500 samples per year with this instrument. The teams at University Hospitals Leuven believe that the time and cost savings from using OGM-based assays could be a competitive advantage relative to traditional techniques. OGM is also complementary to many of the tools used in typical molecular pathology and cytogenomics labs and, as a result, it can be helpful to interpretation of results from assays such as karyotyping, which can be used to confirm OGM findings.

Erik Holmlin, PhD, President and CEO of Bionano Genomics, commented, We are impressed at the drive and persistence of Dr. Dewaele and all of the teams at Leuven, which has enabled the hospital to expand its lab testing portfolio. We are thrilled that University Hospitals Leuven has determined its plans for menu expansion, which are facilitated by the accreditation and formal confirmation letter received from BELAC. We believe that the path followed by Dr. Dewaele is indicative of what other labs can follow along the way to making OGM an essential and widely used method in clinical genomics research, said Dr. Holmlin. OGM can allow new workflows that are faster and provide answers to questions quickly, which may allow for treatment decisions to be taken sooner. Since OGM has been shown to find clinically relevant variants that other techniques may miss, it may also provide answers to questions researchers may not know they had about these specific cancers and genetic diseases.

Dr. Barbara Dewaele will be presenting at Bionanos Symposium on January 11, 2022. At the Symposium, more than 25 esteemed speakers from around the world will present their latest scientific findings using Bionanos Saphyr system for OGM in constitutional cytogenomics, hematologic malignancies, solid tumors, and in combination with next-generation sequencing. A link to register for the Bionano Genomics 2022 Symposium is available at https://www.labroots.com/ms/virtual-event/bngo2022

About Bionano Genomics

Bionano is a provider of genome analysis solutions that can enable researchers and clinicians to reveal answers to challenging questions in biology and medicine. The Companys mission is to transform the way the world sees the genome through OGM solutions, diagnostic services and software. The Company offers OGM solutions for applications across basic, translational and clinical research. Through its Lineagen business, the Company also provides diagnostic testing for patients with clinical presentations consistent with autism spectrum disorder and other neurodevelopmental disabilities. Through its BioDiscovery business, the Company also offers an industry-leading, platform-agnostic software solution, which integrates next-generation sequencing and microarray data designed to provide analysis, visualization, interpretation and reporting of copy number variants, single-nucleotide variants and absence of heterozygosity across the genome in one consolidated view. For more information, visit http://www.bionanogenomics.com, http://www.lineagen.com or http://www.biodiscovery.com.

Forward-Looking Statements of Bionano Genomics

This press release contains forward-looking statements within the meaning of the Private Securities Litigation Reform Act of 1995. Words such as may, will, expect, plan, anticipate, estimate, intend and similar expressions (as well as other words or expressions referencing future events, conditions or circumstances) convey uncertainty of future events or outcomes and are intended to identify these forward-looking statements. Forward-looking statements include statements regarding our intentions, beliefs, projections, outlook, analyses or current expectations concerning, among other things: the inability or delays in the University Hospitals Leuven to expand its menu; the inability for other labs to utilize the steps taken by University Hospitals Leuven to make OGM a widely used method; the ability for University Hospitals Leuven to continue processing the increased volume of samples; OGMs ability to provide new, faster workflows; OGMs ability to find clinically relevant variants that other techniques miss and to provide answers to questions not yet asked; Dr. Dewaeles ability to present at Bionanos Symposium; and the impact of the expansion of our commercial leadership team, including our expectations regarding the growth of Saphyr and our ability to bolster customer support and experience globally. Each of these forward-looking statements involves risks and uncertainties. Actual results or developments may differ materially from those projected or implied in these forward-looking statements. Factors that may cause such a difference include the risks and uncertainties associated with: the impact of the COVID-19 pandemic on our business and the global economy; general market conditions; changes in the competitive landscape and the introduction of competitive products, technologies or improvements in existing technologies; failure of OGM to accurately and consistently perform as observed by University Hospitals Leuven or others; subsequent results could negate the results observed by University Hospitals Leuven or others; changes in our strategic and commercial plans; our ability to obtain sufficient financing to fund our strategic plans and commercialization efforts; the ability of medical and research institutions to obtain funding to support adoption or continued use of our technologies; and the risks and uncertainties associated with our business and financial condition in general, including the risks and uncertainties described in our filings with the Securities and Exchange Commission, including, without limitation, our Annual Report on Form 10-K for the year ended December 31, 2020 and in other filings subsequently made by us with the Securities and Exchange Commission. All forward-looking statements contained in this press release speak only as of the date on which they were made and are based on managements assumptions and estimates as of such date. We do not undertake any obligation to publicly update any forward-looking statements, whether as a result of the receipt of new information, the occurrence of future events or otherwise.

CONTACTSCompany Contact:Erik Holmlin, CEOBionano Genomics, Inc.+1 (858) 888-7610eholmlin@bionanogenomics.com

Investor Relations:Amy ConradJuniper Point+1 (858) 366-3243amy@juniper-point.com

Media Relations:Michael SullivanSeismic+1 (503) 799-7520michael@teamseismic.com

Read this article:
University Hospitals Leuven in Belgium Outlines their Menu Expansion Plans for Optical Genome Mapping as One of their Primary Analyses in Leukemias...

Posted in Genome | Comments Off on University Hospitals Leuven in Belgium Outlines their Menu Expansion Plans for Optical Genome Mapping as One of their Primary Analyses in Leukemias…

SARS-CoV-2 and Omicron: the need to optimise genome surveillance and tracing – The BMJ

Posted: at 10:18 pm

Dear Editor

The advent of the B.1.1.529 variant of SARS-CoV-2, now called Omicron, is bringing significant implications on the course of the COVID-19 pandemic.[1] While the questions on transmissibility, severity of infection and vaccine effectiveness are answered, the testing strategy for Omicron helds a pivotal role in the pandemic response, requiring urgent attention and optimization.

Whole genome sequencing (WGS) has been crucial in studying the evolution and genetic diversity of SARS-CoV-2 during the pandemic.[2] Further, WGS also played an important role in identifying the new variant Omicron which was categorized as a variant of concern (VOC) by WHO. Although whole genome sequencing (WGS) is the gold standard for genomic surveillance, it is not feasible to sequence every suspected case or contact of Omicron.[3]

Earlier, the Alpha version reported S gene target failure (SGTF) in RT-PCR and revealed that it had a considerable diagnostic value.[4,5] The recent South African investigations that led to the announcement of the new VOC Omicron also reported that SGTF was observed for more than 50% of all tested specimens, further recovering SGTF of the PCR assays as a proxy for the variant.[6] Notably, for early detection of the Omicron variant, WHO recommends using diagnostic test kits containing two confirmatory genes, at least one of which is the 'S' gene. As an internal control gene, the kits should ideally include RNaseP, Beta Actin, or any other human housekeeping gene.[1] Earlier, Thermo Fisher Scientific confirmed that its TaqPath Covid-19 test kits can detect Omicron variants with high accuracy. The TaqPath Covid-19 assays identify three gene targets from the orf1a/b, S and N regions of the virus to confirm SARS-CoV-2 infections.[7]

Therefore, SGTF during RT-PCR with kits that detect the S-gene has been used as a proxy test for the Omicron variant pending sequencing confirmation. Moreover, because several nations currently lack sufficient sequencing capacity, SGTF has been employed to screen suspected Omicron cases for WGS. The SGTF growth rate, which was used with the Alpha variant [5], can serve as a suitable surrogate for the level of Omicron community transmission.

However, the SARS-CoV-2 Omicron (B.1.159) lineage is now being proposed to be split into two sub-lineages: BA.1 and BA.2.8

While both lineages share a number of common defining mutations and appear to be co-circulating, the new recognised BA.2 sub-lineage does not carry the Spike del69-70 mutation which may hinder the use of commercially available PCR tests to diagnose Omicron based on S-gene target failure.[8,9]

In fact, recently sequenced cases belonging to the BA.2 sub-lineage have not been flagged by the aforementioned SGTF approach.[9]

Therefore, apart from the WHO's recommendation that a subset of SARS-CoV-2 confirmed cases be sampled for WGS, cases from unique transmission episodes, unexpected disease presentation or severity, vaccination breakthrough, critically ill patients, and overseas travellers should all be included, subject to local sequencing capacity.[1]

More importantly, Governments across the world will need to optimize the RT-PCR kits and their supply chain and adopt a balanced sampling strategy for WGS to confirm the B.1.1.529 variant.

In brief, although SGTF represents an effective testing strategy to contain Omicron through targeted contact tracing and isolation, the rapid evolution of the variants and the unfolding data regarding their genetic profile needs to be fully incorporated into the diagnostics tools if we are to succeed in our quest to conquer the idiosyncrasies of SARS-CoV-2.

All authors have contributed equally

Conflict of InterestNoneFundingNone

References1.WHO. Enhancing Readiness for Omicron (B.1.1.529): Technical Brief and Priority Actions for Member States. Nov 28, 2021 https://www.who.int/publications/m/item/enhancing-readiness-for-omicron-(b.1.1.529)-technical-brief-and-priority-actions-for-member-states (accessed on 03/12/2021)2.Umair M, Ikram A, Salman M, Khurshid A, Alam M, Badar N, Suleman R, Tahir F, Sharif S, Montgomery J, Whitmer S, Klena J. Whole-genome sequencing of SARS-CoV-2 reveals the detection of G614 variant in Pakistan. PLoS One. 2021 Mar 23;16(3):e0248371. doi: 10.1371/journal.pone.0248371.3.Liu T, Chen Z, Chen W, Chen X, Hosseini M, Yang Z, Li J, Ho D, Turay D, Gheorghe CP, Jones W, Wang C. A benchmarking study of SARS-CoV-2 whole-genome sequencing protocols using COVID-19 patient samples. iScience. 2021 Aug 20;24(8):102892. doi: 10.1016/j.isci.2021.102892.4.Migueres M, Lhomme S, Trmeaux P, Dimeglio C, Ranger N, Latour J, Dubois M, Nicot F, Miedouge M, Mansuy JM, Izopet J. Evaluation of two RT-PCR screening assays for identifying SARS-CoV-2 variants. J Clin Virol. 2021 Oct;143:104969. doi: 10.1016/j.jcv.2021.104969.5.Brown KA, Gubbay J, Hopkins J, Patel S, Buchan SA, Daneman N, et al. S-Gene Target Failure as a Marker of Variant B.1.1.7 Among SARS-CoV-2 Isolates in the Greater Toronto Area, December 2020 to March 2021. JAMA. 2021 May 25;325(20):2115-2116. doi: 10.1001/jama.2021.5607.6.European Centre for Disease Prevention and Control. Implications of the emergence and spread of the SARS- CoV-2 B.1.1. 529 variant of concern (Omicron), for the EU/EEA. 26 November 2021. ECDC: Stockholm; 2021.7.Medical Device Network. Thermo Fishers Covid-19 tests can detect Omicron variant. Nov 30, 2021 https://www.medicaldevice-network.com/news/thermo-fishers-covid-19-tests... (accessed on 03/12/2021)8.https://www.gisaid.org, accessed on December 22th9.European Centre for Disease Prevention and Control/World Health Organization Regional Office for Europe. Methods for the detection and characterisation of SARS-CoV-2 variants first update. 20 December 2021. Stockholm/Copenhagen; ECDC/WHO Regional Office for Europe: 2021

Read more:
SARS-CoV-2 and Omicron: the need to optimise genome surveillance and tracing - The BMJ

Posted in Genome | Comments Off on SARS-CoV-2 and Omicron: the need to optimise genome surveillance and tracing – The BMJ

Genomic sequencing: Here’s how researchers identify omicron and other COVID-19 variants – The Conversation AU

Posted: December 22, 2021 at 12:44 am

How do scientists detect new variants of the virus that causes COVID-19? The answer is a process called DNA sequencing.

Researchers sequence DNA to determine the order of the four chemical building blocks, or nucleotides, that make it up: adenine, thymine, cytosine and guanine. The millions to billions of these building blocks paired up together collectively make up a genome that contains all the genetic information an organism needs to survive.

When an organism replicates, it makes a copy of its entire genome to pass on to its offspring. Sometimes errors in the copying process can lead to mutations in which one or more building blocks are swapped, deleted or inserted. This may alter genes, the instruction sheets for the proteins that allow an organism to function, and can ultimately affect the physical characteristics of that organism. In humans, for example, eye and hair color are the result of genetic variations that can arise from mutations. In the case of the virus that causes COVID-19, SARS-CoV-2, mutations can change its ability to spread, cause infection or even evade the immune system.

We are both biochemists and microbiologists who teach about and study the genomes of bacteria. We both use DNA sequencing in our research to understand how mutations affect antibiotic resistance. The tools we use to sequence DNA in our work are the same ones scientists are using right now to study the SARS-CoV-2 virus.

One of the earliest methods scientists used in the 1970s and 1980s was Sanger sequencing, which involves cutting up DNA into short fragments and adding radioactive or fluorescent tags to identify each nucleotide. The fragments are then put through an electric sieve that sorts them by size. Compared with newer methods, Sanger sequencing is slow and can process only relatively short stretches of DNA. Despite these limitations, it provides highly accurate data, and some researchers are still actively using this method to sequence SARS-CoV-2 samples.

Since the late 1990s, next-generation sequencing has revolutionized how researchers collect data on and understand genomes. Known as NGS, these technologies are able to process much higher volumes of DNA at the same time, significantly reducing the amount of time it takes to sequence a genome.

There are two main types of NGS platforms: second-generation and third-generation sequencers.

Second-generation technologies are able to read DNA directly. After DNA is cut up into fragments, short stretches of genetic material called adapters are added to give each nucleotide a different color. For example, adenine is colored blue and cytosine is colored red. Finally, these DNA fragments are fed into a computer and reassembled into the entire genomic sequence.

Third-generation technologies like the Nanopore MinIon directly sequence DNA by passing the entire DNA molecule through an electrical pore in the sequencer. Because each pair of nucleotides disrupts the electrical current in a particular way, the sequencer can read these changes and upload them directly to a computer. This allows clinicians to sequence samples at point-of-care clinical and treatment facilities. However, Nanopore sequences smaller volumes of DNA compared with other NGS platforms.

Though each class of sequencer processes DNA in a different way, they can all report the millions or billions of building blocks that make up genomes in a short time from a few hours to a few days. For example, the Illumina NovaSeq can sequence roughly 150 billion nucleotides, the equivalent of 48 human genomes, in just three days.

So why is genomic sequencing such an important tool in combating the spread of SARS-CoV-2?

Rapid public health responses to SARS-CoV-2 require intimate knowledge of how the virus is changing over time. Scientists have been using genome sequencing to track SARS-CoV-2 almost in real time since the start of the pandemic. Millions of individual SARS-CoV-2 genomes have been sequenced and housed in various public repositories like the Global Initiative on Sharing Avian Influenza Data and the National Center for Biotechnology Information.

Genomic surveillance has guided public health decisions as each new variant has emerged. For example, sequencing the genome of the omicron variant allowed researchers to detect over 30 mutations in the spike protein that allows the virus to bind to cells in the human body. This makes omicron a variant of concern, as these mutations are known to contribute to the viruss ability to spread. Researchers are still learning about how these mutations might affect the severity of the infections omicron causes, and how well its able to evade current vaccines.

Sequencing also has helped researchers identify variants that spread to new regions. Upon receiving a SARS-CoV-2 sample collected from a traveler who returned from South Africa on Nov. 22, 2021, researchers at the University of California, San Francisco, were able to detect omicrons presence in five hours and had nearly the entire genome sequenced in eight. Since then, the Centers for Disease Control and Prevention has been monitoring omicrons spread and advising the government on ways to prevent widespread community transmission.

The rapid detection of omicron worldwide emphasizes the power of robust genomic surveillance and the value of sharing genomic data across the globe. Understanding the genetic makeup of the virus and its variants gives researchers and public health officials insights into how to best update public health guidelines and maximize resource allocation for vaccine and drug development. By providing essential information on how to curb the spread of new variants, genomic sequencing has saved and will continue to save countless lives over the course of the pandemic.

[Get the best of The Conversation, every weekend. Sign up for our weekly newsletter.]

See original here:
Genomic sequencing: Here's how researchers identify omicron and other COVID-19 variants - The Conversation AU

Posted in Genome | Comments Off on Genomic sequencing: Here’s how researchers identify omicron and other COVID-19 variants – The Conversation AU

The genomes of 204 Vitis vinifera accessions reveal the origin of European wine grapes – Nature.com

Posted: at 12:44 am

Sequence variation in 204 V. vinifera genomes and ten outgroup species

We resequenced 122 accessions of V. vinifera (including sativa, sylvestris and feral) and two Vitis species at a genome coverage ranging from 8-fold to 90-fold (average 26) and obtained archived sequence reads for another 82 vinifera accessions and 8 other grape species (Supplementary Data1 and Supplementary Note1). Using uniquely mapped paired-end reads, 7,364,288 SNPs were identified in the non-repetitive regions of the cultivated grape genomes (sativa, Supplementary Fig.1a), of which 596,150 were private to single accessions. Forty-eight bona fide wild and 33 feral vinifera added 492,256 additional SNPs (Supplementary Note 2). Validation of SNP calls showed low error rates in genotyping for homozygous and heterozygous sites (0.00013% and 0.01019%, respectively (Supplementary Methods12, Supplementary Note3, and Supplementary Figs.23).

We used a subset of 5,925,766 polymorphic sites that were informative in the outgroup Muscadinia rotundifolia as well as in a set of eight American and Asian Vitis species to determine the mutation direction, the unfolded site frequency spectrum, and the strength and direction of selective pressure in cultivated varieties by mutation age and mutation type (Supplementary Note4 and Supplementary Fig.1b, c). A relatively large proportion of the SNPs (11.9%) that are polymorphic in vinifera predate the speciation event that led to the creation of vinifera (trans-specific SNPs), suggesting a largely incomplete lineage sorting in the genus Vitis (Supplementary Fig.4). Only 8.2% of the SNPs found in sylvestris are not present in sativa, consistent with a high level of shared variation between wild and cultivated grapes that is expected under a scenario of extensive gene flow and/or with a moderate bottleneck experienced during the domestication process. The cultivated varieties have a nucleotide diversity of =7.29 103 and highly heterozygous genomes (Supplementary Fig.5), with a maximum of 97.1% of total genome length and 96.8% of genes in heterozygous condition in Sauvignon Blanc (Supplementary Fig.6), despite a mating system dominated by cleistogamy and self-compatibility. The nucleotide diversity in the wild accessions was equal to 3.80 103. While this diversity value may be underestimated due to an incomplete sampling of all the diversity available in sylvestris, it still clearly shows that the domestication events that led to the creation of the cultivated varieties, unlike in other fruit crops19,20,21, did not lead to significant genome-wide losses of genetic diversity as a consequence of a major genetic bottleneck, confirming and complementing previous estimates based on haplotype diversity1.

We used two different data sets and two different approaches to derive inferences on the history, population structure and geographic differentiation in cultivated grapes. We first used a model-based clustering approach22 (implemented in the software ADMIXTURE) using whole genome sequence data of 203 accessions of vinifera (after removing accession KE06 from this specific analysis, following the classification of this individual as a feral escapee done by Liang and coworkers18) to infer their genetic ancestry and a statistical model developed by Pickrell and Pritchard23 (implemented in the software TreeMix) to infer splits and gene flow between cultivated and wild grapes (Fig.1a). We then extended the ancestry and gene flow analyses to an additional set of 1241 accessions (hereafter referred to as diversity panel), using a set of 6357 SNPs in common between the whole genome sequenced accessions and the publicly available SNP profiles of the additional accessions (Fig.1b).

a Maximum likelihood (ML) tree with four groups of cultivated varieties (Supplementary Fig.10) and four groups of wild accessions (Supplementary Fig.7). Ancestry composition and group sizes are illustrated in Supplementary Fig.10. b ML tree with nine groups of cultivated varieties and seven populations of sylvestris. Ancestry composition, group sizes, explained variance and the description of sylvestrissylvestris admixture are given in Supplementary Fig.22. a, b Migration events are indicated by colored arrows. The color scale shows the migration weight. The scale bar shows ten times the average standard error of the estimated entries in the sample covariance matrix. Bold lines indicate the sylvestris branches of the tree. Trees represent random trees and numbers represent bootstrap support values above 70% (100 iterations) before adding migrations. Support for the migration events and the resulting predictive model is given in Supplementary Figs.20 and 22c, Supplementary Table1, and Supplementary Data2.

When we applied the model-based clustering to the species germplasm WGS data (n=203, Supplementary Fig.7), with K=2 we separated eastern and western ancestry. With >0.85 membership proportion, the western ancestry component defined one group that includes exclusively accessions of western sylvestris and western feral grapes. The eastern ancestry component defined the other group that includes eastern wild and feral grapes, Caucasian wine grapes, table grapes, and cultivated varieties from across the Europes three great southern peninsulas (Iberian, Italian, and Balkan). The rest of the cultivated germplasm, represented by varieties that today are grown in Alpine countries, appears to be the result of admixture. The statistics to estimate the number of ancestral populations suggested that eastern and western ancestry is the main divide, according to the Evannos test (Supplementary Fig.8). According to the cross-validation error, K=3 and 4 provide the best predictive model, with K=3 showing only slightly higher cv-error than K=4 (Supplementary Fig.9). The existence of up to four ancestry components in V. vinifera was also considered plausible by Liang and coworkers in a broader context of the genus Vitis18. With both K=3 and 4, we confirmed the two main components that are dominant in wild grapes, one (yellow, hereafter referred to as W1), which is dominant in western sylvestris, and one (blue, hereafter referred to as W2) that is dominant in eastern sylvestris (Supplementary Fig.7). In consideration of the fact that the aberrans forms of sylvestris should have gone extinct, both these components, with different proportions, should correspond to the sylvestris typica ancestry in the East and in the West. Unlike the W1 component, which is found in both eastern and western wine grapes but is predominant only in wild and feral accessions, the W2 component is predominant in Caucasian wine grapes, in table grapes and in European cultivated varieties east of 40E longitude. With K=4, two additional components (orange and gray) are predicted, with one (orange, hereafter referred to as C1) predominantly found with the W2 component in table grapes, and the other (gray, hereafter referred to as C2) almost exclusively found in wine grapes. These two components, which are detectable in some eastern feral grapes but not in wild grape samples, could be derived from extinct forms of sylvestris (i.e., aberrans) and deserve special attention to better understand the structure of genetic diversity in the cultivated compartment. The C2 component is detectable in cultivated varieties of the Muscat family as well as in European wine grapes. The C1 component is most frequently associated with table and wine grapes from around the Mediterranean Basin.

In order to test which scenario of ancestral populations is more consistent with the taxonomic treatment of the cultivated compartment, we applied the model-based clustering to the cultivated germplasm alone (n=123, Supplementary Fig.10). With K=2, we separated one population containing only wine grape varieties from the Alpine countries, which includes 29.3% of all accessions and corresponds to Negruls occidentalis, and one population that consists of 31.7% of all accessions and includes Caucasian wine grapes, table grapes and European varieties from the Southern Balkans and Iberian Peninsula. With K=2, varieties that are typical of the ecogeographical groups pontica and orientalis clustered in the same ancestral population. K=3 generated one population corresponding to occidentalis, including 20.3% of all accessions, and a divide between one population of table grapes and Caucasian wine grapes, including 14.6% of all accessions, and one population that includes varieties from the entire Balkans (including insular Greece), from Southern Italy and from the Iberian Peninsula, representing 13.0% of all accessions. Only K=4 generated a divide between table grapes and Caucasian wine grapes into two ancestral populations that correspond to Negruls orientalis and pontica georgica, respectively. The other two ancestral populations with K=4 were represented by wine grapes from the Alpine countries (occidentalis) and by varieties from the Balkans, Greece and Southern Italy, largely corresponding to Negruls pontica balcanica. The adoption of K=4 (Supplementary Figs.1012) allowed us to obtain groups that reflect, both in terms of number and composition, the divide and the stratification postulated by Negrul (orientalis, pontica georgica, pontica balcanica, occidentalis) and widely accepted in grapevine taxonomy.

TreeMix provided strong evidence for a single eastern origin for the entirety of the cultivated germplasm as well as for an origin of European wine grapes from introgression of western sylvestris individuals into the domesticated lineage of orientalis grapes (Fig.1a). The inclusion of this single event of admixture (Supplementary Fig.13 and Supplementary Note5) in the model allowed 98.7% of the variance in relatedness among populations to be explained. TreeMix also suggested the occurrence of gene flow in the opposite direction, going from cultivated accessions into wild populations as a consequence of the migration of intermediate forms between wine and table grapes into western wild populations. With all the events of admixture shown in Fig.1a, which were confirmed by a 3-population test (Supplementary Table1), the proportion of the variance in the predicted relatedness among populations explained by the model increased to 99.8% and was resilient to different data treatments (Supplementary Figs.1420 and Supplementary Note6). According to TreeMix analysis, Mediterranean wine grapes from the Balkans and Magna Graecia, largely corresponding to pontica balcanica, appear to be genetically more similar to orientalis ancestors (table grapes) than to pontica georgica ancestors (Caucasian wine grapes), in partial disagreement with Negruls hypothesis (Fig.1a). Principal component analysis, haplotype-based pairwise genetic distance matrices and pedigree networks (see below) lend further support in favor of this statement.

The analysis of the extended set of accessions in the diversity panel provided historical and geographical resolution to this reconstruction. With four ancestry components (Supplementary Fig.21), we defined eight groups of well differentiated accessions in eight broad geographic areas (Supplementary Fig.22), excluding varieties with highly admixed ancestry. Caucasian wine grapes were confirmed to be distinct from all other cultivated varieties and closely related to the local wild accessions. Table grapes as well as wine grape varieties from the Black Sea Basin, the Middle East and the Mediterranean Basin have prevalently W2 and C1 ancestry, with an increase of the C1 component going from east to west (Supplementary Fig.22). Cultivated varieties across Europe are characterized by the increasing presence of W1 and C2 ancestry components going from south to north (Supplementary Fig.22). TreeMix analysis and three-population test (Supplementary Data2) suggested that the presence of W1 ancestry is to be attributed to admixture events between Mediterranean lineages of sylvestris, either extinct or not captured by our sample, and introduced varieties most similar to those today grown in the Balkans and Magna Graecia (Fig.1b). We found the highest W1 western sylvestris ancestry proportion in old local varieties considered today as autochthonous in the central and northern Italian peninsula, such as Enantio, Lambrusco di Sorbara, Raboso Piave, Fumat, Greco di Tufo, Aglianico, Verduzzo, Welschriesling, and in the widely grown variety Cabernet Franc that is similar to wild forms still present in the Atlantic Pyrenees24 (Supplementary Fig.7). We collectively refer to this germplasm as well as to hybrid forms classified in other papers under the designation of vigne sauvage faux (false sylvestris) as primitive European varieties, which represented the ninth group included in Fig.1b. Admixture events may have started in Southern Europe as early as in Greek and Roman times. This scenario agrees with previous estimates that western wine grapes and table grapes have diverged for 2.6K years17 and with our demographic model that predicts the nadir of effective population size 2K years before the present (Supplementary Fig.23) and suggests resumption of sexual reproduction in domesticated grapevines since Roman times. Further admixture may have later involved other sylvestris lineages more similar to those found today around the Alpine region (Fig.1b and Supplementary Data2).

In order to understand the consequences of post-domestication sylvestris-sativa hybridization, we used an ABBABABA test25 to identify genomic regions in wine grapes from the Alpine countries that received introgression from western wild populations before spreading worldwide. We observed widespread rather than localized signals of introgression (Fig.2a), suggesting that hybridizations occurred multiple times and left pervasive sylvestris ancestry across the genome rather than limited to specific loci under adaptive evolution. We vice versa identified some chromosomal regions that appear to be under negative selection against the introgression of wild alleles. These regions could correspond to loci that are particularly important for quality traits. An analysis using DA distances26 and phylogenetic trees built separately in 2368 genomic windows across the genome provided additional evidence for widespread effects of introgression, with western sylvestris contributions being detected in 37.7% of the genome (Fig.2b).

a Dots represent adjusted fd values in 100Kb windows of non-repetitive DNA. Lines represent cubic smoothing splines of the values. b Diagram of 100Kb chromosomal windows (in red) that show phylogenetic tree topologies with shorter genetic distance between Alpine wine grapes and western sylvestris than between Alpine wine grapes and any other cultivated group. Red triangles in a and constricted regions in b indicate the location of centromeric repeats. Source data are provided as a Source Data file.

Despite the scale of the dispersal and admixture events that have reshaped the continental diversity for millennia, as shown by multiple lines of evidence presented so far, European wine grapes remain connected with table grapes of the Central Asian oases through a highly interconnected network of first-degree or second-degree relationships (Supplementary Fig.24) that includes all of the 123 cultivated varieties of the WGS panel. We detected 24 parentoffspring and 4 full-sibling relationships, providing conclusive evidence for previously conflicting inferences (Supplementary Figs.2529 and Supplementary Note7). In the diversity panel, 492 varieties spanning the same geographic range were interconnected by 576 parentoffspring relationships and another 122 varieties had parentoffspring relationships outside of this network, also including parentoffspring pairs between cultivated varieties and feral grapes (Supplementary Fig.30).

A principal component and coordinate analysis (PCA and PCoA, respectively, in Fig.3 and Supplementary Fig.31 and Supplementary Note8) largely supported the conclusions based on the ancestry analysis about the origin of European wine grapes. The PCA in Fig.3 shows that the unbiased set of SNPsobtained by WGSprovided higher resolution for the separation of varieties on the bi-dimensional space than pre-ascertained SNPs used in hybridization-based genotyping. The set of sequenced varieties (Fig.3a) captured most of the genetic diversity present in the cultivated germplasm as represented in the extended set of accessions (Fig.3b), including the diversity present in the Iberian peninsula (Supplementary Fig.32), a center of supposed independent domestication in the West12. The PCA in Supplementary Fig.32 shows that the Iberian cultivated germplasm is more similar to table grapes and Eastern populations of sylvestris than to Western populations of sylvestris, not providing support to the hypothesized event of neodomestication. These results are consistent with those obtained by Freitas and coworkers27 from low coverage resequencing of a larger set of locally grown Iberian cultivars and local wild accessions. The PCA highlights individual accessions that may serve as illustrations of the blurred boundaries between wild and cultivated compartments. Enantio and Lambrusco di Sorbara that are cultivated south of the Alps in the Po valley provide an example of western wine grapes that are situated midway on the PCA plane between western sylvestris and cultivated varieties from the Balkans and Magna Graecia (Fig.3a) and are contiguous to sylvestris accessions from the Italian peninsula (Fig.3b), as observed by11. WGS data show that ADMIXTURE membership proportions in Enantio and Lambrusco di Sorbara (Supplementary Fig.7) and the level of haplotype sharing with accessions of Western sylvestris (Supplementary Fig.33) are fully compatible with these varieties representing sylvestris-sativa first generation hybrids or very early backcross generations. The feral accession KE06 from the Ketsch island on the Rhein river (Germany) shows a similar genetic constitutionresulting from a possible cross between an escapee from the vineyards and a genuine autochthonous sylvestris, as suggested by18but an opposite case of classification (lambrusque mtis) presumably because the accession was found outside of a vineyard. There is no evidence of parentoffspring relationships between KE06 and cultivated varieties of the WGS panel, but we detected the highest level of haplotype sharing with Savagnin Blanc and Pinot Noir (across 40.6 and 39.8% of the diploid genome length, respectively, Supplementary Fig.34). The Manseng family that was represented in the diversity panel by Gros Manseng, Petit Manseng and Riesling Bleu and is located midway in the PCA plane, as recently observed by28, between the Pinot/Savagnin Blanc parentoffspring pair and French/German populations of sylvestris is in close proximity with an accession classified by Laucou and coworkers15 as a French sylvestris (B00ERBY). The pairs Pinot/B00ERBY, Savagnin Blanc/Petit Manseng, Savagnin Blanc/Riesling Bleu (collection Oberlin), and Petit Manseng/Gros Manseng share a parentoffspring relationship (Supplementary Fig.30), indicating a possible origin of Petit Manseng, Riesling Bleu and B00ERBY from a cross between a cultivated and a wild accession. Similar hybridization events between cultivated germplasm that was introduced from the center of domestication in the East and local sylvestris may also have occurred elsewhere in Southern Europe, generating intermediate forms that somewhere thrive as seedlings in the wild (e.g., feral forms in the Adriatic coast of Croatia, Supplementary Fig.33) and somewhere are vegetatively propagated for cultivation (e.g., some cultivars in the Iberian peninsula as shown in Supplementary Fig.32 and proto-varieties in Montenegro29). Our analysis of both the WGS panel and the diversity panel did not reveal any instance of cultivated varieties carrying pure Western sylvestris ancestry that would be expected in the scenario of an independent domestication event.

a PCA of 204V. vinifera whole genome resequenced genotypes based on 7.9M SNPs. b PCA of 1445 V. vinifera genotypes based on a subset of 6357 pre-ascertained SNPs in the diversity panel and in common with the WGS panel. Sequenced samples are indicated as open (cultivated varieties) and solid (wild accessions) squares. Additional cultivated varieties are indicated as gray crosses in b. Samples with uncertain assignment in their literature reports are reported as faux sauvage: 1, sylvestris FR B00ERBY15; 2, KE0618; 3, Vigne sauvage faux Mouchouses 1; 4, Tighzirt 1; 5, Fethiye 58 6415 and collectively indicated as solid circles in b. The 2-letter codes () indicate countries of origin: CH Switzerland, DE Germany, DZ Algeria, ES Spain, FR France, GE Georgia, GR Greece, HU Hungary, IT Italy, MA Morocco, SK Slovakia, TN Tunisia, TR Turkey. Source data are provided as a Source Data file.

The history of grape cultivation combines local adaptation with widespread vegetative propagation and movement, with varieties that have achieved broad or worldwide distribution and others that have largely remained confined in narrow geographic areas. Using a set of 605 cultivated varieties that provided a nearly proportional representation of those in cultivation in each country (Supplementary Table2), we associated the individual accessions with a precise geographic location represented by either the most ancient known area of cultivation (for widely spread and so-called international varieties) or the most typical or renowned growing region at the present time (for locally grown varieties). Figure4 shows the geographical distribution of genetic ancestry components for the cultivated compartment (Fig.4a) and for wild populations (Fig.4b), respectively. The top two wine-producing countries, France and Italy, exploited most of the diversity of western wine grapes (Fig.4c, d). The Italian viticulture showed the highest within country variation both in the intensity of the local major ancestry component (Fig.4c) and in the assortment of all four ancestry components (Fig.4d). This was already apparent from the very high proportion of admixed ancestry varieties observed among those originating from the Italian peninsula (Supplementary Fig.22), which therefore seems to be home not onlyto varieties that differ in their ancestry but also to crosses that generated highly admixed ancestries. This is likely due both to the historical presence of hubs for maritime and land trade routes between the East and the West and to the ample latitudinal and climatic range of wine growing regions (from 36 to 46N) that encompass USDA hardiness zones from 7 to 10. Spain and Portugal, the third and fifth wine-producing countries, respectively, instead rely on a national germplasm largely based on high C1 ancestry that is only admixed with C2 ancestry in northern Portugal, Galicia, and southern Pyrenees, presumably as a consequence of massive natural crossing with descendants of Savagnin Blanc (Supplementary Fig.30). Germany is the fourth wine-producing country in Europe with several wine regions located at the northern limits of grape cultivation and has, therefore, more limiting growing conditions and a reduced variation in proportional ancestry components across the country. Although there is a clear pattern of ancestry component proportions that is dictated by latitude across thetop wine-producing countries, but most notably across Italy (Supplementary Fig.35), which seems to result from environmental limitations preventing large-scale, within-country geographical displacement, there are notable exceptions of cultivated varieties with typical southern ancestry that are traditionally in use at northern latitudes. For instance, the variety Garganega, once extensively grown in warm climates of Sicily under the synonym of Grecanico Dorato, rose to fame for quality wines only after its long-range movement to Alpine growing regions.

Continental patterns of ancestry components in cultivated (a) and wild (b) grapevines and nationwide patterns of wine grape ancestry in the top five wineproducing countries in Europe (c, d). Colors represent W2 ancestry (blue), C1 ancestry (orange), C2 ancestry (gray), and W1 ancestry (yellow). Each ancestry component is plotted separately (a, b). Intensity of the main ancestry component is plotted (c). Overlay of all ancestry components is plotted (d). The collection site of wild accessions is indicated by black dots (b). The most representative site of cultivation of each variety is indicate by black dots (c, d). Abbreviations of top wine-producing countries: Italy IT, France FR, Spain ES, Germany DE, Portugal PT. Source data are provided as a Source Data file.

We therefore tested whether local adaptation to climate conditions may have contributed to shaping the geographic distribution of genetic diversity by using a generalized linear model (GLM). For each cultivated variety and the corresponding geographical location, we associated the genetic ancestry coefficients with 29 bioclimatic variables (Supplementary Table3) of the location using a spatial resolution of 1km2, under the assumption that each variety that has been retained in cultivation in a specific site, where many others may have been discharged, may recapitulate genotypes suitable for the local climate conditions. Seven climatic variables showing <0.70 Spearman correlation with one another explained from 41 to 52% of the variance in the geographic distribution of each ancestry component (Supplementary Table3). The W2 ancestry showed positive association with annual temperature range and negative association with seasonal precipitation. The C1 and C2 ancestry components showed associations with annual mean temperature in opposite directions (positive and negative, respectively). The W1 ancestry was most significantly and positively associated with seasonal precipitation. The associations between ancestry components and local climate variables are so tightly related to the geographical location that they rapidly decay under simulations that systematically displaced each genotype outside of the most traditional site of cultivation by 20, 50, and 100 Km in all latitudinal and longitudinal directions (Supplementary Table4).

Artificial selection for specific desired traits during the domestication process results in selective sweeps that lead to local reductions of genetic variation. Loss of nucleotide and haplotype diversity (Supplementary Fig.36a) as well as runs-of-homozygosity (Supplementary Fig.5) were detected in cultivated varieties across three loci on chromosomes 2, 15, and 17 when they were compared to sylvestris (Supplementary Fig.36b, c). These strong signals of selective sweeps presumably originate from strong positive selection of favorable alleles (Supplementary Fig.36d, f, h) and result in persistent linkage disequilibrium (r2) and extended hitchhiking (Supplementary Fig.36e).

The reduction of diversity in the lower arm of chromosome 2 is a breeding sweep known to result from positive selection for two nearby loss-of-function mutations causing loss of anthocyanin pigmentation in berry skin30. Homozygous recessive genotypes are so-called white varieties, which were the only option for the production of white wines before the advent of modern technologies to limit skin contact of crashed berries with their juice. The quest for this trait brought about a severe loss of diversity at nearby distal loci because, while LD dropped rapidly to background values on the proximal side of the locus, it persisted for 4Mb on the distal side (Supplementary Fig.36e).

The reduction of diversity on chromosome 15 resides in a pericentromeric region (Supplementary Fig.37). Contrary to breeding and domestication sweeps that are characterized by both low haplotype diversity as well as high frequency of homozygous varieties as a result of the positive selection for one favorable mutation, we observed in this case only a marked reduction of haplotype diversity, upstream of the centromere. We also observed an extreme segregation distortion immediately downstream of the centromere in the selfed progeny of Pinot Noir with a complete lack of one class of homozygous seedlings, compatible with a lethal recessive variant that was masked in Pinot Noir by the presence of one copy of the reference haplotype. It is thus possible that favorable and unfavorable variants are in strong linkage and in repulsion across the centromere in this region and are maintained in heterozygous state in the population of cultivated varieties.

The sweep on chromosome 17 has been proposed by Myles and coworkers1 as a footprint of domestication. Within a 2Mb valley of haplotype diversity in the cultivated germplasm (Fig.5a), we identified the nadir of haplotype diversity in a 100Kb interval carrying a total of 13 predicted genes in the most common haplotype, five of which form a cluster of tandemly arranged isopiperitenol/carveol dehydrogenases (Fig.5b). In addition to the most common haplotype (H1-A) that corresponds to the PN40024 reference sequence, we identified 18 other haplotypes with minor frequencies in the population (Supplementary Fig.38 and Fig.5c). The phenotypic traits that were subject to selection during domestication31 were presumably related to flower sex determination, with nearby mutations within a sex locus in the upper arm of chromosome 2 involved in the transition from dioecious plants in sylvestris to hermaphrodites in sativa32, and to berry morphology, with an increase in berry size and flesh-to-seed ratio going from sylvestris to sativa that made the grapes more attractive to human consumption and more amenable to wine making, with their genetic determinants still unknown. Quantitative trait loci (QTLs) controlling a series of berry traits in wine as well as in table grapes have been found overlapping with the sweep region on chromosome 1733,34.

a Chromosomal plot of haplotype diversity. Haplotype diversity was calculated in blocks of five consecutive variant sites and plotted as the average of 50 consecutive blocks (blue dots) and a cubic smoothing spline (black line). The scale indicates Mb. The yellow background indicate the interval magnified in b. b V2.1 gene models (exons in blue), manually curated gene predictions (green) in the isopiperitenol/carveol dehydrogenase gene cluster (gene IDs 711), annotated transposable elements (light gray). c Frequency of 19 haplotypes shown in Supplementary Fig.38 in 196 grapevine accessions. d Genotype frequency in 121 cultivated varieties. e VIT_217s0000g05570 (gene 6 in b) gene phylogeny. Numbers indicate the proportion of bootstrap trees supporting that clade. f ASE of the LRRreceptor kinase VIT_217s0000g05570 alleles in representative varieties of 15 haplotypic combinations, in softening berries (lower panel) and leaves (upper panel). The asterisks indicate statistically significant ASE levels (p-value <0.05) according to a Stouffers meta-analysis with weight and direction effect using n=2 biologically independent samples. Cumulative expression is reported for each haplotypic combination lacking exonic SNPs in VIT_217s0000g05570 (H1-A/H1-G, H1-A/H10, H1-A/AX) and for a control variety homozygous for the H1-A haplotype. Gene expression for three haplotype combinations (H1-A/H10, H1-A/H6, H1-A/H4) was quantified in leaves of three different representative varieties (Tschvediansis Tetra, Picolit, Lambrusco Grasparossa) with the same genotype with respect to those used for berry gene expression. Source data of gene expression are provided as a Source Data file.

Two nearby genes in the sweep region captured our attention because they show the lowest level of diversity (Fig.5e and Supplementary Fig.39a) and because they show a hugely increased transcript abundance in the berry in the haplotypes found in the cultivated forms in comparison to those found in the wild ones (Fig.5f and Supplementary Fig.39b). These genes (VIT_217s0000g05570 and VIT_217s0000g05580, corresponding to gene numbers 6 and 7 in the diagram of Fig.5b) are arranged in a head-to-tail orientation with less than 100bp separating their transcriptional units (Supplementary Fig.40) and encode a leucine-rich-repeat receptor-like kinase (LRRRLK) and the first isopiperitenol/carveol dehydrogenase in the tandemly repeated cluster, respectively. We used allele-specific analysis of gene expression to determine the steady-state transcript abundance of the two genes in leaves and berries for a large subset of the 19 haplotypes identified in the region. While no major differences in expression between haplotypes are detected in leaves (Fig.5f and Supplementary Fig.39), only one haplogroup, including the most common haplotype and other highly similar haplotypes that are present only in cultivated varieties, seems to produce detectable levels of transcripts in the berries at least for the kinase gene (Fig.5f). Haplotypes that are found in the wild accessions on the contrary all show transcript levels that are very close to zero. The most frequent (76%) haplotype H1-A is present in 95.8% of cultivated varieties in either homozygous (55.4%) or heterozygous (40.4%) condition, possibly indicating a dominant or semi-dominant mode of action of the selected allele (Fig.5d). The haplogroup comprising H1-A is represented in 98.3% of cultivated varieties. The only exceptions among cultivated varieties are represented by Berzamino, an almost abandoned wine grape once grown in Northeastern Italy35,36, and Gordin Verde, a wine grape from Moldova, unrelated to Berzamino (Supplementary Figs.24 and 30). Despite both varieties having domesticated traits, Berzamino is homozygous for the H7 haplotype that is predominant in wild accessions and consequently has extremely low transcript levels for both genes in the berry (Fig.5df and Supplementary Fig.39). Gordin Verde is heterozygous for two haplotypes (H1-F/H2-A) that are normally found in other cultivated varieties in combination with H1-A and provide low transcript levels for the kinase in the berry (Fig.5df). The haplotypes found in all other domesticated varieties that do not have at least one H1-A copy all share a region of sequence identity that comprises the 5 intergenic region of the kinase gene, the kinase and the dehydrogenase genes, forming the H1-A haplogroup, and have high levels of expression of the kinase gene in the berry (Fig.5e). The difference in organ-specific and allele-specific expression is even more dramatic for the isopiperitenol/carveol dehydrogenase with extremely high levels of transcript being detected for the cultivated haplotypes in the berry (Supplementary Fig.39). While there is in general a good correlation between transcript levels of the kinase and the dehydrogenase genes as if there was a common regulatory element capable of affecting the expression of both genes, there are a few haplotypes identified in cultivated varieties (H2-A, H3, and especially H1-F) that show very low levels for the kinase transcript (Fig.5f) but detectable levels of the dehydrogenase transcript (Supplementary Fig.39). The expression pattern of the two genes in the berry provided by the selected haplotypes appears to be tightly developmentally regulated (Supplementary Figs.4142). Expression is low during the initial phase of berry growth, which occurs mostly by cell division and partly by cell enlargement37, and sharply increases at berry softening, which marks the inception of ripening about one week before color transition (vraison) and resumption of berry growth38. This second phase of increase in berry size, unlike the first one, occurs exclusively by cell expansion39,40. A genome-wide association study (GWAS, Supplementary Fig.43) and an association analysis performed with one of the SNPs that recapitulates the expression differences among haplotypes for the kinase (Fig.6) reveal a significant association between SNPs in the locus and the seed-to-berry ratio at the inception of ripening, with all cultivated varieties showing lower ratios than the wild ones, and Berzamino and Gordin Verde showing high ratios among the cultivated ones. Seed development is the chief factor promoting berry growth41. Berry weight, which is commonly measured as a proxy for berry size, is positively correlated with seed content (seed fresh weight, SFW) and QTLs for extreme variation in berry size and seed content colocalize42 on chromosome 18 with a seed morphogenesis regulator MADS-Box gene43. Doligez and coworkers34 showed, additionally, that the QTL overlapping with the sweep region on chromosome 17 explains the residual variation in berry weight not explained by seed content and it is therefore possible that factors in this region promote pericarp growth at a rate that is more than proportional to the increase in SFW, which is reflected by lower seed-to-berry ratio. The selected haplotypes are associated with a change in berry morphology towards a larger pericarp per unit of SFW. This leads to an increase in size of the fleshy and edible part of the berry, making it more attractive for fresh consumption. It also decreases more than proportionally the seed content released from crushed berries into the must, which greatly improves tannin chemistry and textural sensory attributes in wines. This effect is due to a reduction of the leakage during maceration of astringent and bitter condensed tannins with low degree of polymerization from seeds in favor of the extraction of more palatable condensed tannins with higher degree of polymerization from skins. The kinase gene encodes a LRRRLK that is orthologous to a kinase in Arabidopsis (At5G62710) that is expressed in ovaries and in vascular tissues44 and that shows high homology with the FEI2 kinase. RLKs play a pivotal role in sensing external stimuli, activating downstream signaling pathways and regulating cell behavior involved in response to pathogens, growth, and developmental processes in plants. The FEI2 kinase has been shown in Arabidopsis45 to promote anisotropic cell expansion through a modulation of cell wall function, a role that FEI2 fulfils by interacting directly with 1-aminocyclopropane-1-carboxylic acid (ACC) synthase, a key enzyme for ethylene biosynthesis. The grape berry is considered a non-climateric fruit, lacking a concomitant increase in respiration rate and ethylene biosynthesis at the onset of ripening, but the rise in endogenous ethylene production that is consistently observed a few days before the inception of the second phase of berry growth regulates several aspects of ripening46, including an increase of berry diameter that can be further augmented by the application of exogenous ethylene at vraison47. In light of the specific function of the LRRreceptor kinase ortholog in other plants, it is possible that the haplotypes selected during grape domestication may have provided cultivated varieties with new opportunities for ethylene-related cell expansion during berry ripening thanks to the greatly increased expression of the LRRreceptor kinase gene.

a Association between a AT mutation in the VIT_217s0000g05570 gene, which recapitulate the increase in berryspecific expression of the kinase, and seed-to-berry ratio in hard berries prior to softening, soft berries collected over the same bunch and their average (as a proxy for the end of the first phase of berry growth). Box-plots show 88 accessions (green dots) sorted by their genotype at the SNP_chr17:6,079,793. Accessions with missing AA, AT, TT genotypes were classified based on their alternate/alternate, alternate/reference and reference/reference genotypes, respectively, at the variant sites chr17:6,080,166; 6,079,793; 6,080,193; 6,080,258; 6,080,447; 6,080,449, which are all in LD with chr17: 6,079,793 in the H1-A haplotype. b Variation in soluble solids concentration in the same berries and accessions as in a. Red dots indicate values in hard berries of sylvestris V395. Yellow dots indicate values in eastern feral grapes. Cyan dots indicate values in Berzamino and Gordin Verde. Boxes indicate the first and third quartiles, the horizontal line within the boxes indicates the median and the whiskers indicate 1.5 interquartile range. Source data are provided as a Source Data file.

View original post here:
The genomes of 204 Vitis vinifera accessions reveal the origin of European wine grapes - Nature.com

Posted in Genome | Comments Off on The genomes of 204 Vitis vinifera accessions reveal the origin of European wine grapes – Nature.com

Genome instability drives epistatic adaptation in the human pathogen Leishmania – pnas.org

Posted: at 12:44 am

Significance

Chromosome and gene copy number variations often correlate with the evolution of microbial and cancer drug resistance, thus causing important human mortality. How genome instability is harnessed to generate beneficial phenotypes and how deleterious gene dosage effects are compensated remain open questions. The protist pathogen Leishmania exploits genome instability to regulate expression via gene dosage changes. Using these parasites as a unique model system, we uncover complex epistatic interactions between gene copy number variations and compensatory transcriptomic responses as key processes that harness genome instability for adaptive evolution in Leishmania. Our results propose a model of eukaryotic fitness gain that may be broadly applicable to pathogenic fungi or tumor cells known to exploit genome instability for adaptation.

How genome instability is harnessed for fitness gain despite its potential deleterious effects is largely elusive. An ideal system to address this important open question is provided by the protozoan pathogen Leishmania, which exploits frequent variations in chromosome and gene copy number to regulate expression levels. Using ecological genomics and experimental evolution approaches, we provide evidence that Leishmania adaptation relies on epistatic interactions between functionally associated gene copy number variations in pathways driving fitness gain in a given environment. We further uncover posttranscriptional regulation as a key mechanism that compensates for deleterious gene dosage effects and provides phenotypic robustness to genetically heterogenous parasite populations. Finally, we correlate dynamic variations in small nucleolar RNA (snoRNA) gene dosage with changes in ribosomal RNA 2-O-methylation and pseudouridylation, suggesting translational control as an additional layer of parasite adaptation. Leishmania genome instability is thus harnessed for fitness gain by genome-dependent variations in gene expression and genome-independent compensatory mechanisms. This allows for polyclonal adaptation and maintenance of genetic heterogeneity despite strong selective pressure. The epistatic adaptation described here needs to be considered in Leishmania epidemiology and biomarker discovery and may be relevant to other fast-evolving eukaryotic cells that exploit genome instability for adaptation, such as fungal pathogens or cancer.

Darwinian evolution plays a central, yet poorly understood, role in human disease. Iterative rounds of genetic mutation and environmental selection drive tumor development, microbial fitness, and therapeutic failure. Genome instability is a key source for genetic and phenotypic diversity, often defining disease outcome (14). However, the mechanism(s) by which genome instability is harnessed for fitness gain despite its potential deleterious effects remain largely elusive. Here we investigate this question in the protozoan parasite Leishmania, which causes a spectrum of severe diseases known as leishmaniases that generate substantial human morbidity worldwide (5). Visceral leishmaniasis (also known as kala azar) is caused by Leishmania donovani or Leishmania infantum and is fatal if left untreated. Most Leishmania species show a digenic life cycle comprising two major developmental stages that infect two distinct hosts. The motile, extracellular promastigote form of Leishmania proliferates inside the digestive tract of phlebotomine sand flies, which transmit the highly virulent metacyclic form of the parasite into mammalian hosts during a blood meal. There, following uptake by macrophages, Leishmania develops into nonmotile, intracellular amastigotes that proliferate inside fully acidified phagolysosomes and subvert host cell immuno-metabolomic functions, thus causing the severe immuno-pathologies underlying leishmaniasis.

Genome instability is a hallmark of Leishmania biology since these parasites lack promoter-dependent gene regulation (6, 7) but exploit chromosome and gene copy number variations (CNVs) to regulate messenger RNA (mRNA) abundance by gene dosage (812). In the absence of confounding transcriptional control, Leishmania thus represents an ideal system to investigate the role of genome instability in fast-evolving eukaryotic cells. We applied an experimental evolution (EE) approach and assessed changes at genomic and transcriptomic levels during adaptation of animal-derived parasites to invitro culture. As demonstrated in our previous study (11), fitness gain in this invitro system is driven largely by the rate of parasite proliferation, which depends on accelerated cell cycle progression and increased transcription and translation efficiencies. Here, we uncover complex epistatic interactions between gene CNVs and compensatory transcriptomic responses as key processes that harness genome instability for fitness gain in Leishmania. Our data may be broadly applicable to pathogenic fungi or cancer cells known to exploit genome instability for adaptation.

We first assessed the level of CNV across 204 L. donovani clinical isolates from the Indian subcontinent (13). This collection includes a core group of 191 strains that are genetically highly homogenous as judged by the small number of single nucleotide variants (SNVs) (<2,500 total), which provided us with a useful benchmark to study the dynamics of CNVs across a large number of quasiclonal populations. DNA read depth analysis of these isolates revealed important CNVs in both coding and noncoding (nc) regions, with amplifications and deletions affecting respectively 14% and 4% of the genome (Fig. 1 A and B and Datasets S1 and S2). Analyzing the statistical association of observed CNVs with repetitive sequence elements uncovered 11 DNA repeats, including the previously described SIDER (14) and LDRP1 elements (15). In addition, we describe simple/low complexity repeats (16) and eight repetitive sequence elements that may drive Leishmania genome instability through microhomology-mediated, break-induced replication as observed for human CNVs (17) (Fig. 1C, SI Appendix, Fig. S1, and Dataset S3). Gene dosage changes were not random but clearly under selection as judged by the reproducibility of genetic interactions across independent isolates and the enrichment of amplified genes in biological functions associated with proliferation and thus fitness gain in culture. Statistically significant interactions were observed between positive (correlating) and negative (anticorrelating) read depth variations (Fig. 1D, SI Appendix, Figs. S2 and S3, and Datasets S4 and S5), including a highly connected network cluster (NC) containing 60 coamplified transfer RNA (tRNA) genes encoded on 16 different chromosomes (NC1, Fig. 1D, SI Appendix, Fig. S4, and Datasets S6 and S7). Natural selection of gene CNVs is further supported by 1) their independent emergence across phylogenetically distinct strains providing evidence for evolutionary convergence, 2) the very high copy number observed for certain genes (up to 21-fold) suggesting strong, positive selection, and 3) the global enrichment of read depth variations in phenotypically silent, intergenic regions, suggesting purifying selection against deleterious effects caused by gene CNVs (Fig. 1 EG, SI Appendix, Fig. S5, and Datasets S8S10). Together, our data provide evidence that Leishmania genomic adaptation is governed by gene CNVs through highly dynamic, functional interactions that are under natural selection. These interactions define a form of epistasis at the gene (rather than nucleotide) level, with the phenotypic effect of a given gene amplification being dependent on coamplification of functionally related genes.

Genome-wide mapping of CNVs, their environmental selection, and epistatic interactions. (A) Genome-wide normalized coverage values in natural logarithm scale (y axis) across the 36 chromosomes (x axis) for 204 L. donovani field isolates from the Indian subcontinent (ISC) (13). The x axis reports the position of genomic windows along the chromosomes. The smoothed blue color represents the two-dimensional kernel density estimate of genomic bins. A sample of 50,000 genomic bins with normalized coverage 1.5 or 0.5 are materialized as black dots. The black horizontal line and the two red lines indicate normalized coverage values of 1.5, 1, and 0.5. (B) Heatmap generated for the 204 clinical isolates (columns) showing their phylogenetic relationship as a function of CNV regions (rows). The color scale reflects the deviation from the minimum bin coverage observed across all genomes. The two-colored columns on the right report the presence of annotated genes (orange) and intragenic regions (blue), and the chromosomal location of the CNVs is defined by the legend below the plot. (C) Association between CNVs and repetitive elements. The bar plots show the number of observed overlap instances between the boundaries of the CNV regions and repetitive elements (Left), and the log2 ratio between the observed and expected overlap events over 10,000 randomizations (Right). (D) Gene CNV network analysis. The nodes represent gene CNVs while the edges indicate statistically significant positive (red) and negative (blue) correlations observed in the 204 field isolates. The nodes are colored according to the predicted network clusters (NC). The indicated gene identifiers refer to key nodes of the network, i.e., genes that connect positive or negative correlating clusters, such as a putative protein kinase (LdBPK_160014000), a member of the 4F5 protein family (LdBPK_010007900), the eukaryotic translation initiation factor 2 subunit alpha (LdBPK_030014900), or the rRNA gene LdBPK_210010700. (E) Phylogenetic tree based on SNVs (>90% frequency) (Upper) for the ISC core population comprising 191 isolates (13) [not including the distant ISC1 strains (48); see also cladogram in SI Appendix, Fig. S5]. The heatmap (Lower) shows the normalized gene sequencing coverage across genomes (rows). To ease the visualization, gene amplifications with normalized coverage >2 are indicated as 2. (F) Violin plot showing the distributions of the normalized genomic coverage values of the collapsed CNV positions (dots) matching genic (orange) and intergenic (blue) regions. (G) Log2 ratio distributions of observed and expected nucleotide overlap between collapsed CNV regions and gene/intergenic annotations.

We next used an EE approach to directly assess the link between epistatic interactions and fitness gain in hamster-derived L. donovani parasites during adaptation to invitro culture (11) (SI Appendix, Fig. S6A; EE1). Following normalization for karyotypic variations (SI Appendix, Fig. S6B and Dataset S11), changes in read depth were monitored between passages 2 (P2, 2 wk in culture) and 135 (P135, 36 wk in culture), corresponding to 20 and 3,800 generations. Our analysis revealed coamplification of coding and nc genes and gene clusters that are functionally linked to fitness gain in culture (i.e., accelerated cell proliferation), including genes encoding for ribosomal RNAs (rRNAs), tRNAs, small nuclear (snRNAs), small nucleolar RNA (snoRNA), spliced leader RNAs (SLRNAs), and ribosomal proteins (RPs; Fig. 2A, SI Appendix, Fig. S6C, and Datasets S12S14). Epistatic adaptation to culture thus affects pathways linked to translation in both the LDBPK (Fig. 1) and Ld1S isolates, with amplification observed for homologous genes encoding for various tRNAs, ribosomal RNAs (5SMS, 5.8S, 18S, 28S), or 60S acidic RPs (Dataset S10). Indeed, translation efficiency is one of the major rate-limiting steps for proliferation (18), with enhanced protein synthesis fulfilling the increased need for various molecular machines (DNA and RNA polymerases, ribosomes) in biosynthetically highly active, fast-growing cells. This functional link between a given fitness phenotype and its underlying epistatic network opens unexplored venues for the discovery of Leishmania pathways that govern adaptation to a given environment.

Longitudinal analysis of gene CNVs during fitness gain in culture. (A) Heatmap generated by plotting gene read depth values (columns) across L. donovani amastigotes isolated from infected hamster spleen (AMA) and derived promastigotes evolving in continuous culture for 2, 10, 20, and 135 passages (P2, P10, P20, P135) (rows). The gray level reflects the scaled normalized gene coverage as indicated in the figure. The colored ribbon indicates the simplified gene annotation as shown in the legend to the right. (B) Screenshot of the IGV genome browser showing gradual loss of the NIMA-like kinase gene Ld1S_360735700 during culture adaptation between splenic amastigotes (AMA) and derived promastigotes at passages P2 and P135. The right panel shows the gel-electrophoretic analysis of PCR fragments obtained from lesion amastigotes (AMA) and derived promastigotes at the indicated culture passages that are diagnostic for the WT (11 kb) and the deleted NIMA-like kinase locus (2 kb) (see SI Appendix, Fig. S7A for a schematic overview of the PCR strategy). (C) Heatmap generated by plotting gene read depth variation (columns) across eight clones isolated from the P20 population (rows). The color code is defined in the legend and corresponds to the deviation from the minimum sequencing coverage measured for that gene in all clones. The colored ribbon indicates the simplified gene annotation as shown in the legend of A. The deletion of the NIMA-like kinase Ld1S_360735700 is indicated by the arrow. (D) Analysis of the phylogenetic relation of the P20 clones. The red dots and numbers indicate the node bootstrap support (1 = 100%). The genetic distance is indicated by the branch length and scaled as indicated in the figure.

Unexpectedly, we discovered that gene depletionas well as gene amplificationis also a major driver for environmental adaptation and fitness gain. We identified a genomic deletion of 11 kb containing a single gene encoding for a NIMA-related kinase gene (Ld1S_360735700), which is gradually selected in the adapting promastigote population from a preexisting mutant that was detected in splenic amastigotes (Fig. 2B and Fig. S7 A and B). Clonal analysis of the P20 population revealed the presence of the spontaneous knockout (spo-KO) in six out of eight individual clones (Fig. 2C). The only partial penetration of spo-KO cells after several months in culture indicates that the fitness gain provided by the absence of the NIMA kinase may be very small, a fact we confirmed by establishing growth curves for wild-type (WT) clone CL1 and spo-KO clone CL6, which was indeed unable to identify a statistically significant difference in proliferation (SI Appendix, Fig. S7C). The different spo-KO clones were clearly not the descendants of a single founder cell but were of independent, polyclonal origin as judged by their distinct gene CNV profiles (Fig. 2C and Dataset S15), and their polyphyletic clustering based on SNVs compared to WT clones (Fig. 2D and Dataset S16). This evolutionary convergence strongly supports natural selection of the deletion during culture adaptation and suggests a potential role of the deleted NIMA kinase in growth restriction, which we further assessed by gene editing.

Unlike the spo-KO clones, CRISPR/Cas9-generated, NIMA knockout mutants (cri-KO) (SI Appendix, Fig. S7 C and D) were not viable, while heterozygous mutants showed a strong growth defect, which was partially rescued by episomal overexpression of the NIMA kinase gene (Fig. 3A). This paradoxical result suggests that spo-KO cells must have evolved mechanisms that can compensate for the loss of this essential gene. Read depth analyses of the spo-KO and WT clones ruled out genetic compensation (Dataset S15). In contrast, RNA sequencing (RNAseq) analysis showed highly reproducible, compensatory transcript profiles in the six independent spo-KO clones (Fig. 3B and Dataset S17). Our analysis revealed reduced stability in spo-KO clones of 23 transcripts implicated in flagellar biogenesis (SI Appendix, Fig. S8), which correlated with reduced flagellar activity (Videos S1 and S2). Loss of motility may represent a fitness tradeoff, providing energy required for accelerated invitro growth. Likewise, the spontaneous loss of the NIMA-like kinase gene may be the result of an evolutionary process that selects against motility, given the implication of the NIMA-related kinase Cnk2p in regulating flagellar length in the protist Chlamydomonas (19) and the localization of the Trypanosoma brucei NIMA-related kinase TbNRKC to the flagellum basal body (20). However, selection for the spontaneous NIMA deletion mutant was not observed in three independent evolutionary experiments conducted with different amastigote isolates (SI Appendix, Fig. S6A; EE1-3, SI Appendix, Fig. S9A), suggesting that compensation for the essential NIMA functions may be complex and thus rather a rare event.

Genetic and transcriptomic analyses of the NIMA-like kinase null mutant. (A) Growth analysis. WT and heterozygous NIMA+/ mutants generated by CRISPR/Cas9 gene editing (Left; see SI Appendix, Fig. S7 D and E for gene editing strategy and PCR validation of the mutant). Transgenic NIMA+/ parasites transfected with empty vector (NIMA+//m) or vector encoding for the NIMA-like kinase gene (NIMA+//+) (Right). The doubling time of parasites in logarithmic culture phase is shown. (B) Heatmap of the scaled normalized RNAseq counts of the genes differentially expressed in the P20 clones 1 and 8 (WT) with respect to the spontaneous NIMA null mutant (spo-KO) clones 3, 6, 4, 7, 9, and 10. A log2 fold change > 0.5 with adjusted P value < 0.01 was considered significant. Darker levels of blue reflect higher expression levels as indicated by the legend. (C) Double-ratio scatter plot. The plot represents the ratio of the mean DNA (x axis) and RNA (y axis) sequencing read counts between the clones that lost the NIMA kinase gene (CL3-4-6-7-9-10) and the NIMA kinase WT clones (CL1 and CL8). Each dot represents an individual gene. The marginal distributions for DNA and RNA ratio values are displayed along the x and y axes. The color indicates the statistical significance level of the genes double-ratio scores (i.e., RNA ratio divided by DNA ratio) as indicated in the legend. The NIMA-like kinase homolog Ld1S_360735800 is labeled in red. The vertical and horizontal dotted lines indicate DNA and RNA ratio values of 1, while the diagonal dashed line specifies the bisector. The blue line represents a linear regression model built on the DNA and RNA ratio values and measuring a Pearson correlation value of 0.65. (D) Functional enrichment analysis of the biological process Gene Ontology terms for all genes in C showing a statistically significant double-ratio score (Dataset S18). The node color mapping is ranging from yellow to dark orange to represent increasing significance levels, or lower adjusted P values. White nodes are not significant. 1: organic substance metabolic process, 2: nitrogen compound metabolic process, 3: cellular metabolic process, 4: organic cyclic compound metabolic process, 5: primary metabolic process, 6: cellular nitrogen compound metabolic process, 7: heterocycle metabolic process, 8: cellular aromatic compound metabolic process, 9: macromolecule metabolic process, 10: nucleobase-containing compound metabolic process, 11: nucleic acid metabolic process.

On the other hand, we identified 350 transcripts with increased abundance (Dataset S17, second sheet), which either result from increased gene dosage or mRNA posttranscriptional stabilization. Direct comparison of DNA and RNA read depth variations allowed us to distinguish between these two possibilities and identified a set of transcripts whose expression changes between WT and spo-KO did not correlate with gene dosage. In absence of transcriptional regulation in Leishmania (6), the abundance of these transcripts is likely regulated by differential RNA turnover. Increased abundance was observed in spo-KO clones for the mRNA of another NIMA-related kinase (Ld1S_360735800) encoded adjacent to the deleted region, suggesting a direct, posttranscriptional compensation of kinase functions (Fig. 3C and Dataset S18). Likewise, we observed increased stability for functionally related small ncRNAs, including 43 snoRNAs, several rRNAs, and tRNAs but also various metabolic enzymes (e.g., calpain-like cysteine peptidases, a glycosomal phosphoenolpyruvate carboxykinase or a histone methylation protein DOT1) (Fig. 3D and Dataset S18), suggesting ribosomal biogenesis, translational regulation, and various metabolic processes as yet additional levels of nongenomic adaptation. Together, our data identify gene deletion and compensatory, posttranscriptional responses as drivers of Leishmania fitness gain.

The selective stabilization of snoRNAs during early culture adaptation (P2-P20) suggests these ncRNAs as key drivers in Leishmania fitness gain. We confirmed this possibility in long-term adapted parasites that were continuously cultured for 3,800 generations (P135). Read depth analysis of P2, its derived P135 population, and an independent population evolved until P125 (SI Appendix, Fig. S6A; EE4) revealed selective amplification of snoRNAs, as judged by the shift observed in the shown ternary plot (Fig. 4A, SI Appendix, Fig. S9B, and Dataset S19). Surprisingly, rather than amplification of individual snoRNA genes, increased read depth was caused by the recovery of a single locus on chromosome 33 containing a cluster of 15 snoRNA genes in the P135 population, which was depleted in the original amastigote population (Fig. 4B). The restoration of this locus between P20 and P135 is further proof that snoRNAs are under positive selection during culture adaptation. Rather than intrachromosomal expansion of gene copies, the recovery of this locus is more likely due to the selection of a small subpopulation that may preexist in the original amastigote isolate and penetrate the culture between P20 and P135, similar to what we observed for the NIMA kinase spo-KO. This scenario is indeed supported by analyzing CNVs in the snoRNA cluster across independent amastigotes isolates, which revealed significant differences in snoRNA gene copy number, as indicated by the observed variations in read depth (Fig. 4C and SI Appendix, Fig. S10). Thus, while short-term adaptation in Leishmania is governed by control of RNA stability, long-term adaptation occurs through more cost-efficient gene dosage regulation. Increased snoRNA abundance likely satisfies a quantitative need for ribosomal biogenesis in our fast-growing cultures but could also affect the nature and quality of ribosomes (21, 22).

snoRNA genes are amplified in long-term adapted parasites and promote rRNA modification. (A) Ternary plot showing for each gene the relative abundance in culture passage P2, P125, and P135. The axes report the fraction of the normalized gene coverage in each sample, with each given point adding up to 100. Dots with color ranging from pink to black indicate significant gene CNVs (P value < 0.001). (B) Recovery of the full snoRNA gene cluster during culture adaptation. The panel illustrates a genome browser representation of the sequencing depth measured in the samples at the indicated passage. Gene annotations and the predicted repetitive elements are indicated. (C) Line plot showing the normalized sequencing coverage (y axis) over the snoRNA cluster region on chromosome 33 (x axis) for the original amastigote sample used to derive the P2 strain (AMA, green) and three independent amastigote isolates (AMAH154, red; AMA07142, yellow; AMA1992, blue) obtained from different hamster infections. (D) Hyper-pseudouridylated () sites are present in the PTC functional domain of the ribosome. The relative change in levels was measured using -seq and is presented in Dataset S21. Representative line graph of the fold change in pseudouridylation levels (-fc, log2) along the rRNA nucleotides (x axis) is presented for P2 (green line) and P135 (blue line) for the PTC domain in LSU-b rRNA. The positions where the level is increased in four replicates are indicated in red. (E) The location of sites in the rRNA is depicted on the L. donovani Ld1S secondary structure, which is identical to Leishmania major and similar to T. brucei (25, 49). Hypermodified sites are highlighted in boxes. The snoRNA guiding on each is indicated. The color code for each site is indicative of the organism where it was already reported. (F) Hypermethylated (Nm) sites are located around the functional domains of the ribosome. The complete stoichiometry of each Nm site was measured by RiboMeth-seq in P2 and P135 L. donovani strains as presented in Dataset S20. The hypermodified Nm sites are highlighted in green, and their identity is indicated on the three-dimensional structure of the L. donovani large subunit ribosome based on the previously deposited cryo-EM coordinates [Protein Data Bank (50) accession 6AZ3] (26). (G) Model of Leishmania polyclonal adaptation. 1) Leishmania intrinsic genome instability generates constant genetic variability. 2) Epistatic interactions between gene CNVs and compensatory responses at the level of RNA stability and translation efficiency eliminate toxic gene dosage effects and harness genome instability. 3) This mechanism generates phenotypic robustness while at the same time maintaining genetic variability, thus allowing for polyclonal adaptation. 4) The genetic mosaic structure allows for distinct adaptive trajectories (T) inside and in-between adapting populations, a process that conserves genetic heterogeneity and thus evolvability of the population despite constant selection.

In the following, we assessed the possibility of such fitness-adapted ribosomes in Leishmania, especially since snoRNA genes guiding 2-O-methylation (Nm) and pseudouridine () modifications were extensively amplified. Amplification of snoRNAs should lead to increased modifications on sites that are accessible for modification. The mapping of Nm via RiboMeth-seq (23) revealed an increased level of modification for 18 sites by at least twofold (Dataset S20), while the mapping of pseudouridylation by -seq (24, 25) showed increase for 5 sites during adaptation from P2 to P135 (Fig. 4D and Dataset S21). Interestingly, mapping these sites on the resolved cryogenic electron microscopy (cryo-EM) structure of the L. donovani large ribosomal subunit (26) reveals their localization around the peptidyl-transferase center (PTC) and mRNA entrance tunnel, whereas the hyper-modified sites are located in the PTC itself (Fig. 4 E and F and SI Appendix, Fig. S11). Together, our data infer a complex model of Leishmania fitness gain where epistatic interactions between gene amplifications and compensatory responses at posttranscriptional levels harness Leishmania genome instability for polyclonal adaptation (see model Fig. 4G and details in the legend).

A common strategy in microbial evolutionary adaptation is known as bet-hedging, where the fitness of a population in changing environments is increased by stochastic fluctuations in gene expression that are regulated at epigenetic levels (2731). The protozoan pathogen Leishmania largely lacks transcriptional regulation, raising the question of how these parasites generate variability in transcript abundance and phenotype required for adaptation (6, 7). Our data uncover an alternative mechanism of bet-hedging that has evolved based on the unique biology of Leishmania.

First, we provide evidence that Leishmania genome instability may be driven by 11 repetitive DNA elements associated with genome-wide amplifications and deletions, both of which are under positive and purifying selection. Leishmania thus compensates for the absence of stochastic gene regulation by the generation of stochastic gene CNVs, which can cause dosage-dependent changes in transcript abundance (812). Second, we reveal an important role for posttranscriptional regulation in Leishmania fitness gain, which 1) compensates for the deleterious deletion of a NIMA kinase by selectively stabilizing the transcripts of an orthologous kinase, 2) allows for gene dosage-independent increase in the abundance of ncRNAs (SLRNAs, snoRNAs) required for proliferation, and 3) provides phenotypic robustness to genetically heterogenous populations as documented by the converging transcript profiles of independent NIMA spo-KO mutants. This adaptation process guards against toxic gene dosage effects and simultaneously increases the phenotypic landscape available to Leishmania for adaptation via gene deletions and compensatory transcriptional responses. Significantly, the NIMA ortholog as well as the SLRNA and snoRNA loci are amplified during long-term culture (Fig. 4 A and B and SI Appendix, Figs. S6C and S9B), revealing a two-step adaptation process reminiscent to yeast (32), implicating first, a posttranscriptional mechanism via transient changes in RNA stability followed by a second genomic mechanism via selection for stable CNVs.

The dynamic changes in snoRNA stability and gene copy number observed in our EE system identifies this class of ncRNAs as an unexpected driver of Leishmania fitness gain. snoRNAs guide rRNA modification and processing as well as modification of snRNAs and additional ncRNAs (33, 34). There is currently no information on the effect of individual rRNA and Nm modifications on translation. It has been shown that ribosomes with different RP composition are specifically affected in translation of only a subset of proteins (35). Conceivably, changes in rRNA conformation and interaction due to modification may have similar effects. Indeed, we recently demonstrated that individual modifications affect protein binding (33, 36) and similarly may impact RP binding, generating specialized ribosomes with different translation properties. In addition, modification also affects rRNA structure, which was shown to change translation efficiency and fidelity (22, 37). Based on our results, it is interesting to speculate that differential snoRNA expression and subsequent differential rRNA modification generates fitness-adapted ribosomes with unique translation properties. Such specialized ribosomes may represent an additional layer of regulation capable of counteracting toxic gene dosage effects, providing phenotypic robustness, and adapting the proteome profile to a given environment, much like it was observed in cancer cells or during differentiation of stem cells (38, 39).

Our findings have important clinical implications for Leishmania infection. Leishmania adapts to various environmental cues, notably the presence of antileishmanial drugs. In contrast to high frequency amplifications observed during experimental drug treatment in culture (4042), treatment failure and drug resistance observed during natural infection may evolve through multilocus epistatic interactions such as those described here, which can balance the fitness tradeoff between drug resistance and infectivity (43). Therefore, our data define biological networks, rather than individual genes, as biomarkers with potential diagnostic or prognostic value. Conceivably, the epistatic mechanisms we uncover in Leishmania can be of broader relevance to other human pathologies caused by fast-evolving eukaryotic cells exploiting genome instability for polyclonal adaptation, such as cancer cells. While single nucleotide, epistatic interactions are recognized as important drivers of tumor development (44, 45), the role of epistatic interactions between structural mutations and between the genome and transcriptome in drug-resistant cancer cells remains to be elucidated (46).

In conclusion, our results propose a model of Leishmania fitness gain (Fig. 4F), where polyclonal adaptation of mosaic populations is driven by epistatic interactions that 1) buffer the detrimental effects of genome instability, 2) coordinate expression of functionally related genes, and 3) generate beneficial phenotypes for adaptation. This mechanism of fitness gain avoids genetic death by maintaining heterogeneity in competing parasite populations under environmental selection and may be generally applicable to other eukaryotic systems that adapt through genome instability.

The extended supporting materials and methods provide detailed information on the 1) L. donovani strains, culture conditions, and cell cloning, 2) nucleic acid extraction and deep sequencing, 3) whole genome sequencing data analysis, 4) reference genomes, 5) repeat analysis, 6) CNV association analyses, 7) network analysis, 8) phyogenetic analysis, 9) RNAseq analysis, 10) Gene Ontology term enrichment analysis, 11) PCR analyses, 12) null mutant analysis, and 13) methylation (Nm) and pseudouridine () rRNA modification analyses.

Reads were deposited in the Sequence Read Archive database (47) and are publicly available under accession no. PRJNA605972. All data are available in the main text or the SI Appendix. Previously published data were used for this work (11, 13).

This study was supported by a seeding grant from the Institut Pasteur International Department to the LeiSHield Consortium, the EU H2020 project LeiSHield-MATI-REP-778298-1, the Fondation pour la Recherche Mdicale (Grant FDT201805005619), the Flemish Ministry of Science and Innovation (MADLEI, SOFI Grant 754204), and a grant from CAMPUS France and the Israeli Ministry of Science and Technology PHC MAIMONIDE 2018-2019-Projet 41131ZD. We thank Cedric Notredame and Jean-Claude Dujardin for critical reading of the manuscript.

Author contributions: G.B., P.P., M.A.D., K.S.R., and G.F.S. designed research; G.B., L.P., P.P., and K.S.R. performed research; G.B., L.P., P.P., K.S.R., S.C.-C., T.D., D.-G.H., P.J.M., R.U., S.M., and G.F.S. analyzed data; and G.B., P.P., K.S.R., and G.F.S. wrote the paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2113744118/-/DCSupplemental.

Excerpt from:
Genome instability drives epistatic adaptation in the human pathogen Leishmania - pnas.org

Posted in Genome | Comments Off on Genome instability drives epistatic adaptation in the human pathogen Leishmania – pnas.org

Ireland returnee tests Covid positive in Bengal; genome sequencing report awaited – India Today

Posted: at 12:44 am

A man, who had returned to Kolkata from Ireland last week, tested positive for Covid-19 following which he was hospitalised on Tuesday, a health department official said.

The department is waiting for the genome sequencing report of the 27-year-old man to check if he was afflicted with the Omicron variant of coronavirus, he added.

"The man who has been working in Dublin, Ireland, for the last five years, arrived in the city flying from Manchester via Abu Dhabi and New Delhi. He was admitted to a hospital on Tuesday morning," the official said.

ALSO READ: Omicron likely to peak in Feb 2022 but subside in a month, say scientists

His genome sequencing report from the National Institute of Biomedical Genomics (NIBMG) at Kalyani is expected within 72 hours, he added.

"He had tested negative for Covid on December 16 at a lab at Dublin, a day before his flight, and then again upon arrival at Delhi airport. He reached Kolkata on December 18 evening and went straight to his home. "From Monday morning, however, he started running a high-grade fever (103 degrees Fahrenheit or above) along with body ache, malaise and headache, after which a test showed that he was Covid positive," the official said.

The patient has said he had been indoors since his arrival in Kolkata, as per the guideline for international travellers which says they have to remain in home isolation for eight days, and venture out only after testing negative through an RT-PCR test.

Meanwhile, on Tuesday, West Bengal's Covid-19 death toll rose to 19,688 after 12 more fatalities were reported in the last 24 hours, a health department bulletin said.

ALSO Read | US reports 1st Omicron-related death as unvaccinated man dies

Three deaths each were registered in Kolkata, North 24 Parganas and Nadia districts while Hooghly and Birbhum districts recorded two and one fatalities respectively.

Altogether 440 new cases pushed the state's Covid tally to 16,27,930, the bulletin stated.

In the last 24 hours, 451 patients recovered from the disease taking the total number of cured people to 16,00,791. The discharge rate remained at 98.33 per cent.

The number of active cases is 7,451.

A total of 32,871 samples were tested for Covid-19 in West Bengal on Tuesday, taking the total number of such clinical examinations to 2,10,59,843.

ALSO READ | Planning a trip abroad for Christmas-New Year? Here are Omicron curbs you must know

Read more:
Ireland returnee tests Covid positive in Bengal; genome sequencing report awaited - India Today

Posted in Genome | Comments Off on Ireland returnee tests Covid positive in Bengal; genome sequencing report awaited – India Today

Genomic basis of fishing-associated selection varies with population density – pnas.org

Posted: at 12:44 am

Significance

Fisheries-associated selection is recognized as one of the strongest potential human drivers of contemporary evolution in natural populations. The results of this study show that while simulated commercial fishing techniques consistently remove fish with traits associated with growth, metabolism, and social behavior, the specific genes under fishing selection differ depending on the density of the targeted population. This finding suggests that different fish populations of varying sizes will respond differently to fishing selection at the genetic level. Furthermore, as a population is fished over time, the genes under selection may change as the population diminishes. This could have repercussions on population resilience. This study highlights the importance of selection but also environmental and density effects on harvested fish populations.

Fisheries induce one of the strongest anthropogenic selective pressures on natural populations, but the genetic effects of fishing remain unclear. Crucially, we lack knowledge of how capture-associated selection and its interaction with reductions in population density caused by fishing can potentially shift which genes are under selection. Using experimental fish reared at two densities and repeatedly harvested by simulated trawling, we show consistent phenotypic selection on growth, metabolism, and social behavior regardless of density. However, the specific genes under selectionmainly related to brain function and neurogenesisvaried with the population density. This interaction between direct fishing selection and density could fundamentally alter the genomic responses to harvest. The evolutionary consequences of fishing are therefore likely context dependent, possibly varying as exploited populations decline. These results highlight the need to consider environmental factors when predicting effects of human-induced selection and evolution.

The selective harvest of animals by humans is one of the most important contemporary pressures on natural populations (1, 2). Intensive commercial fishing has been demonstrated to alter life history traits (e.g., reduced body size and/or age and size at maturation) in ecologically and economically important populations (37). However, a major question persists about whether the observed changes stem from Darwinian evolution via selection on phenotypic traits and associated genotypes or result from human-induced environmental changes generating phenotypic plasticity (811). In addition, harvest-associated phenotypic plasticity may interact with the fishing selection on genotypes (Gene by Environment interaction, GE) to alter evolutionary outcomes, but this possibility has been overlooked.

For selection by fishing to occur, there must be phenotypic variation among individuals with respect to their vulnerability to capture. Vulnerability is likely comprised of a suite of life history, morphological, physiological, and behavioral traits that interact to determine whether a fish will ultimately escape or be captured by a fishing gear. While size and maturation are well established to be selected by fishing (12, 13), emerging evidence shows that fishing may also drive selection on traits related to bioenergetics or social behavior that also vary widely within species (1416). This is especially likely given that commercial fishing methods such as trawling directly exploit aspects of fish foraging, schooling, and escape behaviors to facilitate capture (14). If the traits under fishing selection possess a genetic basis, fishing could lead to direct evolution (8, 17). Recent research suggests that fisheries can induce a shift in the genomic variants of targeted populations (1820). While an analysis of wild populations would require more detailed time series and genomic data to securely infer the genomic responses to fishing (11), previous experimental work has mainly examined responses to size-based selection with no attempt to determine whether vulnerability to capture as an integrated trait can indeed select on specific genotypes and underlying genomic variants. Such genomic information is potentially valuable for predicting the consequences of selective harvest on targeted populations with the benefit of understanding which molecular changes might be involved and fuel evolution.

Harvest-associated plasticity could also occur in response to environmental effects because fishing causes other confounding environmental changes that could influence phenotypic expression (9, 21), such as the reduction of population density over time. Indeed, intense harvesting can remove so much biomass from the environment that the density for the remaining population is altered. A reduced population density may then decrease interindividual competition and correspondingly increase resource availability or alter among-individual variation in resource acquisition (22). Such conditions may not only modify the average phenotype of the remaining population and reduce phenotypic variation within the population (23) because of more homogenous food allocation among individuals, but also affect which individuals have a selective advantage in that new context. Different phenotypes and genotypes may therefore be selected by fishing pressures depending on population density, creating GE effects to produce new selective landscapes and evolutionary trajectories for the remaining population. Previous modeling studies (24, 25) highlighted the importance of considering the contribution of population density in harvest-induced evolution, but, so far, empirical studies examining how population density reduction can affect selection and evolutionary potential in a fisheries context are lacking. Increased knowledge of the independent and interactive roles of direct selection by fishing and density-dependent effects is critical for understanding the integrated possible evolutionary consequences of fisheries on natural populations and for devising well-informed and sustainable strategies for harvest management.

To address these issues, which are intractable in wild populations, we used an experimental approach using scaled-down gears and a surrogate species under varying population densities. The use of surrogate species has already been advocated by several authors (17, 26). Zebrafish (Danio rerio) have similar behavior (such as exploration, sociability, and shoaling) as larger fish species targeted by fisheries, reproduce easily in captivity, and thus are a suitable surrogate species for experimental studies of fisheries-induced evolution (26). We created 36 families of semiwild zebrafish, with each family equally split at hatching into either a population of baseline density [i.e., the density recommended for zebrafish rearing (27)] or a population of reduced density (half that of the baseline). After 6 mo, 10 fish per family per density were used to create our experimental populations (360 fish per density), with the baseline density fish being housed in two 55-L tanks and the reduced density fish being housed within four 55-L tanks. Each tank was further subdivided into four sections using transparent dividers. Each fish was then screened for a range of life history, physiological, and behavioral phenotypic traits, including size, growth rate, aerobic metabolic rate, aggression, and sociability. The fish were then submitted to scaled-down trawling simulations repeatedly over six fishing trials that comprised a total of 75 individual fishing events to mimic commercial fisheries that gradually harvest fish over time (SI Appendix, Fig. S1). A key advance of our study is that we quantify vulnerability to capture as an integrated trait as opposed to previous work that has mainly selected only on body size during experimental harvest (18, 19). The 20% most vulnerable fish (based on the shortest time to be caught in the first trawling event, i.e., captured fish) and 20% least vulnerable fish (escaping the last trawling event and were never captured over the course of the trawling events, i.e., escaped fish) were identified at the end of trawling simulations (n = 72 fish per vulnerability and population density).

We observed that the trawling simulation selected for similar phenotypic differences between the captured and escaped fish regardless of population density (Fig. 1). A general linear model (GLM) multivariate analysis including all life history, physiological, and behavioral traits measured revealed a significant difference between the captured and escaped fish (multivariate GLM: F4,238 = 9.09, P < 0.0001). The phenotypic differences between captured and escaped fish were similar across the population densities; no interaction between vulnerability and density (multivariate GLM: F4,238 = 0.91, P = 0.46) and no main effects of density were observed (multivariate GLM: F4,238 = 1.60, P = 0.17). Further analysis of the differences using individual traits revealed that the escaped fish had a higher specific growth rate (150% faster growth, GLM: F1,285 = 10.31, P = 0.001, Fig. 1A), higher aerobic metabolic scope (16% higher, GLM: F1,286 = 7.55, P = 0.006, Fig. 1B), lower aggression (37% fewer bites toward mirror reflection, GLM: F1,262 = 4.57, P = 0.034, Fig. 1C), and lower sociability (21% further from conspecifics in sociability assay, GLM: F1,263 = 3.94, P = 0.048, Fig. 1D) than the captured fish, regardless of density.

Physiological and behavioral phenotypic selection in the two density populations. The distribution of the specific growth rate (A), aerobic scope (adjusted to the mean mass of the fish, i.e., 0.30 g) (B), level of aggressiveness (C), and sociability (D) of captured (dark gray) and escaped fish (light gray) after a series of trawling simulations reared either under a baseline or reduced density (n = 75 per group). Different letters indicate significant difference among the conditions (GLM: P < 0.05).

To determine the potential evolutionary effects of this fisheries-induced selection on phenotype and investigate the broad range of traits that could be under selection, we screened for differential selection on genetic variants using low-coverage (2 coverage per individual), whole-genome sequencing. We sequenced 24 fish (siblings from the same family origin across the groups) from high and low vulnerability to capture in the baseline and reduced density populations (n = 4 24 = 96 in total). We examined the differential allele frequency of the genomic variants using genotype likelihoods (28).

Our analysis of over 5.67 million reference genomemapped single-nucleotide polymorphisms (SNPs) indicated that the trawling simulation had a selective effect at the genomic level that was consistent across families within each density. We observed an expected heterozygosity (He) of 0.25 for all samples combined, which is in the range of natural zebrafish populations (29). Using a restrictive threshold based on random permutation (upper and lower global 0.5% quantiles [z-transformed differences in allele frequencies, i.e., zdAF, = 4.65 and 4.77, respectively] of 0.05% Bonferroni-corrected z-transformed allele frequency differences from each SNP), we detected the outlier SNPs with allele frequencies that significantly differed between the captured and escaped fish across the families in each population density (Fig. 2A). We identified 239 annotated SNPs in 220 genes or in noncoding regions and 241 unannotated SNPs that significantly differed between the captured and escaped fish in the baseline density. In the reduced density, 268 annotated SNPs in 239 genes or in noncoding regions and 249 unannotated SNPs were significantly different between captured and escaped fish. By targeting several hundred genes, this fisheries-induced selection follows the classic quantitative genetic prediction of selection on complex traits (30). The outlier genes identified were mainly involved in brain function and neurogenesis (Gene Ontology [GO]; Fig. 3 and SI Appendix, Table S1). Similarly, the gene set enrichment analysis based on the z-transformed allele frequency differences of all SNPs (not only the outliers) also indicated trends of enrichment for nervous system processes (SI Appendix, Fig. S2). The trawling simulation seems therefore to induce additional selection on the neurological functions of the fish.

Genomic variation and selection in the two density populations. (A) Distribution of the allele frequency difference between captured and escaped fish after a series of trawl simulations, with the complete set of SNPs (5 666 304 SNPs) and the outlier SNPs (480 and 517 SNPs in the baseline and reduced density respectively) in the fish reared under baseline (dark blue) or reduced (light blue) density. The dark line represents the outlier threshold based on the 0.5% quantiles of the Bonferroni-corrected zdAF (4.65 and 4.77, upper and lower cutoff, respectively). (B) Distribution of the genomic PC score of the outliers of captured (dark gray) and escaped fish (light gray) after a series of trawling simulations, reared either under a baseline or reduced density (n = 24 per group). Different letters indicate significant difference among the conditions (GLM: P < 0.05).

Outlier GO terms from the genes selected by the fishing simulation in the two density populations. The significance and fold enrichment of the 15 most significant GO terms are represented in the outliers of the fish reared under a baseline (A) or reduced (B) density. The numbers shown in parentheses are the GO identities of each biological process and GO term. Dark blue GO terms represent GO terms only present in the baseline density population, while clear blue GO terms represent GO terms only present in the reduced density population. Orange GO terms represent the GO terms present in the two populations. GO terms with asterisks are the GO terms with significant enrichment (FDR < 0.05). The complete list of the outlier GO terms from each population is available in SI Appendix, Table S1.

Notably, however, the genes and particular biological functions potentially selected by the trawling simulation differed depending on population density (Fig. 3 and SI Appendix, Table S1). From the outlier SNPs identified as different between the captured and escaped fish in each population density (480 in the baseline density and 517 in the reduced density), only two overlapped between the baseline and reduced density populations (one annotated and one unannotated, SI Appendix, Table S2), which is significantly fewer than expected by chance (Fishers exact test, P < 0.0001). From the genes identified with the outlier SNPs of each population density, only eight (mainly involved in brain and eye development) were shared between the baseline and reduced density (SI Appendix, Table S2), which is similar to that expected by chance (Fishers exact test, P = 0.10), and only one of those genes involved an overlapping SNP. Further confirmation of the density effects on genomic response, based on replication and reanalysis within experimental groups, also found that between-density differences were greater than within-density differences (SI Appendix, Tables S3 and S4) both in number of outlier genes and in gene functions. The proportion of the different genomic variants (e.g., missense variant, synonymous variant, etc.) represented in the outliers of each population density were similar across density (SI Appendix, Table S5). A mere nine GO terms overlapped between the baseline and reduced density populations, with only six in the 15 most significant GO terms (Fig. 3 and SI Appendix, Table S1). In addition, only six GO terms in the reduced density were significantly enriched in alleles different between the captured and escaped fish (false discovery rate [FDR] corrected: nervous system development, neuron development, multicellular organismal process, multicellular organism development, plasma membrane bounded cell projection organization, and cell projection organization), while none were significantly enriched in the baseline density (Fig. 4). These results are a strong indication of an interaction between the population density and the allele selection induced by the trawling simulation.

Summary of the phenotypic and genomic selection by the fishing simulation according to population density. Shown are differences in captured fish relative to those that were never captured, illustrating the selection by fishing on the phenotypes (physiological and behavioral traits) and the genomes (outlier SNPs, annotated genes, GO terms, and enriched GO terms) of fish reared at a baseline (dark blue) or reduced (light blue) density. The overlapping section represents the selection that is shared between the baseline and reduced density. The complete list of the GO terms from each population is available in SI Appendix, Table S1.

Genomic multivariate analysis revealed strong differential selection by trawling at the genome level, which differed between population densities. Using the allele frequency difference of all outlier SNPs from the two population densities, we ran a multivariate analysis (principal component analysis [PCA]) separating escaped from captured individuals (31% of variance explained) and extracted a genomic principal component (PC) score for each fish. The genomic PC score was then used to represent the overall genome of each individual. A significant interaction between vulnerability and density was observed on the genomic PC score obtained (GLM: F1,88 = 241.8, P < 0.0001). The difference in the genomic PC score between captured and escaped fish was more extreme at baseline density, while escaped fish had a lower genomic PC score in both density conditions (Fig. 2B). In addition, the genomic PC score of the escaped fish in the baseline density was significantly lower than the genomic PC score of the escaped fish from the reduced density (Fig. 2B). An additional analysis conducted replicating within density (SI Appendix) also revealed that no difference was observed between groups of the same density (SI Appendix, Figs. S3 and S4). These results again point to the conclusion that trawling-induced selection at the genomic level differs depending on the prevailing population density. The analysis also highlighted a stronger similarity within each density/vulnerability group than across. The same families were represented in each group, suggesting stronger effects of vulnerability and density compared to family on the presence of outlier genomic variants, meaning that the trawling selection on the genomic variants was also consistent across the different families.

The genomic PC score correlated with some of the measured phenotypic traits (Table 1 and Fig. 5), revealing genotypephenotype associations. Significant correlations were observed between the genomic PC score and body mass (Pearsons correlation: baseline density r = 0.59, n = 48; reduced density r = 0.56, n = 48; P < 0.001 for both), specific growth rate (Pearsons correlation: baseline density r = 0.47, n = 48; reduced density r = 0.54, n = 48; P < 0.001 for both), and aggression (Pearsons correlation: combined densities r = 0.26, n = 96, P = 0.015). Even though a significant density effect was observed on the correlation between the genomic PC score and the mass or specific growth rate (Table 1), the direction and strength of the correlation was similar between densities. These correlations suggest that these phenotypic traits, which are under fishing selection, possess a genomic basis also under selection by the fishing process. Evolution in response to harvesting could thus be expected for these traits.

Correlations between genomic and phenotypic variance

Significant correlations between genomic and phenotypic variance. The correlations between the genomic PC score of the outliers and mass (A), specific growth rate (B), and aggressiveness (C) of captured (round shape) and escaped (triangle shape) fish after a series of small-scale trawling simulations reared either under a baseline (dark blue) or reduced (light blue) density (n = 24 per group). The black line in C represents the main effect of genomic PC score (there was no interaction with density). The shaded areas around the lines correspond to 95% intervals.

Our results show the potential for an important, but to date overlooked, interaction between harvest-associated selection and environmental harvest-associated effects, measured here as a reduction in population density, on the selection of genotypes. Specifically, even though selection at the phenotypic level imposed by trawling was similar in the two density conditions, selection at the genomic level acted on different underlying genes depending on the population density (Fig. 4). Harvest-associated reduction in population density could thus fundamentally shift the evolutionary trajectory of targeted populations at the genomic level. It is therefore imperative to consider that the harvest of wild individuals has the potential to not only alter the phenotypes and genotypes present within a population through direct selection but also through associated density-dependent effects on the underlying genes under selection. This combination of effects could further limit our ability to predict the evolutionary consequences of fisheries, especially as the density of targeted populations will tend to decrease through time, potentially regularly shifting the selection on genomic variants present even if selection on phenotypes remains constant.

Fish with a lower specific growth rate were more vulnerable to our simulated capture. The predicted responses of fishing selection on growth rate are complex and depend on a number of factors, including any thresholds for size-based selection (e.g., length limits at which fish can be retained because of management-based size restrictions) and energy investment before versus after maturity (3, 5, 19, 24). Importantly, our study contained no a priori assumption of size-based selectivity and instead considered vulnerability to capture as an integrative trait unto itself. Therefore, faster growing fish may have been better able to escape the trawl because of increased swimming endurance or higher absolute swimming speeds. This suggests that trawling may produce selection on growth rates, perhaps because of correlations among growth rate and components of locomotor ability and bioenergetics, which can be separate from the selective pressures on growth stemming from size-selective mortality.

Our experimental trawling simulation selected not only on life history traits (specific growth rate) but also on physiological (aerobic metabolic rate) and behavioral (aggression and sociability) traits similarly in both density conditions. This finding provides further evidence that important physiological and behavioral traits in addition to body size can be directly targeted by fishing and could determine the capacity of an individual to be captured by a particular fishing gear or technique (14, 16, 31, 32). For example, higher aerobic metabolic scope probably allows a fish to reach a faster swimming speed or have greater swimming endurance, enabling it to out swim a trawl (14). Similarly, more social fish could be more likely to follow conspecifics into the net when groupmates tire and get captured rather than leaving the group to find a way to escape (14). Selection on particular phenotypes could thus lead to a shift in the phenotypic composition of the remaining population. Especially as a number of these phenotypic traits seem also to possess some heritability, this could lead to differential evolution for the targeted population (8).

From a genomic perspective, our trawling simulations affected hundreds of genes, mainly associated with brain functioning and neurogenesis, in both density conditions. Any traits or other important biological functions associated with these genes could therefore also be targeted by the fisheries process in a manner that not only depends on selection by the fishing gear but also on the density of the harvested population. Our SNP analyses thus highlight that further attention should be given to the involvement of brain structure or individual cognition in the capacity of fish to escape fishing gears. Nonetheless, fishing has the potential to induce a genomic change by selection on specific SNPs, which can ultimately lead to the evolution of the population. However, our experiments do not allow conclusions about how the trajectory of selection might change in subsequent generations.

Even though the fishing simulations selected similar functions and phenotypic traits in both density conditions, specific genes and genomic regions were differentially selected in the different conditions. The possible evolutionary consequences of fishing may thus not be predicted from phenotypic observations alone (18), as changes in some phenotypic traits, such as those associated with neurological functions, and their ecological and evolutionary implications may be difficult to assess. The low genomic repeatability between the populations of different density and the subtle allele frequency shifts of many loci also highlight the high genetic redundancy and polygenic basis of the escaped phenotype (33), suggesting many different combinations of genetic changes can lead to a higher chance of escaping (34, 35). These results are concordant with the analysis of Pinsky etal. (11) that did not find strong signals of fishing-selective genomic trace in overfished cod populations. They explained that these results were either because of density-dependent phenotypic plasticity or polygenic selection with subtle allele frequency changes, both of which we showas indeed being of major importance for fishing-induced selection. Population density may affect how fish experience intraspecific competition and other among-individual interactions (22), potentially affecting the level and quality of external sensory stimuli received by each individual and, therefore, their brain development (36). Therefore, depending on the density of a targeted population, distinct genomic pathways may be under selection by fishing, potentially leading to divergent evolutionary trajectories over time. As intense harvesting may be accompanied by a pronounced reduction of population density, the interaction between harvest-associated genomic selection and density-dependent environmental effects (GE) is likely to occur in targeted populations and potentially shift the evolutionary outcome of the fishing process. The presence of GE could thus limit the strength of the fishing selection through time, selecting different genomic regions when population density is reduced, potentially maintaining genetic diversity as previously reported (11). Alternatively, such GE could also threaten the resilience of populations to further harvesting pressure or environmental challenges because of the selection of potentially maladaptive genomic variants or unexpected correlated changes on other phenotypic traits initially believed to be unrelated to fishing. The greater the density reduction in the targeted population, the greater the probability of density-dependent GE interactions. At low densities, populations are also at increased risk of experiencing Allee effects, which occur when the per capita population growth rate (and average fitness of individuals within the population) declines as abundance decreases (37, 38). Such Allee effects could additionally limit the rate of recovery of the targeted populations and have nonlinear consequences that are challenging to predict (25, 37).

The genes under selection that were involved in brain function and cognition might underlie the differences observed in growth and aggression between captured and escaped fish. For example, an enhancement of cognition could lead to improved food finding, foraging, and competitive ability (5, 39), which in turn could lead to faster growth. The absence of correlations between genomic PC score and either aerobic metabolic scope or sociability, despite these phenotypic traits being under selection in the trawling simulations, suggests that these traits may not possess a clear genomic basis that would be targeted by fishing selection or, alternatively, are highly polygenic with fitness effects spread across a large number of genomic variants. Other factors (probably more environmental than genotypic, such as the presence of conspecifics in the net or training effects on aerobic capacity) could have also influenced the contribution of these traits to fish vulnerability to capture and may also play a role in fishing in general.

It is important to consider the similarities and differences between our experimental setup and actual trawl selection. Swim flumes have previously been used as a tool to study fish vulnerability to trawling (40, 41) and mimic the tendency of fish to hold station at the mouth of an approaching trawl net (42). A notable difference is that real trawls can target hundreds or thousands of fish simultaneously, while we were limited to the number of fish we could test within a given trial. Increased numbers of individuals could enhance the importance of social interactions for vulnerability to capture. Fish in real trawls may have additional opportunities for escape either above or around the net or beneath the ground gear. While we simulated these escape routes, it is possible that differences in escape mechanics may alter the traits under selection in addition to traits associated with swimming performance. It is notable, however, that fish in our trawl simulation used escape routes in a manner similar to that observed in real trawls (43). Finally, the current study focused on the critical final stage when the fish have encountered the gear and attempt to escape. Actual harvest-induced selection may integrate various additional steps which will determine an individual's overall capture vulnerability (14), including habitat use by individual fish and gear encounter rate (14, 44). Additional work is required to understand how the various stages of the capture process may further affect which traits and genes are under selection.

The present findings suggest a one-fits-all evolutionary approach would not be appropriate for the management of wild populations that are subjected to harvest by humans (7). Instead, a more integrative approach that considers both direct human-induced selection and other environmental effects such as population density is necessary. It is critical that both genomic and ecological factors must be considered to fully understand and predict the consequences of human-induced selection on the resilience of natural populations. This is because the outcome of selection in one environment will most likely not be representative of the outcome of selection in another environmental context (45). Our results have also wide implications for studying the interplay of genetic and ecological factors in determining possible evolutionary outcomes in a broader ecological context, for example, in case of predation, or the interaction between sexual selection and population fluctuations.

In 2017, a semiwild zebrafish (Danio rerio) population sourced from rearing ponds in Malaysia (JMC Aquatics) was transferred to the University of Glasgow. A total of 24 adults were used to produce 36 families in a controlled factorial (North Carolina II) breeding design, where four groups of three males were reciprocally crossed to three females. After hatching (at 4 d postfertilization), each family was separated equally into two densities: a baseline density (60 larvae/L) and a reduced density (30 larvae/L). The families were then transferred and kept separated in 2-L tanks held under a 13-hlight : 11-h darkness photoperiod and supplied with recirculating dechlorinated filtered freshwater maintained at 28C. The larvae were fed four times daily with a combination of commercial food (TetraMin baby, ZM fry food, Zebrafeed, Novo Tom) and live Artemia nauplii. After 2 mo, we estimated about 10% mortality within families for both density conditions and readjusted the density number of fish (baseline density 40 juveniles/L and reduced density 20 juveniles/L). When the fish reached 6 mo, 10 fish per family and density (i.e., 360 fish per density) were randomly chosen within the tanks and tagged using a visual implant elastomer (Northwest Marine Technology) with a unique code identifier of four colors on the dorsal region (46). The families were then mixed and transferred into 55-L tanks divided into four equally sized sections (each being around 13.5 L) supplied with recirculating dechlorinated filtered freshwater maintained at 28C (SI Appendix, Fig. S1). The density conditions were then adjusted to 6 fish/L [baseline density (27)] and 3 fish/L (reduced density). The fish were fed twice daily with a combination of commercial food (TetraMin Tropical Flakes, ZM small granula) and live Artemia nauplii. Less than 1% mortality was observed within families and tanks during this period of rearing for both densities. The rearing conditions thus unlikely represented an initial source of genomic selection. Before every manipulation, the fish were fasted for 24 h.

During the tagging (6 mo old) and at 9 mo old, the fish (n = 360 per density) were measured for their body mass (to the nearest milligram) and fork length (to the nearest 0.01 mm), and their sex was determined. The specific growth rate (SGR) of each fish was calculated according to the formula SGR = Ln (mf mi)/T, in which mf is the mass (g) of the fish at 9 mo, mi is the mass (g) of the fish at 6 mo, and T is the time in days between the two measurements.

Individual fish oxygen uptake (MO2) was measured using intermittent flow respirometry as previously described (47, 48). Briefly, the setup was immersed into a 40-L tank filled with fully aerated freshwater thermoregulated at 28C and shielded from surrounding disturbances. The setup comprised 16 glass chambers (22 mL) connected to oxygen probe holders in a closed recirculating loop using a peristaltic pump. The closed circuit insured good mixing of the water and allowed the monitoring of the oxygen level in the chambers using FirestingO2 optical oxygen meters and probe sensors (PyroScience GmbH) calibrated daily inserted in the probe holders. Submersible pumps (Eheim GmbH) supplied fresh fully aerated water into the chambers for 2 min every 10 min creating measuring cycles.

Individual fish were placed in a 30-L swimming tunnel (Loligo Systems) and forced to swim until exhaustion (i.e., when no longer able to swim against the flow) for 2 min. The fish were then rapidly placed into a respirometry chamber to measure their maximum metabolic rate (MMR) postexercise. The fish were maintained in the chambers overnight (i.e., about 15 h) to estimate their standard metabolic rate (SMR). The fish were then removed from their chambers, measured for their mass and length, and returned to their rearing tanks. Blank oxygen consumption was measured in the empty chambers before and after the measurements of the fish to estimate bacterial respiration.

Fish MO2 (mg O2 h1) was calculated using the slopes of decline in oxygen in the chambers measured in LabChart multiplied by the volume of the chambers minus the volume of the fish and corrected by the background bacterial respiration. Fish SMR was determined as the 0.2 quantile of the MO2 measurements (49). The fish MMR was determined as the maximum MO2 obtained during the 30 min after the swimming exercise. The aerobic scope (AS) was determined as the difference between MMR and SMR.

The aggressiveness of each fish was measured using a mirror assay. The setup comprised 16 individual square tanks (17 17 cm) filled to a 5-cm depth with aerated freshwater thermoregulated at 28C and shielded from surrounding disturbances. The fish were acclimated in the empty tanks for 10 min. The mirror (8.5 30 cm) was then introduced on one side of the tank, and the fish behavior was recorded for 10 min using four webcams (Logitech HD Pro C920) and iSpy software (iSpyConnect). The videos were then analyzed using Ethovision XT 11 (Noldus, 2001), and the total number of bites against the mirror was quantified and used as a proxy for the fishs aggressiveness.

The sociability of each fish was measured using four rectangular glass tanks subdivided in three sections with a central focal section (32 19 cm) and two side sections (13 19 cm) separated by transparent acrylic. The tanks were filled to a 10-cm depth with aerated 28C freshwater and shielded from surrounding disturbance. Between each trial, 50% of the water was changed to maintain the water temperature and oxygenation level. At the beginning of the trial, a group of stimulus fish (three males and three females unfamiliar to the focal fish) were placed randomly in one of the side sections and left to acclimate for 5 min. The other side section remained empty. The focal fish was then placed in the central focal section within a transparent cylinder placed in the middle of the section to acclimate for 5 min. The cylinder was then removed, and the fish behavior was recorded for 20 min using two webcams (Logitech HD Pro C920) and iSpy software (iSpyConnect). At the end of the trial, the fish were removed, their mass and length measured, and returned to their holding tanks. The videos were then analyzed using Ethovision XT 11 (Noldus, 2001), and the average distance of the focal fish to the stimulus group of fish was determined and used as a proxy for fish sociability.

The trawling simulations took place after the phenotypic characterization in a 90-L swimming tunnel (Loligo Systems) at 28C shielded from surrounding disturbance. A 30-cm-long small-scale custom-designed model trawl net (designed by the Fisheries and Marine Institute of Memorial University of Newfoundland) with escape routes on the upper side areas of the net mouth was used for the simulations (SI Appendix, Fig. S1). For each trawling event, fish from either the baseline or reduced density were acclimated in groups of 16 in front of the trawl hidden by a separator at a water velocity of 4 cm s1 for 20 min. After the acclimation, the separator was removed, and the water velocity was rapidly increased (over 30 s) to 50 cm s1, the lowest velocity at which individuals shift to anaerobic swimming (i.e., upper limit of sustainable swimming as is the case in an actual trawling event) (50). The event lasted for 10 min, during which the time the fishes captured by falling back in the trawl were recorded to determine vulnerability to trawling capture. Fish that reached the end of the net passed through a tube and into an acrylic compartment where they were shielded from the oncoming flow. This simulated being captured in the codend but allowed the fish to be retained without being compressed against the net. At the end of the event, the position of the fish (captured in the net or acrylic compartment, or that escaped either in front or behind the net) was also recorded. Once the entire population had passed through the first trial (n = 360 fish per density, 22 events in the first fishing trial), the 20% of fish that were most vulnerable in each density (n = 72 per density) were identified according to their time of capture and were removed from the experimental populations. The experimental fish were then returned to their rearing tank randomly. The trawling simulations with the new populations were repeated every week for 6 wk in total (six fishing trials consisting of 75 fishing events in total), identifying and removing each time the 20% of fish that were most vulnerable to capture. At the end of the 6-wk period, the 20% of fish least vulnerable to capture in each density (n = 72 per density) were those that had escaped every trawling simulation. These escaped fish together with the 20% most vulnerable fish to trawling capture (i.e., those captured in the first trawling simulation) were then considered to be our vulnerability groups.

After the fisheries simulation, a fin clip was taken from 24 individuals from each vulnerability group under each density (total n = 96), with the 24 individuals balanced from the families present in all the groups (SI Appendix, Table S6). The MagMax DNA Multi Sample Kit (Applied Biosystems) was used to extract high molecular weight DNA from tissue. The concentration, purity, and integrity of the DNA extractions were assessed using the Qubit double-stranded DNA broad range (dsDNA BR) assay (Thermo Fisher), the Nanodrop (Thermo Scientific), and electrophoresis on 1% agarose gel. A barcoded library for each individual was prepared using NEBNext Ultra II FS DNA Library Prep Kit for Illumina (BioLabs.) reagents and protocols. Briefly, 100 ng input DNA samples were digested over 20 min followed by adapter ligation. The products were then cleaned and size selected (250 to 500 base pair [bp]) using Agencourt AMPure XP beads. PCR was used to add the unique dual index barcodes and amplify the libraries over seven cycles. The libraries were then combined equally and purified using Agencourt AMPure XP beads. The final concentration of the library was quantified (9 ng/L) using the Qubit dsDNA BR assay (Thermo Fisher), and the fragment size distribution was assessed using high-sensitivity DNA assay on an Agilent Bioanalyzer instrument (average size 325 bp). The library was sequenced on four lanes of Illumina HiSeq X Ten (BGI) with paired-end 150-bp reads.

The raw reads were filtered to remove potential lower quality reads and artifacts using Trimmomatic v0.36 (51) and cutadapt v1.16 (52). The reads were aligned and mapped to the zebrafish reference genome (GRCz11) using the mem algorithm of BurrowsWheeler Aligner software (BWA v0.7.17) (53). Sequence duplicates were removed with MarkDuplicates in Picard v2.18.14 (54). The coverage per individual was 2 0.5 in the final dataset. Angsd v0.928 (55) was used to calculate genotype likelihoods for each individual and to estimate allele frequencies in each vulnerability group for both densities. The following site filtering options were used in ANGSD: -SNP_pval 1e-6 -remove_bads 1 (removal of bad mapped reads), -setMinDepth 48 (minimum sum of depth across individuals), -setMaxDepth 600 (maximum sum of depth across individuals), -minInd 48 (minimum number of individuals), -minQ 20 (minimum read quality), -minMapQ 20 (minimum mapping quality), and -minMaf 0.05 (minimum minor allele frequency). We used the group allele frequencies from ANGSD to calculate z-transformed differences in allele frequencies (zdAF) between the captured and escaped fish in each density [i.e., zdAF = dAF mean(dAF)/sd(dAF)] using R v3.5.1 (56). For additional analyses, we also subdivided the fish from each density and vulnerability group into two post hoc groups with family shared evenly, to create two replicated genomic analyses within each density. A global maximum likelihood estimate of expected heterozygozity (He) was calculated in ANGSD using the real site frequency spectrum (SFS) function on all the allele frequencies.

To determine the outlier threshold, we created 25 permutation groups of 24 randomly chosen individuals, inferred group allele frequencies in ANGSD using the same data filtering options as described in the previous paragraph, and recalculated zdAF between the 625 possible pairings of 25 permutation groups. We thus estimated the random zdAF distribution for each SNP position derived from the 625 zdAF values obtained from the different pairings. From this random zdAF distribution per position, we first considered the upper and lower 0.05% quantiles after Bonferroni correction for the total number of SNPs (P < 8.8e-11, i.e., 0.0005/5.67M; equivalent to Bonferroni-corrected empirical P values, using function (x) quantile (x, p-value) in R) to be the significant zdAF threshold at each SNP. This analysis generated a distribution of SNP thresholds across the genome that we then compiled and used to calculate a global 0.5% quantile threshold. Outlier SNPs between captured and escaped fish in each density were then defined as those with zdAF values exceeding the upper or lower global 0.5% quantiles of the 0.05% Bonferroni-corrected quantile thresholds.

Outlier SNPs were annotated using the annotations contained in the zebrafish reference genome (GRCz11) RefSeq annotation file available on National Center for Biotechnology Information (NCBI). Using the annotated outlier SNPs, we performed a PANTHER-based GO term enrichment analysis (57) on the Gene Ontology webpage (http://geneontology.org/) (58), applying a significance threshold of FDR < 0.05. In addition, using the total set of SNPs, we averaged the zdAF values of each gene in each density population, and performed a gene set enrichment analysis in WebGestalt (5961), applying again a significance threshold of FDR < 0.05. The total set of SNPs and the outlier SNPs from each density were also used for an analysis of SNP types using SnpEff v4.4 (62). We first created a SnpEff database based on the GCA_000002035.4_GRCz11_genomic.fna sequence file and the GCF_000002035.6_GRCz11_genomic.gff annotation file available on NCBI. The chromosome name format in the annotation file was changed to the format in the sequence file. Then, we analyzed SNP types using SnpEff on each set of SNPs.

PCAngsd v0.98 (63) was used to obtain PC scores based on the BEAGLE genotype likelihood files from ANGSD using only sites of zdAF outlier SNPs. Based on the separation of the captured and escaped individuals in the PCA plots, the PC2 scores were used for subsequent statistical analysis, as PC1 mainly clustered the variability within the escaped fish from the reduced density.

Data normality and homogeneity of variance were tested according to the analysis of the distribution of model residuals and Levene tests respectively. The level of aggressiveness (number of bites against a mirror) and the PC score based on allele frequency difference of the outlier SNPs were not normally distributed, so data were ranked, and statistical procedures were applied on ranks (64). A general linear model multivariate analysis of covariance was used to analyze the global shift of the fish phenotypes (including fish SGR, AS, level of aggressiveness, and sociability) with sex, density, and vulnerability as well as their interaction as fixed effects and mass as a covariate. Subsequently, a general linear model was used to analyze the fish individual phenotypes SGR, AS, level of aggressiveness, and sociability with sex, density, and vulnerability as well as their interaction, fitted as fixed effects, and mass (or length in the case of sociability) as a covariate. Tank effects on the phenotypic variables were not significant (GLM: growth rate, F3,282 = 2.09, P = 0.11; aerobic scope, F3,282 = 0.55, P = 0.65; aggression, F3,259 = 2.07, P = 0.11; sociability, F3,256 = 0.05, P = 0.98). The PC score was also analyzed using a similar general linear model but without covariate. A posteriori Tukey tests were used for mean comparisons. The correlation between the PC score and mass, SGR, AS, level of aggressiveness, and sociability was evaluated using the Pearson correlation coefficient. Statistical analyses were performed using Statistica 7 (Statsoft), and all visuals were created using ggplot2 in R v3.5.3. A significance level of = 0.05 was used in all statistical tests.

We thank G. Law, R. Phillips, A. Kirk, and B. Allan for help in the maintenance of the fish populations throughout the project time and L. Chauvel for help during the trawling simulations. We also thank D. Thambithurai and J. Hollins for assistance in assembling the model trawl system. The experiment was carried out under License 60/4461 from the Home Office. This research was supported by a Marie Curie Fellowship (708762DIFIE) to A.C., a Fisheries Society of the British Isles Studentship to K.S., the Wellcome Trust (105614/Z/14/Z) to K.R.E., and a Natural Environment Research Council Advanced Fellowship (NE/J019100/1) and European Research Council Starting Grant (640004) to S.S.K.

Author contributions: A.C., J.L., and S.S.K. designed research; A.C., T.M., and A.R. performed research; A.C., K.R.E., and S.S.K. contributed new reagents/analytic tools; A.C., K.S., A.J., K.R.E., and S.S.K. analyzed data; and A.C., K.S., T.M., A.R., A.J., J.L., K.R.E., and S.S.K. wrote the paper.

The authors declare no competing interest.

This article is a PNAS Direct Submission.

This article contains supporting information online at https://www.pnas.org/lookup/suppl/doi:10.1073/pnas.2020833118/-/DCSupplemental.

Go here to see the original:
Genomic basis of fishing-associated selection varies with population density - pnas.org

Posted in Genome | Comments Off on Genomic basis of fishing-associated selection varies with population density – pnas.org

Second Genome Granted US Patent for Proteins for the Treatment or Prevention of Epithelial Barrier Function Disorders – PRNewswire

Posted: at 12:44 am

BRISBANE, Calif., Dec. 20, 2021 /PRNewswire/ --Second Genome, a biotechnology company that leverages its proprietary platform sg-4sight to discover and develop precision therapies and biomarkers from public and proprietary microbiome data, today announced that the U.S. Patent and Trademark Office issued a U.S. patent, No. 11,174,293 B2, entitled, "Proteins for the Treatment of Epithelial Barrier Function Disorders." The patent covers a potential first-in-class therapeutic that improves mucosal healing, and other novel proteins and pharmaceutical compositions comprising those proteins that have application in the treatment or prevention of inflammatory bowel diseases (IBD), including Crohn's disease and ulcerative colitis, and other epithelial barrier function disorders.

"The issuance of Second Genome's U.S. patent is another milestone in the advancement of our lead mucosal healing candidate and strengthens our overall intellectual property portfolio," said Karim Dabbagh, Ph.D., President and Chief Executive Officer of Second Genome. "There is an opportunity to improve current IBD treatments beyond suppression of disease associated inflammation by targeting the integrity of the epithelial barrier. Second Genome's novel proteins demonstrate the potential to act directly at the epithelial barrier to influence mucosal healing in a variety of gastrointestinal and epithelial barrier function disorders."

Mucosal healing is a key therapeutic goal for IBD. Second Genome's lead protein candidate targeting mucosal healing is delivered directly to the gut via an engineered L. lactis probiotic system and has the potential to address a broad patient population due to anticipated combinability with standards of care across lines of therapy in IBD. Second Genome expects to file an Investigational New Drug (IND) application with theU.S. Food and Drug Administration(FDA) in 2022.

About Second Genome

Second Genome is a biotechnology company that leverages its proprietary tech platform sg-4sight to discover and develop transformational precision therapies and biomarkers through clinical development and commercialization based on novel microbial genetic insights. We built a proprietary microbiome-based drug discovery and development platform with machine-learning analytics, customized protein engineering techniques, phage library screening, mass spec analysis and CRISPR, that we couple with traditional drug development approaches to progress the development of therapies and diagnostics for wide-ranging diseases. Second Genome is advancing a deep drug discovery and biomarker pipeline with precision therapeutics and biomarker programs in inflammatory bowel disease (IBD) and cancer, with the lead programs IBD and cancer expected to enter clinical development in 2022. We also collaborate with industry, academic and governmental partners to leverage our microbiome platform and data science. We hold a strategic collaboration with Gilead Sciences, Inc., utilizing our proprietary platform and comprehensive data sets to identify novel biomarkers associated with clinical response to Gilead's investigational medicines. We also hold a strategic collaboration with Arena Pharmaceuticals to identify microbiome biomarkers associated with clinical response for their lead program in gastroenterology, etrasimod. For more information, please visit http://www.secondgenome.com.

Investor Contact: Argot Partners212-600-1902[emailprotected]

Media Contact: Argot Partners212-600-1902[emailprotected]

SOURCE Second Genome

View post:
Second Genome Granted US Patent for Proteins for the Treatment or Prevention of Epithelial Barrier Function Disorders - PRNewswire

Posted in Genome | Comments Off on Second Genome Granted US Patent for Proteins for the Treatment or Prevention of Epithelial Barrier Function Disorders – PRNewswire

How HaystackAnalytics, powered by Intel Startup Program, is democratising genomics with its automated plug-n-p – YourStory

Posted: at 12:44 am

Among the numerous startups that have been working alongside the government in the fight against the COVID-19 pandemic has been HaystackAnalytics. The Mumbai-based healthtech startup is using genomics to trace the transmission of coronavirus in India and understand the transmission patterns of the virus to help the government plan a better strategy to fight the pandemic.

Genome is the data centre for every living being. Sequencing this genome opens up these data points to be processed for understanding the diseases anyone is facing or will face. Sequencing the genome helps researchers to get five million data points for a pathogen to three billion data points for a human sample.

Unlike the RTPCR (Reverse transcription polymerase chain reaction) tests which look at a very small piece of the information, genome sequencing provides a near complete information of the biological system stored in the DNA/RNA of a pathogen or a human. Sequencing is almost a gold standard of comprehensive information, shares HaystackAnalyticss co-founder and CEO, Dr Anirvan Chatterjee.

HaystackAnalytics is envisioning a world where several million diagnostic tests would be done daily based on genome sequencing technology and is building the technology for data handling, and data interpretation at that scale. HaystackAnalytics genomic SaaS solution replaces the current time-consuming processes of genomic data analysis (bioinformatics) and reduces the analysis time from several days to a mere a few minutes, explains the co-founder.

Dr Anirvan points out that while genomics is not a new interdisciplinary field of biology, its usage hasnt percolated much outside academics into mainstream diagnostics. While everyone in the healthcare industry understands the impact of genomics - right from enabling faster and accurate diagnosis to inferring the right medication to creation of new biomolecules and even new medicines - the adoption of genomics has been slow, he says.

But, other factors make it difficult to accept this technology. The time required to analyse the genomic data is long and requires a high level of expertise. Further, the tremendous advances in modern computing are only beginning to be imbibed into genomics, necessitating a rapid churn in the analytical processes. The lack of such agile applications, combined with general inertia to onboard a new tech severely limits legacy healthcare providers to scale and adapt rapidly. And, in cases where sequencing platforms have been developed, the deployment of such technology at a mass scale has been limited especially in cases of resource-limited settings, shares Dr Anirvan.

Today, HaystackAnalytics is addressing these challenges to scale this technology and make it mainstream by bringing automation to genomics with the HaystackAnalytics genomic SaaS solution. A comprehensive one-test solution, HaystackAnalytics -series is a genomic analysis software that analyses genomic data and can be deployed through the hardware that has been co-developed with Intel as a plug-and-play solution for diagnostic labs and hospitals. This will get genomics, which was otherwise mainly used for research, as a scalable solution for mass deployment for patients. This will also remove the need for current molecular diagnostic labs to update infrastructure or reskill. HaystackAnalytics is an enabler of genomics-driven comprehensive diagnostics that reduces uncertainty for doctors and encourages better decision-making leading to better patient outcomes, explains Dr Anirvan.

Having worked for a decade in genomics, Dr Anirvans expertise lies in the deployment of Next Generation Sequencing (NGS)-based analysis in healthcare at the University of Oxford and DTU Denmark. At HaystackAnalytics, Dr Anirvan is joined by the startups two co-founders - Prof Kiran Kondabagil and Gaurav Srivastava. While Kiran is at the Indian Institute of Technology, with significant expertise in molecular biology, infectious disease, genomics and evolutionary biology, Gaurav is an alumnus of IIT, Kharagpur with expertise in navigating operations of tech startups across Singapore and India.

The three co-founders and the team spent the first six months in stealth building the genomic analytic platform for rapid tuberculosis antibiotic test. In early 2021, the platform was being tested in private and government healthcare setup. The startup received a government grant to build the capability for COVID-19 genome analysis. Because of the platforms default software capabilities, we were able to quickly migrate the application to suit the needs of COVID-19 genome sequencing, says Dr Anirvan. In June 2021, the startup launched the rapid tuberculosis antibiotic test. The solutions were deployed in JJ Hospital in Mumbai and were also leveraged in the National TB Elimination Programme. Now HaystackAnalytics is gearing up for the launch of their next clinical application on their platform, internally dubbed ID which has been validated for detecting several 100 pathogens in a single test.

The back-end AI-Machine Learning driven inferencing engine enables us to rapidly empanel new use cases for genomic diagnostics, and also enable drug-discovery. Today, we are creating several applications wherein each application is focused on one disease or one scenario outcome, he says.

A key factor that has enabled HaystackAnalytics to accelerate its product development journey in the early days has been its association with Intel. Intel has been a key player in developing computing technology for genomics with globally leading universities such as MIT and Harvard. Intel chips have been game changers in the field. Intel India also has deep expertise in core memory and computing optimisation specifically for handling genomic data processing algorithms enabling us to take the analysis on-site for the client more easily and build scale at a greater level, shares Dr Anirvan. In 2020, getting selected for the Intel Startup Program brought the opportunity to leverage this expertise and co-develop a computing device to enable last mile access to the Haystack genome analytic platform. He says, The EDGE computing device has been built taking advantage of numerous consumer products already available in the market. By customising the computing device, we have been able to bridge the last mile gap in building scale by enabling direct deployment of the software by the diagnostics or hospitals.

In a span of a year since the launch of genomic SaaS solutions, the impact that HaystackAnalytics has been able to drive speaks for the need and effectiveness of its platform. Today, the startups plug-and-play genome sequencing solution allows diagnostic labs and providers to move to a technology-centric approach of diagnosis for general patient consumption, which promises better accuracy, faster turnaround time, all at a significantly lower cost.

By bringing automation to genomics, we are able to drastically reduce the test turnaround time by anywhere between 33 to 85 percent depending on the disease/infection, Dr Anirvan says. For instance, introducing a single test for 18 antibiotics for tuberculosis, HaystackAnalytics reduces the turnaround time to four days instead of four-six weeks incurred by the traditional testing methodologies. This also means there is elimination of the need for multiple diagnostic tests and the end-user, which can be the patient, the clinician or the doctor, can get the required information on day one of the test results. The other advantage that a platform like HaystackAnalytics brings especially when used in a public health setting is that, unlike current diagnostic tests which help to diagnose disease outbreaks, the platform will be able to use the genomic data to preempt disease outbreaks.

The software also has a versatile use case application depending on who the end user is. While in the case of clinicians or healthcare practitioners, they are able to derive answers to questions such as which drug will work better, public healthcare policymakers or decision-makers are able to use the data to find intelligent ways of applying public health interventions where they are most required.

The startup believes that the shift from the current day diagnostic protocols towards molecular and genomic diagnostics has only just begun. We expect in the next five to seven years, 30 percent of the current day pathological diagnostic tests that are being done, will actually transition into genomic test on the clinical side, says the CEO.

From the first test launched by HaystackAnalytics, they have scaled rapidly and multiple partners have come on board to utilise the whole genome sequencing solution provided. We have launched the TB WGS test as our first test and we have grown the total market 20x in the last four months and now own 90 percent market share. Just in this product, we expect the market to increase 25x in the next 12 months working with our partners and expect a similar growth trajectory for all our products that we are rolling out, quotes Gaurav Srivastava.

Almost all of the top 10 diagnostic labs of the country today are partnering with HaystackAnalytics. Metropolis, Dr. Lal Path Labs, Anderson, Thyrocare to name a few top players and now have all of the major hospitals as our partners with Hospitals like AIIMS, CMC Vellore all sending us their samples, shares Dr Anirvan.

HaystackAnalytics is following the model of partnership with diagnostic labs and hospitals, enabling them to get started with their Genomic product offerings. With multiple partners coming on board in the last five months, there is a display of intent and desire of the industry to upgrade with the cutting edge solutions provided by HaystackAnalytics. There are several products in the pipeline catering to chronic and infectious diseases diagnostics and some of them are being developed in collaboration with partners.

Even though Genomics as a market itself is nascent and set to grow 100x in the next 3-5 years. For the industry to handle as much data and volume, technology and automation is the only way forward. Us being an innovator in this technology opens up a tremendous market ahead of us to scale but what drives us is the opportunity to make a lasting change in the way healthcare is perceived by all stakeholders concludes Kiran.

The Intel Startup Program is Intel Indias flagship program to engage with technology startups who have an IP or innovative solutions that have the potential to create impact on customers and align with Intel's focus areas. The program is at the forefront of engaging with Indias startup ecosystem through high impact collaborations with the industry, academia and government and runs multiple initiatives that are either vertically aligned or focused on emerging technologies.

It engages with startups that have a unique global or local value proposition to solve genuine customer problems, enabling them with domain and business expertise from the industry and the best mentorship from Intel.

For more details visit:https://www.intel.in/startup-program

Go here to read the rest:
How HaystackAnalytics, powered by Intel Startup Program, is democratising genomics with its automated plug-n-p - YourStory

Posted in Genome | Comments Off on How HaystackAnalytics, powered by Intel Startup Program, is democratising genomics with its automated plug-n-p – YourStory

Eye on Omicron: SRL approaches INSACOG to participate in the genomic surveillance – BusinessLine

Posted: at 12:44 am

SRL Diagnostics, that runs the countrys largest path-lab network, has approached INSACOG to be part of the national genomic surveillance efforts.

The Indian SARS-CoV-2 Genomics Consortium or INSACOG is a network of 38 Government-owned labs established to monitor genomic variations of the virus that causes Covid-19. SRLs application, if approved, could open the door for more private laboratories to participate in genome sequencing efforts to catch virus variants, especially with the emergence of the highly-transmissible Omicron variant.

More samples from Covid-positive people are being sent for genome sequencing in India, to identify if any individual has the Omicron variant, explains Anand K, Chief Executive, SRL Diagnostics, giving the rationale behind their application to participate in INSACOGs efforts.

The application is under review, he told Business Line, adding that their reference laboratory in Mumbai was already doing genome sequencing work to identify markers to guide cancer treatment. The lab already has validated protocols for genome sequencing, he added.

As an Omicron-induced surge in Covid-19 cases is witnessed across multiple countries, Anand says diagnostic labs in India are better equipped now to handle a possible surge, than they were during the second wave earlier this year.

SRL, for instance, has 426 labs, distributed equitably across the country, he said, with about 2,100 collection centres. About 22 of their labs are ICMR-accredited to handle Covid-19 samples, he said, adding that their capacity to handle tests had increased to 80,000 per day. And the peak load during the second wave was about 40,000 tests per day, he added.

Under the Fortis Healthcare umbrella (now owned by Malaysias IHH Healthcare), SRL had posted revenues of 1,030 for the year-ended March 2021. Revenues from non-Covid tests were beginning to see a return, he said, accounting for 331 crore of the total revenues of 403 crore posted in the second quarter or three months ended September 2021.

The pandemic had put the spotlight on diagnostics, a service that was earlier on the sidelines of healthcare, he said. Commenting on the recent entry of Lupin and digital companies like PharmEasy (that bought out Thyrocare) in the estimated 60,000 crore diagnostics segment, he said, it would help regularise the industry where presently organised service providers cover only 16 per cent.

Diagnostic services presently also face much criticism on prices, especially involving Covid-19 tests, besides allegations of faulty reports, as States and countries increasingly seek Covid-negative RT-PCR reports to allow incoming travellers. The SRL chief said, many States were exercising price caps on the Covid-tests. On fraudulent test reports, he clarified, SRL test reports had double indicators (QR codes) that carried details of the individual and the test. He urged authorities to check both indicators, to make sure they were seeing an authentic report.

Continued here:
Eye on Omicron: SRL approaches INSACOG to participate in the genomic surveillance - BusinessLine

Posted in Genome | Comments Off on Eye on Omicron: SRL approaches INSACOG to participate in the genomic surveillance – BusinessLine

Page 35«..1020..34353637..4050..»