Category Archives: Genome

Ancient genomic time transect from the Central Asian Steppe unravels the history of the Scythians – Science Advances

Posted: March 31, 2021 at 3:27 am

Abstract

The Scythians were a multitude of horse-warrior nomad cultures dwelling in the Eurasian steppe during the first millennium BCE. Because of the lack of first-hand written records, little is known about the origins and relations among the different cultures. To address these questions, we produced genome-wide data for 111 ancient individuals retrieved from 39 archaeological sites from the first millennia BCE and CE across the Central Asian Steppe. We uncovered major admixture events in the Late Bronze Age forming the genetic substratum for two main Iron Age gene-pools emerging around the Altai and the Urals respectively. Their demise was mirrored by new genetic turnovers, linked to the spread of the eastern nomad empires in the first centuries CE. Compared to the high genetic heterogeneity of the past, the homogenization of the present-day Kazakhs gene pool is notable, likely a result of 400 years of strict exogamous social rules.

The transition to the Iron Age (IA) marks one of the most important events in the history of Eurasia. At the turn of the first millennium BCE, changes in the archeological record attest to the rise of several nomad cultures across the steppe, from the Altai to the western fringe of the Pontic-Caspian region (1). These cultures are often collectively referred to as Scythians based on the common features found in their mortuary contexts (2). Compared to the preceding Bronze Age (BA) populations, the Scythians went through a transition from a sedentary to a nomadic cattle-breeding lifestyle, showed an increase in warfare and advancements in military technologies (e.g., new types of iron weapons and horseback riding techniques, such as introducing the use of a saddle), and the establishment of hierarchical elite-based societies (3).

Previous genomic studies have detected large-scale genetic turnovers (and therefore substantial human migrations) in the BA steppe, which eventually resulted in the formation of a homogeneous and widespread Middle and Late BA (LBA) gene pool that characterized the sedentary herders of the western and central steppe (steppe_MLBA) (47). The reasons that prompted the rapid decline of these MLBA cultures and the rise of the Scythians are still poorly understood. Scholars have pointed out, among the most relevant factors, the climatic humidification (8) and socioeconomic pressures from the neighboring farming civilizations, i.e., the ones linked to the Bactria Margiana Archaeological Complex (BMAC) (3). Three competing hypotheses have been debated regarding the origins of the Scythians: a Pontic-Caspian origin, supported by their assumed Iranian languages, a Kazakh Steppe origin supported by the archaeological findings, and a multiple independent origin from genetically distinct groups that adopted common cultural traits (2). The limited number of genomes so far retrieved from the IA steppe nomads provides a glimpse of their genetic diversity but is far from being sufficient to characterize complex patterns of admixture between various eastern and western Eurasian gene pools (912).

From an archaeological perspective, the earliest IA burials associated with nomad-warrior cultures were identified in the eastern fringes of the Kazakh Steppe, in Tuva and the Altai region (ninth century BCE) (13). Following this early evidence, the Tasmola culture in central and north Kazakhstan is among the earliest major IA nomad warrior cultures emerging (eighth to sixth century BCE) (13). These earlier groups were followed by the iconic Saka cultures located in southeastern Kazakhstan and the Tian Shan mountains (sixth to second century BCE), the Pazyryk culture centered in the Altai mountains (fifth to first century BCE-CE), and the Sarmatians that first appeared in the southern Ural region (sixth to second century BCE) and later are found westward as far as the northern Caucasus and eastern Europe (fourth BCE to fourth CE) (1, 1417). The nomad groups also influenced their sedentary neighbors, such as the ones associated with the Sargat cultural horizon (fifth to first century BCE) located in the northern forest-steppe zone between the Tobol and Irtysh rivers (3, 18).

After the IA, the Kazakh Steppe served as a center for the expansion of multiple empires, such as the Xiongnu and Xianbei chiefdoms from the east (19) and the Persian-related kingdoms from the south (e.g., Kangju) (20). These events brought the demise of the eastern Scythian cultures, but the demographic turnovers associated with this cultural transition remain poorly understood (20). Furthermore, forms of nomadic lifestyle persisted in the Kazakh Steppe throughout the centuries. A key event in the recent history of the nomad populations happened in the 15th to 16th century CE when all the tribes living in the territory of present-day Kazakhstan were organized and grouped into three main hordes (Zhuzs): Elder Zhuz, Middle Zhuz, and Junior Zhuz located in southeast, central/northeast, and west Kazakhstan, respectively (21, 22). This division was a political and religious compromise between different nomadic tribes, which were spread across Central Asia and had to protect themselves from external threats after the collapse of the Golden Horde. This set the basis for the foundation of the Kazakh Khanate (1465 to 1847 CE). Today, Kazakh groups in Kazakhstan still maintain their tribal affiliations and revere their nomadic history preserving some aspects of its culture (21). One of these traditions is the Zheti-ata, which consists of keeping track of the family tree up to seven generations by paternal line to avoid marriage between kins (23).

To understand the genetic structure of the different IA nomadic cultures as well as the demographic events associated with their origins and decline, we successfully generated genome-wide data from 111 ancient human individuals retrieved from 39 different archaeological sites across the Kazakh Steppe (Kazakhstan, Kyrgyzstan, and Russia) and one individual retrieved from a Hun elite burial located in present-day Hungary (text S1). Our dataset densely covers a time span ranging from the eighth century BCE to the fourth century CE and also includes three individuals from the medieval period (Fig. 1 and text S1). We also produced new genome-wide data from 96 modern-day Kazakh individuals belonging to several tribes affiliated to the three major Kazakh hordes (Zhuzh) covering the entire territory of present-day Kazakhstan to better understand how recent historical events have shaped the genetic structure of present-day nomads.

(A) Map showing the locations of the 39 archaeological sites where the 117 individuals were retrieved and (B) their respective dates in years BCE/CE. The dates reported are 14C-calibrated (2-sigma) ranges for the sites comprehending at least one individual directly radiocarbon-dated; if more individuals are dated, we report the lowest and the highest values across all of them. If for a site, no individuals are dated, we report the date ranges based on the archaeological context (data file S1). The sites are colored according to their cultural affiliation. This same culture-based color code (top right) is maintained for all the figures in the main text and the Supplementary Materials.

Genome-wide data for 117 ancient individuals were obtained using an in-solution DNA capture technique designed to enrich for 1,233,013 single-nucleotide polymorphisms (SNPs) commonly referred as 1240K capture (Materials and Methods). Genome-wide data for 96 present-day Kazakh individuals were generated with the Affymetrix Axiom Genome-wide HumanOrigins SNP-chip (HO) (Materials and Methods). After performing quality controls, we retained all the 96 modern Kazakh individuals and 111 ancient individuals with at least >20,000 SNPs covered, obtaining a median of 793,636 successfully recovered SNPs and 1.5 autosomal coverage on the 1240K panel across all individuals (Materials and Methods and data file S1).

We then merged the new data with a reference dataset of previously published modern and ancient individuals compiling a 1240KHO dataset consisting of 586,594 SNPs overlapping with the modern genotype data that we used for performing global population structure analyses [i.e., PCA (principal components analysis) and ADMIXTURE]. We also produced a 1240K-only dataset consisting of 1240K capture or whole-genome shotgun data pooled down to include 1240K sites only that we used for the rest of the analyses (Materials and Methods and tables S2 and S3). For population-based analyses, we grouped individuals according to their archaeological culture affiliation, spanning a defined time range after excluding genetic outliers shifted more than 2 SD from the median PCs of their respective group (Materials and Methods and table S1).

Overall, PCA and ADMIXTURE suggest that a substantial demographic shift occurred during the transition from the BA to the IA in the Kazakh Steppe (Fig. 2 and figs. S1 and S2). In contrast to the highly homogeneous steppe_MLBA cluster found across the Kazakh Steppe until the end of the second millennium BCE, the IA individuals are scattered across the PC space, most notably along PC1 and PC3. Their spread along these PCs suggests a varying degree of extra eastern Eurasian affinity compared to the MLBA population and extra affinity to southern populations ultimately related to the Neolithic Iranians and the Mesolithic Caucasus hunter-gatherers (from here on referred to as Iranian-related ancestry), respectively. Despite the high genetic variability, it is possible to appreciate homogeneous clusters of ancient individuals belonging to the same archaeological culture and/or geographic area (Fig. 2 and fig. S1). Following a chronological order, most of the individuals from the sites associated with the Early IA Tasmola culture (Tasmola_650BCE) and the published Saka_Kazakhstan_600BCE of central-north Kazakhstan cluster together in the middle of the PCA plot and show a uniform pattern of genetic components in ADMIXTURE analyses (Fig. 2, A and D, and figs. S1 and S2). The two previously published individuals from the Aldy Bel site in Tuva (Aldy_Bel_700BCE) also fall within this genetic cloud (Fig. 2A). This genetic profile persists in the later Middle and Late IA, shown by most individuals from the Pazyryk site of Berel (Pazyryk_Berel_50BCE) (Fig. 2B). This IA cluster is distinct from the previous steppe_MLBA groups inhabiting the same regions, most notably because of its substantial shift toward eastern Eurasians along PC1. In addition, we find outliers showing an even stronger shift to eastern Eurasians than the main cluster: two outliers from Pazyryk Berel time (Pazyryk_Berel_50BCE_o), three outliers from the Tasmola site of Birlik (Tasmola_Birlik_640BCE), and three of four individuals from the Korgantas phase of central-north Kazakhstan (24) (Fig. 2B and table S2). One female individual from Birlik (BIR013.A0101) with an eastern Eurasian genetic profile was unearthed with grave goods (a bronze mirror) that presented typical Eastern Steppe features (text S1).

(A to C) PC1 versus PC3 (outer plot) and PC1 versus PC2 (inner plot in the bottom right box) including all the IA, new and previously published individuals (filled symbols), relevant published temporally preceding groups (empty symbols), and present-day Kazakh individuals (small black points). The gray labels in this and the following panel indicate broad geographical groupings of the modern individuals used to calculate PCA that in the plots are shown as small gray points. The ancient samples are distributed in (A) to (C) sliced in three different time intervals as reported in the top right corner. (D) Histograms of ADMIXTURE analysis (K = 12; fig. S2) for the new IA and post-IA individuals and selected subset of temporally preceding groups maximizing key genetic components and a randomly selected subset of present-day Kazakh from the three main Zhuzs.

The classical IA Sakas from the Tian Shan region to the south (Saka_TianShan_600BCE, Saka_TianShan_400BCE, and previously published Pub_Saka_TianShan_200BCE) are distributed along a cline between the Tasmola/Pazyryk cluster and the Iranian-related gene pool, along PC3 (Fig. 2, A and B). A stronger affinity to the Neolithic Iranians is also found in ADMIXTURE analyses (Fig. 2D and fig. S2). The shift toward the Iranian-related gene pool is found as early as ~650 BCE in one Eleke_Sazy_650BCE individual (ESZ002) retrieved from an elite Saka burial, while three of four individuals from one of the earliest Tian Shan Saka site of Caspan_700BCE fall within the Tasmola/Pazyryk cloud.

The individuals associated with the sedentary Sargat culture in the forest-steppe zone north of the Kazakh Steppe (Sargat_300BCE) partially overlap with the Tasmola/Pazyryk cluster although forming a cloud in PCA that is shifted toward western Eurasians and toward the uppermost cline of northern Inner Eurasians (PC1 and PC2, respectively; Fig. 2B). In line with PCA, Sargat individuals carry a small proportion of a different type of northeast Asian ancestry not detected in the nomad groups further to the south (Fig. 2D).

With the exception of one outlier falling in the Tasmola/Pazyryk cloud, the individuals associated with the Sarmatian culture are highly homogeneous despite being spread over a wide geographic area and time period (i.e., early Sarmatians_450BCE, late Sarmatians_150BCE, and western Sarmatians_CaspianSteppe_350BCE; Fig. 2, A and B). Our new data from seven early Sarmatian sites in central and western Kazakhstan (Sarmatians_450BCE) document that this gene pool was already widespread in this region during the early phases of the Sarmatian culture. Furthermore, Sarmatians show a sharp discontinuity from the other IA groups by forming a cluster shifted toward west Eurasians (Fig. 2 and table S2).

Genetic ancestry modeling of the IA groups performed with qpWave and qpAdm confirmed that the steppe_MLBA groups adequately approximate the western Eurasian ancestry source in IA Scythians while the preceding steppe_EBA (e.g., Yamnaya and Afanasievo) do not (data file S4). As an eastern Eurasian proxy, we chose LBA herders from Khovsgol in northern Mongolia based on their geographic and temporal proximity. Other eastern proxies fail the model because of a lack or an excess of affinity toward the Ancient North Eurasian (ANE) lineage (25). However, this two-way admixture model of Khovsgol + steppe_MLBA does not fully explain the genetic compositions of the Scythian gene pools (data file S4). We find that the missing piece matches well with a small contribution from a source related to ancient populations living in the southern regions of the Caucasus/Iran or Turan [we use the term Turan for consistency with (7), only its geographical meaning, designating the southern part of Central Asia; Fig. 3A]. The proportions of this ancestry increase through time and space: a negligible amount in the most northeastern Aldy_Bel_700BCE group, ~6% in the early Tasmola_650BCE, ~12% in Pazyryk_Berel_50BCE, ~10% in Sargat_300BCE, ~13% in Saka_TianShan_600BCE, and ~20% in Saka_TianShan_400BCE (Fig. 3A), in line with f4-statistics (table S2). Sarmatians also require 15 to 20% Iranian ancestry while carrying substantially less Khovsgol and more steppe_MLBA-related ancestry than the eastern Scythian groups.

(A) Fitting models for the main IA groups using LBA sources, the major genetic shift with the new East Asian influx (DevilsCave_N-like) observed in the Middle IA outliers and Korgantas. (B) Fitting models for the post-IA groups using IA groups as sources. A transparency factor is added to the models presenting poor fits (P < 0.05; only Konyr_Tobe_300CE). On the top is shown the color legend for the sources tested. (C) Summary of the admixture dates obtained with DATES for the main groups studied. The y axis is the temporal scale from BCE (negative) to CE (positive) dates. The x axis represents the results for the different target groups reported in the legends in each box using the two-way sources reported at the bottom of the three panels formed along the x axis (e.g., source1 + source2). The colored bars represent the date ranges of the culture, while the filled symbols show the admixture dates SEs obtained from DATES and converted into dates considering 29 years per generation starting from the median point of the cultures age. The three set of sources reported correspond to the summary of the main admixture events described in the text from left to right: the LBA formation of the Scythian gene pools; the BMAC-related influx increasing through time in the Tian Shan Sakas; and the new eastern influx starting in the IA and continuing throughout the centuries. A number-based key (the white numbers from 1 to 6 inside the black circles) connects different tests and analyses shown in the figure with the corresponding arrow in Fig. 4.

For Sarmatians and later Tian Shan Sakas, only the groups from Turan (i.e., Turan_ChL, BMAC, and postBMAC) match as sources, while groups from Iran and Caucasus fail; we chose BMAC and postBMAC as the representative proxies (Fig. 3A and data file S4). The extra eastern Eurasian influx in the outliers (Tasmola_Birlik_640BCE, Korgantas_300BCE, and Pazyryk_Berel_50BCE_o) is not sourced from the same eastern proxies as the previous groups (i.e., Khovsgol); instead, it can only be modeled with an ancient northeast Asian (ANA) lineage, represented by the early Neolithic groups from the Devils Gate Cave site in the Russian Far East (DevilsCave_N) (Fig. 3A and data file S4).

We observe an intensification of the new eastern Eurasian influx described above among the individuals from the early 1st millennium CE (Xianbei_Hun_Berel_300CE) as well as the later 7th- to 11th-millennium CE individuals (Karakaba_830CE and Kayalyk_950CE). They are scattered along PC1 from the main IA Tasmola/Pazyryk cluster toward the ANA groups (Fig. 2C). The two individuals associated with Hun elite burials dated from the third century CE, one from the site of Kurayly in the Aktobe region in western Kazakhstan and the other from Budapest, Hungary (Hun_elite_350CE), cluster closely together along this cline (Fig. 2C and figs. S1 to S3).

The individuals from the ancient city of Otyrar Oasis in southern Kazakhstan show a quite distinct genetic profile. Three of five individuals (Konyr_Tobe_300CE) fall close to the published Kangju_250CE individuals from a similar time period and region (11), between Sarmatians and BMAC (Fig. 2C). KNT005 is shifted toward BMAC in PCA (Fig. 2C and fig. S1). Furthermore, KNT005 is the only one carrying a South Asian Y haplogroup, L1a2 (data file S1), and showing a South Asian genetic component in ADMIXTURE (Fig. 2D and fig. S2). KNT004 is shifted in PC1 toward East Asians (figs. S1 to S3). Admixture models including ~10% South Asian and ~50% eastern Eurasian influx adequately explain KNT005 and KNT004, respectively (data file S4). In contrast, the individuals from the site of Alai Nura (Alai_Nura_300CE) in the Tian Shan mountains (~200 km east from the Konyr Tobe site) still lay along the IA cline of the Tian Shan Saka, with four individuals falling closer to Konyr_Tobe_300CE and four closer to the Tasmola/Pazyryk cloud (Fig. 2C and figs. S1 to S3).

Admixture dating with the DATES program reveal an early formation of the main Scythian gene pools during 1000 to 1500 BCE (Fig. 3C and fig. S4). DATES is designed to model only the two-way admixture, so to account for the estimated three-way models obtained with qpWave and qpAdm, we independently tested the three pairwise comparisons (steppe_MLBA, BMAC, and Khovsgol). DATES was successful in fitting exponential decays for the two western + eastern Eurasian pairs, steppe_MLBA + Khovsgol, and BMAC + Khovsgol, while failing in the western + western Eurasian pair (steppe_MLBA + BMAC) (fig. S4 and table S3). For each target, steppe_MLBA + Khovsgol and BMAC + Khovsgol yielded nearly identical admixture date estimates (table S3). We believe that our estimates mostly reflect an average date between the genetically distinguishable eastern (Khovsgol) and western (steppe_MLBA + BMAC) ancestries, weighted by the relative contribution from the two western sources, rather than reflecting a true simultaneous three-way admixture. It is noteworthy that DATES found increasingly younger admixture dates in the Tian Shan Saka groups as the BMAC-related ancestry increases: from Saka_TianShan_600BCE to the Saka_TianShan_400BCE and especially in the later Alai_Nura_300CE as well as for Pazyryk_Berel_50BCE and Sargat_300BCE with respect to the date of Tasmola_650BCE (~1100 to 900 BCE with respect to ~1300 to 1400 BCE; Fig. 3C). A small-scale gene flow from a BMAC-related source continued over IA may explain both the increase in the BMAC-related ancestry proportion and increasingly younger admixture dates (Fig. 3A). Again, the inferred dates reflect an average over the IA admixture with a BMAC-related source and the LBA one with steppe_MLBA; therefore, they are likely shifted toward older time periods than the actual time of the IA gene flow.

Confirming the results from qpAdm, the admixed individuals from Tasmola_Birlik_640BCE and Korgantas_300BCE (admixed_Eastern_out_IA) show very recent admixture dates (Fig. 3C, fig. S4, and table S3). The later groups of Xianbei_Hun_Berel_300CE, Hun_elite_350CE, and Karakaba_830CE further corroborate this trend of recent dates of admixture, revealing that this new eastern influx likely started in the IA and continued at least during the first centuries of the first millennium CE (Fig. 3C, fig. S4, and table S3).

PCA, ADMIXTURE, and CHROMOPAINTER/fineSTRUCTURE fine-scale haplotype-based analyses performed on present-day Kazakhs reveal a tight clustering and absence of detectable substructure among Kazakhs regardless of the geographic location or Zhuz affiliation (Fig. 2 and fig. S5). We still grouped the Kazakh individuals according to their Zhuz affiliations (which roughly reflects their geographic origin) and ran Globetrotter analyses following the pipeline in (26) as independent replicates to identify the different ancestry sources contributing to the gene pool of Kazakhs and date admixture events. Globetrotter analyses confirmed that the three groups have the same source composition and admixture dates and are a result of a complex mixture of different western, southern, and eastern Eurasian ancestries (table S4). The dates of admixture identified by Globetrotter highlight a narrow and recent time range for the formation of the present-day Kazakh gene pool, between 1341 and 1544 CE (table S5).

Our analysis of more than 100 ancient individuals from Central Asia shows that IA nomad populations of the Kazakh Steppe formed through extensive admixture, resulting from complex interactions between preceding MLBA populations from the steppe and the neighboring regions (Figs. 2A, 3, A and C, and 4A). Our findings shed new light onto the debate about the origins of the Scythian cultures. We do not find support for a western Pontic-Caspian steppe origin, which is, in fact, highly questioned by more recent historical/archeological work (1, 2). The Kazakh Steppe origin hypothesis finds instead a better correspondence with our results, but rather than finding support for one of the two extreme hypotheses, i.e., single origin with population diffusion versus multiple independent origins with only cultural transmission, we found evidence for at least two independent origins as well as population diffusion and admixture (Fig. 4B). In particular, the eastern groups are consistent with descending from a gene pool that formed as a result of a mixture between preceding local steppe_MLBA sources (which could be associated with different cultures such as Sintashta, Srubnaya, and Andronovo that are genetically homogeneous) and a specific eastern Eurasian source that was already present during the LBA in the neighboring northern Mongolia region (27). The genetic structure of the Early IA Tasmola culture of central and northern Kazakhstan is mostly composed of an equal mixture of these two ancestries, although smaller amounts of gene flow from an Iranian-related source are also required (Figs. 3A and 4A and data file S4). We found that overall BMAC-related populations from Turan provide the best fit to our models while Iranian-related sources further to the west, such as the BA groups from the northern Caucasus, fail (data file S4). These results corroborate the historical/archaeological hypotheses of a cultural connection between the southern civilizations and the northern steppe people (3). This BMAC influx continues in the later fourth- to first-century BCE-CE Scythian groups from the northeastern Pazyryk site of Berel and becomes increasingly higher and nonuniformly distributed in the southeastern Saka individuals from the Tian Shan mountains (Figs. 2, B and D, 3, A and C, and 4B; fig. S4; table S3; and data file S4).

(A) Formation of a three-way LBA admixture cline from which (B) eastern Scythian and western Sarmatian gene pools arose and spread throughout the Steppe and (C) a new source of eastern Eurasian ancestry influx admixing with the Scythian gene pools started in the IA and becoming predominant and widespread at northern latitudes during the Xianbei-Hun period. On the very southern tips of the Steppe, a very different ancestry shift occurred, likely linked with the expansion of the Persian world. The arrows represent the demographic processes analyzed in the present study and are numbered from 1 to 6 to connect them to the main results shown in Fig. 3 from which these inferences have been drawn.

The two previously published individuals from the Aldy-Bel culture of the Arzhan 2 site in the Tuva region fall within the main eastern Scythian genetic cluster, confirming that it was present also in the same site where the earliest Scythian burials are found (Fig. 2A). These data, coupled with recent findings from the IA transition in Mongolia (28), seem to point to an origin in the Altai area of a main genetic substratum that formed all the eastern Scythians (Fig. 4B). The western Sarmatians from the southern Ural region also formed as a result of admixture between the same three ancestral sources as the eastern Scythians (Fig. 3A). Nevertheless, the eastern Eurasian ancestry is present only in a small amount in Sarmatians (Fig. 3A). In addition, their early admixture dates (Fig. 3C) and the absence of an admixture cline between the Sarmatians and the eastern groups (Fig. 2, A, B, and D) suggest that the Sarmatians descend from a related but different LBA gene pool compared with the one that contributed to the eastern Scythians (likely differently located along an LBA admixture cline). Given the geographic location of the earliest Sarmatian sites found so far, we hypothesize that this gene pool originated in the LBA southern Ural area (Fig. 4B). More data from the later and westernmost Scythian cultures of the Caucasus and eastern Europe will provide a better understanding of their genetic affinities with the earlier Scythians from the Kazakh Steppe analyzed in the current study. Furthermore, our results show that the northern sedentary Sargat-related cultures show a close genetic proximity with the Scythians especially with the eastern nomad groups (Fig. 2B). The Sargats show additional affinity not found in the Scythian groups ultimately related to a northern Siberian lineage (Figs. 2D and 3A). This is consistent with the historical hypothesis that the Sargat people formed as a result of admixture between incoming Scythian groups and an unsampled local or neighboring population that possibly carried this extra Siberian ancestry (3, 18).

From the second half of the first millennium BCE, we detect a major genetic shift in a number of outliers that are interestingly linked with the emergence of the Korgantas culture that replaced the Tasmola in central Kazakhstan. In particular, we observe an influx from an eastern Eurasian source that is different from the one that contributed to the shift in the LBA (Figs. 3A and 4C and table S2). At the turn of the first millennium CE, this mixed genetic profile became widespread among the northeastern individuals associated with the Xianbei-Hun cultures and the later medieval individuals (Figs. 2, C and D, 3B, and 4C, and table S2). The highly variable admixture proportions and dates obtained for those individuals suggest that this was an ongoing process that characterized the first centuries CE (first to fifth century at least; Fig. 3C, fig. S4, and table S3). Additional genetic data from the first millennium CE will allow a more comprehensive understanding of the nature and the extent of this heterogeneity. Instead, in the southern Kazakhstan region, the individuals from the Konyr Tobe site located in the ancient city of Otyrar Oasis show a different genetic turnover mostly characterized by an increase in Iranian-related genetic ancestry, most likely reflecting the influence of the Persian empires (Fig. 4C) (20, 29). Outliers, with high eastern Eurasian admixture or with gene flow from South Asia, suggest that the population of this city at that time was heterogeneous (Fig. 2C and data file S4). During this period, Otyrar was a main center of the Kangju kingdom and a crossroad along the Silk Road (29). In the neighboring region of the Tian Shan mountains, in the third century CE site of Alai Nura, a genetic profile typical of the much earlier IA Tian Shan Sakas can still be found (Fig. 3B and data file S4).

The heterogeneity and geographic structuring observed during the IA, the Xianbei-Hun, and the medieval periods in Kazakhstan come in strong contrast with the genetic homogeneity observed among present-day Kazakhs (fig. S5). Fine-scale haplotype-based analyses confirmed this homogeneity and showed, in line with previous findings (26), that the Kazakh gene pool is a mixture of different western and eastern Eurasian sources (table S4). Our results on the ancient populations revealed that this was a result of the very complex demographic history, with multiple layers of western and eastern Eurasian ancestries mixing through time. The admixture dates obtained for present-day Kazakhs overlap with the period when the Kazakh Khanate was established (~15th century CE; table S5). Furthermore, the gene pool of present-day Kazakhs cannot be fully modeled as a mixture of post-IA northern Xianbei-Hun and southern Kangju-related gene pools (data file S4). These findings suggest that recent events, likely enfolding during the second millennium CE, were associated with more demographic turnovers in this region that ultimately lead to the homogenization of the Kazakh gene pool as a consequence of the establishment of the Kazakh Khanate with its strict exogamic rules (21).

We selected 48 samples for radiocarbon dating. They were chosen to be representative of the different cultures/genetic clusters observed or key genetic outliers. Additional 14 samples were already 14C-dated in previous studies (text S1), summing up to a total of 62 individuals directly 14C-dated (data file S1). For the new dates, the analyses were done at the Curt-Engelhorn-Zentrum Archaeometry gGmbH, Mannheim, Germany. Collagen was extracted from the bones, purified by ultrafiltration (>30 kDa), and freeze-dried. Then, the samples were combusted to CO2 in an elemental analyzer, and CO2 was converted to graphite via catalysis. The 14C/12C ratio was obtained using a mini radiocarbon dating systemaccelerator mass spectrometry. The resulting 14C ages were normalized to 13C = 25 per mil (30). The 14C ages were then calibrated (we considered the Cal 2 for downstream analyses) using the dataset INTCAL13 (31) and the software SwissCal 1.0 (30).

DNA from the ancient individuals analyzed in this study was obtained following strict sampling and extraction protocols performed in an ancient DNA clean room at the facilities of the Max Planck Institute for the Science of Human History, Jena, Germany. In brief, 40 to 70 mg of bone or tooth powder was used for DNA extraction following a previously published protocol, optimized for the retrieval of short DNA fragments (32). For the initial lysis step, the powder was incubated for 12 to 16 hours (37C) in 1 ml of extraction buffer containing 0.45 M EDTA (pH 8.0) and proteinase K (0.25 mg/ml) and subsequently purified using a binding buffer containing guanidine hydrochloride, sodium acetate (pH 5.2), and isopropanol (32), in combination with the High Pure Viral Nucleic Acid Large Volume Kit (Roche). Last, DNA extracts were eluted in 100 l of TET [10 mM tris-HCl, 1 mM EDTA (pH 8.0), and 0.05% Tween 20]. Following DNA extraction, 25 l of extract from each sample were used to produce double-stranded DNA libraries using a published protocol (33) with an initial treatment using the enzymes uracil-DNA glycosylase (UDG) and endonuclease VIII following a previously described procedure (34). This step allows for the partial removal of uracils resulting from postmortem DNA damage (cytosine deamination), retaining enough damage at the terminal nucleotides of the fragments to permit ancient DNA authentication. The resulting NSG libraries were quantified on a quantitative polymerase chain reaction (qPCR) instrument (LightCycler 96 System, Roche) using the IS7/IS8 primer set and DyNAmo SYBR Green qPCR Kit (Thermo Fisher Scientific) (33). Subsequently, libraries were double-indexed using a combination of indexing primers containing unique 8base pair (bp) identifiers (35). Ten-cycle indexing PCR reactions were carried out using Pfu Turbo Cx Hotstart DNA Polymerase (Agilent). PCR products were purified using the MinElute DNA purification kit (QIAGEN) and were subsequently qPCR-quantified using the IS5/IS6 primer set (35). Indexed libraries were then amplified with the IS5/IS6 primer set using the Herculase II Fusion DNA Polymerase (Agilent) to achieve a maximum of 10 copies per reaction, and amplification products were purified using the MinElute DNA purification kit (QIAGEN). Moreover, the concentration (nanograms per microliter) of amplified libraries was measured on an Agilent 4200 TapeStation instrument (Agilent) using the D1000 ScreenTape system (Agilent). Last, an equimolar pool of 69 of 117 UDG-half libraries was prepared for shotgun sequencing within the Max Planck Institute for the Science of Human History facilities on an Illumina HiSeq 4000 platform using a single-end 76-cycle sequencing kit. All the sequenced libraries showed high human endogenous DNA proportions (between 1 and 85% with only one library showing 0.8%) and ancient DNA characteristic damage patterns at the end of the fragments (3 end at least ~0.05%; data file S1). Therefore, all the sample libraries were enriched using DNA probes spanning 1,237,207 genome-wide SNPs known to be variable in human populations. For this, all libraries we reamplified using the Herculase II Fusion DNA Polymerase (Agilent) to achieve 1 to 2 mg of total DNA in 5.2 l (200 to 400 ng/l), they were then purified using the MinElute DNA purification kit (QIAGEN), and their concentrations were measured on a NanoDrop spectrophotometer (Thermo Fisher Scientific). All amplified libraries were subsequently captured following an established in-solution DNA capture protocol (5, 36, 37).

Genomic DNA from the 96 Kazakh individuals was extracted using the QIAamp DNA Mini Kit (QIAGEN, Germany) according to the manufacturers protocol. The DNA was quantified spectrophotometrically (Eppendorf BioPhotometer Plus) and fluorometrically (Qubit 2.0). The DNA samples were then genotyped for ~600,000 genome-wide SNPs with the Affymetrix Axiom Genome-wide Human Origins 1 (HO) array platform performed at the ATLAS Biolabs GmbH in Berlin (Germany). Quality controls were performed with PLINK v.1.9 (38). All the 96 individuals had a genotype success rate higher than 95%, and all SNPs had a success rate higher than 95% and were therefore kept for downstream analyses. We then merged our 96 individuals with 18 previously published Kazakh individuals also genotyped on a HumanOrigins array (26). On this dataset (N = 114), we estimated recent relatedness values among each pair of individuals with the --genome function restricting the analysis only on 73,076 SNPs with low linkage disequilibrium (LD) (r2 < 0.1) that was estimated setting the --indep-paiwise 50 100 0.1 parameters. We found only two couples with PI-HAT values [i.e., coefficient of relatedness (38)] compatible with a third- to second-degree relation (0.25 > PI-HAT > 0.125) and involved one couple of previously published Kazakh individuals (KZH-1611 and KZH-1750, PI-HAT = 0.23) and one couple formed by a new and a previously published individual (KZH-1650 and E01; PI-HAT = 0.19).

Raw data. Demultiplexing of the sequenced reads was done allowing only one mismatch in the indexes. Adaptor removal, mapping to the reference genome, and duplicate removal were done through the EAGER 1.92.32 workflow (39). We used AdapterRemoval v2.2.0 to remove adaptors discarding reads shorter than 30 bp (40). We then mapped the reads to Human Reference Genome Hs37d5 using the bwa v0.7.12 aln/samse alignment algorithm (41) with an edit distance parameter (-n) of 0.01 and a seed length (-l) of 32 and keeping only high-quality reads (phred mapping quality of 30) using Samtools v1.3 (42). We then used DeDup v0.12.2 to remove PCR duplicates (39).

Authentication and contamination estimates. We used mapDamage v2.0 (43) to assess the amount of deamination at the ends of the fragments on a subset of 100,000 high-quality reads using default parameters. We assessed exogenous human DNA contamination levels using ANGSD v0.910 (44) for nuclear (based on X chromosome heterozygosity levels in males) and Schmutzi (45) for mitochondrial DNA contamination. Among the males with enough coverage (i.e., >200 SNPs on X chromosome), none of the individuals had a contamination level of >7%, and only one had >4%. Even if the results are less reliable (i.e., <<200 SNPs), we removed from further analyses two males that showed signs of moderate nuclear contamination (>10%) that were also PCA outliers. Furthermore, none of the individuals (males and females) showed levels of mitochondrial contamination of >3% (data file S1).

Genotyping. We used pileupCaller (https://github.com/stschiff/sequenceTools) with the --randomHaploid mode to call haploid genotypes for each position captured on the 1240K panel by randomly choosing one high-quality base (phred base quality score of 30). To call transitions, we first clipped 2 bp from each end of the high-quality reads using the trimBam module of bamUtil v.1.0.13 (46) to reduce the numbers of wrong calls due to high deamination at the last two bases, while we used the full high-quality reads to call transversions. At this stage, we excluded from the analyses four individuals with the lowest coverage presenting <20,000 SNPs typed on the 1240K panel.

We then merged the newly produced genotype data of 111 ancient individuals with the 96 modern Kazakh and a reference panel composed of 2280 modern individuals genotyped with the HumanOrigins array (26, 47, 48) and 959 ancient individuals haploid genotypes obtained from a mix of 1240K capture and shotgun sequencing data (4, 6, 7, 911, 2527, 36, 4852) (data files S2 and S3). This 1240KHO dataset consisting of 586,594 overall SNPs was used for explorative global structure population genetic analyses. For fine-scale ancestry deconvolution and admixture dating analyses, we compiled a higher-coverage 1240K dataset merging only data obtained with the 1240K capture technique or whole-genome shotgun data pooled down to comprehend 1240K sites only (1,233,013 overall SNPs; data files S2 and S3).

Sex determination. Genetic sex was determined calculating the ratio between the coverage on the X and Y chromosomes over the one on the autosomes. We found highly consistent ratios, allowing us to confidently infer the sex of all the individuals. One individual showed a Y/autosomes proportion of 0.96 (data file S1). Since the X-based contamination estimates are extremely low and the X/autosomes ratio within the normal range for males, the most likely explanation is that this individual carries a XYY karyotype. This condition is known as the XYY syndrome and is relatively rare (1 in 1000 births) and largely asymptomatic (53). It commonly affects stature (i.e., increased height) and can slightly influence cognitive or behavioral functions (53).

Genetic relatedness estimation. We assessed relatedness between individuals by calculating the rate of mismatching alleles between every pair of individuals (pairwise mismatch rate) among the overlapping positions as described in (27, 54). The pairwise mismatch rate provides good evidence of close relationships such as identical individual/twins and first and second degrees (55). We detected a couple of first-degree relatives. The two individuals, a male and a female (ESZ001 and ESZ003), came from the same site (Eleke_Sazy_650BCE) and the same burial (Mound 4). Another first-degree couple was found in Taldy_7cBCE site (TAL003 and TAL004). We then identified a couple of second-degree relatives from the Karashoky_7cBCE site (KSH001 and KSH003) and a couple of possible second- to third-degree relatives between two individuals retrieved from two different Tasmola sites of Akbeit_7cBCE and Nurken_8cBCE (AKB001 and NUR002). We removed one individual per pair of related couples for downstream population-based analyses (data file S1).

We used Schmutzi to obtain the consensus sequence of the mitochondrial DNA with a q10 quality cutoff, and we used HaploGrep2 (56) to assign haplogroups. We used yHaplo (57) to assign Y chromosome haplogroups of the male individuals. To obtain Y chromosome genotypes, we used pileupCaller in the --majorityCall mode to call the allele supported by most reads for each Y chromosome SNP included in the 1240K panel (data file S1).

We applied the smartpca v16000 function in EIGENSOFT v6.0.1 package (58) on the 1240KHO dataset to run PCA with the lsqproject option to project the data of the ancient individuals on top of PCA calculated on the set of modern populations to bypass the high number of missing genotypes in the ancient data that would artificially shift the eigenvectors toward the origin of the axes. We used a set of 150 present-day Eurasian populations on which we projected our newly produced 111 ancient unrelated individuals that passed the quality controls together with other relevant published ancient genomes. We also ran a PCA only on the genotypes of the present-day Kazakh individuals. We then applied ADMIXTURE v.1.3.0. (59) unsupervised cluster analyses testing K = 2 to K = 16 on a set of worldwide ancient and modern individuals. For each K value tested, we performed 10 independent ADMIXTURE runs with a different random seed to check the convergence of log-likelihoods across the different runs. For each K value, we selected for consideration the run with the highest log-likelihood. We also estimated the cross-validation error (CV-err) for each K value to identify the most parsimonious models (i.e., increasing the number of K values does not produce a visible decrease in CV-err) to avoid overfitting. For ADMIXTURE analyses, we removed the variants with a minor allele frequency of <0.01, and we pruned the dataset, removing all the SNPs in LD with an r2 > 0.4 setting a 200-SNP sliding window with a 25-SNP step using the dedicated commands in PLINK. The pruned dataset consists of 206,728 SNPs, and we further removed from the analysis all the ancient individuals with a missingness rate higher than 95%, which corresponds to including individuals with at least ~10,000 nonmissing variants on this thinned dataset.

For the new ancient individuals produced in the current study, we first considered a site-based labeling system consisting of site name plus the age expressed in centuries BCE/CE referring to the median of the archeological time range of the site (e.g., Site_Name_400BCE). In the few cases of same sites presenting burials from different and discontinuous time periods, multiple unique names for the same site exist (e.g., Berel_50BCE and Berel_300CE). Following the recommendations in (60) for grouping the individuals into populations, we used a mixed system consisting of the archaeological cultures name and the archaeological age of the sites included in the group in centuries BCE/CE (e.g., Tasmola_650BCE). To respect the high levels of admixture and genetic variability observed within most of the ancient cultures studied, we identified and excluded as outliers only the individuals that exceeded 2 SD from the median of at least one of the first three PCs within their respective culture group (table S1).

In the cases of cultures represented by one site only, or by few individuals with highly admixed genetic profiles, we retained the site-based labels or used different grouping combinations depending on specific hypotheses and analyses tested as detailed in Results. We extended this labeling system also to previously published ancient individuals belonging to closely related cultures. Nevertheless, to limit potential batch effects due to different laboratory techniques and sequencing methods (i.e., shotgun or 1240K capture), we avoided grouping together our newly generated individuals with previously published ones even if belonging to the same age or culture, opting for using them as independent control groups for validation (data file S3). For consistency with the literature, the rest of the reference ancient and modern population labels and groupings were kept the same as in the original publications unless stated otherwise (data file S3).

All the f-statisticbased analyses were run using the dedicated programs in the ADMIXTOOLS package (47) on the 1240K dataset. We run outgroup-f3 analyses with qp3Pop (v400). We tested the forms f3(Test, X; Mbuti) for the Kazakh ancient individuals as Test against every other X individual/population included in the dataset. For the newly reported individuals, outgroup-f3 was run on a site-based grouping considering separate the PCA outliers. To test specific hypotheses detailed in the results, we also computed f4-statistics with qpDstat (v711) setting the f4mode: YES option. For both f3- and f4-statistics, we considered only the tests that had a number of overlapping SNPs of >30,000, and we considered a Z > |3| as a threshold for significance.

We then run f4-statisticbased ancestry decomposition analyses on the 1240K dataset using the qpWave and qpAdm (v632) pipeline (5, 61). SEs for the computed f-statistics were estimated using a block jackknife with a 5-centimorgan block. We used the following set of eight outgroups (OG1) by including representatives of western and eastern Eurasian and relevant non-Eurasian ancient lineages using directly ancient individuals or present-day proxies: Mbuti, Natufian, Anatolia_N, Ganj_Dareh_N, Villabruna, Onge, Ami, and Mixe. As sources, we used, when available, ancient populations from the closest available preceding time periods. We started by selecting proximal source populations for the IA groups by choosing representatives of the three main genetic ancestries found in the MLBA in the Kazakh Steppe and the surrounding regions. Specifically, we used as western ancestry sources a chosen set of steppe_MLBA groups as well as the earlier steppe_EBA (i.e., Yamnaya and Afanasievo) for completeness of analysis. We selected two groups from the west and central clusters described in (7): Sintashta_MLBA and Srubnaya as representative of the western cultures from the southern Ural area (steppe_MLBA_west) and Dali:MLBA and Krasnoyarsk_MLBA from northeastern Kazakhstan and the Minusinsk Basin in Russia, respectively, as representative of the eastern cluster showing higher affinity to preceding local hunter-gatherer populations and ultimately to the ANE-related ancestry (steppe_MLBA_central). As sources of East Asian ancestries, we used previously published Eneolithic and Early BA individuals from the Baikal region (Baikal_EN and Baikal_EBA) and from the Minusinsk Basin (Okunevo) and LBA individuals from Khovsgol site in northern Mongolia (Khovsgol) as well as Neolithic individuals from the Amur River Basin representative of a deep north East Asian lineage (DevilsCave_N) presenting a high genetic continuity with modern individuals from the same region (52, 62). As third Iranian-related sources, we used the available Eneolithic groups from Iran (Iran_ChL and Hajji_Firuz_C), the MLBA groups from the Caucasus (Caucasus_MBA_North_Caucasus, Caucasus_Late_Maykop, and Caucasus_Kura_Araxes), Armenia_LBA, Eneolithic from Turan (Geoksiur_EN and Tepe_Hissar_C), and BA from Turan associated with the Bactria-Margiana complex or BMAC (Gonur1_BA) and later MLBA postBMAC (Sappali:Tepe_BA, Bustan_BA, and Dzharkutan1_BA). We first performed 1648 qpWave/qpAdm-independent tests for each target group, permuting all combinations of two-way (N = 16) and three-way (N = 96) sources (data file S4). For the IA outliers and the later CE groups, we used the preceding IA groups as first sources and tested different second and eventually third sources (when two-way tests failed) to narrow down closer proxies that could explain the nature of the observed genetic turnovers. For the modeling of present Kazakh Zhuz, we used the 1240KHO dataset using the same set of outgroups used for the ancient (albeit the modern populations are represented by more individuals in the 1240KHO with respect to the 1240K dataset; data file S2). We tested two-way models (data file S4) with all the later CE individuals from the northern latitude, showing the eastern Eurasian influx as the first source (Xiambei_Hun_Berel_300CE, Hun_elite_350CE, Karakaba_830CE, and Kayalyk_950CE) and the southern CE individuals with Iranian-related influx as the second source (Konyr_Tobe_300CE and Alai_Nura_300CE).

We used DATES (7) on the 1240K dataset to date the admixture events identified from previous analyses in the IA and post-IA Kazakh individuals. The method is conceptually similar to commonly used admixture-dating methods based on LD such as ALDER (63), although instead of calculating the weighted LD decay, which would require high coverage with virtually no missing data, DATES use the decay of ancestry covariance (AC) coefficients between pairs of overlapping SNPs over increasing genetic distance. As for the LD-based methods, an exponential function can be fitted to the decay of weighted AC as genetic distance increases to infer admixture parameters such as the number of generations since admixture (63). We then considered a standard 29 years per generation to convert the generation times into years since admixture (7). DATES assumes a two-way admixture; therefore, we used as sources pairwise combinations of the best proxies resulted from qpAdm modeling analyses. In choosing the source populations, we also considered that the method is sensitive to sample sizes and coverage, preferring proxies with a higher number of individuals or, when possible, pooling together genetically homogeneous populations to reduce statistical noise (table S3). In choosing the target individuals, for sites with more complex chronology (i.e., containing burials belonging to different time periods), we included only individuals directly 14C-dated to reduce the errors due to incorrect context dating.

We reconstructed the phase of haplotypes for the modern Kazakh individuals together with the set of worldwide modern populations present in the 1240KHO dataset using SHAPEIT2 v2.r790 (64) with default parameters and using HapMap phase 3 recombination maps. To explore the fine-scale population structure among present-day Kazakh individuals, we applied the haplotype-based CHROMOPAINTERv2/fineSTRUCTURE pipeline (65) on the Central Asian Southern Steppe populations present in the dataset plus all the Kazakh individuals (96 new and 18 previously published). We first estimated the mutation/emission and the switch rate parameters with 10 steps of the Expectation-Maximization algorithm on a subset of chromosomes {4, 10, 15, 22} using each individual as donor and recipient. Then, we averaged the values across chromosomes (normalized by the number of SNPs per chromosome) and individuals, and we used these mutation/emission and switch rate parameters to run CHROMOPAINTER again on all chromosomes, considering a parameter k = 50 to specify the number of expected chunks to define a region. The obtained matrix of haplotype-sharing chunk counts was summed up across all the 22 autosomes and submitted to the fineSTRUCTURE clustering algorithm version fs2.1 (65). We ran fineSTRUCTURE pipeline by setting 1,000,000 burn-in Markov chain Monte Carlo iterations, followed by additional 2,000,000 iterations and sampling the inferred clustering patterns every 10,000 runs. Last, we set 1,000,000 additional hill-climbing steps to improve posterior probability and merge clusters in a stepwise fashion. Individuals were hierarchically assembled into clusters until reaching the final configuration tree. We then applied the GLOBETROTTER algorithm (66) using the Kazakhs, grouped according to their Zhuz affiliation, as targets and a set of 85 non-Inner Eurasian populations as reference groups following the same pipeline detailed in (26) to date admixture and identify the main contributing sources. All GLOBETROTTER runs were conducted according to the guidelines reported in (66) and performing a first run standardizing over a null individual.

L. Koryakova, Europe to Asia, in The Oxford Handbook of the European Iron Age, C. Haselgrove, K. Rebay-Salisbury, P. S. Wells, Eds. (Oxford Univ. Press, 2018), pp. 141.

J. Davis-Kimball, L. Koryakova, E. M. Murphy, L. T. Yablonsky, Kurgans, Ritual Sites, and Settlements: Eurasian Bronze and Iron Age (British Archaeological Reports Limited, 2000), 324 pp.

V. Mordvinceva, S. Reinhold, The northern Black Sea and North Caucasus, in The Oxford Handbook of the European Iron Age, C. Haselgrove, K, Rebay-Salisbury, P. S. Wells, Eds. (Oxford Univ. Press, 2018), pp. 150.

Deutsches Archologisches Institut, Museum fr Vor- und Frhgeschichte (Berlin, Germany), G. K. Hypo-Kulturstiftung (Munich, G. Martin-Gropius-Bau, Berlin), Im Zeichen des goldenen Greifen: Knigsgrber der Skythen (Prestel, Munich, 2007).

A. M. K. Lerner, Iron Age Nomads of the Urals: Interpreting Sauro-Sarmatian and Sargat Identities (UMI ProQuest, Ann Arbor, 2006).

S. M. Akimbekov, History of the Steppes: The Phenomenon of the State of Genghis Khan in the History of Eurasia (Center of Asia, 2011).

S. Kenzheakhmetuli, in Seven Treasures: Collection of 2 Books, A. A. Tili, Ed. (2002), pp. 134135.

M. Nagy, A Hun-Age burial with male skeleton and horse bones found in Budapest, in Neglected Barbarians, F. Curta, Ed. (Studies in the Early Middle Ages, Brepols Publishers, 2010), vol. 32, pp. 137175.

A. Z. Beisenov, Burial and ritual complex Kurgan 37 Warriors, in Bulletin of SUSU (Series Social and Human Sciences, South Ural State University, Chelyabinsk, 2015) (in Russian), vol. 15, pp. 612.

A. Z. Beisenov, Earrings of the Saka epoch. Bulletin of Tomsk State University. History 6 (2014) (in Russian), pp. 121128.

A. Z. Beisenov, A. O. Ismagulova, E. P. Kitov, A. O. Kitova, The Population of Central Kazakhstan in the 1st Millennium BC (Almaty Institute of Archeology Named after A.Kh. Margulan, 2015) (in Russian), p. 188.

A. Z. Beisenov, E. P. Kitov, The Taldy II burial ground of Tasmola Culture in the Central Kazakhstan (craniological analysis). Science Journal of Volgograd State University History. Area Studies. International Relations, 4 7185 (2014).

Z. Samashev, E. M. Kariev, S. E. Erbolatov, Hun-Syanbian cultural and chronological horizon of Berel, in Materials of the International Archaeological Conferences, B.A. Baytanayev, Ed. (L.N. Gumilyov Eurasian National University, 2019) (in Russian), pp. 385394.

V. E. Stoyanov, A. G. Degtyarev, Report on the Kurgansk detachment of the Ural university expedition of 1963, Archive of IA RAS (Institute of Archaeology of the Russian Academy of Sciences), P-1, No. 2749 (in Russian).

Z. S. Samashev, G. S. Zhumabekova, A. S. Ermolaeva, G. Omarov, Military archeology. Weapon and military affair in a historical and social perspective. Early Saka arrowheads from Kazakhstan Altai - SPb (State Hermitage, 1998) (in Russian), pp. 155160.

Z. S. Samashev, A. S. Ermolaeva, G. S. Zhumabekova, Kazakh Altai in the 1st millennium BC. Kazakhstan in the Saka era, in Collective Monograph. Almaty (Institute of Archeology Named after A.Kh. Margulan, 2017) (in Russian), pp. 101156.

A. Z. Beisenov, V. V. Varfolomeev, V. K. Merz, I. V. Merz, Excavations of the Karaoba burial ground in 2013 (preliminary report). The dialogue of cultures of Eurasia in the archeology of Kazakhstan, in A Collection of Scientific Articles Dedicated to the 90th Birthday of the Outstanding Archaeologist K. A. Akishev, T.S. Sadykov, Ed. (Saryarka Publishing House, 2014) (in Russian), p. 736.

A. Onggaruly, G. Kiyasbek, M. Kyzirkhanov, A. Kairmagambetov, The sanctuary of Aigyrly 2 in Mangystau (preliminary results), in Religion and the Worldview System of Ancient and Medieval Nomads of Eurasia. Collection of scientific articles, A. Onggaruly, Ed. (Institute of Archeology named after A.Kh. Margulan, 2016) (in Russian), pp. 93106.

A. Onggaruly, The image of the sarmatian batyr by the materials of the sacred Aral-Caspian streams. Heritage of the Great Steppe: masterpieces of jewelry art. IV. The world of art images of nomads. The exhibition catalogue (The National Museum of the Republic of Kazakhstan, 2018), pp. 4751.

A. Onggaruly, V. S. Olkhovsky, A. Astafyev, D. Darmenov, The Ancient Sanctuaries of Ustyurt and Eastern Aral (Institute of Archaeology Named after A.Kh. Margulan, 2017) (in Russian), p. 320.

N. P. Petrov, V. V. Rodionov, Report on archaeological exploration in the Aktobe region in the summer of 1977, in Funds of AOIKM. Aktyubinsk (Aktobe Regional Museum of Local History, 1978).

S. Yu Gutsalov, G. V. Makarevich, Report on archaeological explorations and excavations in the Aktobe region in the summer and fall of 1986, in Funds of AOIKM. Aktyubinsk (Aktobe Regional Museum of Local History, 1987).

M. G. Moshkova, G. V. Kushaev, Report on the work of the West Kazakhstan expedition of 1969. Archive of IA RAS (Institute of Archaeology of the Russian Academy of Sciences), P-1, Nos. 4381 and 4381a (in Russian).

A. A. Bisembaev, A. I. Havansky, Excavations of the burial ground Kajynbulak II in Aktyubinsk region of the Republic of Kazakhstan in 2017. International scientific conference "New in the research of the early Iron Age of Eurasia: problems, discoveries, methods": abstracts. Ans. ed. A.A. Malyshev (MAX Press, 2018) (in Russian), p. 174.

K. M. Baipakov, J. K. Taimagambetov, Archeology of Kazakhstan: Textbook FOR University Students (Kazakh University, 2006) (in Russian), p. 355.

B. Ayagan, Kazakhstan. National encyclopedia. 1 (LLP, Kazakh Encyclopedia, 2004) (in Russian), p. 560.

E. Maanaev, V. Ploskikh, On the Roof of the World: Historical Essays on the Pamir-Alai Kirghiz (Mektep, 1983) (in Russian), p. 144.

V. A. Mogilnikov, Report on the work of the Irtysh detachment of the West Siberian expedition of 1967. Archive of IA RAS (Institute of Archaeology of the Russian Academy of Sciences), P-1, No. 3464 (in Russian).

V. A. Mogilnikov, Report on the work of the Irtysh detachment of the West Siberian expedition of 1968. Archive of IARAS (Institute of Archaeology of the Russian Academy of Sciences), P-1, No. 3716 (in Russian).

V. A. Mogilnikov, To the question of Sargat culture, in Problems of Archeology and the Ancient History of the Ugrians, A.P. Smirnov, V.N. Chernetsov, I.F. Erdeli, Eds. (Nauka, 1972) (in Russian), pp. 6687.

V. F. Gening, Ural archaeological expedition of 1961. Archive of IARAS (Institute of Archaeology of the Russian Academy of Sciences), P-1, No. 2362 (in Russian).

V. A. Buldashev, Funeral rites of the Gorokhov culture. Abstract. Cand. Diss., Novosibirsk (Ph.D. thesis, Institute of History and Archaeology, Ural Branch, Russian Academy of Sciences, 1998) (in Russian).

Posted in Genome | Comments Off

Genome analysis for reinfection cases in capital – Hindustan Times

Posted: at 3:27 am

The decision comes a day after the Centre said it found the presence of a novel variant of Sars-Cov-2 in Delhi in nine samples, while 65 others had the UK variant B.1.1.7.

PUBLISHED ON MAR 26, 2021 04:45 AM IST

Samples of anyone with a past history of Covid-19 who tests positive again, or those who catch the disease after getting two doses of a vaccine, will be mandatorily sent for whole genome sequencing, the Delhi government ordered on Thursday. The decision is aimed at augmenting surveillance to look for any concerning variants.

The decision comes a day after the Centre said it found the presence of a novel variant of Sars-Cov-2 in Delhi in nine samples, while 65 others had the UK variant B.1.1.7. It is yet to be established how the novel variant changes the nature of the virus, but it contains two mutations (E484Q and L452R) that could make it spread more readily or evade the immunity conferred by a past infection or a vaccine. The directorate general of health services (DGHS) directive also said that each district has to send 12 samples (three each of mild, moderate, severe and critical cases) of Covid-19 positive cases per week to the National Centre for Disease Control (NCDC) for whole genome sequencing. This will help us in detecting which strain is causing most of the infections here whether it is the new variant, other variants such as UK, South Africa or Brazil, or something else, said a senior official from Delhis health department.

Get our daily newsletter

Thank you for subscribing to our daily newsletter.

Go here to read the rest:
Genome analysis for reinfection cases in capital - Hindustan Times

Posted in Genome | Comments Off

Precision BioSciences to Participate in the Guggenheim Healthcare Talks 2021 Genomic Medicines & Rare Disease Day – Yahoo Finance

Posted: at 3:27 am

Bloomberg

(Bloomberg) -- From his perch high above Midtown Manhattan, just across from Carnegie Hall, Bill Hwang was quietly building one of the worlds greatest fortunes.Even on Wall Street, few ever noticed him -- until suddenly, everyone did.Hwang and his private investment firm, Archegos Capital Management, are now at the center of one of the biggest margin calls of all time -- a multibillion-dollar fiasco involving secretive market bets that were dangerously leveraged and unwound in a blink.Hwangs most recent ascent can be pieced together from stocks dumped by banks in recent days -- ViacomCBS Inc., Discovery Inc. GSX Techedu Inc., Baidu Inc. -- all of which had soared this year, sometimes confounding traders who couldnt fathom why.One part of Hwangs portfolio, which has been traded in blocks since Friday by Goldman Sachs Group Inc., Morgan Stanley and Wells Fargo & Co., was worth almost $40 billion last week. Bankers reckon that Archegoss net capital -- essentially Hwangs wealth -- had reached north of $10 billion. And as disposals keep emerging, estimates of his firms total positions keep climbing: tens of billions, $50 billion, even more than $100 billion.It evaporated in mere days.Ive never seen anything like this -- how quiet it was, how concentrated, and how fast it disappeared, said Mike Novogratz, a career macro investor and former partner at Goldman Sachs whos been trading since 1994. This has to be one of the single greatest losses of personal wealth in history.Late Monday in New York, Archegos broke days of silence on the episode.This is a challenging time for the family office of Archegos Capital Management, our partners and employees, Karen Kessler, a spokesperson for the firm, said in an emailed statement. All plans are being discussed as Mr. Hwang and the team determine the best path forward.The cascade of trading losses has reverberated from New York to Zurich to Tokyo and beyond, and leaves myriad unanswered questions, including the big one: How could someone take such big risks, facilitated by so many banks, under the noses of regulators the world over?One part of the answer is that Hwang set up as a family office with limited oversight and then employed financial derivatives to amass big stakes in companies without ever having to disclose them. Another part is that global banks embraced him as a lucrative customer, despite a record of insider trading and attempted market manipulation that drove him out of the hedge fund business a decade ago.A disciple of hedge-fund legend Julian Robertson, Sung Kook Bill Hwang shuttered Tiger Asia Management and Tiger Asia Partners after settling an SEC civil lawsuit in 2012 accusing them of insider trading and manipulating Chinese banks stocks. Hwang and the firms paid $44 million, and he agreed to be barred from the investment advisory industry.He soon opened Archegos -- Greek for one who leads the way -- and structured it as a family office.Family offices that exclusively manage one fortune are generally exempt from registering as investment advisers with the U.S. Securities and Exchange Commission. So they dont have to disclose their owners, executives or how much they manage -- rules designed to protect outsiders who invest in a fund. That approach makes sense for small family offices, but if they swell to the size of a hedge fund whale they can still pose risks, this time to outsiders in the broader market.This does raise questions about the regulation of family offices once again, said Tyler Gellasch, a former SEC aide who now runs the Healthy Markets trade group. The question is if its just friends and family why do we care? The answer is that they can have significant market impacts, and the SECs regulatory regime even after Dodd-Frank doesnt clearly reflect that.Valuable CustomerArchegos established trading partnerships with firms including Nomura Holdings Inc., Morgan Stanley, Deutsche Bank AG and Credit Suisse Group AG. For a time after the SEC case, Goldman refused to do business with him on compliance grounds, but relented as rivals profited by meeting his needs.The full picture of his holdings is still emerging, and its not clear what positions derailed, or what hedges he had set up.One reason is that Hwang never filed a 13F report of his holdings, which every investment manager holding more than $100 million in U.S. equities must fill out at the end of each quarter. Thats because he appears to have structured his trades using total return swaps, essentially putting the positions on the banks balance sheets. Swaps also enable investors to add a lot of leverage to a portfolio.Morgan Stanley and Goldman Sachs, for instance, are listed as the largest holders of GSX Techedu, a Chinese online tutoring company thats been repeatedly targeted by short sellers. Banks may own shares for a variety of reasons that include hedging swap exposures from trades with their customers.Unhappy InvestorsGoldman increased its position 54% in January, according to regulatory filings. Overall, banks reported holding at least 68% of GSXs outstanding shares, according to a Bloomberg analysis of filings. Banks held at least 40% of IQIYI Inc, a Chinese video entertainment company, and 29% of ViacomCBS -- all of which Archegos had bet on big.Im sure there are a number of really unhappy investors who have bought those names over the last couple of weeks, and now regret it, Doug Cifu, chief executive officer of electronic-trading firm Virtu Financial Inc., said Monday in an interview on Bloomberg TV. He predicted regulators will examine whether there should be more transparency and disclosure by a family office.Without the need to market his fund to external investors, Hwangs strategies and performance remained secret from the outside world. Even as his fortune swelled, the 50-something kept a low profile. Despite once working for Robertsons Tiger Management, he wasnt well-known on Wall Street or in New York social circles.Hwang is a trustee of the Fuller Theology Seminary, and co-founder of the Grace and Mercy Foundation, whose mission is to serve the poor and oppressed. The foundation had assets approaching $500 million at the end of 2018, according to its latest filing.Its not all about the money, you know, he said in a rare interview with a Fuller Institute executive in 2018, in which he spoke about his calling as an investor and his Christian faith. Its about the long term, and God certainly has a long-term view.His extraordinary run of fortune turned early last week as ViacomCBS Inc. announced a secondary offering of its shares. Its stock price plunged 9% the next day.The value of other securities believed to be in Archegos portfolio based on the positions that were block traded followed.By Thursdays close, the value of the portfolio fell 27% -- more than enough to wipe out the equity of an investor who market participants estimate was six to eight times levered.Its also hurt some of the banks that served Hwang. Nomura and Credit Suisse warned of significant losses in the wake of the selloff and Mitsubishi UFJ Financial Group Inc. has flagged a potential $300 million loss.You have to wonder who else is out there with one of these invisible fortunes, said Novogratz. The psychology of all that leverage with no risk management, its almost nihilism.(Updates with latest bank to detail exposure in penultimate paragraph.)For more articles like this, please visit us at bloomberg.comSubscribe now to stay ahead with the most trusted business news source.2021 Bloomberg L.P.

See the original post here:
Precision BioSciences to Participate in the Guggenheim Healthcare Talks 2021 Genomic Medicines & Rare Disease Day - Yahoo Finance

Posted in Genome | Comments Off

PrecisionLife Continues Growth and Expansion With Acquisition of Danish Genomic Analytics Innovator GenoKey – Business Wire

Posted: at 3:27 am

OXFORD, England & AALBORG, Denmark--(BUSINESS WIRE)--PrecisionLife today announces that it has acquired its long-term Danish technology development partner GenoKey ApS, bringing together the leaders in combinatorial analytics and large-scale genomic analysis, and enabling PrecisionLife to continue its expansion as an AI-enabled precision medicine company. Financial details of the paper-based transaction were not disclosed.

PrecisionLifes platform, which includes technology developed with GenoKey, enables the company to gain unique insights into genes associated with disease, as biomarkers and as targets for drug discovery. PrecisionLifes business model maximizes the impact of its platform by partnering with others as well as building a pipeline of proprietary assets in chronic diseases.

In addition to its expertise and IP, PrecisionLife will benefit from GenoKeys strong relationships with the Danish health system and leading academic clinical research centers including Aalborg, Aarhus and Copenhagen. Recently, PrecisionLife joined the pan-European FEMaLe consortium led by researchers from Aarhus University, which is a 5.3M international EU Horizon 2020 project that aims to develop precision medicine approaches to improve the diagnosis, treatment and quality of life of patients with endometriosis.

PrecisionLife will maintain its core platform development operations at GenoKeys site in Denmark with further team expansion in the region planned. GenoKeys Chairman and co-founder, Hans-Christian Brahe Mller joins the board of the wholly-owned subsidiary, PrecisionLife ApS. GenoKeys scientific advisors will become part of the PrecisionLife advisory group.

The acquisition of GenoKey solidifies a long-term highly productive collaboration around core IP, and positions PrecisionLife for its next round of investment and growth as a leader in the delivery of precision medicine beyond cancer and rare disease said Dr Steve Gardner, CEO of PrecisionLife.

The accuracy and additional insights generated by PrecisionLifes combinatorial analytics platform have been validated in multiple chronic disease areas such as ALS, schizophrenia, asthma, type-II diabetes and endometriosis as well as severe COVID-19. During the pandemic, PrecisionLife was able to find significantly more signals in severe COVID-19 patient datasets than traditional Genome Wide Association Study (GWAS) methods used by international consortia with access to much larger data sets, uncovering unique avenues for therapeutic intervention (1). These achievements are complimented by GenoKeys collaboration with Professor Erling Mellerup and his team at Copenhagen University on bipolar and other neuropsychiatric disorders, initially sponsored by the Lundbeck Foundation.

Welcoming the transaction, Hans-Christian Brahe Mller, Chairman of GenoKey said, This acquisition presents an exciting opportunity to ensure that GenoKeys 10 years of pioneering analytics development can contribute to the global challenge of delivering new solutions for patients with unmet medical needs in chronic diseases, which represent a huge economic and social burden to healthcare systems and millions of patients around the world.

1. COVID-19 studies, see https://www.medrxiv.org/content/10.1101/2020.06.17.20134015v2.full.pdf and https://www.medrxiv.org/content/10.1101/2021.02.08.21250899v1.full.pdf

About PrecisionLife

PrecisionLife is headquartered near Oxford, UK and has operations in Aalborg and Copenhagen, Denmark, Warsaw, Poland and Cambridge, MA, USA. The companys unique combinatorial analytic platform generates more insights into the complex biology of chronic diseases, driving the next wave of precision medicine applications and finding new treatment opportunities for patients unmet medical needs. PrecisionLife partners with disease charities, clinical research groups, CROs, best of breed technology providers and pharma, biotech and healthcare companies to improve our knowledge of chronic disease biology. PrecisionLife operates an innovation engine that translates proprietary disease biology insights into new drug discovery programs, more successful and cost-effective clinical trials and more personalized clinical decision support tools.

For more information see https://precisionlife.com/

About GenoKey ApS

GenoKey was founded by Dr Gert Lykke Mller (now Chief Analytics Officer of PrecisionLife), Hans-Christian Brahe Mller and two colleagues. GenoKey pioneered the underlying mathematical approach that enables deep combinatorial analysis of genomic and other clinical and epidemiological patient data. Gert was the first to reduce this innovative approach to computational practice, and this has been developed in collaboration with PrecisionLife into a powerful analytical platform that enables the largest and most detailed precision medicine studies.

View post:
PrecisionLife Continues Growth and Expansion With Acquisition of Danish Genomic Analytics Innovator GenoKey - Business Wire

Posted in Genome | Comments Off

Hong Kong Baptist University-led research unlocks the genomic secrets of organisms that thrive in extreme deep-sea environments – Taiwan News

Posted: at 3:27 am

HONG KONG SAR - Media OutReach - 29 March 2021 - A study led by scientists at Hong Kong Baptist University (HKBU) has decoded the genomes of the deep-sea clam (Archivesica marissinica) and the chemoautotrophic bacteria (Candidatus Vesicomyosocius marissinica) that live in its gill epithelium cells. Through analysis of their genomic structures and profiling of their gene expression patterns, the research team revealed that symbiosis between the two partners enables the clams to thrive in extreme deep-sea environments.

The research findings have been published in the academic journal Molecular Biology and Evolution.

Due to the general lack of photosynthesis-derived organic matter, the deep-sea was once considered a vast "desert" with very little biomass. Yet, clams often form large populations in the high-temperature hydrothermal vents and freezing cold seeps in the deep oceans around the globe where sunlight cannot penetrate but toxic molecules, such as hydrogen sulfide, are available below the seabed. The clams are known to have a reduced gut and digestive system, and they rely on endosymbiotic bacteria to generate energy in a process called chemosynthesis. However, when this symbiotic relationship developed, and how the clams and chemoautotrophic bacteria interact, remain largely unclear.

Horizontal gene transfer between bacteria and clams discovered for the first time

A research team led by Professor Qiu Jianwen, Associate Head and Professor of the Department of Biology at HKBU, collected the clam specimens at 1,360 metres below sea level from a cold seep in the South China Sea. The genomes of the clam and its symbiotic bacteria were then sequenced to shed light on the genomic signatures of their successful symbiotic relationship.

The team found that the ancestor of the clam split with its shallow-water relatives 128 million years ago when dinosaurs roamed the earth. The study revealed that 28 genes have been transferred from the ancestral chemoautotrophic bacteria to the clam, the first discovery of horizontal gene transfera process that transmits genetic material between distantly-related organisms from bacteria to a bivalve mollusc.

The following genomic features of the clam were discovered, and combined, they have enabled it to adapt to the extreme deep-sea environment:

(1) Adaptions for chemosynthesis

The clam relies on its symbiotic chemoautotrophic bacteria to produce the biological materials essential for its survival. In their symbiotic relationship, the clam absorbs hydrogen sulfide from the sediment, and oxygen and carbon dioxide from seawater, and it transfers them to the bacteria living in its gill epithelium cells to produce the energy and nutrients in a process called chemosynthesis. The process is illustrated in Figure 1.

The research team also discovered that the clam's genome exhibits gene family expansion in cellular processes such as respiration and diffusion that likely facilitate chemoautotrophy, including gas delivery to support energy and carbon production, the transfer of small molecules and proteins within the symbiont, and the regulation of the endosymbiont population. It helps the host to obtain sufficient nutrients from the symbiotic bacteria.

(2) Shift from phytoplankton-based food

Cellulase is an enzyme that facilitates the decomposition of the cellulose found in phytoplankton, a major primary food source in the marine food chain. It was discovered that the clam's cellulase genes have undergone significant contraction, which is likely an adaptation to the shift from phytoplankton-derived to bacteria-based food.

(3) Adaptation to sulfur metabolic pathways

The genome of the symbiont also holds the secrets of this mutually beneficial relationship. The team discovered that the clam has a reduced genome, as it is only about 40% of the size of its free-living relatives. Nevertheless, the symbiont genome encodes complete and flexible sulfur metabolic pathways, and it retains the ability to synthesise 20 common amino acids and other essential nutrients, highlighting the importance of the symbiont in generating energy and providing nutrients to support the symbiotic relationship.

(4) Improvement in oxygen-binding capacity

Unlike in vertebrates, haemoglobin, a metalloprotein found in the blood and tissues of many organisms, is not commonly used as an oxygen carrier in molluscs. However, the team discovered several kinds of highly expressed haemoglobin genes in the clam, suggesting an improvement in its oxygen-binding capacity, which can enhance the ability of the clam to survive in deep-sea low-oxygen habitats.

Professor Qiu said: "Most of the previous studies on deep-sea symbiosis have focused only on the bacteria. This first coupled clamsymbiont genome assembly will facilitate comparative studies that aim to elucidate the diversity and evolutionary mechanisms of symbiosis, which allows many invertebrates to thrive in 'extreme' deep-sea ecosystems."

The research was jointly conducted by scientists from HKBU and the HKBU Institute for Research and Continuing Education, the Hong Kong Branch of the Southern Marine Science and Engineering Guangdong Laboratory (Guangzhou), The Hong Kong University of Science and Technology, City University of Hong Kong, the Japan Agency for Marine-Earth Science and Technology, the Sanya Institute of Deep-Sea Science and Engineering, and the Guangzhou Marine Geological Survey.

Posted in Genome | Comments Off

A towering genome: Experimentally validated adaptations to high blood pressure and extreme stature in the giraffe – Science Advances

Posted: March 20, 2021 at 2:57 am

INTRODUCTION

Giraffes are immediately recognizable due to their exceptionally long necks and legs, making them the tallest terrestrial animals. Giraffes played a central role in different evolutionary schools of thought, including those of Lamarck and Darwin. Their unusual anatomy is thought to provide various selective advantages. In addition to allowing access to otherwise inaccessible food resources (1), their elevated head position provides an excellent vantage point for scanning the horizon and thus detecting predators or competitors, both of which are crucial for their survival (2). However, their exceptional anatomy is also accompanied by considerable physiological challenges. Most notably, the cardiovascular system has to tolerate twofold higher systemic blood pressure than most other mammals to supply the brain with blood (3). This elevated hydrostatic pressure has resulted in hypertrophy of their cardiac and arteriole walls (3) and adaptations of the circulation system that prevent sudden changes in blood pressure when a giraffe elevates or lowers its head (4). Giraffes also have neuromotor delays due to their long neural networks (5) and face difficulties in rising due to their long legs, which increases the danger associated with resting and drinking. They also require greatly enlarged and strengthened nuchal ligaments to support their long, heavy neck (6). Hence, the giraffe provides a unique case for studying co-adaptation or evolution in several different traits that are causally linked to an extreme body plan.

The okapi (Okapia johnstoni) is the only other extant member of the Giraffidae family and provides a useful point of genomic comparison. A study published in 2016 provided the first giraffe and okapi draft genomes and identified candidate genes and pathways involved in neck elongation and cardiovascular adaptations (7). However, these initial draft genomes were relatively fragmented, which can both introduce certain biases and limit the interpretation of some analyses (8). Furthermore, the comparative analyses carried out using these draft genomes were restricted to 17,210 genes, which were annotated by aligning with cattle (Bos taurus) reference transcripts, thus limiting the resolution power to explore genomic features unique to the giraffe, not least given the paucity of other ruminant genomes available at that time. Hence, the availability of a higher-quality giraffe genome assembly together with our recently published whole-genome dataset for ~50 ruminant species (9) opens up the possibility of identifying giraffe-specific mutations with a much higher accuracy and robustness. This, in turn, provides a better resource for inferring the true genomic changes that account for the unique body plan of the giraffe.

Here, we report an improved, chromosome-level genome assembly of a Rothschilds giraffe (Giraffa camelopardalis rothschildi), the results of comparative analysis against the recently available ruminant genomes, andcruciallyfunctional validation of one key cardiovascular and skeletal gene in gene-edited mice. These results provide insights into the genetic basis of the giraffe anatomy and associated adaptations, with particular implications concerning the cardiovascular system, which may be helpful for treating human cardiovascular disease and hypertension.

We sequenced the genome of a male Rothschilds giraffe with a combination of single-molecule real-time sequencing (using an Oxford Nanopore platform), paired-end sequencing (with an Illumina HiSeq 2000 system), and Hi-C sequencing (figs. S1 and S2 and table S1). First, we used Nanopore data to generate initial contigs, and after polishing with Illumina reads, we obtained an assembly with contig N50 of 35.9 Mb (table S2). Next, Hi-C data were used to anchor the contigs into chromosomes, which yielded a final assembly of 2.44 Gb with ~97.95% of the bases successfully anchored to 15 chromosomes (2n = 30) (figs. S3 and S4 and table S3). A series of evaluations show that the genome assembly is of high quality (see the Supplementary Materials, figs. S2 to S6, and tables S4 to S11).

Chromosome evolution is related to genome size, gene family evolution, and speciation (10). The giraffe has many fewer chromosomes (2n = 30) than the putative ancestral karyotype of even-toed ungulates (2n = 58 to 60), suggesting the occurrence of multiple chromosome fusion events in its evolution (11). Using the genomes of cattle, goat, giraffe, and okapi, with sperm whale as outgroup, we reconstructed the ancestral karyotype of the Giraffidae and Bovidae families (2n = 60) (table S12), which corresponds to the ancestor of the Pecora suborder (11). The results indicate that just three fissions and three fusions occurred in the cattle lineage since the pecoran ancestor. Hence, most of the ancestral chromosome structure is retained in cattle, including the complement of 30 chromosomes. In contrast, a minimum of four fissions and 17 fusions occurred between the pecoran ancestor and the giraffe, resulting in a substantial decrease (to 15) in haploid chromosome number (Fig. 1). The functional significanceif anyof such prolific chromosome fusions in giraffes requires further research.

(A) The figure displays the distribution of ancestral chromosome segments in cattle and giraffe genomes, including interchromosome rearrangements and fission and fusion events in cattle and giraffe. Blue asterisks in the cattle chromosome diagram indicate chromosome fission events in cattle. Blue asterisks in the giraffe chromosome diagram indicate sites of chromosome rearrangements. (B) Circos plot showing syntenic relationships of chromosomes between giraffe (left) and cattle (right). Chromosomes are colored on the basis of the cattle homologies. (C) Two types of collinear relationship between giraffe and cattle. The top and bottom horizontal lines represent giraffe and cattle chromosomes, respectively, and the lines between them link the alignment blocks.

We next evaluated the adaptive divergence between giraffe and other mammals in coding regions, using both the branch and the branch-site models implemented in PAML (12). We detected 101 positively selected genes (PSGs) and 359 rapidly evolving genes (REGs) in the giraffe (P < 0.05 according to 2 tests in both cases) (fig. S7 and tables S13 and S14) (13). This is a large increase compared to those found in the previous giraffe genome study, which identified 17 PSGs and 53 genes with adaptive divergence (high divergence compared with other mammals or unique substitutions) in giraffe (7). Notably, while 7 of the 17 PSGs from the previous study overlapped with our findings, the remaining 10 PSGs showed no positive selection signal in our analyses, which is primarily caused by the inclusion of many more ruminant branches as background. We show two examples of how the inclusion of a larger background panel or better genome quality refined our ability to identify giraffe-specific selection signals (fig. S8). Similarly, only 15 of the 53 previously identified adaptive divergence genes (7) were identified as PSGs or REGs in our analysis. Together, the improved genome assembly (better genome completeness, accuracy, and annotation) and higher number of accessible ruminant reference genomes allow us to substantially decrease both false positive and false negative signals of genes undergoing adaptive evolution in the giraffe. A Gene Ontology (GO) enrichment analysis showed that the 460 PSGs and REGs identified in the present study are primarily related to growth and development, nervous and visual systems, circadian rhythm, and blood pressure regulation (table S15). The KEGG (Kyoto Encyclopedia of Genes and Genomes) pathwaybased analysis suggested that the rapidly evolving pathways in giraffe compared to okapi are related to metabolic, circulatory, and immune systems (table S16).

The giraffe fibroblast growth factor (FGF) receptorlike protein 1 (FGFRL1) gene has previously been identified as one of the most conspicuous targets of selection in the giraffe (7). FGFRL1 contains a cluster of seven nonsynonymous mutations in its key FGF binding domain when compared against sequences in other ruminants and outgroup mammals (fig. S9). Using our substantially expanded set of background genomes, we confirmed that these mutations are unique to the giraffe and that FGFRL1 contains more unique substitutions than any other giraffe gene (table S17). Mutations in FGFRL1 cause severe cardiovascular and skeletal defects in humans and mice (14, 15), and hence we follow Agaba et al. (7) in hypothesizing that FGFRL1 may be associated with the extreme cardiovascular and skeletal adaptations in the giraffe. To investigate the in vivo consequences of these substitutions, we introduced these seven mutations into the FGFRL1 gene of mice to obtain giraffe-type FGFRL1 mice, using CRISPR-Cas9 technology (fig. S10). In contrast to mice with targeted complete deletion of the gene (14), which die with multiple congenital malformations, giraffe-type FGFRL1 mice were viable and fertile.

The cardiovascular hemodynamic in the giraffe is characterized by exceptionally high blood pressure without related organ damage, in contrast to the typical detrimental effects of hypertension observed in other animals/humans (3). FGFRL1 is known to be involved in the cardiovascular system (14), and we hypothesized that some vascular adaptations in the giraffe may only be apparent in a hypertensive physiological setting. To test this, we induced high blood pressure in wild-type (WT) and mutant FGFRL1 mice. The mice were infused with angiotensin II (Ang II; 900 ng/kg per minute; fig. S11), which induces hypertension by vasoconstriction and sodium retention. Our giraffe-type FGFRL1 mice showed no signs of congenital heart defects (fig. S12) or any obvious alterations in heart rate compared to WT controls (fig. S13). Although the basal blood pressure was slightly higher in giraffe-type than in WT FGFRL1 mice, no significant difference was observed (fig. S13). After Ang II infusion for 28 days, the average systolic and diastolic blood pressure in WT controls were significantly increased to 158.97 5.01 and 94.54 8.60 mmHg (Fig. 2A), respectively, confirming that hypertension was successfully induced in them. Unexpectedly, the Ang IIinduced hypertension was absent in giraffe-type FGFRL1 mice, which showed average systolic and diastolic pressures of 125.30 5.97 and 83.43 11.77 mmHg, respectively, after Ang II infusion for 28 days (Fig. 2A), not significantly different from giraffe-type FGFRL1 controls. Moreover, significantly less myocardial and renal fibrosis was observed in giraffe-type FGFRL1 mice, as manifested by significantly lower proportion of collagen fibers in their heart and kidney than in WT controls, which obviously resulted from the absence of Ang IIinduced hypertension in giraffe-type FGFRL1 mice (Fig. 2B and fig. S14). In addition, the impaired heart function caused by hypertension in Ang IItreated WT mice was also significantly alleviated in giraffe-type FGFRL1 mice, as evidenced by improved left ventricular ejection fractions (LVEFs) and fractional shortening (LVFS) (fig. S15). Our findings collectively suggest that the giraffe-type FGFRL1 has little effect on cardiac development but can prevent Ang IIinduced hypertension and thus avoid or at least alleviate a range of detrimental effects of hypertension. In addition, our molecular dynamics (MD) simulations suggested that the unique variants in giraffe-type FGFRL1 could affect its binding affinity with FGF ligands (fig. S16), potentially interfering with their cross-talk with renin-angiotensin-aldosterone system to modulate blood pressure and providing a possible mechanism by which giraffe-type FGFRL1 modulates blood pressure (16). Despite the differences in cardiovascular structure and physiology between mice and human and the possibility that other genes may have contributed to the observed systemic co-adaptation to hypertension, it is intriguing to speculate that FGFRL1 might hold promise as a therapeutic target for prevention or treatment of hypertension or cardiovascular diseases in humans. Nevertheless, we acknowledge that this perspective is tentative and awaits a thorough investigation of the mechanisms behind the observed cardiovascular effect of giraffe-type FGFRL1.

(A) Giraffe-type FGFRL1 mice showed significantly lower systolic, diastolic, and mean arterial pressures (mmHg) than WT FGFRL1 mice after Ang II infusion for 28 days. *P < 0.05, **P < 0.01, ***P < 0.001, one-way ANOVA followed by Tukeys post hoc test. (B) Giraffe-type FGFRL1 mice had significantly lower proportions of fibrotic areas in heart than WT FGFRL1 mice after 28 days of Ang II infusion. ***P < 0.001, one-way ANOVA followed by Tukeys post hoc test. Error bars indicate SD. (C) Whole-mount skeletons of P0 mice showed hypoplasia of skeletal elements in giraffe-type FGFRL1 mice. **P < 0.01 by t test. (D) Adult giraffe-type FGFRL1 mice show no discernible body size and skeletal phenotype difference to WT mice. (E) Giraffe-type FGFRL1 mice showed significantly higher BMD, BV/TV, and average trabeculae thickness than WT mice. *P < 0.05, **P < 0.01 by t test. Photo credit: Jianbo Gao, The Fourth Military Medical University.

In addition to the observed cardiovascular effect, we noticed that postnatal day 0 (P0) giraffe-type FGFRL1 mice showed prenatal hypoplasia of skeletal elements, with a smaller body size, delayed craniofacial development, shortened axial/appendicular skeletons, and smaller vertebral lengths than the P0 WT mice (Fig. 2C and fig. S17). In contrast, adult giraffe-type FGFRL1 mice (24 to 26 g, 16 weeks) showed no discernible skeletal phenotype compared with WT mice or any significant deviation in body size and weight, limb length, or vertebral height (Fig. 2D and fig. S18). This suggests that mutations in this gene in itself are not sufficient for neck elongation in the giraffe, refuting a previous hypothesis (7), although again we must recognize the limitations of introducing a gene into a different genetic background. However, it also shows that giraffe-type FGFRL1-associated postnatal bone growth can compensate for the observed prenatal effects such that FGFRL1 may play an indirect role in the exceptional bone growth of giraffe, e.g., by accelerating bone formation to maintain bone mineral density (BMD), as in humans (17). Therefore, we next examined bone ultrastructure by micro computed tomography (microCT). Giraffe-type FGFRL1 mice achieved significantly higher BMD, bone volume/total volume (BV/TV) ratio, and average trabeculae thickness in vertebrae (C3) and distal femur (Fig. 2E and fig. S19). Skeletal growth rate tends to be inversely related to bone strength in animals (18), but despite having the highest skeletal growth rate among mammals, giraffes maintain normal BMD (19). In summary, we find indications for a pleiotropic adaptive effect of the highly unique giraffe-type FGFRL1 by not only significantly enhancing hypertension resistance but also achieving normal bone strength, despite the accelerated rate of bone growth in the giraffe.

Previous anatomical and physiological analyses suggest that multiple giraffe organs are involved in associated adaptations of the cardiovascular system, including hypertrophy of the left ventricle and interventricular walls (3), thickening of blood vessels of the lower extremities, and low glomerular filtration rates (20). Our results revealed that several pathways involving tissues influenced by high blood pressure, such as blood vessels, heart, and kidney, were significantly diverged between giraffe and other ruminants (table S16). The platelet activation pathway plays an important role in hypertension-associated thrombosis (Fig. 3A) (21). Three REGs (COL1A2, LYN, and PLCB1) and a number of genes with giraffe-specific amino acid variations are involved in the two major platelet activation, shape change, and platelet aggregation paths. A further set of PSGs and REGs that participate in phosphatidylinositol metabolism (PIP4K2A, ISYNA1, MTMR3, CDS1, and INPP1) may also be involved in the regulation of platelet activation (22). Another giraffe-divergent pathway is the adrenergic signaling pathway in cardiomyocytes, which is related to cardiac contraction and possibly the morphological remodeling of the giraffe heart (Fig. 3B) (23). Highly divergent genes in this pathway are mainly involved in ion transport (SCN7A, SLC9A1, ATP1A4, and CACNA2D4), which is important for myocardial function (24). We also found strong signals of adaptation in two major adrenergic receptors (ADRA1A and ADRA2B), as previously reported (7). Although ADRA2B is mainly expressed in the nervous system, both of these receptors are strongly related to blood pressure regulation (25, 26). Last, we detected strong giraffe-specific divergence in genes related to the proximal tubule bicarbonate reclamation and endocrine and other factor-regulated calcium reabsorption pathways. Changes in these pathways may reduce the pressure gradient across membranes in the giraffe kidney and protect it from hypertensive damage (Fig. 3C) (20). The REG AQP1 encodes a water-transporting protein in cell membranes of kidney proximal tubules and is involved in kidney development and injury responses (27). Two REGs (PLCB1 and ATP1A4) that are reportedly involved in hypertension or related organ damage participate in more than one of the mentioned pathways (28, 29), in accordance with expectations of co-adaptation of the blood vesselheartkidney axis in giraffe. In addition to genes in the mentioned pathways, we also detected other PSGs and REGs that may help to avoid hypertensive damage, including ANGPTL1, which is associated with the integrity of vascular endothelium (30), and TGFB1, which is strongly implicated in multiorgan fibrosis associated with hypertension (31). The finding of multiple genes involved in several phenotypic traits that share evolutionary constraints due to the extreme stature of the giraffe suggests that pleiotropy may play an important role in evolving such an extreme body plan.

(A) Modifications of genes in the platelet activation pathway may help to prevent damage to giraffe blood vessels. (B) Genes in the adrenergic signaling in cardiomyocytes that show high divergence in giraffe. (C) The proximal tubule bicarbonate reclamation (top) and endocrine and other factor-regulated calcium reabsorption (bottom) pathways may help to prevent kidney damage.

For herbivorous ungulates subject to predation, vigilance is crucial for survival and has two components: gathering information and instigating muscular action after signal transduction through the nervous system (32). Giraffes are thought to have a distinctive retinal cone topography that provides the best visual acuity in the Artiodactyla, which, together with the elevated head, enhances the capacity for horizon scanning (33). Accordingly, we found not only a number of PSGs and REGs that contribute to eye development and vision but also a number of genes that are related to Usher syndrome in humans (CDH23, PCDH15, USH2A, NINL, and UBR3), which affects vision, hearing, and balance (34), suggesting a related suite of sensory co-adaptations in giraffe (Fig. 4A). Similar to all other ruminants, we found only two opsin genes; thus, we could not verify that giraffes see color, at least not trichromatic color as has been hypothesized before (35). We found indications that the sense of smell in the giraffe may be degenerated. Compared to okapi, giraffe has lost at least 53 olfactory-related genes, including 50 encoding olfactory receptors, two encoding vomeronasal receptors, and one encoding an odorant binding protein (table S18). Further analysis shows that most of these olfactory receptors are spatially clustered and were lost because of a segmental deletion (Fig. 4B and figs. S20 and S21). Moreover, the contracted gene families in giraffe were also enriched in olfactory receptor activity (fig. S22 and tables S19 and S20). This may be an evolutionary consequence of enhanced vision, consistent with the hypothesized trade-off in sensory acuity found in many taxa (36) and/or with reduction in competition for food with other browsers.

(A) PSGs and REGs associated with giraffes visual, auditory, and balance systems. (B) Giraffes have lost several olfactory receptors (for example, on chromosome 10 of goat) compared to okapi. The location of genes on goat chromosome is shown in the rectangle, and the collinear relationship of giraffe-goat and okapi-goat is shown in the top and bottom panels, respectively. (C) Genetic changes involved in light-mediated regulation of the molecular clock in giraffe suprachiasmatic nucleus (SCN) neurons.

Moreover, the extreme morphology of the giraffe increases its vulnerability when asleep by increasing the time required to become upright. Expectedly, therefore, given their needs for vigilance and high food intakes, giraffe sleep durations are among the lowest recorded (37). Concordantly, we found evidence of rapid evolution of PER1 in giraffe, a period family gene critical for the maintenance of circadian rhythm (38) and the emergence of a stop codon in the first exon of PER2 (Fig. 4C and fig. S23), possibly altering the transcript of this important circadian rhythm gene. HCRT, which plays a role in the regulation of sleep and arousal (39), also shows accelerated evolution in giraffe. Together, there is evidence that adaptive modifications of circadian rhythm and sleep arousal systems in giraffe have promoted short and fragmented sleep patterns. Overall, the comparative genomic analysis highlights that the unique stature of the giraffe has led to a series of necessary behavioral co-adaptations.

Procedures applied in sample collection and animal experiments were reviewed and approved by the Institutional Ethics Committee of the Northwestern Polytechnical University and Fourth Military Medical University. Fresh blood samples of a male Rothschilds giraffe used for genome sequencing were acquired during a routine physical examination at the Guangzhou Zoo in China. High-quality genomic DNA was extracted using a Qiagen DNA purification kit, then used to construct libraries, and sequenced with Illumina HiSeq and Oxford Nanopore GridIOn platforms. Data (199.64 and 140.56 Gb) were obtained, after filtering, from these platforms. In addition, lymphocytes collected from the same blood sample were used for Hi-C library construction, and 138.71-Gb data were obtained using the Illumina HiSeq X Ten platform.

Contigs were assembled by Wtdbg software (v1.2.8) (41), and the assembled contig-level genomes were polished by Racon (v1.2.1) (42) and Pilon (v1.22) (43). Last, the contigs were anchored into chromosomes by Hi-C sequencing reads through the Juicer (version 1.5) (44) and 3D-DNA (version 180922) (45) software workflow. To further improve the chromosome-scale assembly, it was subjected to manual review and refinement using Juicebox Assembly Tools (https://github.com/theaidenlab/juicebox). Last, genome quality was estimated with BUSCO (version 3.0.2) (46), whole-genome synteny with cattle (UMD3.1) genome, and k-mer analysis and by mapping back the initial reads to the assembly.

According to the good genome synteny with cattle genome (Fig. 1B), we assigned the chromosome numbers of our assembly as indicated by previous research (11). Our assembly agrees with the giraffe karyotype revealed before: 13 biarmed autosomal pairs and an acrocentric autosomal pair plus the sex chromosomes (47). Then, we mapped both the Nanopore reads and Illumina reads used for the assembly back onto it. More than 98% of the Nanopore raw reads could be mapped to the assembly properly with an average depth of 54, and 99.99% of genome has a reads depth more than 50, with chromosome X excluded (fig. S5 and table S4). Furthermore, 97.14% of the Illumina reads could be mapped to the genome properly with an average depth of 79 (fig. S5 and table S5). Last, the assembly also recovered 96.15% of the expected single-copy orthologous genes according to BUSCO analysis (table S6), the highest coverage yet for the reported Giraffidae genomes (table S7).

Tandem repeats were predicted by Tandem Repeats Finder software (v4.04) (48). RepeatMasker (open-4.0.7) (49), RepeatModeler (v1.0.8) (49), and RepeatProteinMask (v1.0.8) were used together to predict transposable elements. Gene structures were determined by combining ab initio and homology methods. For ab initio annotation, we used Augustus (v3.2.1) (50) and GENSCAN (v1.0) (51) to analyze the repeat-masked genome. For homolog-based annotation, protein sequences of cattle (B. taurus; ensemble 87 release), sheep (Ovis aries; ensemble 87 release), and human (Homo sapiens; ensemble 87 release) genomes were aligned to giraffe sequences using BLAST software (v2.3.0) (52) and GeneWise (v2.4.1) (53). Then, results from the three methods were integrated by EVidenceModeler software (v1.1.1) (54). To annotate the gene functions, the integrated gene set was aligned against public databases, including KEGG, Swiss-Prot, TrEMBL, COG, and NR with BLAST (v2.3.0) (52), and merged with annotations by InterProScan (v4.8) (55) software. The integrity of annotation was estimated by comparison with reference genome annotations and BUSCO (version 3.0.2) (46). On the basis of homology and ab initio gene prediction, we annotated 21,580 protein-coding genes in the genome (fig. S6 and tables S8 to S11), with 96.81% completeness according to BUSCO analysis, suggesting that our annotation also has high quality (table S6).

The complete mitochondrial cytochrome b (Cytb) gene (1140 base pairs) was used to investigate the phylogenetic status of our sample. In addition, previously published cytb sequences of 160 giraffes and outgroup (okapi and pronghorn) were retrieved from the National Center for Biotechnology Information (NCBI) according to the accession number provided by a research before (56). These sequences were aligned with our data using ClustalW in MEGA7 (57) with default parameters and subsequently adjusted manually to maximize positional homology. Last, the remaining sequences were used to infer the phylogenetic tree using IQ-TREE (58) under parameters -nt AUTO -m MFP -bb 1000 -bnni -o Pronghorn. As a result, the specimen used for genome sequencing was clustered together with the giraffe subspecies Rothschilds giraffe (G. camelopardalis rothschildi) with high support (ultra fast bootstrap value = 93).

We reconstructed the ancestral chromosome karyotype of Giraffidae and Bovidae families using the genomes of cattle, goat, giraffe, okapi, and sperm whale (as outgroup). With giraffe as the reference genome, we carried out pairwise alignments with other species as target using LASTZ (v1.1) with parameters T=2 C=2 H=2000 Y=3400 L=6000 K=2200 --format=axt. Then, axtChain, chainMergeSort, chainPreNet, and ChainNet were used to generate chain and net files as input for DESCHRAMBLER (59). Last, we identified 1502 conserved segments by DESCHRAMBLER at a 300-kb resolution and reconstructed 30 predicted ancestral chromosomes (2n = 60) with a total length of ~2.25 Gb.

To minimize effects of annotation, pseudogenes, and genome quality, we used conserved genome synteny methodology to establish a high-confidence orthologous gene set that included four nonruminants (human, dog, horse, and pig) and six ruminants (pronghorn, giraffe, okapi, forest musk deer, reindeer, and cattle). Using the goat genome sequence (ARS1) as a reference, we performed synteny alignment for these ten species with Last (version 894) (60) and generated pairwise whole-genome alignments with Multiz (version 11.2) (61) using the default parameters. A total of 13,776 genes were extracted from the synteny alignments. We used the Codeml program in the PAML package (version 4.8) (12) to estimate the lineage-specific evolutionary rate for each branch with the phylogenetic extracted from a ruminant study before (9). First, the branch-site model was used for detecting PSGs. The giraffe lineage was specified as the foreground branch, and a likelihood ratio test (LRT) was conducted to examine whether the branch-site model containing positively selected codons (omega > 1) fits better than the null model, which only includes neutral selection or negative selection (omega 1). The P values for model comparison were computed based on chi-square statistics. Besides, the potential positive selection of codon sites was assessed by their posterior probabilities calculated with the Bayes empirical Bayes (BEB) method. The genes with an LRT P < 0.05 and with the sites with a posterior probability of positive selection over 0.95 from the BEB method were treated as PSGs. Then, the branch model that was used for detecting REGs used the same orthologous genes as above. We tested whether the foreground branch (the giraffe lineage) exhibited a significantly higher omega (regardless of whether it is greater than 1) than the background branch (the other lineages) using the LRT test. The genes with an LRT P < 0.05 were treated as REGs in giraffe. The combined set of PSGs and REGs was subjected to KEGG and GO enrichment analysis (P < 0.05) with the online tool Metascape (v1.0) (62).

We used 12 species to construct gene families, including human, horse, dog, pig, killer whale, camel, pronghorn, giraffe, okapi, white-lipped deer, forest musk deer, and cattle. Proteins with premature stop codons, nontriplet codon lengths, and fewer than 30 amino acids were removed. Last, we used OrthoMCL (v2.0.9) (63) for protein clustering with a dataset of 256,596 protein sequences. Family expansion or contraction analysis was performed by CAFE (v3.1) (64), and the phylogenetic tree was extracted from the ruminant study before (9). Gene expansion and contraction results for each branch of the phylogenetic tree were estimated, and enrichment analysis about the gene families expanded or contracted in giraffe was performed with KOBAS (v3.0) (65).

For each KEGG pathway with more than 20 genes, we counted numbers of nonsynonymous and synonymous mutations between giraffe and its most recent common ancestor (Nh and Sh, respectively). We also counted numbers of nonsynonymous mutations and synonymous mutations between okapi and the MCRA (Nt and St, respectively). We formulated a null hypothesis that the probabilities of nonsynonymous mutations compare to synonymous mutations in giraffe and okapi are similar and then applied a one-sided binomial test to identify rapidly evolving pathways with significantly more nonsynonymous mutations than expected. The binomial test included three parameters for each KEGG pathway: the number of successes (Nh), the number of trials (Nh + Sh), and the hypothetical probability of success is given by Nt*All_NhAll_NtNt*All_NhAll_Nt+St*All_ShAll_St, where All is the genome-wide value. Last, the rapidly KEGG pathways were identified using a threshold of P < 0.05 (one-sided binomial test).

Through analysis of conserved genome synteny with goat, we obtained a highly confident set of orthologous genes of mammals (including mouse, human, cat, cheetah, dog, rhinoceros, horse, camel, pig, dolphin, killer whale, and sperm whale) and 51 ruminant species. Domain regions of the encoded proteins were predicted with Pfam (67). Then, we scanned the domain regions in the syntenic alignments and identified the giraffe-specific amino acid substitutions compared to all other species. The substitutions that were not fixed in all published giraffe genomes were further filtered. Last, we identified 414 giraffe genes that have unique substitutions in domain regions, of which 33 genes have more than three unique substitutions (table S17).

The Illumina short reads of giraffe and okapi were mapped onto the cattle genome (UMD3.1). For every gene, the read depth was counted with SAMtools (68) along the coding sequence (CDS). For the CDS region, if more than 50% of the sites had not been mapped with reads in the giraffe, but if more than 50% sites had been mapped with more than 10 reads in okapi, the gene was assumed to be specifically lost in giraffe. It turned out that giraffe uniquely lost 83 genes compared to okapi. To avoid sequencing problem in one research and validate the result, we repeated the same analysis with previously published genomic short reads of another giraffe (9), and it showed that giraffe uniquely lost 78 genes with 63 genes overlapping the result before. However, okapi lost only 13 genes uniquely under the same analysis. We noticed that 53 of the 63 genes lost in giraffe were related to the sense of smell and that they located spatially clustered on chromosomes 10 and 15 on cattle genome (table S18). Furthermore, to validate the result on genome level and to avoid the influence of using cattle genome, we checked the synteny alignment between giraffe-goat (ARS1) and okapi-goat (ARS1), and it verified again that giraffe lost more olfactory-receptor genes on chromosomes 10 and 15 of goat (Fig. 4B and fig. S20). We further checked the deletions on giraffes chromosome 7 (chromosome 10 of goat) with the long Nanopore reads mapped back to the giraffe genome and goat genome; the deletion region can be finely span in giraffe (fig. S21).

Because the 3D structure of the complexes of FGFR1 (a major FGF receptor and was thought to be involved in FGFRL1 signaling) and FGF23 (the ligand) has been revealed (69), we built an in silico 3D structure model of the giraffe FGFRL1 (from IG-II to IG-III domain) by homolog modeling and docked the model with FGF23 to assess possible effects of the mutations on the proteins interaction. The 3D structure model of mtFGFRL1 (seven sites in giraffe type) and WT FGFRL1 (seven sites changed to common type) was separately generated with homolog modeling methods by PROMALS3D (70) with several FGFR structures as templates [Protein Data Bank (PDB) nos. 1E0O, 1EV2, 1II4, 1IIL, 1NUN, 1RY7, 2FDB, 3GRW, 3OJ2, 3OJV, 4J23, and 5W59]. The FGF23 structure was obtained from PDB no. 5w21. The structure of FGFRL1 and FGF23 complexes was produced by the Rosetta (71) protein docking program, using the docking conformation of FGFR1 and FGF23 in PDB no. 5w21 as the initial docking pose.

MD simulations were performed by using the Amber (version 18) software (72) in combination with the ff14SB (version 1.0) force field (73). Protein systems were solvated in the TIP3P water model with an edge distance of 12 , and systems were neutralized (pH 7) by adding suitable counterions (Na+ or Cl). Before performing MD simulations, each system was minimized by means of the steepest descent and conjugate gradient methods through 2000 steps. NPT (constant number of atoms, pressure, and temperature) simulations were then carried out to heat the system from 0 to 300 K using Langevin dynamics for temperature control and the SHAKE algorithm on hydrogen atom constraint. MD simulations were run for 100 ns with the time step set to 2 fs. Last, determination of the relative binding free energy was performed using the molecular mechanics generalized Born surface area method in the Amber package (version 18) (72).

To elucidate the giraffe-type FGFRL1 genes role in skeletogenesis and the cardiovascular system, the seven unique substitutions in giraffe-type FGFRL1 were introduced into the FGFRL1 gene in mice (giraffe-type FGFRL1 mice) by CRISPR-Cas9mediated genome editing as follows. First, single-guide RNA (sgRNA) expression constructs were prepared, based on the pUC57-sgRNA expression vector (no. 51132; Addgene), using oligonucleotide sequences listed in table S21. Next, the sgRNA expression plasmids were linearized and prepared as templates for in vitro transcription using a MEGAshortscript kit (Ambion, AM1354). The sgRNA was purified using a MEGAclear kit (Ambion, AM1908). Fertilized eggs were injected with a mixture of Cas9 protein, sgRNAs, and homologous DNA template. Genomic DNA was then extracted from the tails of 7-day-old mice (new pups) using phenol-chloroform and recovered by alcohol precipitation to detect the mutations. Polymerase chain reaction primers for targeting sites are listed in table S22. Last, mice with expected mutations were mated with WT mice to get enough heterozygous mutant mice, and then homozygous mutant mice were produced by crossing and prepared for consequence experiments.

Neonates (P0) were subjected to whole-mount skeletal staining. Briefly, both P0 WT FGFRL1 (n = 5) and giraffe-type FGFRL1 (n = 5) mice were fixed in 90% ethanol for 12 hours at 4C. Next, specimens were transferred into acetone for 12 hours at room temperature and then into a cartilage staining solution containing 0.03% Alcian blue (w/v; Sigma-Aldrich, USA), 80% ethanol, and 20% acetic acid overnight. The samples were washed with 20% acetic acid, and the ossified tissues were stained in a solution with 0.005% Alizarin red (w/v) overnight at 4C. The specimens were transferred into 1% KOH (w/v) until the muscle tissue was transparent and then saved with 50% glycerol solution containing 1% KOH (w/v). Whole skeleton images were then obtained with an M205 FA stereoscopic microscope (Leica, Germany), and ImageJ software (version 1.46; National Institutes of Health, USA) was used to obtain the following measurements: head length (distance from the frontal tip of the maxilla to the caudal tip of the occipital bone in lateral view), spine length (distance from the annular vertebrae to the tail root), and length of limbs (distance between the two tips of limbs). The measurements were repeated three times for each sample, and average values were obtained.

Adult (16 weeks, 24 to 26 g) WT FGFRL1 mice (n = 8) and giraffe-type FGFRL1 mice (n = 8) were randomly selected and anesthetized by intraperitoneal 1% (w/v) sodium pentobarbital solution (40 mg/kg). The body weight and length (from nose to tail root) of each mouse was measured. Then, x-ray images of the head, lumbar vertebra, and limbs of both sets of mice (n = 3) were acquired using a SkyScan 1276 high-resolution in vivo x-ray microtomography (Bruker, Germany). Digital images were obtained under identical imaging conditions using the same acquisition parameters, and ImageJ software was used to obtain the following measurements: head length (as defined above), height of the L1 lumbar vertebra (distance between the upper and lower endplates of the vertebral body), and length of limbs (as defined above). The measurements were repeated three times for each sample, and average values were obtained.

After x-ray imaging, the adult WT FGFRL1 mice (n = 10) and giraffe-type FGFRL1 mice (n = 10) were sacrificed by an intraperitoneal pentobarbital (Sigma-Aldrich, USA) overdose. The skeleton of each mouse was harvested and fixed by 4.0% formalin. The formalin-fixed femurs and cervical vertebrae were scanned, reconstructed, and analyzed using a GE-LSP industrial microCT system (GE Healthcare, Chicago, IL, USA) with the following parameters: 80 kV, 80 A, and 3.0-s exposure time per projection. The BMD, average trabeculae thickness, and BV/TV of the distal femur (n = 6) and C3 vertebra (n = 10) were measured. In addition, the maximum transverse diameter, average thickness of cortical bone, and both inner and outer perimeters of their femurs (at mid-diaphysis) were measured.

Hypertension was induced using Ang II (Sigma-Aldrich, USA) delivered using Alzet-1004 osmotic mini-pumps (Cupertino, CA). Briefly, WT FGFRL1 and giraffe-type FGFRL1 mice (16 weeks old, 24 to 26 g) were anesthetized with isoflurane (1% at 1.5 liters/min oxygen). A 1-cm incision was then made on the back, and an osmotic mini-pump containing Ang II (n = 10) or an equivalent volume of vehicle (saline, n = 10) was embedded. Ang II (900 ng/kg per minute) was infused at a rate of 10 l/hour for 28 days. At the end of the infusion, the systolic, diastolic, and mean arterial blood pressures were measured using a tail-cuff sphygmomanometer. In addition, cardiac function was evaluated by echocardiography, and hypertension-related cardiac remodeling was examined histologically.

Blood pressure was measured using a BP2010A intelligent noninvasive sphygmomanometer for mice (Softron, Japan), which was calibrated and validated before recording. The reliability of tail-cuff determination of mouse blood pressure was independently validated by radiotelemetry before making critical assessment in mice. Before measurement, mice were acclimated to a restraint box and tail-cuff inflation in a quiet area with designated temperature (22 2C) for 5 days. On the day of testing, mice typically remained relatively calm and still in the restrainer after acclimation period. The tail-cuff was positioned at the base of the tail and a heating pad, supplied as an accessory for the tail-cuff sphygmomanometer, and was preheated to 35C. Blood pressure recordings were acquired after the mice had prewarmed for 10 min. Briefly, the cuff was inflated to 250 mmHg and deflated over 20 s. Ten inflation and deflation cycles were included for each recording. The first three cycles were regarded as acclimation cycles and not included in the analysis. The highest and lowest values in the remaining seven cycles were discarded, and the remaining five readings were averaged for a single session value in further analysis. Changes in tail volume were detected by the pressurized receptor when the blood returned to the tail during cuff deflation. Measurements of the mice were obtained for 3 consecutive days before the Ang II or control treatment to obtain their baseline blood pressure.

Transthoracic echocardiography was performed to evaluate cardiac function using a Vevo 2100 instrument (VisualSonics, Canada) equipped with a 18- to 38-MHz MS-400 imaging sensor. Briefly, the mice were anesthetized with 1% isoflurane via an anesthetic gas machine and maintained in a supine position with limbs fixed, and body temperature was kept stable through a heat pad, while respiration and heart beats were continuously monitored. M-mode images were analyzed to obtain estimates of LVEF, LVFS, left ventricular posterior wall thickness at end diastole, and left ventricular internal diameter at end diastole. For this analysis, a dedicated software (Vevo 2100 version 1.4, VisualSonics, Canada) was used.

At the end of Ang II infusion, the heart (n = 6) and kidney (n = 8) of each vehicle- and Ang IItreated mouse were harvested and fixed with 4.0% formalin (Sigma-Aldrich, USA). Histological sections, 5 m thick, were prepared following standard fixation, clearing, dehydration, waxing, and paraffin-embedding procedures. Representative histological slides were used for histological staining, as follows.

Hematoxylin and eosin staining. The heart and kidney sections were processed by routine dewaxing in xylene followed by hydration with an ethanol concentration gradient. Thereafter, nuclei and cytoplasm in the sections were stained by hematoxylin and eosin (G1004; Servicebio, China), respectively. The sections were then dehydrated, cleared, and mounted. Staining was observed, and images were captured using a BX53+R6 light microscope (Olympus, Tokyo, Japan).

Masson trichrome staining. Heart and kidney tissues were subjected to Masson trichrome staining using a kit and protocols provided by the manufacturer (Sigma-Aldrich, USA). Heart and kidney fibrosis were then measured in terms of the proportion of collagen using ImageJ software. Three randomly selected regions of identical size in each heart or kidney slice were examined, and the average values obtained from them were recorded.

Sirius red staining. The heart sections were stained by incubation in a 0.1% (w/v) solution of Sirius red (G1018; Servicebio, China) in saturated aqueous picric acid for 1 hour. The slides were then washed, dehydrated, and mounted. This treatment stains collagen and noncollagen components red and orange, respectively. Heart fibrosis was measured in terms of the proportion of red-colored collagen using ImageJ software. Three randomly selected regions of identical size in each heart slice were examined, and the average values obtained from them were recorded.

Measurements of continuous variables were expressed as means SD. All statistical analysis was performed with SPSS software (version 19.0; Chicago, USA). Independent Students t tests were used to compare baseline values of WT FGFRL1 and giraffe FGFRL1 mice groups. One-way analysis of variance (ANOVA) was used to compare mean values. If there was a significant overall difference among groups, then Tukeys post hoc test was used for multiple comparisons between groups. A value of P < 0.05 was considered statistically significant.

Acknowledgments: We thank D. Wu for providing the giraffe sample. Funding: This study is supported by the Talents Team Construction Fund of Northwestern Polytechnical University (NWPU) to Q.Q. and W.W., the National Program for Support of Top-notch Young Professionals to Q.Q., the Research Funds for Interdisciplinary Subject, NWPU (19SH030408) to Q.Q., the 1000 Talent Project of Shaanxi Province to Q.Q. and W.W., the National Natural Science Foundation of China (81972052, 81672148, and 81802143) to J.H., and the Independent Research Fund Denmark (8049-00098B) to R.H. Author contributions: Q.Q., J.H., W.W., and R.H. designed this project and research aspects. Zhipeng Li, H.S., G.L., and Q.L. performed sample collection, and D.C. performed sequencing library construction. C.L., L.C., Yuan Yuan, Y.Z., T.Q., M.H., B.Z., Chenglong Zhu, C.Z., and K.W. performed data analysis including genome assembly, annotation, gene family, gene loss, and chromosome evolution. J.G., L.M., and X.C. conducted the experiments for mice. Zihe Li and Y.X. built 3D modeling of proteins. L.Z., Zeshan Lin, Yuan Yin, and W.X. contributed to figure designing. C.L., Q.Q., and J.H. wrote the manuscript. W.W., R.H., and M.T.P.G. performed manuscript amending. Competing interests: J.H., Q.Q., J.G., C.L., and W.W. are inventors on a patent application related to this work filed by the Fourth Military Medical University and the Northwestern Polytechnical University (no. 2020110969712, filed on 14 October 2020). The authors declare no other competing interests. Data and materials availability: All sequencing data and assembled genome have been deposited on the NCBI database with accession ID PRJNA627604. All other data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors.

Originally posted here:
A towering genome: Experimentally validated adaptations to high blood pressure and extreme stature in the giraffe - Science Advances

Posted in Genome | Comments Off

USask Scientists Help Find the Key to Decoding Rye Genome – Seed World

Posted: at 2:56 am

An international team led by the IPK Leibniz Institute in Germany and including University of Saskatchewan (USask) researchers has succeeded in completely decoding the genome of rye, despite its large size and complexity.

Rye is a distinctly climate-resistant cereal plant that is of considerable importance for Germany and northeastern Europe. In Canada, most rye is grown in Saskatchewan and Manitoba.

At USask, the research team includes professor Curtis Pozniak, director of USasks Crop Development Centre and Ministry of Agriculture Strategic Research Program Chair in Durum and High-Yield Wheat Breeding and Genetics, plant molecular geneticist Andrew Sharpe, director of Genomics and Bioinformatics at USasks Global Institute for Food Security (GIFS), Sean Walkowiak (Pozniaks former research officer, now research scientist for Canadian Grain Commission), bioinformatics analyst Brook Byrns, and plant sciences emeritus professor Brian Fowler.

Rye is one of the most cold-tolerant cereal crops and can survive the harshest winters typical of the Canadian Prairies, says Pozniak. The genome sequence of rye points to important genes that could be used to enhance the cold tolerance of other important winter crops, including wheat.

The results published today in the journalNature Geneticsare promising for both science and breeding. Rye offers access to a diverse gene pool, not only for rye breeding but also for wheat breeding.

The delivery of the rye genome represents the work of a large and dedicated group of partners across the world, says Sharpe. These results are significant, as they provide a complete genome that is closely related to other grass crop species such as wheat and barley, thus allowing a deeper insight into the evolutionary relationships between them.

All the research data is available to the general public, meaning the extensive genetic diversity of rye can be systematically discovered and used by breeders in a more targeted approach.

The comparatively low economic importance on a global scale, combined with the great complexity of the genome, interfered with rye getting into the focus of the international research community and thus its genome sequence has been revealed only recently, explains professor Nils Stein, lead of the research group Genomics of Genetic Resources at IPKand holder of a joint professorship at the University of Gttingen.

Rye shares a close and long evolutionary history with barley and wheat. However, its role as an important crop is much shorter. While barley and wheat were domesticated about 10,000 years ago in the so-called Fertile Crescent of the Near East, rye initially spread to Northern Europe as a weed growing in barley and wheat fields. Gradually, rye adopted the characteristics of its two big brothers before becoming a purely cultivated species 5,000-6,000 years ago.

There are important biological differences between rye and its two relatives: rye is fertilized through cross-pollination, thus individual genetic traits cannot be fixed as easily as in a self-fertile plant species, and the rye genome is highly complex, which is mainly due to the large number of highly repeated DNA segments.

Knowing the reference sequence makes it easier to transfer positive properties of rye, such as resistances, to wheat without negatively affecting baking properties, for example.

For example, resistance genes from rye can be transferred to wheat through classical cross-breeding, which has already been used repeatedly in the past, says Stein. Sothe significance of our research extends far beyond rye.

The technical prerequisites for sequencing such a complex genome are available today, Stein emphasizes.

The research used homozygous seeds from the plant breeding company KWS SAAT SE & Co. KGaA.

Thenew genome sequence of our inbred line Lo7 is a great technological achievement and an important step forward towards a more comprehensive genetic characterisation of this crop, says Andres Gordillo, lead of rye breeding at KWS.

It will considerably enhance breeding progress and, therefore, the attractiveness of rye. More specific, it will substantially improve our ability to link resistance traits observed in the field with their underlying genes and their location on the rye genome.

Parallel to the work of the international research team led by Stein, Chinese researchers created a reference sequence of a Chinese landrace.

We worked very well with our Chinese colleagues, which ultimately brought great added value for rye breeding and research. We were able to use two different methods to study two very different rye varieties, of which the complete reference sequences are now available, says Stein.With these two studies, rye has caught up with barley and wheat and is in the middle of the genome research era.

Source: University of Saskatchewan

Read the rest here:
USask Scientists Help Find the Key to Decoding Rye Genome - Seed World

Posted in Genome | Comments Off

The global genome sequencing market by revenue is expected to grow at a CAGR of over 9% during the period 20212026 – GlobeNewswire

Posted: at 2:56 am

New York, March 18, 2021 (GLOBE NEWSWIRE) -- Reportlinker.com announces the release of the report "Genome Sequencing Market - Global Outlook and Forecast 2021-2026" - https://www.reportlinker.com/p06036817/?utm_source=GNW

The global market is expected to grow due to the growing number of rare, terminal, and complex diseases, especially cancer. The constant increase in cancer cases is proportionately increasing the number of sequencing-based diagnostics and treatment options in the market. The introduction of single-cell sequencing technology performs advanced sequencing, thereby helping in cell mapping of tumor cells. This technology is widely used in several tumor researches and has been significantly beneficial for developing new diagnostic and anti-tumor treatment methods. The single-cell analysis has become a standard application both in basic and translational research. This technology is widely used in the field of reproductive and embryonic medicine. It can sequence and quantify the whole genome of germ cells and embryonic cells at the single-cell level, thereby helping researchers to understand the occurrence of germ cells.

The following factors are likely to contribute to the growth of the genome sequencing market during the forecast period: Increase in Demand for Single Cell Sequencing Introduction of Portable Genome Sequencing Devices The emergence of Nanopore, Third Generation Genome Sequencing Platform

The study considers the genome sequencing markets present scenario and its market dynamics for the period 2020?2026. It covers a detailed overview of several market growth enablers, restraints, and trends. The report offers both the demand and supply aspects of the market. It profiles and examines leading companies and other prominent ones operating in the market.

Global Genome Sequencing Market Segmentation The global genome sequencing market research report includes a detailed segmentation by product, application, end-user, geography. The steady rise in the sale of high-end consumables in commercial laboratories, research institutes, academic institutes, and large pharma and biotech companies performing a high volume of sequencing-based processes is a significant factor responsible for the growth of consumables. In 2020, the consumables segment accounted for the largest share in the market with 81%. The recurring application of consumables to perform a wide range of sequencing-based studies and diagnostics is another critical factor for high sales for consumables. Moreover, increased preference for array-based genotyping consumables for a wide range of analysis, disease-related mutations, and genetic characteristics associated with cancer research is further expected to increase the demand for consumables during the forecast period. High innovations and the introduction of high throughput advanced technologies are likely to drive the application of sequencing devices. These devices are capable of sequencing million to billion reads in a single run in less time.

New cancer cases are expected to reach 24 million by 2030, which is likely to augur well for oncology genome sequencing growth. As cancer prevalence is growing, the need for effective patient stratification is driving research efforts to identify biomarkers and develop companion diagnostics. Genome sequencing has opened new ways of studying cancer-related conditions. Cancer sequencing using next-generation sequencing (NGS) methods provides more information in less time compared to traditional single-gene and array-based approaches. Hence, NGS technology has the potential to change the future of oncology and deliver personalized medicine. They have revolutionized the diagnosis and treatment of acute myeloid leukemia (AML) with accurate testing, classification, and the ability to take advantage of precision medicine.

The presence of several research institutes and stand-alone genomic laboratories in the US, the UK, Germany, France, and China is a major factor responsible for the growth of genome sequencing devices. To develop personalized and effective new therapies that restore mobility, enhance the quality of life, and improve surgical outcomes for patients with multiple disorders, these centers perform extensive research on sequence structural levels of genomics. Hence, the increased focus on unraveling genetic components of common and complex diseases, including cancer diagnostics, neurological disorders, infectious diseases, and rare childhood disorders, influences the market.

Product Consumables Sequencers & Software Application Oncology Reproductive Health Complex Disease Research Microbial Research Others End-user Academic & Research Institutes Pharma & Biotech Companies Consumer Genomic Service Providers Government & Commercial Laboratories Others

INSIGHTS BY GEOGRAPHY North America and Europe are the largest genome sequencing market across the globe. They are the leading countries to increase the usage of genome sequencing-based healthcare and diagnostics. The US is the largest revenue contributor to the North American market. The advanced healthcare infrastructure and the increased awareness have slowly increased genome sequencing and cell and gene therapies technology penetration. Multiple initiatives for human genome projects in the US have improved patients flow seeking treatment for several terminal and genetic diseases. With advances in technology and the increased demand for personalized treatment, the US genomic sequencing market is poised for growth. The increased awareness among European patients drives the application of personal genome sequencing testing, especially for reproductive health. There is an increased number of consumer genomic service providers in the market.

Geography North America o US o Canada Europe o UK o Germany o France o Italy o Spain APAC o China o India o Japan o South Korea o Australia Latin America o Mexico o Brazil o Argentina Middle East & Africa o Saudi Arabia o Turkey o South Africa o UAE

INSIGHTS BY VENDORS Illumina, Thermo Fisher Scientific, F. Hoffmann-La Roche, BGI, Pacific Biosciences, Oxford Nanopore Technology are the major vendors in the market. The market is competitive and is evolving with the introduction of new technologies in the market. Several companies are developing or commercializing products, expanding their manufacturing facilities, partnering with others in the market. For instance, in 2020, Illumina introduced software for whole-genome analysis to examine rare diseases. Similarly, Thermo Fischer scientific has made a strategic partnership with First genetics JCS to promote NGS in Russia. The Oxford Nanopore technology, nanopore-based sequencing, and Pacbios SMRT technology-based sequencing revolutionize genome sequencing by reducing cost and increasing throughput, attracting end-users to shift from conventional sanger methods to advanced methods in the market.

Prominent Vendors Illumina Thermo Fisher Scientific Oxford Nanopore Technology Pacific Biosciences F. Hoffmann-La Roche BGI

Other Prominent Vendors PerkinElmer Siemens Healthineers Qiagen Macrogen Myriad Intrexon Bioinformatics Biomatters Cytiva 10x Genomics MGI Tech New England Biolabs DNASTAR Beckman Coulter VEROGEN Bio-Rad

KEY QUESTIONS ANSWERED 1. What technological advances are the genome sequencing market observing? 2. What is the growth rate of the genome sequencing market during the forecast period? 3. How the outbreak of the COVID-19 pandemic affect the genome sequencing market? 4. Which regions are likely to hold the largest revenue share during the forecast period? 5. Which end-user segment accounted for the largest market share in 2021?Read the full report: https://www.reportlinker.com/p06036817/?utm_source=GNW

About ReportlinkerReportLinker is an award-winning market research solution. Reportlinker finds and organizes the latest industry data so you get all the market research you need - instantly, in one place.

__________________________

Originally posted here:
The global genome sequencing market by revenue is expected to grow at a CAGR of over 9% during the period 20212026 - GlobeNewswire

Posted in Genome | Comments Off

WANdisco grants industry leading LiveData Platform to fast-track high-volume genome analysis and Covid-19 research in South Korea – PRNewswire

Posted: at 2:56 am

SAN RAMON, Calif., March 17, 2021 /PRNewswire/ -- WANdisco, the LiveData company, announced today that it donated its LiveData Platform to help Korea Research Institute of Bioscience & Biotechnology conduct faster analysis in its efforts towards Covid-19 research. Using the automated data migration and replication platform, the institute has been able to replicate files between Hadoop-based big data clusters and Linux-based analysis clusters 13 times faster than before, and reduce analysis time by over 30 percent.

In early 2020, WANdisco announced free access to their suite of cloud migration and big data tools for teams involved in research and developing potential treatments and cures for the Covid-19 pandemic. WANdisco provided its LiveData Platform along with technical resources to Korean Bioinformation Center (KOBIC)to assist the organization in enhancing its architecture, developing products, and introducing WANdisco's automated replication technology into KOBIC's workflow.

"Donating our LiveData platform to the Covid-19 research was absolutely the right thing to do," said David Richards, Chairman and CEO, WANdisco. "Every minute, new data is being generated about people suffering from Covid-19. Velocity has become more important than anything else in the development of vaccines and treatments to overcome infectious diseases. We were grateful for the opportunity to assist Korea accelerate its Covid-19 research and analysis."

KOBICprovides Bio-Express, a cloud service free to bio-engineering researchers at Korean hospitals, businesses, universities, and research institutes for large-capacity genome analysis and storage. Since March 2020, the platform's Covid-19 research information portalhas provided Covid-19-related genomes and proteomic data from around the world. As the amount of data and users skyrocketed last year, so did the time to replicate terabytes of data between the Hadoop Distributed File System (HDFS) and the Linux/Unix based Lustre file system to support the analysis tools within different operating environments. More than 40 percent of Bio-Express's total processing time was solely dedicated to replicate an average of 20TB of data per day.

Upon hearing about Korea's data replication challenges, WANdisco donated its LiveData Platform to reduce large-scale data replication time while ensuring data availability and consistency to researchers. KOBIC administrators could easily and immediately move HDFS data to Bio-Express with automated migration and replication capabilities. No changes were required to applications, cluster or node configuration or operation while ensuring data changes were replicated completely and consistently.

With this new capacity, KOBIC expects to significantly increase the next generation of Bio-Express's efficiency to perform large-scale genome analysis in 2021. WANdisco has since provided KOBIC with an ongoing license to LiveData Platform alongside technical support to help enhance its architecture, developing products, and applying automated replication into the workflow.

"KOBIC uses the WANdisco live data platform to automate file transfer 13 times faster in both directions between Hadoop-based Big Data Analysis Program Execution Cluster (HDFS) and Linux-based Genomic Analysis Program Execution Cluster (Lustre)," said Kun-Hwan Ko, Researcher at KOBIC Computational Development Team. "We were able to reduce the overall average time to analyze user genomic data of Bio-Express service by more than 30 percent."

About WANdisco

WANdisco is the LiveData company. WANdisco solutions enable enterprises to create an environment where data is always available, accurate and protected, creating a strong backbone for their IT infrastructure and a bedrock for running consistent, accurate machine learning applications. With zero downtime and zero data loss, WANdisco LiveData Platform keeps geographically dispersed data at any scale consistent between on-premises and cloud environments allowing businesses to operate seamlessly in a hybrid or multi-cloud environment. For more information on WANdisco, visit http://www.wandisco.com.

About KOBIC

The Korea Bioinformation Center (KOBIC) is a national bio-resource information center for general management of domestic bio-resource information and research in the field of bioinformation. KOBIC helps domestic research institutes, hospitals, companies, and universities to research genomic data and Covid-19 for free. One of KOBIC's main missions is to develop and operate a system that can analyze large-scale genomic data using the state-of-the-art information technology.

Media Contact:

Josh TurnerSilicon Valley Communications[emailprotected]+1 (917) 231-0550

SOURCE WANdisco

Originally posted here:
WANdisco grants industry leading LiveData Platform to fast-track high-volume genome analysis and Covid-19 research in South Korea - PRNewswire

Posted in Genome | Comments Off

Centre asks states to focus on genome testing to track mutated virus – Mint

Posted: at 2:56 am

NEW DELHI: As the total number of covid-19 cases caused by mutant variants of coronavirus mounted to 400, centre on Friday, the Centre asked states to follow up on sending samples for genome testing to track circulating virus variants of concern.

All states and Union territories have been tagged to 10 national laboratories under the Indian SARS-CoV-2 Genomics (INSACOG) consortium with National Centre for Disease Control (NCDC) as the nodal institute, the Union health ministry said. The Indian government has recently confirmed circulation of UK, South Africa and Brazil mutant variants of coronavirus in the country.

Also Read | Vaccine utilization rates lower in states witnessing second wave

Public health experts have said that as the mutant strains continue to increase the disease burden, the government needs to look at the diagnostics more closely. Analysing power as well as the density of sequencing ability will be important for India. We need to update our regulatory processes as well for the same and these modifications can be quickly brought to fruition. The US FDA guidelines have provided a relatively easy guideline for achieving the same. Countries need to be alert and create diagnostic kits for the future depending on the presence of mutations," said professor N.K. Ganguly, president, Jawaharlal Institute of Post Graduate Medical Education and Research, and former director general of the Indian Council of Medical Research (ICMR). Ganguly suggested that booster shots in the existing vaccines can be given to neutralise the effects of mutations alongside a few other vaccines.

The Centre has also advised states/UTs to improve testing in districts reporting reduction in testing and increase the overall share of RT-PCR tests (more than 70%), especially in districts dependent on high levels of antigen testing in line with the Test Track & Treat strategy of the government.

While the covid-19 vaccination is progressing in the country administering over 40 million doses to immunise people against the highly infectious disease, Indias covid-19 burden continues to increase. According to the Union health ministry data, some states in the country are reporting a surge in the daily new covid-19 cases. Maharashtra, Punjab, Karnataka, Gujarat and Chhattisgarh together account 80.63% of the daily new cases.

Over 39,726 new daily cases were reported in the last 24 hours, the highest this year. Maharashtra continues to report the highest daily new cases at 25,833, 65% of the daily cases. It is followed by Punjab with 2,369 cases, while Kerala reported 1,899 new cases. The country also recorded over 156 deaths.

Indias total active caseload stood at 2.71 lakh (2,71,282) on Friday, 2.82% of the total positive cases in the country. This is a net incline of 18,918 cases recorded from the total active caseload in the last 24 hours, the government data showed. Three states of Maharashtra, Kerala and Punjab account for 76.48% of Indias total active cases.

Meanwhile, the Central government has also advised states and UTs to carry out an average close contact tracing of a minimum of 20 persons per positive case (in the first 72 hours) along with isolation and early treatment of the serious cases as per clinical protocol. It is also advised to focus on surveillance and stringent containment of those areas in selected districts which are seeing a cluster of cases and focus on clinical management in districts reporting higher deaths," the Union health ministry said in a statement.

States/UTs have been asked to limit the gathering in public places along with promoting covid-appropriate behaviour through communication and enforcement, and accelerate vaccination for priority population groups in districts reporting higher cases. Accelerating the pace of vaccination has also been stressed upon, the Central government said.

Recently, the Centre had deputed high-level public health teams to Maharashtra and Punjab to assist in covid-19 control and containment measures in view of the recent spike in cases in these states.

The Centre had earlier deputed high-level teams to Maharashtra, Kerala, Chhattisgarh, Madhya Pradesh, Gujarat, Punjab, Karnataka, Tamil Nadu, West Bengal, as well as Jammu and Kashmir to support them in their fight against the recent spike in covid-19 cases.

Subscribe to Mint Newsletters

* Enter a valid email

* Thank you for subscribing to our newsletter.

See the original post:
Centre asks states to focus on genome testing to track mutated virus - Mint

Posted in Genome | Comments Off

Page 62«..10 20..616263 64..70 80..»

Category Archives: Genome

Ancient genomic time transect from the Central Asian Steppe unravels the history of the Scythians – Science Advances

Genome analysis for reinfection cases in capital – Hindustan Times

Precision BioSciences to Participate in the Guggenheim Healthcare Talks 2021 Genomic Medicines & Rare Disease Day – Yahoo Finance

PrecisionLife Continues Growth and Expansion With Acquisition of Danish Genomic Analytics Innovator GenoKey – Business Wire

Hong Kong Baptist University-led research unlocks the genomic secrets of organisms that thrive in extreme deep-sea environments – Taiwan News

A towering genome: Experimentally validated adaptations to high blood pressure and extreme stature in the giraffe – Science Advances

USask Scientists Help Find the Key to Decoding Rye Genome – Seed World

The global genome sequencing market by revenue is expected to grow at a CAGR of over 9% during the period 20212026 – GlobeNewswire

WANdisco grants industry leading LiveData Platform to fast-track high-volume genome analysis and Covid-19 research in South Korea – PRNewswire

Centre asks states to focus on genome testing to track mutated virus – Mint

The Prometheus League

Breaking News and Updates

Prometheism

Forbidden Fruit

The Evolutionary Perspective

Transtopia Menu

Library Updates

Library Books

Future Euvolution

Lucid Dreams from Childhood

Genetic Revolution

Speciation + Self-Directed Evolution