ProMIS Neurosciences adds Dr. David Wishart to its Scientific Advisory Board – GlobeNewswire

TORONTO and CAMBRIDGE, Mass., Oct. 29, 2020 (GLOBE NEWSWIRE) -- ProMIS Neurosciences, Inc. (TSX: PMN) (OTCQB: ARFXF), a biotechnology company focused on the discovery and development of antibody therapeutics targeting toxic oligomers implicated in the development of neurodegenerative diseases, welcomes Dr. David Wishart, Distinguished University Professor in the Departments of Biological Sciences and Computing Science at the University of Alberta, to its Scientific Advisory Board (SAB). Identified as one of the worlds most highly cited scientists for each of the past 7 years, Dr. Wishart brings more than three decades in protein folding and misfolding research to ProMIS, creating industry-leading depth in this area of therapeutic development for neurodegenerative and other diseases.

The commitment and talent of our advisory board has been instrumental to the ongoing development of our broad portfolio of highly specific therapeutic, vaccine and diagnostic candidates, said Eugene Williams, Executive Chairman of ProMIS Neurosciences. Dr. Wisharts world-recognized expertise in protein folding and misfolding combined with Dr. Neil Cashmans complementary leadership will place ProMIS among the most accomplished within this arena. Their combined expertise will advance our platforms application to an even broader scope of diseases caused by protein misfolding.

Dr. Wishart will play a pivotal role in advising ProMIS on the application and further development of its drug discovery and development platform, which is uniquely capable of identifying the sequence and shape (conformation) of novel binding targetscalled peptide antigenson misfolded proteins implicated in the development of neurodegenerative diseases such as Alzheimers, Parkinsons and ALS. ProMIS has leveraged its novel platform to create a portfolio of antibody, intrabody and vaccine candidates that are highly selective for the misfolded protein aggregates driving pathogenesis. With Dr. Wisharts support, ProMIS will continue to expand the application of its platform to the biology of additional misfolded protein diseases.

Never before has there been a more urgent need for therapy, diagnostic and vaccine candidates that are highly specific for their intended target, said Dr. Wishart. I look forward to working with Dr. Neil Cashman and his team and such an accomplished SAB as we continue to seek new opportunities to apply ProMIS unique platform technology to misfolded protein diseases with high unmet need.

ProMIS SAB includes distinguished, highly published and cited contributors to the current scientific understanding of Alzheimers, Parkinsons, ALS, protein misfolding diseases in general, vaccines and diagnostics. Dr. Wishart joins the following current members:

About Dr. David WishartDr. Wishart has been studying protein folding and misfolding for more than 30 years using a combination of computational and experimental approaches. These experimental approaches include NMR spectroscopy, circular dichroism, fluorescence spectroscopy, electron microscopy, protein engineering and molecular biology. The computational methods include molecular dynamics, agent-based modeling, bioinformatics and machine learning. Over the course of his career, Dr. Wishart has published more than 430 scientific papers, cited more than 78,000 times, covering many areas of protein science including structural biology, protein metabolism and computational biochemistry. He has been with the University of Alberta since 1995 and is currently a Distinguished University Professor in the Departments of Biological Sciences and Computing Science. He also holds adjunct appointments with the Faculty of Pharmaceutical Sciences and the Department of Pathology and Laboratory Medicine.

Dr. Wishart has been awarded research grants totaling more than $130 million from a number of funding agencies. He has also led or directed a number of core facilities and centers and currently co-directs The Metabolomics Innovation Centre (TMIC), Canadas national metabolomics laboratory. Dr. Wishart held the Bristol-Myers Squibb Research Chair in Pharmaceutical Sciences from 1995-2005, received the Astra-Zeneca-CFPS Young Investigator Prize in 2001, was awarded a Lifetime Honorary Fellowship by the Metabolomics Society in 2014 and elected as a Fellow of the Royal Society of Canada in 2017.

About ProMIS NeurosciencesProMIS Neurosciences, Inc. is a development stage biotechnology company whose unique core technology is the ability to rationally predict the site and shape (conformation) of novel targets known as Disease Specific Epitopes (DSEs) on the molecular surface of proteins. In neurodegenerative diseases, such as Alzheimers, ALS and Parkinsons disease, the DSEs are misfolded regions on toxic forms of otherwise normal proteins. In the infectious disease setting, these DSEs represent peptide antigens that can be used as an essential component to create accurate and sensitive serological assays to detect the presence of antibodies that arise in response to a specific infection, such as COVID-19. ProMIS proprietary peptide antigens can also be used to create potential therapeutic antibodies, as well as serve as the basis for development of vaccines. ProMIS is headquartered in Toronto, Ontario, with offices in Cambridge, Massachusetts. ProMIS is listed on the Toronto Stock Exchange under the symbol PMN, and on the OTCQB Venture Market under the symbol ARFXF.Visit us atwww.promisneurosciences.com, follow us onTwitterandLinkedIn. To learn more about protein misfolding diseases, listen to Episodes 11, 24, of Saving Minds, a podcast available atiTunesorSpotify.

For media inquiries, please contact:Shanti Skiffingtonshanti.skiffington@gmail.comTel. 617 921-0808

The TSX has not reviewed and does not accept responsibility for the adequacy or accuracy of this release. This information release contains certain forward-looking information. Such information involves known and unknown risks, uncertainties and other factors that may cause actual results, performance or achievements to be materially different from those implied by statements herein, and therefore these statements should not be read as guarantees of future performance or results. All forward-looking statements are based on the Companys current beliefs as well as assumptions made by and information currently available to it as well as other factors. Readers are cautioned not to place undue reliance on these forward-looking statements, which speak only as of the date of this press release. Due to risks and uncertainties, including the risks and uncertainties identified by the Company in its public securities filings, actual events may differ materially from current expectations. The Company disclaims any intention or obligation to update or revise any forward-looking statements, whether as a result of new information, future events or otherwise.

More:
ProMIS Neurosciences adds Dr. David Wishart to its Scientific Advisory Board - GlobeNewswire

Silent Mutations Identified That Give the COVID-19 Coronavirus an Evolutionary Edge – SciTechDaily

RNA folding may help explain how the coronavirus became so hard to stop after it spilled over from wildlife to humans.

We know that the coronavirus behind the COVID-19 crisis lived harmlessly in bats and other wildlife before it jumped the species barrier and spilled over to humans.

Now, researchers at Duke University have identified a number of silent mutations in the roughly 30,000 letters of the viruss genetic code that helped it thrive once it made the leap and possibly helped set the stage for the global pandemic. The subtle changes involved how the virus folded its RNA molecules within human cells.

For the study, published October 16, 2020, in the journal PeerJ, the researchers used statistical methods they developed to identify adaptive changes that arose in the SARS-CoV-2 genome in humans, but not in closely related coronaviruses found in bats and pangolins.

Were trying to figure out what made this virus so unique, said lead author Alejandro Berrio, a postdoctoral associate in biologist Greg Wrays lab at Duke.

Previous research detected fingerprints of positive selection within a gene that encodes the spike proteins studding the coronaviruss surface, which play a key role in its ability to infect new cells.

The new study likewise flagged mutations that altered the spike proteins, suggesting that viral strains carrying these mutations were more likely to thrive. But with their approach, study authors Berrio, Wray and Duke Ph.D. student Valerie Gartner also identified additional culprits that previous studies failed to detect.

The researchers report that so-called silent mutations in two other regions of the SARS-CoV-2 genome, dubbed Nsp4 and Nsp16, appear to have given the virus a biological edge over previous strains without altering the proteins they encode.

Instead of affecting proteins, Berrio said, the changes likely affected how the viruss genetic material which is made of RNA folds up into 3-D shapes and functions inside human cells.

What these changes in RNA structure might have done to set the SARS-CoV-2 virus in humans apart from other coronaviruses is still unknown, Berrio said. But they may have contributed to the viruss ability to spread before people even know they have it a crucial difference that made the current situation so much more difficult to control than the SARS coronavirus outbreak of 2003.

The research could lead to new molecular targets for treating or preventing COVID-19, Berrio said.

Nsp4 and Nsp16 are among the first RNA molecules that are produced when the virus infects a new person, Berrio said. The spike protein doesnt get expressed until later. So they could make a better therapeutic target because they appear earlier in the viral life cycle.

More generally, by pinpointing the genetic changes that enabled the new coronavirus to thrive in human hosts, scientists hope to better predict future zoonotic disease outbreaks before they happen.

Viruses are constantly mutating and evolving, Berrio said. So its possible that a new strain of coronavirus capable of infecting other animals may come along that also has the potential to spread to people, like SARS-CoV-2 did. Well need to be able to recognize it and make efforts to contain it early.

Reference: Positive selection within the genomes of SARS-CoV-2 and other Coronaviruses independent of impact on protein function by Alejandro Berrio1, Valerie Gartner and Gregory A. Wray, 16 October 2020, PeerJ.DOI: 10.7717/peerj.10234

Read the original post:
Silent Mutations Identified That Give the COVID-19 Coronavirus an Evolutionary Edge - SciTechDaily

Scientists discover new organic compounds that could have helped form the first cells – Science Codex

Chemists studying how life started often focus on how modern biopolymers like peptides and nucleic acids contributed, but modern biopolymers don't form easily without help from living organisms. A possible solution to this paradox is that life started using different components, and many non-biological chemicals were likely abundant in the environment. A new survey conducted by an international team of chemists from the Earth-Life Science Institute (ELSI) at Tokyo Institute of Technology and other institutes from Malaysia, the Czech Republic, the US and India, has found that a diverse set of such compounds easily form polymers under primitive environmental conditions, and some even spontaneously form cell-like structures.

Understanding how life started on Earth is one of the most challenging questions modern science attempts to explain. Scientists presently study modern organisms and try to see what aspects of their biochemistry are universal, and thus were probably present in the organisms from which they descended. The best guess is that life has thrived on Earth for at least 3.5 billion of Earth's 4.5 billion year history since the planet formed, and most scientists would say life likely began before there is good evidence for its existence. Problematically, since Earth's surface is dynamic, the earliest traces of life on Earth have not been preserved in the geological record. However, the earliest evidence for life on Earth tells us little about what the earliest organisms were made of, or what was going on inside their cells. "There is clearly a lot left to learn from prebiotic chemistry about how life may have arisen," says the study's co-author Jim Cleaves.

A hallmark of life is evolution, and the mechanisms of evolution suggest that common traits can suddenly be displaced by rare and novel mutations which allow mutant organisms to survive better and proliferate, often replacing previously common organisms very rapidly. Paleontological, ecological and laboratory evidence suggests this occurs commonly and quickly. One example is an invasive organism like the dandelion, which was introduced to the Americas from Europe and is now a commo weed causing lawn-concerned homeowners to spend countless hours of effort and dollars to eradicate. Another less whimsical example is COVID-19, a virus (technically not living, but technically an organism) which was probably confined to a small population of bats for years, but suddenly spread among humans around the world. Organisms which reproduce faster than their competitors, even only slightly faster, quickly send their competitors to what Leon Trotsky termed the "ash heap of history." As most organisms which have ever existed are extinct, co-author Tony Z. Jia suggests that "to understand how modern biology emerged, it is important to study plausible non-biological chemistries or structures not currently present in modern biology which potentially went extinct as life complexified."

This idea of evolutionary replacement is pushed to an extreme when scientists try to understand the origins of life. All modern organisms have a few core commonalities: all life is cellular, life uses DNA as an information storage molecule, and uses DNA to make ribonucleic RNA as an intermediary way to make proteins. Proteins perform most of the catalysis in modern biochemistry, and they are created using a very nearly universal "code" to make them from RNA. How this code came to be is in itself enigmatic, but these deep questions point to their possibly having been a very murky period in early biological evolution ~ 4 billion years ago during which almost none of the molecular features observed in modern biochemistry were present, and few if any of the ones that were present have been carried forward.

Proteins are linear polymers of amino acids. These floppy strings of polymerised amino acids fold into unique three-dimensional shapes, forming extremely efficient catalysts which foster precise chemical reactions. In principle, many types of polymerised molecules could form similar strings and fold to form similar catalytic shapes, and synthetic chemists have already discovered many examples. "The point of this kind of study is finding functional polymers in plausibly prebiotic systems without the assistance of biology, including grad students," says co-author Irena Mamajanov.

Scientists have found many ways to make biological organic compounds without the intervention of biology, and these mechanisms help explain these compounds' presence in samples like carbonaceous meteorites, which are relics of the early solar system, and which scientists don't think ever hosted life. These primordial meteorite samples also contain many other types of molecules which could have formed complex folded polymers like proteins, which could have helped steer primitive chemistry. Proteins, by virtue of their folding and catalysis mediate much of the complex biochemical evolution observed in living systems. The ELSI team reasoned that alternative polymers could have helped this occur before the coding between DNA and protein evolved. "Perhaps we cannot reverse-engineer the origin of life; it may be more productive to try and build it from scratch, and not necessarily using modern biomolecules. There were large reservoirs of non-biological chemicals that existed on the primeval Earth. How they helped in the formation of life-as-we-know-it is what we are interested in," says co-author Kuhan Chandru.

The ELSI team did something simple yet profound: they took a large set of structurally diverse small organic molecules which could plausibly be made by prebiotic processes and tried to see if they could form polymers when evaporated from dilute solution. To their surprise, they found many of the primitive compounds could, though they also found some of them decomposed rapidly. This simple criterion, whether a compound is able to be dried without decomposing, may have been one of the earliest evolutionary selection pressures for primordial molecules.

The team conducted one further simple test. They took these dried reactions, added water and looked at them under a microscope. To their surprise, some of the products of these reaction formed cell-sized compartments. That simple starting materials containing 10 to 20 atoms can be converted to self-organised cell-like aggregates containing millions of atoms provides startling insight into how simple chemistry may have led to complex chemistry bordering on the kind of complexity associated with living systems, while not using modern biochemicals.

"We didn't test every possible compound, but we tested a lot of possible compounds. The diversity of chemical behaviors we found was surprising, and suggests this kind of small-molecule to functional-aggregate behavior is a common feature of organic chemistry, which may make the origin of life a more common phenomenon than previously thought," concludes co-author Niraja Bapat.

See the article here:
Scientists discover new organic compounds that could have helped form the first cells - Science Codex

The structural basis for Z 1-antitrypsin polymerization in the liver – Science Advances

Abstract

The serpinopathies are among a diverse set of conformational diseases that involve the aberrant self-association of proteins into ordered aggregates. 1-Antitrypsin deficiency is the archetypal serpinopathy and results from the formation and deposition of mutant forms of 1-antitrypsin as polymer chains in liver tissue. No detailed structural analysis has been performed of this material. Moreover, there is little information on the relevance of well-studied artificially induced polymers to these disease-associated molecules. We have isolated polymers from the liver tissue of Z 1-antitrypsin homozygotes (E342K) who have undergone transplantation, labeled them using a Fab fragment, and performed single-particle analysis of negative-stain electron micrographs. The data show structural equivalence between heat-induced and ex vivo polymers and that the intersubunit linkage is best explained by a carboxyl-terminal domain swap between molecules of 1-antitrypsin.

The misfolding of proteins and their spontaneous ordered aggregation underlie the pathology of Alzheimers, Huntingtons, and Parkinsons diseases; amyloidoses; and serpinopathiesthe latter involving self-association of mutant members of the serine protease inhibitor (serpin) superfamily. 1-Antitrypsin is a 52-kDa serpin expressed and secreted predominantly by hepatocytes and is the most abundant circulating protease inhibitor. The primary physiological role of 1-antitrypsin is the inhibition of neutrophil elastase, a protease whose production is increased during the acute phase inflammatory response (fig. S1, A and B). However, genetic variants such as the severe Z (E342K) allele of 1-antitrypsin promote proteasomal degradation and the formation of ordered linear polymers (1, 2). Despite the pronounced retention in the endoplasmic reticulum (ER), 1-antitrypsin polymers do not typically initiate the unfolded protein response. Instead, these ordered aggregates can be sequestered into ER-derived inclusion bodies that are associated with liver disease. The lack of circulating 1-antitrypsin results in dysregulation of neutrophil elastase and hence tissue destruction and emphysema (2).

The structure of the pathological polymers that accumulate in patients has not been demonstrated. The observation that 1-antitrypsin polymers show a similar degree of stabilization to the cleaved form (3) (fig. S1B, EI) and that peptide analogs of the inserted portion of the reactive center loop (RCL) could similarly stabilize the protein (4) and prevent polymerization (1, 3) suggested that polymers were the product of an interaction between the RCL of one molecule and sheet A of the next (1). This loop-sheet model (Fig. 1A, hypotheses H1 and H2) is consistent with nuclear magnetic resonance and H/D (hydrogen-deuterium) exchange data showing that polymerization proceeds via a compact, rather than an expanded, intermediate (5, 6). The subsequently proposed -hairpin hypothesis (Fig. 1A, H3) was based on the crystal structure of a self-terminating dimer of a homologous protein, generated artificially at low pH, and extrapolated to 1-antitrypsin using limited proteolysis and recombinant mutants with stabilizing disulfide bonds (7). The C-terminal model (Fig. 1A, hypothesis H4) posits that the C terminus fails to form properly in the donor molecule and is instead incorporated into an acceptor molecule, with latent-like self-insertion of the RCL providing the extreme stability found in polymers (8). This model is based on a crystal structure of a denaturant-induced circular trimer of recombinant disulfide-bonded 1-antitrypsin. The circular arrangement of subunits provides a rigid structure that is tractable for crystallography but reflects a minor component of the source sample that is not generally enriched in polymer preparations (1), although there is an absence of the latent conformation in humans that would be predicted to be a by-product of this mechanism (9).

(A) Different linkages hypothesized for the pathological polymer, H1 to H4, with the intermolecular interface proposed between one monomeric subunit and the next shown in black. (B) (i) Analysis of polymers isolated from intrahepatic inclusion bodies (denoted as ZZ) by 4-12 (w/v) acrylamide SDS-PAGE in comparison with the monomeric wild-type (M) variant purified from human plasma and visualized by Coomassie blue R stain. (ii, iv, and v) Western blots of ex vivo polymers (ZZ), polymers of the M variant induced by heating (H), and monomeric M variant (M) separated by denaturing SDS-PAGE (top) and nondenaturing native PAGE (bottom) and probed with a conformation-insensitive rabbit polyclonal antibody (pAb AAT, left) or a mouse monoclonal selective for polymeric 1-antitrypsin (mAb 2C1, right). No monomer is visible by native PAGE in the heat or ZZ preparation. (iii) Sensitivity of ex vivo Z 1-antitrypsin to PNGase F (+P) or EndoH (+E), the latter preferentially cleaving high-mannose glycans. (C) Representative micrograph of polymers isolated from ex vivo liver tissue, visualized by 2% (w/v) uranyl acetate negative stain using a Tecnai 120-keV transmission electron microscope at a magnification of 92,000. The image has been low-passfiltered to 30 . Black scale bar, 50 nm. Details of some polymers are shown at the right. (D) Same material, labeled with the Fab fragment of the 4B12 monoclonal antibody (Fab4B12), and visualized under the same conditions. Scale bar, 50 nm. Details from micrographs are shown at the right; readily discernible Fab protrusions are highlighted by arrows.

The question remains unresolved as to which polymerization model, if any, describes a realistic organization of the pathological polymer. To address this issue, we have performed a structural characterization of polymers from explant liver tissue of individuals homozygous for the Z allele who had undergone orthotopic transplantation. This has allowed us to define structural limits on the pathological polymer and to critically evaluate the proposed models in this pathological context.

Tissue samples were obtained from the explanted livers of individuals homozygous for the Z allele of 1-antitrypsin. After isolation of inclusion bodies, polymers released by sonication were found to contain a major component that resolved at ~50 kDa when dissociated and visualized by denaturing SDSpolyacrylamide gel electrophoresis (SDS-PAGE) (Fig. 1B, i). It was confirmed to be 1-antitrypsin by Western blot analysis (Fig. 1B, ii). The difference in migration with respect to monomeric material purified from human plasma (Fig. 1B, i and ii) was no longer observed following treatment with PNGase F or EndoH (Fig. 1B, iii). This is diagnostic for glycosylated material that has not undergone maturation in the trans-Golgi network and therefore has been retained by the cell. When visualized by nondenaturing PAGE, the protein migrated with a broad size profile with some discrete bands visible, it was reactive with the polymer-specific (10) monoclonal antibody mAb2C1, and it was free of detectable monomer (Fig. 1B, iv and v).

The liver-derived polymers were applied to carbon-coated copper grids and negatively stained with 2% (w/v) uranyl acetate; polymers could easily be distinguished in the resultant electron microscopy (EM) images by a beads-on-a-string appearance (1), with a curvature of the chain and an absence of branching (Fig. 1C). While some circular forms were present, in contrast to a small-angle x-ray scattering (SAXS) analysis of polymeric material produced in the cytoplasm of Pichia pastoris (11), most (~80%) were nonself-terminating with clearly separated termini.

Polymer subunits are ~50 kDa in size, their ellipsoidal shape has few distinct features that would aid orientation, and they are connected by linkages that appear flexible. These properties provide confounding factors to processing by single-particle analysis. To facilitate subsequent image processing, we doubled the effective size of the polymer subunits and introduced an orienting feature by labeling polymers with the antigen-binding fragment of the 4B12 monoclonal antibody (Fab4B12) (12). This antibody was selected as it recognizes all folded forms of 1-antitrypsin including the polymer, and the location of its epitope is well established (1214).

Following the addition of Fab4B12 at a stoichiometric excess to the 1-antitrypsin subunits and removal of unbound material, the polymer sample was visualized using negative-stain EM (NS-EM) (Fig. 1D). Fab4B12-labeled polymer subunits demonstrated additional density visible as tooth-like protrusions (Fig. 1D, insets). On consecutive subunits, Fabs were, in general, present on the same side of the polymer chain, potentially indicating a preference of the angular relationship around the polymer axis. Conversely, opposing 1-antitrypsinFab4B12 orientations, which would report substantial orientational freedom around the intersubunit linkage, were observed only infrequently.

The heterogeneity and flexibility of ex vivo polymers make them unsuitable for crystallography. Modern protocols for single-particle reconstruction of three-dimensional (3D) objects using EM images enable us to explicitly address heterogeneity in samples, and we therefore sought to structurally characterize the pathological polymers using this technique. A NS-EM image dataset of Fab4B12-labeled polymers was compiled from 100 30-frame movies that had been collected using a DE-20 direct detector and a Tecnai 200-keV transmission electron microscope. Preliminary experiments indicated that polymer flexibility would represent a challenge for a single-particle reconstruction approach. Thus, a minimal segment required to investigate the linkage between monomersa dimer of adjacent subunitswas chosen for the subsequent structural analysis.

The processing pathway for single-particle reconstruction is described in more detail in the Supplementary Materials and in fig. S2 and is summarized here. Initially, images of dimer particles were manually selected from regions of polymers that appeared by eye to be side views with relatively little curvature (fig. S2b) and divided into classes using the Class2D function of RELION (15). The class sums included dimers in which the subunits appeared as adjacent ellipses, and many subunits exhibited a protuberance with the characteristic narrow midriff present in Fab structures (fig. S2d). In some classes, these Fab4B12 subunits were poorly resolved, suggesting variability in rotation between adjacent subunits. Seven classes with well-defined Fab4B12 components were used as references for autopicking from the same set of micrographs; after removal of poorly defined components, this yielded ~100,000 230 230 particle images. This dataset, DA,100K, was found by 2D classification to be more diverse and less dominated by long-axis dimer views (fig. S2f). Later in the course of processing, a subset of 69,000 dimer images (DB,69K) was extracted from a 2D reclassification of the same dataset (fig. S2k).

One class in particular showed two well-resolved Fab subunits (fig. S2h). To generate an initial model-agnostic reference map for 3D classification, we converted this 2D image to a 3D surface representation (fig. S2h, right) with the height (along z in both directions) at each x,y coordinate proportional to the grayscale value of the corresponding pixel in the image (fig. S2h, right). This map was used as a reference for 3D classification of the DA,100K dataset (fig. S2i). In two of eight resulting maps, both 1-antitrypsin molecules exhibited Fab4B12 protrusions. The best-defined map was divided in half, and one subunit was used as a monomer input reference in a reclassification of DA,100K (fig. S2j). Following several iterations of 3D classification, five of eight classes exhibited either one or two well-defined 1-antitrypsinFab4B12 subunits (fig. S2n). These maps were divided in half, and the monomer subunits were individually superimposed and averaged together, providing a consensus density for the 1-antitrypsinFab4B12 monomer subunit Monav (fig. S2o, left). Monav was used as the reference map in successive rounds of 3D classification. Eventually, two classes were identified that showed connected 1-antitrypsin molecules with clear Fab4B12 subunits, comprising 9200 and 6200 particle images, respectively (fig. S2, p and q).

These 3D classes differed in the angles between the two 1-antitrypsinFab4B12 subunitsapproximately 60 and 90and were accordingly termed Dim60 and Dim90 (Fig. 2A). Both showed clear Fab4B12 protuberances and connectivity between the volumes representing the 1-antitrypsin molecules. 3D refinement using gold-standard FSC (Fourier shell correlation) analysis provided estimated resolutions of 19.1 and 24.8 , respectively (at a FSC threshold of 0.33) (fig. S3). Other attempts to obtain dimer reconstructions using variations of the processing pathway described above also converged on these two forms and no others.

(A) Orthogonal views of the reconstruction of Dim60 (left) and Dim90 (right) contoured at 3.9 105 3. In this orientation, the connected 1-antitrypsin density is situated at the bottom, and the Fab domains are at the top. Calculated resolutions (using FSC = 0.33) are 19.1 and 24.8 , respectively (fig. S3). (B) Particle images, clustered by view and averaged, that are the basis for the reconstructions. The relative support for each cluster, calculated from the sum of the weights of the constituent images, is shown as circles colored according to a heatmap, highlighting the enrichment of views orthogonal to the dimer axis.

A summary of the constituent particle images, clustered by orientation relative to the 3D reconstructions, can be seen in Fig. 2B. In both cases, the assigned views show that the datasets contain a larger number of side-on views of the dimers, consistent with the observed alignment of most polymers in the micrographs.

Polymers artificially induced at an elevated temperature have often been used to study the process of polymerization (3, 6, 12, 1618). It has been shown that this form shares a common epitope with ex vivo polymers in the vicinity of helices E and F (fig. S1A) (14); the epitope is not recognized when polymerization is artificially induced using a denaturant (10, 16). The lack of discrimination between heat and liver polymers does not, however, demonstrate structural equivalence, and a means of direct comparison between the two has been lacking.

Heat-induced polymers of the plasma-purified M variant were induced and purified, labeled with Fab4B12, and visualized by NS-EM using 2% (w/v) uranyl acetate stain. The resulting images showed the same flexible beads-on-a-string appearance (Fig. 3A), with a greater proportion exhibiting a circularized morphology. The Fab domains once again appeared as teeth-like protuberances with a general preferred orientation on the same side of the polymer axis in adjacent subunits with an occasional apparent ~90 to 180 inversion (Fig. 3B). A new dataset comprising 169 micrograph images was obtained, compiled from 30-frame movies collected using the DE-20 direct detector and the Tecnai 200-keV transmission electron microscope.

(A) Representative micrograph of polymers of M 1-antitrypsin induced at 55C for 48 hours, visualized by 2% (w/v) uranyl acetate negative stain using the Tecnai 120-keV transmission electron microscope at a magnification of 92,000. The image has been low-passfiltered to 30 . Black scale bar, 50 nm. Details of selected polymers are shown at the right. (B) Heat-induced polymers labeled with Fab4B12 and visualized in the same manner. Details from micrographs are shown at the right; discernible Fab protrusions are highlighted by arrows. (C) Orthogonal views of the reconstruction of a Dim60-like structure, with a calculated resolution of 26.4 (FSC = 0.33) (fig. S3). (D) Particles upon which the reconstruction is based, clustered by imputed orientation and with the relative sum of their weights shown as a spectrum. (E) Orthogonal projections of the aligned and contoured Dim60 (blue) and Dim60H (red) structures, with axes shown; overlapping regions appear as magenta. (F) 2D class sums from the liver and heat-induced polymer particle datasets arranged in pairs with columns denoted by L and H, respectively. For each liver polymer class, the most similar heat-induced polymer class by cross-correlation coefficient is shown; gray vertical lines through the images denote identified intensity peaks. (G) Distribution of the interpeak distances for the liver (blue) and heat (red) polymer distances. Dashed lines indicate the means of both sets of data.

We performed autopicking in RELION from the new micrographs using the same 2D references as with the ex vivo dataset (fig. S2d, right) because the heat-induced polymer subunits were of a similar size. Following rounds of 2D classification and cleaning of the image dataset, 25,000 dimer particles were extracted for further image analysis. In 3D classification, the monomeric subunit Monav (fig. S2o, left), obtained from the ex vivo dataset, was used as the reference; monomer rather than dimer was chosen to avoid introducing bias in the relative rotation and translation between subunits. At the final step of classification, a Dim60-type class was identified (Dim60H; Fig. 3C), comprising 6750 particles and with a nominal resolution of 26.4 (at FSC = 0.33; fig. S3). Clustering of particles by their orientation relative to the 3D volume again showed a preference for side views (Fig. 3D). Attempts at reclassification of the residual 18,000 particles failed to reveal further well-defined 3D classes.

In a preliminary model-free analysis, the 1-antitrypsinFab4B12 dimer structure identified from the heat polymer data exhibited a somewhat different intersubunit distance and Fab4B12 orientation to that seen with the liver-derived dataset (Fig. 3, C and E): Translations and rotations of 64 /57 and 69 /65, respectively, were required to superimpose a subunit volume onto the adjacent one. The correspondence more generally between the two datasets was therefore investigated. A comparison was made between all 2D classes obtained from the liver-derived polymer dataset against those calculated from the heat-induced polymer dataset by optimally aligning every possible pair and recording those with the highest correlation coefficient. Most pairs showed good visual correspondence (representative comparisons of class averages are shown in Fig. 3F). Positions of subunits were identified from peaks in the intensity profile of each image. The distribution of distances between these peaks in the aligned classes was almost identical, with a mean of 65 12 and 64 11 (SD) for liver-derived and heat-induced polymer 2D classes, respectively (Fig. 3G). The putative distinction between the dimer volumes is therefore likely accommodated within the observed geometric relationships between subunits in both samples rather than supporting separate linkage mechanisms.

The 3D reconstructions of adjacent subunits reflect the asymmetric character of the Fab4B12-bound subunits and polarity of 1-antitrypsin within the polymer and embody shape, intersubunit distance, and rotational information. Accordingly, they could be used to challenge the different hypotheses regarding the structure of the pathological 1-antitrypsin polymer (Fig. 1A). As the foundation of this analysis, an atomic model of the Fab-antigen complex was required. Protein crystallization trials of Fab4B12 were successful and yielded a 1.9 structure, with the crystallographic parameters summarized in table S1. The asymmetric unit contained two molecules, one of which exhibited fully defined variable loop regions. Despite extensive efforts, it was not possible to obtain a crystal structure of the 1-antitrypsinFab4B12 complex; SAXS data were collected instead. The atomic model of the 1-antitrypsinFab4B12 subunit was then constructed using five sets of experimental data:

1) a consensus density map of the monomer generated by aligning and averaging the individual subunits of the Dim60 and Dim90 reconstructions from the liver polymer dataset (Mon60,90; shown in Fig. 4A, left);

(A) Left: Density for an 1-antitrypsinFab4B12 subunit calculated as the average of the Dim60 and Dim90 subunits, contoured at 1.9 105 3 with a nominal resolution (at FSC = 0.33) of 15.2 (fig. S3). Middle: Result of modeling trials in which complexes between 1-antitrypsin and Fab4B12 molecules with random starting orientations were optimized with respect to the antibody epitope and the subunit density. The resulting structures were evaluated according to their correspondence with the experimental SAXS profile recorded for the complex. A cluster of structures maximizing both parameters are highlighted in red and circled. Right: Superposition on the 1-antitrypsin chain of these five structures showing a consistent relationship between the two components, with the heavy chain in blue and light chain in red. (B) Left: Final model of the subunit shown in the context of the experimental density, with the heavy chain in blue, the light chain in dark green, and 1-antitrypsin sheets A, B, and C in red, pink, and yellow, respectively. The orientations are according to the axes shown in Fig. 3E. Right: Correspondence between the observed SAXS data (black) and the profile calculated from the coordinates of the final subunit model (red). (C) Top: Various polymer images extracted from NS-EM micrographs are shown in red, and 2D projections of polymer models that have been refined against these images are shown in black. Bottom: Mean relative correlations (SD) between each model and the experimental density are shown. Values were calculated for each oligomer relative to the best score observed for that oligomer. Significance was determined by one-way analysis of variance (ANOVA) and Tukeys multiple comparisons test (n = 18); ***P < 0.001 and ****P < 0.0001.

2) the experimentally determined epitope of Fab4B12 (13, 14) at 1-antitrypsin residues 32, 36, 43, 266, and 306 incorporated as a collection of distance constraints on the crystal structures of the individual components;

3) the Fab4B12 crystal structure;

4) the SAXS profile of the complex (Fig. 4B, right); and

5) the structure of cleaved 1-antitrypsin [Protein Data Bank (PDB): 1EZX (19)], as all extant polymer models propose a six-stranded sheet A configuration (Fig. 1A).

Integration of these data during modeling was performed using PyRosetta (20). One thousand randomized starting orientations for 1-antitrypsin and Fab4B12 were subjected to rigid-body energy optimization with reference to these constraints and the Mon60,90 subunit map and scored according to both the cross-correlation coefficient (CCC) with the density and their correspondence with the SAXS profile (Fig. 4A, middle). Backbone and side-chain flexibility was conferred on regions of the Fab likely to contribute to the interface (heavy chain: 27 to 33, 51 to 57, 71 to 76, and 94 to 102; light chain: 27 to 32, 49 to 54, 66 to 70, and 91 to 94) and 1-antitrypsin side chains within the boundaries of the epitope.

The five models that maximized these metrics showed an unambiguous polarity (Fig. 4A, right). One model was selected that best represented this cluster by root mean square distance comparison with the others. This showed the heavy-light chain partition to be oriented off-center along helix A, with the variable-constant domain axis perpendicular to the long axis of the serpin [Fig. 4, A (right) and B (left)]. The cleft between the variable and constant domains of Fab4B12 aligned closely with a central dimple exhibited by the monomer density (denoted by an asterisk in the figure), and the complex corresponded well with the experimentally determined SAXS profile (Fig. 4B, right).

Initial models of the C-terminal (8), loop-sheet (1), and -hairpin (7) polymer configurations (Fig. 1A) were built using the 1-antitrypsinFab4B12 subunit structure (representations of these can be seen in the left column of Fig. 5), differing most substantially in the linker regions connecting adjacent subunits in the polymer chain (detailed in Materials and Methods).

(Top) Different polymer configurations were randomly perturbed by rotation of the subunits with respect to one another and their conformations optimized against Dim60, Dim90, and Dim60H reconstructions. The correlation coefficient after perturbation and before optimization is shown on the x axis, while that after optimization is shown on the y axis. Values are expressed relative to subunits optimized into the density without restriction by a connecting linker. Flexible regions encompassed residues 357 to 368 in all models as well as 340 to 349 (H1), 340 to 352 (H2), and 309 to 328 (H3). (Bottom) The best-fitting model for each polymer configuration and for each of the three dimer EM structures is shown (1-antitrypsin in blue and Fab4B12 in dark green) with respect to the fit of unconstrained subunits (shown in pink). Regions treated as flexible linkers during the optimization are highlighted in light green. For all three reconstructions, the C-terminal model corresponds with the optimum arrangement of subunits.

From an examination of the representative micrographs shown in Fig. 1 (C and D), the intersubunit angular relationships along the polymer chains are not solely accounted for by the Dim60 and Dim90 configurations. Instead, these structures likely correspond to more highly populated species along a continuum of intermediate states. To investigate the compatibility of the loop-sheet, C-terminal, and -hairpin linkages with the arrangement of polymers seen in the micrographs, we used a method that optimized the 3D models to maximize their correspondence with the 2D polymer images. Stretches of residues connecting the dimer subunits were treated as flexible (as specified in Materials and Methods), while the 1-antitrypsinFab4B12 cores behaved as rigid bodies. A selection of 20 oligomers was chosen with different degrees of curvature and subunit orientation (Fig. 4C). Despite a lack of information along the z axis, this approach was able to discriminate between the models on the basis of their ability to adopt the shapes seen in the 2D polymer images: The highly constrained loop-sheet eight-residue insertion model (H1) performed significantly worse than the others (P < 0.0001). The flexibility of the C-terminal domain swap (H4) provided a better fit than the loop-sheet four-residue insertion model (H2) (P < 0.001), and the -hairpin (H3) and C-terminal models (H4) were not distinguishable by this analysis (Fig. 4D).

Next, the compatibility of loop-sheet, C-terminal, and -hairpin configurations with the 3D Dim60, Dim90, and Dim60H reconstructions was evaluated. Each model was repeatedly randomly perturbed by rotation around the dimer long axis (through the 1-antitrypsin subunits) and energy minimized with respect to the EM structures and default stereochemical restraints using PyRosetta (20). This process was undertaken 1000 times for each combination of model and map. As before, the 1-antitrypsinFab4B12 subunits were treated as rigid bodies connected by a flexible linker region. The correspondence between each model and the target map was assessed by the cross-correlation function. These CCC values were denoted as ccperturbed and ccrefined for each perturbed model before and after energy minimization, respectively. Benchmark maximum CCC values were obtained by performing model-free alignments of 1-antitrypsinFab4B12 subunits into each map in the absence of a linker region and reported as ccoptimal, denoted by red shaded models in the bottom panels of Fig. 5.

The result of this analysis is shown in Fig. 5 (top, color-coded by hypothesis). The random rotational perturbations applied to each model resulted in a spread of preminimization CCC values along the horizontal axis, and minimization of these models generally showed a convergence over a narrow range of CCC values on the vertical axis. The minimized structure giving the highest ccrefined/ccoptimal score for each polymer configuration (in rows) with respect to each map (in columns) is shown in Fig. 5 (bottom). By this analysis, the best-scoring C-terminal polymers (H4) exhibited a value close to one, indicating that the linkage-restrained models were essentially indistinguishable from the unrestrained ones, and this was reflected by an almost direct superimposition of the model over the aligned linker-free subunits (top row). In contrast, the translational and rotational restrictions imposed by the linkers of the other models (H13) prevented them, to varying degrees, from adopting the preferred orientation inherent with respect to the data (bottom three rows).

All models entail a connection between strand 4A of one 1-antitrypsin subunit and strand 1C of the next. A distinguishing characteristic of hypotheses H13, with respect to the C-terminal model (H4), is that they involve a second unique intermolecular linkage. Having dual intermolecular constraints might be expected to reduce conformational flexibility, and this may contribute to their lesser compatibility with the density. To explore this, we performed a variation on the experiment in Fig. 5 in which the dual-linkage models were converted to single linkage by breaking the peptide bond between residues 358 and 359 of the strand 4A-1C connection, leaving the unique second linker that each model embodies intact. During iterative rounds of optimization, displacement between residues adjacent to the site of cleavage confirmed that this modification allowed additional freedom of movement of the subunits. At the conclusion of the experiment, the scores obtained were very similar to those obtained with the intact models (fig. S4, top). We also performed the converse experiment, in which the strand 4A-1C connection was kept intact, and the second unique linker of each model was broken (between residues 344 and 345 for H12 and 324 and 325 for H3). This provided comparable results to the single-linkage C-terminal model (H4) (fig. S4, middle).

These results demonstrate that the head-to-tail orientation of 1-antitrypsin subunits, with the base of sheet A and the top of sheet C in proximity to one another, is an intrinsic feature of the dimer density. Therefore, for the dual-linker models, it is not the reduced flexibility that distinguishes them but the inconsistency of their second linkage with this subunit orientation.

Thus, the orientation provided by the C-terminal model is most compatible with the Dim60 and Dim90 structures present in liver-derived polymers. In the final structure, there are translations of 71 and 73 between the centers of mass of the 1-antitrypsin molecules and a final calculated rotation around the dimer axis of 65 and 81, respectively (Fig. 6A, top and middle). The same analysis, performed using the Dim60H model derived from the heat-induced dataset, gave the same conclusion: The C-terminal model (H4) provided a fit consistent with the model-free aligned subunits (Fig. 6A, bottom). While there was a relative improvement in the fit of the loop-sheet 4 dimer, this model remained unable to adopt an optimal alignment to the experimental data (Fig. 5, right, and fig. S4, top right).

(A) Best-fitting C-terminal model (H4) displayed against the Dim60 (top), Dim90 (middle), and Dim60H (bottom) density, annotated with intersubunit translations and rotations. Dashed lines represent vectors passing through the centers of mass of the 1-antitrypsin and Fab molecules. (B) Electrophoretic mobility shift assay comparing the affinity of the polymer-specific mAb2C1 for polymers of different origin. Binding of the antibody results in a cathodal shift of 1-antitrypsin polymers. Arrows highlight that cleavage-induced polymers, which are structurally analogous to C-terminal polymers, are readily recognized by the antibody with respect to denaturant-induced polymers. A schematic representation of P9-cleavageinduced polymers is shown at the left, with the domain-swapped peptide in black, based on PDB 1D5S (21). (C) Results of sandwich ELISA experiments showing the relative affinity of mAb2C1 for liver-derived, cleavage-induced, and denaturant-induced polymers, normalized to the half-maximal effective concentration (EC50) of the interaction with heat-induced polymers. The affinity of monomeric M and Z, denoted by open circles, was outside the maximum antigen concentration used in the experiment and, correspondingly, not less than two orders of magnitude worse than that of heat-induced polymers. Independent experiments are denoted by the markers, and the means SD are indicated by the bars (liver-derived and denaturant-induced, n = 3; cleaved, n = 6); heat-induced by definition is 1, represented by the dotted line; w.r.t, with respect to.

A neoepitope is recognized by the mAb2C1 antibody that is present in liver-derived and heat-induced polymers but not in those induced in the presence of a denaturant. Thus, the latter conditions produce a polymer structure not representative of pathological material (14, 16). Cleavage of the RCL of 1-antitrypsin in a noncognate position can also induce polymerization (3), and the atomic details of the resulting polymer linkage, defined by crystallography (21, 22), show that it produces a molecule that mimics a noncircular form of the C-terminal trimer (8). To determine whether mAb2C1 recognizes the open C-terminal configuration identified from the EM analyses, polymers mimicking this structure were produced by limited proteolysis of a recombinant Ala350Arg 1-antitrypsin mutant by thrombin. This material was readily recognized by mAb2C1 as demonstrated in a mobility shift experiment (Fig. 6B). The relative affinity of mAb2C1 for the different forms was then determined by enzyme-linked immunosorbent assay (ELISA). These experiments exhibited comparable recognition of liver, heat-induced, and C-terminalmimicking cleaved polymers by the antibody, with a markedly lower affinity for denaturant-induced polymers and monomer (Fig. 6C).

1-Antitrypsin deficiency is characterized by the accumulation of mutant protein as inclusions within hepatocytes. Extraction and disruption of these inclusions release chains of unbranched polymers, which, when isolated, exhibit pronounced flexibility and apparently lack higher-order interactions. Several models have been proposed for the molecular basis of the formation and properties of these polymers from in vitro experiments. On the basis of the observation that polymers are extremely stable and that artificially induced polymerization can be prevented by peptide mimics of the RCL, the first proposed loop-sheet molecular mechanism posited that the RCL of one molecule would incorporate into a sheet of the adjacent molecule (H1 and H2 in Figs. 1A and 5) (1). Since that time, while biophysical studies have attempted to address the question of mechanism, the only crystal structures that have been obtained of 1-antitrypsin oligomers are of forms produced artificially from recombinant nonglycosylated material: a chain of molecules spontaneously assembled following fortuitous cleavage by a contaminating protease (21, 22) and a circular trimer of a disulfide mutant produced by heating (H4) (8). Hence, there has been no direct evidence of the structure of the pathological polymers that deposit in the livers of patients with 1-antitrypsin deficiency.

The in vivo mechanism of 1-antitrypsin polymerization and accumulation in the liver has important consequences for the development of therapeutics that interfere with this process. The loop-sheet hypothesis (H1 and H2) involves relatively minor and reversible perturbations with respect to the native conformation to adopt a polymerization-prone state (1), the C-terminal model (H4) predicates a preceding substantial and irreversible conformational change (8), and the -hairpin model (H3) lies somewhere between the two (7). This has implications for the nature of the site and mode of ligand binding capable of blocking polymerization and, indeed, for the question as to whether the process can be reversed at all.

Polymer material obtained from liver tissue is heterogeneous in size, glycosylated, and difficult to obtain in substantial quantity, making it unsuitable for crystallography. Without the requirement to form a crystal lattice, single-particle reconstruction using EM images represents an excellent option to obtain structural information. The negative-stain approach used here for the analysis of small protein complexes provided a strong contrast between protein and background and, in conjunction with decoration by Fab moieties, made angular information easier to retrieve, revealing the interactions between the components of the flexible polymer chains present in explant liver tissue.

Interrogation of the extant models of polymerization revealed that the loop-sheet dimer model (H1), despite its general compatibility with many biophysical observations, was unable to adopt the intersubunit translation or rotation observed in the 2D and 3D data (Figs. 4 and 5). A less stringent test of this model, a four-residue insertion loop-sheet configuration (H2) with an interchain interface analogous to one binding site of a tetrameric peptide blocker of polymerization (23), still provided an incomplete fit to the data. The -hairpin domain swap model (H3), based on the structure of a self-terminating dimer of antithrombin, has been proposed to extend to 1-antitrypsin polymerization by limited proteolysis and the stability of a disulfide mutant against polymerization (7), a conclusion that has been questioned (16, 24) and not supported by peptide fragment folding data (25). Owing to its longer predicted linking regions, the fit to the Dim60 and Dim90 data was better than that seen with the loop-sheet models (Fig. 5), but it required 20 residues to lose their native structure with respect to the antithrombin crystal structure from which this model is derived. While the crystal structure unequivocally demonstrates the ability of this form to adopt a 180 inversion orthogonal to the dimer axis, there was no evidence in the micrographseither Fab-bound or unboundof a chain inversion of this magnitude.

In contrast, the NS-EM data were best explained by the location, length, and flexibility of the C-terminal linkage (H4). The C-terminal mechanism involves displacement (or delayed formation) of the C-terminal 4-kDa fragment of 1-antitrypsin comprising strands 1C, 4B, and 5B (fig. S1) and self-insertion of the RCL, which results in a monomeric latent-like intermediate conformation (8). The open, nonself-terminating arrangement of the subunits (Fig. 6A) contrasts with the observation that oligomeric components of recombinant material purified from P. pastoris were circular (11).

The data obtained, including the intersubunit orientation and distance (Figs. 3, F and G, 5, and 6A) and the presence of the mAb2C1 epitope (Fig. 6B), support a structural equivalence of heat-induced and liver-derived polymers. Hence, it follows that there will be components shared between their respective polymerization pathways; it should accordingly be possible to extend mechanistic observations made in vitro to the mechanisms that produce polymers in vivo, and here, we draw on observations made in the literature regarding the role of strands 5A, 1C, 4B, and 5B and the breach region (Fig. 7). The ability to induce polymers from folded native 1-antitrypsin by displacement of the C-terminal region at modestly elevated temperatures in the Z variant implies that core packing interactions are readily destabilized when the molecule is in a five-stranded sheet A configuration. In the native conformation (Fig. 7, i), the Z variant has been noted to increase the mobility of strand 5A (26) and the solvation and rotational freedom (27) of the solvent-accessible (28) Trp194 residue that is situated in the breach region (Fig. 7, ii, bottom). The breach is bounded by a hydrophobic cluster of residues including some contributed by strands 5A as well as C-terminal 4B and 5B, on which solvation (as reported by Trp194) would be expected to exert destabilizing effects. This is supported by sequential polypeptide folding experiments, suggesting that engagement of ~36 residues at the C terminus is predicated on a properly formed strand 5A (25). A related process likely occurs on the opposing side of the molecule: Helices A, G, and H form a trihelix clamp over this region, and disruption of stabilizing interactions by the S (Glu264Val) and I (Arg39Cys) mutations (Fig. 7, ii, top) also leads to an increased tendency to polymerize upon the application of heat. Moreover, the fact that S, I, and Z are able to copolymerize (29, 30) indicates that this occurs by a common mechanism and supports the mutual destabilization of the C-terminal region that is situated between them (Fig. 7, iii). This process is consistent with the site of polymerization-prone latch mutations clustered near the end of the polypeptide chain (31).

From the native state (i), the evidence suggests that during heating, decreased affinity for the C terminus can be induced by destabilization of the adjacent breach region with increased solvation of the hydrophobic core (26, 27), destabilization in the adjacent trihelix region (as in the S and I variants), and associated loss of strand 1C native interactions (ii and iii) (6, 24, 32). Upon dissociation of the C terminus, the molecule is equivalent to a final stage of folding of the nascent polypeptide chain (iv) (25). This (reversible) displacement is unable to immediately lead to self-insertion and generate the hyper-stable six-stranded sheet A (25) despite delayed folding (34) (v), but such a change is able to proceed rapidly and irreversibly upon incorporation of the C terminus of another molecule (vi) (25, 33). Under appropriate conditions, the latent conformation is generated as an off-pathway species (vii) that is expected to be inaccessible once full RCL insertion has taken place (v) (17, 36). Asterisks denote Trp194 (blue) and Glu264/Arg39 (red), regions colored as black and yellow arrows highlight structural changes, and symbols indicate the application of heat (triangle) or a hypothesized point of convergence with the nascent chain folding pathway (R).

The early (6) and necessary (24, 32) loss of native strand 1C contacts is consistent with the displacement of the C-terminal region (Fig. 7, iv). In this state, current evidence indicates that the molecule is equivalent to a final stage of the folding pathway (25). While the displaced C terminus (Fig. 7, iv) is relatively hydrophobic, in isolation, the equivalent C36 peptide has been found to be soluble, albeit fibrillogenic over a period of hours, and readily incorporated into native 1-antitrypsin at room temperature, inducing an increase in thermal stability consistent with transition to a self-inserted form (33). This suggests that displacement of this region even at ambient temperature is possible. While, by analogy with release of the RCL by proteolytic cleavage (fig. S1), it might be expected that the release of the C terminus would immediately give rise to self-insertion of the untethered RCL as strand 4A, there is evidence that the absence of an engaged C terminus will prevent this from occurring (25). This is congruent with the preferential folding of the protein to the kinetically stabilized five-stranded sheet A conformation rather than the loop-inserted six-stranded thermodynamically favored state (25) despite the adoption of the hyperstable form upon administration of exogenous C-terminal peptide (33) and the fact that some material does fold correctly to the active form even with the delayed folding of the Z variant (34).

Upon incorporation of the C terminus of another molecule (Fig. 7, v), self-insertion of strand 4A would be expected to follow (Fig. 7, vi) (33). The RCL of 1-antitrypsin is shorter than those of serpins known to undergo latency as a competing process to polymerization (35); once insertion has proceeded beyond a molecular decision point near the center of sheet A (17, 36), the molecule would no longer be able to (re-) incorporate its own C-terminal fragment (Fig. 7, vii), and it would effectively become irreversibly activated for oligomerization (Fig. 7, v). This mechanism is consistent with the suppression of polymerization in cells by a single-chain antibody fragment that alters the behavior of sheet A in the vicinity of the helix F (12, 13) and mutations that inhibit loop self-insertion (17).

Thus, of the proposed polymerization linkage models, our data most strongly support the C-terminal domain swap as the structural basis for pathological polymers of Z 1-antitrypsin. It remains to be determined how common or rare the exceptions are to this mechanism among other members of the serpin family. Serpins share a highly conserved core structure and exhibit common folding behaviors, and mutations that are associated with instability and deficiency tend to cluster within defined structural regions (37, 38). These factors likely place constraints on the mechanism by which mutations can induce polymerization. It is difficult to overlook the central role of the C terminus in both latency and the C-terminal domain swap, with the former essentially a monomeric self-terminating form of the latter (Fig. 7, v to vii). While a shorter RCL likely renders these two states mutually exclusive in 1-antitrypsin, it has been suggested that the greater tendency of plasminogen activator inhibitor-1 (PAI-1) to adopt the latent conformation is due to a common origin in the polymerogenic intermediate (35). In support of this, PAI-1 and the neuroserpin L49P variant can form polymers from the latent state (35, 39), a notable observation given the high stability of this conformation and inconsistent with the loop-sheet polymerization mechanism (which is predicated on a five-stranded native-like molecule) and the intermolecular strand 5A/4A linkage of the -hairpin model.

On the other hand, it has been shown that distinct alternative polymerization pathways are accessible in vitro depending on the nature of the destabilizing conditions used. The crystal structure of a -hairpinswapped self-terminating dimer of antithrombin (7) produced by incubation of this protein in vitro at low pH provides evidence of this. Similarly, induction of polymerization at acidic pH or with denaturants causes 1-antitrypsin to adopt a polymer form inconsistent with that seen upon heating or with pathological specimens from ZZ homozygotes (16). Biochemical evidence indicates that this may reflect the conformation of the rare 1-antitrypsin Trento variant (14).

From the data presented, here we expect the C-terminal domain swap to reflect the basis of pathological polymers in carriers of the Z 1-antitrypsin alleleand by extension, the S and I variantsand therefore account for more than 95% of cases of severe 1-antitrypsin deficiency. Because of its intimate association with the folding pathway and relationship with the latent structure more readily adopted by other serpins, it is probable that this form will be relevant to other serpin pathologies. Whether the same linkage underlies the shutter region mutants of 1-antitrypsin [such as Siiyama, Mmalton, and Kings (2, 10)] that also cause polymer formation and severe plasma deficiency remains to be determined.

Human M and Z 1-antitrypsin were purified from donor plasma, and recombinant 1-antitrypsin was purified from Escherichia coli as previously described (24, 40). Monoclonal antibodies were purified from hybridomas according to published methods (12) and stored in phosphate-buffered saline (PBS) with 0.02% (w/v) sodium azide. Fab fragments were generated by limited proteolysis using ficin or papain as appropriate with commercial kits according to the manufacturers instructions (Thermo Fisher Scientific) with the subsequent addition of 1 mM E-64 inhibitor.

Explanted liver tissue (5 to 10 g) from individuals homozygous for the Z allele was homogenized and incubated at 37C for 1 hour in 10 ml of Hanks modified balanced salt solution with 5 mg of Clostridium histolyticum collagenase, and fibrous tissue was removed from the resultant suspension by filtration through BioPrepNylon synthetic cheesecloth with a 50-m pore size (Biodesign). The filtrate was centrifuged at 3000g at 4C for 15 min, the pellet was resuspended in 3 ml of 0.25 M sucrose in buffer E [5 mM EDTA, 50 mM NaCl, and 50 mM tris (pH 7.4)], and the sample was layered onto the top of two 14-ml centrifuge tubes (Beckman Coulter) containing a preformed 0.3 to 1.3 M sucrose gradient in buffer E and centrifuged at 25,000g for 2 hours at 4C. The supernatant was discarded, and the pelleted inclusion bodies were washed with buffer E. Previous approaches to polymer extraction (41) have made use of detergents and denaturants, compounds that have been shown, under certain conditions, to induce conformational change in 1-antitrypsin (3), and therefore, we omitted their use. Soluble polymers were extracted by sonication on ice using a SoniPrep 150 with a nominal amplitude of 2.5 m (giving a probe displacement of 17.5 m) in bursts of 15 s and 15-s rest for a total of 6 min. The solution was repeatedly centrifuged for 5 min at 13,000g in a benchtop centrifuge to remove insoluble material. Purity of the soluble component was assessed by SDS- and nondenaturing PAGE.

For heat-induced polymers, purified plasma M 1-antitrypsin was buffer-exchanged into PBS to 0.2 mg/ml and polymerization induced by heating at 55C for 48 hours. Denaturant-induced polymers were formed by incubation at 0.4 mg/ml and 25C for 48 hours in 3 M guanidine hydrochloride and 40 mM tris-HCl (pH 8) buffer. Following dialysis, anion exchange chromatography using a HiTrap Q Sepharose column with a 0 to 0.5 M NaCl gradient in 20 mM tris (pH 8.0) was used to remove residual monomer, as confirmed by native PAGE.

An arginine residue was introduced at the P9 position (residue 350) of 1-antitrypsin in a pQE-30based (Qiagen) expression system (17) using the QuikChange mutagenesis kit according to the manufacturers instructions (Agilent). Following purification from E. coli, the protein was subjected to limited proteolysis by a 50-fold substoichiometric concentration of bovine thrombin (Merck) at 37C overnight and polymer isolated by anion exchange chromatography using a HiTrap Q Sepharose column with a 0 to 0.5 M NaCl gradient in 20 mM tris (pH 8.0).

Polymers were incubated with a threefold molar excess (with respect to subunit concentration) of Fab4B12 (12) for 2.5 hours at room temperature and repurified by anion exchange chromatography as described above or dialyzed overnight at 4C into buffer E using a 300-kDa molecular weight cutoff membrane (Spectrum). Copper grids (300 mesh, Electron Microscopy Services) were covered with a continuous carbon film of thickness ~50 m and glow discharged for 30 s. Three microliters of the prepared sample at ~0.05 to 0.1 mg/ml concentration was applied to the prepared grids for 1 min before blotting. Samples were negatively stained for 1 min using 5 l of 2% (w/v) uranyl acetate and blotted, and the staining step was repeated. For single-frame high-contrast micrographs, grids were visualized using an FEI Tecnai T12 BioTWIN LaB6 microscope operating at 120 keV, and images were recorded on an FEI Eagle 4K 4K charge-coupled device camera under low-dose conditions (25 electrons 2) at an effective magnification of 91,463 (1.64 per pixel) and a defocus range of 0.8 to 3.5 m. Micrographs for single-particle reconstruction were recorded as averages of 30-frame, 30-frames/s movies using a Tecnai F20 field emission gun transmission electron microscope at 200 keV with a Direct Electron DE-20 direct detector at a calibrated 41,470 magnification (1.54 per pixel) under low-dose conditions (~1 electron 2 per frame). Frames were motion-corrected using MotionCorr (42). Resulting images were corrected for the effects of the contrast transfer function of the microscope using CTFFIND3 (43). Micrographs with greater than 5% astigmatism were discarded. Manual particle picking was undertaken using EMAN (44). General processing scripts in Python made use of the EMAN2 (44), NumPy, SciPy, OpenCV, and Matplotlib libraries.

RELION v2.1 and v3.0.6 (15) were used for single-particle reconstruction including automated particle picking, 2D and 3D classification, and 3D refinement, with the final processing path described in detail in Results and fig. S2. In general, classification in RELION used a regularization parameter T = 2 and 25 iterations or 50 iterations where convergence of statistics was not observed to have occurred. Image boxes were 230 230 in size; for 2D processing, a mask diameter of 180 was used, and alignment was performed using an initial 7.5 interval with an offset search range of five pixels; for 3D processing, the mask diameter was 195 with a sampling of 15 and eight pixels; and 3D refinement used 195 , 7.5, and five pixels, respectively. Masks were generated for 3D dimer references by contouring at ~3.8 105 3 (or at noise), for monomer references at ~1.9 105 3, and both with the addition of a 7-voxel/7-voxel hard and soft edge. A 30- low-pass filter was applied to the resulting masked volumes before classification or refinement. After obtaining the Dim60 and Dim90 structures, the subsets of particle images on which they were based were subjected to a reference-free stochastic gradient-driven de novo reconstruction in RELION (sampling 15 and two-pixel increments; 50 initial, 200 in-between, and 50 final iterations from 40 down to 20 ). An equivalent model was returned in each case. Similarly, combining the two particle sets together and performing a 3D reclassification using the monomeric Monav reference (fig. S2o, left) effectively returned the same two models.

Proteins were resolved under denaturing conditions by NuPAGE 4 to 12% (w/v) acrylamide bis-tris SDS-PAGE gels and under nondenaturing conditions using NativePAGE 3 to 12% (w/v) acrylamide bis-tris gels (Thermo Fisher Scientific). For visualization by Coomassie dye, typical loading was 1 to 4 and 0.1 to 0.4 g for Western blot. Western blot transfer to a polyvinylidene difluoride membrane was undertaken using the iBlot system (Thermo Fisher Scientific) or by wet transfer (Bio-Rad), followed by these steps: soaking in PBS for 10 min; blocking for 1 hour at room temperature with 5% (w/v) nonfat milk powder in PBS; incubation with primary antibody (rabbit polyclonal at 0.8 g/ml or mouse monoclonal at 0.2 g/ml) overnight at 4C in PBS with 0.1% Tween (PBST), 5% (w/v) bovine serum albumin, and 0.1% sodium azide; washing with PBST; incubation with secondary antibodies at 1:5000 to 1:10,000 in PBST with 5% (w/v) bovine serum albumin and 0.1% sodium azide; and development by Pierce enhanced chemiluminesence (Thermo Fisher Scientific) or fluorescence (LiCor).

High-binding enzyme immunoassay microplates (Sigma-Aldrich) were coated with 50 l per well of anti-polymer mAb2C1 (2 g/ml) in PBS with incubation overnight at room temperature, washed once with distilled water and twice with wash buffer [0.9% (w/v) sodium chloride and 0.025% (v/v) Tween 20], and blocked for 1 hour with 300 l per well of PBST buffer [PBS, 0.025% (v/v) Tween 20, and 0.1% (w/v) sodium azide] supplemented with 0.25% (w/v) bovine serum albumin at room temperature (PBSTB). After washing the plates, antigens in PBSTB were applied by 1:1 serial dilution at a final volume of 50 l across the plate, incubated for 2 hours at room temperature, and washed. Fifty microliters of rabbit anti-human 1-antitrypsin polyclonal antibody (1 g/ml) (DAKO) in PBSTB was added to each well, the plates were incubated for 2 hours at room temperature and washed, 50 l of a 1:2000 dilution of goat anti-rabbit horseradish peroxidase antibody in PBSTB (without sodium azide) was added to each well, and the plates were incubated in the dark for 75 min at room temperature and then washed again. For detection, 3,3,5,5-tetramethylbenzidine substrate solution (Sigma-Aldrich) was added at 50 l per well, the plates were incubated for ~7 min in the dark, the reaction stopped by adding 50 l per well of 1 M H2SO4, and the absorbance was promptly measured at 450 nm in a SpectraMax M5 plate reader (Molecular Devices).

For crystallization trials, protein was buffer-exchanged into buffer C [10 mM tris (pH 7.4), 50 mM NaCl, and 0.02% (w/v) sodium azide] and concentrated to 10 mg/ml. Broad-screen sitting drop approaches against commercially available buffer formulations (Molecular Dimensions and Hampton Research) were performed with 100-nl protein:100-nl buffer drops dispensed using a Mosquito robot (TTP LabTech) and equilibrated against 75 l of buffer at 16C with automatic image acquisition by a CrystalMation system (Rigaku). Hanging-drop screens were performed at 20C with 1 l of protein:1 l of buffer equilibrated against 250 l of buffer. Crystals mounted on nylon loops were briefly soaked in the respective crystallization buffer supplemented by 10% (v/v) glycerol ethoxylate or 10% (v/v) ethylene glycol before plunge-freezing into liquid nitrogen. Data collection was undertaken at the European Synchrotron Radiation Facility (ESRF) ID30B beamline (with enabling work at the Diamond I03 beamline). Data reduction, integration, scaling, and merging were performed using autoPROC (45); the structures were solved by molecular replacement using Phaser (46); model refinement was undertaken with PHENIX (47); and model visualization and building were performed with Coot (48).

Recombinant 1-antitrypsin was incubated at a substoichiometric ratio to Fab4B12 for an hour at room temperature, and excess Fab was removed by anion exchange as described above. After concentration of the complex to 10 mg/ml, 50 l was applied to a Superdex 200 Increase 5/150 column (GE Life Sciences) at a rate of 0.3 ml/min in 30 mM NaCl and 50 mM tris (pH 7.4) buffer at the P12 BioSAXS beamline, European Molecular Biology Laboratory (EMBL) Hamburg (49). The x-ray scatter ( = 1.24 ) was recorded on a Pilatus 6M detector at 1 frame/s. The buffer baseline-corrected scatter profile was produced by integration over time points corresponding with elution of the complex from the size exclusion column using the ATSAS software package (50).

For initial working subunit and dimer models, Coot (48) and PyMOL (Schrdinger Software) were used to position crystal structures of 1-antitrypsin [PDB: cleaved, 1EZX (19); cleaved polymer, 1D5S (21)] or mAb4B12 (PDB: 6QU9) and modify chain boundaries, repair gaps, and improve stereochemistry of intermolecular segments. The initial -hairpin and loop-sheet models (Fig. 1A, H13) were further optimized in PyRosetta (Fig. 1A) (20). Superposition of the model of the 1-antitrypsinFab4B12 complex onto the dimer was undertaken using PyMOL. Modifications had to be made to each model to reconcile observations made here and in recent studies:

H1 and H2. Loop-sheet models have been represented with various degrees of insertion of the donor RCL into the site of strand 4A in the acceptor molecule. To explore the compatibility of this parameter with the flexibility and periodicity of the polymers visualized here, two forms were generated, one with a substantial eight-residue insertion (loop-sheet 8, H1) and one with a marginal interaction at the base of sheet A based on the observation that tetrameric peptides are able to block polymerization and induce stabilization of 1-antitrypsin (loop-sheet 4, H2) (18, 23). The loop insertion site is permissive of noncognate peptide residues; however, such out-of-register insertion has not been observed crystallographically for intra- or interprotein loop insertion. For the arrangements used here, inserted residues were maintained in register at their cognate positions as observed for the structures of the cleaved protein, cleavage-induced polymer (21), and the self-terminating dimer (7) and trimer (8).

H3. The hypothesized unwinding of helix I in the -hairpin polymer has been challenged (16) and is inconsistent with the role of this element in the 4B12 epitope (13). The ability of Fab4B12 to bind to the ex vivo polymers is unequivocal from the images recorded here; thus, if the pathological polymer is reflected by the -hairpin model, then helix I must remain intact.

H4. Contrary to a proposal that circular polymers are the predominant species (8, 11), most of those extracted from liver tissue were observed to be linear. Accordingly, the C-terminal dimer was arranged in an open configuration through redefinition of the chain boundaries in the crystal structure of a cleavage-generated polymer (21).

During optimization of Fab-bound 1-antitrypsin dimer models, the constituent subunits were treated as rigid bodies connected by flexible linker regions. As much intersubunit linker flexibility was allowed as possible while maintaining the integrity of the core 1-antitrypsin fold, consistent with serpin monomer and oligomer crystal structures and with the high stability of the polymer. Divergence from the canonical structure was permitted where this accorded with the characteristics of the model being tested and other experimental data. Specifically:

1) Although crystal structures of cleaved antitrypsin polymers (21, 22), an antithrombin dimer (7), and antitrypsin trimer (8) all have an intact strand 1C, it has been shown that during the process of (heat-induced) polymerization, this element is labile (24, 32). Accordingly, we allowed the residues of this element (362 to 368) to move in all models.

2) All models of polymerization, either structurally defined or modeled, propose a connection between the C terminus of strand 4A and the N terminus of strand 1C (residues 357 to 362). The evidence is that this is a region that lacks secondary structure: In the cleaved form, it is not part of strand 4A or strand 1C; in the native structure, it does not form polar contacts with the body of the molecule; and it forms an extended chain in the latent conformation (36). Thus, this was treated as a flexible region.

3) The -hairpin model (H3) involves a connection between helix I of the donor subunit and strand 5A of the acceptor. Limited proteolysis data were interpreted to support the unraveling of helix I in this polymer linkage, yet this is not a feature observed in the crystal structure of the antithrombin dimer on which the model is based (7), and this conclusion has been disputed (16). If the -hairpin model is indeed representative of the polymers considered in this study, then helix I should be intact as it is integral to the epitope of the nonconformation-selective Fab4B12 that decorates them (13). Hence, the region 309 to 328 between helix I and strand 5A was provided with full flexibility, which maintains the integrity of elements seen in the original crystal structure but allows all other linker residues to move.

4) All crystal structures exhibit an intact strand 5A, and while there is evidence of some lability of this structural element in the native conformation of a Z-like Glu342Ala mutant, this is not shared by the wild-type protein (26). For the loop-sheet models (H12) that propose connections between strand 5A of the donor subunit and strand 4A embedded in the acceptor, all connecting residues between residues 340 to 348 (H1) and 340 to 352 (H2) were provided full torsional freedom during refinement.

The selection of polymers was performed manually by visual inspection of micrographs, followed by automatic thresholding and excision of regions of interest from the individual polymer images. Where a region of interest contained more than one chain, the image was postprocessed to remove density not related to the polymer of interest. Starting models of each polymer configuration at an appropriate length were generated by permutation of a seed dimer structure according to the number of subunits in an oligomer. The PyRosetta application programming interface (20) was then used, in which the 1-antitrypsinFab4B12 subunits were treated as rigid bodies connected by flexible linker regions; a full-backbone centroid model was used in which each side chain was represented by a single pseudoatom. Following an initial rigid-body step to approximately align the model with the image, loose positional constraints were applied to subunits according to the polymer path determined during the manual selections from the micrographs. Angular relationships with respect to the underlying substrate plane were inferred according to the extent of the orthogonal Fab protrusion observable, from 90 (evidence of increased density along the z axis only) to 0 (full-length protrusion in the XY plane). A necessary simplification, resulting in an implicit minimization of the magnitude of the angular displacement between subunits, was that these would tend to orient away from the underlying carbon substrate. Refinement of these models used an energy term that sought to increase the correlation between the experimental reference image and a 2D projection of the target 3D molecule. Standard stereochemical, repulsive, and attractive terms, and loose positional restraints, were maintained throughout. Iterative refinement proceeded for a minimum of 10 steps of 25 iterations, following which convergence was deemed to have occurred when the root mean square deviation between prerefined and postrefined model was less than 0.05 . The score for a given model-oligomer pair was calculated as the ratio of the best correlation coefficient observed during the optimization of the model against the oligomer relative to the best score observed for any model against that oligomer image.

For each dimer configurationloop-sheet 8 (H1) or 4 (H2), -hairpin (H3), and C-terminal (H4)repeated (1000) rounds of optimization were undertaken from a starting model randomly perturbed by rotation around the dimer axis. Full-atom models were represented as rigid subunits connected by flexible linkers. Optimization (using PyRosetta) involved an alternating sequence of whole-dimer rigid body shift and torsional optimization into the experimental density. The scoring scheme used to steer the process involved default internal stereochemical, attractive, and repulsive terms as well as the correlation of the atomic configuration with the EM density, with relative weighting of these terms progressively adjusted during the iterative procedure. To avoid any contribution of the linker regions to the scores obtained, only the rigid core subunits were used in the calculation of the correlation coefficient with respect to the electron density. The van der Waals scoring term was monitored to exclude models where unresolvable clashes occurred. Structures were visualized using Chimera (51) and PyMOL (Schrdinger Software).

Statistical analyses were performed using Prism 6 software (GraphPad, La Jolla, CA, USA). The significance of the difference in correlation between the 2D projections of the different polymer models and the polymer images in Fig. 4 was determined by a one-way analysis of variance (ANOVA) and Tukeys multiple comparisons test; ***P < 0.001 and ****P < 0.0001. Mean values are reported throughout the text with SD or SEM, as indicated.

Tissue was used with the informed consent of donors and in accordance with local Institutional Review Boards.

Acknowledgments: We are indebted to M. Carroni (now at SciLifeLab) for collection of EM micrographs and training, and we would like to thank N. Lukoyanova and S. Chen at the ISMB Birkbeck EM Laboratory for support, training, and facility access (as well as D. Clare and Y. Chaban, now at eBIC, for antecedent enabling work) and N. Pinotis at the ISMB X-Ray Crystallography Laboratory for logistical support and facility access. We acknowledge the ESRF (Grenoble) for provision of synchrotron radiation facilities, and we would like to thank G. Leonard for assistance in using beamline ID30B; enabling work was performed on beamline I03 at the Diamond Light Source (proposal mx17201), and we would like to thank the staff for facility provision and technical support. The synchrotron SAXS data were collected at beamline P12 operated by EMBL Hamburg at the PETRA III storage ring (DESY, Hamburg, Germany), and we would like to thank M. Graewert and D. Franke for assistance. We acknowledge the contribution to this publication made by the University of Birminghams Human Biomaterials Resource Centre, which has been supported through the Birmingham Science CityExperimental Medicine Network of Excellence project. We acknowledge the use of the UCL Grace High Performance Computing Facility (Grace@UCL) and the UCL Legion High Performance Computing Facility (Legion@UCL), and associated support services, in the completion of this work. Funding: This work was supported by a grant from the Medical Research Council (UK) to D.A.L. (MR/N024842/1, also supporting J.A.I. as RCo-I and B.G. as Co-I) and the NIHR UCLH Biomedical Research Centre. D.A.L. is an NIHR Senior Investigator. S.V.F. was the recipient of an EPSRC/GSK CASE studentship. E.L.K.E. was the recipient of a Wellcome Trust Biomedical Research Studentship to the ISMB. B.G. was supported for this work by a Wellcome Trust Intermediate Clinical Fellowship and is currently supported by the NIHR Leicester Biomedical Research Centre. This work was funded, in part, by an Alpha-1 Foundation grant to J.A.I. The equipment used at the ISMB/Birkbeck EM Laboratory was funded by the Wellcome Trust (grants 101488 and 058736). Author contributions: S.V.F., E.L.K.E., A.R., and B.G. collected EM data. J.A.I., E.L.K.E., S.V.F., B.G., E.V.O., and M.B. analyzed EM data. A.M.J. and J.A.I. collected and analyzed crystallography data. A.M.J. and J.A.I. collected and analyzed SAXS data. E.L.K.E., S.V.F., I.A., and J.A.I. collected and analyzed biochemical data. J.A.I. performed modeling and wrote the computer code. S.V.F., E.L.K.E., N.H.-C., A.M.J., I.A., A.R., E.M., and J.A.I. prepared reagents. S.T.R., G.M.R., and D.H.A. provided reagents. E.M. provided advice and training. J.A.I., E.V.O., B.G., and A.R. supervised data collection and analysis. J.A.I., D.A.L., and E.V.O. supervised the project. J.A.I., S.V.F., E.L.K.E., and D.A.L. drafted the manuscript. All authors contributed to and approved the final manuscript. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. The Dim60, Dim90, and Dim60H maps have been deposited in the EMDB with accessions EMD-4632, EMD-4631, and EMD-4620. The crystal structure of Fab4B12 has been deposited as PDB accession 6QU9. Additional data related to this paper may be requested from the authors.

View original post here:
The structural basis for Z 1-antitrypsin polymerization in the liver - Science Advances

Breakout Paper in Journal of Theoretical Biology Explicitly Supports Intelligent Design – Discovery Institute

Photo: Red poppy, Auckland Botanic Gardens, Auckland, New Zealand, by Sandy Millar via Unsplash.

As John West noted here last week, the Journal of Theoretical Biology has published an explicitly pro-intelligent design article, Using statistical methods to model the fine-tuning of molecular machines and systems. Lets take a closer look at the contents. The paper is math-heavy, discussing statistical models of making inferences, but it is also groundbreaking for this crucial reason: it considers and proposes intelligent design, by name, as a viable explanation for the origin of fine-tuning in biology. This is a major breakthrough for science, but also for freedom of speech. If the paper is any indication, appearing as it does in a prominent peer-reviewed journal, some of the suffocating constraints on ID advocacy may be coming off.

The authors are Steinar Thorvaldsen, a professor of information science at the University of Troms in Norway, and Ola Hssjer, a professor of mathematical statistics at Stockholm University. The paper, which is open access, begins by noting that while fine-tuning is widely discussed in physics, it needs to be considered more in the context of biology:

Fine-tuning has received much attention in physics, and it states that the fundamental constants of physics are finely tuned to precise values for a rich chemistry and life permittance. It has not yet been applied in a broad manner to molecular biology.

The authors explain the papers main thrust:

However, in this paper we argue that biological systems present fine-tuning at different levels, e.g. functional proteins, complex biochemical machines in living cells, and cellular networks. This paper describes molecular fine-tuning, how it can be used in biology, and how it challenges conventional Darwinian thinking. We also discuss the statistical methods underpinning finetuning and present a framework for such analysis.

They explain how fine-tuning is defined. The definition is essentially equivalent to specified complexity:

We define fine-tuning as an object with two properties: it must a) be unlikely to have occurred by chance, under the relevant probability distribution (i.e. complex), and b) conform to an independent or detached specification (i.e. specific).

They then introduce the concept of design, and explain how humans are innately able to recognize it:

A design is a specification or plan for the construction of an object or system, or the result of that specification or plan in the form of a product. The very term design is from the Medieval Latin word designare (denoting mark out, point out, choose); from de (out) and signum (identifying mark, sign). Hence, a public notice that advertises something or gives information. The design usually has to satisfy certain goals and constraints. It is also expected to interact with a certain environment, and thus be realized in the physical world. Humans have a powerful intuitive understanding of design that precedes modern science. Our common intuitions invariably begin with recognizing a pattern as a mark of design. The problem has been that our intuitions about design have been unrefined and pre-theoretical. For this reason, it is relevant to ask ourselves whether it is possible to turn the tables on this disparity and place those rough and pre-theoretical intuitions on a firm scientific foundation.

That last sentence is key: the purpose is to understand if there is a scientific method by which design can be inferred. They propose that design can be identified by uncovering fine-tuning. The paper explicates statistical methods for understanding fine-tuning, which they argue reflects design:

Fine-tuning and design are related entities. Fine-tuning is a bottom-up method, while design is more like a top-down approach. Hence, we focus on the topic of fine-tuning in the present paper and address the following questions: Is it possible to recognize fine-tuning in biological systems at the levels of functional proteins, protein groups and cellular networks? Can fine-tuning in molecular biology be formulated using state of the art statistical methods, or are the arguments just in the eyes of the beholder?

They cite the work of multiple leading theorists in the ID research community.

They return to physics and the anthropic principle, the idea that the laws of nature are precisely suited for life:

Suppose the laws of physics had been a bit different from what they actually are, what would the consequences be? (Davies, 2006). The chances that the universe should be life permitting are so infinitesimal as to be incomprehensible and incalculable. The finely tuned universe is like a panel that controls the parameters of the universe with about 100 knobs that can be set to certain values. If you turn any knob just a little to the right or to the left, the result is either a universe that is inhospitable to life or no universe at all. If the Big Bang had been just slightly stronger or weaker, matter would not have condensed, and life never would have existed. The odds against our universe developing were enormous and yet here we are, a point that equates with religious implications

However, rather than getting into religion, they apply statistics to consider the possibility of design as an explanation for the fine-tuning of the universe. They cite ID theorist William Dembski:

William Dembski regards the fine-tuning argument as suggestive, as pointers to underlying design. We may describe this inference as abductive reasoning or inference to the best explanation. This reasoning yields a plausible conclusion that is relatively likely to be true, compared to competing hypotheses, given our background knowledge. In the case of fine-tuning of our cosmos, design is considered to be a better explanation than a set of multi-universes that lacks any empirical or historical evidence.

The article offers additional reasons why the multiverse is an unsatisfying explanation for fine-tuning namely that multiverse hypotheses do not predict fine-tuning for this particular universe any better than a single universe hypothesis and we should prefer those theories which best predict (for this or any universe) the phenomena we observe in our universe.

The paper reviews the lines of evidence for fine-tuning in biology, including information, irreducible complexity, protein evolution, and the waiting-timeproblem. Along the way it considers the arguments of many ID theorists, starting with a short review showing how the literature uses words such as sequence code, information, and machine to describe lifes complexity:

One of the surprising discoveries of modern biology has been that the cell operates in a manner similar to modern technology, while biological information is organized in a manner similar to plain text. Words and terms like sequence code, and information, and machine have proven very useful in describing and understanding molecular biology (Wills, 2016). The basic building blocks of life are proteins, long chain-like molecules consisting of varied combinations of 20 different amino acids. Complex biochemical machines are usually composed of many proteins, each folded together and configured in a unique 3D structure dependent upon the exact sequence of the amino acids within the chain. Proteins employ a wide variety of folds to perform their biological function, and each protein has a highly specified shape with some minor variations.

The paper cites and reviews the work of Michael Behe, Douglas Axe, Stephen Meyer, and Gnter Bechly. Some of these discussions are quite long and extensive. First, the article contains a lucid explanation of irreducible complexity and the work of Michael Behe:

Michael Behe and others presented ideas of design in molecular biology, and published evidence of irreducibly complex biochemical machines in living cells. In his argument, some parts of the complex systems found in biology are exceedingly important and do affect the overall function of their mechanism. The fine-tuning can be outlined through the vital and interacting parts of living organisms. In Darwins Black Box (Behe, 1996), Behe exemplified systems, like the flagellum bacteria use to swim and the blood-clotting cascade, that he called irreducibly complex, configured as a remarkable teamwork of several (often dozen or more) interacting proteins. Is it possible on an incremental model that such a system could evolve for something that does not yet exist? Many biological systems do not appear to have a functional viable predecessor from which they could have evolved stepwise, and the occurrence in one leap by chance is extremely small. To rephrase the first man on the moon: Thats no small steps of proteins, no giant leap for biology.

[]

A Behe-system of irreducible complexity was mentioned in Section 3. It is composed of several well-matched, interacting modules that contribute to the basic function, wherein the removal of any one of the modules causes the system to effectively cease functioning. Behe does not ignore the role of the laws of nature. Biology allows for changes and evolutionary modifications. Evolution is there, irreducible design is there, and they are both observed. The laws of nature can organize matter and force it to change. Behes point is that there are some irreducibly complex systems that cannot be produced by the laws of nature:

If a biological structure can be explained in terms of those natural laws [reproduction, mutation and natural selection] then we cannot conclude that it was designed. . . however, I have shown why many biochemical systems cannot be built up by natural selection working on mutations: no direct, gradual route exist to these irreducible complex systems, and the laws of chemistry work strongly against the undirected development of the biochemical systems that make molecules such as AMP1 (Behe, 1996, p. 203).

Then, even if the natural laws work against the development of these irreducible complexities, they still exist. The strong synergy within the protein complex makes it irreducible to an incremental process. They are rather to be acknowledged as finetuned initial conditions of the constituting protein sequences. These structures are biological examples of nano-engineering that surpass anything human engineers have created. Such systems pose a serious challenge to a Darwinian account of evolution, since irreducibly complex systems have no direct series of selectable intermediates, and in addition, as we saw in Section 4.1, each module (protein) is of low probability by itself.

The article also reviews the peer-reviewed research of protein scientist Douglas Axe, as well as his 2016 book Undeniable, on the evolvability of protein folds:

An important goal is to obtain an estimate of the overall prevalence of sequences adopting functional protein folds, i.e. the right folded structure, with the correct dynamics and a precise active site for its specific function. Douglas Axe worked on this question at the Medical Research Council Centre in Cambridge. The experiments he performed showed a prevalence between 1 in 1050 to 1 in 1074 of protein sequences forming a working domain-sized fold of 150 amino acids (Axe, 2004). Hence, functional proteins require highly organised sequences, as illustrated in Fig. 2. Though proteins tolerate a range of possible amino acids at some positions in the sequence, a random process producing amino-acid chains of this length would stumble onto a functional protein only about one in every 1050 to 1074 attempts due to genetic variation. This empirical result is quite analog to the inference from fine-tuned physics.

[]

The search space turns out to be too impossibly vast for blind selection to have even a slight chance of success. The contrasting view is innovations based on ingenuity, cleverness and intelligence. An element of this is what Axe calls functional coherence, which always involves hierarchical planning, hence is a product of finetuning. He concludes: Functional coherence makes accidental invention fantastically improbable and therefore physically impossible (Axe, 2016, p. 160).

They conclude that the literature shows the probability of finding a functional protein in sequence space can vary broadly, but commonly remains far beyond the reach of Darwinian processes (Axe, 2010a).

Citing the work of Gnter Bechly and Stephen Meyer, the paper also reviews the question of whether sufficient time is allowed by the fossil record for complex systems to arise via Darwinian mechanisms. This is known as the waiting-time problem:

Achieving fine-tuning in a conventional Darwinian model: The waiting time problem

In this section we will elaborate further on the connection between the probability of an event and the time available for that event to happen. In the context of living systems, we need to ask the question whether conventional Darwinian mechanisms have the ability to achieve fine-tuning during a prescribed period of time. This is of interest in order to correctly interpret the fossil record, which is often interpreted as having long periods of stasis interrupted by very sudden abrupt changes (Bechly and Meyer, 2017). Examples of such sudden changes include the origin of photosynthesis, the Cambrian explosions, the evolution of complex eyes and the evolution of animal flight. The accompanying genetic changes are believed to have happen very rapidly, at least on a macroevolutionary timescale, during a time period of length t. In order to test whether this is possible, a mathematical model is needed in order to estimate the prevalence P(A) of the event A that the required genetic changes in a species take place within a time window of length t.

Throughout the discussions are multiple citations of BIO-Complexity, a journal dedicated to investigating the scientific evidence for intelligent design.

Lastly, the authors consider intelligent design as a possible explanation of biological fine-tuning, citing heavily the work of William Dembski, Winston Ewert, Robert J. Marks, and other ID theorists:

Intelligent Design (ID) has gained a lot of interest and attention in recent years, mainly in USA, by creating public attention as well as triggering vivid discussions in the scientific and public world. ID aims to adhere to the same standards of rational investigation as other scientific and philosophical enterprises, and it is subject to the same methods of evaluation and critique. ID has been criticized, both for its underlying logic and for its various formulations (Olofsson, 2008; Sarkar, 2011).

William Dembski originally proposed what he called an explanatory filter for distinguishing between events due to chance, lawful regularity or design (Dembski, 1998). Viewed on a sufficiently abstract level, its logics is based on well-established principles and techniques from the theory of statistical hypothesis testing. However, it is hard to apply to many interesting biological applications or contexts, because a huge number of potential but unknown scenarios may exist, which makes it difficult to phrase a null hypothesis for a statistical test (Wilkins and Elsberry, 2001; Olofsson, 2008).

The re-formulated version of a complexity measure published by Dembski and his coworkers is named Algorithmic Specified Complexity (ASC) (Ewert et al., 2013; 2014). ACS incorporates both Shannon and Kolmogorov complexity measures, and it quantifies the degree to which an event is improbable and follows a pattern. Kolmogorov complexity is related to compression of data (and hence patterns), but suffers from the property of being unknowable as there is no general method to compute it. However, it is possible to give upper bounds for the Kolmogorov complexity, and consequently ASC can be bounded without being computed exactly. ASC is based on context and is measured in bits. The same authors have applied this method to natural language, random noise, folding of proteins, images etc (Marks et al., 2017).

[]

The laws, constants, and primordial initial conditions of nature present the flow of nature. These purely natural objects discovered in recent years show the appearance of being deliberately fine-tuned. Functional proteins, molecular machines and cellular networks are both unlikely when viewed as outcomes of a stochastic model, with a relevant probability distribution (having a small P(A)), and at the same time they conform to an independent or detached specification (the set A being defined in terms of specificity). These results are important and deduced from central phenomena of basic science. In both physics and molecular biology, fine-tuning emerges as a uniting principle and synthesis an interesting observation by itself.

In this paper we have argued that a statistical analysis of fine-tuning is a useful and consistent approach to model some of the categories of design: irreducible complexity (Michael Behe), and specified complexity (William Dembski). As mentioned in Section 1, this approach requires a) that a probability distribution for the set of possible outcomes is introduced, and b) that a set A of fine-tuned events or more generally a specificity function f is defined. Here b) requires some apriori understanding of what fine-tuning means, for each type of application, whereas a) requires a naturalistic model for how the observed structures would have been produced by chance. The mathematical properties of such a model depend on the type of data that is analyzed. Typically a stochastic process should be used that models a dynamic feature such as stellar, chemical or biological (Darwinian) evolution. In the simplest case the state space of such a stochastic process is a scalar (one nucleotide or amino acid), a vector (a DNA or amino acid string) or a graph (protein complexes or cellular networks).

A major conclusion of our work is that fine-tuning is a clear feature of biological systems. Indeed, fine-tuning is even more extreme in biological systems than in inorganic systems. It is detectable within the realm of scientific methodology. Biology is inherently more complicated than the large-scale universe and so fine-tuning is even more a feature. Still more work remains in order to analyze more complicated data structures, using more sophisticated empirical criteria. Typically, such criteria correspond to a specificity function f that not only is a helpful abstraction of an underlying pattern, such as biological fitness. One rather needs a specificity function that, although of non-physical origin, can be quantified and measured empirically in terms of physical properties such as functionality. In the long term, these criteria are necessary to make the explanations both scientifically and philosophically legitimate. However, we have enough evidence to demonstrate that fine-tuning and design deserve attention in the scientific community as a conceptual tool for investigating and understanding the natural world. The main agenda is to explore some fascinating possibilities for science and create room for new ideas and explorations. Biologists need richer conceptual resources than the physical sciences until now have been able to initiate, in terms of complex structures having non-physical information as input (Ratzsch, 2010). Yet researchers have more work to do in order to establish fine-tuning as a sustainable and fully testable scientific hypothesis, and ultimately a Design Science.

This is a significant development. The article gives the arguments of intelligent design theorists a major hearing in a mainstream scientific journal. And dont miss the purpose of the article, which is stated in its final sentence to work towards establish[ing] fine-tuning as a sustainable and fully testable scientific hypothesis, and ultimately a Design Science. The authors present compelling arguments that biological fine-tuning cannot arise via unguided Darwinian mechanisms. Some explanation is needed to account for why biological systems show the appearance of being deliberately fine-tuned. Despite the noise that often surrounds this debate, for ID arguments to receive such a thoughtful and positive treatment in a prominent journal is itself convincing evidence that ID has intellectual merit. Claims of IDs critics notwithstanding, design science is being taken seriously by scientists.

Read the original here:
Breakout Paper in Journal of Theoretical Biology Explicitly Supports Intelligent Design - Discovery Institute

These Enzyme-Mimicking Polymers May Have Helped Start Life on Earth – SciTechDaily

The micrograph shows uniform nanoparticles under 10nm in diameter. Credit: Tony Z. Jia, ELSI

Earth-Life Science Institute scientists find that small highly branched polymers that may have formed spontaneously on early Earth can mimic modern biological protein enzyme function. These simple catalytic structures may have helped jump start the origins of life.

Most effort in origins of life research is focused on understanding the prebiotic formation of biological building blocks. However, it is possible early biological evolution relied on different chemical structures and processes, and these were replaced gradually over time by eons of evolution. Recently, chemists Irena Mamajanov, Melina Caudan and Tony Jia at the Earth-Life Science Institute (ELSI) in Japan borrowed ideas from polymer science, drug delivery, and biomimicry to explore this possibility. Surprisingly, they found that even small highly branched polymers could serve as effective catalysts, and these may have helped life get started.

In modern biology, coded protein enzymes do most of the catalytic work in cells. These enzymes are made up of linear polymers of amino acids, which fold up and double-back on themselves to form fixed three-dimensional shapes. These preformed shapes allow them to interact very specifically with the chemicals whose reactions they catalyze. Catalysts help reactions occur much more quickly than they would otherwise, but dont get consumed in the reaction themselves, so a single catalyst molecule can help the same reaction happen many times. In these three-dimensional folded states, most of the structure of the catalyst doesnt directly interact with the chemicals it acts on, and just helps the enzyme structure keep its shape.

Metal sulfide enzymes could have originated from globular metal-sulfide/hyperbranched polymer particles. Credit: Irena Mamajanov, ELSI

In the present work, ELSI researchers studied hyperbranched polymers tree-like structures with a high degree and density of branching which are intrinsically globular without the need for informed folding which is required for modern enzymes. Hyperbranched polymers, like enzymes, are capable of positioning catalysts and reagents, and modulating local chemistry in precise ways.

Most effort in origins of life research is focused on understanding the prebiotic formation of modern biological structures and building blocks. The logic is that these compounds exist now, and thus understanding how they could be made in the environment might help explain how they came to be. However, we only know of one example of life, and we know that life is constantly evolving, meaning that only the most successful variants of organisms survive. Thus it may be reasonable to assume modern organisms may not be very similar to the first organisms, and it is possible prebiotic chemistry and early biological evolution relied on different chemical structures and processes than modern biology to reproduce itself. As an analogy with technological evolution, early cathode-ray TV sets performed more or less the same function as modern high definition displays, but they are fundamentally different technologies. One technology led to the creation of the other in some ways, but it was not necessarily the logical and direct precursor of the other.

If this kind of scaffolding model for biochemical evolution is true, the question becomes what sort of simpler structures, besides those used in contemporary biological systems, might have helped carry out the same sorts of catalytic functions modern life requires? Mamajanov and her team reasoned that hyperbranched polymers might be good candidates.

The team synthesized some of the hyperbranched polymers they studied from chemicals that could reasonably be expected to have been present on early Earth before life began. The team then showed that these polymers could bind small naturally occurring inorganic clusters of atoms known as zinc sulfide nanoparticles. Such nanoparticles are known to be unusually catalytic on their own.

As lead scientist Mamajanov comments, We tried two different types of hyperbranched polymer scaffolds in this study. To make them work, all we needed to do was to mix a zinc chloride solution and a solution of polymer, then add sodium sulfide, and voila, we obtained a stable and effective nanoparticle-based catalyst.

The teams next challenge was to demonstrate that these hyperbranched polymer-nanoparticle hybrids could actually do something interesting and catalytic. They found that these metal sulfide doped polymers that degrade small molecules were especially active in the presence of light, in some cases they catalyzed the reaction by as much as a factor of 20. As Mamajanov says, So far we have only explored two possible scaffolds and only one dopant. Undoubtedly there are many, many more examples of this remaining to be discovered.

The researchers further noted this chemistry may be relevant to an origins of life model known as the Zinc World. According to this model, the first metabolism was driven by photochemical reactions catalyzed by zinc sulfide minerals. They think that with some modifications, such hyperbranched scaffolds could be adjusted to study analogs of iron or molybdenum-containing protein enzymes, including important ones involved in modern biological nitrogen fixation. Mamajanov says, The other question this raises is, assuming life or pre-life used this kind of scaffolding process, why did life ultimately settle upon enzymes? Is there an advantage to using linear polymers over branched ones? How, when and why did this transition occur?

Reference: Protoenzymes: The Case of Hyperbranched Polymer-Scaffolded ZnS Nanocrystals by Irena Mamajanov, Melina Caudan and Tony Z. Jia, 13 August 2020, Life.DOI: 10.3390/life10080150

Excerpt from:
These Enzyme-Mimicking Polymers May Have Helped Start Life on Earth - SciTechDaily

Solution NMR readily reveals distinct structural folds and interactions in doubly 13C- and 19F-labeled RNAs – Science Advances

Abstract

RNAs form critical components of biological processes implicated in human diseases, making them attractive for small-molecule therapeutics. Expanding the sites accessible to nuclear magnetic resonance (NMR) spectroscopy will provide atomic-level insights into RNA interactions. Here, we present an efficient strategy to introduce 19F-13C spin pairs into RNA by using a 5-fluorouridine-5-triphosphate and T7 RNA polymerasebased in vitro transcription. Incorporating the 19F-13C label in two model RNAs produces linewidths that are twice as sharp as the commonly used 1H-13C spin pair. Furthermore, the high sensitivity of the 19F nucleus allows for clear delineation of helical and nonhelical regions as well as GU wobble and Watson-Crick base pairs. Last, the 19F-13C label enables rapid identification of a small-molecule binding pocket within human hepatitis B virus encapsidation signal epsilon (hHBV ) RNA. We anticipate that the methods described herein will expand the size limitations of RNA NMR and aid with RNA-drug discovery efforts.

RNAs form essential regulators of biological processes and are implicated in human diseases, making them attractive therapeutic targets (1, 2). This extensive functional diversity of RNA derives from its ability to fold into complex three-dimensional (3D) structures. Yet, the number of noncoding RNA sequences far outstrips the number of solved RNA structures deposited in the Protein Data Bank (PDB) necessary for understanding RNA function (3, 4). In comparison to x-ray crystallography, nuclear magnetic resonance (NMR) spectroscopy provides high-resolution structural and dynamic information in solution, making it an ideal biophysical technique to characterize the interactions between target RNAs and small drug-like molecules. Nonetheless, NMR studies of RNA suffer from poor spectral resolution and sensitivity, both of which worsen with increasing molecular weight. In contrast with proteins, which are made up of 20 unique amino acid building blocks, RNAs are composed of only four aromatic residues. These four resonate over a very narrow chemical shift region. At high magnetic field strengths, sizable transverse relaxation rates (R2) cause line broadening and thereby decrease both sensitivity and resolution. These problems are further exacerbated with increasing molecular weight. To overcome these limitations of RNA, novel labeling strategies that expand the number of NMR probes beyond the traditional nonradioactive and stable isotope labels such as hydrogen-1 (1H), phosphorus-31 (31P), carbon-13 (13C), hydrogen-2 (2H), and nitrogen-15 (15N) are needed.

Solution NMR of the magnetically active fluorine-19 (19F) isotope offers clear advantages in the study of RNA structure and conformational changes, which occur upon ligand binding. 19F has high NMR sensitivity (0.83 of 1H) due to a large gyromagnetic ratio that is comparable to 1H (0.94 of 1H), a 100% natural abundance, and ~6 wider chemical shift dispersion than 1H (5, 6). In addition, 19F is also sensitive to changes in its local chemical environment (5, 6). In contrast with other commonly used NMR nuclei (1H/31P/13C/15N), 19F is virtually absent in biological systems, thereby rendering 19F NMR background free. Together, 19F is an attractive probe for incorporation into nucleic acids to study their structure, interactions, and dynamics in solution.

Given its attractive spectroscopic properties, 19F was incorporated into RNA for NMR studies in the 1970s (79). Since then, 19F has been successfully incorporated into DNA and RNA oligonucleotides for NMR analysis and used to probe RNA and DNA structure, conformational exchange, and macromolecular interactions (10, 11). Most of these studies were conducted on short oligonucleotides [~30 nucleotides (nt)] prepared by solid-phase synthesis with only a few residues 19F labeled. Even when 2-fluoroadenine (2FA) was incorporated into a 73-nt (~22 kDa) guanine-sensing riboswitch, only 4 of the 16 signals could be assigned. This 2FA study hinted at the limitations of 19F NMR for large RNAs (12). Despite its attractiveness, the application of 19F NMR to study RNA has remained limited because the large 19F chemical shift anisotropy (CSA) contributes substantially to line broadening as a function of increasing molecular weight and polarizing magnetic fields.

To circumvent this limitation, Boeszoermenyi et al. (13) recently showed that direct coupling of 19F to 13C allowed for cancelation of CSA and dipole-dipole (DD) interactions. By incorporating this 19F-13C spin pair into aromatic moieties of proteins and a 16-nt DNA, they showed that a transverse relaxation optimized spectroscopy (TROSY) version of a 19F-13C heteronuclear single-quantum coherence (HSQC) (13) provided improved spectroscopic properties. These exciting results hinted that installing 13C-19F pairs in RNA nucleobases should also lead to improved spectroscopic features.

However, there were no facile methods to readily incorporate 19F-13C spin pairs into RNA. To overcome this technical obstacle of incorporating fluorinated aromatic moieties into RNA, we provide here a straightforward chemoenzymatic synthesis of [5-19F, 5-13C]-uridine 5-triphosphate (5FUTP) for incorporation into RNA (Fig. 1) using phage T7 RNA polymerasebased in vitro transcription. To showcase its versatility, we transcribed two model RNAs using these labels: the 30-nt (~10-kDa) human immunodeficiency type 2 transactivation response (HIV-2 TAR) element (6, 14) and the 61-nt (~20-kDa) human hepatitis B virus encapsidation signal epsilon (hHBV ) element (Fig. 1) (15, 16).

(A) Model RNA systems: HIV-2 TAR (30 nt, 10 kDA) and hHBV (61 nt, 20 kDa). Residues highlighted in green are labeled with 19F-13C5-fluorouridine (5FU) shown in the box. Green circle, 19F; brown circle, 13C; blue circle, 2H. (B) Theoretical 19F,13C spectrum showing the four observable magnetization components of the 19F-13C spin pair as well as the decoupled resonance that has the average chemical shift and linewidths of all four components.

With our new labels, we demonstrate several advantages for RNA NMR studies, including improved resolution and increased sensitivity to ligand binding. We show that a 19F substitution is structurally nonperturbing and has an optimal TROSY effect at readily available magnetic field strength of 600 MHz (1H frequency), in agreement with previous studies (13). Unlike C-H spectra, the resolving power of 19F allows for easy identification of RNA structural elements in helical and nonhelical regions, as well as in wobble GU base-paired regions. With protons substituted with deuterium and depending on the molecular weight of the RNA, the TROSY effect in the 19F-13C pair can reduce the 13C linewidth by a factor >2, compared to a 13C-1H pair, and the 19F-13C label enables detection of a small-molecule binding to a 20-kDa RNA. Thus, our 19F-13C label overcomes several of the limitations in sensitivity and resolution facing RNA NMR studies with the potential to extend the application of solution NMR measurements to largermolecular weight systems in vivo.

Given the potential utility but unavailability of 19F-13C spin pairs in aromatic moieties of RNA, we first sought to develop a reliable and scalable method that combined chemical synthesis with enzymatic coupling in almost quantitative yields. This chemoenzymatic approach is a versatile method that combines chemical synthesis of atom-specific labeled nucleobases with commercially available selectively labeled ribose using enzymes from the pentose phosphate pathways (PPPs) (3, 4). To this end, we adapted the method of Santalucia et al. (17) and Kreutz and co-workers (18) and first synthesized the uracil base (U) specifically labeled with 13C at the aromatic C5 and 15N at the N1 and N3 positions (Fig. 1). This synthesis is readily accomplished using unlabeled potassium cyanide, 13C-labeled bromoacetic acid, and 15N-labeled urea. The resulting U was converted to 5-fluorouracil (5FU) by direct fluorination with Selectfluor (19, 20). This strategy allows for efficient and cost-effective synthesis of the 5FU base with high yield of ~63%. In addition, to remove unwanted scalar coupling interactions (14), we selectively deuterated H6 (~95%) using well-established methods (21). Next, using enzymes from the PPP, we coupled 5FU to D-ribose labeled at the C1 position to give 5FUTP (Fig. 1) (3, 22) with an overall yield of ~50%. This site-specifically labeled 5FUTP was then used for DNA templatedirected T7 RNA polymerasebased in vitro transcription with overall yields comparable to those obtained with unmodified nucleotides.

Fluorine substitution at uridine C5 is thought to reduce the imino N3 pKa values by about 1.7 to 1.8 units with respect to their protonated analogs (23), leading to extensive line broadening of imino protons in 5FU RNAs (24). To determine if incorporation of 5-fluorouridine alters the folding thermodynamics of our RNAs (Fig. 1), we recorded ultraviolet (UV) thermal melting profiles for both wild-type (WT) and 5FU HIV-2 TAR and hHBV (table S1). Both WT and 5FU RNAs showed a single transition in their melting profiles, consistent with unimolecular folding (25). WT and 5FU HIV-2 TAR had melting temperatures within ~1 K of each other (WT: Tm = 355.6 0.5 K; 5FU: Tm = 357.4 0.4 K). Similarly, 5FU hHBV had a melting temperature of 327.1 0.1 K, which is within the error of the melting temperature of WT. Together, these results suggest that 5FU does not markedly alter the thermodynamic stability of HIV-2 TAR and hHBV , in accordance with previous studies of 5FU RNAs (6, 7, 24).

The linewidth for aromatic 19F-13C spin pair (Fig. 1B) is expected to become dominated by the CSA mechanism with increasing polarizing magnetic fields (13). To estimate this effect for 5FU, we calculated the chemical shielding tensor (CST) for 19F-13C spin pairs using density functional theory (DFT) methods (tables S2 and S3) (2629). Using these CST parameters and relaxation theory implemented in the Spinach library (30), we computed the TROSY R2 relaxation rates for the 19F-13C pair of 5FU (13CF and 19FC) and the 13C-1H pair of U (13CH and 1HC) (Fig. 2) assuming isotropic tumbling. The R2 of fluorinated carbon (13CF) TROSY resonance is ~2 times smaller than that of the protonated carbon (13CH) at their respective minima of ~600 and ~950 MHz, respectively, for all molecular weights greater than 5 ns (Fig. 2A). Compared with the decoupled resonance, the R2 of the 13CF TROSY resonance is ~3 times smaller than that of protonated carbon for all molecular weights greater than 5 ns (fig. S1). Although the TROSY effect is quite small for 19F nuclei bonded to 13C (19FC) and for 1H nuclei bonded to 13C (1HC), the R2 of 19FC is three times bigger than that of 1HC (fig. S2). Thus, sensitive, high-resolution NMR spectra for the 19F-13C pair of 5FU in RNAs can be obtained by selective detection of the 13CF TROSY resonance as demonstrated for the 19F-13C pair in aromatic amino acids (13).

(A) Theoretical curves showing the expected R2 values for the TROSY component of 13CF (cyan) and 13CH (magenta) as a function of magnetic field strength (relative to 1H Larmor frequency) for c = 6 ns (dashed line), 25 ns (solid line), and 100 ns (dotted line) at 25C. (B) Theoretical R2 values taken at the commercially available magnetic field strength closest to the maximum TROSY effect (13CH = 950 MHz; 13CF = 600 MHz) for c = 6, 25, and 100 ns at 25C.

To validate these theoretical TROSY predictions experimentally, we adapted the 1H-15N TROSY experiment (3, 31, 32) to perform a 19F-13C TROSY experiment on a ~10-kDa 5FU HIV-2 TAR and on a ~20-kDa 5FU hHBV RNAs (Fig. 3). Because of hardware limitations, we could only run experiments that start with and end on the magnetization of 19F, with the 13C frequency encoded in the indirect dimension. That is, we used the so-called 19F-detected out-and-back method, rather than the more sensitive 19F-excited out-and-stay 13C-detected experiment (13). We collected spectra for each of the four components (Fig. 1B) of the 19F-13C (1H-13C) correlations for both 5FU (WT) HIV-2 TAR and hHBV (figs. S3 to S6).

(A) 19F-13C TROSY of 5FU HIV-2 TAR. (B) 1H-13C TROSY of WT HIV-2 TAR. (C) 19F-13C TROSY of 5FU hHBV . (D) 1H-13C TROSY of WT hHBV . The assignments of 5FU and WT TAR-2 are indicated, as well as the arbitrary peak numbers for 5FU and WT hHBV . The same window size was used in all four spectra to aid in comparison. Gray dashed boxes indicate signals from helical, GU, and nonhelical regions. For (D), the black box indicates a zoom-in view of poorly resolved signals.

Both HIV-2 TAR and hHBV show ~6-fold improvement in chemical shift dispersion of 19F compared with 1H and similar dispersion in 13C (Fig. 3). All six correlations of HIV-2 TAR are well resolved for both 1H-13C and 19F-13C correlations and are in agreement with previously published 1H-19F and 1H-13C RNA spectra (6, 24, 33). Nonetheless, even for this small RNA, the 19F-13C spin pair markedly improves the spectral resolution. 5FU HIV-2 TAR shows a chemical shift dispersion of 2.6 parts per million (ppm) in the 19F dimension and only 0.5 ppm in the 1H dimension for WT (Fig. 3, A and B). Replacing 1H with 19F at C5 results in a slight reduction in chemical shift dispersion along the 13C dimension from 2.1 to 1.5 ppm, although this effect is much smaller than the gain in resolution for 19F over 1H (Fig. 3, A and B). Similarly, the 19F resonances of 5FU hHBV are spread over 4.5 ppm, whereas the WT 1H signals resonate over a narrow 0.8-ppm window. This represents 5.7 times better dispersion (Fig. 3, C and D). Again, substitution of 1H with 19F at C5 results in a reduction in chemical shift dispersion of 2.3 to 1.7 ppm along the 13C dimension for hHBV (Fig. 3, C and D). Of the anticipated 18 signals for hHBV , 16 are resolved for WT and 17 for 5FU. Together, these results demonstrate the marked gain in resolution afforded by the 19F-13C spin pair in 5FU RNAs compared with the 1H-13C spin pair in WT.

In addition to this considerable gain in resolution, 19F-13C labeling confers favorable 13CF TROSY linewidths. We compared the relative linewidths for both RNAs, which we assume to be Lorentzian (Figs. 4 and 5). For 5FU HIV-2 TAR, the 13CF TROSY linewidths were 1.5 times sharper on average than the anti-TROSY components, with a range of 1.3 to 1.7 (Fig. 4A). For WT HIV-2 TAR, the 13CH TROSY component was 3.7-fold narrower than the anti-TROSY component (range, 1.6 to 8.7) (Fig. 4B). Similarly, for 5FU HBV , the 13CF TROSY linewidths were 2.2-fold narrower than the anti-TROSY ones over a range of 1.5 to 3.3 (Fig. 4C). For WT HBV , only 5 of the 16 13CH anti-TROSY signals were observed and were 2.6 times broader than the TROSY resonances (range, 2.0 to 3.3) (Fig. 4D). As predicted from our simulations (Fig. 2), the 13CF TROSY component relaxes ~2 times slower than the 13CH TROSY component in both HIV-2 TAR and hHBV . The 19FC TROSY linewidths for 5FU HIV-2 TAR and 5FU HBV were 1.4 (range, 1.3 to 1.6) and 1.6 (range, 1.1 to 2.5) times narrower than the anti-TROSY components, respectively (Fig. 5, A and C). For both WT HIV-2 and WT HBV , the 1HC TROSY and anti-TROSY linewidths were comparable (Fig. 5, B and D). Consistent with our simulations, the 19FC TROSY linewidth is ~2-fold larger than that of the 1HC component for both RNAs (fig. S3). Again, this is in line with the poor performance of 19F NMR experiments due to the large CSA-induced relaxation. Thus, the incorporation of the 13C label mitigates the deleterious relaxation of the 19F nuclei within a 19F-13C spin pair. However, even for medium-sized RNAs ~20 kDa, 19F TROSY detection of the 19F-13C spin pair still outperforms that for a 1H-13C spin pair. Therefore, to reap the maximum benefits of this label, it is advantageous to monitor the 13C nuclei rather than the 19F nuclei. We anticipate that the 19F-13C TROSY effect will continue to scale with molecular weight for RNAs as was seen recently with proteins (13) and our simulations.

Quantification of TROSY (black) and anti-TROSY (gray) (A) 13CF and (B) 13CH linewidths for HIV-2 TAR. Note that U40 was not observed in the anti-TROSY spectrum of WT HIV-2 TAR (B). In addition, the anti-TROSY component of U38 in (B) was 97 Hz and truncated to fit in the plot. Quantification of TROSY (black) and anti-TROSY (gray) (C) 13CF and (D) 13CH linewidths for hHBV . Note that peaks 1 through 11 in WT hHBV were not observed in the anti-TROSY spectrum (D). The average SD in Hz is shown for the TROSY and anti-TROSY components in each plot. Peak numbers and assignments are given in Fig. 3.

Quantification of TROSY (black) and anti-TROSY (gray) (A) 19FC and (B) 1HC linewidths for HIV-2 TAR. Quantification of TROSY (black) and anti-TROSY (gray) (C) 19FC and (D) 1HC linewidths for hHBV . The average SD in Hz is shown for the TROSY and anti-TROSY components in each plot. Peak numbers and assignments are given in Fig. 3.

In addition to these gains in resolution and favorable linewidths, previous work suggested the 19F chemical shifts serve as sensitive markers of RNA secondary structure (10, 11). For example, GU wobble base pairs are deshielded and shifted by ~4.5 ppm to lower fields compared with AUs within Watson-Crick geometries (34). On the basis of these earlier observations, we hypothesized that 19F-13C correlations of HIV-2 TAR and hHBV can be grouped on the basis of whether or not they are in helical, nonhelical, or GU base-paired regions of the RNA. As a positive control, we note that nonhelical U23, U25, and U31 in 5FU HIV-2 TAR resonate around ~165.5 ppm in 19F and ~142.5 ppm in 13C (Fig. 3A). On the other hand, the helical residues U38, U40, and U42 of 5FU HIV-2 TAR are centered around ~167.5 ppm in 19F and ~141.5 ppm in 13C in line with previous observations for 19F-1H samples of HIV-2 TAR (6) and tRNA (34). Comparison of the equivalent 1H-13C spectra shown in Fig. 3B indicates that even though helical residues cannot be distinguished from nonhelical residues in the 1H dimension, nonhelical residues can be differentiated from helical base pairs in the 13C dimension for a 1H-13C spin pair.

The 17 19F-13C resolved correlations of 5FU hHBV show similar clustering as 5FU HIV-2 TAR (Fig. 3C). For instance, the six most intense signals are centered around ~165.5 ppm in 19F and ~142.5 ppm in 13C where the nonhelical signals of HIV-2 TAR are located. On the basis of the secondary structure of hHBV (Fig. 1A), these six intense peaks belong to the six nonhelical uridines (U15, U17, U18, U32, U34, and U43) (Fig. 3C). A seventh peak is also seen in this region, most likely due to U48 or U49, both of which flank the bulge region. The weaker peaks are from the helical portions of hHBV because these signals located at ~167.5 ppm in 19F and ~141.5 ppm in 13C resonate in the same region as the helical signals from 5FU HIV-2 TAR (6). HIV-2 TAR contains only Watson-Crick base pairs, and so, signals in this region of the hHBV spectrum correspond to AUs (U3, U7, U38, U39, U47, U48, U49, and U56). Of the eight anticipated peaks belonging to helical residues, only seven are observed, further suggesting that U48 or U49 may fray and resonate within the nonhelical region. Unlike HIV-2 TAR, hHBV has four noncanonical GU wobble base pairs embedded within helical regions. The three signals resonating in a distinct region centered at ~163.5 ppm in 19F and ~142.0 ppm in 13C are from the four GUs (U4, U9, U12, and U25). This is in line with previous observations of GU base pairs in tRNA (34). Peak 5 (Fig. 3C) is most likely two GUs that are overlapped. Again, comparison of the equivalent 1H-13C spectra shown in Fig. 3D indicates that even though helical residues can be distinguished from nonhelical residues, nonhelical residues cannot be differentiated from GU base pairs for a 1H-13C spin pair. Thus, the spectroscopic discrimination of helical and nonhelical regions as well as GU wobble and Watson-Crick base pairs in RNA structures becomes possible with the high sensitivity of 19F to the local chemical environment of a 19F-13C spin pair. This distinguishing feature is not readily available for a 1H-13C spin pair.

Ligand-based (35) and protein-observed (36) 19F NMR screening methods are important for identifying small drug-like molecules that act as protein inhibitors. Although most work to date has focused on proteins, recent work suggests that RNAs also contain specific binding pockets that could be easily distinguished and targeted with small molecules (1, 2). hHBV is at the center of the viral replication cycle since the first two residues in its internal bulge are used by the virus to initiate synthesis of the minus-strand DNA. Thus, targeting this RNA structure will notably expand the repertoire of HBV drug targets beyond the current focus on viral proteins (37). Given 19F chemical shifts serve as sensitive markers of RNA secondary structure, we reasoned that 19F-13C spectroscopy will likely pinpoint loop over helical region binders. Rather satisfyingly, we found a small molecule that specifically binds a subset of nonhelical residues in 5FU hHBV (Fig. 6). Overlay of the full spectra of 5FU hHBV with and without the small-molecule shows chemical shift perturbations (CSPs) (38) predominantly confined to nonhelical regions (Fig. 6). Within the nonhelical residues, only four of the seven signals shift with the addition of the small molecule, which suggests selectivity for certain nonhelical residues over others (Fig. 6). We propose a model whereby our small molecule binds hHBV in the 6-nt bulge formed between C14 and C19, but not anywhere else in the RNA. The minor CSPs seen in the helical portion of the 5FU hHBV spectra are from U residues flanking the 6-nt bulge, specifically U47, U48, and U49. Last, the CSP seen in the GU portion is from U12, which also flanks our proposed binding pocket.

(A) Overlay of 19F-13C-TROSY spectra for hHBV without (black) and with small molecule (SM, magenta). (B) Zoom-in of nonhelical residues showing chemical shift perturbations (CSPs) upon addition of SM. (C) Quantification of the CSPs upon addition of SM. The average (Ave) CSP is shown as a dashed line.

19F is an attractive spectroscopic probe to study biomolecular structure, interactions, and dynamics in solution. Nonetheless, a number of obstacles must be overcome for it to become widely useful. First, we must be able to easily install the label into any biopolymer. While incorporation of fluorinated aromatic amino acids and nucleobases into proteins and nucleic acids is usually not a technical challenge, until now, synthesis of carbon-labeled and fluorinated nucleobase to create a 19F-13C spin pair has been problematic for RNA. Here, we present a facile strategy to incorporate 19F-13C 5-fluorouridine into RNA using in vitro transcription for characterization of small-molecule binding interactions by NMR. Our protocol to prepare 19F-13C 5-fluorouridine-5-triphosphate (5FUTP) involves chemically synthesizing 5FU and then enzymatically coupling it to 13C-labeled D-ribose. Our synthetic strategy can be generalized to selectively place labels in the pyrimidine nucleobase at either 15N1, 15N3, 13C2, 13C4, 13C5, or 13C6 or any combinations thereof, and then enzymatically couple ribose labeled at either 13C1, 13C2, 13C3, 13C4, or 13C5 or any of the preceding ribose combinations to the base. The resulting isotopically enriched 5FUTP is then readily incorporated into any desired RNA using DNA templatedirected T7 RNA polymerasebased in vitro transcription. This enzymatic approach, unlike solid-phase RNA synthesis, is not limited to RNAs less than 70 nt or to nucleotides made of labeled nucleobase coupled to unlabeled ribose. Although fluorine substitution at C5 in pyrimidines strongly affects the shielding of the nearby H6, it has little effect on the anomeric H1 chemical shifts (24). We therefore anticipate that our unique strategy that combines ribose 13C1 label with 19F-13C uracil should allow the transfer of assignments from unmodified RNAs to 5-fluoropyrimidinesubstituted RNAs made with our labels.

Second, because of van der Waals radii comparable to that of 1H, 19F is considered minimally perturbing when incorporated into biopolymers (24). Although fluorine substitution in 5FU RNAs leads to sizeable line broadening of the imino protons, thermal melting analysis indicates that the 5FU RNAs are thermodynamically equivalent to the nonfluorinated RNAs (6, 7, 24). In future work, it will be important to systematically investigate the effect of fluorine substitution not only on thermodynamic stability but also on folding kinetics of RNAs. Insights derived from solving, at high-resolution, the 3D structures of fluorinated and nonfluorinated RNA could potentially guide the use of these spin pairs to spy on the biological processes within the cell.

Third, despite its huge potential, nucleic acid observed 19F (NOF) NMR has remained underused because the large 19F CSA induces severe line broadening at high molecular weights and magnetic fields. Using DFT calculations of CST parameters, we show that an optimal 19F-13C TROSY enhancement occurs at 600-MHz 1H frequency to enable slow relaxation of 13C bonded to 19F. Our RNAs show an enhanced 19F-13C TROSY effect with increasing molecular weight and 13C linewidths that are twice as sharp as seen with traditional 1H-13C spin pairs. Thus, nucleobase 19F -13C TROSY will expand the applicability of RNA NMR beyond the ~30-nt (~10-kDa) average.

Fourth, the RNA secondary structure is made up of segments of nucleotides that are either base paired or not. The arrangements of base-paired with unpaired regions can leave distinct NMR chemical shift signatures that can provide low-resolution structural information with minimum expenditure of time and cost. For example, the H5 of a pyrimidine is sensitive to the nature of the residue that comes before it within a triplet of canonical Watson-Crick AU and GC base pairs. When the A in a central UA base pair is substituted by a G, the H5 resonance shifts downfield because of the formation of the GU base pair. Yet, an analysis of the commonly used 1H-13C probes fails to unambiguously separate nonhelical residues from helical ones (39). In contrast, the 19F-13C labels resonate in distinct chemical shift regions based on their secondary structure. For instance, nonhelical residues resonate in spectral regions distinct from helical ones, which are further separated into GU wobble and AU Watson-Crick base-paired regions. The ability to differentiate between different structural features in an RNA simply based on chemical shifts removes the need for the time-consuming and laborious process of resonance assignment.

Given the ubiquity and functional importance of GU wobble base pairs (40) in all kingdoms of life (41), the ability to easily distinguish GU from canonical GC and AU base pairs has several important implications. For instance, in the minor groove, a GU base pair presents a distinctive exocyclic amino group that is unpaired and the Us C1 atom rotates counterclockwise compared with the Cs C1 atom in a canonical GC base pair. This region serves as an important site for protein-RNA interactions. Similarly, in the major groove, G N7 and O6 together with U O4 create an area of intense negative electrostatic potential conducive for binding divalent metal ions. Furthermore, all canonical Watson-Crick base pairs are circumscribed by ~10.6- diameters formed by a line connecting their C1-C1 centers. These ribose-connected centers are superimposable with almost perfect alignment. In contrast, a GU base pair is misaligned counterclockwise by a residual twist of +14, and an UG base pair is misaligned clockwise by a residual twist of 11 (42). That is, the GU base pair is not isosteric with canonical Watson-Crick pairs. Rather, these wobble base pairs either overtwist or undertwist the RNA double helix. 19F-13C labels might aid in elucidating the structural and dynamic basis of these twists depending on the identity of the base pairs neighboring the wobble pair. We, therefore, anticipate that our new label could potentially open up avenues for probing GU wobble pairs in various structural contexts outlined above, such as 19F-13Clabeled RNA-protein interactions and metalloribozyme-ion interactions.

In summary, the labeling technologies presented here open the door for characterizing the structure, dynamics, and interactions of RNA, RNA-RNA, RNA-DNA, RNA-protein, and RNA-drug complexes in vitro and in vivo for complexes as large as 100 kDa or higher with the appropriate fluorine NMR hardware. This 19F -13C labeling approach will also enable correlating chemical shiftstructure relationships to aid chemical shiftcentered probing of RNA structure, dynamics, and interactions. We envision that the 19F-13C spin pair, by providing a clear demarcation of RNA structural elements, may facilitate the discovery and identification of small drug-like molecules that target RNA binding pockets in vitro and in vivo.

The full description of Materials and Methods can be found in the Supplementary Materials. A brief summary is provided here.

[5-19F, 5-13C, 6-2H] and [5-19F, 5-13C, 6-2H, 1,3-15 N2]5FU were synthesized from unlabeled potassium cyanide, 13C-labeled bromoacetic acid, and 15N-labeled urea as described elsewhere (3, 17, 18, 24). The resulting uracil was converted to 5FU by direct fluorination with Selectfluor and deuteration (1921). [1,5-13C2, 5-19F, 6-2H]5FUTP and [1,5-13C2, 5-19F, 6-2H, 1,3-15 N2]5FUTP were synthesized using PPP enzymes (3, 4, 6, 22, 24, 43).

All RNAs were prepared by in vitro transcription and purified as previously described (3, 4). RNA concentrations were approximated by UV absorbance using extinction coefficients of 387.5 mM1 cm1 for HIV-2 TAR and 768.3 mM1 cm1 for hHBV . All RNA concentrations were >0.5 mM (~0.3 ml) in Shigemi NMR tubes.

We collected thermal melting profiles for both WT and 5FU-substituted HIV-2 TAR and hHBV as previously described (24, 25).

Calculations were carried out on 1-methyl-uracil and 5-fluoro-1-methyl uracil using optimized geometries (2628). All calculations used the Gaussian-16 program (29). Details are provided in the Supplementary Materials.

All 19F-13C TROSY spectra were collected at 298 K using a Bruker 600 MHz Avance III spectrometer equipped with TXI (triple resonance inverse) and BBI (broad band inverse) probes. All data were processed with Brukers Topspin 4.0.7 software. 1H chemical shifts were internally referenced to DSS (0.00 ppm), with the 13C chemical shifts referenced indirectly using the gyromagnetic ratios of 13C/1H (44). The 19F chemical shifts were internally referenced to trifluoroacetic acid (75.51 ppm) (45). Experiments showing each component of the 1H/19F-13C correlations were adapted from a sensitivity- and gradient-enhanced 1H-15N TROSY used for proteins (31).

M. J. Frisch, G. W. Trucks, H. B. Schlegel, G. E. Scuseria, M. A. Robb, J. R. Cheeseman, G. Scalmani, V. Barone, G. A. Petersson, H. Nakatsuji, X. Li, M. Caricato, A. V. Marenich, J. Bloino, B. G. Janesko, R. Gomperts, B. Mennucci, H. P. Hratchian, J. V. Ortiz, A. F. Izmaylov, J. L. Sonnenberg, D. Williams-Young, F. Ding, F. Lipparini, F. Egidi, J. Goings, B. Peng, A. Petrone, T. Henderson, D. Ranasinghe, V. G. Zakrzewski, J. Gao, N. Rega, G. Zheng, W. Liang, M. Hada, M. Ehara, K. Toyota, R. Fukuda, J. Hasegawa, M. Ishida, T. Nakajima, Y. Honda, O. Kitao, H. Nakai, T. Vreven, K. Throssell, J. Montgomery, J. A., J. E. Peralta, F. Ogliaro, M. J. Bearpark, J. J. Heyd, E. N. Brothers, K. N. Kudin, V. N. Staroverov, T. A. Keith, R. Kobayashi, J. Normand, K. Raghavachari, A. P. Rendell, J. C. Burant, S. S. Iyengar, J. Tomasi, M. Cossi, J. M. Millam, M. Klene, C. Adamo, R. Cammi, J. W. Ochterski, R. L. Martin, K. Morokuma, O. Farkas, J. B. Foresman, D. J. Fox, Gaussian 16, Revision A.03 (2016).

Acknowledgments: We thank P. Deshong, J. Kahn, L.-X. Wang, and P. Y. Zavalij (University of Maryland) and H. Arthanari (Harvard University) for the helpful comments. We thank S. Bentz and D. Oh for help in preparing samples for thermal melt analysis, and M. Svirydava for help in analyzing samples by mass spectrometry. Funding: We thank the National Science Foundation (DBI1040158 to T.K.D. for NMR instrumentation) and the NIH (U54AI50470 to T.K.D. and D.A.C.) for support. Author contributions: T.K.D.: conceptualization. T.K.D. and O.B.B.: implementation of the project and manuscript preparation. G.Z., B.C., K.M.T., and T.K.D.: synthesis of 5FU. O.B.: synthesis of 5FUTP, RNA synthesis, and thermal melt analysis. T.K.D., K.M.T., B.C., and O.B.B.: TROSY measurements. O.B.B.: small-molecule titration. D.A.C.: DFT calculations. Competing interests: The authors declare that they have no competing interests. Data and materials availability: All data needed to evaluate the conclusions in the paper are present in the paper and/or the Supplementary Materials. Additional data related to this paper may be requested from the authors

Originally posted here:
Solution NMR readily reveals distinct structural folds and interactions in doubly 13C- and 19F-labeled RNAs - Science Advances

After COVID-19, capital will be different, stronger and more conscious – SmartCompany.com.au

VentureCrowd executive director and Maarbani Consulting managing director. Source: supplied.

Lets talk about sushi. I love the stuff. High protein, low fat, minimalist Japanese perfection. Its great with soy, pickled ginger and a sprinkle of microplastic.

Oh, you didnt know? Let me explain.

The modern lightweight shopping bag was invented by Swedish engineer Sten Gustaf Thulin in the early-1960s (stay with me, the connection is coming). Thulin developed and patented a method of forming a one-piece bag by folding, welding and die-cutting a flat tube of plastic for the packaging company Celloplast.

Nowadays, nearly 1 trillion plastic bags are consumed worldwide every year thats over 1 million per minute. Needless to say, the lifetime value of a plastic bag customer is a rock-solid metric, and a bunch of investors are making a ship-load of cash from this little beauty.

Convenient, cheap, disposable and, as it turns out, delicious.

Get COVID-19 news you can use delivered to your inbox.

Youll also receive special offers from our partners. You can opt-out at any time.

You see, most plastic trash in the oceans flows from land. Once at sea, sunlight, wind and wave action break down plastic waste into small particles called microplastics. They are less than 5 millimetres in length the size of a sesame seed and have been found in municipal drinking water systems, drifting through the air and in the seafood we eat.

Ah, the circle of life.

In just one generation, we went from being plastic-free pause for effect to a level of reliance on plastic that results in 12 million tonnes of plastic entering our oceans every year. Thats a full rubbish truck every minute.

But, whatever. Were all making money, right?

Oh, Thulin. Insert facepalm. As my mother would say, Im not angry, Im just disappointed.

The new reality is that the global investment landscape is changing.

A new generation of investors is awakening and they dont want plastic in their sushi. Backed by the largest intergenerational transfer of wealth in modern history, this group is demanding the opportunity to support companies that fund more sustainable futures and solve real-world problems.

In the post-COVID-19 world, capital will be different, stronger, and more conscious.

Even before the pandemic changed our everyday lives, companies contributing to climate change were being called to account as Australia experienced its worst bushfire season on record.

Investors alarmed at the impact of companies damaging the environment have begun to look at the impact of their own investments, and whether those investments are aligned with their values. When people began to dig a little deeper and uncovered where their money was going, the floodgates opened.

In January, Ethical Super saw its net inflow increase by five timescompared to January of 2019, with the fund citing increased awareness of climate change as the reason behind the rise in growth.

The changes are not just being seen in retail investment.

Recently, over half of Woodside Petroleums investors backed motionsfor the company to commit to hard targets for the reduction of its greenhouse gas emissions. As more people realised that the power to choose is in their hands, the shift towards more ethical investments began.

At the same time, the impact of the pandemic has caused many aspects of globalisation to come to a screeching halt accelerating the pace of transformation for industries across the world. In times like these, innovation flourishes.

Uber, Airbnb and WhatsApp were all founded during the 2009 global financial crisis, underling that some of the biggest disruptive opportunities arise during major economic downturns.

Square Peg co-founder Paul Bassat concurs: Every time theres been a major crisis, weve seen this burst of innovation occur where theres a combination of problems needing to be solved as a result, as well as people having a chance to think differently about their career and their lives.

In the midst of the global pandemic, the Australian venture capital sector actually grew. The KPMG Venture Pulse Q1 2020 report found that investment in Australian startups reached a record high of $US944.7 million ($1,314 billion) in H1 2020.

Clearly VC firms are grasping at the opportunities. But they are not the only ones able to reap the potential benefits.

Changes to Australian legislation in 2017 has seen the creation of investment opportunities for retail investors that were previously only available to high-net-worth individuals or sophisticated investors.

If they meet the criteria, these investors are able to invest up to $10,000 in private companies launching fundraises of up to $5 million; cementing the fact that startup investment is no longer just for VCs and angel investors.

Investors have generally been motivated by two things: the opportunity to back the companies changing the world, and the outsized returns of startup investment.

As we move towards a post-COVID world, investments also need to be good for the planet.

As a new generation of investors increasingly begin to focus on the positive impacts their funding decisions can make on the world, startups will need to prove their social and environmental credentials as well as their ability to disrupt and grow.

When they do that, investors will follow and we can all enjoy sushi again, without the microplastic.

NOW READ: Eco-deodorant, accessible rock climbing and interior design: Meet the entrepreneurs taking part in The Good Incubator

NOW READ: After being made redundant on maternity leave, this founder launched her own watch brand and raised $15,000 in six minutes

Small and medium businesses and startups have never needed credible, independent journalism and information more than now.

Thats our job at SmartCompany: to keep you informed with the news, interviews and analysis you need to manage your way through this unprecedented crisis.

Now, theres a way you can help us keep doing this: by becoming a SmartCompany supporter.

Even a small contribution will help us to keep doing the journalism that keeps Australias entrepreneurs informed.

Visit link:
After COVID-19, capital will be different, stronger and more conscious - SmartCompany.com.au

Rogue Waves: Freaks of Nature Studied with Math and Lasers – Inside Science News Service

The elusive waves, once thought to be myths, are explained by the same math that's found in a wide range of settings.

(Inside Science) -- During Columbus third voyage to the Americas, as his six-ship fleet sailed around the southern tip of Trinidad, an island just off the coast of Venezuela, they encountered a freak wave taller than the ships mast. The wave hoisted the ships up to its peak before dropping them down into a huge trough. Columbus would later name the passageway Boca del Serpiente -- Mouth of the Serpent -- for the ferocity of its waters.

Once regarded as myths or pieces of folklore, rogue waves can spike out of nowhere and dissipate just moments later, terrifying sailors and sinking ships. Half a millennium would pass after Columbus encounter before the first rogue wave was measured by a scientific instrument. On New Years Day of 1995, the Draupner oil platform perched in the North Sea off the coast of Norway spotted a wave 84 feet tall -- more than twice the height of its neighboring waves.

Like stock market crashes and devastating earthquakes, the study of rogue waves has been plagued by the scarcity of data.

The Draupner wave was the first time that a rogue wave was actually observed by a scientific instrument; before that it was all just people telling about it. But if we want to learn more about these waves, well need to obtain better statistics and more data, said Tobias Grafke, a physicist from the University of Warwick in the U.K. He is an author of a paper published in the journal Physical Review X that explored the probabilities of rogue waves from a statistical perspective.

It's a very localized phenomenon that comes out of nowhere. I mean, you can just put certain measurement points somewhere and hope that a rogue wave would come by, but it's very, very rare, said Hadas Frostig, a physicist from Boston University not involved in Grafkes paper.

Moreover, rogue waves are so strong that they often destroy the instruments trying to measure them, said Grafke.

Due to the difficulty of collecting real-world data -- even a team of satellites would probably struggle to spot the fleeting, unpredictable waves -- researchers have mostly studied rogue waves in wave pools, dialing in specific conditions that might generate a rogue wave.

An in-lab rogue wave experiment by researchers from the University of Oxford and the University of Edinburgh. [Credit: Ton van den Bremer and Mark McAllister at the University of Oxford.]

Scientists think rogue waves can be generated via two main mechanisms. In the first, waves of different wavelengths, peaking at the same spot, combine to build a massive wave. Because each of their amplitudes simply adds up to form the final height of the rogue wave, it is referred to as a linear process. In contrast, the second mechanism is nonlinear and has to do with how waves with different wavelengths interact and exchange energy with each other. (Check out this infographic by Quanta Magazine that explains the difference between the two concepts.)

A rogue wave can be built linearly or nonlinearly, or a combination of both.

Depending on how a wave model is set up, the relative importance of the two mechanisms is different. What we want is a theory that can predict the probability of these waves and the way they evolve given the state of the ocean, said Grafke. In other words, a model that can predict rogue waves based on the ocean condition without having to predetermine the significance for each mechanism.

Grafke and his colleagues developed a model based on mathematical concepts called solitons and instantons. Solitons are solitary excitations in a field, such as single, short pulses of light; instantons are mathematical devices for interpreting rare events in systems where random processes are present.

According to Amin Chabchoub, who studies environmental fluid mechanics at the University of Sydney in Australia and was not involved in the paper, the model is unique in its approach to predicting the occurrence of rogue waves independent of the formation mechanism.

The study of waves is rarely limited to a single medium. (For example, we have previously covered how a phenomenon called excitable waves plays a role in vastly different systems such as wildfires and heart arrhythmia.)

Since 2007, researchers have begun studying rogue waves in systems where the abundance of data is not a problem because the waves can be easily generated in huge numbers with existing technologies. Theses waves also happen to be much, much faster: light.

Once people started studying rogue waves, it spurred this whole field where people are asking what kind of physics gives rise to these very rare, very extreme events, said Frostig, who recently published a paper in the journal Optica that used laser systems to study rogue waves.

Using optical systems, scientists can generate the immense amount of data required to gauge the probabilities of rogue waves arising under different mechanisms. They have observed that optical rogue waves occur more frequently than would be expected if the waves formation were governed by Gaussian statistics, commonly known as a bell curve.

According to Frostig, rogue wave experiments in optical systems have primarily been focusing on how light waves of different wavelengths interact with each other to generate an extreme event. She and her colleagues discovered these mechanisms alone cannot account for the frequency of rogue waves present in their system. In this relatively young field, new results often create more questions than they answer.

Optical rogue waves do not play a significant role in fiber optics systems like those that bring internet to homes and offices, because the fibers are designed to prevent signals of different frequencies interfering with each other. Nor will ocean rogue wave models likely become a practical solution for safeguarding sailors anytime soon.

But the study of rogue waves and the statistics that govern them is not limited to the ocean or fiber optics. For example, speckle patterns -- the graininess of a laser when it is projected on a surface -- is related to optical rogue waves and has applications in imaging techniques.

Rogue waves also share a mathematical framework with other systems -- some of which arent even waves. Protein folding, disease transmission and even some animal population estimation techniques all display similar statistical characteristics as rogue waves.

The underlying math itself is very general, and it tells you how a system evolves around the probability of an extremely rare event, things like extreme shocks in acoustics systems, extreme voltage vortices and models for turbulence, said Grafke. It doesn't need to be a rogue wave.

Read more:
Rogue Waves: Freaks of Nature Studied with Math and Lasers - Inside Science News Service

Power To The Tenth Power – IT Jungle

August 17, 2020Timothy Prickett Morgan

This is one of my favorite times of the year, with the Hot Chips symposium usually underway this week at Stanford University and all the vendors big and small trotting out their, well, hottest chippery. In this case, hot means extremely interesting but it often means burning shedloads of watts as well. But this is the time that the chip architects show off what they have been working on for four or five years and what has already been in production in recent months or will be in the coming months.

IBM tends to jump the gun a bit with its Power processors, and is doing so a little more than usual with the Power10 processor, which we frankly had hoped would be available later this year rather than later next year. But none of that matters. What does matter is that Power9 is giving customers plenty of headroom in compute at the moment and that Power10 will, thanks to the innovative engineering that Big Blue has come up with, be well worth the wait.

This is the kind of processor complex and system architecture that we have been waiting to see arrive for a long, long time. And we will be getting into the details of that architecture in the coming weeks after IBMs presentation is done at Hot Chips this week. In the meantime, IBM talked with us about how Power10 extends the lead that the Power architecture has over X86 and Arm alternatives for enterprise systems and we are going to focus on that ahead of the Power10 preview and talk to the top brass at Big Blue about how they had better start thinking about systems differently and get people to start thinking about them differently and then invest in IBMs own technology and build the best damned public cloud in the world based on it. We are talking about a moonshot-class investment the likes of which we have not seen out of IBM since it invested $100 million to create the BlueGene protein folding supercomputer back in 1999 to break through the petaflops performance barrier.

So without further ado, here is the wafer of Power10 chips that have come back as early silicon from the fabs at Samsun Electronics, IBMs manufacturing partner:

The research alliance that IBM set up with Samsung, Applied Materials, AMD, GlobalFoundries, and others many years ago has contributed tweaks to the 7 nanometer process that Samsung is using to make the Power10 chips, according to IBM, which is not just using Samsungs plain vanilla 7 nanometer etching, which is called V1 and which uses extreme ultraviolet (EUV) lithography techniques. (Similar to the ones that GlobalFoundries, the former AMD fab cut loose several years ago, was working on for Power10 when it decided in August 2018 to spike the whole 7 nanometer effort, and importantly both flavors of 7 nanometer using regular lithography and using EUV were killed off. Thus driving IBM into Samsungs waiting arms as a foundry partner for the Power10 chips. (Intel and Taiwan Semiconductor Manufacturing Corp were not going to get the deals, that is for sure.)

Samsung started building its V1 fab back in February 2018 and invested $6 billion in the effort in the first two years and has probably spend a few billion dollars more this year. Back in April 2019, Samsung said it was going to invest $115 billion between then and 2030 to build up its foundry both for its own use and for others like IBM. And it is about the safest bet that IBM has outside of GlobalFoundries when it comes to picking a fab partner, given its long history of collaboration with Samsung and the latter companys desire to boost its merchant foundry credentials. Everybody including Intel had better hope Samsung gets good at this, because there are not enough deep pockets otherwise to allay all of the risk as we move from 7 nanometers down to 5 nanometers down to 3 nanometers looking ahead in the current decade.

We are not at liberty to say much about Power10 as we go to press for the Monday issue of The Four Hundred, but we will do a series of follow-up stories to drill down into different aspects of the machines, which we have been prebriefed about under embargo for later today. Here is one thing that IBM did allow us to share with you:

I have only seen the core count of the Power10 chip detailed in a few internal roadmaps, and all of them said that Power10 would have 48 cores. This made logical sense, given that Power8 maxxed out at 12 cores and Power9 maxxed out at 24 skinny cores (or 12 fat ones) across the same 96 threads per die, mostly enabled from the shrink from 22 nanometers with Power8 to 14 nanometers with Power9. It was logical to assume that with the shrink to 7 nanometers that the core count could double up again.

What we now know from the roadmap above is that with the shrink to 7 nanometers, IBM gutted the core design and started with a clean slate to maximize the new 7 nanometer process something that we suspect it was not planning to do with the GlobalFoundries 7 nanometer process and crammed 16 fat cores or 32 skinny cores on a die. Only 15 fat cores or 30 skinny cores are activated to help improve the yield on the chips, assuming that at least 1 in 16 of the cores will be a dud on the new 7 nanometer process, as IBM and Samsung are assuming. At some point, when the yields on the V1 process improve, IBM could activate that latent 16th core and there is an instant performance upgrade for those using a newer stepping of the Power10 chip. The gutting of the microarchitecture is what has allowed IBM to boost the core count from 12 to 16 per chip moving from Power9 to Power10, which is considerably more than expected.

With Power10, IBM is cutting down on the number of chips it is making, which will also help lower costs but it also calls into question whether there will be a single-core or even dual-core variant aimed specifically at smaller IBM i shops. (We will fight that battle later.)

Rather than having three different chip implementations a half skinny chip and a full skinny chip for machines with one or two sockets and a full fat chip for big NUMA iron as it did with Power8 and Power9, IBM moving to a single chip with fat cores and putting one or two of them into a socket to get 30 cores or 60 cores into a socket. This is a much more aggressive strategy, and interestingly, either the single-chip module (SCM) or dual-chip module (DCM) variants of the Power10 chip can be run in SMT4 (four threads per core) or SMT8 (eight threads per core) mode. This mode is not switchable by users, but by IBM at the time it packages up the processor. In the past, to get 24 cores meant running in SMT4 mode, or four threads per core, and not all systems had this capability. This was just a funny way of isolating threads and caches to lower the core count and therefore enterprise software licenses for SMT8 customers, but it also meant raising the per-socket price on software running on the 24-core Power9 variant for software that was priced based on cores and not sockets. It would be useful if IBM could make this SMT level settable at system boot, but it is hard-coded into the processor microcode that customers cannot change because of the software pricing issue mentioned above.

We strongly suspect that IBM never intended to do a monolithic Power10 die with 48 cores on it, but rather a 7 nanometer shrink of the 24-core Nimbus part with some tweaks and then put two of them into a single socket to create a throughput monster. With the Power10 chip as it will be delivered, IBM can, in theory once yields improve, provide customers with 33 percent more cores and, if history is any guide, somewhere around 3X the raw throughput at the 4 GHz design point that IBM has used for Power chips since the Power7 way back in 2010. (The Power6 had a 5 GHz design point, which was quite impressive but not sustainable because Dennard scaling and Moores Law scaling were running out of steam.)

We cant say a lot about it right now, but this memory clustering technology, and indeed the whole memory subsystem of the Power10 chip, is the killer technology with Power10. IBM will be able to do things that other architectures simply cannot, with multi-petabyte memory clustering and sharing across large numbers of Power10 systems.

And that is why IBM has to be the one to invest in building and using these systems, to demonstrate their capabilities, and to make sure Power10 systems are available on the IBM Cloud on Day One of their launch and in huge numbers, not in prototype and proofs of concept onesies and twosies here and there around a dozen or so cloud regions. This is not about drinking the Kool-Aid, which is easy enough, but eating your own dog food first, as we say in this IT business. IBM has to move its own apps to its own cloud running on Power10 iron and be the case study that others can learn from and benefit from.

Theres plenty of time between now and the end of 2021 to make that happen, and IBM i customers as well as those running AIX and Linux should all be invited to come along for the ride.

Power Systems Slump Is Not As Bad As It Looks

The Path Truly Opens To Alternate Power CPUs, But Is It Enough?

Powers Of Ten

What Open Sourcing Powers ISA Means For IBM i Shops

IBMs Plan For Etching Power10 And Later Chips

The Road Ahead For Power Is Paved With Bandwidth

IBM Puts Future Power Chip Stakes In The Ground

What Will IBM i Do With A Power10 Processor?

Samsung Joins The OpenPower Consortium Party

Read more here:
Power To The Tenth Power - IT Jungle

PHILANTROPY IN EDUCATION TOWARDS THE GREATER GOOD – The Star Online

BEING Master of a Cambridge College has its highlights, one of the most pleasurable of which is meeting our benefactors.

My college, Gonville and Caius, has thousands of benefactors, but only the most eminent are commemorated in stone on the Benefactors Wall inside the Great Gate that looks onto the busiest part of Cambridge.

The first name on the wall is the co-founder Edmund Gonville (1348) and the 30th is Jeffrey Cheah, honorary fellow of the college.

The Benefactors Wall at Cambridge College.

One of the greatest Jewish rabbis was Maimonides (1138-1204). As well as codifying religious law, he was also a philosopher, and his day job was being a medical doctor.

Indeed, he was physician to the great Sultan Saladin, and much admired by Islamic scientists and scholars.

He formulated Eight Levels of Giving applied to charitable donations. The highest level is: Giving an interest-free loan to a person in need; forming a partnership with a person in need; giving a grant to a person in need; finding a job for a person in need; so long as that loan, grant, partnership or job results in the person no longer living by relying upon others.

This highest level of philanthropy encompasses precisely what Tan Sri Dr Jeffrey Cheah through his Foundation does.

Grants are awarded to scientists and educators to enable them to carry out their groundbreaking research and scholarship, and it works in partnership with them; this also financially supports young people massively so they will be able to undertake higher education, have jobs and contribute to Malaysian society.

To cap it all, medicine is a central part of those activities. Maimonides really would have approved.

It is for those reasons I have enjoyed so much my relationship with Tan Sri Dr Jeffrey Cheah and the Jeffrey Cheah Foundation we are working together in a partnership that improves medicine, science and education in my college and university, and in our partners institutions in Malaysia for the benefit of the world.

I take pride in being an Honorary Jeffrey Cheah Distinguished Professor so I can continue being a small part in this great partnership.

Crises bring out the worst and the best in society. Some politicians who have spent their careers disregarding the truth have discovered to their cost and the even greater cost of their populations that viruses take no notice of lies.

Crazy groups protest against vaccinations and even claim that Bill Gates, who is a fine philanthropist, is putting microchips in vaccines, and 5G networks spread the virus.

But so many people in every walk of life are doing their bit, from delivery drivers dropping off goods to those isolating at home to people sewing masks and protective gear. Much kindness is shining through.

Low paid healthcare workers are risking their lives caring for the sick and elderly.

Doctors, scientists and engineers in global collaboration have risen as one to the task of tackling the pandemic.

Developments in biotechnology and diagnostics that have sprung from fundamental work on DNA, proteins and molecular and cellular biology are being employed in force.

Scientists are challenging ideas and concepts as new facts are discovered in order to make progress.

I take pride in being a scientist and being part of the scientific movement that has led to the breakthroughs that are allowing the rapid response to the crisis.

Experimental scientists, however, need considerable support and infrastructure in order to carry out this critical research. Much of it is provided by governments, but it is never enough and sometimes too driven by politics or not flexible enough for unpredictable innovation.

Philanthropic individuals and organisations have for long catalysed scientific progress by funding individuals and laboratories and hospitals.

Long live philanthropy and those, like Tan Sri Dr Jeffrey Cheah, who selflessly practise it in the interests of mankind.

About Prof Sir Alan Fersht FRS FMedSci

Honorary Jeffrey Cheah Distinguished Professor

Master of Gonville and Caius College, Cambridge 2008-2018

Herschel Smith Professor of Organic Chemistry, University of Cambridge

A world leading protein scientist, Prof Sir Alan Fersht FRS is widely regarded as one of the main pioneers of protein engineering, a process he developed to analyse the structure, activity and folding of proteins.

He studied at Gonville and Caius College, and earned his PhD degree in 1968. He was Master of Gonville and Caius College, Cambridge from 2012-2018.

In August 2020, Fersht was awarded the Copley Medal of the Royal Society, the most prestigious science prize in the UK and the worlds oldest, dating back to 1731. The medal is awarded for outstanding achievements in research in any branch of science. Previous winners encompass the most famous scientists of the last few centuries including Joseph Priestly, Benjamin Franklin, Charles Darwin, Dmitri Mendeleev, Albert Einstein Niels Bohr as well as Francis Crick and Stephen Hawking, both Fellows of Gonville and Caius.

With interests spanning chemistry and biology, he has been appointed to three of the most prestigious societies in the US Foreign Associate of the National Academy of Sciences (USA); Honorary Foreign Member of the American Academy of Arts and Sciences; and Foreign Honorary Member of the American Philosophical Society.

He is a member of prestigious UK and European academies Fellow of the Royal Society of London; Member of European Molecular Biology Organisation (EMBO); and Member of Academia Europaea. On top of co-founding three biotech companies, Fersht is also the recipient of many global recognitions including the Gabor Medal of the Royal Society for Molecular Biology, its Davy Medal for Chemistry, and Royal Medal for his work on protein folding.

Both Fersht and Nobel Prize winner Sir Gregory Winter FRS, his collaborator, were knighted for their work on protein engineering.

This article is brought to you by Jeffrey Cheah Foundation in conjunction with its 10th anniversary.

Related stories:

https://www.thestar.com.my/news/nation/2020/07/19/leaving-a-legacy-to-inspire

https://www.thestar.com.my/news/nation/2020/07/19/setting-the-bar-in-education#cxrecs_s

More:
PHILANTROPY IN EDUCATION TOWARDS THE GREATER GOOD - The Star Online

Folding@home infectious disease research with Spot Instances – idk.dev

This post was contributed by Jarman Hauser, Jessie Xie, and Kinnar Kumar Sen.

Folding@home(FAH) is a distributed computing project that uses computational modeling to simulate protein structure, stability, and shape (how it folds). These simulationshelp to advancedrug discoveries and cures for diseases linked to protein dynamics within human cells. The FAH software crowdsources its distributed compute platform allowing anyone to contribute by donating unused computational resources from personal computers, laptops, and cloud servers.

In this post, I walk through deploying EC2 Spot Instances, optimized for the latest Folding@home client software. I describe how to be flexible across a combination ofGPU-optimized Amazon EC2 Spot Instances configured in anEC2 Auto Scaling group. The Auto Scaling group handles launching and maintaining a desiredcapacity, and automatically request resources to replace any that are interrupted or manually shut down.

Spot Instances are spare EC2 capacity available at up to a 90% discount compared to On-Demand Instance prices. The only difference between On-Demand Instance and Spot Instances is that Spot Instances can be interrupted by EC2 with two minutes of notification when EC2 needs the capacity back. This makes Spot Instances a great fit for stateless, fault-tolerant workloads like big data, containers, batch processing, AI/ML training, CI/CD and test/dev. For more information, see Amazon EC2 Spot Instances.

In addition to being flexible across instance types, another best practice for using Spot Instances effectively is to select the appropriate allocation strategy. Allocation strategies in EC2 Auto Scaling help you automatically provision capacity according to your workload requirements. We recommend that using the capacity optimized strategy to automatically provision instances from the most-available Spot Instance pools by looking at real-time capacity data. Because your Spot Instance capacity is sourced from pools with optimal capacity, this decreases the possibility that your Spot Instances are reclaimed. For more information about allocation strategies, see Spot Instances in the EC2 Auto Scaling user guide and configuring Spot capacity optimization in this user guide.

AmazonCloudWatchinstance metrics and logs for real-time monitoring of the protein folding progress.

To complete the setup, you must have an AWS account with permissions to the listed resources above. When you sign up for AWS, your AWS account is automatically signed up for all services in AWS, including Amazon EC2. If you dont have an AWS account, find more info about creating an accounthere.

The AWS CloudFormation (CFn) template includes customizable configuration parameters. Some of these settings, such as instance type, affect the cost of deployment. For cost estimates, see the pricing pages for each AWS service you are using. Prices are subject to change. You are responsible for the cost of the AWS services used. There is no additional cost for using the CFn template.

Note: There is no additional charge to use Deep Learning AMIs you pay only for the AWS resources while theyre running. Folding@home client software is a free, open-source software that is distributed under theFolding@home EULA.

Tip: After you deploy the AWS CloudFormationtemplate, we recommend that you enable AWS Cost Explorer. Cost Explorer is aneasy-to-use interface that lets you visualize, understand, and manage your AWS costs and usage e.g. you can break down costs to show hourly costs for your protein folding project.

First thing you must do is download, then make a few edits to the template.

Once downloaded, open the template file in your favorite text editor to make a few edits to the configuration before deploying.

In theUser Information section, you have the option to create a unique user name, join or create a new team, or contribute anonymously. For this example, I leave the values set to default and contribute as an anonymous user, the default team. More details about teams and leaderboards can be foundhereand details about PASSKEYshere.

Once edited and saved to a location you can easily find later, in the next section youll learn how to upload the template in the AWS CloudFormation console.

Next, log into the AWS Management Console, choose the Region you want to run the solution in, then navigate to AWSCloudFormation to launch the template.

In the AWS CloudFormation console, click on Create stack. Upload the template we just configured and click on Next to specify stack details.

Enter a stack name and adjust the capacity parameters as needed. In this example I set the desiredCapacity and minSize at 2 to handle protein folding jobs assigned to the client, and then the maxSize set at 12. Setting your maxSize to 12 ensures you have capacity for larger jobs that get assigned. These parameters can be adjusted based on your desired capacity.

If breaking out usage and cost data is required,, you can optionally add additional configurations like tags, permissions, stack policies, rollback options, and more advanced options in the next stack configuration step. Click Next to Review and then create the stack.

Under the Events tab, you can see the status of the AWS resources being created. When the status is CREATE_COMPLETE (approx. 35 minutes), the environment with Folding@home is installed and ready.Once the stack is created, the GPU instances will begin protein simulation.

The AWS CloudFormation template creates a log group fahlog that each of the instances send log data to. This allows you to visualize the protein folding progress in near real time via the Amazon CloudWatch console. To see the log data, navigate over to the Resources tab and click on the cloudWatchLogGroup link for fahlog. Alternatively, you can navigate to the Amazon CloudWatch console and choose fahlog under log groups.Note: Sometimes it takes a bit of time for Folding@Home Work Units (WU) to be downloaded in the instances and allocate all the available GPUs.

In the CloudWatch console, check out the Insights feature in the left navigation menu to see analytics for your protein folding logs. Select fahlog in the search box and run the default query that is provided for you in the query editor window to see your protein folding results.

Another thing you can do is create a dashboard in the CloudWatch console to automatically refresh based on the time intervals you set. Under Dashboardsin the left navigation bar, I was able to quickly create a few widgets to visualize CPU utilization, network in/out, and protein folding completed steps. This is a nifty tool that, with a little more time, you could configure more detailed metrics like cost per fold, and GPU monitoring.

You can let this run as long as you want to contribute to this project. When youre ready to stop, AWS CloudFormation gives us the option to delete the stack and resources created.On the AWS CloudFormation console, select the stack, and select delete.When you delete a stack, you delete the stack and all of its resources.

In this post, I shared how to launch a cluster of EC2 GPU-optimized Spot Instances to aid in Folding@homes protein dynamics research that could lead to therapeutics for infectious diseases. I leveraged Spot best practices by being flexible with instance selections across multiple families, sizes, and Availability Zones, and by choosing the capacity-optimized allocation strategy to ensure our cluster scales optimally and securely. Now you are ready to donate compute capacity with Spot Instances to aid disease research efforts on Folding@home.

Folding@home is currently based at the Washington University School of Medicine in St. Louis, under the directorship of Dr. Greg Bowman. The project was started by the Pande Laboratory atStanford University, under the direction of Dr.Vijay Pande, who led the project until 2019.[4]Since 2019, Folding@home has been led by Dr. Greg Bowman ofWashington University in St. Louis, a former student of Dr. Pande, in close collaboration with Dr. John Chodera of MSKCC and Vince Voelz of Temple University.[5]

With heightened interest in the project, Folding@home has grown to a community of2M+ users, bringing together the compute power of over 600K GPUs and 1.6M CPUs.

This outpouring of support has made Folding@home one of the worlds fastest computing systems achievingspeeds of approximately 1.2exaFLOPS, or 2.3 x86 exaFLOPS, by April 9, 2020 making it the worlds firstexaFLOP computing system.Folding@homes COVID-19 effortspecifically focuses on better understanding how the viral proteins moving parts enable to infect a human host, evade an immune response, and create new copies of the virus.The project is leveraging this insight to help design new therapeutic antibodies and small molecules that might prevent infection. They are engaged with a number of experimental collaborators to quickly iterate between computational design and experimental testing.

Original post:
Folding@home infectious disease research with Spot Instances - idk.dev

Nine Cambridge researchers among this years Royal Society medal and award winners – India Education Diary

He is one of the 25 Royal Society medals and awards winners announced today, nine of whom are researchers at the University of Cambridge. The annual prizes celebrate exceptional researchers and outstanding contributions to science across a wide array of fields.

President of the Royal Society, Venki Ramakrishnan, said:

The Royal Societys medals and awards celebrate those researchers whose ground-breaking work has helped answer fundamental questions and advance our understanding of the world around us. They also champion those who have reinforced sciences place in society, whether through inspiring public engagement, improving our education system, or by making STEM careers more inclusive and rewarding.

This year has highlighted how integral science is in our daily lives, and tackling the challenges we face, and it gives me great pleasure to congratulate all our winners and thank them for their work.

Sir Alan Fersht FMedSci FRS, Emeritus Professor in the Department of Chemistry and former Master of Gonville and Caius College, is awarded the Copley Medal for the development and application of methods to describe protein folding pathways at atomic resolution, revolutionising our understanding of these processes.

Most of us who become scientists do so because science is one of the most rewarding and satisfying of careers and we actually get paid for doing what we enjoy and for our benefitting humankind. Recognition of ones work, especially at home, is icing on the cake, said Sir Alan. Like many Copley medallists, I hail from a humble immigrant background and the first of my family to go to university. If people like me are seen to be honoured for science, then I hope it will encourage young people in similar situations to take up science.

As the latest recipient of the Royal Societys premier award, Sir Alan joins an elite group of scientists, that includes Charles Darwin, Albert Einstein and Dorothy Hodgkin, and more recently Professor John Goodenough (2020) for his research on the rechargeable lithium battery, Peter Higgs (2015), the physicist who hypothesised the existence of the Higgs Boson, and DNA fingerprinting pioneer Alec Jeffreys (2014).

Professor Barry Everitt FMedSci FRS, from the Department of Psychology and former Master of Downing College, receives the Croonian Medal and Lecture for research which has elucidated brain mechanisms of motivation and applied them to important societal issues such as drug addiction.

Professor Everitt said: In addition to my personal pride about having received this prestigious award, I hope that it helps draw attention to experimental addiction research, its importance and potential.

Professor Herbert Huppert FRS of the Department of Applied Mathematics and Theoretical Physics, and a Fellow of Kings College, receives a Royal Medal for outstanding achievements in the physical sciences. He has been at the forefront of research in fluid mechanics. As an applied mathematician he has consistently developed highly original analysis of key natural and industrial processes. Further to his research, he has chaired policy work on how science can help defend against terrorism, and carbon capture and storage in Europe.

In addition to the work for which they are recognised with an award, several of this years recipients have also been working on issues relating to the COVID-19 pandemic.

Professor Julia Gog of the Department of Applied Mathematics and Theoretical Physics and a Fellow of Queens College, receives the Rosalind Franklin Award and Lecture for her achievements in the field of mathematics. Her expertise in infectious diseases and virus modelling has seen her contribute to the pandemic response, including as a participant at SAGE meetings. The STEM project component of her award will produce resources for Key Stage 3 (ages 11-14) maths pupils and teachers exploring the curriculum in the context of modelling epidemics and infectious diseases and showing how maths can change the world for the better.

The Societys Michael Faraday Prize is awarded to Sir David Spiegelhalter OBE FRS, of the Winton Centre for Centre for Risk and Evidence Communication and a Fellow of Churchill College, for bringing key insights from the disciplines of statistics and probability vividly home to the public at large, and to key decision-makers, in entertaining and accessible ways, most recently through the COVID-19 pandemic.

The full list of Cambridges 2020 winners and their award citations:

Copley MedalAlan Fersht FMedSci FRS, Department of Chemistry, and Gonvilleand Caius CollegeHe has developed and applied the methods of protein engineering to provide descriptions of protein folding pathways at atomic resolution, revolutionising our understanding of these processes.

Croonian Medal and LectureProfessor Barry Everitt FMedSci FRS, Department of Psychology and Downing CollegeHe has elucidated brain mechanisms of motivation and applied them to important societal issues such as drug addiction.

Royal Medal AProfessor Herbert Huppert FRS, Department of Applied Mathematics and Theoretical Physics,and Kings CollegeHe has been at the forefront of research in fluid mechanics. As an applied mathematician he has consistently developed highly original analysis of key natural and industrial processes.

Hughes MedalProfessor Clare Grey FRS, Department of Chemistry and Pembroke CollegeFor her pioneering work on the development and application of new characterization methodology to develop fundamental insight into how batteries, supercapacitors and fuel cells operate.

Ferrier Medal and LectureProfessor Daniel Wolpert FMedSci FRS, Department of Engineering and Trinity CollegeFor ground-breaking contributions to our understanding of how the brain controls movement. Using theoretical and experimental approaches he has elucidated the computational principles underlying skilled motor behaviour.

Michael Faraday Prize and LectureSir David Spiegelhalter OBE FRS, Winton Centre for Risk and Evidence Communication and Churchill CollegeFor bringing key insights from the disciplines of statistics and probability vividly home to the public at large, and to key decision-makers, in entertaining and accessible ways, most recently through the COVID-19 pandemic.

Milner Award and LectureProfessor Zoubin Ghahramani FRS, Department of Engineering and St Johns CollegeFor his fundamental contributions to probabilistic machine learning.

Rosalind Franklin Award and LectureProfessor Julia Gog, Department of Applied Mathematics and Theoretical Physics, and Queens CollegeFor her achievements in the field of mathematics and her impactful project proposal with its potential for a long-term legacy.

Royal Society Mullard AwardProfessor Stephen Jackson FMedSci FRS, Gurdon Institute, Department of BiochemistryFor pioneering research on DNA repair mechanisms and synthetic lethality that led to the discovery of olaparib, which has reached blockbuster status for the treatment of ovarian and breast cancers.

The full list of medals and awards, including their description and past winners can be found on the Royal Society website:https://royalsociety.org/grants-schemes-awards/awards/

Read the rest here:
Nine Cambridge researchers among this years Royal Society medal and award winners - India Education Diary

AI Researchers Design Program To Generate Sound Effects For Movies and Other Media – Unite.AI

A team of researchers from the Pritzker School of Molecular Engineering (PME) at the University of Chicago has recently succeeded in the creation of an AI system that can create entirely new, artificial proteins by analyzing stores of big data.

Proteins are macromolecules essential for the construction of tissues in living things, and critical to the life of cells in general. Proteins are used by cells as chemical catalysts to make various chemical reactions occur and to carry out complex tasks. If scientists can figure out how to reliably engineer artificial proteins, it could open the door to new ways of carbon capturing, new methods of harvesting energy, and new disease treatments. Artificial proteins have the power to dramatically alter the world we live in. As reported by EurekaAlert, a recent breakthrough by researchers at PME University of Chicago has put scientists closer to those goals. The PME researchers made use of machine learning algorithms to develop a system capable of generating novel forms of protein.

The research team created machine learning models trained on data pulled from various genomic databases. As the models learned, they began to distinguish common underlying patterns, simple rules of design, that enable the creation of artificial proteins. Upon taking the patterns and synthesizing the respective proteins in the lab, the researchers found that the artificial proteins created chemical reactions that were approximately as effective as those driven by naturally occurring proteins.

According to Joseph Regenstein Professor at PME UC, Rama Ranganathan, the research team found that genome data contains a massive amount of information regarding the basic functions and structures of proteins. By utilizing machine learning to recognize these common structures, the researchers were able to bottle natures rules to create proteins ourselves.

The researchers focused on metabolic enzymes for this study, specifically a family of proteins called chorismate mutase. This protein family is necessary for life in a wide variety of plants, fungi, and bacteria.

Ranganathan and collaborators realized that genome databases contained insights just waiting to be discovered by scientists, but that traditional methods of determining the rules regarding protein structure and function have only had limited success. The team set out to design machine learning models capable of revealing these design rules. The models findings imply that new artificial sequences can be created by conserving amino acid positions and correlations in the evolution of amino acid pairs.

The team of researchers created synthetic genes that encoded amino acid sequences producing these proteins. They cloned bacteria with these synthetic genes and found that the bacteria used the synthetic proteins in their cellular machinery, functioning almost exactly the same as regular proteins.

According to Ranganathan, the simple rules that their AI distinguished can be used to create artificial proteins of incredible complexity and variety. As Ranganathan explained to EurekaAlert:

The constraints are much smaller than we ever imagined they would be. There is a simplicity in natures design rules, and we believe similar approaches could help us search for models for design in other complex systems in biology, like ecosystems or the brain.

Ranganathan and collaborators want to take their models and generalize them, creating a platform scientists can use to better understand how proteins are constructed and what effects they have. They hope to use their AI systems to enable other scientists to discover proteins that can tackle important issues like climate change. Ranganathan and Associate Professor Andrew Ferguson have created a company dubbed Evozyne, which aims to commercialize the technology and promote its use in fields like agriculture, energy, and environment.

Understanding the commonalities between proteins, and the relationships between structure and function could also assist in the creation of new drugs and forms of therapy. Though protein folding has long been considered an incredibly difficult problem for computers to crack, the insights from models like the once produced by Ranganathans team could help increase the speed these calculations are produced at, facilitating the creation of new drugs based on these proteins. Drugs could be developed that block the creation of proteins within viruses, potentially aiding in the treatment of even novel viruses like the Covid-19 coronavirus.

Ranganathan and the rest of the research team still need to understand how and why their models work and how they produce reliable protein blueprints. The research teams next goal is to better understand what attributes the models are taking into account to arrive at their conclusions.

Read more:
AI Researchers Design Program To Generate Sound Effects For Movies and Other Media - Unite.AI

"Floppy" Proteins Used To Create Artificial Organelles Within Human Cells – Technology Networks

Biomedical engineers at Duke University have demonstrated a method for controlling the phase separation of an emerging class of proteins to create artificial membrane-less organelles within human cells. The advance, similar to controlling how vinegar forms droplets within oil, creates opportunities for engineering synthetic structures to modulate existing cell functions or create entirely new behaviors within cells.The results appear online in the journal Nature Chemistry.

Proteins function by folding into specific 3-D shapes that interact with different biomolecular structures. Researchers previously believed that proteins needed these fixed shapes to function. But in the last two decades, a large new class of intrinsically disordered proteins (IDPs) have been discovered that have large regions that are floppy that is, they do not fold into a defined 3-D shape. It is now understood these regions play an important, previously unrecognized role in controlling various cellular functions.

IDPs are also useful for biomedical applications because they can undergo phase transitions changing from a liquid to a gel, for example, or from a soluble to an insoluble state, and back again in response to environmental triggers, like changes in temperature. These features also dictate their phase behavior in cellular environments and are controlled by adjusting characteristics of the IDPs such as their molecular weight or the sequence in which the amino acids are linked together.

Although there are many natural IDPs that show phase behavior in cells, they come in many different flavors, and it has been difficult to discern the rules that govern this behavior, said Ashutosh Chilkoti, the Alan L. Kaganov Distinguished Professor of Biomedical Engineering at Duke. This paper provides very simple engineering principles to program this behavior within a cell.

Others in the field have taken a top-down approach where theyll make a change to a natural IDP and see how its behavior changes within a cell, said Michael Dzuricky, a research scientist working in the Chilkoti laboratory and first author of the study. Were taking the opposite approach and building our own artificial IDPs from simple thermodynamic principles. This enables us and others to precisely tune a single propertythe shape of the IDPs phase diagramto better understand how this parameter affects biological behavior.

In the new paper, the researchers begin by looking to nature for examples of IDPs that come together to form biomolecular condensates within cells. These weakly-held-together structures allow cells to create compartments without also building a membrane to encapsulate it. Using one such IDP from the common fruit fly as a basis, the researchers draw from their extensive history of working with IDPs to engineer a molecularly simpler artificial version that retains the same behavior.

This simpler version allowed the researchers to make precise changes to the molecular weight of the IDP and amino acids of the IDPs. The researchers show that, depending on how these two variables are tweaked, the IDPs come together to form these compartments at different temperatures in a test tube. And by consistently trying various tweaks and temperatures, the researchers gained a solid understanding of which design parameters are most important to control the IDPs behavior.

A test tube, however, is not the same as a living cell, so the researchers then went one step further to demonstrate how their engineered IDPs behave within E. coli. As predicted, their artificial IDPs grouped together to form a tiny droplet within the cells cytoplasm. And because the IDPs behavior was now so well understood, the researchers showed they could predictably control how they coalesced using their test tube principles as a guide.

We were able to change temperatures in cells to develop a complete description of their phase behavior, which mirrored our test tube predictions, said Dzuricky. At this point, we were able to design different artificial IDP systems where the droplets that are formed have different material properties.

Put another way, because the researchers understood how to manipulate the size and composition of the IDPs to respond to temperature, they could program the IDPs to form droplets or compartments of varying densities within cells. To show how this ability might be useful to biomedical engineers, the researchers then used their newfound knowledge, as nature often does, to create an organelle that performs a specific function within a cell.

The researchers showed that they could use the IDPs to encapsulate an enzyme to control its activity level. By varying the molecular weight of the IDPs, the IDPs hold on the enzyme either increased or decreased, which in turn affected how much it could interact with the rest of the cell.

To demonstrate this ability, the researchers chose an enzyme used by E. coli to convert lactose into usable sugars. However, in this case, the researchers tracked this enzymes activity with a fluorescent reporter in real-time to determine how the engineered IDP organelle was affecting enzyme activity.

In the future, the researchers believe they could use their new IDP organelles to control the activity levels of biomolecules important to disease states. Or to learn how natural IDPs fill similar cellular roles and understand how and why they sometimes malfunction.

This is the first time anybody has been able to precisely define how the protein sequence controls phase separation behavior inside cells, said Dzuricky. We used an artificial system, but we think that the same rules apply to natural IDPs and are excited to begin testing this theory.

We can also now start to program this type of phase behavior with any protein in a cell by fusing them to these artificial IDPs, said Chilkoti. We hope that these artificial IDPs will provide new tool for synthetic biology to control cell behavior.

This article has been republished from the following materials. Note: material may have been edited for length and content. For further information, please contact the cited source.

Original post:
"Floppy" Proteins Used To Create Artificial Organelles Within Human Cells - Technology Networks

Nine Cambridge researchers among this year’s Royal Society medal and award winners – Cambridge Network

President of the Royal Society, Venki Ramakrishnan, said: "The Royal Societys medals and awards celebrate those researchers whose ground-breaking work has helped answer fundamental questions and advance our understanding of the world around us. They also champion those who have reinforced sciences place in society, whether through inspiring public engagement, improving our education system, or by making STEM careers more inclusive and rewarding.

"This year has highlighted how integral science is in our daily lives, and tackling the challenges we face, and it gives me great pleasure to congratulate all our winners and thank them for their work."

Sir Alan Fersht FMedSci FRS, Emeritus Professor in the Department of Chemistry and former Master of Gonville and Caius College, is awarded the Copley Medal for the development and application of methods to describe protein folding pathways at atomic resolution, revolutionising our understanding of these processes.

"Most of us who become scientists do so because science is one of the most rewarding and satisfying of careers and we actually get paid for doing what we enjoy and for our benefitting humankind. Recognition of ones work, especially at home, is icing on the cake," said Sir Alan. "Like many Copley medallists, I hail from a humble immigrant background and the first of my family to go to university. If people like me are seen to be honoured for science, then I hope it will encourage young people in similar situations to take up science."

As the latest recipient of the Royal Societys premier award, Sir Alan joins an elite group of scientists, that includes Charles Darwin, Albert Einstein and Dorothy Hodgkin, and more recently Professor John Goodenough (2020) for his research on the rechargeable lithium battery, Peter Higgs (2015), the physicist who hypothesised the existence of the Higgs Boson, and DNA fingerprinting pioneer Alec Jeffreys (2014).

Read the full story, with the complete list of Cambridges 2020 winners and their award citations

Reproduced courtesy of the University of Cambridge

Read this article:
Nine Cambridge researchers among this year's Royal Society medal and award winners - Cambridge Network

What’s New in Computing vs. COVID-19: Bars, Visualizations, New Therapeutics & More – HPCwire

Supercomputing, big data and artificial intelligence are crucial tools in the fight against the coronavirus pandemic. Around the world, researchers, corporations and governments are urgently devoting their computing resources to this global crisis. This column collects the biggest news about how advanced technologies are helping us fight back against COVID-19.

Nvidia uses [emailprotected] data to visualize moving spike proteins

[emailprotected]s crowdsourced network of volunteer computers, which has boomed during the pandemic, have enabled the production of massive datasets describing the folding of SARS-CoV-2s viral proteins particularly the spike protein. A scientific visualization team at Nvidia used that dataset to produce a haunting, ultra-high-resolution fly-through visualization of those proteins. To read more, click here.

Calculations on Comet boost understanding of immune responses to foreign pathogens

Researchers at the San Diego Supercomputer Center used Comet to assist a study on T cell receptors that the team says will inform understanding on the adaptive immune systems response to pathogens like SARS-CoV-2. Our most recent study puts us one step closer to truly understanding the extreme and beneficial diversity in the immune system, and identifying features of immunity that are shared by most people, said James E. Crowe, Jr., director of the Vanderbilt Vaccine Center of Vanderbilt University Medical Center. To read more, click here.

Supercomputer research leads to clinical study of potential COVID-19 therapeutic

Research conducted under the auspices of European public-private consortium Exscalate4CoV has led to the approval of a human clinical trial studying the use of the existing osteoporosis drug raloxifene for the treatment of mild cases of COVID-19. Raloxifene was one of several drugs to emerge from a massive supercomputer-powered screening of hundreds of thousands of candidate molecules. The researchers are hopeful that the drug may halt the progression of infection in certain cases. To read more, click here.

Researchers use supercomputing to study the minute movements of the coronavirus proteins

SARS-CoV-2s notorious spike protein, which allows it to infect human cells, relies on movement to pry open and enter host cells. While the basic stages of its movement were imaged early in the year, the intermediary states between those stages had not been fully captured until now. Researchers from UC Berkeley and Istanbul Technical University used TACC supercomputers to simulate these minute movements, identifying potential middle states that could serve as useful drug targets. To read more, click here.

RIKEN teams with businesses to study and reduce the risk of COVID-19 infection in restaurants and bars

Japanese research institute RIKEN, host of the Top500-leading supercomputer Fugaku, has teamed up with Suntory Liquors Ltd. and Toppan Printing Co., Ltd. to develop face shields specifically for eating and drinking in order to reduce the risk of COVID-19 infection in restaurants and bars. The study is making use of Fugaku, which has been involved in a variety of viral droplet simulations. To read more, click here.

Corona supercomputer receives major upgrade for coronavirus research

The (coincidentally named) Corona supercomputer at Lawrence Livermore National Laboratory (LLNL) has received a major upgrade to assist with its research on the coronavirus. The system now boasts almost 1,000 new AMD Radeon Instinct MI50 GPUs, more than doubling its speed (for a total of 11 peak petaflops). The expansion of Corona allows us to routinely run the computationally intensive molecular dynamics simulations to obtain the free energy between antibodies-antigens, said LLNL COVID-19 researcher Felice Lightstone. To read more, click here.

Brookhaven National Laboratory issues update on its work to fight the coronavirus

Since early this year, Brookhaven National Laboratory has been supporting a range of projects aimed at combating COVID-19. The lab recently issued an update on its research, highlighting a variety of supercomputer-supported research, including a scalable high-performance computing and AI infrastructure that allows for high-throughput ensemble docking studies and AI-driven molecular dynamics simulations. To read more, click here.

Link:
What's New in Computing vs. COVID-19: Bars, Visualizations, New Therapeutics & More - HPCwire

Arizona universities join forces to contribute to COVID-19 modeling and simulation efforts – ASU Now

July 14, 2020

While the COVID-19 pandemic continues to impact the world, the research computing centers at Arizona State University, Northern Arizona Universityand University of Arizonahave united as a team to contribute to the Folding@homeproject. The project utilizes idle computing power to significantly contribute to vital scientific research and therapeutic drug discovery.

The Arizona Research Computing consortium is contributing to this collective effort by using advanced computing resources to perform complex protein modeling computations during brief idle periods on local supercomputers. By running these computations only during downtime, contributions to COVID-19 modeling and simulation efforts can be made through the Folding@home project without impacting everyday research. Artist rendition of the SARS-Cov-2 virus. The envelope protein is shown in cyan. (Figure from Klick health https://covid19.klick.com/) Download Full Image

The Folding@home project provides people around the world the opportunity to make active contributions to a variety of scientific research efforts including COVID-19. Volunteers, or citizen scientists, can download the Folding@home software to their personal computers, allowing simulations of complex scientific processes to run in the background while their personal computers are not in use. The Folding@home project is crowdsourcing at its best, using shared computing power at a massive scale to help solve grand challenges in biomedical research.

In addition to mobilizing citizen scientists across the globe, many institutions and corporations are contributing their own computational resources such as high performance workstations and servers. This distributed computational power is estimated to be 10 times faster than the worlds fastest individual supercomputer.

The onslaught of COVID-19 has raised the visibility of the Folding@home project, highlighting a unique opportunity to fight the virus. The project seeks to understand how proteins, which are large, complex molecules that play an important role in how our bodies function, fold to perform their biological functions. This helps researchers understand diseases that result from protein misfolding and identify novel ways to develop new drug therapies.

How proteins fold or misfold can help us understand what causes diseases like cancer, Alzheimer's disease and diabetes. It might also lend insight into viruses such as SARS-CoV-2,the cause of the recent COVID-19pandemic.

Imagine if I told 100 people to fold a pipe cleaner. They are going to fold it in 100 different ways because theres an infinite number of combinations of how to take something that is straight and fold it," said Blake Joyce, assistant director of research computing at the University of Arizona."Thats what viruses and living things do with proteins. They make copies of themselves and fold them up in their own particular way.

Using computational modeling, researchers can explore the mechanics of proteins of the virus and predict every possible way it might fold, or physically change shape.

In biology, shape is function. If you can disrupt that shape, the virus is inactive or cant do its thing. If you disrupt any of the mechanisms that can damage us, you have a cure, or at least something you can treat. And that is what were after. It just takes a lot of computing to come up with every possible way to bend a pipe cleaner, Joyce said.

By running computer simulations, researchers can take the virus and see how it interacts with various compounds or drugs and narrow down which ones might work to interrupt one of the critical mechanisms the virus needs to survive.

Folding@home assigns pieces of a protein simulation to each computer and the results are returned to create an overall simulation. Folding@home computations for COVID-19 research seem to be most productive on the kind of computers found in facilities like Arizonas research computing centers, making their contributions even more valuable.

Volunteers can track their contributions on the Folding@home website and combine their efforts as a team, receiving points for completing work assigned to them and even earning bonus points for work that is more computationally demanding or that might have a greater scientific priority.

The Arizona Research Computing team has risen quickly in the ranks, highlighting the powerful computing capabilities at Arizonas state universities and the effectiveness of regional collaborations. As of mid-June, the Arizona Research Computing team was ranked in the top 100 out of nearly a quarter of a million teams, surpassing Hewlett Packard, Cisco Systems, Apple Computer, Inc., Google, Ireland and Poland, as well as many other university, industry and national or international contributors.

The Folding@home project investigates many research questions that require an enormous amount of computing, but this specific use for COVID-19 provides a unique opportunity, spurring many computing centers to participate in Folding@home for the first time, said Gil Speyer, lead scientific software engineer for Arizona State Universitys research computing center.

Todays biomedical research requires vast amounts of time and computing power. While the Arizona Research Computing team may directly impact COVID-19 research in a small way, the overall impact of the Folding@home project is much broader and will continue to have applications beyond the COVID-19 pandemic.

ASU:

Sean Dudley, assistant vice president and chief research information officer, Research Technology Office

Douglas Jennewein, senior director, research computing, Research Technology Office

Gil Speyer, lead scientific software engineer, Research Technology Office

Marisa Brazil, associate director, research computing, Research Technology Office

Jason Yalim, postdoc,research computing, Research Technology Office

Lee Reynolds, systems analyst principal, research computing, Research Technology Office

Eric Tannenhill, senior software engineer,research computing, Research Technology Office

NAU:

Chris Coffey

UArizona:

Blake Joyce, assistant director, research computing

Todd Merritt, information technology manager, principal

Ric Anderson, systems administrator, principal

Chris Reidy, systems administrator, principal

Adam Michel, systems administrator, principal

Here is the original post:
Arizona universities join forces to contribute to COVID-19 modeling and simulation efforts - ASU Now

HPC revolutionising speed of life science research – Business Weekly

Until recently High Performance Computing (HPC) was largely the preserve of the automotive, aerospace and financial services industries but, increasingly, the need for HPC within the life sciences sector has predominated.

Never has this been more evident than in the last six months of the pandemic, which has seen global HPC resources pooled in an unprecedented effort to halt the progress of COVID-19.

HPC refers to the practice of aggregating super-computing power in a way that delivers much higher performance than the typical desktop computer or workstation.

It enables researchers to analyse vast datasets, run extraordinarily large numbers of calculations in epidemiology, bioinformatics and molecular modelling and solve highly complex scientific problems.

Most significantly, it enables scientists and researchers to do in hours and days what would have, otherwise, taken months and years via slower, traditional computing platforms.

According to HPC expert Adrian Wander over the next few years, we will start seeing an ever-increasing democratisation of HPC which will be central to super-computing becoming a routine part of life sciences and pharmaceutical research.

A former research student with a PhD in theoretical solid-state physics, Adrian was a chemistry lecturer at Cambridge University, working under Professor Sir David King (later chief scientific advisor to Tony Blairs government) before devoting the next 30 years to working in HPC.

During this time, Adrian was an integral part of the team setting up the Hartree Centre home to some of the most advanced computing, data and AI technologies in the UK ran the scientific computing department at the Science & Technology Facilities Council before moving to the European Centre for Medium-Range Weather Forecasts (ECMWF), delivering highly critical, time-sensitive computations that relied heavily on HPC modelling for speed and accuracy.

He says: Historically, HPC was seen as a complex, specialist activity requiring special machines and expert techniques. But, especially in recent months, we have seen a dramatic change in who wants and needs access to HPC.

Traditionally, the biggest users were the automotive and aerospace sectors. But in the current COVID-19 climate, aeroplanes arent being purchased and consumers arent buying cars, whereas we are seeing a dramatic rise in the use of HPC within the life sciences industry not least because the need has never been greater.

HPC expert Adrian Wander

The life sciences sector was initially slow to catch onto HPC compared with the physical science community. In 2007, the Biotechnology & Biological Sciences Research Council (BBSRC) paid for 10 per cent of Britains academic national supercomputer service, HECToR (High End Computing Terascale Resource) but it ended up dropping the partnership because it wasnt being used by its community.

More recently there has been a huge upsurge in life sciences to explore workloads more actively and efficiently e.g. via simulation and modelling of protein folding and the structures of proteins, and similar areas. And, of course, Artificial Intelligence is being relied on heavily in the search for new drugs by automatically sampling huge numbers of drug candidates on the target treatment.

Adrian adds: Over the next few years, HPC will become both simpler to use and more easily available and it will inevitably become a more mainstream part of research portfolios just as incubators and wet benches are now.

Even with genome sequencing, the Oxford nanopore system takes you down into quite small organisations doing this kind of work because the new sequencing machines make sequencing quite easy to do.

The tricky part is putting all the bits together to assemble the full genome and this requires increasing amounts of compute power, irrespective of the size of the organisation doing the assembly.

The technological advance in this field has been incredible: Sequencing the first human genome was an international effort that cost around $1 billion and took 13 years to complete. Today, genomic studies and meta-genomics are routinely run for between $3000-$5000 and take little more than a couple of days to complete.

To quantify the remarkable difference HPC is making to scientific discovery, one only has to note the following: After HIV-1 (the main cause of AIDS in humans) was first identified in 1981, it took almost three more decades to genetically decode it.

Four years later, in 2013, the SARS outbreak (due to another coronavirus) was decoded within three months. This year the genome behind COVID-19 was decoded and published globally within days. Things are indeed changing in the life sciences sector. And rapidly so.

At the end of May, this country led by UK Research and Innovation became the first European super-computing partner to join the COVID-19 High Performance Computing Consortium, contributing more than 20 Petaflops of HPC capability to the global effort addressing the coronavirus crisis.

The consortium currently has 56 active projects and more than 430 Petaflops of compute which, collectively, is equal to the computing power of almost 500,000 laptops.

For perspective, a supercomputer with just eight petaflops can do a million calculations per person in the world per second. But, by pooling supercomputing capacity the consortium offers 50 times that and hopes to reduce the time it takes to discover new molecules which could lead to coronavirus treatments and a vaccine.

Last week, it was revealed that Summit, the worlds second-fastest supercomputer, had been employed to produce a genetic study of COVID-19 patients, analysing 2.5 billion genetic combinations in just two weeks.

The insights which Summit has produced through HPC and AI are significant in understanding how coronavirus causes the COVID-19 disease and additionally indicate potential new therapies to treat the worst symptoms.

Of course, return on investment is also important in driving the move towards HPC. Hyperion Research recently reported that every dollar spent on HPC in life sciences earns $160 of which $41 is profit or cost-saving.

Adrian explains: On the face of it, when you look at the cost of hosted services, it might seem expensive. But you need to weigh that against the fact that you no longer need an in-house team of high voltage engineers and all that comes with them.

And for the big drug companies and pharmas using increasing amounts of HPC cycles for drug discovery, personalised healthcare has the potential to become a huge profit-making market.

With astronomical levels of computational life sciences data being produced daily, the need for secure storage to house this and advanced computing infrastructure to rapidly analyse the vast datasets is becoming paramount. Thats the reason why more and more research institutions are outsourcing or co-locating to specialist data centres such as the bleeding edge Kao Data campus in Harlow.

Adam Nethersole

Kao Datas director, Adam Nethersole explains: Staying static isnt an option within a highly competitive sector where being first to market is everything especially when were talking about life-saving treatments, medicines and vaccines.

So across the life sciences sector, most universities, laboratories or research institutes are looking to expand their access to high performance computing.

But, of course, many of these organisations are pretty landlocked and with old architecture and there isnt available space that can be turned into a hyper-efficient data centre unless youre in new, state of the art facilities.

And even if you are able to scale internally and have the technical expertise in-house to do this you still need to consider how youre going to power and cool the additional servers especially in locations like Cambridge where there simply isnt a vast surplus of available electricity ready and available to be utilised.

One solution is using hyperscale cloud services but, while these are great for streaming videos, music and playing video games, they arent optimal for specialist computing which requires servers located closely together (and not virtualised in the cloud) and in many cases, bespoke, tailored IT architecture. Cloud platforms also tend to be expensive when moving large amounts of data.

This is why were seeing an increasing number of enquiries about moving computing infrastructure off-premise and into an advanced industrial scale data centre like the one we operate at Kao Data.

With multi-megawatts of power and space available immediately and excellent connectivity back-up to Cambridge citys research parks, were ideally placed to support.

One of Cambridges most forward-thinking research institutions, EMBL-EBI, has already done this and we are in conversation with others about helping them plan their computing footprint for the next 10, 15 and 20 years.

The rest is here:
HPC revolutionising speed of life science research - Business Weekly

Hopkins chemist awarded New Innovator Award from National Institutes of Health – The Hub at Johns Hopkins

ByRachel Wallach

Proteins must fold themselves up into specific three-dimensional shapes to perform tasks required by the cell for function and survival. But when proteins fold into the wrong shape, they aggregate or clump togetherthink of the way eggs transform from liquid to solid during the cooking process.

Misfolded proteins can disrupt the normal functioning of the cell, and are associated with a wide range of diseases. When the proteins inside neurons aggregate, the toxic structures they create cause neurodegenerative diseases like Alzheimer's and Parkinson's. "Over billions of years of evolution, cells were challenged with the task to get their proteins to fold up correctly and stay that way," says Stephen Fried, assistant professor in the Department of Chemistry in the Krieger School of Arts and Sciences. "But we humans live for a pretty long time, and as the proteins in our brains get older, there seems to be a slow process where they forget what shape they're supposed to be in. They form structures that stick to each other, ultimately leading to the death of neurons, dementia, and other maladies associated with age."

Fried has received an NIH Director's New Innovator Award from the National Institutes of Health's High-Risk, High-Reward Research program to continue his studies into both the normal process of protein folding and what happens on a molecular level when the process goes awry. The award, in the amount of $1.5 million over five years, supports unusually innovative research from early career investigators.

Francis S. Collins

Director, National Institutes of Health

"The breadth of innovative science put forth by the 2020 cohort of early career and seasoned investigators is impressive and inspiring," said NIH Director Francis S. Collins. "I am confident that their work will propel biomedical and behavioral research and lead to improvements in human health."

Scientists have been trying to understand the folding process for some time by studying purified proteins in test tubes. What sets Fried's research apart is that he and his team are studying the normal and abnormal folding of proteins in their native contextin this case, within rodent brains.

"We think that the tools we're developing on the front part of our project will give us a new view into why cells are so good at getting their proteins to assemble into such complicated and intricate architectures," Fried says. "We will then apply the tools to take a look at what's going on inside rats' brains at the molecular level when they age. Specifically, we want to know what's different in cognitively healthy versus cognitively impaired rats."

Fried has been collaborating with Michela Gallagher, Krieger-Eisenhower Professor of Psychology and Neuroscience, whose long-term research on Alzheimer's-related changes within the brain has produced experimental drugs now in clinical trials. While Gallagher's team focuses on the brain's structures, Fried's team complements the research by working at the molecular level. "Our collaboration will zoom in on the proteins forming incorrect shapes inside the brain; what they are interacting with, and what shapes they are forming," Fried says.

It is the opportunity for such interdisciplinary work that made the NIH award possible, Fried says, pointing to the preliminary data he and Gallagher were able to produce that he believes convinced the reviewers to take a chance on this uncharted territory.

Jotham Suez, a postdoctoral fellow at the Weizmann Institute of Science and is expected to join the Johns Hopkins Bloomberg School of Public Health as an assistant professor in the Department of Molecular Microbiology and Immunology in January, was also awarded an Early Independence Award from the High-Risk, High-Reward Research program. A microbiologist, Suez focuses on how non-nutritive sweeteners affect microbiomes. His early research indicated that non-caloric sweeteners can negatively affect health through the disruption of gut bacteria, and his research at JHU will focus on deciphering the underlying mechanisms.

The High-Risk, High-Reward Research program catalyzes scientific discovery by supporting research proposals that, due to their inherent risk, may struggle in the traditional peer-review process despite their transformative potential. Program applicants are encouraged to think "outside the box" and to pursue trailblazing ideas in any area of research relevant to the NIH's mission to advance knowledge and enhance health.

Read more here:
Hopkins chemist awarded New Innovator Award from National Institutes of Health - The Hub at Johns Hopkins