Rett Syndrome Tied to Altered Protein Levels in Brain in Early Study – Rett Syndrome News

Lack of a functional MeCP2 protein leads toRett syndrome by altering levels of brain proteins associated with energy metabolism and protein regulation, a study in a mouse model suggests.

These altered protein levels might also predict Rett syndromes progression, the investigators said.

The study, Brain protein changes in Mecp2 mouse mutant models: Effects on disease progression of Mecp2 brain specific gene reactivation, was published in theJournal of Proteomics.

Rett syndrome is caused by mutations in the MECP2gene that result in a missing functional MeCP2 protein, a regulator of gene expression. Despite prior studies in animal models, little research has focused on the effects of MeCP2 deficiency in the levels of other proteins in the brain, as well as in Rett syndromes progression.

Researchers from Italy used a mouse model of Rett to address this gap. They did a proteomic analysis of the brains of mice both before and after they developed symptoms, and compared the data to controls withoutMECP2mutations. (Proteomics is the large-scale study of proteins, conducted to draw more global conclusions than possible if assessing proteins one-by-one.)

Results showed abnormal levels of 20 brain proteins in symptomatic mice with Rett syndrome. Twelve of these proteins were overproduced, while eight were at lower levels compared to non-diseased control mice.

Notably, eight (40%) of these 20 proteins were involved in energy metabolism (the process by which cells get energy), and six (30%) were involved in proteostasis, which refers to cellular processes to ensure proper production and folding of proteins.

Presymptomatic mice showed abnormal levels in 18 proteins; 10 at low levels and 8 at high levels compared to controls. Similar to symptomatic mice, these proteins were primarily involved in energy metabolism and proteostasis.

The team then looked at mice that had been engineered to turn the MECP2 gene on in the brain, which was associated with mild symptoms and a longer life than otherwise expected.

By comparing animals lacking functional MeCP2 to mice with so-called MECP2 gene reactivation, the researchers worked to identify the proteins most directly impacted by missing MeCP2.

They found 12 proteins whose levels were normalized by gene reactivation. Seven of these proteins were at low levels and five at high levels without functional MeCP2 protein. Again, most were associated with energy metabolism and proteostasis, while two proteins were involved in how cells respond to oxidants reactive molecules that can damage DNA and cellular structures that is called redox regulation.

Only two of these 12 proteins, PYL2 and SODC, had been previously associated with Rett syndrome via earlier animal model studies that recorded altered levels in the brain.

Our findings suggest that RTT [Rett syndrome] is characterized by a complex metabolic dysfunction strictly related to energy metabolism, proteostasis processes pathways and redox regulation mechanisms, the researchers wrote.

Alteration in the evidenced cellular processes, brain pathways and molecular mechanisms [suggest] the possibility of the use of proteins as predictive biomarkers, they added.

Marisa holds an MS in Cellular and Molecular Pathology from the University of Pittsburgh, where she studied novel genetic drivers of ovarian cancer. She specializes in cancer biology, immunology, and genetics. Marisa began working with BioNews in 2018, and has written about science and health for SelfHacked and the Genetics Society of America. She also writes/composes musicals and coaches the University of Pittsburgh fencing club.

Read this article:

Rett Syndrome Tied to Altered Protein Levels in Brain in Early Study - Rett Syndrome News

Two years in the making, Pizza Hut tests a round pizza box – Fast Company

This week, Pizza Hut is using one Phoenix location to test two innovations. First up is a trial of plant-based Italian sausage as a pizza topping on a Garden Specialty Pizza, complemented by banana peppers, onions, and mushrooms. The sausage is made by Kelloggs incomparably named Incogmeato brand. Pizza toppings made from plant-based meat are still rather novel; in the United States, only Little Casears has tested an Italian sausage crumble from Impossible last spring.

But while the protein may top a lot of the headlines, the real potential for change is in its second move here: a round pizza box.

According to Pizza Hut,the new box keeps pizzas crispier, and it relies on less overall packaging compared with a typical square pizza box. The companys chief customer and operations officerNicolas Burquier says Pizza Hut worked with the startup Zume, best known for robotic pizza trucks, for more than two years on this design.

The goal was to design a box that simply makes our pizza taste even betterhotter pizza, crispier crust, says Burquier. This box will improve the pizza-eating experience for our customers and simplify operations for our team members.

The new round box has grooves to help catch grease and prevent soggy crust, and the top latches to keep heat in. It also interlocks easily to stack compactly, cutting out the employee effort that typically must be devoted to folding pizza boxes.

It may be innovative, but Pizza Hut is hardly the first to cut the corners off its pizza boxes. Back in 2010, Apple actually filed a patent for a round pizza box of its own, then touted it in a fun, three-minute ad called The Underdogs back in April.

In 2018, sustainable packaging company World Centric unveiled its own version of the round box. And even way back in 2004, a round pizza box called the Presseal was introduced by an inventor named John Harvey. None managed to catch on at a scale that even comes close to threatening the dominance of the almighty square.

ButBurquier is confident that could change.

One day in the future well reminisce about the idea of round pizzas in square boxes and laugh,says Burquier. The company plans to evaluate how the limited rollout in Phoenix goes, and look for ways to expand it across the country.

Continue reading here:

Two years in the making, Pizza Hut tests a round pizza box - Fast Company

Insights into Parkinson’s Onset May Lie in New Model of Cell Aging and Damage – Parkinson’s News Today

A newly created model helps to clarify the processes by which cells grow old and die, and which are known to be involved in the onset of neurodegenerative disorders likeAlzheimers and Parkinsons disease.

The study describing this model, Proteostasis collapse is a driver of cell aging and death, was published in PNAS.

To remain healthy, cells must be able to produce proteins and chaperone them: keeping proteins correctly folded, and destroying those that arent.

But as cells age, oxidative stress an imbalance between reactive and inflammatory free radicals andthe ability of cells to detoxify them slowly leads to the accumulation of irreparably damaged proteins inside cells that eventually overwhelm their quality control mechanisms.

Irreparably damaged proteins accumulate with age, increasingly distracting the chaperones from folding the healthy proteins the cell needs. The tipping point to death occurs when replenishing good proteins no longer keeps up with depletion from misfolding, aggregation, and damage, the researchers wrote.

Investigators with the Laufer Center for Physical & Quantitative Biology at Stony Brook University created a model that is able to predict the lifespan of the round worm Caenorhabditis elegans, an animal model often used in aging studies, based on its protein quality controlmechanisms.

In their study, scientists showed their models predictions matched the results of experiments they performed on round worms to assess the effects of oxidative damage on the animals lifespan.

In one experiment, they found that animals raised at a temperature of 20 degrees Celsius (about 68 degrees Farhenheit) had an average lifespan of 20 days. Worms were raised at higher temperatures and in the presence of free radicals (byproducts of oxidative stress), however, had lifespans of only a few hours.

As the cell is stressed by heat, proteins unfold, misfold, and aggregate. Chaperones are recruited, but with age, the synthesis [production] of good protein and the chaperoning of those spontaneously unfolding ultimately succumb to damage levels, at which bad protein becomes overwhelming, the researchers said.

Their work also found that mutant animals with more chaperones or proteasomes a complex of enzymes responsible for the destruction of unnecessary or damaged proteins lived longer.

All these findings were in agreement with the foundations of their model, which stated that oxidative stress and protein instability increase with age and are the root cause of cell degeneration.

This modeling is unique by being mathematically detailed and describing a broad range of cellular processes across the cells whole proteome [all proteins found in a cell], Ken A. Dill, PhD, a distinguished professor and director of the Laufer Center for Physical & Quantitative Biology, and a study co-author, said in a news release.

Often, aging-related studies look at the effects of one or two proteins at a time, rather than seeking, more generally, the cellular aging mechanism itself, Dill added.

This study also sets the foundation for future research into the molecular origins of aging disorders associated with protein misfolding, such as Parkinsons.

Joana holds a BSc in Biology and a MSc in Evolutionary and Developmental Biology from Universidade de Lisboa. She is currently finishing her PhD in Biomedicine and Clinical Research at Universidade de Lisboa. Her work has been focused on the impact of non-canonical Wnt signaling in the collective behavior of endothelial cells cells that made up the lining of blood vessels found in the umbilical cord of newborns.

Total Posts: 208

Ana holds a PhD in Immunology from the University of Lisbon and worked as a postdoctoral researcher at Instituto de Medicina Molecular (iMM) in Lisbon, Portugal. She graduated with a BSc in Genetics from the University of Newcastle and received a Masters in Biomolecular Archaeology from the University of Manchester, England. After leaving the lab to pursue a career in Science Communication, she served as the Director of Science Communication at iMM.

Continued here:

Insights into Parkinson's Onset May Lie in New Model of Cell Aging and Damage - Parkinson's News Today

UT molecular evolution professor named 2019 American Physical Society Fellow – UT The Daily Texan

The American Physical Society recently recognized a UT professor of molecular evolution as one of their 2019 fellows.

Claus Wilke said he was nominated by the Division of Biological Physics within the society and was given the fellowship in October. Associate physics professor Vernita Gordon said Wilke got his nomination for his studies of protein biophysics and molecular evolution.

Gordon said half of 1% of society members receive fellowships every year based on notable research findings in their field of study. Gordon also said 168 fellows were selected this year, and seven of these fellows were nominated within the biophysics field.

(Wilke) is really deserving of this award, Gordon said. Theres stuff to back up why they have made contributions at a level significant enough that they should be recognized as an APS fellow.

Wilke said his research identifying the primary driver of protein sequences, or sequences of organic compounds known as amino acids, kick-started his nomination from the division. He said his research combined physics and biology to showcase the functions of protein folding and the advancement of genetic sequences.

Wilke said he researched genetic mutations and patterns visible in the evolution of genomes, or an organisms genetic material, related to mutation structures. He said he studied the areas where more harm is caused from mutation and how it would affect the shape of protein sequences.

Everybody knows Jenga, Wilke said. So theres pieces that you just take out, and nothing is going to happen because the tower is stable, but theres other pieces that after a while you cant touch them because the whole thing would fall. So, what Im doing is a lot like that. Im trying to figure out which are the (proteins) that can be changed and which parts cant.

Nursing freshman Margarita Ramirez said she is becoming aware of the topics Wilke researched and how they compare to previous biophysics research.

Its good that he got awarded for (sequences) because its something having to do with evolution, and not a lot of people are looking into proteins, Ramirez said. His study is very unique compared to other biological professionals because its not just looking at DNA. Its looking deeper.

Wilke will be presented with the award in March 2020 during an annual ceremony where the society hosts fellows and other contributors to the study of physics.

Here is the original post:

UT molecular evolution professor named 2019 American Physical Society Fellow - UT The Daily Texan

Christopher Dobson: chemist whose work on proteins advanced research into neurodegenerative diseases – The BMJ

Christopher Dobson, master of St Johns College, Cambridge, whose work on proteins advanced research into diseases such as Parkinsons and Alzheimers, died at the Royal Marsden Hospital, Surrey, at the age of 69.

Born in Rinteln, Germany, the son of Arthur Dobson, an army officer, and Mabel, ne Pollard, Christopher Dobson was educated at Hereford Cathedral Junior School, Abingdon School (where he was a rowing cox), and Keble College, Oxford, where he took a first in chemistry before going on to take a DPhil at Merton College. Both his parents were originally from Bradford in Yorkshire and had left school aged 14. He had two elder siblings, Graham and Gillian. Because of his fathers postings, Dobsons early life was fairly nomadic. He grew up in Nigeria in the formative part of his childhood, which created a lifelong fascination with different cultures, said his son, William. He first wanted to be an architect but owing to inspirational science teachers at Abingdon, he instead chose to study chemistry at university.

Dobson devoted his life to researching diseases such as Parkinsons and Alzheimers and understanding the chemical processes that disrupt the production of healthy proteins and instead trigger their aggregation into toxic clumps. He became one of the worlds leading experts on protein folding and aggregation, and its links to neurodegenerative conditions. Alzheimers disease is a new plague, already affecting 40 million people worldwide, he said. Its one of a group of non-infectious diseases that terrifies us.

Original post:

Christopher Dobson: chemist whose work on proteins advanced research into neurodegenerative diseases - The BMJ

Diseases Folding@home

The Folding@home project (FAH) is dedicated to understanding protein folding, the diseases that result from protein misfolding and aggregation, and novel computational ways to develop new drugs in general. Here, we briefly describe our goals, what we are doing, and some highlights so far.

A distributed computing project must not only run calculations on millions of PCs, but such projects must produce results, especially in the form of peer-reviewed publications, public lectures, and other ways that disseminate the results from FAH to the greater scientific community. In the sidebar, you will find links to our progress in different areas.

You will also find updates about our work, advancements and new projects in the main Folding@home blog.

Proteins are necklaces of amino acids, long chain molecules. They are the basis of how biology gets things done. As enzymes, they are the driving force behind all of the biochemical reactions that make biology work. As structural elements, they are the main constituent of our bones, muscles, hair, skin and blood vessels. As antibodies, they recognize invading elements and allow the immune system to get rid of the unwanted invaders. For these reasons, scientists have sequenced the human genome the blueprint for all of the proteins in biology but how can we understand what these proteins do and how they work?

However, only knowing this sequence tells us little about what the protein does and how it does it. In order to carry out their function (e.g. as enzymes or antibodies), they must take on a particular shape, also known as a fold. Thus, proteins are truly amazing machines: before they do their work, they assemble themselves! This self-assembly is called folding.

Diseases such as Alzheimers disease, Huntingtons disease, cystic fibrosis, BSE (Mad Cow disease), an inherited form of emphysema, and even many cancers are believed to result from protein misfolding. When proteins misfold, they can clump together (aggregate). These clumps can often gather in the brain, where they are believed to cause the symptoms of Mad Cow or Alzheimers disease.

Link:

Diseases Folding@home

DeepMind timeline: The history of the UK’s pioneering AI firm – Techworld.com

DeepMind timeline: The history of the UK's pioneering AI firm | Startups | TechworldThe London startup has made headlines for both breakthroughs and controversies since it was founded in 2010

Share

DeepMind's efforts to achieve artificial general intelligence have won the firm both plaudits and critics since it was founded in 2010. The firm's research into deep learning techniquesconvinced the search engine giant Googleto spend 400 million on the company in 2014, but it has since incurred heavy losses and while its scientific discoveries have earned acclaim, DeepMind has also been rebuked for itslaissez faire approach to data privacy and security practices.

Read next: Google DeepMind: the story behind the world's leading AI startup

Here's our timeline of DeepMind's short but eventful history.

DeepMind was founded in London by machine learning researcher Shane Legg and childhood friends Demis Hassabis and former consultant Mustafa Suleyman. The cofounders all metat University College London, where Legg was a research associate and Hassibis was studying for a PhD in cognitive neuroscience.

The trio declared a grand ambition for their new company: "To solve intelligence and then to use that to solve everything else."

They initially pursued this lofty goal through video games. A 16-year-old Hasabis had co-developed the hit simulation game Theme Park, and at 22 was running his own games studio. He combined this experience with his neuroscience PhD to create AIprogrammesthat could master video games.One ofthese systemstaught itself how to play 49 different Atari games, including Pong and Space Invaders, just byviewing the score and pixels on the screen.

These experiments with video games led DeepMind to focus on an AI technique called deep reinforcement learning, which combines the pattern recognition of deep learning with the reward signals for completing tasks achieved through reinforcement learning.

DeepMind announced the technique in a research paper about its Atari trials, which it called "the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning."

The technique was responsible for DeepMind'smost impressive achievements, but the company's relentless focus on the technique has been questioned by some AI experts. In August 2019, Gary Marcus, the founder of Robust.AI and a professor of psychology and neural science at NYU, noted inWiredthat the company was still yet to find a large-scale commercial application of deep reinforcement learning.

"Ten years from now we will conclude that deep reinforcement learning was overrated in the late 2010s, and that many other important research avenues were neglected," he wrote. "Every dollar invested in reinforcement learning is a dollar not invested somewhere else, at a time when, for example, insights from the human cognitive sciences might yield valuable clues."

Google made DeepMind one of its biggest-ever European acquisitions when it splashed out 400 million on the London-based startup.Googleagreed to establish an AIethics boardas part of the deal, but the members and workings of the board have never been made clear.

A DeepMind-created system became the first AI to beat a professional Go player when AlphaGo routed European champion Fan Hu by a score of five to zero.Later that year, the system defeatedKe Jie, the world's number one player of the ancient and highly complex board game.

DeepMind began its controversial relationship with the Royal Free hospital in London when the two organisationssigned a deal that gave the Google subsidiary access to healthcare dataon 1.6 million patients. DeepMind later announced that the partnership hadyieldedan app called Streams that would help clinicians monitor patients for early signs of kidney disease.

DeepMind turned its ambition to use AI to improve healthcare into a separate division of the company:DeepMindHealth.Suleyman, whose mother was an NHS nurse, who chosen to lead the unit.

Suleymanwent on to sign further NHS dealswithTaunton & Somerset Foundation Trust,Yeovil District Hospital,University College London Hospital,Imperial College Healthcare and Moorfields Eye Hospital to apply AI to various medical challenges.

The Information Commissioner's Office (ICO), theUK's data regulator, ruled that the Royal Free"failed" to comply with data protection rules when it provided DeepMind with patient data as it didn't properly inform patients about how their details would be used.

Read next: DeepMind report fails to justify NHS use, claim privacy campaigners

The Royal Free accepted the findings and was not fined. The Trust announced that it hadstarted to address the concerns.

DeepMind revealedit had attracted a major client in the US when it announced that itwas teaming up with the US Department of Veterans Affairs to predict patient deterioration by analysing patterns in medical records.

Read next: DeepMind researcher says AI agents should cooperate for social good

The project also involves researching ways to improve the algorithmsDeepMind uses to detect acute kidney injury.

Privacy campaigners raised alarm whenDeepMind announcedthat its healthcare subsidiary was being absorbed into Google. The arrangement meant that the group would no longer operate as an independent unit but instead merge with the newly-formed Google Health team, led by former Geisinger CEO David Feinberg.

Critics argued that the shift betrayed DeepMind's promise never to share data with its parent company. DeepMind claimed that all patient data would remain separate from Google services and projects.

DeepMind made its biggest scientific breakthrough yet when its AlphaFold system won a competitionto predict the 3D shapes of proteins based on their genetic codes.The victory suggested that AI could help understand the protein-folding puzzle that plays a key role in the development of new drugs.

This is a lighthouse project, our first major investment in terms of people and resources into a fundamental, very important, real-world scientific problem, DeepMind CEO Demis Hassabistold the Guardian.

DeepMind continued its long history of applying AI tovideo games by introducing AlphaStar, a programme that can play strategy game StarCraft II. The system went on to defeat some of the world's best StarCraft II players.

DeepMind announced that Mustafa Suleyman, the company'scofounder and head of applied artificial intelligence,was leaving the company for an indefinite period that the company said would likelyend later the same year.DeepMindclaimedthat the decision was mutual andnotrelated to his performance,but rumours spread that his departure was related to the company's various healthcare controversies.

Read next:Google DeepMind loses its cofounder Mustafa Suleyman indefinitely

On September 18, Dr Dominic King, the UK site lead at Google Health, announced in a blogpost that Google had completed its takeover of DeepMind's health division.

"It's clear that a transition like this takes time," he wrote."Health data is sensitive, and we gave proper time and care to make sure that we had the full consent and cooperation of our partners. This included giving them the time to ask questions and fully understand our plans and to choose whether to continue our partnerships. As has always been the case, our partners are in full control of all patient data and we will only use patient data to help improve care, under their oversight and instructions.

The Royal Free,University College London Hospitals,Imperial College Healthcare, Moorfields Eye Hospital, Taunton & Somerset, and University College London Hospitals NHS Foundation Trust all went on to release statementsconfirming that theircontractual arrangements had been moved to Google.

Share

More:

DeepMind timeline: The history of the UK's pioneering AI firm - Techworld.com

Protein Structures: Primary, Secondary, Tertiary, Quaternary …

Proteins are the largest and most varied class of biological molecules, and they show the greatest variety of structures. Many have intricate three-dimensional folding patterns that result in a compact form, but others do not fold up at all (natively unstructured proteins) and exist in random conformations. The function of proteins depends on their structure, and defining the structure of individual proteins is a large part of modern Biochemistry and Molecular Biology.

To understand how proteins fold, we will start with the basics of structure, and progress through to structures of increasing complexity.

Peptide Bonds

To make a protein, amino acids are connected together by a type of amide bond called a peptide bond. This bond is formed between the alpha amino group of one amino acid and the carboxyl group of another in a condensation reaction. When two amino acids join, the result is called a dipeptide, three gives a tripeptide, etc. Multiple amino acids result in a polypeptide (often shortened to peptide). Because water is lost in the course of creating the peptide bond, individual amino acids are referred to as amino acid residues once they are incorporated. Another property of peptides is polarity: the two ends are different. One end has a free amino group (called the N-terminal) and the other has a free carboxyl group (C-terminal).

In the natural course of making a protein, polypeptides are elongated by the addition of amino acids to the C-terminal end of the growing chain. Conventionally, peptides are written N-terminal first; therefore gly-ser is not the same as ser-gly or GS is not the same as SG. The connection gives rise to a repeating pattern of NCC-NCC-NCC atoms along the length of the molecule. This is referred to as the backbone of the peptide. If stretched out, the side chains of the individual residues project outwards from this backbone.

The peptide bond is written as a single bond, but it actually has some characteristics of a double bond because of the resonance between the C-O and C-N bonds:

This means that the six atoms involved are coplanar, and that there is not free rotation around the CN axis. This constrains the flexibility of the chain and prevents some folding patterns.

Primary Structure of Proteins

It is convenient to discuss protein structure in terms of four levels (primary to quaternary) of increasing complexity. Primary structure is simply the sequence of residues making up the protein. Thus primary structure involves only the covalent bonds linking residues together.

The minimum size of a protein is defined as about 50 residues; smaller chains are referred to simply as peptides. So the primary structure of a small protein would consist of a sequence of 50 or so residues. Even such small proteins contain hundreds of atoms and have molecular weights of over 5000 Daltons (Da). There is no theoretical maximum size, but the largest protein so far discovered has about 30,000 residues. Since the average molecular weight of a residue is about 110 Da, that single chain has a molecular weight of over 3 million Daltons.

Secondary Structure

This level of structure describes the local folding pattern of the polypeptide backbone and is stabilized by hydrogen bonds between N-H and C=O groups. Various types of secondary structure have been discovered, but by far the most common are the orderly repeating forms known as the a helix and the b sheet.

An a helix, as the name implies, is a helical arrangement of a single polypeptide chain, like a coiled spring. In this conformation, the carbonyl and N-H groups are oriented parallel to the axis. Each carbonyl is linked by a hydrogen bond to the N-H of a residue located 4 residues further on in the sequence within the same chain. All C=O and N-H groups are involved in hydrogen bonds, making a fairly rigid cylinder. The alpha helix has precise dimensions: 3.6 residues per turn, 0.54 nm per turn. The side chains project outward and contact any solvent, producing a structure something like a bottle brush or a round hair brush. An example of a protein with many a helical structures is the keratin that makes up human hair.

The structure of a b sheet is very different from the structure of an a helix. In a b sheet, the polypeptide chain folds back on itself so that polypeptide strands like side by side, and are held together by hydrogen bonds, forming a very rigid structure. Again, the polypeptide N-H and C=O groups form hydrogen bonds to stabilize the structure, but unlike the a helix, these bonds are formed between neighbouring polypeptide (b) strands. Generally the primary structure folds back on itself in either a parallel or antiparallel arrangement, producing a parallel or antiparallel b sheet. In this arrangement, side chains project alternately upward and downward from the sheet. The major constituent of silk (silk fibroin) consists mainly of layers of b sheet stacked on top of each another.

Other types of secondary structure. While the a helix and b sheet are by far the most common types of structure, many others are possible. These include various loops, helices and irregular conformations. A single polypeptide chain may have different regions that take on different secondary structures. In fact, many proteins have a mixture of a helices, b sheets, and other types of folding patterns to form various overall shapes.

What determines whether a particular part of a sequence will fold into one or the other of these structures? A major determinant is the interactions between side chains of the residues in the polypeptide. Several factors come into play: steric hindrance between nearby large side chains, charge repulsion between nearby similarly-charged side chains, and the presence of proline. Proline contains a ring that constrains bond angles so that it will not fit exactly into an a helix or b sheet. Further, there is no H on one peptide bond when proline is present, so a hydrogen bond cannot form. Another major factor is the presence of other chemical groups that interact with each other. This contributes to the next level of protein structure, the tertiary structure.

Tertiary Structure

This level of structure describes how regions of secondary structure fold together that is, the 3D arrangement of a polypeptide chain, including a helices, b sheets, and any other loops and folds. Tertiary structure results from interactions between side chains, or between side chains and the polypeptide backbone, which are often distant in sequence. Every protein has a particular pattern of folding and these can be quite complex.

Whereas secondary structure is stabilized by H-bonding, all four weak forces contribute to tertiary structure. Usually, the most important force is hydrophobic interaction (or hydrophobic bonds). Polypeptide chains generally contain both hydrophobic and hydrophilic residues. Much like detergent micelles, proteins are most stable when their hydrophobic parts are buried, while hydrophilic parts are on the surface, exposed to water. Thus, more hydrophobic residues such as trp are often surrounded by other parts of the protein, excluding water, while charged residues such as asp are more often on the surface.

Other forces that contribute to tertiary structure are ionic bonds between side chains, hydrogen bonds, and van der Waals forces. These bonds are far weaker than covalent bonds, and it takes multiple interactions to stabilize a structure.

There is one covalent bond that is also involved in tertiary structure, and that is the disulfide bond that can form between cysteine residues. This bond is important only in non-cytoplasmic proteins since there are enzyme systems present in the cytoplasm to remove disulfide bonds.

Visualization of protein structures Because the 3D structures of proteins involve thousands of atoms in complex arrangements, various ways of depicting them so they are understood visually have been developed, each emphasizing a different property of the protein. Software tools have been written to depict proteins in many different ways, and have become essential to understanding protein structure and function.

Structural Domains of Proteins

Protein structure can also be described by a level of organization that is distinct from the ones we have just discussed. This organizational unit is the protein domain, and the concept of domains is extremely important for understanding tertiary structure. A domain is a distinct region (sequence of amino acids) of a protein, while a structural domain is an independently-folded part of a protein that folds into a stable structure. A protein may have many domains, or consist only of a single domain. Larger proteins generally consist of connected structural domains. Domains are often separated by a loosely folded region and may create clefts between them..

Quaternary Structure

Some proteins are composed of more than one polypeptide chain. In such proteins, quaternary structure refers to the number and arrangement of the individual polypeptide chains. Each polypeptide is referred to as a subunit of the protein. The same forces and bonds that create tertiary structure also hold subunits together in a stable complex to form the complete protein.

Individual chains may be identical, somewhat similar, or totally different. As examples, CAP protein is a dimer with two identical subunits, whereas hemoglobin is a tetramer containing two pairs of non-identical (but similar) subunits. It has 2 a subunits and 2 b subunits. Secreted proteins often have subunits that are held together by disulfide bonds. Examples include tetrameric antibody molecules that commonly have two larger subunits and two smaller subunits (heavy chains and light chains) connected by disulfide bonds and noncovalent forces.

In some proteins, intertwined a helices hold subunits together; these are called coiled-coils. This structure is stabilized by a hydrophobic surface on each a helix that is created by a heptameric repeat pattern of hydrophilic/hydrophobic residues. The sequence of the protein can be represented as abcdefgabcdefgabcdefg with positions a and d filled with hydrophobic residues such as A, V, L etc. Each a helix has a hydrophobic surface that therefore matches the other. When the two helices coil around each other, those surfaces come together, burying the hydrophobic side chains and forming a stable structure. An example of such a protein is myosin, the motor protein found in muscle that allows contraction.

Protein Folding

How and why do proteins naturally form secondary, tertiary and quaternary structures? This question is a very active area of research and is certainly not completely understood. A folded, biologically-active protein is considered to be in its native state, which is generally thought to be the conformation with least free energy.

Proteins can be unfolded or denatured by treatment with solvents that disrupt weak bonds. Thus organic solvents that disrupt hydrophobic interactions, high concentrations of urea or guanidine that interfere with H-bonding, extreme pH or even high temperatures, will all cause proteins to unfold. Denatured proteins have a random, flexible conformation and usually lack biological activity. Because of exposed hydrophobic groups, they often aggregate and precipitate. This is what happens when you fry an egg.

If the denaturing condition is removed, some proteins will re-fold and regain activity. This process is called renaturation. Therefore, all the information necessary for folding is present in the primary structure (sequence) of the protein. During renaturation, the polypeptide chain is thought to fold up into a loose globule by hydrophobic effects, after which small regions of secondary structure form into especially favorable sequences. These sequences then interact with each other to stabilize intermediate structures before the final conformation is attained.

Many proteins have great difficulty renaturing, and proteins that assist other proteins to fold are called molecular chaperones. They are thought to act by reversibly masking exposed hydrophobic regions to prevent aggregation during the multi-step folding process. Proteins that must cross membranes (eg. mitochondrial proteins) must stay unfolded until they reach their destination, and molecular chaperones may protect and assist during this process.

Protein families/Types of proteins

Proteins are classified in a number of ways, according to structure, function, location and/or properties. For example, many proteins combine tightly with other substances such as carbohydrates (glycoproteins), lipids (lipoproteins), or metal ions (metalloproteins). The diversity of proteins that form from the 20 amino acids is greatly increased by associations such as these. Proteins that are tightly bound to membranes are called membrane proteins. Proteins with similar activities are given functional classifications. For example, proteins that break down other proteins are called proteases.

Because almost all proteins arise by an evolutionary process, ie. new ones are derived from old ones, they can be classified into families by their relatedness. Proteins that derive from the same ancestor are called homologous proteins. Studying the sequences of homologous proteins can give clues to the structure and function of the protein. Residues that are critical for function do not change on an evolutionary timescale; they are referred to as conserved residues. Identifying such residues by comparing amino acid sequences often helps clarify what a protein is doing or how it is folded. For example the proteases trypsin and chymotrypsin are members of the serine protease family; so-named because of a conserved serine residue that is essential to catalyze the reaction. Trypsin and chymotrypsin contain very similar folding patterns and reaction mechanisms. Recognizing a pattern of conserved residues in protein sequences often allows scientists to deduce the function of a protein.

See the original post here:

Protein Structures: Primary, Secondary, Tertiary, Quaternary ...

Thermodynamics of spontaneous protein folding: role of …

Summary

Free energy change in individual transformations

It is standard practice in biochemistry to consider the Gibbs Free Energy of transformation of the sort A B in isolation in determining whether it will proceed spontaneously. A chemical reaction for which G is negative may generate heat (i.e. have a negative enthalpy change (H) ) which affects its aqueous surroundings, but it seems justified to consider the reaction in isolation as there is no sense that the change in the vibration of the water molecules is driving or coupled to the reaction.

This approach has been applied to the structural change of protein folding with the conclusion (consistent with the first explanation) that the change in enthalpy (H) is sufficient to produce a negative G and hence drive protein folding (Citation 1, below).

Free energy change in coupled transformationsMany biochemical changes involve transformations which individually have a positive free energy change, but are made possible by coupling to another reaction with negative free energy change, of greater magnitude:

A B , G1 = +x

C D , G2 = y

If y>x and these two reactions are coupled (generally through a complex reaction path on an enzyme) , then we have:

A + C B + D , Goverall = ve

See also Berg et al.

Although one can reject the second explanation in the question as it stands because it ignores the free energy change in the protein folding, perhaps it was intended to mean that the folding of the protein (A B) should be considered as coupled to the change in the environment of the water (C D), and that the negative G for the aqueous environment made a greater contribution to the overall G than that for the protein folding.

Is it valid to consider these two systems as coupled? In the original version of my answer I argued against this point of view, but am no longer convinced by my own arguments. The water environment is clearly essential for the hydrophobic effect the burying of the hydrophobic residues in the centre of the protein away from the water. This is evident if one considers the same protein in a hydrophobic environment such as a cell membrane it would not fold. In membrane proteins it is hydrophobic residues that are exposed to the lipid bilayer and it is their interiors that sometimes have hydrophilic channels.

So in this coupled system, what is the determinant of the negative free energy change? Minikel (Citation 2, below) asserts that there is no net enthalpy change for the protein folding, and it is the entropy effect on the G for the aqueous environment that drives the folding. He indicates that this view is supported by differential scanning colorimetry and, although he doesnt cite references, there is a recent (if rather complex) review of this topic by Christopher M. Johnson.

Citation 1: Assertion of role of H of protein

The following explanation, taken from Essential Biochemistry, treats the protein folding in isolation and asserts that change in enthalpy is sufficient to produce a negative free energy change:

The folding of a protein also provides an example of the "H" and "TS" terms competing with one another to determine the G of the folding process. As described above, the change in entropy of the protein as it folds is negative, so the "TS" term is positive. However, in addition to entropic effects there are enthalpic contributions to protein folding. These include hydrogen bonding, ionic salt bridges, and Van der Waals forces. An input of thermal (heat) energy is required to disrupt these forces, and conversely when these interactions form during protein folding they release heat (the H is negative). When all of these entropic and enthalpic contributions are weighed, the enthalpy term wins out over the entropy term. Therefore the free energy of protein folding is negative, and protein folding is a spontaneous process.

Citation 2: Rebuttal of role of H of protein and assertion of role of water

The following explanation, taken from on-line lecture notes of of Eric V. Minikel of Harvard University, rebutting the point of view above:

An incorrect and simplistic view of protein folding is as follows. An unfolded protein has high configurational entropy but also high enthalpy because it has few stabilizing interactions. A folded protein has far less entropy, but also far less enthalpy. There is a tradeoff between H and S here. Note that because G = H - TS, increased temperature weights the S term more heavily, meaning that higher temperature favors unfolding.

That entire explanation only considers the energy of the protein and not that of the solvent. In fact, hydrophobic domains of a protein constrain the possible configurations of surrounding water (see explanation above), and so their burial upon folding increases the waters entropy. Moreover, it turns out that the hydrogen bonding of polar residues and the backbone is satisfied both in an unfolded state (by water) and in a folded state (by each other). Therefore enthalpy is zero sum, and protein folding is driven almost entirely by entropy.

Original post:

Thermodynamics of spontaneous protein folding: role of ...

Proteopathy – Wikipedia

In medicine, proteopathy (Proteo- [pref. protein]; -pathy [suff. disease]; proteopathies pl.; proteopathic adj.) refers to a class of diseases in which certain proteins become structurally abnormal, and thereby disrupt the function of cells, tissues and organs of the body.[1][2] Often the proteins fail to fold into their normal configuration; in this misfolded state, the proteins can become toxic in some way (a gain of toxic function) or they can lose their normal function.[3] The proteopathies (also known as proteinopathies, protein conformational disorders, or protein misfolding diseases) include such diseases as CreutzfeldtJakob disease and other prion diseases, Alzheimer's disease, Parkinson's disease, amyloidosis, Multiple System Atrophy, and a wide range of other disorders (see List of Proteopathies).[2][4][5][6][7][8]

The concept of proteopathy can trace its origins to the mid-19th century, when, in 1854, Rudolf Virchow coined the term amyloid ("starch-like") to describe a substance in cerebral corpora amylacea that exhibited a chemical reaction resembling that of cellulose. In 1859, Friedreich and Kekul demonstrated that, rather than consisting of cellulose, "amyloid" actually is rich in protein.[9] Subsequent research has shown that many different proteins can form amyloid, and that all amyloids have in common birefringence in cross-polarized light after staining with the dye Congo Red, as well as a fibrillar ultrastructure when viewed with an electron microscope.[9] However, some proteinaceous lesions lack birefringence and contain few or no classical amyloid fibrils, such as the diffuse deposits of A protein in the brains of Alzheimer patients.[10] Furthermore, evidence has emerged that small, non-fibrillar protein aggregates known as oligomers are toxic to the cells of an affected organ, and that amyloidogenic proteins in their fibrillar form may be relatively benign.[11][12]

In most, if not all proteopathies, a change in 3-dimensional folding (conformation) increases the tendency of a specific protein to bind to itself.[5] In this aggregated form, the protein is resistant to clearance and can interfere with the normal capacity of the affected organs. In some cases, misfolding of the protein results in a loss of its usual function. For example, cystic fibrosis is caused by a defective cystic fibrosis transmembrane conductance regulator (CFTR) protein,[3] and in amyotrophic lateral sclerosis / frontotemporal lobar degeneration (FTLD), certain gene-regulating proteins inappropriately aggregate in the cytoplasm, and thus are unable to perform their normal tasks within the nucleus.[13][14] Because proteins share a common structural feature known as the polypeptide backbone, all proteins have the potential to misfold under some circumstances.[15] However, only a relatively small number of proteins are linked to proteopathic disorders, possibly due to structural idiosyncrasies of the vulnerable proteins. For example, proteins that are normally unfolded or relatively unstable as monomers (that is, as single, unbound protein molecules) are more likely to misfold into an abnormal conformation.[5][15][16] In nearly all instances, the disease-causing molecular configuration involves an increase in beta-sheet secondary structure of the protein.[5][15][17][18] The abnormal proteins in some proteopathies have been shown to fold into multiple 3-dimensional shapes; these variant, proteinaceous structures are defined by their different pathogenic, biochemical, and conformational properties.[19] They have been most thoroughly studied with regard to prion disease, and are referred to as protein strains.[20][21]

The likelihood that proteopathy will develop is increased by certain risk factors that promote the self-assembly of a protein. These include destabilizing changes in the primary amino acid sequence of the protein, post-translational modifications (such as hyperphosphorylation), changes in temperature or pH, an increase in production of a protein, or a decrease in its clearance.[1][5][15] Advancing age is a strong risk factor,[1] as is traumatic brain injury.[22][23] In the aging brain, multiple proteopathies can overlap.[24] For example, in addition to tauopathy and A-amyloidosis (which coexist as key pathologic features of Alzheimer's disease), many Alzheimer patients have concomitant synucleinopathy (Lewy bodies) in the brain.[25]

It is hypothesized that chaperones and co-chaperones (proteins that assist protein folding) may antagonize proteotoxicity during aging and in protein misfolding-diseases to maintain proteostasis.[26][27][28]

Some proteins can be induced to form abnormal assemblies by exposure to the same (or similar) protein that has folded into a disease-causing conformation, a process called 'seeding' or 'permissive templating'.[29][30] In this way, the disease state can be brought about in a susceptible host by the introduction of diseased tissue extract from an afflicted donor. The best known form of such inducible proteopathy is prion disease,[31] which can be transmitted by exposure of a host organism to purified prion protein in a disease-causing conformation.[32][33] There is now evidence that other proteopathies can be induced by a similar mechanism, including A amyloidosis, amyloid A (AA) amyloidosis, and apolipoprotein AII amyloidosis,[30][34] tauopathy,[35] synucleinopathy,[36][37][38][39] and the aggregation of superoxide dismutase-1 (SOD1),[40][41] polyglutamine,[42][43] and TAR DNA-binding protein-43 (TDP-43).[44]

In all of these instances, an aberrant form of the protein itself appears to be the pathogenic agent. In some cases, the deposition of one type of protein can be experimentally induced by aggregated assemblies of other proteins that are rich in -sheet structure, possibly because of structural complementarity of the protein molecules. For example, AA amyloidosis can be stimulated in mice by such diverse macromolecules as silk, the yeast amyloid Sup35, and curli fibrils from the bacterium Escherichia coli.[45] In addition, apolipoprotein AII amyloid can be induced in mice by a variety of -sheet rich amyloid fibrils,[46] and cerebral tauopathy can be induced by brain extracts that are rich in aggregated A.[47] There is also experimental evidence for cross-seeding between prion protein and A.[48] In general, such heterologous seeding is less efficient than is seeding by a corrupted form of the same protein.

The development of effective treatments for many proteopathies has been challenging.[73][74] Because the proteopathies often involve different proteins arising from different sources, treatment strategies must be customized to each disorder; however, general therapeutic approaches include maintaining the function of affected organs, reducing the formation of the disease-causing proteins, preventing the proteins from misfolding and/or aggregating, or promoting their removal.[75][73][76] For example, in Alzheimer's disease, researchers are seeking ways to reduce the production of the disease-associated protein A by inhibiting the enzymes that free it from its parent protein.[74] Another strategy is to use antibodies to neutralize specific proteins by active or passive immunization.[77] In some proteopathies, inhibiting the toxic effects of protein oligomers might be beneficial.[78] Amyloid A (AA) amyloidosis can be reduced by treating the inflammatory state that increases the amount of the protein in the blood (referred to as serum amyloid A, or SAA).[73] In immunoglobulin light chain amyloidosis (AL amyloidosis), chemotherapy can be used to lower the number of the blood cells that make the light chain protein that forms amyloid in various bodily organs.[79] Transthyretin (TTR) amyloidosis (ATTR) results from the deposition of misfolded TTR in multiple organs.[80] Because TTR is mainly produced in the liver, TTR amyloidosis can be slowed in some hereditary cases by liver transplantation.[81] TTR amyloidosis also can be treated by stabilizing the normal assemblies of the protein (called tetramers because they consist of four TTR molecules bound together). Stabilization prevents individual TTR molecules from escaping, misfolding, and aggregating into amyloid.[82][83]

Several other treatment strategies for proteopathies are being investigated, including small molecules and biologic medicines such as small interfering RNAs, antisense oligonucleotides, peptides, and engineered immune cells.[82][79][84][85] In some cases, multiple therapeutic agents may be combined to improve effectiveness.[79][86]

Micrograph of tauopathy (brown) in a neuronal cell body (arrow) and process (arrowhead) in the cerebral cortex of a patient with Alzheimer's disease. Bar = 25 microns (0.025mm).

View post:

Proteopathy - Wikipedia

Folding@home – Wikipedia

Distributed computing project simulating protein folding

Folding@home (FAH or F@h) is a distributed computing project for disease research that simulates protein folding, computational drug design, and other types of molecular dynamics. The project uses the idle processing resources of thousands of personal computers owned by volunteers who have installed the software on their systems. Its main purpose is to determine the mechanisms of protein folding, which is the process by which proteins reach their final three-dimensional structure, and to examine the causes of protein misfolding. This is of significant academic interest with major implications for medical research into Alzheimer's disease, Huntington's disease, and many forms of cancer, among other diseases. To a lesser extent, Folding@home also tries to predict a protein's final structure and determine how other molecules may interact with it, which has applications in drug design. Folding@home is developed and operated by the Pande Laboratory at Stanford University, under the direction of Prof. Vijay Pande, and is shared by various scientific institutions and research laboratories across the world.[4]

The project has pioneered the use of graphics processing units (GPUs), PlayStation3s, Message Passing Interface (used for computing on multi-core processors), and some Sony Xperia smartphones for distributed computing and scientific research. The project uses statistical simulation methodology that is a paradigm shift from traditional computing methods.[5] As part of the clientserver model network architecture, the volunteered machines each receive pieces of a simulation (work units), complete them, and return them to the project's database servers, where the units are compiled into an overall simulation. Volunteers can track their contributions on the Folding@home website, which makes volunteers' participation competitive and encourages long-term involvement.

Folding@home is one of the world's fastest computing systems, with a speed of approximately 98.7petaFLOPS[6] as of June 2019[update]. This performance from its large-scale computing network has allowed researchers to run computationally costly atomic-level simulations of protein folding thousands of times longer than formerly achieved. Since its launch on 1Oct2000, the Pande Lab has produced 212 scientific research papers as a direct result of Folding@home.[7] Results from the project's simulations agree well with experiments.[8][9][10]

Proteins are an essential component to many biological functions and participate in virtually all processes within biological cells. They often act as enzymes, performing biochemical reactions including cell signaling, molecular transportation, and cellular regulation. As structural elements, some proteins act as a type of skeleton for cells, and as antibodies, while other proteins participate in the immune system. Before a protein can take on these roles, it must fold into a functional three-dimensional structure, a process that often occurs spontaneously and is dependent on interactions within its amino acid sequence and interactions of the amino acids with their surroundings. Protein folding is driven by the search to find the most energetically favorable conformation of the protein, i.e., its native state. Thus, understanding protein folding is critical to understanding what a protein does and how it works, and is considered a holy grail of computational biology.[11][12] Despite folding occurring within a crowded cellular environment, it typically proceeds smoothly. However, due to a protein's chemical properties or other factors, proteins may misfold, that is, fold down the wrong pathway and end up misshapen. Unless cellular mechanisms can destroy or refold misfolded proteins, they can subsequently aggregate and cause a variety of debilitating diseases.[13] Laboratory experiments studying these processes can be limited in scope and atomic detail, leading scientists to use physics-based computing models that, when complementing experiments, seek to provide a more complete picture of protein folding, misfolding, and aggregation.[14][15]

Due to the complexity of proteins' conformation or configuration space (the set of possible shapes a protein can take), and limits in computing power, all-atom molecular dynamics simulations have been severely limited in the timescales which they can study. While most proteins typically fold in the order of milliseconds,[14][16] before 2010, simulations could only reach nanosecond to microsecond timescales.[8] General-purpose supercomputers have been used to simulate protein folding, but such systems are intrinsically costly and typically shared among many research groups. Further, because the computations in kinetic models occur serially, strong scaling of traditional molecular simulations to these architectures is exceptionally difficult.[17][18] Moreover, as protein folding is a stochastic process and can statistically vary over time, it is challenging computationally to use long simulations for comprehensive views of the folding process.[19][20]

Protein folding does not occur in one step.[13] Instead, proteins spend most of their folding time, nearly 96% in some cases,[21] waiting in various intermediate conformational states, each a local thermodynamic free energy minimum in the protein's energy landscape. Through a process known as adaptive sampling, these conformations are used by Folding@home as starting points for a set of simulation trajectories. As the simulations discover more conformations, the trajectories are restarted from them, and a Markov state model (MSM) is gradually created from this cyclic process. MSMs are discrete-time master equation models which describe a biomolecule's conformational and energy landscape as a set of distinct structures and the short transitions between them. The adaptive sampling Markov state model method significantly increases the efficiency of simulation as it avoids computation inside the local energy minimum itself, and is amenable to distributed computing (including on GPUGRID) as it allows for the statistical aggregation of short, independent simulation trajectories.[22] The amount of time it takes to construct a Markov state model is inversely proportional to the number of parallel simulations run, i.e., the number of processors available. In other words, it achieves linear parallelization, leading to an approximately four orders of magnitude reduction in overall serial calculation time. A completed MSM may contain tens of thousands of sample states from the protein's phase space (all the conformations a protein can take on) and the transitions between them. The model illustrates folding events and pathways (i.e., routes) and researchers can later use kinetic clustering to view a coarse-grained representation of the otherwise highly detailed model. They can use these MSMs to reveal how proteins misfold and to quantitatively compare simulations with experiments.[5][19][23]

Between 2000 and 2010, the length of the proteins Folding@home has studied have increased by a factor of four, while its timescales for protein folding simulations have increased by six orders of magnitude.[24] In 2002, Folding@home used Markov state models to complete approximately a million CPU days of simulations over the span of several months,[10] and in 2011, MSMs parallelized another simulation that required an aggregate 10million CPU hours of computing.[25] In January 2010, Folding@home used MSMs to simulate the dynamics of the slow-folding 32-residue NTL9 protein out to 1.52milliseconds, a timescale consistent with experimental folding rate predictions but a thousand times longer than formerly achieved. The model consisted of many individual trajectories, each two orders of magnitude shorter, and provided an unprecedented level of detail into the protein's energy landscape.[5][8][26] In 2010, Folding@home researcher Gregory Bowman was awarded the Thomas Kuhn Paradigm Shift Award from the American Chemical Society for the development of the open-source MSMBuilder software and for attaining quantitative agreement between theory and experiment.[27][28] For his work, Pande was awarded the 2012 Michael and Kate Brny Award for Young Investigators for "developing field-defining and field-changing computational methods to produce leading theoretical models for protein and RNA folding",[29] and the 2006 Irving Sigal Young Investigator Award for his simulation results which "have stimulated a re-examination of the meaning of both ensemble and single-molecule measurements, making Dr. Pande's efforts pioneering contributions to simulation methodology."[30]

Protein misfolding can result in a variety of diseases including Alzheimer's disease, cancer, CreutzfeldtJakob disease, cystic fibrosis, Huntington's disease, sickle-cell anemia, and typeII diabetes.[13][31][32] Cellular infection by viruses such as HIV and influenza also involve folding events on cell membranes.[33] Once protein misfolding is better understood, therapies can be developed that augment cells' natural ability to regulate protein folding. Such therapies include the use of engineered molecules to alter the production of a given protein, help destroy a misfolded protein, or assist in the folding process.[34] The combination of computational molecular modeling and experimental analysis has the possibility to fundamentally shape the future of molecular medicine and the rational design of therapeutics,[15] such as expediting and lowering the costs of drug discovery.[35] The goal of the first five years of Folding@home was to make advances in understanding folding, while the current goal is to understand misfolding and related disease, especially Alzheimer's.[36]

The simulations run on Folding@home are used in conjunction with laboratory experiments,[19] but researchers can use them to study how folding in vitro differs from folding in native cellular environments. This is advantageous in studying aspects of folding, misfolding, and their relationships to disease that are difficult to observe experimentally. For example, in 2011, Folding@home simulated protein folding inside a ribosomal exit tunnel, to help scientists better understand how natural confinement and crowding might influence the folding process.[37][38] Furthermore, scientists typically employ chemical denaturants to unfold proteins from their stable native state. It is not generally known how the denaturant affects the protein's refolding, and it is difficult to experimentally determine if these denatured states contain residual structures which may influence folding behavior. In 2010, Folding@home used GPUs to simulate the unfolded states of ProteinL, and predicted its collapse rate in strong agreement with experimental results.[39]

The Pande Lab is part of Stanford University, a non-profit entity, and does not sell the results generated by Folding@home.[40] The large data sets from the project are freely available for other researchers to use upon request and some can be accessed from the Folding@home website.[41][42] The Pande lab has collaborated with other molecular dynamics systems such as the Blue Gene supercomputer,[43] and they share Folding@home's key software with other researchers, so that the algorithms which benefited Folding@home may aid other scientific areas.[41] In 2011, they released the open-source Copernicus software, which is based on Folding@home's MSM and other parallelizing methods and aims to improve the efficiency and scaling of molecular simulations on large computer clusters or supercomputers.[44][45] Summaries of all scientific findings from Folding@home are posted on the Folding@home website after publication.[7]

Alzheimer's disease is linked to the aggregation of amyloid beta protein fragments in the brain (right). Researchers have used Folding@home to simulate this aggregation process, to better understand the cause of the disease.

Alzheimer's disease is an incurable neurodegenerative disease which most often affects the elderly and accounts for more than half of all cases of dementia. Its exact cause remains unknown, but the disease is identified as a protein misfolding disease. Alzheimer's is associated with toxic aggregations of the amyloid beta (A) peptide, caused by A misfolding and clumping together with other A peptides. These A aggregates then grow into significantly larger senile plaques, a pathological marker of Alzheimer's disease.[46][47][48] Due to the heterogeneous nature of these aggregates, experimental methods such as X-ray crystallography and nuclear magnetic resonance (NMR) have had difficulty characterizing their structures. Moreover, atomic simulations of A aggregation are highly demanding computationally due to their size and complexity.[49][50]

Preventing A aggregation is a promising method to developing therapeutic drugs for Alzheimer's disease, according to Drs. Naeem and Fazili in a literature review article.[51] In 2008, Folding@home simulated the dynamics of A aggregation in atomic detail over timescales of the order of tens of seconds. Prior studies were only able to simulate about 10 microseconds. Folding@home was able to simulate A folding for six orders of magnitude longer than formerly possible. Researchers used the results of this study to identify a beta hairpin that was a major source of molecular interactions within the structure.[52] The study helped prepare the Pande lab for future aggregation studies and for further research to find a small peptide which may stabilize the aggregation process.[49]

In December 2008, Folding@home found several small drug candidates which appear to inhibit the toxicity of A aggregates.[53] In 2010, in close cooperation with the Center for Protein Folding Machinery, these drug leads began to be tested on biological tissue.[32] In 2011, Folding@home completed simulations of several mutations of A that appear to stabilize the aggregate formation, which could aid in the development of therapeutic drug therapies for the disease and greatly assist with experimental nuclear magnetic resonance spectroscopy studies of A oligomers.[50][54] Later that year, Folding@home began simulations of various A fragments to determine how various natural enzymes affect the structure and folding of A.[55][56]

Huntington's disease is a neurodegenerative genetic disorder that is associated with protein misfolding and aggregation. Excessive repeats of the glutamine amino acid at the N-terminus of the Huntingtin protein cause aggregation, and although the behavior of the repeats is not completely understood, it does lead to the cognitive decline associated with the disease.[57] As with other aggregates, there is difficulty in experimentally determining its structure.[58] Scientists are using Folding@home to study the structure of the Huntingtin protein aggregate and to predict how it forms, assisting with rational drug design methods to stop the aggregate formation.[32] The N17 fragment of the Huntington protein accelerates this aggregation, and while there have been several mechanisms proposed, its exact role in this process remains largely unknown.[59] Folding@home has simulated this and other fragments to clarify their roles in the disease.[60] Since 2008, its drug design methods for Alzheimer's disease have been applied to Huntington's.[32]

More than half of all known cancers involve mutations of p53, a tumor suppressor protein present in every cell which regulates the cell cycle and signals for cell death in the event of damage to DNA. Specific mutations in p53 can disrupt these functions, allowing an abnormal cell to continue growing unchecked, resulting in the development of tumors. Analysis of these mutations helps explain the root causes of p53-related cancers.[61] In 2004, Folding@home was used to perform the first molecular dynamics study of the refolding of p53's protein dimer in an all-atom simulation of water. The simulation's results agreed with experimental observations and gave insights into the refolding of the dimer that were formerly unobtainable.[62] This was the first peer reviewed publication on cancer from a distributed computing project.[63] The following year, Folding@home powered a new method to identify the amino acids crucial for the stability of a given protein, which was then used to study mutations of p53. The method was reasonably successful in identifying cancer-promoting mutations and determined the effects of specific mutations which could not otherwise be measured experimentally.[64]

Folding@home is also used to study protein chaperones,[32] heat shock proteins which play essential roles in cell survival by assisting with the folding of other proteins in the crowded and chemically stressful environment within a cell. Rapidly growing cancer cells rely on specific chaperones, and some chaperones play key roles in chemotherapy resistance. Inhibitions to these specific chaperones are seen as potential modes of action for efficient chemotherapy drugs or for reducing the spread of cancer.[65] Using Folding@home and working closely with the Center for Protein Folding Machinery, the Pande lab hopes to find a drug which inhibits those chaperones involved in cancerous cells.[66] Researchers are also using Folding@home to study other molecules related to cancer, such as the enzyme Src kinase, and some forms of the engrailed homeodomain: a large protein which may be involved in many diseases, including cancer.[67][68] In 2011, Folding@home began simulations of the dynamics of the small knottin protein EETI, which can identify carcinomas in imaging scans by binding to surface receptors of cancer cells.[69][70]

Interleukin 2 (IL-2) is a protein that helps T cells of the immune system attack pathogens and tumors. However, its use as a cancer treatment is restricted due to serious side effects such as pulmonary edema. IL-2 binds to these pulmonary cells differently than it does to T cells, so IL-2 research involves understanding the differences between these binding mechanisms. In 2012, Folding@home assisted with the discovery of a mutant form of IL-2 which is three hundred times more effective in its immune system role but carries fewer side effects. In experiments, this altered form significantly outperformed natural IL-2 in impeding tumor growth. Pharmaceutical companies have expressed interest in the mutant molecule, and the National Institutes of Health are testing it against a large variety of tumor models to try to accelerate its development as a therapeutic.[71][72]

Osteogenesis imperfecta, known as brittle bone disease, is an incurable genetic bone disorder which can be lethal. Those with the disease are unable to make functional connective bone tissue. This is most commonly due to a mutation in Type-I collagen,[73] which fulfills a variety of structural roles and is the most abundant protein in mammals.[74] The mutation causes a deformation in collagen's triple helix structure, which if not naturally destroyed, leads to abnormal and weakened bone tissue.[75] In 2005, Folding@home tested a new quantum mechanical method that improved upon prior simulation methods, and which may be useful for future computing studies of collagen.[76] Although researchers have used Folding@home to study collagen folding and misfolding, the interest stands as a pilot project compared to Alzheimer's and Huntington's research.[32]

Folding@home is assisting in research towards preventing some viruses, such as influenza and HIV, from recognizing and entering biological cells.[32] In 2011, Folding@home began simulations of the dynamics of the enzyme RNase H, a key component of HIV, to try to design drugs to deactivate it.[77] Folding@home has also been used to study membrane fusion, an essential event for viral infection and a wide range of biological functions. This fusion involves conformational changes of viral fusion proteins and protein docking,[33] but the exact molecular mechanisms behind fusion remain largely unknown.[78] Fusion events may consist of over a half million atoms interacting for hundreds of microseconds. This complexity limits typical computer simulations to about ten thousand atoms over tens of nanoseconds: a difference of several orders of magnitude.[52] The development of models to predict the mechanisms of membrane fusion will assist in the scientific understanding of how to target the process with antiviral drugs.[79] In 2006, scientists applied Markov state models and the Folding@home network to discover two pathways for fusion and gain other mechanistic insights.[52]

Following detailed simulations from Folding@home of small cells known as vesicles, in 2007, the Pande lab introduced a new computing method to measure the topology of its structural changes during fusion.[80] In 2009, researchers used Folding@home to study mutations of influenza hemagglutinin, a protein that attaches a virus to its host cell and assists with viral entry. Mutations to hemagglutinin affect how well the protein binds to a host's cell surface receptor molecules, which determines how infective the virus strain is to the host organism. Knowledge of the effects of hemagglutinin mutations assists in the development of antiviral drugs.[81][82] As of 2012, Folding@home continues to simulate the folding and interactions of hemagglutinin, complementing experimental studies at the University of Virginia.[32][83]

Drugs function by binding to specific locations on target molecules and causing some desired change, such as disabling a target or causing a conformational change. Ideally, a drug should act very specifically, and bind only to its target without interfering with other biological functions. However, it is difficult to precisely determine where and how tightly two molecules will bind. Due to limits in computing power, current in silico methods usually must trade speed for accuracy; e.g., use rapid protein docking methods instead of computationally costly free energy calculations. Folding@home's computing performance allows researchers to use both methods, and evaluate their efficiency and reliability.[36][84][85] Computer-assisted drug design has the potential to expedite and lower the costs of drug discovery.[35] In 2010, Folding@home used MSMs and free energy calculations to predict the native state of the villin protein to within 1.8 angstrom () root mean square deviation (RMSD) from the crystalline structure experimentally determined through X-ray crystallography. This accuracy has implications to future protein structure prediction methods, including for intrinsically unstructured proteins.[52] Scientists have used Folding@home to research drug resistance by studying vancomycin, an antibiotic drug of last resort, and beta-lactamase, a protein that can break down antibiotics like penicillin.[86][87]

Chemical activity occurs along a protein's active site. Traditional drug design methods involve tightly binding to this site and blocking its activity, under the assumption that the target protein exists in one rigid structure. However, this approach works for approximately only 15% of all proteins. Proteins contain allosteric sites which, when bound to by small molecules, can alter a protein's conformation and ultimately affect the protein's activity. These sites are attractive drug targets, but locating them is very computationally costly. In 2012, Folding@home and MSMs were used to identify allosteric sites in three medically relevant proteins: beta-lactamase, interleukin-2, and RNase H.[87][88]

Approximately half of all known antibiotics interfere with the workings of a bacteria's ribosome, a large and complex biochemical machine that performs protein biosynthesis by translating messenger RNA into proteins. Macrolide antibiotics clog the ribosome's exit tunnel, preventing synthesis of essential bacterial proteins. In 2007, the Pande lab received a grant to study and design new antibiotics.[32] In 2008, they used Folding@home to study the interior of this tunnel and how specific molecules may affect it.[89] The full structure of the ribosome was determined only as of 2011, and Folding@home has also simulated ribosomal proteins, as many of their functions remain largely unknown.[90]

There are many more protein misfolding promoted diseases that can be benefited from Folding@home to either discern the misfolded protein structure or the misfolding kinetics, and assist in drug design in the future. The often fatal prion diseases is among the most significant.

Prion (PrP) is a transmembrane cellular protein found widely in eukaryotic cells. In mammals, it is more abundant in the central nervous system. Although its function is unknown, its high conservation among species indicates an important role in the cellular function. The conformational change from the normal prion protein (PrPc, stands for cellular) to the disease causing isoform PrPSc (stands for prototypical prion diseasescrapie) causes a host of diseases collectly known as transmissible spongiform encephalopathies (TSEs), including Bovine spongiform encephalopathy (BSE) in bovine, Creutzfeldt-Jakob disease (CJD) and fatal insomnia in human, chronic wasting disease (CWD) in the deer family. The conformational change is widely accepted as the result of protein misfolding. What distinguishes TSEs from other protein misfolding diseases is its transmissible nature. The seeding of the infectious PrPSc, either arising spontaneously, hereditary or acquired via exposure to contaminated tissues,[91] can cause a chain reaction of transforming normal PrPc into fibrils aggregates or amyloid like plaques consist of PrPSc.[92]

The molecular structure of PrPSc has not been fully characterized due to its aggregated nature. Neither is known much about the mechanism of the protein misfolding nor its kinetics. Using the known structure of PrPc and the results of the in vitro and in vivo studies described below, Folding@home could be valuable in elucidating how PrPSc is formed and how the infectious protein arrange themselves to form fibrils and amyloid like plaques, bypassing the requirement to purify PrPSc or dissolve the aggregates.

The PrPc has been enzymatically dissociated from the membrane and purified, its structure studied using structure characterization techniques such as NMR spectroscopy and X-ray crystallography. Post-translational PrPc has 231 amino acids (aa) in murine. The molecule consists of a long and unstructured amino terminal region spanning up to aa residue 121 and a structured carboxy terminal domain.[92] This globular domain harbours two short sheet-forming anti-parallel -strands (aa 128 to 130 and aa 160 to 162 in murine PrPc) and three -helices (helix I: aa 143 to 153; helix II: aa 171 to 192; helix III: aa 199 to 226 in murine PrPc),[93] Helices II and III are anti-parallel orientated and connected by a short loop. Their structural stability is supported by a disulfide bridge, which is parallel to both sheet-forming -strands. These -helices and the -sheet form the rigid core of the globular domain of PrPc.[94]

The disease causing PrPSc is proteinase K resistant and insoluble. Attempts to purify it from the brains of infected animals invariably yield heterogeneous mixtures and aggregated states that are not amenable to characterization by NMR spectroscopy or X-ray crystallography. However, it is a general consensus that PrPSc contains a high percentage of tightly stacked -sheets than the normal PrPc that renders the protein insoluble and resistant to proteinase. Using techniques of cryoelectron microscopy and structural modeling based on similar common protein structures, it has been discovered that PrPSc contains -sheets in the region of aa 81-95 to aa 171, while the carboxy terminal structure is supposedly preserved, retaining the disulfide-linked -helical conformation in the normal PrPc. These -sheets form a parallel left-handed beta-helix.[92] Three PrPSc molecules are believed to form a primary unit and therefore build the basis for the so-called scrapie-associated fibrils.[95] The catalytic activity depends on the size of the particle. PrPSc particles which consist of only 14-28 PrPc molecules exhibit the highest rate of infectivity and conversion.[96]

Despite the difficulty to purify and characterize PrPSc, from the known molecular structure of PrPc and using transgenic mice and N-terminal deletion,[97] the potential hot spots of protein misfolding leading to the pathogenic PrPSc could be deduced and Folding@home could be of great value in confirming these. Studies found that both the primary and secondary structure of the prion protein can be of significance of the conversion.

There are more than twenty mutations of the prion protein gene (PRNP) that are known to be associated with or that are directly linked to the hereditary form of human TSEs [56], indicating single amino acids at certain position, likely within the carboxy domain,[93] of the PrPc can affect the susceptibility to TSEs.

The post-translational amino terminal region of PrPc consists of residues 23-120 which make up nearly half of the amino sequence of full-length matured PrPc. There are two sections in the amino terminal region that may influence conversion. First, residues 52-90 contains an octapeptide repeat (5 times) region that likely influences the initial binding (via the octapeptide repeats) and also the actual conversion via the second section of aa 108-124.[98] The highly hydrophobic AGAAAAGA is located between aa residue 113 and 120 and is described as putative aggregation site,[99] although this sequence requires its flanking parts to form fibrillar aggregates.[100]

In the carboxy globular domain,[94] among the three helices, study show that helix II has a significant higher propensity to -strand conformation.[101] Due to the high conformational flexvoribility seen between residues 114-125 (part of the unstructured N-terminus chain) and the high -strand propensity of helix II, only moderate changes in the environmental conditions or interactions might be sufficient to induce misfolding of PrPc and subsequent fibril formation.[92]

Other studies of NMR structures of PrPc showed that these residues (~108189) contain most of the folded domain including both -strands, the first two -helices, and the loop/turn regions connecting them, but not the helix III.[97] Small changes within the loop/turn structures of PrPc itself could be important in the conversion as well.[102] In another study, Riek et al. showed that the two small regions of -strand upstream of the loop regions act as a nucleation site for the conformational conversion of the loop/turn and -helical structures in PrPc to -sheet.[93]

The energy threshold for the conversion are not necessarily high. The folding stability, i.e. the free energy of a globular protein in its environment is in the range of one or two hydrogen bonds thus allows the transition to an isoform without the requirement of high transition energy.[92]

From the respective of the interactions among the PrPc molecules, hydrophobic interactions play a crucial role in the formation of -sheets, a hallmark of PrPSc, as the sheets bring fragments of polypeptide chains into close proximity.[103] Indeed, Kutznetsov and Rackovsky [104] showed that disease-promoting mutations in the human PrPc had a statistically significant tendency towards increasing local hydrophobicity.

In vitro experiments showed the kinetics of misfolding has an initial lag phase followed by a rapid growth phase of fibril formation.[105] It is likely that PrPc goes through some intermediate states, such as at least partially unfolded or degraded, before finally ending up as part of an amyloid fibril.[92]

This section needs to be updated. Please update this article to reflect recent events or newly available information. (June 2016)

Like other distributed computing projects, Folding@home is an online citizen science project. In these projects non-specialists contribute computer processing power or help to analyse data produced by professional scientists. Participants in these projects play an invaluable role in facilitating research for little or no obvious reward.

Research has been carried out into the motivations of citizen scientists and most of these studies have found that participants are motivated to take part because of altruistic reasons, that is, they want to help scientists and make a contribution to the advancement of their research.[106][107][108][109] Many participants in citizen science have an underlying interest in the topic of the research and gravitate towards projects that are in disciplines of interest to them. Folding@home is no different in that respect.[110] Research carried out recently on over 400 active participants revealed that they wanted to help make a contribution to research and that many had friends or relatives affected by the diseases that the Folding@home scientists investigate.

Folding@home attracts participants who are computer hardware enthusiasts (sometimes called overclockers). These groups bring considerable expertise to the project and are able to build computers with advanced processing power.[111] Other distributed computing projects attract these types of participants and projects are often used to benchmark the performance of modified computers, and this aspect of the hobby is accommodated through the competitive nature of the project. Individuals and teams can compete to see who can process the most computer processing units (CPUs).

This latest research on Folding@home involving interview and ethnographic observation of online groups showed that teams of hardware enthusiasts can sometimes work together, sharing best practice with regard to maximising processing output. Such teams can become communities of practice, with a shared language and online culture. This pattern of participation has been observed in other distributed computing projects.[112][113]

Another key observation of Folding@home participants is that many are male.[110] This has also been observed in other distributed projects. Furthermore, many participants work in computer and technology-based jobs and careers.[110][114][115]

Not all Folding@home participants are hardware enthusiasts. Many participants run the project software on unmodified machines and do take part competitively. Over 100,000 participants are involved in Folding@home. However, it is difficult to ascertain what proportion of participants are hardware enthusiasts. Although, according to the project managers, the contribution of the enthusiast community is substantially larger in terms of processing power.[116]

On September 16, 2007, due in large part to the participation of PlayStation 3 consoles, the Folding@home project officially attained a sustained performance level higher than one native petaFLOPS, becoming the first computing system of any kind to do so.[122][123] Top500's fastest supercomputer at the time was BlueGene/L, at 0.280 petaFLOPS.[124] The following year, on May 7, 2008, the project attained a sustained performance level higher than two native petaFLOPS,[125] followed by the three and four native petaFLOPS milestones on August 2008[126][127] and September 28, 2008 respectively.[128] On February 18, 2009, Folding@home achieved five native petaFLOPS,[129][130] and was the first computing project to meet these five levels.[132] In comparison, November 2008's fastest supercomputer was IBM's Roadrunner at 1.105 petaFLOPS.[133] On November 10, 2011, Folding@home's performance exceeded six native petaFLOPS with the equivalent of nearly eight x86 petaFLOPS.[123][134] In mid-May 2013, Folding@home attained over seven native petaFLOPS, with the equivalent of 14.87 x86 petaFLOPS. It then reached eight native petaFLOPS on June 21, followed by nine on September 9 of that year, with 17.9 x86 petaFLOPS.[135] On May 11, 2016 Folding@home announced that it was moving towards reaching the 100 x86 petaFLOPS mark.[136]

Similarly to other distributed computing projects, Folding@home quantitatively assesses user computing contributions to the project through a credit system.[137] All units from a given protein project have uniform base credit, which is determined by benchmarking one or more work units from that project on an official reference machine before the project is released.[137] Each user receives these base points for completing every work unit, though through the use of a passkey they can receive added bonus points for reliably and rapidly completing units which are more demanding computationally or have a greater scientific priority.[138][139] Users may also receive credit for their work by clients on multiple machines.[40] This point system attempts to align awarded credit with the value of the scientific results.[137]

Users can register their contributions under a team, which combine the points of all their members. A user can start their own team, or they can join an existing team.[140] In some cases, a team may have their own community-driven sources of help or recruitment such as an Internet forum.[141] The points can foster friendly competition between individuals and teams to compute the most for the project, which can benefit the folding community and accelerate scientific research.[137][142][143] Individual and team statistics are posted on the Folding@home website.[137]

If a user does not form a new team, or does not join an existing team, that user automatically becomes part of a "Default" team. This "Default" team has a team number of "0". Statistics are accumulated for this "Default" team as well as for specially named teams.

Folding@home software at the user's end involves three primary components: work units, cores, and a client.

A work unit is the protein data that the client is asked to process. Work units are a fraction of the simulation between the states in a Markov state model. After the work unit has been downloaded and completely processed by a volunteer's computer, it is returned to Folding@home servers, which then award the volunteer the credit points. This cycle repeats automatically.[142] All work units have associated deadlines, and if this deadline is exceeded, the user may not get credit and the unit will be automatically reissued to another participant. As protein folding occurs serially, and many work units are generated from their predecessors, this allows the overall simulation process to proceed normally if a work unit is not returned after a reasonable period of time. Due to these deadlines, the minimum system requirement for Folding@home is a Pentium3 450MHz CPU with Streaming SIMD Extensions (SSE).[40] However, work units for high-performance clients have a much shorter deadline than those for the uniprocessor client, as a major part of the scientific benefit is dependent on rapidly completing simulations.[144]

Before public release, work units go through several quality assurance steps to keep problematic ones from becoming fully available. These testing stages include internal, beta, and advanced, before a final full release across Folding@home.[145] Folding@home's work units are normally processed only once, except in the rare event that errors occur during processing. If this occurs for three different users, the unit is automatically pulled from distribution.[146][147] The Folding@home support forum can be used to differentiate between issues arising from problematic hardware and bad work units.[148]

Specialized molecular dynamics programs, referred to as "FahCores" and often abbreviated "cores", perform the calculations on the work unit as a background process. A large majority of Folding@home's cores are based on GROMACS,[142] one of the fastest and most popular molecular dynamics software packages, which largely consists of manually optimized assembly language code and hardware optimizations.[149][150] Although GROMACS is open-source software and there is a cooperative effort between the Pande lab and GROMACS developers, Folding@home uses a closed-source license to help ensure data validity.[151] Less active cores include ProtoMol and SHARPEN. Folding@home has used AMBER, CPMD, Desmond, and TINKER, but these have since been retired and are no longer in active service.[3][152][153] Some of these cores perform explicit solvation calculations in which the surrounding solvent (usually water) is modeled atom-by-atom; while others perform implicit solvation methods, where the solvent is treated as a mathematical continuum.[154][155] The core is separate from the client to enable the scientific methods to be updated automatically without requiring a client update. The cores periodically create calculation checkpoints so that if they are interrupted they can resume work from that point upon startup.[142]

A Folding@home participant installs a client program on their personal computer. The user interacts with the client, which manages the other software components in the background. Through the client, the user may pause the folding process, open an event log, check the work progress, or view personal statistics.[156] The computer clients run continuously in the background at a very low priority, using idle processing power so that normal computer use is unaffected.[40][140] The maximum CPU use can be adjusted via client settings.[156][157] The client connects to a Folding@home server and retrieves a work unit and may also download the appropriate core for the client's settings, operating system, and the underlying hardware architecture. After processing, the work unit is returned to the Folding@home servers. Computer clients are tailored to uniprocessor and multi-core processor systems, and graphics processing units. The diversity and power of each hardware architecture provides Folding@home with the ability to efficiently complete many types of simulations in a timely manner (in a few weeks or months rather than years), which is of significant scientific value. Together, these clients allow researchers to study biomedical questions formerly considered impractical to tackle computationally.[36][142][144]

Professional software developers are responsible for most of Folding@home's code, both for the client and server-side. The development team includes programmers from Nvidia, ATI, Sony, and Cauldron Development.[158] Clients can be downloaded only from the official Folding@home website or its commercial partners, and will only interact with Folding@home computer files. They will upload and download data with Folding@home's data servers (over port8080, with 80 as an alternate), and the communication is verified using 2048-bit digital signatures.[40][159] While the client's graphical user interface (GUI) is open-source,[160] the client is proprietary software citing security and scientific integrity as the reasons.[161][162][163]

However, this rationale of using proprietary software is disputed since while the license could be enforceable in the legal domain retrospectively, it doesn't practically prevent the modification (also known as patching) of the executable binary files. Likewise, binary-only distribution does not prevent the malicious modification of executable binary-code, either through a man-in-the-middle attack while being downloaded via the internet,[164] or by the redistribution of binaries by a third-party that have been previously modified either in their binary state (i.e. patched),[165] or by decompiling[166] and recompiling them after modification.[167][168] Unless the binary files and the transport channel are signed and the recipient person/system is able to verify the digital signature, in which case unwarranted modifications should be detectable, but not always.[169] Either way, since in the case of Folding@Home the input data and output result processed by the client-software are both digitally signed,[40][159] the integrity of work can be verified independently from the integrity of the client software itself.

Folding@home uses the Cosm software libraries for networking.[142][158] Folding@home was launched on October1, 2000, and was the first distributed computing project aimed at bio-molecular systems.[170] Its first client was a screensaver, which would run while the computer was not otherwise in use.[171][172] In 2004, the Pande lab collaborated with David P. Anderson to test a supplemental client on the open-source BOINC framework. This client was released to closed beta in April 2005;[173] however, the method became unworkable and was shelved in June 2006.[174]

The specialized hardware of graphics processing units (GPU) is designed to accelerate rendering of 3-Dgraphics applications such as video games and can significantly outperform CPUs for some types of calculations. GPUs are one of the most powerful and rapidly growing computing platforms, and many scientists and researchers are pursuing general-purpose computing on graphics processing units (GPGPU). However, GPU hardware is difficult to use for non-graphics tasks and usually requires significant algorithm restructuring and an advanced understanding of the underlying architecture.[175] Such customization is challenging, more so to researchers with limited software development resources. Folding@home uses the open-source OpenMM library, which uses a bridge design pattern with two application programming interface (API) levels to interface molecular simulation software to an underlying hardware architecture. With the addition of hardware optimizations, OpenMM-based GPU simulations need no significant modification but achieve performance nearly equal to hand-tuned GPU code, and greatly outperform CPU implementations.[154][176]

Before 2010, the computing reliability of GPGPU consumer-grade hardware was largely unknown, and circumstantial evidence related to the lack of built-in error detection and correction in GPU memory raised reliability concerns. In the first large-scale test of GPU scientific accuracy, a 2010 study of over 20,000 hosts on the Folding@home network detected soft errors in the memory subsystems of two-thirds of the tested GPUs. These errors strongly correlated to board architecture, though the study concluded that reliable GPU computing was very feasible as long as attention is paid to the hardware traits, such as software-side error detection.[177]

The first generation of Folding@home's GPU client (GPU1) was released to the public on October2, 2006,[174] delivering a 2030 times speedup for some calculations over its CPU-based GROMACS counterparts.[178] It was the first time GPUs had been used for either distributed computing or major molecular dynamics calculations.[179][180] GPU1 gave researchers significant knowledge and experience with the development of GPGPU software, but in response to scientific inaccuracies with DirectX, on April10, 2008 it was succeeded by GPU2, the second generation of the client.[178][181] Following the introduction of GPU2, GPU1 was officially retired on June6.[178] Compared to GPU1, GPU2 was more scientifically reliable and productive, ran on ATI and CUDA-enabled Nvidia GPUs, and supported more advanced algorithms, larger proteins, and real-time visualization of the protein simulation.[182][183] Following this, the third generation of Folding@home's GPU client (GPU3) was released on May25, 2010. While backward compatible with GPU2, GPU3 was more stable, efficient, and flexibile in its scientific abilities,[184] and used OpenMM on top of an OpenCL framework.[184][185] Although these GPU3 clients did not natively support the operating systems Linux and macOS, Linux users with Nvidia graphics cards were able to run them through the Wine software application.[186][187] GPUs remain Folding@home's most powerful platform in FLOPS. As of November 2012, GPU clients account for 87% of the entire project's x86 FLOPS throughput.[188]

Native support for Nvidia and AMD graphics cards under Linux was introduced with FahCore 17, which uses OpenCL rather than CUDA.[189]

From March 2007 until November 2012, Folding@home took advantage of the computing power of PlayStation 3s. At the time of its inception, its main streaming Cell processor delivered a 20 times speed increase over PCs for some calculations, processing power which could not be found on other systems such as the Xbox 360.[36][190] The PS3's high speed and efficiency introduced other opportunities for worthwhile optimizations according to Amdahl's law, and significantly changed the tradeoff between computing efficiency and overall accuracy, allowing the use of more complex molecular models at little added computing cost.[191] This allowed Folding@home to run biomedical calculations that would have been otherwise infeasible computationally.[192]

The PS3 client was developed in a collaborative effort between Sony and the Pande lab and was first released as a standalone client on March23, 2007.[36][193] Its release made Folding@home the first distributed computing project to use PS3s.[194] On September18 of the following year, the PS3 client became a channel of Life with PlayStation on its launch.[195][196] In the types of calculations it can perform, at the time of its introduction, the client fit in between a CPU's flexibility and a GPU's speed.[142] However, unlike clients running on personal computers, users were unable to perform other activities on their PS3 while running Folding@home.[192] The PS3's uniform console environment made technical support easier and made Folding@home more user friendly.[36] The PS3 also had the ability to stream data quickly to its GPU, which was used for real-time atomic-level visualizing of the current protein dynamics.[191]

On November 6, 2012, Sony ended support for the Folding@home PS3 client and other services available under Life with PlayStation. Over its lifetime of five years and seven months, more than 15 million users contributed over 100 million hours of computing to Folding@home, greatly assisting the project with disease research. Following discussions with the Pande lab, Sony decided to terminate the application. Pande considered the PlayStation 3 client a "game changer" for the project.[197][198][199]

Folding@home can use the parallel computing abilities of modern multi-core processors. The ability to use several CPU cores simultaneously allows completing the full simulation far faster. Working together, these CPU cores complete single work units proportionately faster than the standard uniprocessor client. This method is scientifically valuable because it enables much longer simulation trajectories to be performed in the same amount of time, and reduces the traditional difficulties of scaling a large simulation to many separate processors.[200] A 2007 publication in the Journal of Molecular Biology relied on multi-core processing to simulate the folding of part of the villin protein approximately 10 times longer than was possible with a single-processor client, in agreement with experimental folding rates.[201]

In November 2006, first-generation symmetric multiprocessing (SMP) clients were publicly released for open beta testing, referred to as SMP1.[174] These clients used Message Passing Interface (MPI) communication protocols for parallel processing, as at that time the GROMACS cores were not designed to be used with multiple threads.[144] This was the first time a distributed computing project had used MPI.[202] Although the clients performed well in Unix-based operating systems such as Linux and macOS, they were troublesome under Windows.[200][202] On January24, 2010, SMP2, the second generation of the SMP clients and the successor to SMP1, was released as an open beta and replaced the complex MPI with a more reliable thread-based implementation.[139][158]

SMP2 supports a trial of a special category of bigadv work units, designed to simulate proteins that are unusually large and computationally intensive and have a great scientific priority. These units originally required a minimum of eight CPU cores,[203] which was raised to sixteen later, on February7, 2012.[204] Along with these added hardware requirements over standard SMP2 work units, they require more system resources such as random-access memory (RAM) and Internet bandwidth. In return, users who run these are rewarded with a 20% increase over SMP2's bonus point system.[205] The bigadv category allows Folding@home to run especially demanding simulations for long times that had formerly required use of supercomputing clusters and could not be performed anywhere else on Folding@home.[203] Many users with hardware able to run bigadv units have later had their hardware setup deemed ineligible for bigadv work units when CPU core minimums were increased, leaving them only able to run the normal SMP work units. This frustrated many users who invested significant amounts of money into the program only to have their hardware be obsolete for bigadv purposes shortly after. As a result, Pande announced in January 2014 that the bigadv program would end on January 31, 2015.[206]

The V7 client is the seventh and latest generation of the Folding@home client software, and is a full rewrite and unification of the prior clients for Windows, macOS, and Linux operating systems.[207][208] It was released on March22, 2012.[209] Like its predecessors, V7 can run Folding@home in the background at a very low priority, allowing other applications to use CPU resources as they need. It is designed to make the installation, start-up, and operation more user-friendly for novices, and offer greater scientific flexibility to researchers than prior clients.[210] V7 uses Trac for managing its bug tickets so that users can see its development process and provide feedback.[208]

V7 consists of four integrated elements. The user typically interacts with V7's open-source GUI, named FAHControl.[160][211] This has Novice, Advanced, and Expert user interface modes, and has the ability to monitor, configure, and control many remote folding clients from one computer. FAHControl directs FAHClient, a back-end application that in turn manages each FAHSlot (or slot). Each slot acts as replacement for the formerly distinct Folding@home v6 uniprocessor, SMP, or GPU computer clients, as it can download, process, and upload work units independently. The FAHViewer function, modeled after the PS3's viewer, displays a real-time 3-D rendering, if available, of the protein currently being processed.[207][208]

In 2014, a client for the Google Chrome and Chromium web browsers was released, allowing users to run Folding@home in their web browser. The client uses Google's Native Client (NaCl) feature on Chromium-based web browsers to run the Folding@Home code at near-native speed in a sandbox on the user's machine.[212] Due to the phasing out of NaCL and changes at Folding@Home, the web client was permanently shut down in June 2019.[213]

In July 2015, a client for Android mobile phones was released on Google Play for devices running Android 4.4 KitKat or newer.[214][215]

On the 16th of February 2018 the android client, which was offered in cooperation with Sony, was removed from the Google Play. Plans were announced to offer an open source alternative in the future.[216]

Rosetta@home is a distributed computing project aimed at protein structure prediction and is one of the most accurate tertiary structure predictors.[217][218] The conformational states from Rosetta's software can be used to initialize a Markov state model as starting points for Folding@home simulations.[22] Conversely, structure prediction algorithms can be improved from thermodynamic and kinetic models and the sampling aspects of protein folding simulations.[219] As Rosetta only tries to predict the final folded state, and not how folding proceeds, Rosetta@home and Folding@home are complementary and address very different molecular questions.[22][220]

Anton is a special-purpose supercomputer built for molecular dynamics simulations. In October 2011, Anton and Folding@home were the two most powerful molecular dynamics systems.[221] Anton is unique in its ability to produce single ultra-long computationally costly molecular trajectories,[222] such as one in 2010 which reached the millisecond range.[223][224] These long trajectories may be especially helpful for some types of biochemical problems.[225][226] However, Anton does not use Markov state models (MSM) for analysis. In 2011, the Pande lab constructed a MSM from two 100-s Anton simulations and found alternative folding pathways that were not visible through Anton's traditional analysis. They concluded that there was little difference between MSMs constructed from a limited number of long trajectories or one assembled from many shorter trajectories.[222] In June 2011 Folding@home began added sampling of an Anton simulation in an effort to better determine how its methods compare to Anton's.[227][228] However, unlike Folding@home's shorter trajectories, which are more amenable to distributed computing and other parallelizing methods, longer trajectories do not require adaptive sampling to sufficiently sample the protein's phase space. Due to this, it is possible that a combination of Anton's and Folding@home's simulation methods would provide a more thorough sampling of this space.[222]

Excerpt from:

Folding@home - Wikipedia

Protein Folding: The Good, the Bad, and the Ugly – Science …

We often think of proteins as nutrients in the food we eat or the main component of muscles, but proteins are also microscopic molecules inside of cells that perform diverse and vital jobs. With the Human Genome Project complete, scientists are turning their attention to the human proteome, the catalog of all human proteins. This work has shown that the world of proteins is a fascinating one, full of molecules with such intricate shapes and precise functions that they seem almost fanciful.

A proteins function depends on its shape, and when protein formation goes awry, the resulting misshapen proteins cause problems that range from bad, when proteins neglect their important work, to ugly, when they form a sticky, clumpy mess inside of cells. Current research suggests that the world of proteins is far from pristine. Protein formation is an error-prone process, and mistakes along the way have been linked to a number of human diseases.

There are 20,000 to over 100,000 unique types of proteins within a typical human cell. Why so many? Proteins are the workhorses of the cell. Each expertly performs a specific task. Some are structural, lending stiffness and rigidity to muscle cells or long thin neurons, for example. Others bind to specific molecules and shuttle them to new locations, and still others catalyze reactions that allow cells to divide and grow. This wealth of diversity and specificity in function is made possible by a seemingly simple property of proteins: they fold.

A protein starts off in the cell as a long chain of, on average, 300 building blocks called amino acids. There are 22 different types of amino acids, and their ordering determines how the protein chain will fold upon itself. When folding, two types of structures usually form first. Some regions of the protein chain coil up into slinky-like formations called alpha helices, while other regions fold into zigzag patterns called beta sheets, which resemble the folds of a paper fan. These two structures can interact to form more complex structures. For example, in one protein structure, several beta sheets wrap around themselves to form a hollow tube with a few alpha helices jutting out from one end. The tube is short and squat such that the overall structure resembles snakes (alpha helices) emerging from a can (beta sheet tube). A few other protein structures with descriptive names include the beta barrel, the beta propeller, the alpha/beta horseshoe, and the jelly-roll fold.

These complex structures allow proteins to perform their diverse jobs in the cell. The snakes in a can protein, when embedded in a cell membrane, creates a tunnel that allows traffic into and out of cells. Other proteins form shapes with pockets called active sites that are perfectly shaped to bind to a particular molecule, like a lock and key. By folding into distinct shapes, proteins can perform very different roles despite being composed of the same basic building blocks. To draw an analogy, all vehicles are made from steel, but a racecars sleek shape wins races, while a bus, dump truck, crane, or zamboni are each shaped to perform their own unique tasks.

Folding allows a protein to adopt a functional shape, but it is a complex process that sometimes fails. Protein folding can go wrong for three major reasons:

1: A person might possess a mutation that changes an amino acid in the protein chain, making it difficult for a particular protein to find its preferred fold or native state. This is the case for inherited mutations, for example, those leading to cystic fibrosis or sickle cell anemia. These mutations are located in the DNA sequence or gene that encodes one particular protein. Therefore, these types of inherited mutations affect only that particular protein and its related function.

2: On the other hand, protein folding failure can be viewed as an ongoing and more general process that affects many proteins. When proteins are created, the machine that reads the directions from DNA to create the long chains of amino acids can make mistakes. Scientists estimate that this machine, the ribosome, makes mistakes in as many as 1 in every 7 proteins! These mistakes can make the resulting proteins less likely to fold properly.

3: Even if an amino acid chain has no mutations or mistakes, it may still not reach its preferred folded shape simply because proteins do not fold correctly 100% of the time. Protein folding becomes even more difficult if the conditions in the cell, like acidity and temperature, change from those to which the organism is accustomed.

A failure in protein folding causes several known diseases, and scientists hypothesize that many more diseases may be related to folding problems. There are two completely different problems that occur in cells when their proteins do not fold properly.

One type of problem, called loss of function, results when not enough of a particular protein folds properly, causing a shortage of specialized workers needed to do a specific job. For example, imagine that a properly folded protein is perfectly shaped to bind a toxin and break it into less toxic byproducts. Without enough of the properly folded protein available, the toxin will build up to damaging levels. As another example, a protein may be responsible for metabolizing sugar so that the cell can use it for energy. The cell will grow slowly due to lack of energy if not enough of the protein is present in its functional state. The reason the cell gets sick, in these cases, is due to a lack of one specific, properly folded, functional protein. Cystic fibrosis, Tay-Sachs disease, Marfan syndrome, and some forms of cancer are examples of diseases that result when one type of protein is not able to perform its job. Who knew that one type of protein among tens of thousands could be so important?

Proteins that fold improperly may also impact the health of the cell regardless of the function of the protein. When proteins fail to fold into their functional state, the resulting misfolded proteins can be contorted into shapes that are unfavorable to the crowded cellular environment. Most proteins possess sticky, water-hating amino acids that they bury deep inside their core. Misfolded proteins wear these inner parts on the outside, like a chocolate-covered candy that has been crushed to reveal a gooey caramel center. These misfolded proteins often stick together forming clumps called aggregates. Scientists hypothesize that the accumulation of misfolded proteins plays a role in several neurological diseases, including Alzheimers, Parkinsons, Huntingtons, and Lou Gehrigs (ALS) disease, but scientists are still working to discover exactly how these misfolded, sticky molecules inflict their damage on cells.

One misfolded protein stands out among the rest to deserve special attention. The prion protein in Creutzfeldt-Jakob disease, also known as mad cow disease, is an example of a misfolded protein gone rogue. This protein is not only irreversibly misfolded, but it converts other functional proteins into its twisted state.

Recent research shows that protein misfolding happens frequently inside of cells. Fortunately, cells are accustomed to coping with this problem and have several systems in place to refold or destroy aberrant protein formations.

Chaperones are one such system. Appropriately named, they accompany proteins through the folding process, improving a proteins chances of folding properly and even allowing some misfolded proteins the opportunity to refold. Interestingly, chaperones are proteins themselves! There are many different types of chaperones. Some cater specifically to helping one type of protein fold, while others act more generally. Some chaperones are shaped like large hollow chambers and provide proteins with a safe space, isolated from other molecules, in which to fold. Production of several chaperones is boosted when a cell encounters high temperatures or other conditions making protein folding more difficult, thus earning these chaperones the alias, heat shock proteins.

Another line of cell defense against misfolded proteins is called the proteasome. If misfolded proteins linger in the cell, they will be targeted for destruction by this machine, which chews up proteins and spits them out as small fragments of amino acids. The proteasome is like a recycling center, allowing the cell to reuse amino acids to make more proteins. The proteasome itself is not one protein but many acting together. Proteins frequently interact to form larger structures with important cellular functions. For example, the tail of a human sperm is a structure composed of many types of proteins that work together to form a complex rotary engine that propels the sperm forward.

Why is it that some misfolded proteins are able to evade systems like chaperones and the proteasome? How can sticky misfolded proteins cause the neurodegenerative diseases listed above? Do some proteins misfold more often than others? These questions are at the forefront of current research seeking to understand basic protein biology and the diseases that result when protein folding goes awry.

The wide world of proteins, with its great assortment of shapes, bestows cells with capabilities that allow for life to exist and allow for its diversity (e.g., the differences between eye, skin, lung or heart cells, and the differences between species). Perhaps for this reason, the word protein is from the Greek word protas, meaning of primary importance.

Contributed by Kerry Geiler, a 4th year Ph.D student in the Harvard Department of Organismic and Evolutionary Biology

Go here to read the rest:

Protein Folding: The Good, the Bad, and the Ugly - Science ...

Protein Folding – Chemistry LibreTexts

Introduction and Protein Structure

Proteins have several layers of structure each of which is important in the process of protein folding. The first most basic level of this structure is the sequence of amino acids themselves.1 The sequencing is important because it will determine the types of interactions seen in the protein as it is folding. A novel sequence-based method based on the assumption that protein-protein interactions are more related to amino acids at the surface than those at the core.2 This study shows that not only is the amino acids that are in a protein important but also the order in which they are sequenced. The interactions of the amino acids will determine what the secondary and tertiary structure of the protein will be.

The next layer in protein structure is the secondary structure. The secondary structure includes architectural structures that extend in one dimension.1 Secondary structure includes -Helixes (Figure 1) and -sheets (Figure 2). The -helices, the most common secondary structure in proteins, the peptide CONHgroups in the backbone form chains held together by NH OC hydrogen bonds.3 The -helices form the backbone of proteins and help to aid in the folding process. The -sheets form in two distinct ways. They are able to form in both parallel -pleated sheets and anti parallel -pleated sheets.1 When the -helix or -sheet is formed, the excluded volumes generated by the backbone and side chains overlap, leading to an increase in the total volume available to the translational displacement of water molecules.4 This is important because it leads to a more thermodynamically stable conformation and leads to less strain on the protein as a whole and thus are aided by the conformation.

Figure 1: (left) typical example to an -helix, from Wikimedia CommonsFigure 2: (right) typical example of an -sheet, from Wikimedia Commons

The tertiary structure is the next layer in protein structure. This takes the -Helixes and -sheets and allows them to fold into a three dimensional structure.1 Most proteins take on a globular structure once folded. The description of globular protein structures as an ensemble of contiguous closed loops or tightened end fragments reveals fold elements crucial for the formation of stable structures and for navigating the very process of protein folding.5 The globular proteins generally have a hydrophobic core surrounded by a hydrophilic outer layer. These interactions are important because they lead to the global structure and help create channels and binding sites for enzymes.

The last layer of protein structure is the quaternary structure. The folding transition and the functional transitions between useful states are encoded in the linear sequence of amino acids, and a long- term goal of structural biology is to be able to predict both the structure and function of molecules from the information in the sequence.6 The Subunit organization is the last level of structure in protein molecules.1 The organization of the subunits is important because that determines the types of interactions that can form and dictates its use in the body.

Proteins are folded and held together by several forms of molecular interactions. The molecular interactions include the thermodynamic stability of the complex, the hydrophobic interactions and the disulfide bonds formed in the proteins. The figure below (figure 3) is an example of protein folding.

Figure 3: Protein Folding, from Wikimedia Commons

The biggest factor in a proteins ability to fold is the thermodynamics of the structure. The interaction scheme includes the short-range propensity to form extended conformations, residue-dependent long-range contact potentials, and orientation-dependent hydrogen bonds.7 The thermodynamics are a main stabilizing force within a protein because if it is not in the lowest energy conformation it will continue to move and adjust until it finds its most stable state. The use of energy diagrams and maps are key in finding out when the protein is in the most stable form possible.

The next type of interaction in protein folding is the hydrophobic interactions within the protein. The framework model and the hydrophobic collapse model represent two canonical descriptions of the protein folding process. The first places primary reliance on the short-range interactions of secondary structure and the second assigns greater importance to the long-range interactions of tertiary structure.6 These hydrophobic interactions have an impact not just on the primary structure but then lead to changes seen in the secondary and tertiary structure as well. Globular proteins acquire distinct compact native con- formations in water as a result of the hydrophobic effect.7 When a protein has been folded in the correct way it usually exists with the hydrophobic core as a result of being hydrated by waters in the system around it which is important because it creates a charged core to the protein and can lead to the creation of channels within the protein. The hydrophobic interactions are found to affect time correlation functions in the vicinity of the native state even though they have no impact on same time characteristics of the structure fluctuations around the native state.7 The hydrophobic interactions are shown to have an impact on the protein even after it has found the most stable conformation in how the proteins can interact with each other as well as folding themselves.

Another type of interaction seen when the protein is folding is the disulfide linkages that form in the protein. (See figure 4) The disulfide bond, a sulfur- sulfur chemical bond that results from an oxidative process that links nonadjacent (in most cases) cysteines of a protein.9 These are a major way that proteins get into their folded form. The types of disulfide bonds are cysteine-cysteine linkage is a stable part of their final folded structure and those in which pairs of cysteines alternate between the reduced and oxidized states.9 The more common is the linkages that cause the protein to fold together and link back on itself compared to the cysteines that are changing oxidation states because the bonds between cysteines once created are fairly stable.

Figure 4: Disulfide Bonds, shown in the picture in yellow, from Wikimedia Commons

Proteins can miss function for several reasons. When a protein is miss folded it can lead to denaturation of the protein. Denaturation is the loss of protein structure and function.1 The miss folding does not always lead to complete lack of function but only partial loss of functionality. The miss functioning of proteins can sometimes lead to diseases in the human body.

Alzheimer's Disease (AD) is a neurological degenerative disease that affects around 5 million Americans, including nearly half of those who are age 85 or older.10 The predominant risk factors of AD are age, family history, and heredity. Alzheimers disease typically results in memory loss, confusion of time and place, misplacing places, and changes in mood and behavior.11 AD results in dense plaques in the brain that are comprised of fibrillar -amyloid proteins with a well-orders -sheet secondary structure.12 These plaques visually look like voids in the brain matter (see figure 5) and are directly connected to the deterioration of thought processes. It has been determined that AD is a protein misfolding disease, where the misfolded protein is directly related to the formation of these plaques in the brain.13

Figure 5: Comparison of healthy brain (left) with brian with Alzheimer's (right)From Wikimedia Commons

It is yet to be fully understood what exactly causes this protein misfolding to begin, but several theories point to oxidative stress in the brain to be the initiating factor. This oxidation results in damage to the phospholipids in the brain, which has been found to result in a faster accumulation of amyloid -proteins.14

Figure 6: Beta-Amyloid Plaque Formation, from Wikimedia Commons

Cystic Fibrosis (CF) is a chronic disease that affects 30,000 Americans. The typical affects of CF is a production of thick, sticky mucus that clogs the lungs and leads to life-threatening lung infection, and obstructs the pancreas preventing proper food processing.15 CF is caused by protein misfolding. This misfolding then results in some change in the protein known as cystic fibrosis transmembrane conductance regulator (CFTR), which can result in this potentially fatal disease.16 In approximately 70% of CF cases, a deletion of phenylalanine at position 508 in the CFTR is deleted. This deletion of Phe508 seems to be directly connected to the formation of CF.17 The protein misfolding that results in CF occurs prior to birth, but it is not entirely clear as to why.

Here is the original post:

Protein Folding - Chemistry LibreTexts

Protein Structure and Folding

After a polypeptide is produced in protein synthesis, it's not necessarily a functional protein yet! Explore protein folding that occurs within levels of protein structure with the Amoeba Sisters! Primary, secondary, tertiary, and quaternary protein structure levels are briefly discussed. Video also mentions chaperonins (chaperone proteins) and how proteins can be denatured.

Table of Contents:0:41 Reminder of Protein Roles1:06 Modifications of Proteins1:25 Importance of Shape for Proteins1:56 Levels of Protein Structure2:06 Primary Structure3:10 Secondary Structure3:45 Tertiary Structure4:58 Quaternary Structure [not in all proteins]6:01 Proteins often have help in folding [introduces chaperonins]6:40 Denaturing Proteins

*Further Reading Suggestions*

Related to Protein Misfoldings:

https://www.nature.com/scitable/topic...https://www.scientificamerican.com/ar...

Learn About "The Protein Folding Problem":https://www.ncbi.nlm.nih.gov/pmc/arti...

Factual References:

OpenStax, Biology. OpenStax CNX. Jun 1, 2018 http://cnx.org/contents/185cbf87-c72e....

Reece, J. B., & Campbell, N. A. (2011). Campbell biology. Boston: Benjamin Cummings / Pearson.

Support us on Patreon! http://www.patreon.com/amoebasistersMore ways to Support Us? http://www.amoebasisters.com/support-...

Our Resources:Biology Playlist: https://www.youtube.com/playlist?list...GIFs: http://www.amoebasisters.com/gifs.htmlHandouts: http://www.amoebasisters.com/handouts...Comics: http://www.amoebasisters.com/parameci...Unlectured Series: https://www.amoebasisters.com/unlectured

Connect with us!Website: http://www.AmoebaSisters.comTwitter: http://www.twitter.com/AmoebaSistersFacebook: http://www.facebook.com/AmoebaSistersTumblr: http://www.amoebasisters.tumblr.comPinterest: http://www.pinterest.com/AmoebaSistersInstagram: https://www.instagram.com/amoebasiste...

Visit our Redbubble store at http://www.amoebasisters.com/store

The Amoeba Sisters videos demystify science with humor and relevance. The videos center on Pinky's certification and experience in teaching science at the high school level. Pinky's teacher certification is in grades 4-8 science and 8-12 composite science (encompassing biology, chemistry, and physics). Amoeba Sisters videos only cover concepts that Pinky is certified to teach, and they focus on her specialty: secondary life science. For more information about The Amoeba Sisters, visit: http://www.amoebasisters.com/about-us...

We cover the basics in biology concepts at the secondary level. If you are looking to discover more about biology and go into depth beyond these basics, our recommended reference is the FREE, peer reviewed, open source OpenStax biology textbook: https://openstax.org/details/books/bi...

Our intro music designed and performed by Jeremiah Cheshire.

End music is this video is listed free to use/no attribution required from the YouTube audio library https://www.youtube.com/audiolibrary/...

We take pride in our AWESOME community, and we welcome feedback and discussion. However, please remember that this is an education channel. See YouTube's community guidelines https://www.youtube.com/yt/policyands... and YouTube's policy center https://support.google.com/youtube/to.... We also reserve the right to remove comments with vulgar language.

We have YouTube's community contributed subtitles feature on to allow translations for different languages, and we are thankful for those that contribute different languages! YouTube automatically credits the different language contributors below (unless the contributor had opted out of being credited). We are not affiliated with any of the translated subtitle credits that YouTube may place below. If you have a concern about community contributed contributions, please contact us.

Read this article:

Protein Structure and Folding

Protein folding – Wikipedia

"Protein thermodynamics" redirects here. For the thermodynamics of reactions catalyzed by proteins, see Enzyme.

Protein folding is the physical process by which a protein chain acquires its native 3-dimensional structure, a conformation that is usually biologically functional, in an expeditious and reproducible manner. It is the physical process by which a polypeptide folds into its characteristic and functional three-dimensional structure from random coil.[1]Each protein exists as an unfolded polypeptide or random coil when translated from a sequence of mRNA to a linear chain of amino acids. This polypeptide lacks any stable (long-lasting) three-dimensional structure (the left hand side of the first figure). As the polypeptide chain is being synthesized by a ribosome, the linear chain begins to fold into its three-dimensional structure. Folding begins to occur even during translation of the polypeptide chain. Amino acids interact with each other to produce a well-defined three-dimensional structure, the folded protein (the right hand side of the figure), known as the native state. The resulting three-dimensional structure is determined by the amino acid sequence or primary structure (Anfinsen's dogma).[2]

The correct three-dimensional structure is essential to function, although some parts of functional proteins may remain unfolded,[3] so that protein dynamics is important. Failure to fold into native structure generally produces inactive proteins, but in some instances misfolded proteins have modified or toxic functionality. Several neurodegenerative and other diseases are believed to result from the accumulation of amyloid fibrils formed by misfolded proteins.[4] Many allergies are caused by incorrect folding of some proteins, because the immune system does not produce antibodies for certain protein structures.[5]

Denaturation of proteins is a process of transition from the folded to the unfolded state. It happens in cooking, in burns, in proteinopathies, and in other contexts.[6]

The duration of the folding process varies dramatically depending on the protein of interest. When studied outside the cell, the slowest folding proteins require many minutes or hours to fold primarily due to proline isomerization, and must pass through a number of intermediate states, like checkpoints, before the process is complete.[7] On the other hand, very small single-domain proteins with lengths of up to a hundred amino acids typically fold in a single step.[8] Time scales of milliseconds are the norm and the very fastest known protein folding reactions are complete within a few microseconds.[9]

The primary structure of a protein, its linear amino-acid sequence, determines its native conformation.[10] The specific amino acid residues and their position in the polypeptide chain are the determining factors for which portions of the protein fold closely together and form its three-dimensional conformation. The amino acid composition is not as important as the sequence.[11] The essential fact of folding, however, remains that the amino acid sequence of each protein contains the information that specifies both the native structure and the pathway to attain that state. This is not to say that nearly identical amino acid sequences always fold similarly.[12] Conformations differ based on environmental factors as well; similar proteins fold differently based on where they are found.

Formation of a secondary structure is the first step in the folding process that a protein takes to assume its native structure. Characteristic of secondary structure are the structures known as alpha helices and beta sheets that fold rapidly because they are stabilized by intramolecular hydrogen bonds, as was first characterized by Linus Pauling. Formation of intramolecular hydrogen bonds provides another important contribution to protein stability.[13] -helices are formed by hydrogen bonding of the backbone to form a spiral shape (refer to figure on the right).[11] The pleated sheet is a structure that forms with the backbone bending over itself to form the hydrogen bonds (as displayed in the figure to the left). The hydrogen bonds are between the amide hydrogen and carbonyl oxygen of the peptide bond. There exists anti-parallel pleated sheets and parallel pleated sheets where the stability of the hydrogen bonds is stronger in the anti-parallel sheet as it hydrogen bonds with the ideal 180 degree angle compared to the slanted hydrogen bonds formed by parallel sheets.[11]

The alpha helices and beta pleated sheets can be amphipathic in nature, or contain a hydrophilic portion and a hydrophobic portion. This property of secondary structures aids in the tertiary structure of a protein in which the folding occurs so that the hydrophilic sides are facing the aqueous environment surrounding the protein and the hydrophobic sides are facing the hydrophobic core of the protein.[14] Secondary structure hierarchically gives way to tertiary structure formation. Once the protein's tertiary structure is formed and stabilized by the hydrophobic interactions, there may also be covalent bonding in the form of disulfide bridges formed between two cysteine residues. Tertiary structure of a protein involves a single polypeptide chain; however, additional interactions of folded polypeptide chains give rise to quaternary structure formation.[15]

Tertiary structure may give way to the formation of quaternary structure in some proteins, which usually involves the "assembly" or "coassembly" of subunits that have already folded; in other words, multiple polypeptide chains could interact to form a fully functional quaternary protein.[11]

Folding is a spontaneous process that is mainly guided by hydrophobic interactions, formation of intramolecular hydrogen bonds, van der Waals forces, and it is opposed by conformational entropy.[16] The process of folding often begins co-translationally, so that the N-terminus of the protein begins to fold while the C-terminal portion of the protein is still being synthesized by the ribosome; however, a protein molecule may fold spontaneously during or after biosynthesis.[17] While these macromolecules may be regarded as "folding themselves", the process also depends on the solvent (water or lipid bilayer),[18] the concentration of salts, the pH, the temperature, the possible presence of cofactors and of molecular chaperones.Proteins will have limitations on their folding abilities by the restricted bending angles or conformations that are possible. These allowable angles of protein folding are described with a two-dimensional plot known as the Ramachandran plot, depicted with psi and phi angles of allowable rotation.[19]

Protein folding must be thermodynamically favorable within a cell in order for it to be a spontaneous reaction. Since it is known that protein folding is a spontaneous reaction, then it must assume a negative Gibbs free energy value. Gibbs free energy in protein folding is directly related to enthalpy and entropy.[11] For a negative delta G to arise and for protein folding to become thermodynamically favorable, then either enthalpy, entropy, or both terms must be favorable.

Minimizing the number of hydrophobic side-chains exposed to water is an important driving force behind the folding process.[20] The hydrophobic effect is the phenomenon in which the hydrophobic chains of a protein collapse into the core of the protein (away from the hydrophilic environment).[11] In an aqueous environment, the water molecules tend to aggregate around the hydrophobic regions or side chains of the protein, creating water shells of ordered water molecules.[21] An ordering of water molecules around a hydrophobic region increases order in a system and therefore contributes a negative change in entropy (less entropy in the system). The water molecules are fixed in these water cages which drives the hydrophobic collapse, or the inward folding of the hydrophobic groups. The hydrophobic collapse introduces entropy back to the system via the breaking of the water cages which frees the ordered water molecules.[11] The multitude of hydrophobic groups interacting within the core of the globular folded protein contributes a significant amount to protein stability after folding, because of the vastly accumulated van der Waals forces (specifically London Dispersion forces).[11] The hydrophobic effect exists as a driving force in thermodynamics only if there is the presence of an aqueous medium with an amphiphilic molecule containing a large hydrophobic region.[22] The strength of hydrogen bonds depends on their environment; thus, H-bonds enveloped in a hydrophobic core contribute more than H-bonds exposed to the aqueous environment to the stability of the native state.[23]

In proteins with globular folds, hydrophobic amino acids tend to be interspersed along the primary sequence, rather than randomly distributed or clustered together.[24][25] However, proteins that have recently been born de novo, which tend to be intrinsically disordered[26][27], show the opposite pattern of hydrophobic amino acid clustering along the primary sequence.[28]

Molecular chaperones are a class of proteins that aid in the correct folding of other proteins in vivo. Chaperones exist in all cellular compartments and interact with the polypeptide chain in order to allow the native three-dimensional conformation of the protein to form; however, chaperones themselves are not included in the final structure of the protein they are assisting in.[29] Chaperones may assist in folding even when the nascent polypeptide is being synthesized by the ribosome.[30] Molecular chaperones operate by binding to stabilize an otherwise unstable structure of a protein in its folding pathway, but chaperones do not contain the necessary information to know the correct native structure of the protein they are aiding; rather, chaperones work by preventing incorrect folding conformations.[30] In this way, chaperones do not actually increase the rate of individual steps involved in the folding pathway toward the native structure; instead, they work by reducing possible unwanted aggregations of the polypeptide chain that might otherwise slow down the search for the proper intermediate and they provide a more efficient pathway for the polypeptide chain to assume the correct conformations.[29] Chaperones are not to be confused with folding catalysts, which actually do catalyze the otherwise slow steps in the folding pathway. Examples of folding catalysts are protein disulfide isomerases and peptidyl-prolyl isomerases that may be involved in formation of disulfide bonds or interconversion between cis and trans stereoisomers, respectively.[30] Chaperones are shown to be critical in the process of protein folding in vivo because they provide the protein with the aid needed to assume its proper alignments and conformations efficiently enough to become "biologically relevant".[31] This means that the polypeptide chain could theoretically fold into its native structure without the aid of chaperones, as demonstrated by protein folding experiments conducted in vitro;[31] however, this process proves to be too inefficient or too slow to exist in biological systems; therefore, chaperones are necessary for protein folding in vivo. Along with its role in aiding native structure formation, chaperones are shown to be involved in various roles such as protein transport, degradation, and even allow denatured proteins exposed to certain external denaturant factors an opportunity to refold into their correct native structures.[32]

A fully denatured protein lacks both tertiary and secondary structure, and exists as a so-called random coil. Under certain conditions some proteins can refold; however, in many cases, denaturation is irreversible.[33] Cells sometimes protect their proteins against the denaturing influence of heat with enzymes known as heat shock proteins (a type of chaperone), which assist other proteins both in folding and in remaining folded. Some proteins never fold in cells at all except with the assistance of chaperones which either isolate individual proteins so that their folding is not interrupted by interactions with other proteins or help to unfold misfolded proteins, allowing them to refold into the correct native structure.[34] This function is crucial to prevent the risk of precipitation into insoluble amorphous aggregates. The external factors involved in protein denaturation or disruption of the native state include temperature, external fields (electric, magnetic),[35] molecular crowding,[36] and even the limitation of space, which can have a big influence on the folding of proteins.[37] High concentrations of solutes, extremes of pH, mechanical forces, and the presence of chemical denaturants can contribute to protein denaturation, as well. These individual factors are categorized together as stresses. Chaperones are shown to exist in increasing concentrations during times of cellular stress and help the proper folding of emerging proteins as well as denatured or misfolded ones.[29]

Under some conditions proteins will not fold into their biochemically functional forms. Temperatures above or below the range that cells tend to live in will cause thermally unstable proteins to unfold or denature (this is why boiling makes an egg white turn opaque). Protein thermal stability is far from constant, however; for example, hyperthermophilic bacteria have been found that grow at temperatures as high as 122C,[38] which of course requires that their full complement of vital proteins and protein assemblies be stable at that temperature or above.

A protein is considered to be misfolded if it cannot achieve its normal native state. This can be due to mutations in the amino acid sequence or a disruption of the normal folding process by external factors.[39] The misfolded protein typically contains -sheets that are organized in a supramolecular arrangement known as a cross- structure. These -sheet-rich assemblies are very stable, very insoluble, and generally resistant to proteolysis.[40] The structural stability of these fibrillar assemblies is caused by extensive interactions between the protein monomers, formed by backbone hydrogen bonds between their -strands.[40] The misfolding of proteins can trigger the further misfolding and accumulation of other proteins into aggregates or oligomers. The increased levels of aggregated proteins in the cell leads to formation of amyloid-like structures which can cause degenerative disorders and cell death.[39] The amyloids are fibrillary structures that contain intermolecular hydrogen bonds which are highly insoluble and made from converted protein aggregates.[39] Therefore, the proteasome pathway may not be efficient enough to degrade the misfolded proteins prior to aggregation. Misfolded proteins can interact with one another and form structured aggregates and gain toxicity through intermolecular interactions.[39]

Aggregated proteins are associated with prion-related illnesses such as CreutzfeldtJakob disease, bovine spongiform encephalopathy (mad cow disease), amyloid-related illnesses such as Alzheimer's disease and familial amyloid cardiomyopathy or polyneuropathy,[41] as well as intracellular aggregation diseases such as Huntington's and Parkinson's disease.[4][42] These age onset degenerative diseases are associated with the aggregation of misfolded proteins into insoluble, extracellular aggregates and/or intracellular inclusions including cross- amyloid fibrils. It is not completely clear whether the aggregates are the cause or merely a reflection of the loss of protein homeostasis, the balance between synthesis, folding, aggregation and protein turnover. Recently the European Medicines Agency approved the use of Tafamidis or Vyndaqel (a kinetic stabilizer of tetrameric transthyretin) for the treatment of transthyretin amyloid diseases. This suggests that the process of amyloid fibril formation (and not the fibrils themselves) causes the degeneration of post-mitotic tissue in human amyloid diseases.[43] Misfolding and excessive degradation instead of folding and function leads to a number of proteopathy diseases such as antitrypsin-associated emphysema, cystic fibrosis and the lysosomal storage diseases, where loss of function is the origin of the disorder. While protein replacement therapy has historically been used to correct the latter disorders, an emerging approach is to use pharmaceutical chaperones to fold mutated proteins to render them functional.

While inferences about protein folding can be made through mutation studies, typically, experimental techniques for studying protein folding rely on the gradual unfolding or folding of proteins and observing conformational changes using standard non-crystallographic techniques.

X-ray crystallography is one of the more efficient and important methods for attempting to decipher the three dimensional configuration of a folded protein.[44] To be able to conduct X-ray crystallography, the protein under investigation must be located inside a crystal lattice. To place a protein inside a crystal lattice, one must have a suitable solvent for crystallization, obtain a pure protein at supersaturated levels in solution, and precipitate the crystals in solution.[45] Once a protein is crystallized, x-ray beams can be concentrated through the crystal lattice which would diffract the beams or shoot them outwards in various directions. These exiting beams are correlated to the specific three-dimensional configuration of the protein enclosed within. The x-rays specifically interact with the electron clouds surrounding the individual atoms within the protein crystal lattice and produce a discernible diffraction pattern.[14] Only by relating the electron density clouds with the amplitude of the x-rays can this pattern be read and lead to assumptions of the phases or phase angles involved that complicate this method.[46] Without the relation established through a mathematical basis known as Fourier transform, the "phase problem" would render predicting the diffraction patterns very difficult.[14] Emerging methods like multiple isomorphous replacement use the presence of a heavy metal ion to diffract the x-rays into a more predictable manner, reducing the number of variables involved and resolving the phase problem.[44]

Fluorescence spectroscopy is a highly sensitive method for studying the folding state of proteins. Three amino acids, phenylalanine (Phe), tyrosine (Tyr) and tryptophan (Trp), have intrinsic fluorescence properties, but only Tyr and Trp are used experimentally because their quantum yields are high enough to give good fluorescence signals. Both Trp and Tyr are excited by a wavelength of 280nm, whereas only Trp is excited by a wavelength of 295nm. Because of their aromatic character, Trp and Tyr residues are often found fully or partially buried in the hydrophobic core of proteins, at the interface between two protein domains, or at the interface between subunits of oligomeric proteins. In this apolar environment, they have high quantum yields and therefore high fluorescence intensities. Upon disruption of the proteins tertiary or quaternary structure, these side chains become more exposed to the hydrophilic environment of the solvent, and their quantum yields decrease, leading to low fluorescence intensities. For Trp residues, the wavelength of their maximal fluorescence emission also depend on their environment.

Fluorescence spectroscopy can be used to characterize the equilibrium unfolding of proteins by measuring the variation in the intensity of fluorescence emission or in the wavelength of maximal emission as functions of a denaturant value.[47][48] The denaturant can be a chemical molecule (urea, guanidinium hydrochloride), temperature, pH, pressure, etc. The equilibrium between the different but discrete protein states, i.e. native state, intermediate states, unfolded state, depends on the denaturant value; therefore, the global fluorescence signal of their equilibrium mixture also depends on this value. One thus obtains a profile relating the global protein signal to the denaturant value. The profile of equilibrium unfolding may enable one to detect and identify intermediates of unfolding.[49][50] General equations have been developed by Hugues Bedouelle to obtain the thermodynamic parameters that characterize the unfolding equilibria for homomeric or heteromeric proteins, up to trimers and potentially tetramers, from such profiles.[47] Fluorescence spectroscopy can be combined with fast-mixing devices such as stopped flow, to measure protein folding kinetics,[51] generate a chevron plot and derive a Phi value analysis.

Circular dichroism is one of the most general and basic tools to study protein folding. Circular dichroism spectroscopy measures the absorption of circularly polarized light. In proteins, structures such as alpha helices and beta sheets are chiral, and thus absorb such light. The absorption of this light acts as a marker of the degree of foldedness of the protein ensemble. This technique has been used to measure equilibrium unfolding of the protein by measuring the change in this absorption as a function of denaturant concentration or temperature. A denaturant melt measures the free energy of unfolding as well as the protein's m value, or denaturant dependence. A temperature melt measures the denaturation temperature (Tm) of the protein.[47] As for fluorescence spectroscopy, circular-dichroism spectroscopy can be combined with fast-mixing devices such as stopped flow to measure protein folding kinetics and to generate chevron plots.

The more recent developments of vibrational circular dichroism (VCD) techniques for proteins, currently involving Fourier transform (FFT) instruments, provide powerful means for determining protein conformations in solution even for very large protein molecules. Such VCD studies of proteins are often combined with X-ray diffraction of protein crystals, FT-IR data for protein solutions in heavy water (D2O), or ab initio quantum computations to provide unambiguous structural assignments that are unobtainable from CD.[citation needed]

Protein folding is routinely studied using NMR spectroscopy, for example by monitoring hydrogen-deuterium exchange of backbone amide protons of proteins in their native state, which provides both the residue-specific stability and overall stability of proteins.[52]

Dual polarisation interferometry is a surface-based technique for measuring the optical properties of molecular layers. When used to characterize protein folding, it measures the conformation by determining the overall size of a monolayer of the protein and its density in real time at sub-Angstrom resolution,[53] although real-time measurement of the kinetics of protein folding are limited to processes that occur slower than ~10Hz. Similar to circular dichroism, the stimulus for folding can be a denaturant or temperature.

The study of protein folding has been greatly advanced in recent years by the development of fast, time-resolved techniques. Experimenters rapidly trigger the folding of a sample of unfolded protein and observe the resulting dynamics. Fast techniques in use include neutron scattering,[54] ultrafast mixing of solutions, photochemical methods, and laser temperature jump spectroscopy. Among the many scientists who have contributed to the development of these techniques are Jeremy Cook, Heinrich Roder, Harry Gray, Martin Gruebele, Brian Dyer, William Eaton, Sheena Radford, Chris Dobson, Alan Fersht, Bengt Nlting and Lars Konermann.

Proteolysis is routinely used to probe the fraction unfolded under a wide range of solution conditions (e.g. Fast parallel proteolysis (FASTpp).[55][56]

Single molecule techniques such as optical tweezers and AFM have been used to understand protein folding mechanisms of isolated proteins as well as proteins with chaperones.[57] Optical tweezers have been used to stretch single protein molecules from their C- and N-termini and unfold them to allow study of the subsequent refolding.[58] The technique allows one to measure folding rates at single-molecule level; for example, optical tweezers have been recently applied to study folding and unfolding of proteins involved in blood coagulation. von Willebrand factor (vWF) is a protein with an essential role in blood clot formation process. It discovered using single molecule optical tweezers measurement that calcium-bound vWF acts as a shear force sensor in the blood. Shear force leads to unfolding of the A2 domain of vWF, whose refolding rate is dramatically enhanced in the presence of calcium.[59] Recently, it was also shown that the simple src SH3 domain accesses multiple unfolding pathways under force.[60]

Biotin painting enables condition-specific cellular snapshots of (un)folded proteins. Biotin 'painting' shows a bias towards predicted Intrinsically disordered proteins [61].

Computational studies of protein folding includes three main aspects related to the prediction of protein stability, kinetics, and structure. A recent review summarizes the available computational methods for protein folding. [62]

In 1969, Cyrus Levinthal noted that, because of the very large number of degrees of freedom in an unfolded polypeptide chain, the molecule has an astronomical number of possible conformations. An estimate of 3300 or 10143 was made in one of his papers.[63] Levinthal's paradox is a thought experiment based on the observation that if a protein were folded by sequentially sampling of all possible conformations, it would take an astronomical amount of time to do so, even if the conformations were sampled at a rapid rate (on the nanosecond or picosecond scale).[64] Based upon the observation that proteins fold much faster than this, Levinthal then proposed that a random conformational search does not occur, and the protein must, therefore, fold through a series of meta-stable intermediate states.

The configuration space of a protein during folding can be visualized as energy landscape. According to Joseph Bryngelson and Peter Wolynes, proteins follow the principle of minimal frustration meaning that naturally evolved proteins have optimized their folding energy landscapes.[65], and that nature has chosen amino acid sequences so that the folded state of the protein is sufficiently stable. In addition, the acquisition of the folded state had to become a sufficiently fast process. Even though nature has reduced the level of frustration in proteins, some degree of it remains up to now as can be observed in the presence of local minima in the energy landscape of proteins.

A consequence of these evolutionarily selected sequences is that proteins are generally thought to have globally "funneled energy landscapes" (coined by Jos Onuchic)[66] that are largely directed toward the native state. This "folding funnel" landscape allows the protein to fold to the native state through any of a large number of pathways and intermediates, rather than being restricted to a single mechanism. The theory is supported by both computational simulations of model proteins and experimental studies,[65] and it has been used to improve methods for protein structure prediction and design.[65] The description of protein folding by the leveling free-energy landscape is also consistent with the 2nd law of thermodynamics.[67] Physically, thinking of landscapes in terms of visualizable potential or total energy surfaces simply with maxima, saddle points, minima, and funnels, rather like geographic landscapes, is perhaps a little misleading. The relevant description is really a high-dimensional phase space in which manifolds might take a variety of more complicated topological forms.[68]

The unfolded polypeptide chain begins at the top of the funnel where it may assume the largest number of unfolded variations and is in its highest energy state. Energy landscapes such as these indicate that there are a large number of initial possibilities, but only a single native state is possible; however, it does not reveal the numerous folding pathways that are possible. A different molecule of the same exact protein may be able to follow marginally different folding pathways, seeking different lower energy intermediates, as long as the same native structure is reached.[69] Different pathways may have different frequencies of utilization depending on the thermodynamic favorability of each pathway. This means that if one pathway is found to be more thermodynamically favorable than another, it is likely to be used more frequently in the pursuit of the native structure.[69] As the protein begins to fold and assume its various conformations, it always seeks a more thermodynamically favorable structure than before and thus continues through the energy funnel. Formation of secondary structures is a strong indication of increased stability within the protein, and only one combination of secondary structures assumed by the polypeptide backbone will have the lowest energy and therefore be present in the native state of the protein.[69] Among the first structures to form once the polypeptide begins to fold are alpha helices and beta turns, where alpha helices can form in as little as 100 nanoseconds and beta turns in 1 microsecond.[29]

There exists a saddle point in the energy funnel landscape where the transition state for a particular protein is found.[29] The transition state in the energy funnel diagram is the conformation that must be assumed by every molecule of that protein if the protein wishes to finally assume the native structure. No protein may assume the native structure without first passing through the transition state.[29] The transition state can be referred to as a variant or premature form of the native state rather than just another intermediary step.[70] The folding of the transition state is shown to be rate-determining, and even though it exists in a higher energy state than the native fold, it greatly resembles the native structure. Within the transition state, there exists a nucleus around which the protein is able to fold, formed by a process referred to as "nucleation condensation" where the structure begins to collapse onto the nucleus.[70]

De novo or ab initio techniques for computational protein structure prediction are related to, but strictly distinct from, experimental studies of protein folding. Molecular Dynamics (MD) is an important tool for studying protein folding and dynamics in silico.[71] First equilibrium folding simulations were done using implicit solvent model and umbrella sampling.[72] Because of computational cost, ab initio MD folding simulations with explicit water are limited to peptides and very small proteins.[73][74] MD simulations of larger proteins remain restricted to dynamics of the experimental structure or its high-temperature unfolding. Long-time folding processes (beyond about 1 millisecond), like folding of small-size proteins (about 50 residues) or larger, can be accessed using coarse-grained models.[75][76][77]

The 100-petaFLOP distributed computing project Folding@home created by Vijay Pande's group at Stanford University simulates protein folding using the idle processing time of CPUs and GPUs of personal computers from volunteers. The project aims to understand protein misfolding and accelerate drug design for disease research.

Long continuous-trajectory simulations have been performed on Anton, a massively parallel supercomputer designed and built around custom ASICs and interconnects by D. E. Shaw Research. The longest published result of a simulation performed using Anton is a 2.936 millisecond simulation of NTL9 at 355 K.[78]

Read the original here:

Protein folding - Wikipedia