A European Informational Website
learn more
Protein folding is the physical process by which a polypeptide folds into its characteristic three-dimensional structure [1]. Each protein begins as a polypeptide, translated from a sequence of mRNA as a linear chain of amino acids. This polypeptide lacks any developed three-dimensional structure (the left hand side of the neighboring figure). However each amino acid in the chain can be thought of having certain 'gross' chemical features. These may be hydrophobic, hydrophilic, or electrically charged, for example. These interact with each other and their surroundings in the cell to produce a well-defined, three dimensional shape, the folded protein (the right hand side of the figure), known as the native state. The resulting three-dimensional structure is determined by the sequence of the amino acids[2]. The mechanism of protein folding is not completely understood.
Experimentally determining the three dimensional structure of a protein is often very difficult and expensive. However the sequence of that protein is often known. Therefore scientists have tried to use different biophysical techniques to manually fold a protein. That is, to predict the structure of the protein complete from the sequence of the protein.
For many proteins the correct three dimensional structure is essential for the protein to function correctly.[3] Thus "failure" of folding usually produces inactive proteins with different properties, details can be found under Prions. Several diseases are believed to result from the accumulation of misfolded proteins, e.g. Alzheimer's disease, cystic fibrosis and BSE[4].
Most folded proteins have a hydrophobic core in which side chain packing stabilizes the folded state, and charged or polar side chains on the solvent-exposed surface where they interact with surrounding water molecules. It is generally accepted that minimizing the number of hydrophobic sidechains exposed to water is the principal driving force behind the folding process [5], although a recent theory has been proposed which reassesses the contributions made by hydrogen bonding [6]
The process of folding in vivo often begins co-translationally, so that the N-terminus of the protein begins to fold while the C-terminal portion of the protein is still being synthesized by the ribosome. Cells express specialized proteins called chaperones whose function is to aid in the folding of other proteins[7]. A major example is the bacterial GroEL system, which assists in the folding of globular proteins. In eukaryotic organisms chaperones are known as heat shock proteins. Although most globular proteins are able to assume their native state unassisted, chaperone-assisted folding is necessary for some proteins in the crowded intracellular environment to prevent aggregation; chaperones are also used to prevent misfolding and aggregation which may occur as a consequence of exposure to heat or other changes in the cellular environment. The particular amino-acid sequence (or "primary structure") of a protein predisposes it to fold into its native conformation or conformations. Proteins do so spontaneously during or after their synthesis inside cells. While these macromolecules may be seen as "folding themselves," their folding depends on the characteristics of their surrounding solution, including the identity of the primary solvent (either water or lipid inside cells), the concentration of salts, the temperature, and molecular chaperones.
For the most part, scientists have been able to study many identical molecules folding together en masse. At the coarsest level, it appears that in transitioning to the native state, a given amino acid sequence takes on roughly the same route and proceeds through roughly the same intermediates and transition states. Often folding involves first the establishment of regular secondary and supersecondary structures, particularly alpha helices and beta sheets, and afterwards tertiary structure. Formation of quaternary structure usually involves the "assembly" or "coassembly" of subunits that have already folded. The regular alpha helix and beta sheet structures fold rapidly because they are stabilized by intramolecular hydrogen bonds, as was first realized by Linus Pauling. Protein folding may involve covalent bonding in the form of disulfide bridges formed between two cysteine residues or formation of metal clusters. Shortly before settling into their more stable native conformation, molecules may pass through an intermediate "molten globule" state.
The essential fact of folding, however, remains that the amino acid sequence of each protein contains the information that specifies both the native structure and the pathway to attain that state: Folding is a spontaneous process. The passage of the folded state is mainly guided by the hydrophobic interactions, formation of intramolecular hydrogen bonds, and van der Waals forces, and it is opposed by conformational entropy of the polypeptide chain.
In certain solutions and under some conditions proteins will not fold into their biologically "functional" forms. Temperatures above the range that cells tend to live in will cause proteins to unfold or "denature" (this is why boiling makes the white of an egg opaque). High concentrations of solutes and extremes of pH can do the same. A fully denatured protein lacks both tertiary and secondary structure, and exists as a so-called random coil. Cells sometimes protect their proteins against the denaturing influence of heat with enzymes known as chaperones or heat shock proteins, which assist other proteins both in folding and in remaining folded. Some proteins never fold in cells at all except with the assistance of chaperone molecules, that either isolate individual proteins so that their folding is not interrupted by interactions with other proteins or help to unfold misfolded proteins, giving them a second chance to refold properly.
Incorrectly folded (misfolded) proteins are responsible for prion related illness such as Creutzfeldt-Jakob disease and Bovine spongiform encephalopathy (mad cow disease), and amyloid related illnesses such as Alzheimer's Disease. These diseases are associated with the aggregation of misfolded proteins into insoluble plaques; it is not known whether the plaques are the cause or merely a symptom of illness.
The entire duration of the folding process varies dramatically depending on the protein of interest. The slowest folding proteins require many minutes or hours to fold, primarily due to steric hindrances. However, small proteins, with lengths of a hundred or so amino acids, typically fold on time scales of milliseconds. The very fastest known protein folding reactions are complete within a few microseconds. The Levinthal paradox, proposed by Cyrus Levinthal in 1969, states that, if a protein were to fold by sequentially sampling all possible conformations, it would take an astronomical amount of time to do so, even if the conformations were sampled at a rapid rate (on the nanosecond or picosecond scale). Based upon the observation that proteins fold much faster than this, Levinthal then proposed that a random conformational search does not occur in folding, and the protein must, therefore, fold by a directed process.
The "reverse" of the folding process is called protein denaturation, whereby the native structure of a protein is disrupted and a random coil ensemble of unfolded structures is formed instead. Denaturation can be carried out chemically by the addition of denaturants or thermally by heating (and sometimes cooling). Many denatured proteins precipitate into insoluble amorphous aggregates. Some proteins denatured under some conditions can reversibly refold; however, in many cases denaturation is irreversible[8]. Folding and unfolding rates also depend on environment conditions like temperature, solvent viscosity, pH and more. The folding process can also be slowed down (and the unfolding sped up) by applying mechanical forces, as revealed by single-molecule experiments.
The study of protein folding has been greatly advanced in recent years by the development of fast, time-resolved techniques. These are experimental methods for rapidly triggering the folding of a sample of unfolded protein, and then observing the resulting dynamics. Fast techniques in widespread use include ultrafast mixing of solutions, photochemical methods, and laser temperature jump spectroscopy. Among the many scientists who have contributed to the development of these techniques are Heinrich Roder, Harry Gray, Martin Gruebele, Brian Dyer, William Eaton, Sir Alan R. Fersht and Bengt Nölting.
The protein folding phenomenon was largely an experimental endeavor until the formulation of energy landscape theory by Joseph Bryngelson and Peter Wolynes in the late 1980's and early 1990's. This approach introduced the principle of minimal frustration, which asserts that evolution has selected the amino acid sequences of natural proteins so that interactions between side chains largely favor the molecule's acquisition of the folded state. Interactions that do not favor folding are selected against, although some residual frustration is expected to exist. A consequence of these evolutionarily selected sequences is that proteins are generally thought to have globally "funneled energy landscapes" (coined by José Onuchic) that are largely directed towards the native state. This "folding funnel" landscape allows the protein to fold to the native state through any of a large number of pathways and intermediates, rather than being restricted to a single mechanism. The theory is supported by computational simulations of model proteins and has been used to improve methods for protein structure prediction and design.
De novo or ab initio techniques for computational protein structure prediction employ simulations of protein folding to determine the protein's final folded shape.
The determination of the folded structure of a protein is a lengthy and complicated process, involving methods like X-ray crystallography and NMR. In bioinformatics, one of the major areas of interest is the prediction of native structure from amino-acid sequences alone.
Contents