Search This Blog

Friday, August 22, 2025

Personalized medicine

From Wikipedia, the free encyclopedia

Personalized medicine, also referred to as precision medicine, is a medical model that separates people into different groups—with medical decisions, practices, interventions and/or products being tailored to the individual patient based on their predicted response or risk of disease. The terms personalized medicine, precision medicine, stratified medicine and P4 medicine are used interchangeably to describe this concept, though some authors and organizations differentiate between these expressions based on particular nuances. P4 is short for "predictive, preventive, personalized and participatory".

While the tailoring of treatment to patients dates back at least to the time of Hippocrates, the usage of the term has risen in recent years thanks to the development of new diagnostic and informatics approaches that provide an understanding of the molecular basis of disease, particularly genomics. This provides a clear biomarker on which to stratify related patients.

Among the 14 Grand Challenges for Engineering, an initiative sponsored by National Academy of Engineering (NAE), personalized medicine has been identified as a key and prospective approach to "achieve optimal individual health decisions", therefore overcoming the challenge to "engineer better medicines".

Development of concept

In personalised medicine, diagnostic testing is often employed for selecting appropriate and optimal therapies based on the patient's genetics or their other molecular or cellular characteristics.[citation needed] The use of genetic information has played a major role in certain aspects of personalized medicine (e.g. pharmacogenomics), and the term was first coined in the context of genetics, though it has since broadened to encompass all sorts of personalization measures, including the use of proteomics, imaging analysis, nanoparticle-based theranostics, among others.

Difference between precision medicine and personalized medicine

Precision medicine is a medical model that proposes the customization of healthcare, with medical decisions, treatments, practices, or products being tailored to a subgroup of patients, instead of a one‐drug‐fits‐all model. In precision medicine, diagnostic testing is often employed for selecting appropriate and optimal therapies based on the context of a patient's genetic content or other molecular or cellular analysis. Tools employed in precision medicine can include molecular diagnostics, imaging, and analytics.

Precision medicine and personalized medicine (also individualized medicine) are analogous, applying a person's genetic profile to guide clinical decisions about the prevention, diagnosis, and treatment of a disease. Personalized medicine is established on discoveries from the Human Genome Project.

In explaining the distinction from the similar term of personalized medicine, the United States President's Council of Advisors on Science and Technology writes:

Precision medicine refers to the tailoring of medical treatment to the individual characteristics of each patient. It does not literally mean the creation of drugs or medical devices that are unique to a patient, but rather the ability to classify individuals into subpopulations that differ in their susceptibility to a particular disease, in the biology or prognosis of those diseases they may develop, or in their response to a specific treatment. Preventive or therapeutic interventions can then be concentrated on those who will benefit, sparing expense and side effects for those who will not.

The use of the term "precision medicine" can extend beyond treatment selection to also cover creating unique medical products for particular individuals—for example, "...patient-specific tissue or organs to tailor treatments for different people." Hence, the term in practice has so much overlap with "personalized medicine" that they are often used interchangeably, even though the latter is sometimes misinterpreted as involving a unique treatment for each individual.

Background

Basics

Every person has a unique variation of the human genome. Although most of the variation between individuals has no effect on health, an individual's health stems from genetic variation with behaviors and influences from the environment.

Modern advances in personalized medicine rely on technology that confirms a patient's fundamental biology, DNA, RNA, or protein, which ultimately leads to confirming disease. For example, personalised techniques such as genome sequencing can reveal mutations in DNA that influence diseases ranging from cystic fibrosis to cancer. Another method, called RNA-seq, can show which RNA molecules are involved with specific diseases. Unlike DNA, levels of RNA can change in response to the environment. Therefore, sequencing RNA can provide a broader understanding of a person's state of health. Recent studies have linked genetic differences between individuals to RNA expression, translation, and protein levels.

The concepts of personalised medicine can be applied to new and transformative approaches to health care. Personalised health care is based on the dynamics of systems biology and uses predictive tools to evaluate health risks and to design personalised health plans to help patients mitigate risks, prevent disease and to treat it with precision when it occurs. The concepts of personalised health care are receiving increasing acceptance with the Veterans Administration committing to personalised, proactive patient driven care for all veterans. In some instances personalised health care can be tailored to the markup of the disease causing agent instead of the patient's genetic markup; examples are drug resistant bacteria or viruses.

Precision medicine often involves the application of panomic analysis and systems biology to analyze the cause of an individual patient's disease at the molecular level and then to utilize targeted treatments (possibly in combination) to address that individual patient's disease process. The patient's response is then tracked as closely as possible, often using surrogate measures such as tumor load (versus true outcomes, such as five-year survival rate), and the treatment finely adapted to the patient's response. The branch of precision medicine that addresses cancer is referred to as "precision oncology". The field of precision medicine that is related to psychiatric disorders and mental health is called "precision psychiatry."

Inter-personal difference of molecular pathology is diverse, so as inter-personal difference in the exposome, which influence disease processes through the interactome within the tissue microenvironment, differentially from person to person. As the theoretical basis of precision medicine, the "unique disease principle" emerged to embrace the ubiquitous phenomenon of heterogeneity of disease etiology and pathogenesis. The unique disease principle was first described in neoplastic diseases as the unique tumor principle. As the exposome is a common concept of epidemiology, precision medicine is intertwined with molecular pathological epidemiology, which is capable of identifying potential biomarkers for precision medicine.

Method

In order for physicians to know if a mutation is connected to a certain disease, researchers often do a study called a "genome-wide association study" (GWA study). Such a study will look at one disease, and then sequence the genome of many patients with that particular disease to look for shared mutations in the genome. Mutations that are determined to be related to a disease by a GWA study can then be used to diagnose that disease in future patients, by looking at their genome sequence to find that same mutation. The first GWA study, conducted in 2005, studied patients with age-related macular degeneration (ARMD). It found two different mutations, each containing only a variation in only one nucleotide (called single nucleotide polymorphisms, or SNPs), which were associated with ARMD. GWA studies like this have been very successful in identifying common genetic variations associated with diseases. As of early 2014, over 1,300 GWA studies have been completed.

Disease risk assessment

Multiple genes collectively influence the likelihood of developing many common and complex diseases. Personalised medicine can also be used to predict a person's risk for a particular disease, based on one or even several genes. This approach uses the same sequencing technology to focus on the evaluation of disease risk, allowing the physician to initiate preventive treatment before the disease presents itself in their patient. For example, if it is found that a DNA mutation increases a person's risk of developing Type 2 Diabetes, this individual can begin lifestyle changes that will lessen their chances of developing Type 2 Diabetes later in life.

Practice

The ability to provide precision medicine to patients in routine clinical settings depends on the availability of molecular profiling tests, e.g. individual germline DNA sequencing. While precision medicine currently individualizes treatment mainly on the basis of genomic tests (e.g. Oncotype DX), several promising technology modalities are being developed, from techniques combining spectrometry and computational power to real-time imaging of drug effects in the body. Many different aspects of precision medicine are tested in research settings (e.g., proteome, microbiome), but in routine practice not all available inputs are used. The ability to practice precision medicine is also dependent on the knowledge bases available to assist clinicians in taking action based on test results. Early studies applying omics-based precision medicine to cohorts of individuals with undiagnosed disease has yielded a diagnosis rate ~35% with ~1 in 5 of newly diagnosed receiving recommendations regarding changes in therapy. It has been suggested that until pharmacogenetics becomes further developed and able to predict individual treatment responses, the N-of-1 trials are the best method of identifying patients responding to treatments.

On the treatment side, PM can involve the use of customized medical products such drug cocktails produced by pharmacy compounding or customized devices. It can also prevent harmful drug interactions, increase overall efficiency when prescribing medications, and reduce costs associated with healthcare.

The question of who benefits from publicly funded genomics is an important public health consideration, and attention is needed to ensure that implementation of genomic medicine does not further entrench social‐equity concerns.

Artificial intelligence in precision medicine

Artificial intelligence is providing a paradigm shift toward precision medicine. Machine learning algorithms are used for genomic sequence and to analyze and draw inferences from the vast amounts of data patients and healthcare institutions recorded in every moment. AI techniques are used in precision cardiovascular medicine to understand genotypes and phenotypes in existing diseases, improve the quality of patient care, enable cost-effectiveness, and reduce readmission and mortality rates. A 2021 paper reported that machine learning was able to predict the outcomes of Phase III clinical trials (for treatment of prostate cancer) with 76% accuracy. This suggests that clinical trial data could provide a practical source for machine learning-based tools for precision medicine.

Precision medicine may be susceptible to subtle forms of algorithmic bias. For example, the presence of multiple entry fields with values entered by multiple observers can create distortions in the ways data is understood and interpreted. A 2020 paper showed that training machine learning models in a population-specific fashion (i.e. training models specifically for Black cancer patients) can yield significantly superior performance than population-agnostic models.

Precision Medicine Initiative

In his 2015 State of the Union address, then-U.S. President Barack Obama stated his intention to give $215 million of funding to the "Precision Medicine Initiative" of the United States National Institutes of Health. A short-term goal of this initiative was to expand cancer genomics to develop better prevention and treatment methods. In the long term, the Precision Medicine Initiative aimed to build a comprehensive scientific knowledge base by creating a national network of scientists and embarking on a national cohort study of one million Americans to expand our understanding of health and disease. The mission statement of the Precision Medicine Initiative read: "To enable a new era of medicine through research, technology, and policies that empower patients, researchers, and providers to work together toward development of individualized treatments". In 2016 this initiative was renamed to "All of Us" and by January 2018, 10,000 people had enrolled in its pilot phase.

Benefits of precision medicine

Precision medicine helps health care providers better understand the many things—including environment, lifestyle, and heredity—that play a role in a patient's health, disease, or condition. This information lets them more accurately predict which treatments will be most effective and safe, or possibly how to prevent the illness from starting in the first place. In addition, benefits are to:

  • shift the emphasis in medicine from reaction to prevention
  • predict susceptibility to disease
  • improve disease detection
  • preempt disease progression
  • customize disease-prevention strategies
  • prescribe more effective drugs
  • avoid prescribing drugs with predictable negative side effects
  • reduce the time, cost, and failure rate of pharmaceutical clinical trials
  • eliminate trial-and-error inefficiencies that inflate health care costs and undermine patient care

Applications

Advances in personalised medicine will create a more unified treatment approach specific to the individual and their genome. Personalised medicine may provide better diagnoses with earlier intervention, and more efficient drug development and more targeted therapies.

Diagnosis and intervention

Having the ability to look at a patient on an individual basis will allow for a more accurate diagnosis and specific treatment plan. Genotyping is the process of obtaining an individual's DNA sequence by using biological assays. By having a detailed account of an individual's DNA sequence, their genome can then be compared to a reference genome, like that of the Human Genome Project, to assess the existing genetic variations that can account for possible diseases. A number of private companies, such as 23andMe, Navigenics, and Illumina, have created Direct-to-Consumer genome sequencing accessible to the public. Having this information from individuals can then be applied to effectively treat them. An individual's genetic make-up also plays a large role in how well they respond to a certain treatment, and therefore, knowing their genetic content can change the type of treatment they receive.

An aspect of this is pharmacogenomics, which uses an individual's genome to provide a more informed and tailored drug prescription. Often, drugs are prescribed with the idea that it will work relatively the same for everyone, but in the application of drugs, there are a number of factors that must be considered. The detailed account of genetic information from the individual will help prevent adverse events, allow for appropriate dosages, and create maximum efficacy with drug prescriptions. For instance, warfarin is the FDA approved oral anticoagulant commonly prescribed to patients with blood clots. Due to warfarin's significant interindividual variability in pharmacokinetics and pharmacodynamics, its rate of adverse events is among the highest of all commonly prescribed drugs. However, with the discovery of polymorphic variants in CYP2C9 and VKORC1 genotypes, two genes that encode the individual anticoagulant response, physicians can use patients' gene profile to prescribe optimum doses of warfarin to prevent side effects such as major bleeding and to allow sooner and better therapeutic efficacy. The pharmacogenomic process for discovery of genetic variants that predict adverse events to a specific drug has been termed toxgnostics.

An aspect of a theranostic platform applied to personalized medicine can be the use of diagnostic tests to guide therapy. The tests may involve medical imaging such as MRI contrast agents (T1 and T2 agents), fluorescent markers (organic dyes and inorganic quantum dots), and nuclear imaging agents (PET radiotracers or SPECT agents). or in vitro lab test including DNA sequencing and often involve deep learning algorithms that weigh the result of testing for several biomarkers.

In addition to specific treatment, personalised medicine can greatly aid the advancements of preventive care. For instance, many women are already being genotyped for certain mutations in the BRCA1 and BRCA2 gene if they are predisposed because of a family history of breast cancer or ovarian cancer. As more causes of diseases are mapped out according to mutations that exist within a genome, the easier they can be identified in an individual. Measures can then be taken to prevent a disease from developing. Even if mutations were found within a genome, having the details of their DNA can reduce the impact or delay the onset of certain diseases. Having the genetic content of an individual will allow better guided decisions in determining the source of the disease and thus treating it or preventing its progression. This will be extremely useful for diseases like Alzheimer's or cancers that are thought to be linked to certain mutations in our DNA.

A tool that is being used now to test efficacy and safety of a drug specific to a targeted patient group/sub-group is companion diagnostics. This technology is an assay that is developed during or after a drug is made available on the market and is helpful in enhancing the therapeutic treatment available based on the individual. These companion diagnostics have incorporated the pharmacogenomic information related to the drug into their prescription label in an effort to assist in making the most optimal treatment decision possible for the patient.

An overall process of personalized cancer therapy. Genome sequencing will allow for a more accurate and personalized drug prescription and a targeted therapy for different patients.

Drug development and usage

Having an individual's genomic information can be significant in the process of developing drugs as they await approval from the FDA for public use. Having a detailed account of an individual's genetic make-up can be a major asset in deciding if a patient can be chosen for inclusion or exclusion in the final stages of a clinical trial. Being able to identify patients who will benefit most from a clinical trial will increase the safety of patients from adverse outcomes caused by the product in testing, and will allow smaller and faster trials that lead to lower overall costs. In addition, drugs that are deemed ineffective for the larger population can gain approval by the FDA by using personal genomes to qualify the effectiveness and need for that specific drug or therapy even though it may only be needed by a small percentage of the population.

Physicians commonly use a trial and error strategy until they find the treatment therapy that is most effective for their patient. With personalized medicine, these treatments can be more specifically tailored by predicting how an individual's body will respond and if the treatment will work based on their genome. This has been summarized as "therapy with the right drug at the right dose in the right patient." Such an approach would also be more cost-effective and accurate. For instance, Tamoxifen used to be a drug commonly prescribed to women with ER+ breast cancer, but 65% of women initially taking it developed resistance. After research by people such as David Flockhart, it was discovered that women with certain mutation in their CYP2D6 gene, a gene that encodes the metabolizing enzyme, were not able to efficiently break down Tamoxifen, making it an ineffective treatment for them. Women are now genotyped for these specific mutations to select the most effective treatment.

Screening for these mutations is carried out via high-throughput screening or phenotypic screening. Several drug discovery and pharmaceutical companies are currently utilizing these technologies to not only advance the study of personalised medicine, but also to amplify genetic research. Alternative multi-target approaches to the traditional approach of "forward" transfection library screening can entail reverse transfection or chemogenomics.

Pharmacy compounding is another application of personalised medicine. Though not necessarily using genetic information, the customized production of a drug whose various properties (e.g. dose level, ingredient selection, route of administration, etc.) are selected and crafted for an individual patient is accepted as an area of personalised medicine (in contrast to mass-produced unit doses or fixed-dose combinations). Computational and mathematical approaches for predicting drug interactions are also being developed. For example, phenotypic response surfaces model the relationships between drugs, their interactions, and an individual's biomarkers.

One active area of research is efficiently delivering personalized drugs generated from pharmacy compounding to the disease sites of the body. For instance, researchers are trying to engineer nanocarriers that can precisely target the specific site by using real-time imaging and analyzing the pharmacodynamics of the drug delivery. Several candidate nanocarriers are being investigated, such as iron oxide nanoparticles, quantum dots, carbon nanotubes, gold nanoparticles, and silica nanoparticles. Alteration of surface chemistry allows these nanoparticles to be loaded with drugs, as well as to avoid the body's immune response, making nanoparticle-based theranostics possible. Nanocarriers' targeting strategies are varied according to the disease. For example, if the disease is cancer, a common approach is to identify the biomarker expressed on the surface of cancer cells and to load its associated targeting vector onto nanocarrier to achieve recognition and binding; the size scale of the nanocarriers will also be engineered to reach the enhanced permeability and retention effect (EPR) in tumor targeting. If the disease is localized in the specific organ, such as the kidney, the surface of the nanocarriers can be coated with a certain ligand that binds to the receptors inside that organ to achieve organ-targeting drug delivery and avoid non-specific uptake. Despite the great potential of this nanoparticle-based drug delivery system, the significant progress in the field is yet to be made, and the nanocarriers are still being investigated and modified to meet clinical standards.

Theranostics

Theranostics is a personalized approach in nuclear medicine, using similar molecules for both imaging (diagnosis) and therapy. The term is a portmanteau of "therapeutics" and "diagnostics". Its most common applications are attaching radionuclides (either gamma or positron emitters) to molecules for SPECT or PET imaging, or electron emitters for radiotherapy. One of the earliest examples is the use of radioactive iodine for treatment of people with thyroid cancer. Other examples include radio-labelled anti-CD20 antibodies (e.g. Bexxar) for treating lymphoma, Radium-223 for treating bone metastases, Lutetium-177 DOTATATE for treating neuroendocrine tumors and Lutetium-177 PSMA for treating prostate cancer. A commonly used reagent is fluorodeoxyglucose, using the isotope fluorine-18.

Respiratory proteomics

The preparation of a proteomics sample on a sample carrier to be analyzed by mass spectrometry

Respiratory diseases affect humanity globally, with chronic lung diseases (e.g., asthma, chronic obstructive pulmonary disease, idiopathic pulmonary fibrosis, among others) and lung cancer causing extensive morbidity and mortality. These conditions are highly heterogeneous and require an early diagnosis. However, initial symptoms are nonspecific, and the clinical diagnosis is made late frequently. Over the last few years, personalized medicine has emerged as a medical care approach that uses novel technology aiming to personalize treatments according to the particular patient's medical needs. In specific, proteomics is used to analyze a series of protein expressions, instead of a single biomarker. Proteins control the body's biological activities including health and disease, so proteomics is helpful in early diagnosis. In the case of respiratory disease, proteomics analyzes several biological samples including serum, blood cells, bronchoalveolar lavage fluids (BAL), nasal lavage fluids (NLF), sputum, among others. The identification and quantification of complete protein expression from these biological samples are conducted by mass spectrometry and advanced analytical techniques. Respiratory proteomics has made significant progress in the development of personalized medicine for supporting health care in recent years. For example, in a study conducted by Lazzari et al. in 2012, the proteomics-based approach has made substantial improvement in identifying multiple biomarkers of lung cancer that can be used in tailoring personalized treatments for individual patients. More and more studies have demonstrated the usefulness of proteomics to provide targeted therapies for respiratory disease.

Cancer genomics

Over recent decades cancer research has discovered a great deal about the genetic variety of types of cancer that appear the same in traditional pathology. There has also been increasing awareness of tumor heterogeneity, or genetic diversity within a single tumor. Among other prospects, these discoveries raise the possibility of finding that drugs that have not given good results applied to a general population of cases may yet be successful for a proportion of cases with particular genetic profiles.

Personalized oncogenomics is the application of personalized medicine to cancer genomics. High-throughput sequencing methods are used to characterize genes associated with cancer to better understand disease pathology and improve drug development. Oncogenomics is one of the most promising branches of genomics, particularly because of its implications in drug therapy. Examples of this include:

  • Trastuzumab (trade names Herclon, Herceptin) is a monoclonal antibody drug that interferes with the HER2/neu receptor. Its main use is to treat certain breast cancers. This drug is only used if a patient's cancer is tested for over-expression of the HER2/neu receptor. Two tissue-typing tests are used to screen patients for possible benefit from Herceptin treatment. The tissue tests are immunohistochemistry(IHC) and Fluorescence In Situ Hybridization(FISH) Only Her2+ patients will be treated with Herceptin therapy (trastuzumab)
  • Tyrosine kinase inhibitors such as imatinib (marketed as Gleevec) have been developed to treat chronic myeloid leukemia (CML), in which the BCR-ABL fusion gene (the product of a reciprocal translocation between chromosome 9 and chromosome 22) is present in >95% of cases and produces hyperactivated abl-driven protein signaling. These medications specifically inhibit the Ableson tyrosine kinase (ABL) protein and are thus a prime example of "rational drug design" based on knowledge of disease pathophysiology.
  • The FoundationOne CDx report produced by Foundation Medicine, which looks at genes in individual patients' tumor biopsies and recommends specific drugs
  • High mutation burden is indicative of response to immunotherapy, and also specific patterns of mutations have been associated with previous exposure to cytotoxic cancer drugs.

Population screening

Through the use of genomics (microarray), proteomics (tissue array), and imaging (fMRI, micro-CT) technologies, molecular-scale information about patients can be easily obtained. These so-called molecular biomarkers have proven powerful in disease prognosis, such as with cancer.[89][90][91] The main three areas of cancer prediction fall under cancer recurrence, cancer susceptibility and cancer survivability. Combining molecular scale information with macro-scale clinical data, such as patients' tumor type and other risk factors, significantly improves prognosis. Consequently, given the use of molecular biomarkers, especially genomics, cancer prognosis or prediction has become very effective, especially when screening a large population. Essentially, population genomics screening can be used to identify people at risk for disease, which can assist in preventative efforts.

Genetic data can be used to construct polygenic scores, which estimate traits such as disease risk by summing the estimated effects of individual variants discovered through a GWA study. These have been used for a wide variety of conditions, such as cancer, diabetes, and coronary artery disease. Many genetic variants are associated with ancestry, and it remains a challenge to both generate accurate estimates and to decouple biologically relevant variants from those that are coincidentally associated. Estimates generated from one population do not usually transfer well to others, requiring sophisticated methods and more diverse and global data. Most studies have used data from those with European ancestry, leading to calls for more equitable genomics practices to reduce health disparities. Additionally, while polygenic scores have some predictive accuracy, their interpretations are limited to estimating an individual's percentile and translational research is needed for clinical use.

Challenges

As personalised medicine is practiced more widely, a number of challenges arise. The current approaches to intellectual property rights, reimbursement policies, patient privacy, data biases and confidentiality as well as regulatory oversight will have to be redefined and restructured to accommodate the changes personalised medicine will bring to healthcare. For instance, a survey performed in the UK concluded that 63% of UK adults are not comfortable with their personal data being used for the sake of utilizing AI in the medical field. Furthermore, the analysis of acquired diagnostic data is a recent challenge of personalized medicine and its implementation. For example, genetic data obtained from next-generation sequencing requires computer-intensive data processing prior to its analysis. In the future, adequate tools will be required to accelerate the adoption of personalised medicine to further fields of medicine, which requires the interdisciplinary cooperation of experts from specific fields of research, such as medicine, clinical oncology, biology, and artificial intelligence.

Regulatory oversight

The U.S. Food and Drug Administration (FDA) has started taking initiatives to integrate personalised medicine into their regulatory policies. In October 2013, the agency published a report entitled "Paving the Way for Personalized Medicine: FDA's role in a New Era of Medical Product Development," in which they outlined steps they would have to take to integrate genetic and biomarker information for clinical use and drug development. These included developing specific regulatory standards, research methods and reference materials. An example of the latter category they were working on is a "genomic reference library", aimed at improving quality and reliability of different sequencing platforms. A major challenge for those regulating personalized medicine is a way to demonstrate its effectiveness relative to the current standard of care. The new technology must be assessed for both clinical and cost effectiveness, and as of 2013, regulatory agencies had no standardized method.

Intellectual property rights

As with any innovation in medicine, investment and interest in personalised medicine is influenced by intellectual property rights. There has been a lot of controversy regarding patent protection for diagnostic tools, genes, and biomarkers. In June 2013, the U.S. Supreme Court ruled that natural occurring genes cannot be patented, while "synthetic DNA" that is edited or artificially- created can still be patented. The Patent Office is currently reviewing a number of issues related to patent laws for personalised medicine, such as whether "confirmatory" secondary genetic tests post initial diagnosis, can have full immunity from patent laws. Those who oppose patents argue that patents on DNA sequences are an impediment to ongoing research while proponents point to research exemption and stress that patents are necessary to entice and protect the financial investments required for commercial research and the development and advancement of services offered.

Reimbursement policies

Reimbursement policies will have to be redefined to fit the changes that personalised medicine will bring to the healthcare system. Some of the factors that should be considered are the level of efficacy of various genetic tests in the general population, cost-effectiveness relative to benefits, how to deal with payment systems for extremely rare conditions, and how to redefine the insurance concept of "shared risk" to incorporate the effect of the newer concept of "individual risk factors". The study, Barriers to the Use of Personalized Medicine in Breast Cancer, took two different diagnostic tests which are BRACAnalysis and Oncotype DX. These tests have over ten-day turnaround times which results in the tests failing and delays in treatments. Patients are not being reimbursed for these delays which results in tests not being ordered. Ultimately, this leads to patients having to pay out-of-pocket for treatments because insurance companies do not want to accept the risks involved.

Patient privacy and confidentiality

Perhaps the most critical issue with the commercialization of personalised medicine is the protection of patients. One of the largest issues is the fear and potential consequences for patients who are predisposed after genetic testing or found to be non-responsive towards certain treatments. This includes the psychological effects on patients due to genetic testing results. The right of family members who do not directly consent is another issue, considering that genetic predispositions and risks are inheritable. The implications for certain ethnic groups and presence of a common allele would also have to be considered.

Moreover, we could refer to the privacy issue at all layers of personalized medicine from discovery to treatment. One of the leading issues is the consent of the patients to have their information used in genetic testing algorithms primarily AI algorithms. The consent of the institution who is providing the data to be used is of prominent concern as well. In 2008, the Genetic Information Nondiscrimination Act (GINA) was passed in an effort to minimize the fear of patients participating in genetic research by ensuring that their genetic information will not be misused by employers or insurers. On February 19, 2015, FDA issued a press release titled: "FDA permits marketing of first direct-to-consumer genetic carrier test for Bloom syndrome.

Data biases

Data biases also play an integral role in personalized medicine. It is important to ensure that the sample of genes being tested come from different populations. This is to ensure that the samples do not exhibit the same human biases we use in decision making.

Consequently, if the designed algorithms for personalized medicine are biased, then the outcome of the algorithm will also be biased because of the lack of genetic testing in certain populations. For instance, the results from the Framingham Heart Study have led to biased outcomes of predicting the risk of cardiovascular disease. This is because the sample was tested only on white people and when applied to the non-white population, the results were biased with overestimation and underestimation risks of cardiovascular disease.

Implementation

Several issues must be addressed before personalized medicine can be implemented. Very little of the human genome has been analyzed, and even if healthcare providers had access to a patient's full genetic information, very little of it could be effectively leveraged into treatment. Challenges also arise when processing such large amounts of genetic data. Even with error rates as low as 1 per 100 kilobases, processing a human genome could have roughly 30,000 errors. This many errors, especially when trying to identify specific markers, can make discoveries and verifiability difficult. There are methods to overcome this, but they are computationally taxing and expensive. There are also issues from an effectiveness standpoint, as after the genome has been processed, function in the variations among genomes must be analyzed using genome-wide studies. While the impact of the SNPs discovered in these kinds of studies can be predicted, more work must be done to control for the vast amounts of variation that can occur because of the size of the genome being studied. In order to effectively move forward in this area, steps must be taken to ensure the data being analyzed is good, and a wider view must be taken in terms of analyzing multiple SNPs for a phenotype. The most pressing issue that the implementation of personalized medicine is to apply the results of genetic mapping to improve the healthcare system. This is not only due to the infrastructure and technology required for a centralized database of genome data, but also the physicians that would have access to these tools would likely be unable to fully take advantage of them. In order to truly implement a personalized medicine healthcare system, there must be an end-to-end change.

The Copenhagen Institute for Futures Studies and Roche set up FutureProofing Healthcare which produces a Personalised Health Index, rating different countries performance against 27 different indicators of personalised health across four categories called 'Vital Signs'. They have run conferences in many countries to examine their findings.

Protein folding

From Wikipedia, the free encyclopedia
Protein before and after folding
Results of protein folding

Protein folding is the physical process by which a protein, after synthesis by a ribosome as a linear chain of amino acids, changes from an unstable random coil into a more ordered three-dimensional structure. This structure permits the protein to become biologically functional or active.

The folding of many proteins begins even during the translation of the polypeptide chain. The amino acids interact with each other to produce a well-defined three-dimensional structure, known as the protein's native state. This structure is determined by the amino-acid sequence or primary structure.

The correct three-dimensional structure is essential to function, although some parts of functional proteins may remain unfolded, indicating that protein dynamics are important. Failure to fold into a native structure generally produces inactive proteins, but in some instances, misfolded proteins have modified or toxic functionality. Several neurodegenerative and other diseases are believed to result from the accumulation of amyloid fibrils formed by misfolded proteins, the infectious varieties of which are known as prions. Many allergies are caused by the incorrect folding of some proteins because the immune system does not produce the antibodies for certain protein structures.

Denaturation of proteins is a process of transition from a folded to an unfolded state. It happens in cooking, burns, proteinopathies, and other contexts. Residual structure present, if any, in the supposedly unfolded state may form a folding initiation site and guide the subsequent folding reactions.

The duration of the folding process varies dramatically depending on the protein of interest. When studied outside the cell, the slowest folding proteins require many minutes or hours to fold, primarily due to proline isomerization, and must pass through a number of intermediate states, like checkpoints, before the process is complete. On the other hand, very small single-domain proteins with lengths of up to a hundred amino acids typically fold in a single step. Time scales of milliseconds are the norm, and the fastest known protein folding reactions are complete within a few microseconds. The folding time scale of a protein depends on its size, contact order, and circuit topology.

Understanding and simulating the protein folding process has been an important challenge for computational biology since the late 1960s.

Process of protein folding

Primary structure

The primary structure of a protein, its linear amino-acid sequence, determines its native conformation. The specific amino acid residues and their position in the polypeptide chain are the determining factors for which portions of the protein fold closely together and form its three-dimensional conformation. The amino acid composition is not as important as the sequence. The essential fact of folding, however, remains that the amino acid sequence of each protein contains the information that specifies both the native structure and the pathway to attain that state. This is not to say that nearly identical amino acid sequences always fold similarly. Conformations differ based on environmental factors as well; similar proteins fold differently based on where they are found.

Secondary structure

The alpha helix spiral formation
An anti-parallel beta pleated sheet displaying hydrogen bonding within the backbone

Formation of a secondary structure is the first step in the folding process that a protein takes to assume its native structure. Characteristic of secondary structure are the structures known as alpha helices and beta sheets that fold rapidly because they are stabilized by intramolecular hydrogen bonds, as was first characterized by Linus Pauling. Formation of intramolecular hydrogen bonds provides another important contribution to protein stability. α-helices are formed by hydrogen bonding of the backbone to form a spiral shape (refer to figure on the right). The β pleated sheet is a structure that forms with the backbone bending over itself to form the hydrogen bonds (as displayed in the figure to the left). The hydrogen bonds are between the amide hydrogen and carbonyl oxygen of the peptide bond. There exists anti-parallel β pleated sheets and parallel β pleated sheets where the stability of the hydrogen bonds is stronger in the anti-parallel β sheet as it hydrogen bonds with the ideal 180 degree angle compared to the slanted hydrogen bonds formed by parallel sheets.

Tertiary structure

The α-Helices and β-Sheets are commonly amphipathic, meaning they have a hydrophilic and a hydrophobic portion. This ability helps in forming tertiary structure of a protein in which folding occurs so that the hydrophilic sides are facing the aqueous environment surrounding the protein and the hydrophobic sides are facing the hydrophobic core of the protein. Secondary structure hierarchically gives way to tertiary structure formation. Once the protein's tertiary structure is formed and stabilized by the hydrophobic interactions, there may also be covalent bonding in the form of disulfide bridges formed between two cysteine residues. These non-covalent and covalent contacts take a specific topological arrangement in a native structure of a protein. Tertiary structure of a protein involves a single polypeptide chain; however, additional interactions of folded polypeptide chains give rise to quaternary structure formation.

Quaternary structure

Tertiary structure may give way to the formation of quaternary structure in some proteins, which usually involves the "assembly" or "coassembly" of subunits that have already folded; in other words, multiple polypeptide chains could interact to form a fully functional quaternary protein.

Driving forces of protein folding

All forms of protein structure summarized

Folding is a spontaneous process that is mainly guided by hydrophobic interactions, formation of intramolecular hydrogen bonds, van der Waals forces, and it is opposed by conformational entropy. The folding time scale of an isolated protein depends on its size, contact order, and circuit topology. Inside cells, the process of folding often begins co-translationally, so that the N-terminus of the protein begins to fold while the C-terminal portion of the protein is still being synthesized by the ribosome; however, a protein molecule may fold spontaneously during or after biosynthesis. While these macromolecules may be regarded as "folding themselves", the process also depends on the solvent (water or lipid bilayer), the concentration of salts, the pH, the temperature, the possible presence of cofactors and of molecular chaperones.

Proteins will have limitations on their folding abilities by the restricted bending angles or conformations that are possible. These allowable angles of protein folding are described with a two-dimensional plot known as the Ramachandran plot, depicted with psi and phi angles of allowable rotation.

Hydrophobic effect

Hydrophobic collapse. In the compact fold (to the right), the hydrophobic amino acids (shown as black spheres) collapse toward the center to become shielded from aqueous environment.

Protein folding must be thermodynamically favorable within a cell in order for it to be a spontaneous reaction. Since it is known that protein folding is a spontaneous reaction, then it must assume a negative Gibbs free energy value. Gibbs free energy in protein folding is directly related to enthalpy and entropy. For a negative delta G to arise and for protein folding to become thermodynamically favorable, then either enthalpy, entropy, or both terms must be favorable.

Minimizing the number of hydrophobic side-chains exposed to water is an important driving force behind the folding process. The hydrophobic effect is the phenomenon in which the hydrophobic chains of a protein collapse into the core of the protein (away from the hydrophilic environment). In an aqueous environment, the water molecules tend to aggregate around the hydrophobic regions or side chains of the protein, creating water shells of ordered water molecules. An ordering of water molecules around a hydrophobic region increases order in a system and therefore contributes a negative change in entropy (less entropy in the system). The water molecules are fixed in these water cages which drives the hydrophobic collapse, or the inward folding of the hydrophobic groups. The hydrophobic collapse introduces entropy back to the system via the breaking of the water cages which frees the ordered water molecules. The multitude of hydrophobic groups interacting within the core of the globular folded protein contributes a significant amount to protein stability after folding, because of the vastly accumulated van der Waals forces (specifically London Dispersion forces). The hydrophobic effect exists as a driving force in thermodynamics only if there is the presence of an aqueous medium with an amphiphilic molecule containing a large hydrophobic region. The strength of hydrogen bonds depends on their environment; thus, H-bonds enveloped in a hydrophobic core contribute more than H-bonds exposed to the aqueous environment to the stability of the native state.

In proteins with globular folds, hydrophobic amino acids tend to be interspersed along the primary sequence, rather than randomly distributed or clustered together. However, proteins that have recently been born de novo, which tend to be intrinsically disordered, show the opposite pattern of hydrophobic amino acid clustering along the primary sequence.

Chaperones

Example of a small eukaryotic heat shock protein

Molecular chaperones are a class of proteins that aid in the correct folding of other proteins in vivo. Chaperones exist in all cellular compartments and interact with the polypeptide chain in order to allow the native three-dimensional conformation of the protein to form; however, chaperones themselves are not included in the final structure of the protein they are assisting in. Chaperones may assist in folding even when the nascent polypeptide is being synthesized by the ribosome. Molecular chaperones operate by binding to stabilize an otherwise unstable structure of a protein in its folding pathway, but chaperones do not contain the necessary information to know the correct native structure of the protein they are aiding; rather, chaperones work by preventing incorrect folding conformations. In this way, chaperones do not actually increase the rate of individual steps involved in the folding pathway toward the native structure; instead, they work by reducing possible unwanted aggregations of the polypeptide chain that might otherwise slow down the search for the proper intermediate and they provide a more efficient pathway for the polypeptide chain to assume the correct conformations. Chaperones are not to be confused with folding catalyst proteins, which catalyze chemical reactions responsible for slow steps in folding pathways. Examples of folding catalysts are protein disulfide isomerases and peptidyl-prolyl isomerases that may be involved in formation of disulfide bonds or interconversion between cis and trans stereoisomers of peptide group. Chaperones are shown to be critical in the process of protein folding in vivo because they provide the protein with the aid needed to assume its proper alignments and conformations efficiently enough to become "biologically relevant". This means that the polypeptide chain could theoretically fold into its native structure without the aid of chaperones, as demonstrated by protein folding experiments conducted in vitro; however, this process proves to be too inefficient or too slow to exist in biological systems; therefore, chaperones are necessary for protein folding in vivo. Along with its role in aiding native structure formation, chaperones are shown to be involved in various roles such as protein transport, degradation, and even allow denatured proteins exposed to certain external denaturant factors an opportunity to refold into their correct native structures.

A fully denatured protein lacks both tertiary and secondary structure, and exists as a so-called random coil. Under certain conditions some proteins can refold; however, in many cases, denaturation is irreversible. Cells sometimes protect their proteins against the denaturing influence of heat with enzymes known as heat shock proteins (a type of chaperone), which assist other proteins both in folding and in remaining folded. Heat shock proteins have been found in all species examined, from bacteria to humans, suggesting that they evolved very early and have an important function. Some proteins never fold in cells at all except with the assistance of chaperones which either isolate individual proteins so that their folding is not interrupted by interactions with other proteins or help to unfold misfolded proteins, allowing them to refold into the correct native structure. This function is crucial to prevent the risk of precipitation into insoluble amorphous aggregates. The external factors involved in protein denaturation or disruption of the native state include temperature, external fields (electric, magnetic), molecular crowding, and even the limitation of space (i.e. confinement), which can have a big influence on the folding of proteins. High concentrations of solutes, extremes of pH, mechanical forces, and the presence of chemical denaturants can contribute to protein denaturation, as well. These individual factors are categorized together as stresses. Chaperones are shown to exist in increasing concentrations during times of cellular stress and help the proper folding of emerging proteins as well as denatured or misfolded ones.

Under some conditions proteins will not fold into their biochemically functional forms. Temperatures above or below the range that cells tend to live in will cause thermally unstable proteins to unfold or denature (this is why boiling makes an egg white turn opaque). Protein thermal stability is far from constant, however; for example, hyperthermophilic bacteria have been found that grow at temperatures as high as 122 °C, which of course requires that their full complement of vital proteins and protein assemblies be stable at that temperature or above.

The bacterium E. coli is the host for bacteriophage T4, and the phage encoded gp31 protein (P17313) appears to be structurally and functionally homologous to E. coli chaperone protein GroES and able to substitute for it in the assembly of bacteriophage T4 virus particles during infection. Like GroES, gp31 forms a stable complex with GroEL chaperonin that is absolutely necessary for the folding and assembly in vivo of the bacteriophage T4 major capsid protein gp23.

Fold switching

Some proteins have multiple native structures, and change their fold based on some external factors. For example, the KaiB protein switches fold throughout the day, acting as a clock for cyanobacteria. It has been estimated that around 0.5–4% of PDB (Protein Data Bank) proteins switch folds.

Protein misfolding and neurodegenerative disease

A protein is considered to be misfolded if it cannot achieve its normal native state. This can be due to mutations in the amino acid sequence or a disruption of the normal folding process by external factors. The misfolded protein typically contains β-sheets that are organized in a supramolecular arrangement known as a cross-β structure. These β-sheet-rich assemblies are very stable, very insoluble, and generally resistant to proteolysis. The structural stability of these fibrillar assemblies is caused by extensive interactions between the protein monomers, formed by backbone hydrogen bonds between their β-strands. The misfolding of proteins can trigger the further misfolding and accumulation of other proteins into aggregates or oligomers. The increased levels of aggregated proteins in the cell leads to formation of amyloid-like structures which can cause degenerative disorders and cell death. The amyloids are fibrillary structures that contain intermolecular hydrogen bonds which are highly insoluble and made from converted protein aggregates. Therefore, the proteasome pathway may not be efficient enough to degrade the misfolded proteins prior to aggregation. Misfolded proteins can interact with one another and form structured aggregates and gain toxicity through intermolecular interactions.

Aggregated proteins are associated with prion-related illnesses such as Creutzfeldt–Jakob disease, bovine spongiform encephalopathy (mad cow disease), amyloid-related illnesses such as Alzheimer's disease and familial amyloid cardiomyopathy or polyneuropathy, as well as intracellular aggregation diseases such as Huntington's and Parkinson's disease. These age onset degenerative diseases are associated with the aggregation of misfolded proteins into insoluble, extracellular aggregates and/or intracellular inclusions including cross-β amyloid fibrils. It is not completely clear whether the aggregates are the cause or merely a reflection of the loss of protein homeostasis, the balance between synthesis, folding, aggregation and protein turnover. Recently the European Medicines Agency approved the use of Tafamidis or Vyndaqel (a kinetic stabilizer of tetrameric transthyretin) for the treatment of transthyretin amyloid diseases. This suggests that the process of amyloid fibril formation (and not the fibrils themselves) causes the degeneration of post-mitotic tissue in human amyloid diseases. Misfolding and excessive degradation instead of folding and function leads to a number of proteopathy diseases such as antitrypsin-associated emphysema, cystic fibrosis and the lysosomal storage diseases, where loss of function is the origin of the disorder. While protein replacement therapy has historically been used to correct the latter disorders, an emerging approach is to use pharmaceutical chaperones to fold mutated proteins to render them functional.

Experimental techniques for studying protein folding

While inferences about protein folding can be made through mutation studies, typically, experimental techniques for studying protein folding rely on the gradual unfolding or folding of proteins and observing conformational changes using standard non-crystallographic techniques.

X-ray crystallography

Steps of X-ray crystallography

X-ray crystallography is one of the more efficient and important methods for attempting to decipher the three dimensional configuration of a folded protein. To be able to conduct X-ray crystallography, the protein under investigation must be located inside a crystal lattice. To place a protein inside a crystal lattice, one must have a suitable solvent for crystallization, obtain a pure protein at supersaturated levels in solution, and precipitate the crystals in solution. Once a protein is crystallized, X-ray beams can be concentrated through the crystal lattice which would diffract the beams or shoot them outwards in various directions. These exiting beams are correlated to the specific three-dimensional configuration of the protein enclosed within. The X-rays specifically interact with the electron clouds surrounding the individual atoms within the protein crystal lattice and produce a discernible diffraction pattern. Only by relating the electron density clouds with the amplitude of the X-rays can this pattern be read and lead to assumptions of the phases or phase angles involved that complicate this method. Without the relation established through a mathematical basis known as Fourier transform, the "phase problem" would render predicting the diffraction patterns very difficult. Emerging methods like multiple isomorphous replacement use the presence of a heavy metal ion to diffract the X-rays into a more predictable manner, reducing the number of variables involved and resolving the phase problem.

Fluorescence spectroscopy

Fluorescence spectroscopy is a highly sensitive method for studying the folding state of proteins. Three amino acids, phenylalanine (Phe), tyrosine (Tyr) and tryptophan (Trp), have intrinsic fluorescence properties, but only Tyr and Trp are used experimentally because their quantum yields are high enough to give good fluorescence signals. Both Trp and Tyr are excited by a wavelength of 280 nm, whereas only Trp is excited by a wavelength of 295 nm. Because of their aromatic character, Trp and Tyr residues are often found fully or partially buried in the hydrophobic core of proteins, at the interface between two protein domains, or at the interface between subunits of oligomeric proteins. In this apolar environment, they have high quantum yields and therefore high fluorescence intensities. Upon disruption of the protein's tertiary or quaternary structure, these side chains become more exposed to the hydrophilic environment of the solvent, and their quantum yields decrease, leading to low fluorescence intensities. For Trp residues, the wavelength of their maximal fluorescence emission also depend on their environment.

Fluorescence spectroscopy can be used to characterize the equilibrium unfolding of proteins by measuring the variation in the intensity of fluorescence emission or in the wavelength of maximal emission as functions of a denaturant value. The denaturant can be a chemical molecule (urea, guanidinium hydrochloride), temperature, pH, pressure, etc. The equilibrium between the different but discrete protein states, i.e. native state, intermediate states, unfolded state, depends on the denaturant value; therefore, the global fluorescence signal of their equilibrium mixture also depends on this value. One thus obtains a profile relating the global protein signal to the denaturant value. The profile of equilibrium unfolding may enable one to detect and identify intermediates of unfolding. General equations have been developed by Hugues Bedouelle to obtain the thermodynamic parameters that characterize the unfolding equilibria for homomeric or heteromeric proteins, up to trimers and potentially tetramers, from such profiles. Fluorescence spectroscopy can be combined with fast-mixing devices such as stopped flow, to measure protein folding kinetics, generate a chevron plot and derive a Phi value analysis.

Circular dichroism

Circular dichroism is one of the most general and basic tools to study protein folding. Circular dichroism spectroscopy measures the absorption of circularly polarized light. In proteins, structures such as alpha helices and beta sheets are chiral, and thus absorb such light. The absorption of this light acts as a marker of the degree of foldedness of the protein ensemble. This technique has been used to measure equilibrium unfolding of the protein by measuring the change in this absorption as a function of denaturant concentration or temperature. A denaturant melt measures the free energy of unfolding as well as the protein's m value, or denaturant dependence. A temperature melt measures the denaturation temperature (Tm) of the protein. As for fluorescence spectroscopy, circular-dichroism spectroscopy can be combined with fast-mixing devices such as stopped flow to measure protein folding kinetics and to generate chevron plots.

Vibrational circular dichroism of proteins

The more recent developments of vibrational circular dichroism (VCD) techniques for proteins, currently involving Fourier transform (FT) instruments, provide powerful means for determining protein conformations in solution even for very large protein molecules. Such VCD studies of proteins can be combined with X-ray diffraction data for protein crystals, FT-IR data for protein solutions in heavy water (D2O), or quantum computations.

Protein nuclear magnetic resonance spectroscopy

Protein nuclear magnetic resonance (NMR) is able to collect protein structural data by inducing a magnet field through samples of concentrated protein. In NMR, depending on the chemical environment, certain nuclei will absorb specific radio-frequencies. Because protein structural changes operate on a time scale from ns to ms, NMR is especially equipped to study intermediate structures in timescales of ps to s. Some of the main techniques for studying proteins structure and non-folding protein structural changes include COSY, TOCSYHSQC, time relaxation (T1 & T2), and NOE. NOE is especially useful because magnetization transfers can be observed between spatially proximal hydrogens are observed. Different NMR experiments have varying degrees of timescale sensitivity that are appropriate for different protein structural changes. NOE can pick up bond vibrations or side chain rotations, however, NOE is too sensitive to pick up protein folding because it occurs at larger timescale.

Timescale of protein structural changes matched with NMR experiments. For protein folding, CPMG Relaxation Dispersion (CPMG RD) and chemical exchange saturation transfer (CEST) collect data in the appropriate timescale.

Because protein folding takes place in about 50 to 3000 s−1 CPMG Relaxation dispersion and chemical exchange saturation transfer have become some of the primary techniques for NMR analysis of folding. In addition, both techniques are used to uncover excited intermediate states in the protein folding landscape. To do this, CPMG Relaxation dispersion takes advantage of the spin echo phenomenon. This technique exposes the target nuclei to a 90 pulse followed by one or more 180 pulses. As the nuclei refocus, a broad distribution indicates the target nuclei is involved in an intermediate excited state. By looking at Relaxation dispersion plots the data collect information on the thermodynamics and kinetics between the excited and ground. Saturation Transfer measures changes in signal from the ground state as excited states become perturbed. It uses weak radio frequency irradiation to saturate the excited state of a particular nuclei which transfers its saturation to the ground state. This signal is amplified by decreasing the magnetization (and the signal) of the ground state.

The main limitations in NMR is that its resolution decreases with proteins that are larger than 25 kDa and is not as detailed as X-ray crystallography. Additionally, protein NMR analysis is quite difficult and can propose multiple solutions from the same NMR spectrum.

In a study focused on the folding of an amyotrophic lateral sclerosis involved protein SOD1, excited intermediates were studied with relaxation dispersion and Saturation transfer. SOD1 had been previously tied to many disease causing mutants which were assumed to be involved in protein aggregation, however the mechanism was still unknown. By using Relaxation Dispersion and Saturation Transfer experiments many excited intermediate states were uncovered misfolding in the SOD1 mutants.

Dual-polarization interferometry

Dual polarisation interferometry is a surface-based technique for measuring the optical properties of molecular layers. When used to characterize protein folding, it measures the conformation by determining the overall size of a monolayer of the protein and its density in real time at sub-Angstrom resolution, although real-time measurement of the kinetics of protein folding are limited to processes that occur slower than ~10 Hz. Similar to circular dichroism, the stimulus for folding can be a denaturant or temperature.

Studies of folding with high time resolution

The study of protein folding has been greatly advanced in recent years by the development of fast, time-resolved techniques. Experimenters rapidly trigger the folding of a sample of unfolded protein and observe the resulting dynamics. Fast techniques in use include neutron scattering, ultrafast mixing of solutions, photochemical methods, and laser temperature jump spectroscopy. Among the many scientists who have contributed to the development of these techniques are Jeremy Cook, Heinrich Roder, Terry Oas, Harry Gray, Martin Gruebele, Brian Dyer, William Eaton, Sheena Radford, Chris Dobson, Alan Fersht, Bengt Nölting and Lars Konermann.

Proteolysis

Proteolysis is routinely used to probe the fraction unfolded under a wide range of solution conditions (e.g. fast parallel proteolysis (FASTpp).

Single-molecule force spectroscopy

Single molecule techniques such as optical tweezers and AFM have been used to understand protein folding mechanisms of isolated proteins as well as proteins with chaperones. Optical tweezers have been used to stretch single protein molecules from their C- and N-termini and unfold them to allow study of the subsequent refolding. The technique allows one to measure folding rates at single-molecule level; for example, optical tweezers have been recently applied to study folding and unfolding of proteins involved in blood coagulation. von Willebrand factor (vWF) is a protein with an essential role in blood clot formation process. It discovered – using single molecule optical tweezers measurement – that calcium-bound vWF acts as a shear force sensor in the blood. Shear force leads to unfolding of the A2 domain of vWF, whose refolding rate is dramatically enhanced in the presence of calcium. Recently, it was also shown that the simple src SH3 domain accesses multiple unfolding pathways under force.

Biotin painting

Biotin painting enables condition-specific cellular snapshots of (un)folded proteins. Biotin 'painting' shows a bias towards predicted Intrinsically disordered proteins.

Computational studies of protein folding

Computational studies of protein folding includes three main aspects related to the prediction of protein stability, kinetics, and structure. A 2013 review summarizes the available computational methods for protein folding.

Levinthal's paradox

In 1969, Cyrus Levinthal noted that, because of the very large number of degrees of freedom in an unfolded polypeptide chain, the molecule has an astronomical number of possible conformations. An estimate of 3300 or 10143 was made in one of his papers. Levinthal's paradox is a thought experiment based on the observation that if a protein were folded by sequential sampling of all possible conformations, it would take an astronomical amount of time to do so, even if the conformations were sampled at a rapid rate (on the nanosecond or picosecond scale). Based upon the observation that proteins fold much faster than this, Levinthal then proposed that a random conformational search does not occur, and the protein must, therefore, fold through a series of meta-stable intermediate states.

Energy landscape of protein folding

The energy funnel by which an unfolded polypeptide chain assumes its native structure

The configuration space of a protein during folding can be visualized as an energy landscape. According to Joseph Bryngelson and Peter Wolynes, proteins follow the principle of minimal frustration, meaning that naturally evolved proteins have optimized their folding energy landscapes, and that nature has chosen amino acid sequences so that the folded state of the protein is sufficiently stable. In addition, the acquisition of the folded state had to become a sufficiently fast process. Even though nature has reduced the level of frustration in proteins, some degree of it remains up to now as can be observed in the presence of local minima in the energy landscape of proteins.

A consequence of these evolutionarily selected sequences is that proteins are generally thought to have globally "funneled energy landscapes" (a term coined by José Onuchic) that are largely directed toward the native state. This "folding funnel" landscape allows the protein to fold to the native state through any of a large number of pathways and intermediates, rather than being restricted to a single mechanism. The theory is supported by both computational simulations of model proteins and experimental studies, and it has been used to improve methods for protein structure prediction and design. The description of protein folding by the leveling free-energy landscape is also consistent with the 2nd law of thermodynamics. Physically, thinking of landscapes in terms of visualizable potential or total energy surfaces simply with maxima, saddle points, minima, and funnels, rather like geographic landscapes, is perhaps a little misleading. The relevant description is really a high-dimensional phase space in which manifolds might take a variety of more complicated topological forms.

The unfolded polypeptide chain begins at the top of the funnel where it may assume the largest number of unfolded variations and is in its highest energy state. Energy landscapes such as these indicate that there are a large number of initial possibilities, but only a single native state is possible; however, it does not reveal the numerous folding pathways that are possible. A different molecule of the same exact protein may be able to follow marginally different folding pathways, seeking different lower energy intermediates, as long as the same native structure is reached. Different pathways may have different frequencies of utilization depending on the thermodynamic favorability of each pathway. This means that if one pathway is found to be more thermodynamically favorable than another, it is likely to be used more frequently in the pursuit of the native structure. As the protein begins to fold and assume its various conformations, it always seeks a more thermodynamically favorable structure than before and thus continues through the energy funnel. Formation of secondary structures is a strong indication of increased stability within the protein, and only one combination of secondary structures assumed by the polypeptide backbone will have the lowest energy and therefore be present in the native state of the protein. Among the first structures to form once the polypeptide begins to fold are alpha helices and beta turns, where alpha helices can form in as little as 100 nanoseconds and beta turns in 1 microsecond.

There exists a saddle point in the energy funnel landscape where the transition state for a particular protein is found. The transition state in the energy funnel diagram is the conformation that must be assumed by every molecule of that protein if the protein wishes to finally assume the native structure. No protein may assume the native structure without first passing through the transition state. The transition state can be referred to as a variant or premature form of the native state rather than just another intermediary step. The folding of the transition state is shown to be rate-determining, and even though it exists in a higher energy state than the native fold, it greatly resembles the native structure. Within the transition state, there exists a nucleus around which the protein is able to fold, formed by a process referred to as "nucleation condensation" where the structure begins to collapse onto the nucleus.

Modeling of protein folding

Folding@home uses Markov state models, like the one diagrammed here, to model the possible shapes and folding pathways a protein can take as it condenses from its initial randomly coiled state (left) into its native 3D structure (right).

De novo or ab initio techniques for computational protein structure prediction can be used for simulating various aspects of protein folding. Molecular dynamics (MD) was used in simulations of protein folding and dynamics in silico. First equilibrium folding simulations were done using implicit solvent model and umbrella sampling. Because of computational cost, ab initio MD folding simulations with explicit water are limited to peptides and small proteins. MD simulations of larger proteins remain restricted to dynamics of the experimental structure or its high-temperature unfolding. Long-time folding processes (beyond about 1 millisecond), like folding of larger proteins (>150 residues) can be accessed using coarse-grained models.

Several large-scale computational projects, such as Rosetta@homeFolding@home and Foldit, target protein folding.

Long continuous-trajectory simulations have been performed on Anton, a massively parallel supercomputer designed and built around custom ASICs and interconnects by D. E. Shaw Research. The longest published result of a simulation performed using Anton as of 2011 was a 2.936 millisecond simulation of NTL9 at 355 K. Such simulations are currently able to unfold and refold small proteins (<150 amino acids residues) in equilibrium and predict how mutations affect folding kinetics and stability.

In 2020 a team of researchers that used AlphaFold, an artificial intelligence (AI) protein structure prediction program developed by DeepMind placed first in CASP, a long-standing structure prediction contest. The team achieved a level of accuracy much higher than any other group. It scored above 90% for around two-thirds of the proteins in CASP's global distance test (GDT), a test that measures the degree of similarity between the structure predicted by a computational program, and the empirical structure determined experimentally in a lab. A score of 100 is considered a complete match, within the distance cutoff used for calculating GDT.

AlphaFold's protein structure prediction results at CASP were described as "transformational" and "astounding". Some researchers noted that the accuracy is not high enough for a third of its predictions, and that it does not reveal the physical mechanism of protein folding for the protein folding problem to be considered solved. Nevertheless, it is considered a significant achievement in computational biology and great progress towards a decades-old grand challenge of biology, predicting the structure of proteins.

Personalized medicine

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Personalized_medicine ...