Search This Blog

Saturday, August 31, 2024

Metabolism

From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Metabolism
Simplified view of the cellular metabolism
Structure of adenosine triphosphate (ATP), a central intermediate in energy metabolism

Metabolism (/məˈtæbəlɪzəm/, from Greek: μεταβολή metabolē, "change") is the set of life-sustaining chemical reactions in organisms. The three main functions of metabolism are: the conversion of the energy in food to energy available to run cellular processes; the conversion of food to building blocks of proteins, lipids, nucleic acids, and some carbohydrates; and the elimination of metabolic wastes. These enzyme-catalyzed reactions allow organisms to grow and reproduce, maintain their structures, and respond to their environments. The word metabolism can also refer to the sum of all chemical reactions that occur in living organisms, including digestion and the transportation of substances into and between different cells, in which case the above described set of reactions within the cells is called intermediary (or intermediate) metabolism.

Metabolic reactions may be categorized as catabolic—the breaking down of compounds (for example, of glucose to pyruvate by cellular respiration); or anabolic—the building up (synthesis) of compounds (such as proteins, carbohydrates, lipids, and nucleic acids). Usually, catabolism releases energy, and anabolism consumes energy.

The chemical reactions of metabolism are organized into metabolic pathways, in which one chemical is transformed through a series of steps into another chemical, each step being facilitated by a specific enzyme. Enzymes are crucial to metabolism because they allow organisms to drive desirable reactions that require energy and will not occur by themselves, by coupling them to spontaneous reactions that release energy. Enzymes act as catalysts—they allow a reaction to proceed more rapidly—and they also allow the regulation of the rate of a metabolic reaction, for example in response to changes in the cell's environment or to signals from other cells.

The metabolic system of a particular organism determines which substances it will find nutritious and which poisonous. For example, some prokaryotes use hydrogen sulfide as a nutrient, yet this gas is poisonous to animals. The basal metabolic rate of an organism is the measure of the amount of energy consumed by all of these chemical reactions.

A striking feature of metabolism is the similarity of the basic metabolic pathways among vastly different species. For example, the set of carboxylic acids that are best known as the intermediates in the citric acid cycle are present in all known organisms, being found in species as diverse as the unicellular bacterium Escherichia coli and huge multicellular organisms like elephants. These similarities in metabolic pathways are likely due to their early appearance in evolutionary history, and their retention is likely due to their efficacy. In various diseases, such as type II diabetes, metabolic syndrome, and cancer, normal metabolism is disrupted. The metabolism of cancer cells is also different from the metabolism of normal cells, and these differences can be used to find targets for therapeutic intervention in cancer.

Key biochemicals

Structure of a triacylglycerol lipid
This is a diagram depicting a large set of human metabolic pathways.

Most of the structures that make up animals, plants and microbes are made from four basic classes of molecules: amino acids, carbohydrates, nucleic acid and lipids (often called fats). As these molecules are vital for life, metabolic reactions either focus on making these molecules during the construction of cells and tissues, or on breaking them down and using them to obtain energy, by their digestion. These biochemicals can be joined to make polymers such as DNA and proteins, essential macromolecules of life.

Type of molecule Name of monomer forms Name of polymer forms Examples of polymer forms
Amino acids Amino acids Proteins (made of polypeptides) Fibrous proteins and globular proteins
Carbohydrates Monosaccharides Polysaccharides Starch, glycogen and cellulose
Nucleic acids Nucleotides Polynucleotides DNA and RNA

Amino acids and proteins

Proteins are made of amino acids arranged in a linear chain joined by peptide bonds. Many proteins are enzymes that catalyze the chemical reactions in metabolism. Other proteins have structural or mechanical functions, such as those that form the cytoskeleton, a system of scaffolding that maintains the cell shape. Proteins are also important in cell signaling, immune responses, cell adhesion, active transport across membranes, and the cell cycle. Amino acids also contribute to cellular energy metabolism by providing a carbon source for entry into the citric acid cycle (tricarboxylic acid cycle), especially when a primary source of energy, such as glucose, is scarce, or when cells undergo metabolic stress.

Lipids

Lipids are the most diverse group of biochemicals. Their main structural uses are as part of internal and external biological membranes, such as the cell membrane. Their chemical energy can also be used. Lipids contain a long, non-polar hydrocarbon chain with a small polar region containing oxygen. Lipids are usually defined as hydrophobic or amphipathic biological molecules but will dissolve in organic solvents such as ethanol, benzene or chloroform. The fats are a large group of compounds that contain fatty acids and glycerol; a glycerol molecule attached to three fatty acids by ester linkages is called a triacylglyceride. Several variations of the basic structure exist, including backbones such as sphingosine in sphingomyelin, and hydrophilic groups such as phosphate in phospholipids. Steroids such as sterol are another major class of lipids.

Carbohydrates

The straight chain form consists of four C H O H groups linked in a row, capped at the ends by an aldehyde group C O H and a methanol group C H 2 O H. To form the ring, the aldehyde group combines with the O H group of the next-to-last carbon at the other end, just before the methanol group.
Glucose can exist in both a straight-chain and ring form.

Carbohydrates are aldehydes or ketones, with many hydroxyl groups attached, that can exist as straight chains or rings. Carbohydrates are the most abundant biological molecules, and fill numerous roles, such as the storage and transport of energy (starch, glycogen) and structural components (cellulose in plants, chitin in animals). The basic carbohydrate units are called monosaccharides and include galactose, fructose, and most importantly glucose. Monosaccharides can be linked together to form polysaccharides in almost limitless ways.

Nucleotides

The two nucleic acids, DNA and RNA, are polymers of nucleotides. Each nucleotide is composed of a phosphate attached to a ribose or deoxyribose sugar group which is attached to a nitrogenous base. Nucleic acids are critical for the storage and use of genetic information, and its interpretation through the processes of transcription and protein biosynthesis. This information is protected by DNA repair mechanisms and propagated through DNA replication. Many viruses have an RNA genome, such as HIV, which uses reverse transcription to create a DNA template from its viral RNA genome. RNA in ribozymes such as spliceosomes and ribosomes is similar to enzymes as it can catalyze chemical reactions. Individual nucleosides are made by attaching a nucleobase to a ribose sugar. These bases are heterocyclic rings containing nitrogen, classified as purines or pyrimidines. Nucleotides also act as coenzymes in metabolic-group-transfer reactions.

Coenzymes

Structure of the coenzyme acetyl-CoA. The transferable acetyl group is bonded to the sulfur atom at the extreme left.

Metabolism involves a vast array of chemical reactions, but most fall under a few basic types of reactions that involve the transfer of functional groups of atoms and their bonds within molecules. This common chemistry allows cells to use a small set of metabolic intermediates to carry chemical groups between different reactions. These group-transfer intermediates are called coenzymes. Each class of group-transfer reactions is carried out by a particular coenzyme, which is the substrate for a set of enzymes that produce it, and a set of enzymes that consume it. These coenzymes are therefore continuously made, consumed and then recycled.

One central coenzyme is adenosine triphosphate (ATP), the energy currency of cells. This nucleotide is used to transfer chemical energy between different chemical reactions. There is only a small amount of ATP in cells, but as it is continuously regenerated, the human body can use about its own weight in ATP per day. ATP acts as a bridge between catabolism and anabolism. Catabolism breaks down molecules, and anabolism puts them together. Catabolic reactions generate ATP, and anabolic reactions consume it. It also serves as a carrier of phosphate groups in phosphorylation reactions.

A vitamin is an organic compound needed in small quantities that cannot be made in cells. In human nutrition, most vitamins function as coenzymes after modification; for example, all water-soluble vitamins are phosphorylated or are coupled to nucleotides when they are used in cells. Nicotinamide adenine dinucleotide (NAD+), a derivative of vitamin B3 (niacin), is an important coenzyme that acts as a hydrogen acceptor. Hundreds of separate types of dehydrogenases remove electrons from their substrates and reduce NAD+ into NADH. This reduced form of the coenzyme is then a substrate for any of the reductases in the cell that need to transfer hydrogen atoms to their substrates. Nicotinamide adenine dinucleotide exists in two related forms in the cell, NADH and NADPH. The NAD+/NADH form is more important in catabolic reactions, while NADP+/NADPH is used in anabolic reactions.

The structure of iron-containing hemoglobin. The protein subunits are in red and blue, and the iron-containing heme groups in green.

Mineral and cofactors

Inorganic elements play critical roles in metabolism; some are abundant (e.g. sodium and potassium) while others function at minute concentrations. About 99% of a human's body weight is made up of the elements carbon, nitrogen, calcium, sodium, chlorine, potassium, hydrogen, phosphorus, oxygen and sulfur. Organic compounds (proteins, lipids and carbohydrates) contain the majority of the carbon and nitrogen; most of the oxygen and hydrogen is present as water.

The abundant inorganic elements act as electrolytes. The most important ions are sodium, potassium, calcium, magnesium, chloride, phosphate and the organic ion bicarbonate. The maintenance of precise ion gradients across cell membranes maintains osmotic pressure and pH. Ions are also critical for nerve and muscle function, as action potentials in these tissues are produced by the exchange of electrolytes between the extracellular fluid and the cell's fluid, the cytosol. Electrolytes enter and leave cells through proteins in the cell membrane called ion channels. For example, muscle contraction depends upon the movement of calcium, sodium and potassium through ion channels in the cell membrane and T-tubules.

Transition metals are usually present as trace elements in organisms, with zinc and iron being most abundant of those. Metal cofactors are bound tightly to specific sites in proteins; although enzyme cofactors can be modified during catalysis, they always return to their original state by the end of the reaction catalyzed. Metal micronutrients are taken up into organisms by specific transporters and bind to storage proteins such as ferritin or metallothionein when not in use.

Catabolism

Catabolism is the set of metabolic processes that break down large molecules. These include breaking down and oxidizing food molecules. The purpose of the catabolic reactions is to provide the energy and components needed by anabolic reactions which build molecules. The exact nature of these catabolic reactions differ from organism to organism, and organisms can be classified based on their sources of energy, hydrogen, and carbon (their primary nutritional groups), as shown in the table below. Organic molecules are used as a source of hydrogen atoms or electrons by organotrophs, while lithotrophs use inorganic substrates. Whereas phototrophs convert sunlight to chemical energy, chemotrophs depend on redox reactions that involve the transfer of electrons from reduced donor molecules such as organic molecules, hydrogen, hydrogen sulfide or ferrous ions to oxygen, nitrate or sulfate. In animals, these reactions involve complex organic molecules that are broken down to simpler molecules, such as carbon dioxide and water. Photosynthetic organisms, such as plants and cyanobacteria, use similar electron-transfer reactions to store energy absorbed from sunlight.

Classification of organisms based on their metabolism 
Energy source sunlight photo-   -troph
molecules chemo-
Hydrogen or electron donor organic compound   organo-  
inorganic compound litho-
Carbon source organic compound   hetero-
inorganic compound auto-

The most common set of catabolic reactions in animals can be separated into three main stages. In the first stage, large organic molecules, such as proteins, polysaccharides or lipids, are digested into their smaller components outside cells. Next, these smaller molecules are taken up by cells and converted to smaller molecules, usually acetyl coenzyme A (acetyl-CoA), which releases some energy. Finally, the acetyl group on acetyl-CoA is oxidized to water and carbon dioxide in the citric acid cycle and electron transport chain, releasing more energy while reducing the coenzyme nicotinamide adenine dinucleotide (NAD+) into NADH.

Digestion

Macromolecules cannot be directly processed by cells. Macromolecules must be broken into smaller units before they can be used in cell metabolism. Different classes of enzymes are used to digest these polymers. These digestive enzymes include proteases that digest proteins into amino acids, as well as glycoside hydrolases that digest polysaccharides into simple sugars known as monosaccharides.

Microbes simply secrete digestive enzymes into their surroundings, while animals only secrete these enzymes from specialized cells in their guts, including the stomach and pancreas, and in salivary glands. The amino acids or sugars released by these extracellular enzymes are then pumped into cells by active transport proteins.

A simplified outline of the catabolism of proteins, carbohydrates and fats

Energy from organic compounds

Carbohydrate catabolism is the breakdown of carbohydrates into smaller units. Carbohydrates are usually taken into cells after they have been digested into monosaccharides such as glucose and fructose. Once inside, the major route of breakdown is glycolysis, in which glucose is converted into pyruvate. This process generates the energy-conveying molecule NADH from NAD+, and generates ATP from ADP for use in powering many processes within the cell. Pyruvate is an intermediate in several metabolic pathways, but the majority is converted to acetyl-CoA and fed into the citric acid cycle, which enables more ATP production by means of oxidative phosphorylation. This oxidation consumes molecular oxygen and releases water and the waste product carbon dioxide. When oxygen is lacking, or when pyruvate is temporarily produced faster than it can be consumed by the citric acid cycle (as in intense muscular exertion), pyruvate is converted to lactate by the enzyme lactate dehydrogenase, a process that also oxidizes NADH back to NAD+ for re-use in further glycolysis, allowing energy production to continue. The lactate is later converted back to pyruvate for ATP production where energy is needed, or back to glucose in the Cori cycle. An alternative route for glucose breakdown is the pentose phosphate pathway, which produces less energy but supports anabolism (biomolecule synthesis). This pathway reduces the coenzyme NADP+ to NADPH and produces pentose compounds such as ribose 5-phosphate for synthesis of many biomolecules such as nucleotides and aromatic amino acids.

Carbon Catabolism pathway map for free energy including carbohydrate and lipid sources of energy

Fats are catabolized by hydrolysis to free fatty acids and glycerol. The glycerol enters glycolysis and the fatty acids are broken down by beta oxidation to release acetyl-CoA, which then is fed into the citric acid cycle. Fatty acids release more energy upon oxidation than carbohydrates. Steroids are also broken down by some bacteria in a process similar to beta oxidation, and this breakdown process involves the release of significant amounts of acetyl-CoA, propionyl-CoA, and pyruvate, which can all be used by the cell for energy. M. tuberculosis can also grow on the lipid cholesterol as a sole source of carbon, and genes involved in the cholesterol-use pathway(s) have been validated as important during various stages of the infection lifecycle of M. tuberculosis.

Amino acids are either used to synthesize proteins and other biomolecules, or oxidized to urea and carbon dioxide to produce energy. The oxidation pathway starts with the removal of the amino group by a transaminase. The amino group is fed into the urea cycle, leaving a deaminated carbon skeleton in the form of a keto acid. Several of these keto acids are intermediates in the citric acid cycle, for example α-ketoglutarate formed by deamination of glutamate. The glucogenic amino acids can also be converted into glucose, through gluconeogenesis.

Energy transformations

Oxidative phosphorylation

In oxidative phosphorylation, the electrons removed from organic molecules in areas such as the citric acid cycle are transferred to oxygen and the energy released is used to make ATP. This is done in eukaryotes by a series of proteins in the membranes of mitochondria called the electron transport chain. In prokaryotes, these proteins are found in the cell's inner membrane. These proteins use the energy from reduced molecules like NADH to pump protons across a membrane.

Mechanism of ATP synthase. ATP is shown in red, ADP and phosphate in pink and the rotating stalk subunit in black.

Pumping protons out of the mitochondria creates a proton concentration difference across the membrane and generates an electrochemical gradient. This force drives protons back into the mitochondrion through the base of an enzyme called ATP synthase. The flow of protons makes the stalk subunit rotate, causing the active site of the synthase domain to change shape and phosphorylate adenosine diphosphate—turning it into ATP.

Energy from inorganic compounds

Chemolithotrophy is a type of metabolism found in prokaryotes where energy is obtained from the oxidation of inorganic compounds. These organisms can use hydrogen, reduced sulfur compounds (such as sulfide, hydrogen sulfide and thiosulfate), ferrous iron (Fe(II)) or ammonia as sources of reducing power and they gain energy from the oxidation of these compounds. These microbial processes are important in global biogeochemical cycles such as acetogenesis, nitrification and denitrification and are critical for soil fertility.

Energy from light

The energy in sunlight is captured by plants, cyanobacteria, purple bacteria, green sulfur bacteria and some protists. This process is often coupled to the conversion of carbon dioxide into organic compounds, as part of photosynthesis, which is discussed below. The energy capture and carbon fixation systems can, however, operate separately in prokaryotes, as purple bacteria and green sulfur bacteria can use sunlight as a source of energy, while switching between carbon fixation and the fermentation of organic compounds.

In many organisms, the capture of solar energy is similar in principle to oxidative phosphorylation, as it involves the storage of energy as a proton concentration gradient. This proton motive force then drives ATP synthesis. The electrons needed to drive this electron transport chain come from light-gathering proteins called photosynthetic reaction centres. Reaction centers are classified into two types depending on the nature of photosynthetic pigment present, with most photosynthetic bacteria only having one type, while plants and cyanobacteria have two.

In plants, algae, and cyanobacteria, photosystem II uses light energy to remove electrons from water, releasing oxygen as a waste product. The electrons then flow to the cytochrome b6f complex, which uses their energy to pump protons across the thylakoid membrane in the chloroplast. These protons move back through the membrane as they drive the ATP synthase, as before. The electrons then flow through photosystem I and can then be used to reduce the coenzyme NADP+. This coenzyme can enter the Calvin cycle or be recycled for further ATP generation.

Anabolism

Anabolism is the set of constructive metabolic processes where the energy released by catabolism is used to synthesize complex molecules. In general, the complex molecules that make up cellular structures are constructed step-by-step from smaller and simpler precursors. Anabolism involves three basic stages. First, the production of precursors such as amino acids, monosaccharides, isoprenoids and nucleotides, secondly, their activation into reactive forms using energy from ATP, and thirdly, the assembly of these precursors into complex molecules such as proteins, polysaccharides, lipids and nucleic acids.

Anabolism in organisms can be different according to the source of constructed molecules in their cells. Autotrophs such as plants can construct the complex organic molecules in their cells such as polysaccharides and proteins from simple molecules like carbon dioxide and water. Heterotrophs, on the other hand, require a source of more complex substances, such as monosaccharides and amino acids, to produce these complex molecules. Organisms can be further classified by ultimate source of their energy: photoautotrophs and photoheterotrophs obtain energy from light, whereas chemoautotrophs and chemoheterotrophs obtain energy from oxidation reactions.

Carbon fixation

Plant cells (bounded by purple walls) filled with chloroplasts (green), which are the site of photosynthesis

Photosynthesis is the synthesis of carbohydrates from sunlight and carbon dioxide (CO2). In plants, cyanobacteria and algae, oxygenic photosynthesis splits water, with oxygen produced as a waste product. This process uses the ATP and NADPH produced by the photosynthetic reaction centres, as described above, to convert CO2 into glycerate 3-phosphate, which can then be converted into glucose. This carbon-fixation reaction is carried out by the enzyme RuBisCO as part of the Calvin–Benson cycle. Three types of photosynthesis occur in plants, C3 carbon fixation, C4 carbon fixation and CAM photosynthesis. These differ by the route that carbon dioxide takes to the Calvin cycle, with C3 plants fixing CO2 directly, while C4 and CAM photosynthesis incorporate the CO2 into other compounds first, as adaptations to deal with intense sunlight and dry conditions.

In photosynthetic prokaryotes the mechanisms of carbon fixation are more diverse. Here, carbon dioxide can be fixed by the Calvin–Benson cycle, a reversed citric acid cycle, or the carboxylation of acetyl-CoA. Prokaryotic chemoautotrophs also fix CO2 through the Calvin–Benson cycle, but use energy from inorganic compounds to drive the reaction.

Carbohydrates and glycans

In carbohydrate anabolism, simple organic acids can be converted into monosaccharides such as glucose and then used to assemble polysaccharides such as starch. The generation of glucose from compounds like pyruvate, lactate, glycerol, glycerate 3-phosphate and amino acids is called gluconeogenesis. Gluconeogenesis converts pyruvate to glucose-6-phosphate through a series of intermediates, many of which are shared with glycolysis. However, this pathway is not simply glycolysis run in reverse, as several steps are catalyzed by non-glycolytic enzymes. This is important as it allows the formation and breakdown of glucose to be regulated separately, and prevents both pathways from running simultaneously in a futile cycle.

Although fat is a common way of storing energy, in vertebrates such as humans the fatty acids in these stores cannot be converted to glucose through gluconeogenesis as these organisms cannot convert acetyl-CoA into pyruvate; plants do, but animals do not, have the necessary enzymatic machinery. As a result, after long-term starvation, vertebrates need to produce ketone bodies from fatty acids to replace glucose in tissues such as the brain that cannot metabolize fatty acids. In other organisms such as plants and bacteria, this metabolic problem is solved using the glyoxylate cycle, which bypasses the decarboxylation step in the citric acid cycle and allows the transformation of acetyl-CoA to oxaloacetate, where it can be used for the production of glucose. Other than fat, glucose is stored in most tissues, as an energy resource available within the tissue through glycogenesis which was usually being used to maintained glucose level in blood.

Polysaccharides and glycans are made by the sequential addition of monosaccharides by glycosyltransferase from a reactive sugar-phosphate donor such as uridine diphosphate glucose (UDP-Glc) to an acceptor hydroxyl group on the growing polysaccharide. As any of the hydroxyl groups on the ring of the substrate can be acceptors, the polysaccharides produced can have straight or branched structures. The polysaccharides produced can have structural or metabolic functions themselves, or be transferred to lipids and proteins by the enzymes oligosaccharyltransferases.

Fatty acids, isoprenoids and sterol

Simplified version of the steroid synthesis pathway with the intermediates isopentenyl pyrophosphate (IPP), dimethylallyl pyrophosphate (DMAPP), geranyl pyrophosphate (GPP) and squalene shown. Some intermediates are omitted for clarity.

Fatty acids are made by fatty acid synthases that polymerize and then reduce acetyl-CoA units. The acyl chains in the fatty acids are extended by a cycle of reactions that add the acyl group, reduce it to an alcohol, dehydrate it to an alkene group and then reduce it again to an alkane group. The enzymes of fatty acid biosynthesis are divided into two groups: in animals and fungi, all these fatty acid synthase reactions are carried out by a single multifunctional type I protein, while in plant plastids and bacteria separate type II enzymes perform each step in the pathway.

Terpenes and isoprenoids are a large class of lipids that include the carotenoids and form the largest class of plant natural products. These compounds are made by the assembly and modification of isoprene units donated from the reactive precursors isopentenyl pyrophosphate and dimethylallyl pyrophosphate. These precursors can be made in different ways. In animals and archaea, the mevalonate pathway produces these compounds from acetyl-CoA, while in plants and bacteria the non-mevalonate pathway uses pyruvate and glyceraldehyde 3-phosphate as substrates. One important reaction that uses these activated isoprene donors is sterol biosynthesis. Here, the isoprene units are joined to make squalene and then folded up and formed into a set of rings to make lanosterol. Lanosterol can then be converted into other sterols such as cholesterol and ergosterol.

Proteins

Organisms vary in their ability to synthesize the 20 common amino acids. Most bacteria and plants can synthesize all twenty, but mammals can only synthesize eleven nonessential amino acids, so nine essential amino acids must be obtained from food. Some simple parasites, such as the bacteria Mycoplasma pneumoniae, lack all amino acid synthesis and take their amino acids directly from their hosts. All amino acids are synthesized from intermediates in glycolysis, the citric acid cycle, or the pentose phosphate pathway. Nitrogen is provided by glutamate and glutamine. Nonessensial amino acid synthesis depends on the formation of the appropriate alpha-keto acid, which is then transaminated to form an amino acid.

Amino acids are made into proteins by being joined in a chain of peptide bonds. Each different protein has a unique sequence of amino acid residues: this is its primary structure. Just as the letters of the alphabet can be combined to form an almost endless variety of words, amino acids can be linked in varying sequences to form a huge variety of proteins. Proteins are made from amino acids that have been activated by attachment to a transfer RNA molecule through an ester bond. This aminoacyl-tRNA precursor is produced in an ATP-dependent reaction carried out by an aminoacyl tRNA synthetase. This aminoacyl-tRNA is then a substrate for the ribosome, which joins the amino acid onto the elongating protein chain, using the sequence information in a messenger RNA.

Nucleotide synthesis and salvage

Nucleotides are made from amino acids, carbon dioxide and formic acid in pathways that require large amounts of metabolic energy. Consequently, most organisms have efficient systems to salvage preformed nucleotides. Purines are synthesized as nucleosides (bases attached to ribose). Both adenine and guanine are made from the precursor nucleoside inosine monophosphate, which is synthesized using atoms from the amino acids glycine, glutamine, and aspartic acid, as well as formate transferred from the coenzyme tetrahydrofolate. Pyrimidines, on the other hand, are synthesized from the base orotate, which is formed from glutamine and aspartate.

Xenobiotics and redox metabolism

All organisms are constantly exposed to compounds that they cannot use as foods and that would be harmful if they accumulated in cells, as they have no metabolic function. These potentially damaging compounds are called xenobiotics. Xenobiotics such as synthetic drugs, natural poisons and antibiotics are detoxified by a set of xenobiotic-metabolizing enzymes. In humans, these include cytochrome P450 oxidases, UDP-glucuronosyltransferases, and glutathione S-transferases. This system of enzymes acts in three stages to firstly oxidize the xenobiotic (phase I) and then conjugate water-soluble groups onto the molecule (phase II). The modified water-soluble xenobiotic can then be pumped out of cells and in multicellular organisms may be further metabolized before being excreted (phase III). In ecology, these reactions are particularly important in microbial biodegradation of pollutants and the bioremediation of contaminated land and oil spills. Many of these microbial reactions are shared with multicellular organisms, but due to the incredible diversity of types of microbes these organisms are able to deal with a far wider range of xenobiotics than multicellular organisms, and can degrade even persistent organic pollutants such as organochloride compounds.

A related problem for aerobic organisms is oxidative stress. Here, processes including oxidative phosphorylation and the formation of disulfide bonds during protein folding produce reactive oxygen species such as hydrogen peroxide. These damaging oxidants are removed by antioxidant metabolites such as glutathione and enzymes such as catalases and peroxidases.

Thermodynamics of living organisms

Living organisms must obey the laws of thermodynamics, which describe the transfer of heat and work. The second law of thermodynamics states that in any isolated system, the amount of entropy (disorder) cannot decrease. Although living organisms' amazing complexity appears to contradict this law, life is possible as all organisms are open systems that exchange matter and energy with their surroundings. Living systems are not in equilibrium, but instead are dissipative systems that maintain their state of high complexity by causing a larger increase in the entropy of their environments. The metabolism of a cell achieves this by coupling the spontaneous processes of catabolism to the non-spontaneous processes of anabolism. In thermodynamic terms, metabolism maintains order by creating disorder.

Regulation and control

As the environments of most organisms are constantly changing, the reactions of metabolism must be finely regulated to maintain a constant set of conditions within cells, a condition called homeostasis. Metabolic regulation also allows organisms to respond to signals and interact actively with their environments. Two closely linked concepts are important for understanding how metabolic pathways are controlled. Firstly, the regulation of an enzyme in a pathway is how its activity is increased and decreased in response to signals. Secondly, the control exerted by this enzyme is the effect that these changes in its activity have on the overall rate of the pathway (the flux through the pathway). For example, an enzyme may show large changes in activity (i.e. it is highly regulated) but if these changes have little effect on the flux of a metabolic pathway, then this enzyme is not involved in the control of the pathway.

Effect of insulin on glucose uptake and metabolism. Insulin binds to its receptor (1), which in turn starts many protein activation cascades (2). These include: translocation of Glut-4 transporter to the plasma membrane and influx of glucose (3), glycogen synthesis (4), glycolysis (5) and fatty acid synthesis (6).

There are multiple levels of metabolic regulation. In intrinsic regulation, the metabolic pathway self-regulates to respond to changes in the levels of substrates or products; for example, a decrease in the amount of product can increase the flux through the pathway to compensate. This type of regulation often involves allosteric regulation of the activities of multiple enzymes in the pathway. Extrinsic control involves a cell in a multicellular organism changing its metabolism in response to signals from other cells. These signals are usually in the form of water-soluble messengers such as hormones and growth factors and are detected by specific receptors on the cell surface. These signals are then transmitted inside the cell by second messenger systems that often involved the phosphorylation of proteins.

A very well understood example of extrinsic control is the regulation of glucose metabolism by the hormone insulin. Insulin is produced in response to rises in blood glucose levels. Binding of the hormone to insulin receptors on cells then activates a cascade of protein kinases that cause the cells to take up glucose and convert it into storage molecules such as fatty acids and glycogen. The metabolism of glycogen is controlled by activity of phosphorylase, the enzyme that breaks down glycogen, and glycogen synthase, the enzyme that makes it. These enzymes are regulated in a reciprocal fashion, with phosphorylation inhibiting glycogen synthase, but activating phosphorylase. Insulin causes glycogen synthesis by activating protein phosphatases and producing a decrease in the phosphorylation of these enzymes.

Evolution

Evolutionary tree showing the common ancestry of organisms from all three domains of life. Bacteria are colored blue, eukaryotes red, and archaea green. Relative positions of some of the phyla included are shown around the tree.

The central pathways of metabolism described above, such as glycolysis and the citric acid cycle, are present in all three domains of living things and were present in the last universal common ancestor. This universal ancestral cell was prokaryotic and probably a methanogen that had extensive amino acid, nucleotide, carbohydrate and lipid metabolism. The retention of these ancient pathways during later evolution may be the result of these reactions having been an optimal solution to their particular metabolic problems, with pathways such as glycolysis and the citric acid cycle producing their end products highly efficiently and in a minimal number of steps. The first pathways of enzyme-based metabolism may have been parts of purine nucleotide metabolism, while previous metabolic pathways were a part of the ancient RNA world.

Many models have been proposed to describe the mechanisms by which novel metabolic pathways evolve. These include the sequential addition of novel enzymes to a short ancestral pathway, the duplication and then divergence of entire pathways as well as the recruitment of pre-existing enzymes and their assembly into a novel reaction pathway. The relative importance of these mechanisms is unclear, but genomic studies have shown that enzymes in a pathway are likely to have a shared ancestry, suggesting that many pathways have evolved in a step-by-step fashion with novel functions created from pre-existing steps in the pathway. An alternative model comes from studies that trace the evolution of proteins' structures in metabolic networks, this has suggested that enzymes are pervasively recruited, borrowing enzymes to perform similar functions in different metabolic pathways (evident in the MANET database) These recruitment processes result in an evolutionary enzymatic mosaic. A third possibility is that some parts of metabolism might exist as "modules" that can be reused in different pathways and perform similar functions on different molecules.

As well as the evolution of new metabolic pathways, evolution can also cause the loss of metabolic functions. For example, in some parasites metabolic processes that are not essential for survival are lost and preformed amino acids, nucleotides and carbohydrates may instead be scavenged from the host. Similar reduced metabolic capabilities are seen in endosymbiotic organisms.

Investigation and manipulation

Metabolic network of the Arabidopsis thaliana citric acid cycle. Enzymes and metabolites are shown as red squares and the interactions between them as black lines.

Classically, metabolism is studied by a reductionist approach that focuses on a single metabolic pathway. Particularly valuable is the use of radioactive tracers at the whole-organism, tissue and cellular levels, which define the paths from precursors to final products by identifying radioactively labelled intermediates and products. The enzymes that catalyze these chemical reactions can then be purified and their kinetics and responses to inhibitors investigated. A parallel approach is to identify the small molecules in a cell or tissue; the complete set of these molecules is called the metabolome. Overall, these studies give a good view of the structure and function of simple metabolic pathways, but are inadequate when applied to more complex systems such as the metabolism of a complete cell.

An idea of the complexity of the metabolic networks in cells that contain thousands of different enzymes is given by the figure showing the interactions between just 43 proteins and 40 metabolites to the right: the sequences of genomes provide lists containing anything up to 26.500 genes. However, it is now possible to use this genomic data to reconstruct complete networks of biochemical reactions and produce more holistic mathematical models that may explain and predict their behavior. These models are especially powerful when used to integrate the pathway and metabolite data obtained through classical methods with data on gene expression from proteomic and DNA microarray studies. Using these techniques, a model of human metabolism has now been produced, which will guide future drug discovery and biochemical research. These models are now used in network analysis, to classify human diseases into groups that share common proteins or metabolites.

Bacterial metabolic networks are a striking example of bow-tie organization, an architecture able to input a wide range of nutrients and produce a large variety of products and complex macromolecules using a relatively few intermediate common currencies.

A major technological application of this information is metabolic engineering. Here, organisms such as yeast, plants or bacteria are genetically modified to make them more useful in biotechnology and aid the production of drugs such as antibiotics or industrial chemicals such as 1,3-propanediol and shikimic acid. These genetic modifications usually aim to reduce the amount of energy used to produce the product, increase yields and reduce the production of wastes.

History

The term metabolism is derived from the Ancient Greek word μεταβολή—"metabole" for "a change" which is derived from μεταβάλλειν—"metaballein", meaning "to change"

Aristotle's metabolism as an open flow model

Greek philosophy

Aristotle's The Parts of Animals sets out enough details of his views on metabolism for an open flow model to be made. He believed that at each stage of the process, materials from food were transformed, with heat being released as the classical element of fire, and residual materials being excreted as urine, bile, or faeces.

Ibn al-Nafis described metabolism in his 1260 AD work titled Al-Risalah al-Kamiliyyah fil Siera al-Nabawiyyah (The Treatise of Kamil on the Prophet's Biography) which included the following phrase "Both the body and its parts are in a continuous state of dissolution and nourishment, so they are inevitably undergoing permanent change."

Application of the scientific method and Modern metabolic theories

The history of the scientific study of metabolism spans several centuries and has moved from examining whole animals in early studies, to examining individual metabolic reactions in modern biochemistry. The first controlled experiments in human metabolism were published by Santorio Santorio in 1614 in his book Ars de statica medicina. He described how he weighed himself before and after eating, sleep, working, sex, fasting, drinking, and excreting. He found that most of the food he took in was lost through what he called "insensible perspiration".

Santorio Santorio in his steelyard balance, from Ars de statica medicina, first published 1614

In these early studies, the mechanisms of these metabolic processes had not been identified and a vital force was thought to animate living tissue. In the 19th century, when studying the fermentation of sugar to alcohol by yeast, Louis Pasteur concluded that fermentation was catalyzed by substances within the yeast cells he called "ferments". He wrote that "alcoholic fermentation is an act correlated with the life and organization of the yeast cells, not with the death or putrefaction of the cells." This discovery, along with the publication by Friedrich Wöhler in 1828 of a paper on the chemical synthesis of urea, and is notable for being the first organic compound prepared from wholly inorganic precursors. This proved that the organic compounds and chemical reactions found in cells were no different in principle than any other part of chemistry.

It was the discovery of enzymes at the beginning of the 20th century by Eduard Buchner that separated the study of the chemical reactions of metabolism from the biological study of cells, and marked the beginnings of biochemistry. The mass of biochemical knowledge grew rapidly throughout the early 20th century. One of the most prolific of these modern biochemists was Hans Krebs who made huge contributions to the study of metabolism. He discovered the urea cycle and later, working with Hans Kornberg, the citric acid cycle and the glyoxylate cycle. Modern biochemical research has been greatly aided by the development of new techniques such as chromatography, X-ray diffraction, NMR spectroscopy, radioisotopic labelling, electron microscopy and molecular dynamics simulations. These techniques have allowed the discovery and detailed analysis of the many molecules and metabolic pathways in cells.

DNA digital data storage

From Wikipedia, the free encyclopedia

DNA digital data storage is the process of encoding and decoding binary data to and from synthesized strands of DNA.

While DNA as a storage medium has enormous potential because of its high storage density, its practical use is currently severely limited because of its high cost and very slow read and write times.

In June 2019, scientists reported that all 16 GB of text from the English Wikipedia had been encoded into synthetic DNA. In 2021, scientists reported that a custom DNA data writer had been developed that was capable of writing data into DNA at 1 Mbps.

Encoding methods

Many methods for encoding data in DNA are possible. The optimal methods are those that make economical use of DNA and protect against errors. If the message DNA is intended to be stored for a long period of time, for example, 1,000 years, it is also helpful if the sequence is obviously artificial and the reading frame is easy to identify.

Encoding text

Several simple methods for encoding text have been proposed. Most of these involve translating each letter into a corresponding "codon", consisting of a unique small sequence of nucleotides in a lookup table. Some examples of these encoding schemes include Huffman codes, comma codes, and alternating codes.

Encoding arbitrary data

To encode arbitrary data in DNA, the data is typically first converted into ternary (base 3) data rather than binary (base 2) data. Each digit (or "trit") is then converted to a nucleotide using a lookup table. To prevent homopolymers (repeating nucleotides), which can cause problems with accurate sequencing, the result of the lookup also depends on the preceding nucleotide. Using the example lookup table below, if the previous nucleotide in the sequence is T (thymine), and the trit is 2, the next nucleotide will be G (guanine).

Trits to nucleotides (example)
Previous 0 1 2
T A C G
G T A C
C G T A
A C G T

Various systems may be incorporated to partition and address the data, as well as to protect it from errors. One approach to error correction is to regularly intersperse synchronization nucleotides between the information-encoding nucleotides. These synchronization nucleotides can act as scaffolds when reconstructing the sequence from multiple overlapping strands.

In vivo

The genetic code within living organisms can potentially be co-opted to store information. Furthermore synthetic biology can be used to engineer cells with "molecular recorders" to allow the storage and retrieval of information stored in the cell's genetic material. CRISPR gene editing can also be used to insert artificial DNA sequences into the genome of the cell. For encoding developmental lineage data (molecular flight recorder), roughly 30 trillion cell nuclei per mouse * 60 recording sites per nucleus * 7-15 bits per site yields about 2 TeraBytes per mouse written (but only very selectively read).

In-vivo light-based direct image and data recording

A proof-of-concept in-vivo direct DNA data recording system was demonstrated through incorporation of optogenetically regulated recombinases as part of an engineered "molecular recorder" allows for direct encoding of light-based stimuli into engineered E.coli cells. This approach can also be parallelized to store and write text or data in 8-bit form through the use of physically separated individual cell cultures in cell-culture plates.

This approach leverages the editing of a "recorder plasmid" by the light-regulated recombinases, allowing for identification of cell populations exposed to different stimuli. This approach allows for the physical stimulus to be directly encoded into the "recorder plasmid" through recombinase action. Unlike other approaches, this approach does not require manual design, insertion and cloning of artificial sequences to record the data into the genetic code. In this recording process, each individual cell population in each cell-culture plate culture well can be treated as a digital "bit", functioning as a biological transistor capable of recording a single bit of data.

History

The idea of DNA digital data storage dates back to 1959, when the physicist Richard P. Feynman, in "There's Plenty of Room at the Bottom: An Invitation to Enter a New Field of Physics" outlined the general prospects for the creation of artificial objects similar to objects of the microcosm (including biological) and having similar or even more extensive capabilities. In 1964–65, Mikhail Samoilovich Neiman, the Soviet physicist, published 3 articles about microminiaturization in electronics at the molecular-atomic level, which independently presented general considerations and some calculations regarding the possibility of recording, storage, and retrieval of information on synthesized DNA and RNA molecules. After the publication of the first M.S. Neiman's paper and after receiving by Editor the manuscript of his second paper (January, the 8th, 1964, as indicated in that paper) the interview with cybernetician Norbert Wiener was published. N. Wiener expressed ideas about miniaturization of computer memory, close to the ideas, proposed by M. S. Neiman independently. These Wiener's ideas M. S. Neiman mentioned in the third of his papers. This story is described in details.

One of the earliest uses of DNA storage occurred in a 1988 collaboration between artist Joe Davis and researchers from Harvard University. The image, stored in a DNA sequence in E.coli, was organized in a 5 x 7 matrix that, once decoded, formed a picture of an ancient Germanic rune representing life and the female Earth. In the matrix, ones corresponded to dark pixels while zeros corresponded to light pixels.

In 2007 a device was created at the University of Arizona using addressing molecules to encode mismatch sites within a DNA strand. These mismatches were then able to be read out by performing a restriction digest, thereby recovering the data.

In 2011, George Church, Sri Kosuri, and Yuan Gao carried out an experiment that would encode a 659 kb book that was co-authored by Church. To do this, the research team did a two-to-one correspondence where a binary zero was represented by either an adenine or cytosine and a binary one was represented by a guanine or thymine. After examination, 22 errors were found in the DNA.

In 2012, George Church and colleagues at Harvard University published an article in which DNA was encoded with digital information that included an HTML draft of a 53,400 word book written by the lead researcher, eleven JPEG images and one JavaScript program. Multiple copies for redundancy were added and 5.5 petabits can be stored in each cubic millimeter of DNA. The researchers used a simple code where bits were mapped one-to-one with bases, which had the shortcoming that it led to long runs of the same base, the sequencing of which is error-prone. This result showed that besides its other functions, DNA can also be another type of storage medium such as hard disk drives and magnetic tapes.

In 2013, an article led by researchers from the European Bioinformatics Institute (EBI) and submitted at around the same time as the paper of Church and colleagues detailed the storage, retrieval, and reproduction of over five million bits of data. All the DNA files reproduced the information with an accuracy between 99.99% and 100%. The main innovations in this research were the use of an error-correcting encoding scheme to ensure the extremely low data-loss rate, as well as the idea of encoding the data in a series of overlapping short oligonucleotides identifiable through a sequence-based indexing scheme. Also, the sequences of the individual strands of DNA overlapped in such a way that each region of data was repeated four times to avoid errors. Two of these four strands were constructed backwards, also with the goal of eliminating errors. The costs per megabyte were estimated at $12,400 to encode data and $220 for retrieval. However, it was noted that the exponential decrease in DNA synthesis and sequencing costs, if it continues into the future, should make the technology cost-effective for long-term data storage by 2023.

In 2013, a software called DNACloud was developed by Manish K. Gupta and co-workers to encode computer files to their DNA representation. It implements a memory efficiency version of the algorithm proposed by Goldman et al. to encode (and decode) data to DNA (.dnac files).

The long-term stability of data encoded in DNA was reported in February 2015, in an article by researchers from ETH Zurich. The team added redundancy via Reed–Solomon error correction coding and by encapsulating the DNA within silica glass spheres via Sol-gel chemistry.

In 2016 research by Church and Technicolor Research and Innovation was published in which, 22 MB of a MPEG compressed movie sequence were stored and recovered from DNA. The recovery of the sequence was found to have zero errors.

In March 2017, Yaniv Erlich and Dina Zielinski of Columbia University and the New York Genome Center published a method known as DNA Fountain that stored data at a density of 215 petabytes per gram of DNA. The technique approaches the Shannon capacity of DNA storage, achieving 85% of the theoretical limit. The method was not ready for large-scale use, as it costs $7000 to synthesize 2 megabytes of data and another $2000 to read it.

In March 2018, University of Washington and Microsoft published results demonstrating storage and retrieval of approximately 200MB of data. The research also proposed and evaluated a method for random access of data items stored in DNA. In March 2019, the same team announced they have demonstrated a fully automated system to encode and decode data in DNA.

Research published by Eurecom and Imperial College in January 2019, demonstrated the ability to store structured data in synthetic DNA. The research showed how to encode structured or, more specifically, relational data in synthetic DNA and also demonstrated how to perform data processing operations (similar to SQL) directly on the DNA as chemical processes.

In April 2019, due to a collaboration with TurboBeads Labs in Switzerland, Mezzanine by Massive Attack was encoded into synthetic DNA, making it the first album to be stored in this way.

In June 2019, scientists reported that all 16 GB of Wikipedia have been encoded into synthetic DNA. In 2021, CATALOG reported that they had developed a custom DNA writer capable of writing data at 1 Mbps into DNA.

The first article describing data storage on native DNA sequences via enzymatic nicking was published in April 2020. In the paper, scientists demonstrate a new method of recording information in DNA backbone which enables bit-wise random access and in-memory computing.

In 2021, a research team at Newcastle University led by N. Krasnogor implemented a stack data structure using DNA, allowing for last-in, first-out (LIFO) data recording and retrieval. Their approach used hybridization and strand displacement to record DNA signals in DNA polymers, which were then released in reverse order. The study demonstrated that data structure-like operations are possible in the molecular realm. The researchers also explored the limitations and future improvements for dynamic DNA data structures, highlighting the potential for DNA-based computational systems.

Davos Bitcoin Challenge

On January 21, 2015, Nick Goldman from the European Bioinformatics Institute (EBI), one of the original authors of the 2013 Nature paper, announced the Davos Bitcoin Challenge at the World Economic Forum annual meeting in Davos. During his presentation, DNA tubes were handed out to the audience, with the message that each tube contained the private key of exactly one bitcoin, all coded in DNA. The first one to sequence and decode the DNA could claim the bitcoin and win the challenge. The challenge was set for three years and would close if nobody claimed the prize before January 21, 2018.

Almost three years later on January 19, 2018, the EBI announced that a Belgian PhD student, Sander Wuyts, of the University of Antwerp and Vrije Universiteit Brussel, was the first one to complete the challenge. Next to the instructions on how to claim the bitcoin (stored as a plain text and PDF file), the logo of the EBI, the logo of the company that printed the DNA (CustomArray), and a sketch of James Joyce were retrieved from the DNA.

The Lunar Library

The Lunar Library, launched on the Beresheet Lander by the Arch Mission Foundation, carries information encoded in DNA, which includes 20 famous books and 10,000 images. This was one of the optimal choices of storage, as DNA can last a long time. The Arch Mission Foundation suggests that it can still be read after billions of years. The lander crashed on 11 April 2019 and was lost.

DNA of things

The concept of the DNA of Things (DoT) was introduced in 2019 by a team of researchers from Israel and Switzerland, including Yaniv Erlich and Robert Grass. DoT encodes digital data into DNA molecules, which are then embedded into objects. This gives the ability to create objects that carry their own blueprint, similar to biological organisms. In contrast to Internet of things, which is a system of interrelated computing devices, DoT creates objects which are independent storage objects, completely off-grid.

As a proof of concept for DoT, the researcher 3D-printed a Stanford bunny which contains its blueprint in the plastic filament used for printing. By clipping off a tiny bit of the ear of the bunny, they were able to read out the blueprint, multiply it and produce a next generation of bunnies. In addition, the ability of DoT to serve for steganographic purposes was shown by producing non-distinguishable lenses which contain a YouTube video integrated into the material.

Entropy (information theory)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Entropy_(information_theory) In info...