Search This Blog

Tuesday, December 5, 2023

Analytic–synthetic distinction

From Wikipedia, the free encyclopedia

The analytic–synthetic distinction is a semantic distinction used primarily in philosophy to distinguish between propositions (in particular, statements that are affirmative subjectpredicate judgments) that are of two types: analytic propositions and synthetic propositions. Analytic propositions are true or not true solely by virtue of their meaning, whereas synthetic propositions' truth, if any, derives from how their meaning relates to the world.

While the distinction was first proposed by Immanuel Kant, it was revised considerably over time, and different philosophers have used the terms in very different ways. Furthermore, some philosophers (starting with Willard Van Orman Quine) have questioned whether there is even a clear distinction to be made between propositions which are analytically true and propositions which are synthetically true. Debates regarding the nature and usefulness of the distinction continue to this day in contemporary philosophy of language.

Kant

Immanuel Kant

Conceptual containment

The philosopher Immanuel Kant uses the terms "analytic" and "synthetic" to divide propositions into two types. Kant introduces the analytic–synthetic distinction in the Introduction to his Critique of Pure Reason (1781/1998, A6–7/B10–11). There, he restricts his attention to statements that are affirmative subject–predicate judgments and defines "analytic proposition" and "synthetic proposition" as follows:

  • analytic proposition: a proposition whose predicate concept is contained in its subject concept
  • synthetic proposition: a proposition whose predicate concept is not contained in its subject concept but related

Examples of analytic propositions, on Kant's definition, include:

  • "All bachelors are unmarried."
  • "All triangles have three sides."

Kant's own example is:

  • "All bodies are extended": that is, they occupy space. (A7/B11)

Each of these statements is an affirmative subject–predicate judgment, and, in each, the predicate concept is contained within the subject concept. The concept "bachelor" contains the concept "unmarried"; the concept "unmarried" is part of the definition of the concept "bachelor". Likewise, for "triangle" and "has three sides", and so on.

Examples of synthetic propositions, on Kant's definition, include:

  • "All bachelors are alone."
  • "All creatures with hearts have kidneys."

Kant's own example is:

  • "All bodies are heavy": that is, they experience a gravitational force. (A7/B11)

As with the previous examples classified as analytic propositions, each of these new statements is an affirmative subject–predicate judgment. However, in none of these cases does the subject concept contain the predicate concept. The concept "bachelor" does not contain the concept "alone"; "alone" is not a part of the definition of "bachelor". The same is true for "creatures with hearts" and "have kidneys"; even if every creature with a heart also has kidneys, the concept "creature with a heart" does not contain the concept "has kidneys". So the philosophical issue is: What kind of statement is "Language is used to transmit meaning"?

Kant's version and the a prioria posteriori distinction

In the Introduction to the Critique of Pure Reason, Kant contrasts his distinction between analytic and synthetic propositions with another distinction, the distinction between a priori and a posteriori propositions. He defines these terms as follows:

  • a priori proposition: a proposition whose justification does not rely upon experience. Moreover, the proposition can be validated by experience, but is not grounded in experience. Therefore, it is logically necessary.
  • a posteriori proposition: a proposition whose justification does rely upon experience. The proposition is validated by, and grounded in, experience. Therefore, it is logically contingent.

Examples of a priori propositions include:

  • "All bachelors are unmarried."
  • "7 + 5 = 12."

The justification of these propositions does not depend upon experience: one need not consult experience to determine whether all bachelors are unmarried, nor whether 7 + 5 = 12. (Of course, as Kant would grant, experience is required to understand the concepts "bachelor", "unmarried", "7", "+" and so forth. However, the a prioria posteriori distinction as employed here by Kant refers not to the origins of the concepts but to the justification of the propositions. Once we have the concepts, experience is no longer necessary.)

Examples of a posteriori propositions include:

  • "All bachelors are unhappy."
  • "Tables exist."

Both of these propositions are a posteriori: any justification of them would require one's experience.

The analytic–synthetic distinction and the a prioria posteriori distinction together yield four types of propositions:

  • analytic a priori
  • synthetic a priori
  • analytic a posteriori
  • synthetic a posteriori

Kant posits the third type as obviously self-contradictory. Ruling it out, he discusses only the remaining three types as components of his epistemological framework—each, for brevity's sake, becoming, respectively, "analytic", "synthetic a priori", and "empirical" or "a posteriori" propositions. This triad accounts for all propositions possible. Examples of analytic and a posteriori statements have already been given, for synthetic a priori propositions he gives those in mathematics and physics.

The ease of knowing analytic propositions

Part of Kant's argument in the Introduction to the Critique of Pure Reason involves arguing that there is no problem figuring out how knowledge of analytic propositions is possible. To know an analytic proposition, Kant argued, one need not consult experience. Instead, one needs merely to take the subject and "extract from it, in accordance with the principle of contradiction, the required predicate" (A7/B12). In analytic propositions, the predicate concept is contained in the subject concept. Thus, to know an analytic proposition is true, one need merely examine the concept of the subject. If one finds the predicate contained in the subject, the judgment is true.

Thus, for example, one need not consult experience to determine whether "All bachelors are unmarried" is true. One need merely examine the subject concept ("bachelors") and see if the predicate concept "unmarried" is contained in it. And in fact, it is: "unmarried" is part of the definition of "bachelor" and so is contained within it. Thus the proposition "All bachelors are unmarried" can be known to be true without consulting experience.

It follows from this, Kant argued, first: All analytic propositions are a priori; there are no a posteriori analytic propositions. It follows, second: There is no problem understanding how we can know analytic propositions; we can know them because we only need to consult our concepts in order to determine that they are true.

The possibility of metaphysics

After ruling out the possibility of analytic a posteriori propositions, and explaining how we can obtain knowledge of analytic a priori propositions, Kant also explains how we can obtain knowledge of synthetic a posteriori propositions. That leaves only the question of how knowledge of synthetic a priori propositions is possible. This question is exceedingly important, Kant maintains, because all scientific knowledge (for him Newtonian physics and mathematics) is made up of synthetic a priori propositions. If it is impossible to determine which synthetic a priori propositions are true, he argues, then metaphysics as a discipline is impossible. The remainder of the Critique of Pure Reason is devoted to examining whether and how knowledge of synthetic a priori propositions is possible.

Logical positivists

Frege revision of Kantian definition

Over a hundred years later, a group of philosophers took interest in Kant and his distinction between analytic and synthetic propositions: the logical positivists.

Part of Kant's examination of the possibility of synthetic a priori knowledge involved the examination of mathematical propositions, such as

  • "7 + 5 = 12." (B15–16)
  • "The shortest distance between two points is a straight line." (B16–17)

Kant maintained that mathematical propositions such as these are synthetic a priori propositions, and that we know them. That they are synthetic, he thought, is obvious: the concept "equal to 12" is not contained within the concept "7 + 5"; and the concept "straight line" is not contained within the concept "the shortest distance between two points". From this, Kant concluded that we have knowledge of synthetic a priori propositions.

Gottlob Frege's notion of analyticity included a number of logical properties and relations beyond containment: symmetry, transitivity, antonymy, or negation and so on. He had a strong emphasis on formality, in particular formal definition, and also emphasized the idea of substitution of synonymous terms. "All bachelors are unmarried" can be expanded out with the formal definition of bachelor as "unmarried man" to form "All unmarried men are unmarried", which is recognizable as tautologous and therefore analytic from its logical form: any statement of the form "All X that are (F and G) are F". Using this particular expanded idea of analyticity, Frege concluded that Kant's examples of arithmetical truths are analytical a priori truths and not synthetic a priori truths.

Thanks to Frege's logical semantics, particularly his concept of analyticity, arithmetic truths like "7+5=12" are no longer synthetic a priori but analytical a priori truths in Carnap's extended sense of "analytic". Hence logical empiricists are not subject to Kant's criticism of Hume for throwing out mathematics along with metaphysics.

(Here "logical empiricist" is a synonym for "logical positivist".)

The origin of the logical positivist's distinction

The logical positivists agreed with Kant that we have knowledge of mathematical truths, and further that mathematical propositions are a priori. However, they did not believe that any complex metaphysics, such as the type Kant supplied, are necessary to explain our knowledge of mathematical truths. Instead, the logical positivists maintained that our knowledge of judgments like "all bachelors are unmarried" and our knowledge of mathematics (and logic) are in the basic sense the same: all proceeded from our knowledge of the meanings of terms or the conventions of language.

Since empiricism had always asserted that all knowledge is based on experience, this assertion had to include knowledge in mathematics. On the other hand, we believed that with respect to this problem the rationalists had been right in rejecting the old empiricist view that the truth of "2+2=4" is contingent on the observation of facts, a view that would lead to the unacceptable consequence that an arithmetical statement might possibly be refuted tomorrow by new experiences. Our solution, based upon Wittgenstein's conception, consisted in asserting the thesis of empiricism only for factual truth. By contrast, the truths of logic and mathematics are not in need of confirmation by observations, because they do not state anything about the world of facts, they hold for any possible combination of facts.

— Rudolf Carnap, "Autobiography": §10: Semantics, p. 64

Logical positivist definitions

Thus the logical positivists drew a new distinction, and, inheriting the terms from Kant, named it the "analytic-synthetic distinction". They provided many different definitions, such as the following:

  • analytic proposition: a proposition whose truth depends solely on the meaning of its terms
  • analytic proposition: a proposition that is true (or false) by definition
  • analytic proposition: a proposition that is made true (or false) solely by the conventions of language

(While the logical positivists believed that the only necessarily true propositions were analytic, they did not define "analytic proposition" as "necessarily true proposition" or "proposition that is true in all possible worlds".)

Synthetic propositions were then defined as:

  • synthetic proposition: a proposition that is not analytic

These definitions applied to all propositions, regardless of whether they were of subject–predicate form. Thus, under these definitions, the proposition "It is raining or it is not raining" was classified as analytic, while for Kant it was analytic by virtue of its logical form. And the proposition "7 + 5 = 12" was classified as analytic, while under Kant's definitions it was synthetic.

Two-dimensionalism

Two-dimensionalism is an approach to semantics in analytic philosophy. It is a theory of how to determine the sense and reference of a word and the truth-value of a sentence. It is intended to resolve a puzzle that has plagued philosophy for some time, namely: How is it possible to discover empirically that a necessary truth is true? Two-dimensionalism provides an analysis of the semantics of words and sentences that makes sense of this possibility. The theory was first developed by Robert Stalnaker, but it has been advocated by numerous philosophers since, including David Chalmers and Berit Brogaard.

Any given sentence, for example, the words,

"Water is H2O"

is taken to express two distinct propositions, often referred to as a primary intension and a secondary intension, which together compose its meaning.

The primary intension of a word or sentence is its sense, i.e., is the idea or method by which we find its referent. The primary intension of "water" might be a description, such as watery stuff. The thing picked out by the primary intension of "water" could have been otherwise. For example, on some other world where the inhabitants take "water" to mean watery stuff, but, where the chemical make-up of watery stuff is not H2O, it is not the case that water is H2O for that world.

The secondary intension of "water" is whatever thing "water" happens to pick out in this world, whatever that world happens to be. So if we assign "water" the primary intension watery stuff then the secondary intension of "water" is H2O, since H2O is watery stuff in this world. The secondary intension of "water" in our world is H2O, which is H2O in every world because unlike watery stuff it is impossible for H2O to be other than H2O. When considered according to its secondary intension, "Water is H2O" is true in every world.

If two-dimensionalism is workable it solves some very important problems in the philosophy of language. Saul Kripke has argued that "Water is H2O" is an example of the necessary a posteriori, since we had to discover that water was H2O, but given that it is true, it cannot be false. It would be absurd to claim that something that is water is not H2O, for these are known to be identical.

Carnap's distinction

Rudolf Carnap was a strong proponent of the distinction between what he called "internal questions", questions entertained within a "framework" (like a mathematical theory), and "external questions", questions posed outside any framework – posed before the adoption of any framework. The "internal" questions could be of two types: logical (or analytic, or logically true) and factual (empirical, that is, matters of observation interpreted using terms from a framework). The "external" questions were also of two types: those that were confused pseudo-questions ("one disguised in the form of a theoretical question") and those that could be re-interpreted as practical, pragmatic questions about whether a framework under consideration was "more or less expedient, fruitful, conducive to the aim for which the language is intended". The adjective "synthetic" was not used by Carnap in his 1950 work Empiricism, Semantics, and Ontology. Carnap did define a "synthetic truth" in his work Meaning and Necessity: a sentence that is true, but not simply because "the semantical rules of the system suffice for establishing its truth".

The notion of a synthetic truth is of something that is true both because of what it means and because of the way the world is, whereas analytic truths are true in virtue of meaning alone. Thus, what Carnap calls internal factual statements (as opposed to internal logical statements) could be taken as being also synthetic truths because they require observations, but some external statements also could be "synthetic" statements and Carnap would be doubtful about their status. The analytic–synthetic argument therefore is not identical with the internal–external distinction.

Quine's criticisms

In 1951, Willard Van Orman Quine published the essay "Two Dogmas of Empiricism" in which he argued that the analytic–synthetic distinction is untenable. The argument at bottom is that there are no "analytic" truths, but all truths involve an empirical aspect. In the first paragraph, Quine takes the distinction to be the following:

  • analytic propositions – propositions grounded in meanings, independent of matters of fact.
  • synthetic propositions – propositions grounded in fact.

Quine's position denying the analytic–synthetic distinction is summarized as follows:

It is obvious that truth in general depends on both language and extralinguistic fact. ... Thus one is tempted to suppose in general that the truth of a statement is somehow analyzable into a linguistic component and a factual component. Given this supposition, it next seems reasonable that in some statements the factual component should be null; and these are the analytic statements. But, for all its a priori reasonableness, a boundary between analytic and synthetic statements simply has not been drawn. That there is such a distinction to be drawn at all is an unempirical dogma of empiricists, a metaphysical article of faith.

— Willard V. O. Quine, "Two Dogmas of Empiricism", p. 64

To summarize Quine's argument, the notion of an analytic proposition requires a notion of synonymy, but establishing synonymy inevitably leads to matters of fact – synthetic propositions. Thus, there is no non-circular (and so no tenable) way to ground the notion of analytic propositions.

While Quine's rejection of the analytic–synthetic distinction is widely known, the precise argument for the rejection and its status is highly debated in contemporary philosophy. However, some (for example, Paul Boghossian) argue that Quine's rejection of the distinction is still widely accepted among philosophers, even if for poor reasons.

Responses

Paul Grice and P. F. Strawson criticized "Two Dogmas" in their 1956 article "In Defense of a Dogma". Among other things, they argue that Quine's skepticism about synonyms leads to a skepticism about meaning. If statements can have meanings, then it would make sense to ask "What does it mean?". If it makes sense to ask "What does it mean?", then synonymy can be defined as follows: Two sentences are synonymous if and only if the true answer of the question "What does it mean?" asked of one of them is the true answer to the same question asked of the other. They also draw the conclusion that discussion about correct or incorrect translations would be impossible given Quine's argument. Four years after Grice and Strawson published their paper, Quine's book Word and Object was released. In the book Quine presented his theory of indeterminacy of translation.

In Speech Acts, John Searle argues that from the difficulties encountered in trying to explicate analyticity by appeal to specific criteria, it does not follow that the notion itself is void. Considering the way which we would test any proposed list of criteria, which is by comparing their extension to the set of analytic statements, it would follow that any explication of what analyticity means presupposes that we already have at our disposal a working notion of analyticity.

In "'Two Dogmas' Revisited", Hilary Putnam argues that Quine is attacking two different notions:

It seems to me there is as gross a distinction between 'All bachelors are unmarried' and 'There is a book on this table' as between any two things in this world, or at any rate, between any two linguistic expressions in the world;

— Hilary Putnam, Philosophical Papers, p. 36

Analytic truth defined as a true statement derivable from a tautology by putting synonyms for synonyms is near Kant's account of analytic truth as a truth whose negation is a contradiction. Analytic truth defined as a truth confirmed no matter what, however, is closer to one of the traditional accounts of a priori. While the first four sections of Quine's paper concern analyticity, the last two concern a priority. Putnam considers the argument in the two last sections as independent of the first four, and at the same time as Putnam criticizes Quine, he also emphasizes his historical importance as the first top rank philosopher to both reject the notion of a priority and sketch a methodology without it.

Jerrold Katz, a one-time associate of Noam Chomsky, countered the arguments of "Two Dogmas" directly by trying to define analyticity non-circularly on the syntactical features of sentences. Chomsky himself critically discussed Quine's conclusion, arguing that it is possible to identify some analytic truths (truths of meaning, not truths of facts) which are determined by specific relations holding among some innate conceptual features of the mind or brain.

In Philosophical Analysis in the Twentieth Century, Volume 1: The Dawn of Analysis, Scott Soames pointed out that Quine's circularity argument needs two of the logical positivists' central theses to be effective:

All necessary (and all a priori) truths are analytic.
Analyticity is needed to explain and legitimate necessity.

It is only when these two theses are accepted that Quine's argument holds. It is not a problem that the notion of necessity is presupposed by the notion of analyticity if necessity can be explained without analyticity. According to Soames, both theses were accepted by most philosophers when Quine published "Two Dogmas". Today, however, Soames holds both statements to be antiquated. He says: "Very few philosophers today would accept either [of these assertions], both of which now seem decidedly antique."

In other fields

This distinction was imported from philosophy into theology, with Albrecht Ritschl attempting to demonstrate that Kant's epistemology was compatible with Lutheranism.

Y-chromosomal Adam

From Wikipedia, the free encyclopedia

In human genetics, the Y-chromosomal most recent common ancestor (Y-MRCA, informally known as Y-chromosomal Adam) is the patrilineal most recent common ancestor (MRCA) from whom all currently living humans are descended. He is the most recent male from whom all living humans are descended through an unbroken line of their male ancestors. The term Y-MRCA reflects the fact that the Y chromosomes of all currently living human males are directly derived from the Y chromosome of this remote ancestor. The analogous concept of the matrilineal most recent common ancestor is known as "Mitochondrial Eve" (mt-MRCA, named for the matrilineal transmission of mtDNA), the most recent woman from whom all living humans are descended matrilineally. As with "Mitochondrial Eve", the title of "Y-chromosomal Adam" is not permanently fixed to a single individual, but can advance over the course of human history as paternal lineages become extinct.

Estimates of the time when Y-MRCA lived have also shifted as modern knowledge of human ancestry changes. For example, in 2013, the discovery of a previously unknown Y-chromosomal haplogroup was announced, which resulted in a slight adjustment of the estimated age of the human Y-MRCA.

By definition, it is not necessary that the Y-MRCA and the mt-MRCA should have lived at the same time. While estimates as of 2014 suggested the possibility that the two individuals may well have been roughly contemporaneous, the discovery of the archaic Y-haplogroup has pushed back the estimated age of the Y-MRCA beyond the most likely age of the mt-MRCA. As of 2015, estimates of the age of the Y-MRCA range around 200,000 to 300,000 years ago, roughly consistent with the emergence of anatomically modern humans.

Y-chromosomal data taken from a Neanderthal from El Sidrón, Spain, produced a Y-T-MRCA of 588,000 years ago for Neanderthal and Homo sapiens patrilineages, dubbed ante Adam, and 275,000 years ago for Y-MRCA.

Definition

The Y-chromosomal most recent common ancestor is the most recent common ancestor of the Y-chromosomes found in currently living human males.

Due to the definition via the "currently living" population, the identity of a MRCA, and by extension of the human Y-MRCA, is time-dependent (it depends on the moment in time intended by the term "currently"). The MRCA of a population may move forward in time as archaic lineages within the population go extinct: once a lineage has died out, it is irretrievably lost. This mechanism can thus only shift the title of Y-MRCA forward in time. Such an event could be due to the total extinction of several basal haplogroups. The same holds for the concepts of matrilineal and patrilineal MRCAs: it follows from the definition of Y-MRCA that he had at least two sons who both have unbroken lineages that have survived to the present day. If the lineages of all but one of those sons die out, then the title of Y-MRCA shifts forward from the remaining son through his patrilineal descendants, until the first descendant is reached who had at least two sons who both have living, patrilineal descendants. The title of Y-MRCA is not permanently fixed to a single individual, and the Y-MRCA for any given population would himself have been part of a population which had its own, more remote, Y-MRCA.

Although the informal name "Y-chromosomal Adam" is a reference to the biblical Adam, this should not be misconstrued as implying that the bearer of the chromosome was the only human male alive during his time. His other male contemporaries may also have descendants alive today, but not, by definition, through solely patrilineal descent; in other words, none of them have an unbroken male line of descendants (son's son's son's … son) connecting them to currently living people.

By the nature of the concept of most recent common ancestors, these estimates can only represent a terminus ante quem ("limit before which"), until the genome of the entire population has been examined (in this case, the genome of all living humans).

Age estimate

Estimates on the age of the Y-MRCA crucially depend on the most archaic known haplogroup extant in contemporary populations. As of 2018, this is haplogroup A00 (discovered in 2013). Age estimates based on this published during 2014–2015 range between 160,000 and 300,000 years, compatible with the time of emergence and early dispersal of Homo sapiens.

Method

In addition to the tendency of the title of Y-MRCA to shift forward in time, the estimate of the Y-MRCA's DNA sequence, his position in the family tree, the time when he lived, and his place of origin, are all subject to future revisions.

The following events would change the estimate of who the individual designated as Y-MRCA was:

  • Further sampling of Y chromosomes could uncover previously unknown divergent lineages. If this happens, Y-chromosome lineages would converge on an individual who lived further back in time.
  • The discovery of additional deep rooting mutations in known lineages could lead to a rearrangement of the family tree.
  • Revision of the Y-chromosome mutation rate (see below) can change the estimate of the time when he lived.

The time when Y-MRCA lived is determined by applying a molecular clock to human Y-chromosomes. In contrast to mitochondrial DNA (mtDNA), which has a short sequence of 16,000 base pairs, and mutates frequently, the Y chromosome is significantly longer at 60 million base pairs, and has a lower mutation rate. These features of the Y chromosome have slowed down the identification of its polymorphisms; as a consequence, they have reduced the accuracy of Y-chromosome mutation rate estimates.

Methods of estimating the age of the Y-MRCA for a population of human males whose Y-chromosomes have been sequenced are based on applying the theories of molecular evolution to the Y chromosome. Unlike the autosomes, the human Y-chromosome does not recombine often with the X chromosome during meiosis, but is usually transferred intact from father to son; however, it can recombine with the X chromosome in the pseudoautosomal regions at the ends of the Y chromosome. Mutations occur periodically within the Y chromosome, and these mutations are passed on to males in subsequent generations.

These mutations can be used as markers to identify shared patrilineal relationships. Y chromosomes that share a specific mutation are referred to as haplogroups. Y chromosomes within a specific haplogroup are assumed to share a common patrilineal ancestor who was the first to carry the defining mutation. (This assumption could be mistaken, as it is possible for the same mutation to occur more than once.) A family tree of Y chromosomes can be constructed, with the mutations serving as branching points along lineages. The Y-MRCA is positioned at the root of the family tree, as the Y chromosomes of all living males are descended from his Y chromosome.

Researchers can reconstruct ancestral Y chromosome DNA sequences by reversing mutated DNA segments to their original condition. The most likely original or ancestral state of a DNA sequence is determined by comparing human DNA sequences with those of a closely related species, usually non-human primates such as chimpanzees and gorillas. By reversing known mutations in a Y-chromosome lineage, a hypothetical ancestral sequence for the MRCA, Y-chromosomal Adam, can be inferred.

Determining the Y-MRCA's DNA sequence, and the time when he lived, involves identifying the human Y-chromosome lineages that are most divergent from each other—the lineages that share the fewest mutations with each other when compared to a non-human primate sequence in a phylogenetic tree. The common ancestor of the most divergent lineages is therefore the common ancestor of all lineages.

History of estimates

Early estimates of the age for the Y-MRCA published during the 1990s ranged between roughly 200 and 300 thousand years ago (kya).Such estimates were later substantially revised downward, as in Thomson et al. 2000, which proposed an age of about 59,000. This date suggested that the Y-MRCA lived about 84,000 years after his female counterpart mt-MRCA (the matrilineal most recent common ancestor), who lived 150,000–200,000 years ago. This date also meant that Y-chromosomal Adam lived at a time very close to, and possibly after, the migration from Africa which is believed to have taken place 50,000–80,000 years ago. One explanation given for this discrepancy in the time depths of patrilineal vs. matrilineal lineages was that females have a better chance of reproducing than males due to the practice of polygyny. When a male individual has several wives, he has effectively prevented other males in the community from reproducing and passing on their Y chromosomes to subsequent generations. On the other hand, polygyny does not prevent most females in a community from passing on their mitochondrial DNA to subsequent generations. This differential reproductive success of males and females can lead to fewer male lineages relative to female lineages persisting into the future. These fewer male lineages are more sensitive to drift and would most likely coalesce on a more recent common ancestor. This would potentially explain the more recent dates associated with the Y-MRCA.

The "hyper-recent" estimate of significantly below 100 kya was again corrected upward in studies of the early 2010s, which ranged at about 120 kya to 160 kya. This revision was due to the rearrangement of the backbone of the Y-chromosome phylogeny following the resequencing of Haplogroup A lineages. In 2013, Francalacci et al. reported the sequencing of male-specific single-nucleotide Y-chromosome polymorphisms (MSY-SNPs) from 1204 Sardinian males, which indicated an estimate of 180,000 to 200,000 years for the common origin of all humans through paternal lineage. Also in 2013, Poznik et al. reported the Y-MRCA to have lived between 120,000 and 156,000 years ago, based on genome sequencing of 69 men from 9 different populations. In addition, the same study estimated the age of Mitochondrial Eve to about 99,000 and 148,000 years. As these ranges overlap for a time-range of 28,000 years (148 to 120 kya), the results of this study have been cast in terms of the possibility that "Genetic Adam and Eve may have walked on Earth at the same time" in the popular press.

The announcement by Mendez et al. of the discovery of a previously unknown lineage, haplogroup A00, in 2013, resulted in another shift in the estimate for the age of Y-chromosomal Adam. The authors estimated the split from the other haplogroups at 338,000 years ago (95% confidence interval 237–581 kya), but later Elhaik et al. (2014) dated it to between 163,900 and 260,200 years ago (95% CI), and Karmin et al. (2015) dated it to between 192,000 and 307,000 years ago (95% CI). The same study reports that non-African populations converge to a cluster of Y-MRCAs in a window close to 50 kya (out-of-Africa migration), and an additional bottleneck for non-African populations at about 10 kya, interpreted as reflecting cultural changes increasing the variance in male reproductive success (i.e. increased social stratification) in the Neolithic.

Family tree

The revised root of the y-chromosome family tree by Cruciani et al. 2011 compared with the family tree from Karafet et al. 2008. It is now known that there is a haplogroup (A00) outside of this scheme. The group designated A1b here is now called A0, and "A1b" is now used for what is here called A2-T.

Initial sequencing (Karafet et al., 2008) of the human Y chromosome suggested that two most basal Y-chromosome lineages were Haplogroup A and Haplogroup BT. Haplogroup A is found at low frequencies in parts of Africa, but is common among certain hunter-gatherer groups. Haplogroup BT lineages represent the majority of African Y-chromosome lineages and virtually all non-African lineages. Y-chromosomal Adam was represented as the root of these two lineages. Haplogroup A and Haplogroup BT represented the lineages of Y-chromosomal Adam himself and of one of his sons, who had a new SNP.

Cruciani et al. 2011, determined that the deepest split in the Y-chromosome tree was found between two previously reported subclades of Haplogroup A, rather than between Haplogroup A and Haplogroup BT. Later, group A00 was found, outside of the previously known tree. The rearrangement of the Y-chromosome family tree implies that lineages classified as Haplogroup A do not necessarily form a monophyletic clade. Haplogroup A therefore refers to a collection of lineages that do not possess the markers that define Haplogroup BT, though Haplogroup A includes the most distantly related Y chromosomes.

The M91 and P97 mutations distinguish Haplogroup A from Haplogroup BT. Within Haplogroup A chromosomes, the M91 marker consists of a stretch of 8 T nucleobase units. In Haplogroup BT and chimpanzee chromosomes, this marker consists of 9 T nucleobase units. This pattern suggested that the 9T stretch of Haplogroup BT was the ancestral version and that Haplogroup A was formed by the deletion of one nucleobase. Haplogroups A1b and A1a were considered subclades of Haplogroup A as they both possessed the M91 with 8Ts.

But according to Cruciani et al. 2011, the region surrounding the M91 marker is a mutational hotspot prone to recurrent mutations. It is therefore possible that the 8T stretch of Haplogroup A may be the ancestral state of M91 and the 9T of Haplogroup BT may be the derived state that arose by an insertion of 1T. This would explain why subclades A1b and A1a-T, the deepest branches of Haplogroup A, both possess the same version of M91 with 8Ts. Furthermore, Cruciani et al. 2011 determined that the P97 marker, which is also used to identify Haplogroup A, possessed the ancestral state in Haplogroup A but the derived state in Haplogroup BT.

Likely geographic origin

As current estimates on TMRCA converge with estimates for the age of anatomically modern humans and well predate the Out of Africa migration, geographical origin hypotheses continue to be limited to the African continent.

According to Cruciani et al. 2011, the most basal lineages have been detected in West, Northwest and Central Africa, suggesting plausibility for the Y-MRCA living in the general region of "Central-Northwest Africa".

Scozzari et al. (2012) agreed with a plausible placement in "the north-western quadrant of the African continent" for the emergence of the A1b haplogroup.  The 2013 report of haplogroup A00 found among the Mbo people of western present-day Cameroon is also compatible with this picture.

The revision of Y-chromosomal phylogeny since 2011 has affected estimates for the likely geographical origin of Y-MRCA as well as estimates on time depth. By the same reasoning, future discovery of presently-unknown archaic haplogroups in living people would again lead to such revisions. In particular, the possible presence of between 1% and 4% Neanderthal-derived DNA in Eurasian genomes implies that the (unlikely) event of a discovery of a single living Eurasian male exhibiting a Neanderthal patrilineal line would immediately push back T-MRCA ("time to MRCA") to at least twice its current estimate. However, the discovery of a Neanderthal Y-chromosome by Mendez et al. suggests the extinction of Neanderthal patrilineages, as the lineage inferred from the Neanderthal sequence is outside of the range of contemporary human genetic variation. Questions of geographical origin would become part of the debate on Neanderthal evolution from Homo erectus.

Exon shuffling

From Wikipedia, the free encyclopedia

Exon shuffling is a molecular mechanism for the formation of new genes. It is a process through which two or more exons from different genes can be brought together ectopically, or the same exon can be duplicated, to create a new exon-intron structure. There are different mechanisms through which exon shuffling occurs: transposon mediated exon shuffling, crossover during sexual recombination of parental genomes and illegitimate recombination.

Exon shuffling follows certain splice frame rules. Introns can interrupt the reading frame of a gene by inserting a sequence between two consecutive codons (phase 0 introns), between the first and second nucleotide of a codon (phase 1 introns), or between the second and third nucleotide of a codon (phase 2 introns). Additionally exons can be classified into nine different groups based on the phase of the flanking introns (symmetrical: 0-0, 1-1, 2-2 and asymmetrical: 0–1, 0–2, 1–0, 1–2, etc.) Symmetric exons are the only ones that can be inserted into introns, undergo duplication, or be deleted without changing the reading frame.

History

Exon shuffling was first introduced in 1978 when Walter Gilbert discovered that the existence of introns could play a major role in the evolution of proteins. It was noted that recombination within introns could help assort exons independently and that repetitive segments in the middle of introns could create hotspots for recombination to shuffle the exonic sequences. However, the presence of these introns in eukaryotes and absence in prokaryotes created a debate about the time in which these introns appeared. Two theories arose: the "introns early" theory and the "introns late" theory. Supporters of the "introns early theory" believed that introns and RNA splicing were the relics of the RNA world and therefore both prokaryotes and eukaryotes had introns in the beginning. However, prokaryotes eliminated their introns in order to obtain a higher efficiency, while eukaryotes retained the introns and the genetic plasticity of the ancestors. On the other hand, supporters of the "introns late" theory believe that prokaryotic genes resemble the ancestral genes and introns were inserted later in the genes of eukaryotes. What is clear now is that the eukaryotic exon-intron structure is not static, introns are continually inserted and removed from genes and the evolution of introns evolves parallel to exon shuffling.

In order for exon shuffling to start to play a major role in protein evolution the appearance of spliceosomal introns had to take place. This was due to the fact that the self-splicing introns of the RNA world were unsuitable for exon-shuffling by intronic recombination. These introns had an essential function and therefore could not be recombined. Additionally there is strong evidence that spliceosomal introns evolved fairly recently and are restricted in their evolutionary distribution. Therefore, exon shuffling became a major role in the construction of younger proteins.

Moreover, to define more precisely the time when exon shuffling became significant in eukaryotes, the evolutionary distribution of modular proteins that evolved through this mechanism were examined in different organisms such as Escherichia coli, Saccharomyces cerevisiae, and Arabidopsis thaliana. These studies suggested that there was an inverse relationship between the genome compactness and the proportion of intronic and repetitive sequences, and that exon shuffling became significant after metazoan radiation.

Mechanisms

Crossover during sexual recombination of parental genomes

Evolution of eukaryotes is mediated by sexual recombination of parental genomes and since introns are longer than exons most of the crossovers occur in noncoding regions. In these introns there are large numbers of transposable elements and repeated sequences which promote recombination of nonhomologous genes. In addition it has also been shown that mosaic proteins are composed of mobile domains which have spread to different genes during evolution and which are capable of folding themselves.

There is a mechanism for the formation and shuffling of said domains, this is the modularization hypothesis. This mechanism is divided into three stages. The first stage is the insertion of introns at positions that correspond to the boundaries of a protein domain. The second stage is when the "protomodule" undergoes tandem duplications by recombination within the inserted introns. The third stage is when one or more protomodules are transferred to a different nonhomologous gene by intronic recombination. All states of modularization have been observed in different domains such as those of hemostatic proteins.

Transposon mediated

Long interspersed element (LINE)-1

A potential mechanism for exon shuffling is the long interspersed element (LINE) -1 mediated 3' transduction. However it is important first to understand what LINEs are. LINEs are a group of genetic elements that are found in abundant quantities in eukaryotic genomes. LINE-1 is the most common LINE found in humans. It is transcribed by RNA polymerase II to give an mRNA that codes for two proteins: ORF1 and ORF2, which are necessary for transposition.

Upon transposition, L1 associates with 3' flanking DNA and carries the non-L1 sequence to a new genomic location. This new location does not have to be in a homologous sequence or in close proximity to the donor DNA sequence. The donor DNA sequence remains unchanged throughout this process because it functions in a copy-paste manner via RNA intermediates; however, only those regions located in the 3' region of the L1 have been proven to be targeted for duplication.

Nevertheless, there is reason to believe that this may not hold true every time as shown by the following example. The human ATM gene is responsible for the human autosomal-recessive disorder ataxia-telangiectasia and is located on chromosome 11. However, a partial ATM sequence is found in chromosome 7. Molecular features suggest that this duplication was mediated by L1 retrotransposition: the derived sequence was flanked by 15bp target side duplications (TSD), the sequence around the 5' end matched with the consensus sequence for L1 endonuclease cleavage site and a poly(A) tail preceded the 3' TSD. But since the L1 element was present in neither the retrotransposed segment nor the original sequence the mobilization of the segment cannot be explained by 3' transduction. Additional information has led to the belief that trans-mobilization of the DNA sequence is another mechanism of L1 to shuffle exons, but more research on the subject must be done.

Helitron

Another mechanism through which exon shuffling occurs is by the usage of helitrons. Helitron transposons were first discovered during studies of repetitive DNA segments of rice, worm and the thale crest genomes. Helitrons have been identified in all eukaryotic kingdoms, but the number of copies varies from species to species.

Helitron encoded proteins are composed of a rolling-circle (RC) replication initiator (Rep) and a DNA helicase (Hel) domain. The Rep domain is involved in the catalytic reactions for endonucleolytic cleavage, DNA transfer and ligation. In addition this domain contains three motifs. The first motif is necessary for DNA binding. The second motif has two histidines and is involved in metal ion binding. Lastly the third motif has two tyrosines and catalyzes DNA cleavage and ligation.

There are three models of gene capture by helitrons: the 'read-through" model 1 (RTM1), the 'read-through" model 2 (RTM2) and a filler DNA model (FDNA). According to the RTM1 model an accidental "malfunction" of the replication terminator at the 3' end of the Helitron leads to transposition of genomic DNA. It is composed of the read-through Helitron element and its downstream genomic regions, flanked by a random DNA site, serving as a "de novo" RC terminator. According to the RTM2 model the 3' terminus of another Helitron serves as an RC terminator of transposition. This occurs after a malfunction of the RC terminator. Lastly in the FDNA model portions of genes or non-coding regions can accidentally serve as templates during repair of ds DNA breaks occurring in helitrons. Even though helitrons have been proven to be a very important evolutionary tool, the specific details for their mechanisms of transposition are yet to be defined.

An example of evolution by using helitrons is the diversity commonly found in maize. Helitrons in maize cause a constant change of genic and nongenic regions by using transposable elements, leading to diversity among different maize lines.

Long-terminal repeat (LTR) retrotransposons

Long-terminal repeat (LTR) retrotransposons are part of another mechanism through which exon shuffling takes place. They usually encode two open reading frames (ORF). The first ORF named gag is related to viral structural proteins. The second ORF named pol is a polyprotein composed of an aspartic protease (AP)which cleaves the polyprotein, an Rnase H (RH) which splits the DNR-RNA hybrid, a reverse transcriptase (RT) which produces a cDNA copy of the transposons RNA and a DDE integrase which inserts cDNA into the host's genome. Additionally LTR retrotransponsons are classified into five subfamilies: Ty1/copia, Ty3/gypsy, Bel/Pao, retroviruses and endogenous retroviruses.

The LTR retrotransponsons require an RNA intermediate in their transposition cycle mechanism. Retrotransponsons synthesize a cDNA copy based on the RNA strand using a reverse transcriptase related to retroviral RT. The cDNA copy is then inserted into new genomic positions to form a retrogene. This mechanism has been proven to be important in gene evolution of rice and other grass species through exon shuffling.

Transposons with Terminal inverted repeats (TIRs)

DNA transposon with Terminal inverted repeats (TIRs) can also contribute to gene shuffling. In plants, some non-autonomous elements called Pack-TYPE can capture gene fragments during their mobilization. This process appears to be mediated by acquisition of genic DNA residing between neighbouring Pack-TYPE transposons and its subsequent mobilization.

Illegitimate recombination

Lastly, illegitimate recombination (IR) is another of the mechanisms through which exon shuffling occurs. IR is the recombination between short homologous sequences or nonhomologous sequences.

There are two classes of IR: The first corresponds to errors of enzymes which cut and join DNA (i.e., DNases.) This process is initiated by a replication protein which helps generate a primer for DNA synthesis. While one DNA strand is being synthesized the other is being displaced. This process ends when the displaced strand is joined by its ends by the same replication protein. The second class of IR corresponds to the recombination of short homologous sequences which are not recognized by the previously mentioned enzymes. However, they can be recognized by non-specific enzymes which introduce cuts between the repeats. The ends are then removed by exonuclease to expose the repeats. Then the repeats anneal and the resulting molecule is repaired using polymerase and ligase.

Exon

From Wikipedia, the free encyclopedia
Introns are removed and exons joined in the process of RNA splicing. RNAs could be mRNA or non-coding RNA

An exon is any part of a gene that will form a part of the final mature RNA produced by that gene after introns have been removed by RNA splicing. The term exon refers to both the DNA sequence within a gene and to the corresponding sequence in RNA transcripts. In RNA splicing, introns are removed and exons are covalently joined to one another as part of generating the mature RNA. Just as the entire set of genes for a species constitutes the genome, the entire set of exons constitutes the exome.

History

The term exon derives from the expressed region and was coined by American biochemist Walter Gilbert in 1978: "The notion of the cistron… must be replaced by that of a transcription unit containing regions which will be lost from the mature messenger – which I suggest we call introns (for intragenic regions) – alternating with regions which will be expressed – exons."

This definition was originally made for protein-coding transcripts that are spliced before being translated. The term later came to include sequences removed from rRNA and tRNA, and other ncRNA and it also was used later for RNA molecules originating from different parts of the genome that are then ligated by trans-splicing.

Contribution to genomes and size distribution

Although unicellular eukaryotes such as yeast have either no introns or very few, metazoans and especially vertebrate genomes have a large fraction of non-coding DNA. For instance, in the human genome only 1.1% of the genome is spanned by exons, whereas 24% is in introns, with 75% of the genome being intergenic DNA. This can provide a practical advantage in omics-aided health care (such as precision medicine) because it makes commercialized whole exome sequencing a smaller and less expensive challenge than commercialized whole genome sequencing. The large variation in genome size and C-value across life forms has posed an interesting challenge called the C-value enigma.

Across all eukaryotic genes in GenBank, there were (in 2002), on average, 5.48 exons per protein coding gene. The average exon encoded 30-36 amino acids. While the longest exon in the human genome is 11555 bp long, several exons have been found to be only 2 bp long. A single-nucleotide exon has been reported from the Arabidopsis genome. In humans, like protein coding mRNA, most non-coding RNA also contain multiple exons

Structure and function

Exons in a messenger RNA precursor (pre-mRNA). Exons can include both sequences that code for amino acids (red) and untranslated sequences (grey). Introns — those parts of the pre-mRNA that are not in the mRNA — (blue) are removed, and the exons are joined (spliced) to form the final functional mRNA. The 5′ and 3′ ends of the mRNA are marked to differentiate the two untranslated regions (grey).

In protein-coding genes, the exons include both the protein-coding sequence and the 5′- and 3′-untranslated regions (UTR). Often the first exon includes both the 5′-UTR and the first part of the coding sequence, but exons containing only regions of 5′-UTR or (more rarely) 3′-UTR occur in some genes, i.e. the UTRs may contain introns. Some non-coding RNA transcripts also have exons and introns.

Mature mRNAs originating from the same gene need not include the same exons, since different introns in the pre-mRNA can be removed by the process of alternative splicing.

Exonization is the creation of a new exon, as a result of mutations in introns.

Experimental approaches using exons

Exon trapping or 'gene trapping' is a molecular biology technique that exploits the existence of the intron-exon splicing to find new genes. The first exon of a 'trapped' gene splices into the exon that is contained in the insertional DNA. This new exon contains the ORF for a reporter gene that can now be expressed using the enhancers that control the target gene. A scientist knows that a new gene has been trapped when the reporter gene is expressed.

Splicing can be experimentally modified so that targeted exons are excluded from mature mRNA transcripts by blocking the access of splice-directing small nuclear ribonucleoprotein particles (snRNPs) to pre-mRNA using Morpholino antisense oligos. This has become a standard technique in developmental biology. Morpholino oligos can also be targeted to prevent molecules that regulate splicing (e.g. splice enhancers, splice suppressors) from binding to pre-mRNA, altering patterns of splicing.

Common misuse of the term

Common incorrect uses of the term exon are that 'exons code for protein', or 'exons code for amino-acids' or 'exons are translated'. However, these sorts of definitions only cover protein-coding genes, and omit those exons that become part of a non-coding RNA or the untranslated region of an mRNA. Such incorrect definitions still occur in overall reputable secondary sources.

Inequality (mathematics)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Inequality...