A Medley of Potpourri

Thursday, December 12, 2019

Richard Feynman on Artificial General Intelligence

Jørgen Veisdal
https://medium.com/cantors-paradise/richard-feynman-on-artificial-general-intelligence-2c1b9d8aae31

In a lecture held by Nobel Laureate Richard Feynman (1918–1988) on September 26th, 1985, the question of artificial general intelligence (also known as “strong-AI”) comes up.

Audience Question

Do you think there will ever be a machine that will think like human beings and be more intelligent than human beings?

Below is a structured transcript of Feynman’s verbatim response. With the advent of machine learning via artificial neural nets, it’s fascinating to hear Feynman’s thoughts on the subject and just how close he gets, even 35 years ago.

Estimated reading time is 8 minutes. Happy reading!

Richard Feynman’s Answer

"First of all, do they think like human beings? I would say no and I’ll explain in a minute why I say no. Second, for "whether they be more intelligent than human beings" to be a question, intelligences must first be defined. If you were to ask me are they better chess players than any human being? Possibly can be, yes, 'I'll get you, some day'."

By 1985, of course, human chess grand masters were still stronger than machines. Not until the legendary six-game matches between world chess champion GM Garry Kasparov and the IBM supercomputer Deep Blue in 1996 and 1997 did a computer beat a world-class chess champion. Even then, the score was 3 1/2 to 2 1/2, and Kasparov ended up disputing the loss, claiming the IBM team had somehow intervened on behalf of the machine between matches.

The AI Effect

“As soon as it works, no one calls it AI anymore” — John McCarthy
Feynman next addresses the so-called “AI effect”, namely the discounting that has been observed to occur when a programmed machine is instructed to perform a task and actually performs it, by onlookers arguing that what the AI achieved is not “real” intelligence:

"They're better chess players than most human beings right now! One of the things, by the way we always do is we want the darn machine to be better than ANYBODY, not just better than us. If we find a machine that can play chess better than us it doesn't impress us much. We keep saying 'and what happens when it comes up against the masters?'. We imagine that we human beings are equivalent to the masters in everything, right? The machine has to be better in everything that the best person does at the best level. Okay, but that's hard on the machine."

On Building Artificial Machines

Feynman next addresses the question of mental models by analogy to the differences between a naturally evolved mode of locomotion (e.g. the running gait of a mammal with ligaments, tendons, joints and muscle) and mechanically designed modes of locomotion (using wheels, wings and/or propellers):

"With regard to the question of whether we can make it to think like [human beings], my opinion is based on the following idea: That we try to make these things work as efficiently as we can with the materials that we have. Materials are different than nerves, and so on. If we would like to make something that runs rapidly over the ground, then we could watch a cheetah running, and we could try to make a machine that runs like a cheetah. But, it's easier to make a machine with wheels. With fast wheels or something that flies just above the ground in the air. When we make a bird, the airplanes don't fly like a bird, they fly but they don't fly like a bird, okay? They don't flap their wings exactly, they have in front, another gadget that goes around, or the more modern airplane has a tube that you heat the air and squirt it out the back, a jet propulsion, a jet engine, has internal rotating fans and so on, and uses gasoline. It's different, right?So, there's no question that the later machines are not going to think like people think, in that sense.With regard to intelligence, I think it's exactly the same way, for example they're not going to do arithmetic the same way as we do arithmetic, but they'll do it better."

Superhuman Narrow AI

As an example of the superiority in performance of a mental task by a designed mechanical machinery versus a naturally evolved organ, Feynman next describes the differences between a superhuman narrow AI (such as e.g. a calculator) and the human brain:

"Let's take mathematics, very elementary mathematics. Arithmetic. They do arithmetic better than anybody. Much faster and differently, but it's fundamentally the same because in the end, the numbers are equivalent, right? So that's a good example of.. We're never going to change how they do arithmetic, to make it more like humans. That would be going backwards. Because, the arithmetic done by humans is slow, cumbersome, confused and full of errors. Where, these guys (machines) are fast.If one compares what computers can do, to the human beings, we find the following rather interesting comparisons. First of all, if I give you, a human being, a problem like this: I'm going to ask you for these numbers back, every other one, in reverse order, please. Right? I've got a series of numbers, and I want you to give them to me back, in reverse order, every other one. I'll tell you, I'll make it easy for you. Just give me the numbers back the way I gave them to you. You ready?1, 7, 3, 9, 2, 6, 5, 8, 3, 1, 7, 2, 6, 3Anybody gonna be able to do that? No. And that's not more than twenty or thirty numbers, but you can give a computer 50,000 numbers like that and ask it for any reverse order, the sum of them all, do different things with them, and so on. And it doesn't forget them for a long time.So there are some things a computer does much better than a human, and you'd be better remember that if you're trying to compare a machines to humans.

The Problem of Pattern Recognition

In what follows, Feynman moves closer and closer to describing the problem later solved by supervised machine learning, namely pattern recognition from large data sets:

"But, what a human has to do for his own.. Always, they always do this. They always try to find one thing, darn-it that they can bdo better than the computer. So, we now know many, many things that humans can do better than a computer.She's walking down the street and she's got a certain kind of a wiggle, and you know that's Jane, right? Or, this guy is going and you see his hair flip just a little bit, it's hard to see, it's at a distance but the particular funny way that the back of his head looks, that's Jack, okay? To recognize things, to recognize patterns, seems to be something we have not been able to put into a definite procedure. You would say, "I have a good procedure for recognizing a jacket. Just take lots of pictures of Jack" --by the way, a picture can be put into the computer by this method here, if this were very much finer I could tell whether it's black and white at different spots. You know, you in fact get pictures in a newspaper by black and white dots and if you do it fine enough you can't see the dots. So, with enough information I can load pictures in so you put all the pictures of Jack under different circumstances, and there is a machine to compare it."

The Bias–Variance Tradeoff

Feynman moves on to essentially address the problem of variance in data training sets, and so implicitly also address the so-called bias-variance tradeoff. In statistics and machine learning, the bias–variance tradeoff is the property of a set of predictive models whereby models with a lower bias in parameter estimation have a higher variance of the parameter estimates across samples, and vice versa. The bias-variance dilemma describes the optimization problem whereby one tries to simultaneously minimize bias errors from erroneous assumptions in a learning algorithm and the variance from sensitivity to small fluctuations in the training set.

"The trouble is that the actual new circumstance is different. The lighting is different, the distance is different, the tilt of the head is different and you have to figure out how to allow for all that. It's so complicated and elaborate that even with the large machines with the amount of storage that's available and the speed that they go, we can't figure out how to make a definite procedure that works at all, or at least works anywhere within a reasonable speed.So, recognizing things is difficult for the machines at the present time, and some of those things that are done in a snap by a person.. So, there are things that humans can do that we don't know how to do in a filing system. It is recognition, and that brings me back to something I left which is what kind of a file clerk that has some special skill which requires recognition of a complicated kind.For instance a clerk in the fingerprint department which looks at the fingerprints and then makes a careful comparison to see if these finger prints match, has not been.. It's just about ready to be.. It's hard to do, but almost possible to do it by a computer."

The Current State of Artificial Intelligence (1985)

In his last comment, Feynman discusses the difficulties humans at the time still had with designing machines for the purposes of fingerprint matching:

"You'd think there's nothing to it, I look a the two fingerprints and see if all the blood dots are the same, but of course, it's not the case. The finger was dirty, the print was made at a different angle, the pressure was different, the ridges are not exactly in the same place. If you were trying to match exactly the same picture it would be easy, but where the center of the print is, which way the finger is turned, where there's been squashed a little more, a little bit less, where there's some dirt on the finger, whether in the meantime you got a wart on this thumb and so forth are all complications. These little complications make the comparison so much more difficult for the machine, for the "blind filing clerk system", that is too much. Too slow, certainly to be utterly impractical, almost, at the present time.I don't know where they stand but they're going fast trying to do it. Whereas a human can go across all of that somehow, just like they do in the chess game. They seem to be able to catch on to patterns rapidly and we don't know how to do that rapidly and automatically."

Video

Video of Feynman’s full response is available at the link below:

https://youtu.be/ipRvjS7q1DI
This essay is part of a series of stories on math-related topics, published in Cantor’s Paradise, a weekly Medium publication. Thank you for reading!

Phylogenetic nomenclature

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Phylogenetic_nomenclature

Phylogenetic nomenclature, often called cladistic nomenclature, is a method of nomenclature for taxa in biology that uses phylogenetic definitions for taxon names as explained below. This contrasts with the traditional approach, in which taxon names are defined by a type, which can be a specimen or a taxon of lower rank, and a description in words. Phylogenetic nomenclature is currently not regulated, but the International Code of Phylogenetic Nomenclature (PhyloCode) is intended to regulate it once it is ratified.

Definitions

The clade shown by the dashed lines in each figure is specified by the ancestor X. Under the hypothesis that the relationships are as in the left tree, the clade includes X, A, B and C. Under the hypothesis that the relationships are as in the right tree, the clade includes X, A and B.

Phylogenetic nomenclature ties names to clades, groups consisting of an ancestor and all its descendants. These groups can equivalently be called monophyletic. There are slightly different ways of specifying the ancestor, which are discussed below. Once the ancestor is specified, the meaning of the name is fixed: the ancestor and all organisms which are its descendants are included in the named taxon. Listing all these organisms (i.e. providing a full circumscription) requires the full phylogenetic tree to be known. In practice, there are only one or more hypotheses as to the correct tree. Different hypotheses lead to different organisms being thought to be included in the named taxon, but do not affect what organisms the name actually applies to. In this sense the name is independent of theory revision.

Phylogenetic definitions of clade names

Phylogenetic nomenclature ties names to clades, groups consisting solely of an ancestor and all its descendants. All that is needed to specify a clade, therefore, is to designate the ancestor. There are a number of ways of doing this. Commonly, the ancestor is indicated by its relation to two or more specifiers (species, specimens, or traits) that are mentioned explicitly. The diagram shows three common ways of doing this. For previously defined clades A, B, and C, the clade X can be defined as:

The three most common ways to define the name of a clade: node-based, branch-based and apomorphy-based definition. The tree represents a phylogenetic hypothesis on the relations of A, B and C.

A node-based definition could read: "the last common ancestor of A and B, and all descendants of that ancestor". Thus, the entire line below the junction of A and B does not belong to the clade to which the name with this definition refers.

Example: The sauropod dinosaurs consist of the last common ancestor of Vulcanodon (A) and Apatosaurus (B)^[2] and all of that ancestor's descendants. This ancestor was the first sauropod. C could include other dinosaurs like Stegosaurus.

A branch-based definition, often called a stem-based definition, could read: "the first ancestor of A which is not also an ancestor of C, and all descendants of that ancestor". Thus, the entire line below the junction of A and B (other than the bottommost point) does belong to the clade to which the name with this definition refers.

Example: The rodents consist of the first ancestor of the house mouse (A) that is not also an ancestor of the eastern cottontail rabbit (C) together with all descendants of that ancestor. Here, the ancestor is the very first rodent. B is some other descendant, perhaps the red squirrel.

An apomorphy-based definition could read: "the first ancestor of A to possess trait M that is inherited by A, and all descendants of that ancestor". In the diagram, M evolves at the intersection of the horizontal line with the tree. Thus, the clade to which the name with this definition refers contains that part of the line below the last common ancestor of A and B which corresponds to ancestors possessing the apomorphy M. The lower part of the line is excluded. It is not required that B have trait M; it may have disappeared in the lineage leading to B.

Example: the tetrapods consist of the first ancestor of humans (A) from which humans inherited limbs with fingers or toes (M) and all descendants of that ancestor. These descendants include snakes (B), which do not have limbs.

Several other alternatives are provided in the PhyloCode, (see below) though there is no attempt to be exhaustive.

Phylogenetic nomenclature allows the use, not only of ancestral relations, but also of the property of being extant. One of the many ways of specifying the Neornithes (modern birds), for example, is:

The Neornithes consist of the last common ancestor of the extant members of the most inclusive clade containing the cockatoo Cacatua galerita but not the dinosaur Stegosaurus armatus as well as all descendants of that ancestor.

Neornithes is a crown clade, a clade for which the last common ancestor of its extant members is also the last common ancestor of all its members.

Node names

Crown node: Most recent common ancestor of the sampled species of the clade of interest
Stem node: Most recent common ancestor of the clade of interest and its sister clade

Ancestry-based definitions of the names of paraphyletic and polyphyletic taxa

In the PhyloCode, only a clade can receive a "phylogenetic definition", and this restriction is observed in the present article. However, it is also possible to create definitions for the names of other groups that are phylogenetic in the sense that they use only ancestral relations anchored on species or specimens. For example, assuming Mammalia and Aves (birds) are defined in this manner, Reptilia could be defined as "the most recent common ancestor of Mammalia and Aves and all its descendants except Mammalia and Aves". This is an example of a paraphyletic group, a clade minus one or more subordinate clades. Names of polyphyletic groups, characterized by a trait that evolved convergently in two or more subgroups, can similarly be defined as the sum of multiple clades.

Ranks

Under the traditional nomenclature codes, such as the International Code of Zoological Nomenclature and the International Code of Nomenclature for algae, fungi, and plants, taxa that are not explicitly associated with a rank cannot be formally named, because the application of a name to a taxon is based on both a type and a rank. The requirement for a rank is a major difference between traditional and phylogenetic nomenclature. It has several consequences: it limits the number of nested levels at which names can be applied; it causes the endings of names to change if a group has its rank changed, even if it has precisely the same members (i.e. the same circumscription); and it is logically inconsistent with all taxa being monophyletic.

Especially in recent decades (due to advances in phylogenetics), taxonomists have named many "nested" taxa (i.e. taxa which are contained inside other taxa). No system of nomenclature attempts to name every clade; this would be particularly difficult in traditional nomenclature since every named taxon must be given a lower rank than any named taxon in which it is nested, so the number of names that can be assigned in a nested set of taxa can be no greater than the number of generally recognized ranks. Gauthier et al. (1988) suggested that, if Reptilia is assigned its traditional rank of class, then a phylogenetic classification has to assign the rank of genus to Aves. In such a classification, all ~12,000 known species of extant and extinct birds would then have to be incorporated into this genus.

Various solutions have been proposed while keeping the rank-based nomenclature codes. Patterson and Rosen (1977) suggested nine new ranks between family and superfamily in order to be able to classify a clade of herrings, and McKenna and Bell (1997) introduced a large array of new ranks in order to cope with the diversity of Mammalia; these have not been widely adopted. In botany, the Angiosperm Phylogeny Group, responsible for the currently most widely used classification of flowering plants, chose a different approach. They retained the traditional ranks of family and order, considering them to be of value in teaching and in studying relationships between taxa, but also introduced named clades without formal ranks.

The current codes also have rules stating that names must have certain endings depending on the rank of the taxa to which they are applied. When a group has a different rank in different classifications, its name must have a different suffix. Ereshefsky gave an example. He noted that Simpson in 1963 and Wiley in 1981 agreed that the same group of genera, which included the genus Homo, should be placed together in a taxon. Simpson treated this taxon as a family, and so gave it the name "Hominidae": "Homin-" from "Homo" and "-idae" as the family ending under the zoological code. Wiley considered it to be at the rank of tribe, and so gave it the name "Hominini", "-ini" being the tribe ending. Wiley's tribe Hominini formed only part of a family which he called "Hominidae". Thus, under the zoological code, two groups with precisely the same circumscription were given different names (Simpson's Hominidae and Wiley's Hominini) and two groups with the same name had different circumscriptions (Simpson's Hominidae and Wiley's Hominidae).

In phylogenetic nomenclature, ranks have no bearing on the spelling of taxon names (see e.g. Gauthier (1994) and the PhyloCode). Ranks are, however, not altogether forbidden in phylogenetic nomenclature. They are merely decoupled from nomenclature: they do not influence which names can be used, which taxa are associated with which names, and which names can refer to nested taxa.

The principles of traditional rank-based nomenclature are logically incompatible with all taxa being strictly monophyletic. Every organism must belong to a genus, for example, so there would have to be a genus for every common ancestor of the mammals and the birds. For such a genus to be monophyletic, it would have to include both the class Mammalia and the class Aves. In rank-based nomenclature, however, classes must include genera, not the other way around.

Philosophy

The conflict between phylogenetic and traditional nomenclature reflects differing views of the metaphysics of taxa. For the advocates of phylogenetic nomenclature, a taxon is an individual, an entity that gains and loses attributes as time passes. Just as a person does not become somebody else when his or her properties change through maturation, senility, or more radical changes like amnesia, the loss of a limb, or a change in sex, so a taxon remains the same entity whatever characteristics are gained or lost.

For any individual, there has to be something that connects its temporal stages in virtue of which it remains the same thing. For a person, the spatiotemporal continuity of the body provides the relevant connection; from infancy to old age, the body traces a continuous path through the world and it is this path, rather than any characteristics of the individual, that connects the baby and the octogenarian. For a taxon, if characteristics are not relevant, it can only be ancestral relations that connect the Devonian Rhyniognatha hirsti with the modern monarch butterfly as representatives, separated by 400 million years, of the taxon Insecta.

If ancestry is sufficient for the continuity of a taxon, however, then all descendants of a taxon member will also be included in the taxon, so all bona fide taxa are monophyletic; the names of paraphyletic groups do not merit formal recognition. As "Pelycosauria" refers to a paraphyletic group that includes some Permian tetrapods but not their extant descendants, it cannot be admitted as a valid taxon name.

To the adherent of traditional nomenclature, on the other hand, taxa are sets or classes. Unlike individuals, they are constituted by similarities, characteristics shared among their members. Monophyletic groups are particularly worthy of attention and naming primarily because they often share properties of interest. Since many paraphyletic groups also share such properties, plesiomorphies in their case, providing them with names is also conducive to productive research. Such naming is strongly defended by some scientists; in a 2005 letter to the editors of the journal Taxon, 150 biologists from around the world joined in defense of paraphyletic taxa. For Darwin, they pointed out, evolution involved descent and modification, not just descent. Taxa, for them, are sets of organisms united by similarity; when the similarity is too weak, descendants are not in all of their ancestors' taxa.

History

"Monophyletic phylogenetic tree of organisms".

Phylogenetic nomenclature is a result of the general acceptance of branching in the course of evolution, represented in the diagrams of Jean-Baptiste Lamarck and later writers like Charles Darwin and Ernst Haeckel. In 1866, Haeckel for the first time constructed a single tree of all life and immediately proceeded to translate it into a classification. This classification was rank-based, as was usual at the time, but did not contain taxa that Haeckel considered polyphyletic. In it, Haeckel introduced the rank of phylum which carries a connotation of monophyly in its name (literally meaning "stem").

Ever since, it has been debated in which ways and to what extent the phylogeny of life should be used as a basis for its classification, with views ranging from "numerical taxonomy" (phenetics) over "evolutionary taxonomy" (gradistics) to "phylogenetic systematics". From the 1960s onwards, rankless classifications were occasionally proposed, but in general the principles of rank-based nomenclature were used by all three schools of thought.

Most of the basic tenets of phylogenetic nomenclature (lack of obligatory ranks, and something close to phylogenetic definitions) can, however, be traced to 1916, when Edwin Goodrich interpreted the name Sauropsida, erected 40 years earlier by T. H. Huxley, to include the birds (Aves) as well as part of Reptilia, and coined the new name Theropsida to include the mammals as well as another part of Reptilia. Goodrich did not give them ranks, and treated them exactly as if they had phylogenetic definitions, using neither contents nor diagnostic characters to decide whether a given animal should belong to Theropsida, Sauropsida, or something else once its phylogenetic position was agreed upon. Goodrich also opined that the name Reptilia should be abandoned once the phylogeny of the reptiles would be better known.

The principle that only clades should be formally named became popular in some circles in the second half of the 20th century. It spread together with the methods for discovering clades (cladistics) and is an integral part of phylogenetic systematics (see above). At the same time, it became apparent that the obligatory ranks that are part of the traditional systems of nomenclature produced problems. Some authors suggested abandoning them altogether, starting with Willi Hennig's abandonment of his earlier proposal to define ranks as geological age classes.

The first use of phylogenetic nomenclature in a publication can be dated to 1986. Theoretical papers outlining the principles of phylogenetic nomenclature, as well as further publications containing applications of phylogenetic nomenclature (mostly to vertebrates), soon followed (see Literature section).

In an attempt to avoid a schism in the biologist community, "Gauthier suggested to two members of the ICZN to apply formal taxonomic names ruled by the zoological code only to clades (at least for supraspecific taxa) and to abandon Linnean ranks, but these two members promptly rejected these ideas" (Laurin, 2008: 224). This led Kevin de Queiroz and the botanist Philip Cantino to start drafting their own code of nomenclature, the PhyloCode, for regulating phylogenetic nomenclature.

Controversy

Willi Hennig's pioneering work provoked a spirited debate about the relative merits of phylogenetic nomenclature versus Linnaean taxonomy, or the related approach of evolutionary taxonomy, which has continued down to the present. Some of the debates in which the cladists were engaged had been running since the 19th century. While Hennig insisted that different classification schemes were useful for different purposes, he gave primacy to his own, claiming that the categories of his system had "individuality and reality" in contrast to the "timeless abstractions" of morphology-based classifications.

Formal classifications based on cladistic reasoning are said to emphasize ancestry at the expense of descriptive characteristics. Nonetheless, most taxonomists today avoid paraphyletic groups whenever they think it is possible within Linnaean taxonomy; polyphyletic taxa have long fallen out of fashion.

The International Code of Phylogenetic Nomenclature

The ICPN, or PhyloCode, is a draft code of rules and recommendations for phylogenetic nomenclature.

The ICPN will only regulate clade names. Names for para- or polyphyletic taxa, and names for species (which may or may not be clades), will not be considered, at least not at first. This means that the regulation of species names will be left, for the time being, to the rank-based codes of nomenclature.
The Principle of Priority will be introduced for names and for definitions. The starting point for priority will be the publication date of the ICPN.
Definitions for existing names, and new names along with their definitions, will have to be published in peer-reviewed works (on or after the starting date) and will have to be registered in an online database in order to be valid.

The number of supporters for widespread adoption of the PhyloCode is still small, and it is uncertain (as of December 2019) when the code will be implemented and how widely it will be followed.

Evolutionary taxonomy

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Evolutionary_taxonomy

Evolutionary taxonomy, evolutionary systematics or Darwinian classification is a branch of biological classification that seeks to classify organisms using a combination of phylogenetic relationship (shared descent), progenitor-descendant relationship (serial descent), and degree of evolutionary change. This type of taxonomy may consider whole taxa rather than single species, so that groups of species can be inferred as giving rise to new groups. The concept found its most well-known form in the modern evolutionary synthesis of the early 1940s.

Evolutionary taxonomy differs from strict pre-Darwinian Linnaean taxonomy (producing orderly lists only), in that it builds evolutionary trees. While in phylogenetic nomenclature each taxon must consist of a single ancestral node and all its descendants, evolutionary taxonomy allows for groups to be excluded from their parent taxa (e.g. dinosaurs are not considered to include birds, but to have given rise to them), thus permitting paraphyletic taxa.

Origin of evolutionary taxonomy

Jean-Baptiste Lamarck's 1815 diagram showing branching in the course of invertebrate evolution

Evolutionary taxonomy arose as a result of the influence of the theory of evolution on Linnaean taxonomy. The idea of translating Linnaean taxonomy into a sort of dendrogram of the Animal and Plant Kingdoms was formulated toward the end of the 18th century, well before Charles Darwin's book On the Origin of Species was published. The first to suggest that organisms had common descent was Pierre-Louis Moreau de Maupertuis in his 1751 Essai de Cosmologie, Transmutation of species entered wider scientific circles with Erasmus Darwin's 1796 Zoönomia and Jean-Baptiste Lamarck's 1809 Philosophie Zoologique. The idea was popularised in the English-speaking world by the speculative but widely read Vestiges of the Natural History of Creation, published anonymously by Robert Chambers in 1844.

Following the appearance of On the Origin of Species, Tree of Life representations became popular in scientific works. In On the Origin of Species, the ancestor remained largely a hypothetical species; Darwin was primarily occupied with showing the principle, carefully refraining from speculating on relationships between living or fossil organisms and using theoretical examples only. In contrast, Chambers had proposed specific hypotheses, the evolution of placental mammals from marsupials, for example.

Following Darwin's publication, Thomas Henry Huxley used the fossils of Archaeopteryx and Hesperornis to argue that the birds are descendants of the dinosaurs. Thus, a group of extant animals could be tied to a fossil group. The resulting description, that of dinosaurs "giving rise to" or being "the ancestors of" birds, exhibits the essential hallmark of evolutionary taxonomic thinking.

The past three decades have seen a dramatic increase in the use of DNA sequences for reconstructing phylogeny and a parallel shift in emphasis from evolutionary taxonomy towards Hennig's 'phylogenetic systematics'.

Today, with the advent of modern genomics, scientists in every branch of biology make use of molecular phylogeny to guide their research. One common method is multiple sequence alignment.

Cavalier-Smith, G. G. Simpson and Ernst Mayr are some representative evolutionary taxonomists.

New methods in modern evolutionary systematics

Efforts in combining modern methods of cladistics, phylogenetics, and DNA analysis with classical views of taxonomy have recently appeared. Certain authors have found that phylogenetic analysis is acceptable scientifically as long as paraphyly at least for certain groups is allowable. Such a stance is promoted in papers by Tod F. Stuessy and others. A particularly strict form of evolutionary systematics has been presented by Richard H. Zander in a number of papers, but summarized in his "Framework for Post-Phylogenetic Systematics".

Briefly, Zander's pluralistic systematics is based on the incompleteness of each of the theories: A method that cannot falsify a hypothesis is as unscientific as a hypothesis that cannot be falsified. Cladistics generates only trees of shared ancestry, not serial ancestry. Taxa evolving seriatim cannot be dealt with by analyzing shared ancestry with cladistic methods. Hypotheses such as adaptive radiation from a single ancestral taxon cannot be falsified with cladistics. Cladistics offers a way to cluster by trait transformations but no evolutionary tree can be entirely dichotomous. Phylogenetics posits shared ancestral taxa as causal agents for dichotomies yet there is no evidence for the existence of such taxa. Molecular systematics uses DNA sequence data for tracking evolutionary changes, thus paraphyly and sometimes phylogenetic polyphyly signal ancestor-descendant transformations at the taxon level, but otherwise molecular phylogenetics makes no provision for extinct paraphyly. Additional transformational analysis is needed to infer serial descent.

Cladogram of the moss genus Didymodon showing taxon transformations. Colors denote dissilient groups.

The Besseyan cactus or commagram is the best evolutionary tree for showing both shared and serial ancestry. First, a cladogram or natural key is generated. Generalized ancestral taxa are identified and specialized descendant taxa are noted as coming off the lineage with a line of one color representing the progenitor through time. A Besseyan cactus or commagram is then devised that represents both shared and serial ancestry. Progenitor taxa may have one or more descendant taxa. Support measures in terms of Bayes factors may be given, following Zander's method of transformational analysis using decibans.

Cladistic analysis groups taxa by shared traits but incorporates a dichotomous branching model borrowed from phenetics. It is essentially a simplified dichotomous natural key, although reversals are tolerated. The problem, of course, is that evolution is not necessarily dichotomous. An ancestral taxon generating two or more descendants requires a longer, less parsimonious tree. A cladogram node summarizes all traits distal to it, not of any one taxon, and continuity in a cladogram is from node to node, not taxon to taxon. This is not a model of evolution, but is a variant of hierarchical cluster analysis (trait changes and non-ultrametric branches. This is why a tree based solely on shared traits is not called an evolutionary tree but merely a cladistic tree. This tree reflects to a large extent evolutionary relationships through trait transformations but ignores relationships made by species-level transformation of extant taxa.

A Besseyan cactus evolutionary tree of the moss genus Didymodon with generalized taxa in color and specialized descendants in white. Support measures are given in terms of Bayes factors, using deciban analysis of taxon transformation. Only two progenitors are considered unknown shared ancestors.

Phylogenetics attempts to inject a serial element by postulating ad hoc, undemonstrable shared ancestors at each node of a cladistic tree. There are in number, for a fully dichotomous cladogram, one less invisible shared ancestor than the number of terminal taxa. We get, then, in effect a dichotomous natural key with an invisible shared ancestor generating each couplet. This cannot imply a process-based explanation without justification of the dichotomy, and supposition of the shared ancestors as causes. The cladistic form of analysis of evolutionary relationships cannot falsify any genuine evolutionary scenario incorporating serial transformation, according to Zander.

Zander has detailed methods for generating support measures for molecular serial descent and for morphological serial descent using Bayes factors and sequential Bayes analysis through Turing deciban or Shannon informational bit addition.

The Tree of Life

Evolution of the vertebrates at class level, width of spindles indicating number of families. Spindle diagrams are often used in evolutionary taxonomy.

As more and more fossil groups were found and recognized in the late 19th and early 20th century, palaeontologists worked to understand the history of animals through the ages by linking together known groups. The Tree of life was slowly being mapped out, with fossil groups taking up their position in the tree as understanding increased.

These groups still retained their formal Linnaean taxonomic ranks. Some of them are paraphyletic in that, although every organism in the group is linked to a common ancestor by an unbroken chain of intermediate ancestors within the group, some other descendants of that ancestor lie outside the group. The evolution and distribution of the various taxa through time is commonly shown as a spindle diagram (often called a Romerogram after the American palaeontologist Alfred Romer) where various spindles branch off from each other, with each spindle representing a taxon. The width of the spindles are meant to imply the abundance (often number of families) plotted against time.

Vertebrate palaeontology had mapped out the evolutionary sequence of vertebrates as currently understood fairly well by the closing of the 19th century, followed by a reasonable understanding of the evolutionary sequence of the plant kingdom by the early 20th century. The tying together of the various trees into a grand Tree of Life only really became possible with advancements in microbiology and biochemistry in the period between the World Wars.

Terminological difference

The two approaches, evolutionary taxonomy and the phylogenetic systematics derived from Willi Hennig, differ in the use of the word "monophyletic". For evolutionary systematicists, "monophyletic" means only that a group is derived from a single common ancestor. In phylogenetic nomenclature, there is an added caveat that the ancestral species and all descendants should be included in the group. The term "holophyletic" has been proposed for the latter meaning. As an example, amphibians are monophyletic under evolutionary taxonomy, since they have arisen from fishes only once. Under phylogenetic taxonomy, amphibians do not constitute a monophyletic group in that the amniotes (reptiles, birds and mammals) have evolved from an amphibian ancestor and yet are not considered amphibians. Such paraphyletic groups are rejected in phylogenetic nomenclature, but are considered a signal of serial descent by evolutionary taxonomists.

Phylogenetics (updated)

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Phylogenetics

In biology, phylogenetics /ˌfaɪloʊdʒəˈnɛtɪks, -lə-/ (Greek: φυλή, φῦλον – phylé, phylon = tribe, clan, race + γενετικός – genetikós = origin, source, birth) is the study of the evolutionary history and relationships among individuals or groups of organisms (e.g. species, or populations). These relationships are discovered through phylogenetic inference methods that evaluate observed heritable traits, such as DNA sequences or morphology under a model of evolution of these traits. The result of these analyses is a phylogeny (also known as a phylogenetic tree)—a diagrammatic hypothesis about the history of the evolutionary relationships of a group of organisms. The tips of a phylogenetic tree can be living organisms or fossils, and represent the 'end', or the present, in an evolutionary lineage. A phylogenetic tree can be rooted or unrooted. A rooted tree indicates the common ancestor, or ancestral lineage, of the tree. An unrooted tree makes no assumption about the ancestral line, and does not show the origin or "root" of the gene or organism in question. Phylogenetic analyses have become central to understanding biodiversity, evolution, ecology, and genomes.

Taxonomy is the identification, naming and classification of organisms. It is usually richly informed by phylogenetics, but remains a methodologically and logically distinct discipline. The degree to which taxonomies depend on phylogenies (or classification depends on evolutionary development) differs depending on the school of taxonomy: phenetics ignores phylogeny altogether, trying to represent the similarity between organisms instead; cladistics (phylogenetic systematics) tries to reproduce phylogeny in its classification without loss of information; evolutionary taxonomy tries to find a compromise between them.

Construction of a phylogenetic tree

Usual methods of phylogenetic inference involve computational approaches implementing the optimality criteria and methods of parsimony, maximum likelihood (ML), and MCMC-based Bayesian inference. All these depend upon an implicit or explicit mathematical model describing the evolution of characters observed.

Phenetics, popular in the mid-20th century but now largely obsolete, used distance matrix-based methods to construct trees based on overall similarity in morphology or similar observable traits (i.e. in the phenotype or the overall similarity of DNA, not the DNA sequence), which was often assumed to approximate phylogenetic relationships.

Prior to 1950, phylogenetic inferences were generally presented as narrative scenarios. Such methods are often ambiguous and lack explicit criteria for evaluating alternative hypotheses.

History

The term "phylogeny" derives from the German Phylogenie, introduced by Haeckel in 1866, and the Darwinian approach to classification became known as the "phyletic" approach.

Ernst Haeckel's recapitulation theory

During the late 19th century, Ernst Haeckel's recapitulation theory, or "biogenetic fundamental law", was widely accepted. It was often expressed as "ontogeny recapitulates phylogeny", i.e. the development of a single organism during its lifetime, from germ to adult, successively mirrors the adult stages of successive ancestors of the species to which it belongs. But this theory has long been rejected. Instead, ontogeny evolves – the phylogenetic history of a species cannot be read directly from its ontogeny, as Haeckel thought would be possible, but characters from ontogeny can be (and have been) used as data for phylogenetic analyses; the more closely related two species are, the more apomorphies their embryos share.

Timeline of key events

Branching tree diagram from Heinrich Georg Bronn's work (1858)

Phylogenetic tree suggested by Haeckel (1866)

14th century, lex parsimoniae (parsimony principle), William of Ockam, English philosopher, theologian, and Franciscan friar, but the idea actually goes back to Aristotle, precursor concept
1763, Bayesian probability, Rev. Thomas Bayes, precursor concept
18th century, Pierre Simon (Marquis de Laplace), perhaps first to use ML (maximum likelihood), precursor concept
1809, evolutionary theory, Philosophie Zoologique, Jean-Baptiste de Lamarck, precursor concept, foreshadowed in the 17th century and 18th century by Voltaire, Descartes, and Leibniz, with Leibniz even proposing evolutionary changes to account for observed gaps suggesting that many species had become extinct, others transformed, and different species that share common traits may have at one time been a single race, also foreshadowed by some early Greek philosophers such as Anaximander in the 6th century BC and the atomists of the 5th century BC, who proposed rudimentary theories of evolution
1837, Darwin's notebooks show an evolutionary tree
1843, distinction between homology and analogy (the latter now referred to as homoplasy), Richard Owen, precursor concept
1858, Paleontologist Heinrich Georg Bronn (1800–1862) published a hypothetical tree to illustrating the paleontological "arrival" of new, similar species following the extinction of an older species. Bronn did not propose a mechanism responsible for such phenomena, precursor concept.
1858, elaboration of evolutionary theory, Darwin and Wallace, also in Origin of Species by Darwin the following year, precursor concept
1866, Ernst Haeckel, first publishes his phylogeny-based evolutionary tree, precursor concept
1893, Dollo's Law of Character State Irreversibility, precursor concept
1912, ML recommended, analyzed, and popularized by Ronald Fisher, precursor concept
1921, Tillyard uses term "phylogenetic" and distinguishes between archaic and specialized characters in his classification system
1940, term "clade" coined by Lucien Cuénot
1949, Jackknife resampling, Maurice Quenouille (foreshadowed in '46 by Mahalanobis and extended in '58 by Tukey), precursor concept
1950, Willi Hennig's classic formalization
1952, William Wagner's groundplan divergence method
1953, "cladogenesis" coined
1960, "cladistic" coined by Cain and Harrison
1963, first attempt to use ML (maximum likelihood) for phylogenetics, Edwards and Cavalli-Sforza
1965
- Camin-Sokal parsimony, first parsimony (optimization) criterion and first computer program/algorithm for cladistic analysis both by Camin and Sokal
- character compatibility method, also called clique analysis, introduced independently by Camin and Sokal (loc. cit.) and E. O. Wilson
1966
- English translation of Hennig
- "cladistics" and "cladogram" coined (Webster's, loc. cit.)
1969
- dynamic and successive weighting, James Farris
- Wagner parsimony, Kluge and Farris
- CI (consistency index), Kluge and Farris
- introduction of pairwise compatibility for clique analysis, Le Quesne
1970, Wagner parsimony generalized by Farris
1971
- first successful application of ML to phylogenetics (for protein sequences), Neyman
- Fitch parsimony, Fitch
- NNI (nearest neighbour interchange), first branch-swapping search strategy, developed independently by Robinson and Moore et al.
- ME (minimum evolution), Kidd and Sgaramella-Zonta (it is unclear if this is the pairwise distance method or related to ML as Edwards and Cavalli-Sforza call ML "minimum evolution")
1972, Adams consensus, Adams
1976, prefix system for ranks, Farris
1977, Dollo parsimony, Farris
1979
- Nelson consensus, Nelson
- MAST (maximum agreement subtree)((GAS)greatest agreement subtree), a consensus method, Gordon
- bootstrap, Bradley Efron, precursor concept
1980, PHYLIP, first software package for phylogenetic analysis, Felsenstein
1981
- majority consensus, Margush and MacMorris
- strict consensus, Sokal and Rohlf
- first computationally efficient ML algorithm, Felsenstein
1982
- PHYSIS, Mikevich and Farris
- branch and bound, Hendy and Penny
1985
- first cladistic analysis of eukaryotes based on combined phenotypic and genotypic evidence Diana Lipscomb
- first issue of Cladistics
- first phylogenetic application of bootstrap, Felsenstein
- first phylogenetic application of jackknife, Scott Lanyon
1986, MacClade, Maddison and Maddison
1987, neighbor-joining method Saitou and Nei
1988, Hennig86 (version 1.5), Farris
- Bremer support (decay index), Bremer
1989
- RI (retention index), RCI (rescaled consistency index), Farris
- HER (homoplasy excess ratio), Archie
1990
- combinable components (semi-strict) consensus, Bremer
- SPR (subtree pruning and regrafting), TBR (tree bisection and reconnection), Swofford and Olsen
1991
- DDI (data decisiveness index), Goloboff
- first cladistic analysis of eukaryotes based only on phenotypic evidence, Lipscomb
1993, implied weighting Goloboff
1994, reduced consensus: RCC (reduced cladistic consensus) for rooted trees, Wilkinson
1995, reduced consensus RPC (reduced partition consensus) for unrooted trees, Wilkinson
1996, first working methods for BI (Bayesian Inference)independently developed by Li, Mau, and Rannala and Yang and all using MCMC (Markov chain-Monte Carlo)
1998, TNT (Tree Analysis Using New Technology), Goloboff, Farris, and Nixon
1999, Winclada, Nixon
2003, symmetrical resampling, Goloboff^[65]

Clade

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Clade

Cladogram (family tree) of a biological group, showing the last common ancestor of the composite tree, which is the vertical line 'trunk' (stem) at the bottom, with all descendant branches shown above. The blue and red subgroups (at left and right) are clades, or monophyletic (complete) groups; each shows its common ancestor 'stem' at the bottom of the subgroup 'branch'. The green subgroup is not a clade; it is a paraphyletic group, which is an incomplete clade here because it excludes the blue branch even though it has also descended from the common ancestor stem at the bottom of the green branch. The green subgroup together with the blue one forms a clade again.

A clade (from Ancient Greek: κλάδος, klados, "branch"), also known as monophyletic group, is a group of organisms that consists of a common ancestor and all its lineal descendants, and represents a single "branch" on the "tree of life". Rather than the English term, the equivalent Latin term cladus (plural cladi) is often used in taxonomical literature.

The common ancestor may be an individual, a population, a species (extinct or extant), and so on right up to a kingdom and further. Clades are nested, one in another, as each branch in turn splits into smaller branches. These splits reflect evolutionary history as populations diverged and evolved independently. Clades are termed monophyletic (Greek: "one clan") groups.

Over the last few decades, the cladistic approach has revolutionized biological classification and revealed surprising evolutionary relationships among organisms. Increasingly, taxonomists try to avoid naming taxa that are not clades; that is, taxa that are not monophyletic. Some of the relationships between organisms that the molecular biology arm of cladistics has revealed are that fungi are closer relatives to animals than they are to plants, archaea are now considered different from bacteria, and multicellular organisms may have evolved from archaea.

Etymology

The term "clade" was coined in 1957 by the biologist Julian Huxley to refer to the result of cladogenesis, a concept Huxley borrowed from Bernhard Rensch.

Many commonly named groups, rodents and insects for example, are clades because, in each case, the group consists of a common ancestor with all its descendant branches. Rodents, for example, are a branch of mammals that split off after the end of the period when the clade Dinosauria stopped being the dominant terrestrial vertebrates 66 million years ago. The original population and all its descendants are a clade. The rodent clade corresponds to the order Rodentia, and insects to the class Insecta. These clades include smaller clades, such as chipmunk or ant, each of which consists of even smaller clades. The clade "rodent" is in turn included in the mammal, vertebrate and animal clades.

History of nomenclature and taxonomy

Early phylogenetic tree by Haeckel, 1866. Groups once thought to be more advanced, such as birds ("Aves"), are placed at the top.

The idea of a clade did not exist in pre-Darwinian Linnaean taxonomy, which was based by necessity only on internal or external morphological similarities between organisms – although as it happens, many of the better known animal groups in Linnaeus' original Systema Naturae (notably among the vertebrate groups) do represent clades. The phenomenon of convergent evolution is, however, responsible for many cases where there are misleading similarities in the morphology of groups that evolved from different lineages.

With the increasing realization in the first half of the 19th century that species had changed and split through the ages, classification increasingly came to be seen as branches on the evolutionary tree of life. The publication of Darwin's theory of evolution in 1859 gave this view increasing weight. Thomas Henry Huxley, an early advocate of evolutionary theory, proposed a revised taxonomy based on clades. For example, he grouped birds with reptiles, based on fossil evidence.

German biologist Emil Hans Willi Hennig (1913 – 1976) is considered to be the founder of cladistics. He proposed a classification system that represented repeated branchings of the family tree, as opposed to the previous systems, which put organisms on a "ladder", with supposedly more "advanced" organisms at the top.

Taxonomists have increasingly worked to make the taxonomic system reflect evolution. When it comes to naming, however, this principle is not always compatible with the traditional rank-based nomenclature. In the latter, only taxa associated with a rank can be named, yet there are not enough ranks to name a long series of nested clades. For these and other reasons, phylogenetic nomenclature has been developed; it is still controversial.

Definitions

Gavialidae, Crocodylidae and Alligatoridae are clade names that are here applied to a phylogenetic tree of crocodylians.

A clade is by definition monophyletic, meaning that it contains one ancestor (which can be an organism, a population, or a species) and all its descendants. The ancestor can be known or unknown; any and all members of a clade can be extant or extinct.

Clades and phylogenetic trees

The science that tries to reconstruct phylogenetic trees and thus discover clades is called phylogenetics or cladistics, the latter term coined by Ernst Mayr (1965), derived from "clade". The results of phylogenetic/cladistic analyses are tree-shaped diagrams called cladograms; they, and all their branches, are phylogenetic hypotheses.

Three methods of defining clades are featured in phylogenetic nomenclature: node-, stem-, and apomorphy-based (see here for detailed definitions).

Terminology

Cladogram of modern primate groups; all tarsiers are haplorhines, but not all haplorhines are tarsiers; all apes are catarrhines, but not all catarrhines are apes; etc.

The relationship between clades can be described in several ways:

A clade located within a clade is said to be nested within that clade. In the diagram, the hominoid clade, i.e. the apes and humans, is nested within the primate clade.
Two clades are sisters if they have an immediate common ancestor. In the diagram, lemurs and lorises are sister clades, while humans and tarsiers are not.
A clade A is basal to a clade B if A branches off the lineage leading to B before the first branch leading only to members of B. In the adjacent diagram, the strepsirrhine/prosimian clade, is basal to the hominoids/ape clade. However, in this example, both Haplorrhine as prosimians should be considered as most basal groupings. It is better to say that the prosimians are the sister group to the rest of the primates. This way one also avoids unintended and misconceived connotations about evolutionary advancement, complexity, diversity, ancestor status, and ancienity e.g. due to impact of sampling diversity and extinction. Basal clades should not be confused with stem groupings, as the latter is associated with paraphyletic or unresolved groupings.

In popular culture

Clade is the title of a novel by James Bradley, who chose it both because of its biological meaning and also because of the larger implications of the word.

An episode of Elementary was titled "Dead Clade Walking" and dealt with a case involving a rare fossil.

Search This Blog

Thursday, December 12, 2019

Audience Question

Richard Feynman’s Answer

The AI Effect

On Building Artificial Machines

Superhuman Narrow AI

The Problem of Pattern Recognition

The Bias–Variance Tradeoff

The Current State of Artificial Intelligence (1985)

Video

Definitions

Phylogenetic definitions of clade names

Node names

Ancestry-based definitions of the names of paraphyletic and polyphyletic taxa

Ranks

Philosophy

History

Controversy

The International Code of Phylogenetic Nomenclature

Origin of evolutionary taxonomy

New methods in modern evolutionary systematics

The Tree of Life

Terminological difference

Construction of a phylogenetic tree

History

Ernst Haeckel's recapitulation theory

Timeline of key events

Etymology

History of nomenclature and taxonomy

Definitions

Clades and phylogenetic trees

Terminology

In popular culture