Search This Blog

Thursday, May 17, 2018

Genomics

From Wikipedia, the free encyclopedia

Genomics is an interdisciplinary field of science focusing on the structure, function, evolution, mapping, and editing of genomes. A genome is an organism's complete set of DNA, including all of its genes. In contrast to genetics, which refers to the study of individual genes and their roles in inheritance, genomics aims at the collective characterization and quantification of genes, which direct the production of proteins with the assistance of enzymes and messenger molecules. In turn, proteins make up body structures such as organs and tissues as well as control chemical reactions and carry signals between cells. Genomics also involves the sequencing and analysis of genomes through uses of high throughput DNA sequencing and bioinformatics to assemble and analyze the function and structure of entire genomes.[1][2][3] Advances in genomics have triggered a revolution in discovery-based research and systems biology to facilitate understanding of even the most complex biological systems such as the brain.[4]

The field also includes studies of intragenomic (within the genome) phenomena such as epistasis (effect of one gene on another), pleiotropy (one gene affecting more than one trait), heterosis (hybrid vigour), and other interactions between loci and alleles within the genome.[5]

History

Etymology

From the Greek ΓΕΝ[6] gen, "gene" (gamma, epsilon, nu, epsilon) meaning "become, create, creation, birth", and subsequent variants: genealogy, genesis, genetics, genic, genomere, genotype, genus etc. While the word genome (from the German Genom, attributed to Hans Winkler) was in use in English as early as 1926,[7] the term genomics was coined by Tom Roderick, a geneticist at the Jackson Laboratory (Bar Harbor, Maine), over beer at a meeting held in Maryland on the mapping of the human genome in 1986.[8]

Early sequencing efforts

Following Rosalind Franklin's confirmation of the helical structure of DNA, James D. Watson and Francis Crick's publication of the structure of DNA in 1953 and Fred Sanger's publication of the Amino acid sequence of insulin in 1955, nucleic acid sequencing became a major target of early molecular biologists.[9] In 1964, Robert W. Holley and colleagues published the first nucleic acid sequence ever determined, the ribonucleotide sequence of alanine transfer RNA.[10][11] Extending this work, Marshall Nirenberg and Philip Leder revealed the triplet nature of the genetic code and were able to determine the sequences of 54 out of 64 codons in their experiments.[12] In 1972, Walter Fiers and his team at the Laboratory of Molecular Biology of the University of Ghent (Ghent, Belgium) were the first to determine the sequence of a gene: the gene for Bacteriophage MS2 coat protein.[13] Fiers' group expanded on their MS2 coat protein work, determining the complete nucleotide-sequence of bacteriophage MS2-RNA (whose genome encodes just four genes in 3569 base pairs [bp]) and Simian virus 40 in 1976 and 1978, respectively.[14][15]

DNA-sequencing technology developed

Frederick Sanger
Walter Gilbert
Frederick Sanger and Walter Gilbert shared half of the 1980 Nobel Prize in chemistry for independently developing methods for the sequencing of DNA.

In addition to his seminal work on the amino acid sequence of insulin, Frederick Sanger and his colleagues played a key role in the development of DNA sequencing techniques that enabled the establishment of comprehensive genome sequencing projects.[5] In 1975, he and Alan Coulson published a sequencing procedure using DNA polymerase with radiolabelled nucleotides that he called the Plus and Minus technique.[16][17] This involved two closely related methods that generated short oligonucleotides with defined 3' termini. These could be fractionated by electrophoresis on a polyacrylamide gel (called polyacrylamide gel electrophoresis) and visualised using autoradiography. The procedure could sequence up to 80 nucleotides in one go and was a big improvement, but was still very laborious. Nevertheless, in 1977 his group was able to sequence most of the 5,386 nucleotides of the single-stranded bacteriophage φX174, completing the first fully sequenced DNA-based genome.[18] The refinement of the Plus and Minus method resulted in the chain-termination, or Sanger method (see below), which formed the basis of the techniques of DNA sequencing, genome mapping, data storage, and bioinformatic analysis most widely used in the following quarter-century of research.[19][20] In the same year Walter Gilbert and Allan Maxam of Harvard University independently developed the Maxam-Gilbert method (also known as the chemical method) of DNA sequencing, involving the preferential cleavage of DNA at known bases, a less efficient method.[21][22] For their groundbreaking work in the sequencing of nucleic acids, Gilbert and Sanger shared half the 1980 Nobel Prize in chemistry with Paul Berg (recombinant DNA).

Complete genomes

The advent of these technologies resulted in a rapid intensification in the scope and speed of completion of genome sequencing projects. The first complete genome sequence of an eukaryotic organelle, the human mitochondrion (16,568 bp, about 16.6 kb [kilobase]), was reported in 1981,[23] and the first chloroplast genomes followed in 1986.[24][25] In 1992, the first eukaryotic chromosome, chromosome III of brewer's yeast Saccharomyces cerevisiae (315 kb) was sequenced.[26] The first free-living organism to be sequenced was that of Haemophilus influenzae (1.8 Mb [megabase]) in 1995.[27] The following year a consortium of researchers from laboratories across North America, Europe, and Japan announced the completion of the first complete genome sequence of a eukaryote, S. cerevisiae (12.1 Mb), and since then genomes have continued being sequenced at an exponentially growing pace.[28] As of October 2011, the complete sequences are available for: 2,719 viruses, 1,115 archaea and bacteria, and 36 eukaryotes, of which about half are fungi.[29][30]

"Hockey stick" graph showing the exponential growth of public sequence databases.
The number of genome projects has increased as technological improvements continue to lower the cost of sequencing. (A) Exponential growth of genome sequence databases since 1995. (B) The cost in US Dollars (USD) to sequence one million bases. (C) The cost in USD to sequence a 3,000 Mb (human-sized) genome on a log-transformed scale.

Most of the microorganisms whose genomes have been completely sequenced are problematic pathogens, such as Haemophilus influenzae, which has resulted in a pronounced bias in their phylogenetic distribution compared to the breadth of microbial diversity.[31][32] Of the other sequenced species, most were chosen because they were well-studied model organisms or promised to become good models. Yeast (Saccharomyces cerevisiae) has long been an important model organism for the eukaryotic cell, while the fruit fly Drosophila melanogaster has been a very important tool (notably in early pre-molecular genetics). The worm Caenorhabditis elegans is an often used simple model for multicellular organisms. The zebrafish Brachydanio rerio is used for many developmental studies on the molecular level, and the plant Arabidopsis thaliana is a model organism for flowering plants. The Japanese pufferfish (Takifugu rubripes) and the spotted green pufferfish (Tetraodon nigroviridis) are interesting because of their small and compact genomes, which contain very little noncoding DNA compared to most species.[33][34] The mammals dog (Canis familiaris),[35] brown rat (Rattus norvegicus), mouse (Mus musculus), and chimpanzee (Pan troglodytes) are all important model animals in medical research.[22]

A rough draft of the human genome was completed by the Human Genome Project in early 2001, creating much fanfare.[36] This project, completed in 2003, sequenced the entire genome for one specific person, and by 2007 this sequence was declared "finished" (less than one error in 20,000 bases and all chromosomes assembled).[36] In the years since then, the genomes of many other individuals have been sequenced, partly under the auspices of the 1000 Genomes Project, which announced the sequencing of 1,092 genomes in October 2012.[37] Completion of this project was made possible by the development of dramatically more efficient sequencing technologies and required the commitment of significant bioinformatics resources from a large international collaboration.[38] The continued analysis of human genomic data has profound political and social repercussions for human societies.[39]

The "omics" revolution

The English-language neologism omics informally refers to a field of study in biology ending in -omics, such as genomics, proteomics or metabolomics. The related suffix -ome is used to address the objects of study of such fields, such as the genome, proteome or metabolome respectively. The suffix -ome as used in molecular biology refers to a totality of some sort; similarly omics has come to refer generally to the study of large, comprehensive biological data sets. While the growth in the use of the term has led some scientists (Jonathan Eisen, among others[40]) to claim that it has been oversold,[41] it reflects the change in orientation towards the quantitative analysis of complete or near-complete assortment of all the constituents of a system.[42] In the study of symbioses, for example, researchers which were once limited to the study of a single gene product can now simultaneously compare the total complement of several types of biological molecules.[43][44]

Genome analysis

After an organism has been selected, genome projects involve three components: the sequencing of DNA, the assembly of that sequence to create a representation of the original chromosome, and the annotation and analysis of that representation.[5]

Overview of a genome project. First, the genome must be selected, which involves several factors including cost and relevance. Second, the sequence is generated and assembled at a given sequencing center (such as BGI or DOE JGI). Third, the genome sequence is annotated at several levels: DNA, protein, gene pathways, or comparatively.

Sequencing

Historically, sequencing was done in sequencing centers, centralized facilities (ranging from large independent institutions such as Joint Genome Institute which sequence dozens of terabases a year, to local molecular biology core facilities) which contain research laboratories with the costly instrumentation and technical support necessary. As sequencing technology continues to improve, however, a new generation of effective fast turnaround benchtop sequencers has come within reach of the average academic laboratory.[45][46] On the whole, genome sequencing approaches fall into two broad categories, shotgun and high-throughput (or next-generation) sequencing.[5]

Shotgun sequencing

An ABI PRISM 3100 Genetic Analyzer. Such capillary sequencers automated early large-scale genome sequencing efforts.

Shotgun sequencing is a sequencing method designed for analysis of DNA sequences longer than 1000 base pairs, up to and including entire chromosomes.[47] It is named by analogy with the rapidly expanding, quasi-random firing pattern of a shotgun. Since gel electrophoresis sequencing can only be used for fairly short sequences (100 to 1000 base pairs), longer DNA sequences must be broken into random small segments which are then sequenced to obtain reads. Multiple overlapping reads for the target DNA are obtained by performing several rounds of this fragmentation and sequencing. Computer programs then use the overlapping ends of different reads to assemble them into a continuous sequence.[47][48] Shotgun sequencing is a random sampling process, requiring over-sampling to ensure a given nucleotide is represented in the reconstructed sequence; the average number of reads by which a genome is over-sampled is referred to as coverage.[49]

For much of its history, the technology underlying shotgun sequencing was the classical chain-termination method or 'Sanger method', which is based on the selective incorporation of chain-terminating dideoxynucleotides by DNA polymerase during in vitro DNA replication.[18][50] Recently, shotgun sequencing has been supplanted by high-throughput sequencing methods, especially for large-scale, automated genome analyses. However, the Sanger method remains in wide use, primarily for smaller-scale projects and for obtaining especially long contiguous DNA sequence reads (>500 nucleotides).[51] Chain-termination methods require a single-stranded DNA template, a DNA primer, a DNA polymerase, normal deoxynucleosidetriphosphates (dNTPs), and modified nucleotides (dideoxyNTPs) that terminate DNA strand elongation. These chain-terminating nucleotides lack a 3'-OH group required for the formation of a phosphodiester bond between two nucleotides, causing DNA polymerase to cease extension of DNA when a ddNTP is incorporated. The ddNTPs may be radioactively or fluorescently labelled for detection in DNA sequencers.[5] Typically, these machines can sequence up to 96 DNA samples in a single batch (run) in up to 48 runs a day.[52]

High-throughput sequencing

The high demand for low-cost sequencing has driven the development of high-throughput sequencing technologies that parallelize the sequencing process, producing thousands or millions of sequences at once.[53][54] High-throughput sequencing is intended to lower the cost of DNA sequencing beyond what is possible with standard dye-terminator methods. In ultra-high-throughput sequencing, as many as 500,000 sequencing-by-synthesis operations may be run in parallel.[55][56]

Illumina Genome Analyzer II System. Illumina technologies have set the standard for high-throughput massively parallel sequencing.[45]

The Illumina dye sequencing method is based on reversible dye-terminators and was developed in 1996 at the Geneva Biomedical Research Institute, by Pascal Mayer and Laurent Farinelli.[57] In this method, DNA molecules and primers are first attached on a slide and amplified with polymerase so that local clonal colonies, initially coined "DNA colonies", are formed. To determine the sequence, four types of reversible terminator bases (RT-bases) are added and non-incorporated nucleotides are washed away. Unlike pyrosequencing, the DNA chains are extended one nucleotide at a time and image acquisition can be performed at a delayed moment, allowing for very large arrays of DNA colonies to be captured by sequential images taken from a single camera. Decoupling the enzymatic reaction and the image capture allows for optimal throughput and theoretically unlimited sequencing capacity; with an optimal configuration, the ultimate throughput of the instrument depends only on the A/D conversion rate of the camera. The camera takes images of the fluorescently labeled nucleotides, then the dye along with the terminal 3' blocker is chemically removed from the DNA, allowing the next cycle.[58]

An alternative approach, ion semiconductor sequencing, is based on standard DNA replication chemistry. This technology measures the release of a hydrogen ion each time a base is incorporated. A microwell containing template DNA is flooded with a single nucleotide, if the nucleotide is complementary to the template strand it will be incorporated and a hydrogen ion will be released. This release triggers an ISFET ion sensor. If a homopolymer is present in the template sequence multiple nucleotides will be incorporated in a single flood cycle, and the detected electrical signal will be proportionally higher.[59]

Assembly

Overlapping reads form contigs; contigs and gaps of known length form scaffolds.
 
Paired end reads of next generation sequencing data mapped to a reference genome.
 
Multiple, fragmented sequence reads must be assembled together on the basis of their overlapping areas.

Sequence assembly refers to aligning and merging fragments of a much longer DNA sequence in order to reconstruct the original sequence.[5] This is needed as current DNA sequencing technology cannot read whole genomes as a continuous sequence, but rather reads small pieces of between 20 and 1000 bases, depending on the technology used. 3rd generation sequencing technologies such as PacBio or Oxford Nanopore routinly generate sequenceing reads >10 kb in length; however, they have a high error rate at approximately 15%.[60][61] Typically the short fragments, called reads, result from shotgun sequencing genomic DNA, or gene transcripts (ESTs).[5]

Assembly approaches

Assembly can be broadly categorized into two approaches: de novo assembly, for genomes which are not similar to any sequenced in the past, and comparative assembly, which uses the existing sequence of a closely related organism as a reference during assembly.[49] Relative to comparative assembly, de novo assembly is computationally difficult (NP-hard), making it less favorable for short-read NGS technologies. Within the de novo assembly paradigm there are two primary strategies for assembly, Eulerian path strategies, and overlap-layout-consensus (OLC) strategies. OLC strategies ultimately try to create a Hamiltonian path through an overlap graph which is an NP-hard problem. Eulerian path strategies are computationally more tractable because they try to find a Eulerian path through a deBruijn graph.[49]

Finishing

Finished genomes are defined as having a single contiguous sequence with no ambiguities representing each replicon.[62]

Annotation

The DNA sequence assembly alone is of little value without additional analysis.[5] Genome annotation is the process of attaching biological information to sequences, and consists of three main steps:[63]
  1. identifying portions of the genome that do not code for proteins
  2. identifying elements on the genome, a process called gene prediction, and
  3. attaching biological information to these elements.
Automatic annotation tools try to perform these steps in silico, as opposed to manual annotation (a.k.a. curation) which involves human expertise and potential experimental verification.[64] Ideally, these approaches co-exist and complement each other in the same annotation pipeline (also see below).

Traditionally, the basic level of annotation is using BLAST for finding similarities, and then annotating genomes based on homologues.[5] More recently, additional information is added to the annotation platform. The additional information allows manual annotators to deconvolute discrepancies between genes that are given the same annotation. Some databases use genome context information, similarity scores, experimental data, and integrations of other resources to provide genome annotations through their Subsystems approach. Other databases (e.g. Ensembl) rely on both curated data sources as well as a range of software tools in their automated genome annotation pipeline.[65] Structural annotation consists of the identification of genomic elements, primarily ORFs and their localisation, or gene structure. Functional annotation consists of attaching biological information to genomic elements.

Sequencing pipelines and databases

The need for reproducibility and efficient management of the large amount of data associated with genome projects mean that computational pipelines have important applications in genomics.[66]

Research areas

Functional genomics

Functional genomics is a field of molecular biology that attempts to make use of the vast wealth of data produced by genomic projects (such as genome sequencing projects) to describe gene (and protein) functions and interactions. Functional genomics focuses on the dynamic aspects such as gene transcription, translation, and protein–protein interactions, as opposed to the static aspects of the genomic information such as DNA sequence or structures. Functional genomics attempts to answer questions about the function of DNA at the levels of genes, RNA transcripts, and protein products. A key characteristic of functional genomics studies is their genome-wide approach to these questions, generally involving high-throughput methods rather than a more traditional “gene-by-gene” approach.
A major branch of genomics is still concerned with sequencing the genomes of various organisms, but the knowledge of full genomes has created the possibility for the field of functional genomics, mainly concerned with patterns of gene expression during various conditions. The most important tools here are microarrays and bioinformatics.

Structural genomics

An example of a protein structure determined by the Midwest Center for Structural Genomics.

Structural genomics seeks to describe the 3-dimensional structure of every protein encoded by a given genome.[67][68] This genome-based approach allows for a high-throughput method of structure determination by a combination of experimental and modeling approaches. The principal difference between structural genomics and traditional structural prediction is that structural genomics attempts to determine the structure of every protein encoded by the genome, rather than focusing on one particular protein. With full-genome sequences available, structure prediction can be done more quickly through a combination of experimental and modeling approaches, especially because the availability of large numbers of sequenced genomes and previously solved protein structures allow scientists to model protein structure on the structures of previously solved homologs. Structural genomics involves taking a large number of approaches to structure determination, including experimental methods using genomic sequences or modeling-based approaches based on sequence or structural homology to a protein of known structure or based on chemical and physical principles for a protein with no homology to any known structure. As opposed to traditional structural biology, the determination of a protein structure through a structural genomics effort often (but not always) comes before anything is known regarding the protein function. This raises new challenges in structural bioinformatics, i.e. determining protein function from its 3D structure.[69]

Epigenomics

Epigenomics is the study of the complete set of epigenetic modifications on the genetic material of a cell, known as the epigenome.[70] Epigenetic modifications are reversible modifications on a cell’s DNA or histones that affect gene expression without altering the DNA sequence (Russell 2010 p. 475). Two of the most characterized epigenetic modifications are DNA methylation and histone modification. Epigenetic modifications play an important role in gene expression and regulation, and are involved in numerous cellular processes such as in differentiation/development and tumorigenesis.[70] The study of epigenetics on a global level has been made possible only recently through the adaptation of genomic high-throughput assays.[71]

Metagenomics

Environmental Shotgun Sequencing (ESS) is a key technique in metagenomics. (A) Sampling from habitat; (B) filtering particles, typically by size; (C) Lysis and DNA extraction; (D) cloning and library construction; (E) sequencing the clones; (F) sequence assembly into contigs and scaffolds.

Metagenomics is the study of metagenomes, genetic material recovered directly from environmental samples. The broad field may also be referred to as environmental genomics, ecogenomics or community genomics. While traditional microbiology and microbial genome sequencing rely upon cultivated clonal cultures, early environmental gene sequencing cloned specific genes (often the 16S rRNA gene) to produce a profile of diversity in a natural sample. Such work revealed that the vast majority of microbial biodiversity had been missed by cultivation-based methods.[72] Recent studies use "shotgun" Sanger sequencing or massively parallel pyrosequencing to get largely unbiased samples of all genes from all the members of the sampled communities.[73] Because of its power to reveal the previously hidden diversity of microscopic life, metagenomics offers a powerful lens for viewing the microbial world that has the potential to revolutionize understanding of the entire living world.[74][75]

Model systems

Viruses and bacteriophages

Bacteriophages have played and continue to play a key role in bacterial genetics and molecular biology. Historically, they were used to define gene structure and gene regulation. Also the first genome to be sequenced was a bacteriophage. However, bacteriophage research did not lead the genomics revolution, which is clearly dominated by bacterial genomics. Only very recently has the study of bacteriophage genomes become prominent, thereby enabling researchers to understand the mechanisms underlying phage evolution. Bacteriophage genome sequences can be obtained through direct sequencing of isolated bacteriophages, but can also be derived as part of microbial genomes. Analysis of bacterial genomes has shown that a substantial amount of microbial DNA consists of prophage sequences and prophage-like elements.[76] A detailed database mining of these sequences offers insights into the role of prophages in shaping the bacterial genome.[77][78]

Cyanobacteria

At present there are 24 cyanobacteria for which a total genome sequence is available. 15 of these cyanobacteria come from the marine environment. These are six Prochlorococcus strains, seven marine Synechococcus strains, Trichodesmium erythraeum IMS101 and Crocosphaera watsonii WH8501. Several studies have demonstrated how these sequences could be used very successfully to infer important ecological and physiological characteristics of marine cyanobacteria. However, there are many more genome projects currently in progress, amongst those there are further Prochlorococcus and marine Synechococcus isolates, Acaryochloris and Prochloron, the N2-fixing filamentous cyanobacteria Nodularia spumigena, Lyngbya aestuarii and Lyngbya majuscula, as well as bacteriophages infecting marine cyanobaceria. Thus, the growing body of genome information can also be tapped in a more general way to address global problems by applying a comparative approach. Some new and exciting examples of progress in this field are the identification of genes for regulatory RNAs, insights into the evolutionary origin of photosynthesis, or estimation of the contribution of horizontal gene transfer to the genomes that have been analyzed.[79]

Applications of genomics

Genomics has provided applications in many fields, including medicine, biotechnology, anthropology and other social sciences.[39]

Genomic medicine

Next-generation genomic technologies allow clinicians and biomedical researchers to drastically increase the amount of genomic data collected on large study populations.[80] When combined with new informatics approaches that integrate many kinds of data with genomic data in disease research, this allows researchers to better understand the genetic bases of drug response and disease.[81][82]

Synthetic biology and bioengineering

The growth of genomic knowledge has enabled increasingly sophisticated applications of synthetic biology.[83] In 2010 researchers at the J. Craig Venter Institute announced the creation of a partially synthetic species of bacterium, Mycoplasma laboratorium, derived from the genome of Mycoplasma genitalium.[84]

Conservation genomics

Conservationists can use the information gathered by genomic sequencing in order to better evaluate genetic factors key to species conservation, such as the genetic diversity of a population or whether an individual is heterozygous for a recessive inherited genetic disorder.[85] By using genomic data to evaluate the effects of evolutionary processes and to detect patterns in variation throughout a given population, conservationists can formulate plans to aid a given species without as many variables left unknown as those unaddressed by standard genetic approaches.[86]

Nature versus nurture

From Wikipedia, the free encyclopedia

The nature versus nurture debate involves whether human behaviour is determined by the environment, either prenatal or during a person's life, or by a person's genes. The alliterative expression "nature and nurture" in English has been in use since at least the Elizabethan period[1] and goes back to medieval French.[2] The combination of the two concepts as complementary is ancient (Greek: ἁπό φύσεως καὶ εὐτροφίας[3]). Nature is what we think of as pre-wiring and is influenced by genetic inheritance and other biological factors. Nurture is generally taken as the influence of external factors after conception e.g. the product of exposure, experience and learning on an individual.[4]

The phrase in its modern sense was popularized by the English Victorian polymath Francis Galton, the modern founder of eugenics and behavioral genetics, discussing the influence of heredity and environment on social advancement.[5][6][7] Galton was influenced by the book On the Origin of Species written by his half-cousin, Charles Darwin.

The view that humans acquire all or almost all their behavioral traits from "nurture" was termed tabula rasa ("blank slate") by John Locke in 1690. A "blank slate view" in human developmental psychology assuming that human behavioral traits develop almost exclusively from environmental influences, was widely held during much of the 20th century (sometimes termed "blank-slatism"). The debate between "blank-slate" denial of the influence of heritability, and the view admitting both environmental and heritable traits, has often been cast in terms of nature versus nurture. These two conflicting approaches to human development were at the core of an ideological dispute over research agendas throughout the second half of the 20th century. As both "nature" and "nurture" factors were found to contribute substantially, often in an extricable manner, such views were seen as naive or outdated by most scholars of human development by the 2000s.[8][9][10][11][12][13]

The strong dichotomy of nature versus nurture has thus been claimed to have limited relevance in some fields of research. Close feedback loops have been found in which "nature" and "nurture" influence one another constantly, as seen in self-domestication. In ecology and behavioral genetics, researchers think nurture has an essential influence on nature.[14][15] Similarly in other fields, the dividing line between an inherited and an acquired trait becomes unclear, as in epigenetics[16] or fetal development.[17][18]

History of the debate

John Locke's An Essay Concerning Human Understanding (1690) is often cited as the foundational document of the "blank slate" view. Locke was criticizing René Descartes' claim of an innate idea of God universal to humanity. Locke's view was harshly criticized in his own time. Anthony Ashley-Cooper, 3rd Earl of Shaftesbury, complained that by denying the possibility of any innate ideas, Locke "threw all order and virtue out of the world", leading to total moral relativism. Locke's was not the predominant view in the 19th century, which on the contrary tended to focus on "instinct". Leda Cosmides and John Tooby noted that William James (1842–1910) argued that humans have more instincts than animals, and that greater freedom of action is the result of having more psychological instincts, not fewer.[19]

The question of "innate ideas" or "instincts" were of some importance in the discussion of free will in moral philosophy. In 18th-century philosophy, this was cast in terms of "innate ideas" establishing the presence of a universal virtue, prerequisite for objective morals. In the 20th century, this argument was in a way inverted, as some philosophers now argued that the evolutionary origins of human behavioral traits forces us to concede that there is no foundation for ethics (J. L. Mackie), while others treat ethics as a field in complete isolation from evolutionary considerations (Thomas Nagel).[20]

In the early 20th century, there was an increased interest in the role of the environment, as a reaction to the strong focus on pure heredity in the wake of the triumphal success of Darwin's theory of evolution.[21]

During this time, the social sciences developed as the project of studying the influence of culture in clean isolation from questions related to "biology". Franz Boas's The Mind of Primitive Man (1911) established a program that would dominate American anthropology for the next fifteen years. In this study he established that in any given population, biology, language, material and symbolic culture, are autonomous; that each is an equally important dimension of human nature, but that no one of these dimensions is reducible to another.

The tool of twin studies was developed as an research design intended to exclude all confounders based on inherited behavioral traits.[22] Such studies are designed to decompose the variability of a given trait in a given population into a genetic and an environmental component.

John B. Watson in the 1920s and 1930s established the school of purist behaviorism that would become dominant over the following decades. Watson was convinced of the complete dominance of cultural influence over anything that heredity might contribute, to the point of claiming
"Give me a dozen healthy infants, well-formed, and my own specified world to bring them up in and I'll guarantee to take any one at random and train him to become any type of specialist I might select – doctor, lawyer, artist, merchant-chief and, yes, even beggar-man and thief, regardless of his talents, penchants, tendencies, abilities, vocations, and race of his ancestors" (Behaviorism, 1930, p. 82).
During the 1940s to 1960s, Ashley Montagu was a notable proponent of this purist form of behaviorism which allowed no contribution from heredity whatsoever:
"Man is man because he has no instincts, because everything he is and has become he has learned, acquired, from his culture ... with the exception of the instinctoid reactions in infants to sudden withdrawals of support and to sudden loud noises, the human being is entirely instinctless."[23]
In 1951, Calvin Hall[24] suggested that the dichotomy opposing nature to nurture is ultimately fruitless.

Robert Ardrey in the 1960s argued for innate attributes of human nature, especially concerning territoriality, in the widely read African Genesis (1961) and The Territorial Imperative. Desmond Morris in The Naked Ape (1967) expressed similar views. Organised opposition to Montagu's kind of purist "blank-slatism" began to pick up in the 1970s, notably led by E. O. Wilson (On Human Nature 1979). Twin studies established that there was, in many cases, a significant heritable component. These results did not in any way point to overwhelming contribution of heritable factors, with heritability typically ranging around 40% to 50%, so that the controversy may not be cast in terms of purist behaviorism vs. purist nativism. Rather, it was purist behaviorism which was gradually replaced by the now-predominant view that both kinds of factors usually contribute to a given trait, anecdotally phrased by Donald Hebb as an answer to the question "which, nature or nurture, contributes more to personality?" by asking in response, "Which contributes more to the area of a rectangle, its length or its width?"[25] In a comparable avenue of research, anthropologist Donald Brown in the 1980s surveyed hundreds of anthropological studies from around the world and collected a set of cultural universals. He identified approximately 150 such features, coming to the conclusion there is indeed a "universal human nature", and that these features point to what that universal human nature is.[26]

At the height of the controversy, during the 1970s to 1980s, the debate was highly ideologised. In Not in Our Genes: Biology, Ideology and Human Nature (1984), Richard Lewontin, Steven Rose and Leon Kamin criticise "genetic determinism" from a Marxist framework, arguing that "Science is the ultimate legitimator of bourgeois ideology ... If biological determinism is a weapon in the struggle between classes, then the universities are weapons factories, and their teaching and research faculties are the engineers, designers, and production workers." The debate thus shifted away from whether heritable traits exist to whether it was politically or ethically permissible to admit their existence. The authors deny this, requesting that that evolutionary inclinations could be discarded in ethical and political discussions regardless of whether they exist or not.[27]

Heritability studies became much easier to perform, and hence much more numerous, with the advances of genetic studies during the 1990s. By the late 1990s, an overwhelming amount of evidence had accumulated that amounts to a refutation of the extreme forms of "blank-slatism" advocated by Watson or Montagu.

This revised state of affairs was summarized in books aimed at a popular audience from the late 1990s. In The Nurture Assumption: Why Children Turn Out the Way They Do (1998), Judith Rich Harris was heralded by Steven Pinker as a book that "will come to be seen as a turning point in the history of psychology".[28] but Harris was criticized for exaggerating the point of "parental upbringing seems to matter less than previously thought" to the implication that "parents do not matter".[29]

The situation as it presented itself by the end of the 20th century was summarized in The Blank Slate: The Modern Denial of Human Nature (2002) by Steven Pinker. The book became a best-seller, and was instrumental in bringing to the attention of a wider public the paradigm shift away from the behaviourist purism of the 1940s to 1970s that had taken place over the preceding decades. Pinker portrays the adherence to pure blank-slatism as an ideological dogma linked to two other dogmas found in the dominant view of human nature in the 20th century, which he termed "noble savage" (in the sense that people are born good and corrupted by bad influence) and "ghost in the machine" (in the sense that there is a human soul capable of moral choices completely detached from biology). Pinker argues that all three dogmas were held onto for an extended period even in the face of evidence because they were seen as desirable in the sense that if any human trait is purely conditioned by culture, any undesired trait (such as crime or aggression) may be engineered away by purely cultural (political means). Pinker focuses on reasons he assumes were responsible for unduly repressing evidence to the contrary, notably the fear of (imagined or projected) political or ideological consequences.[30]

Heritability estimates


This chart illustrates three patterns one might see when studying the influence of genes and environment on traits in individuals. Trait A shows a high sibling correlation, but little heritability (i.e. high shared environmental variance c2; low heritability h2). Trait B shows a high heritability since correlation of trait rises sharply with degree of genetic similarity. Trait C shows low heritability, but also low correlations generally; this means Trait C has a high nonshared environmental variance e2. In other words, the degree to which individuals display Trait C has little to do with either genes or broadly predictable environmental factors—roughly, the outcome approaches random for an individual. Notice also that even identical twins raised in a common family rarely show 100% trait correlation.

It is important to note that the term heritability refers only to the degree of genetic variation between people on a trait. It does not refer to the degree to which a trait of a particular individual is due to environmental or genetic factors. The traits of an individual are always a complex interweaving of both.[31] For an individual, even strongly genetically influenced, or "obligate" traits, such as eye color, assume the inputs of a typical environment during ontogenetic development (e.g., certain ranges of temperatures, oxygen levels, etc.).

In contrast, the "heritability index" statistically quantifies the extent to which variation between individuals on a trait is due to variation in the genes those individuals carry. In animals where breeding and environments can be controlled experimentally, heritability can be determined relatively easily. Such experiments would be unethical for human research. This problem can be overcome by finding existing populations of humans that reflect the experimental setting the researcher wishes to create.

One way to determine the contribution of genes and environment to a trait is to study twins. In one kind of study, identical twins reared apart are compared to randomly selected pairs of people. The twins share identical genes, but different family environments. In another kind of twin study, identical twins reared together (who share family environment and genes) are compared to fraternal twins reared together (who also share family environment but only share half their genes). Another condition that permits the disassociation of genes and environment is adoption. In one kind of adoption study, biological siblings reared together (who share the same family environment and half their genes) are compared to adoptive siblings (who share their family environment but none of their genes).

In many cases, it has been found that genes make a substantial contribution, including psychological traits such as intelligence and personality.[32] Yet heritability may differ in other circumstances, for instance environmental deprivation. Examples of low, medium, and high heritability traits include:

Low heritability Medium heritability High heritability
Specific language Weight Blood type
Specific religion Religiosity Eye color

Twin and adoption studies have their methodological limits. For example, both are limited to the range of environments and genes which they sample. Almost all of these studies are conducted in Western, first-world countries, and therefore cannot be extrapolated globally to include poorer, non-western populations. Additionally, both types of studies depend on particular assumptions, such as the equal environments assumption in the case of twin studies, and the lack of pre-adoptive effects in the case of adoption studies.

Since the definition of "nature" in this context is tied to "heritability", the definition of "nurture" has necessarily become very wide, including any type of causality that is not heritable. The term has thus moved away from its original connotation of "cultural influences" to include all effects of the environment, including; indeed, a substantial source of environmental input to human nature may arise from stochastic variations in prenatal development and is thus in no sense of the term "cultural".[33][34]

Interaction of genes and environment

Heritability refers to the origins of differences between people. Individual development, even of highly heritable traits, such as eye color, depends on a range of environmental factors, from the other genes in the organism, to physical variables such as temperature, oxygen levels etc. during its development or ontogenesis.

The variability of trait can be meaningfully spoken of as being due in certain proportions to genetic differences ("nature"), or environments ("nurture"). For highly penetrant Mendelian genetic disorders such as Huntington's disease virtually all the incidence of the disease is due to genetic differences. Huntington's animal models live much longer or shorter lives depending on how they are cared for[citation needed].

At the other extreme, traits such as native language are environmentally determined: linguists have found that any child (if capable of learning a language at all) can learn any human language with equal facility.[35] With virtually all biological and psychological traits, however, genes and environment work in concert, communicating back and forth to create the individual.

At a molecular level, genes interact with signals from other genes and from the environment. While there are many thousands of single-gene-locus traits, so-called complex traits are due to the additive effects of many (often hundreds) of small gene effects. A good example of this is height, where variance appears to be spread across many hundreds of loci.[36]

Extreme genetic or environmental conditions can predominate in rare circumstances—if a child is born mute due to a genetic mutation, it will not learn to speak any language regardless of the environment; similarly, someone who is practically certain to eventually develop Huntington's disease according to their genotype may die in an unrelated accident (an environmental event) long before the disease will manifest itself.


The "two buckets" view of heritability.

More realistic "homogenous mudpie" view of heritability.

Steven Pinker likewise described several examples:[37] [38]
concrete behavioral traits that patently depend on content provided by the home or culture—which language one speaks, which religion one practices, which political party one supports—are not heritable at all. But traits that reflect the underlying talents and temperaments—how proficient with language a person is, how religious, how liberal or conservative—are partially heritable.
When traits are determined by a complex interaction of genotype and environment it is possible to measure the heritability of a trait within a population. However, many non-scientists who encounter a report of a trait having a certain percentage heritability imagine non-interactional, additive contributions of genes and environment to the trait. As an analogy, some laypeople may think of the degree of a trait being made up of two "buckets," genes and environment, each able to hold a certain capacity of the trait. But even for intermediate heritabilities, a trait is always shaped by both genetic dispositions and the environments in which people develop, merely with greater and lesser plasticities associated with these heritability measures.

Heritability measures always refer to the degree of variation between individuals in a population. That is, as these statistics cannot be applied at the level of the individual, it would be incorrect to say that while the heritability index of personality is about 0.6, 60% of one's personality is obtained from one's parents and 40% from the environment. To help to understand this, imagine that all humans were genetic clones. The heritability index for all traits would be zero (all variability between clonal individuals must be due to environmental factors). And, contrary to erroneous interpretations of the heritability index, as societies become more egalitarian (everyone has more similar experiences) the heritability index goes up (as environments become more similar, variability between individuals is due more to genetic factors).

One should also take into account the fact that the variables of heritability and environmentality are not precise and vary within a chosen population and across cultures. It would be more accurate to state that the degree of heritability and environmentality is measured in its reference to a particular phenotype in a chosen group of a population in a given period of time. The accuracy of the calculations is further hindered by the number of coefficients taken into consideration, age being one such variable. The display of the influence of heritability and environmentality differs drastically across age groups: the older the studied age is, the more noticeable the heritability factor becomes, the younger the test subjects are, the more likely it is to show signs of strong influence of the environmental factors.

Some have pointed out that environmental inputs affect the expression of genes[16] (see the article on epigenetics). This is one explanation of how environment can influence the extent to which a genetic disposition will actually manifest.[citation needed] The interactions of genes with environment, called gene–environment interactions, are another component of the nature–nurture debate. A classic example of gene–environment interaction is the ability of a diet low in the amino acid phenylalanine to partially suppress the genetic disease phenylketonuria. Yet another complication to the nature–nurture debate is the existence of gene-environment correlations. These correlations indicate that individuals with certain genotypes are more likely to find themselves in certain environments. Thus, it appears that genes can shape (the selection or creation of) environments. Even using experiments like those described above, it can be very difficult to determine convincingly the relative contribution of genes and environment.

A study conducted by T.J. Bouchard, Jr. showed data that has been evidence for the importance of genes when testing middle-aged twins reared together and reared apart. The results shown have been important evidence against the importance of environment when determining, happiness, for example. In the Minnesota study of twins reared apart, it was actually found that there was higher correlation for monozygotic twins reared apart (0.52)than monozygotic twins reared together (0.44). Also, highlighting the importance of genes, these correlations found much higher correlation among monozygotic than dizygotic twins that had a correlation of 0.08 when reared together and −0.02 when reared apart.[39]

Social pre-wiring

The social pre-wiring hypothesis refers to the ontogeny of social interaction. Also informally referred to as, "wired to be social." The theory questions whether there is a propensity to socially oriented action already present before birth. Research in the theory concludes that newborns are born into the world with a unique genetic wiring to be social[40].

Circumstantial evidence supporting the social pre-wiring hypothesis can be revealed when examining newborns' behavior. Newborns, not even hours after birth, have been found to display a preparedness for social interaction. This preparedness is expressed in ways such as their imitation of facial gestures. This observed behavior cannot be contributed to any current form of socialization or social construction. Rather, newborns most likely inherit to some extent social behavior and identity through genetics[40].

Principal evidence of this theory is uncovered by examining Twin pregnancies. The main argument is, if there are social behaviors that are inherited and developed before birth, then one should expect twin foetuses to engage in some form of social interaction before they are born. Thus, ten foetuses were analyzed over a period of time using ultrasound techniques. Using kinematic analysis, the results of the experiment were that the twin foetuses would interact with each other for longer periods and more often as the pregnancies went on. Researchers were able to conclude that the performance of movements between the co-twins were not accidental but specifically aimed[40].

The social pre-wiring hypothesis was proved correct, "The central advance of this study is the demonstration that 'social actions' are already performed in the second trimester of gestation. Starting from the 14th week of gestation twin foetuses plan and execute movements specifically aimed at the co-twin. These findings force us to predate the emergence of social behavior: when the context enables it, as in the case of twin foetuses, other-directed actions are not only possible but predominant over self-directed actions."[40].

Obligate vs. facultative adaptations

Traits may be considered to be adaptations (such as the umbilical cord), byproducts of adaptations (the belly button) or due to random variation (convex or concave belly button shape).[41] An alternative to contrasting nature and nurture focuses on "obligate vs. facultative" adaptations.[41] Adaptations may be generally more obligate (robust in the face of typical environmental variation) or more facultative (sensitive to typical environmental variation). For example, the rewarding sweet taste of sugar and the pain of bodily injury are obligate psychological adaptations—typical environmental variability during development does not much affect their operation.[42] On the other hand, facultative adaptations are somewhat like "if-then" statements.[43] An example of a facultative psychological adaptation may be adult attachment style. The attachment style of adults, (for example, a "secure attachment style," the propensity to develop close, trusting bonds with others) is proposed to be conditional on whether an individual's early childhood caregivers could be trusted to provide reliable assistance and attention. An example of a facultative physiological adaptation is tanning of skin on exposure to sunlight (to prevent skin damage).

Advanced techniques

Quantitative studies of heritable traits throw light on the question.

Developmental genetic analysis examines the effects of genes over the course of a human lifespan. Early studies of intelligence, which mostly examined young children, found that heritability measured 40–50%. Subsequent developmental genetic analyses found that variance attributable to additive environmental effects is less apparent in older individuals,[44][45][46] with estimated heritability of IQ increasing in adulthood.

Multivariate genetic analysis examines the genetic contribution to several traits that vary together. For example, multivariate genetic analysis has demonstrated that the genetic determinants of all specific cognitive abilities (e.g., memory, spatial reasoning, processing speed) overlap greatly, such that the genes associated with any specific cognitive ability will affect all others. Similarly, multivariate genetic analysis has found that genes that affect scholastic achievement completely overlap with the genes that affect cognitive ability.

Extremes analysis examines the link between normal and pathological traits. For example, it is hypothesized that a given behavioral disorder may represent an extreme of a continuous distribution of a normal behavior and hence an extreme of a continuous distribution of genetic and environmental variation. Depression, phobias, and reading disabilities have been examined in this context.

For a few highly heritable traits, studies have identified loci associated with variance in that trait, for instance in some individuals with schizophrenia.[47]

Entrepreneurship

Through studies of identical twins separated at birth, one-third of their creative thinking abilities come from genetics and two-thirds come from learning.[48] Research suggests that between 37 and 42 percent of the explained variance can be attributed to genetic factors.[49] The learning primarily comes in the form of human capital transfers of entrepreneurial skills through parental role modeling.[50] Other findings agree that the key to innovative entrepreneurial success comes from environmental factors and working “10,000 hours” to gain mastery in entrepreneurial skills.[51]

Heritability of intelligence

Evidence from behavioral genetic research suggests that family environmental factors may have an effect upon childhood IQ, accounting for up to a quarter of the variance. The American Psychological Association's report "Intelligence: Knowns and Unknowns" (1995) states that there is no doubt that normal child development requires a certain minimum level of responsible care. Here, environment is playing a role in what is believed to be fully genetic (intelligence) but it was found that severely deprived, neglectful, or abusive environments have highly negative effects on many aspects of children's intellect development. Beyond that minimum, however, the role of family experience is in serious dispute. On the other hand, by late adolescence this correlation disappears, such that adoptive siblings no longer have similar IQ scores.[52]

Moreover, adoption studies indicate that, by adulthood, adoptive siblings are no more similar in IQ than strangers (IQ correlation near zero), while full siblings show an IQ correlation of 0.6. Twin studies reinforce this pattern: monozygotic (identical) twins raised separately are highly similar in IQ (0.74), more so than dizygotic (fraternal) twins raised together (0.6) and much more than adoptive siblings (~0.0).[53] Recent adoption studies also found that supportive parents can have a positive effect on the development of their children.[54]

Personality traits

Personality is a frequently cited example of a heritable trait that has been studied in twins and adoptees using behavioral genetic study designs. The most famous categorical organization of heritable personality traits were created by Goldberg (1990) in which he had college students rate their personalities on 1400 dimensions to begin, and then narrowed these down into "The Big Five" factors of personality—Openness, conscientiousness, extraversion, agreeableness, and neuroticism. The close genetic relationship between positive personality traits and, for example, our happiness traits are the mirror images of comorbidity in psychopathology. These personality factors were consistent across cultures, and many studies have also tested the heritability of these traits.

Identical twins reared apart are far more similar in personality than randomly selected pairs of people. Likewise, identical twins are more similar than fraternal twins. Also, biological siblings are more similar in personality than adoptive siblings. Each observation suggests that personality is heritable to a certain extent. A supporting article had focused on the heritability of personality (which is estimated to be around 50% for subjective well-being) in which a study was conducted using a representative sample of 973 twin pairs to test the heritable differences in subjective well-being which were found to be fully accounted for by the genetic model of the Five-Factor Model’s personality domains.[55] However, these same study designs allow for the examination of environment as well as genes.

Adoption studies also directly measure the strength of shared family effects. Adopted siblings share only family environment. Most adoption studies indicate that by adulthood the personalities of adopted siblings are little or no more similar than random pairs of strangers. This would mean that shared family effects on personality are zero by adulthood.

In the case of personality traits, non-shared environmental effects are often found to out-weigh shared environmental effects. That is, environmental effects that are typically thought to be life-shaping (such as family life) may have less of an impact than non-shared effects, which are harder to identify. One possible source of non-shared effects is the environment of pre-natal development. Random variations in the genetic program of development may be a substantial source of non-shared environment. These results suggest that "nurture" may not be the predominant factor in "environment". Environment and our situations, do in fact impact our lives, but not the way in which we would typically react to these environmental factors. We are preset with personality traits that are the basis for how we would react to situations. An example would be how extraverted prisoners become less happy than introverted prisoners and would react to their incarceration more negatively due to their preset extraverted personality.[31]:Ch 19 Behavioral genes are somewhat proven to exist when we take a look at fraternal twins. When fraternal twins are reared apart, they show the same similarities in behavior and response as if they have been reared together.[56]

Genetics

Genomics

The relationship between personality and people's own well-being is influenced and mediated by genes (Weiss, Bates, & Luciano, 2008). There has been found to be a stable set point for happiness that is characteristic of the individual (largely determined by the individual's genes). Happiness fluctuates around that setpoint (again, genetically determined) based on whether good things or bad things are happening to us ("nurture"), but only fluctuates in small magnitude in a normal human. The midpoint of these fluctuations is determined by the "great genetic lottery" that people are born with, which leads them to conclude that how happy they may feel at the moment or over time is simply due to the luck of the draw, or gene. This fluctuation was also not due to educational attainment, which only accounted for less than 2% of the variance in well-being for women, and less than 1% of the variance for men.[39]

They consider that the individualities measured together with personality tests remain steady throughout an individual’s lifespan. They further believe that human beings may refine their forms or personality but can never change them entirely. Darwin's Theory of Evolution steered naturalists such as George Williams and William Hamilton to the concept of personality evolution. They suggested that physical organs and also personality is a product of natural selection.[57]

With the advent of genomic sequencing, it has become possible to search for and identify specific gene polymorphisms that affect traits such as IQ and personality. These techniques work by tracking the association of differences in a trait of interest with differences in specific molecular markers or functional variants. An example of a visible human trait for which the precise genetic basis of differences are relatively well known is eye color. For traits with many genes affecting the outcome, a smaller portion of the variance is currently understood: For instance for height known gene variants account for around 5–10% of height variance at present.[citation needed] When discussing the significant role of genetic heritability in relation to one's level of happiness, it has been found that from 44% to 52% of the variance in one's well-being is associated with genetic variation. Based on the retest of smaller samples of twins studies after 4,5, and 10 years, it is estimated that the heritability of the genetic stable component of subjective well-being approaches 80%.[39] Other studies that have found that genes are a large influence in the variance found in happiness measures, exactly around 35–50%.[58][59][60][61]

In contrast to views developed in 60's that gender identity is primarily learned (which led to policy-based surgical sex changed in children such as David Reimer), genomics has provided solid evidence that both sex and gender identities are primarily influenced by genes:
It is now clear that genes are vastly more influential than virtually any other force in shaping sex identity and gender identity…[T]he growing consensus in medicine is that…children should be assigned to their chromosomal (i.e., genetic) sex regardless of anatomical variations and differences—with the option of switching, if desired, later in life.
— Siddhartha Mukherjee, The Gene: An Intimate History

Linkage and association studies

In their attempts to locate the genes responsible for configuring certain phenotypes, researches resort to two different techniques. Linkage study facilitates the process of determining a specific location in which a gene of interested is located. This methodology is applied only among individuals that are related and does not serve to pinpoint specific genes. It does, however, narrow down the area of search, making it easier to locate one or several genes in the genome which constitute a specific trait.

Association studies, on the other hand, are more hypothetic and seek to verify whether a particular genetic variable really influences the phenotype of interest. In association studies it is more common to use case-control approach, comparing the subject with relatively higher or lower hereditary determinants with the control subject.

Butane

From Wikipedia, the free encyclopedia ...