Neuroevolution, or neuro-evolution, is a form of artificial intelligence that uses evolutionary algorithms to generate artificial neural networks (ANN), parameters, and rules. It is most commonly applied in artificial life, general game playing and evolutionary robotics. The main benefit is that neuroevolution can be applied more widely than supervised learning
algorithms, which require a syllabus of correct input-output pairs. In
contrast, neuroevolution requires only a measure of a network's
performance at a task. For example, the outcome of a game (i.e., whether
one player won or lost) can be easily measured without providing
labeled examples of desired strategies. Neuroevolution is commonly used
as part of the reinforcement learning paradigm, and it can be contrasted with conventional deep learning techniques that use backpropagation (gradient descent on a neural network) with a fixed topology.
Features
Many neuroevolution algorithms
have been defined. One common distinction is between algorithms that
evolve only the strength of the connection weights for a fixed network
topology (sometimes called conventional neuroevolution), and algorithms
that evolve both the topology of the network and its weights (called
TWEANNs, for Topology and Weight Evolving Artificial Neural Network
algorithms).
A separate distinction can be made between methods that evolve
the structure of ANNs in parallel to its parameters (those applying
standard evolutionary algorithms) and those that develop them separately
(through memetic algorithms).
Most neural networks use gradient descent rather than neuroevolution. However, around 2017 researchers at Uber
stated they had found that simple structural neuroevolution algorithms
were competitive with sophisticated modern industry-standard
gradient-descent deep learning algorithms, in part because neuroevolution was found to be less likely to get stuck in local minima. In Science,
journalist Matthew Hutson speculated that part of the reason
neuroevolution is succeeding where it had failed before is due to the
increased computational power available in the 2010s.
It can be shown that there is a correspondence between neuroevolution and gradient descent.
Direct and indirect encoding
Evolutionary algorithms operate on a population of genotypes (also referred to as genomes). In neuroevolution, a genotype is mapped to a neural network phenotype that is evaluated on some task to derive its fitness.
In direct encoding schemes the genotype directly maps to
the phenotype. That is, every neuron and connection in the neural
network is specified directly and explicitly in the genotype. In
contrast, in indirect encoding schemes the genotype specifies indirectly how that network should be generated.
Indirect encodings are often used to achieve several aims:
modularity and other regularities;
compression of phenotype to a smaller genotype, providing a smaller search space;
mapping the search space (genome) to the problem domain.
Taxonomy of embryogenic systems for indirect encoding
Traditionally indirect encodings that employ artificial embryogeny (also known as artificial development) have been categorised along the lines of a grammatical approach versus a cell chemistry approach. The former evolves sets of rules in the form of grammatical rewrite
systems. The latter attempts to mimic how physical structures emerge in
biology through gene expression. Indirect encoding systems often use
aspects of both approaches.
Stanley and Miikkulainen propose a taxonomy for embryogenic systems that is intended to reflect
their underlying properties. The taxonomy identifies five continuous
dimensions, along which any embryogenic system can be placed:
Cell (neuron) fate: the final characteristics and role of
the cell in the mature phenotype. This dimension counts the number of
methods used for determining the fate of a cell.
Targeting: the method by which connections are directed from
source cells to target cells. This ranges from specific targeting
(source and target are explicitly identified) to relative targeting
(e.g., based on locations of cells relative to each other).
Heterochrony: the timing and ordering of events during embryogeny. Counts the number of mechanisms for changing the timing of events.
Canalization: how tolerant the genome is to mutations
(brittleness). Ranges from requiring precise genotypic instructions to a
high tolerance of imprecise mutation.
Complexification: the ability of the system (including
evolutionary algorithm and genotype to phenotype mapping) to allow
complexification of the genome (and hence phenotype) over time. Ranges
from allowing only fixed-size genomes to allowing highly variable length
genomes.
Geometry of the water molecule with values for O-H bond length and for H-O-H bond angle between two bonds
Molecular geometry is the three-dimensional arrangement of the atoms that constitute a molecule. It includes the general shape of the molecule as well as bond lengths, bond angles, torsional angles and any other geometrical parameters that determine the position of each atom.
The molecular geometry can be determined by various spectroscopic methods and diffraction methods. IR, microwave and Raman spectroscopy
can give information about the molecule geometry from the details of
the vibrational and rotational absorbance detected by these techniques. X-ray crystallography, neutron diffraction and electron diffraction
can give molecular structure for crystalline solids based on the
distance between nuclei and concentration of electron density. Gas electron diffraction can be used for small molecules in the gas phase. NMR and FRET methods can be used to determine complementary information including relative distances, dihedral angles, angles, and connectivity. Molecular geometries are best determined at
low temperature because at higher temperatures the molecular structure
is averaged over more accessible geometries (see next section). Larger
molecules often exist in multiple stable geometries (conformational isomerism) that are close in energy on the potential energy surface. Geometries can also be computed by ab initio quantum chemistry methods to high accuracy. The molecular geometry can be different as a solid, in solution, and as a gas.
The position of each atom is determined by the nature of the chemical bonds
by which it is connected to its neighboring atoms. The molecular
geometry can be described by the positions of these atoms in space,
evoking bond lengths of two joined atoms, bond angles of three connected atoms, and torsion angles (dihedral angles) of three consecutive bonds.
Influence of thermal excitation
Since the motions of the atoms in a molecule are determined by
quantum mechanics, "motion" must be defined in a quantum mechanical way.
The overall (external) quantum mechanical motions translation and
rotation hardly change the geometry of the molecule. (To some extent
rotation influences the geometry via Coriolis forces and centrifugal distortion, but this is negligible for the present discussion.) In addition to translation and rotation, a third type of motion is molecular vibration,
which corresponds to internal motions of the atoms such as bond
stretching and bond angle variation. The molecular vibrations are harmonic
(at least to good approximation), and the atoms oscillate about their
equilibrium positions, even at the absolute zero of temperature. At
absolute zero all atoms are in their vibrational ground state and show zero point quantum mechanical motion, so that the wavefunction of a single vibrational mode is not a sharp peak, but approximately a Gaussian function (the wavefunction for n = 0 depicted in the article on the quantum harmonic oscillator).
At higher temperatures the vibrational modes may be thermally excited
(in a classical interpretation one expresses this by stating that "the
molecules will vibrate faster"), but they oscillate still around the
recognizable geometry of the molecule.
To get a feeling for the probability that the vibration of molecule may be thermally excited,
we inspect the Boltzmann factorβ ≡ exp(−ΔE/kT), where ΔE is the excitation energy of the vibrational mode, k the Boltzmann constant and T the absolute temperature. At 298 K (25 °C), typical values for the Boltzmann factor β are:
β = 0.089 for ΔE = 500 cm−1
β = 0.008 for ΔE = 1000 cm−1
β = 0.0007 for ΔE = 1500 cm−1.
(The reciprocal centimeter is an energy unit that is commonly used in infrared spectroscopy; 1 cm−1 corresponds to 1.23984×10−4 eV). When an excitation energy is 500 cm−1,
then about 8.9 percent of the molecules are thermally excited at room
temperature. To put this in perspective: the lowest excitation
vibrational energy in water is the bending mode (about 1600 cm−1).
Thus, at room temperature less than 0.07 percent of all the molecules
of a given amount of water will vibrate faster than at absolute zero.
As stated above, rotation hardly influences the molecular
geometry. But, as a quantum mechanical motion, it is thermally excited
at relatively (as compared to vibration) low temperatures. From a
classical point of view it can be stated that at higher temperatures
more molecules will rotate faster,
which implies that they have higher angular velocity and angular momentum. In quantum mechanical language: more eigenstates of higher angular momentum become thermally populated with rising temperatures. Typical rotational excitation energies are on the order of a few cm−1.
The results of many spectroscopic experiments are broadened because
they involve an averaging over rotational states. It is often difficult
to extract geometries from spectra at high temperatures, because the
number of rotational states probed in the experimental averaging
increases with increasing temperature. Thus, many spectroscopic
observations can only be expected to yield reliable molecular geometries
at temperatures close to absolute zero, because at higher temperatures
too many higher rotational states are thermally populated.
Bonding
Molecules, by definition, are most often held together with covalent bonds involving single, double, and/or triple bonds, where a "bond" is a shared pair of electrons (the other method of bonding between atoms is called ionic bonding and involves a positive cation and a negative anion).
Molecular geometries can be specified in terms of 'bond lengths',
'bond angles' and 'torsional angles'. The bond length is defined to be
the average distance between the nuclei of two atoms bonded together in
any given molecule. A bond angle is the angle formed between three atoms
across at least two bonds. For four atoms bonded together in a chain,
the torsional angle is the angle between the plane formed by the first three atoms and the plane formed by the last three atoms.
There exists a mathematical relationship among the bond angles
for one central atom and four peripheral atoms (labeled 1 through 4)
expressed by the following determinant. This constraint removes one
degree of freedom from the choices of (originally) six free bond angles
to leave only five choices of bond angles. (The angles θ11, θ22, θ33, and θ44
are always zero and that this relationship can be modified for a
different number of peripheral atoms by expanding/contracting the square
matrix.)
Molecular geometry is determined by the quantum mechanical behavior of the electrons. Using the valence bond approximation this can be understood by the type of bonds between the atoms that make up the molecule. When atoms interact to form a chemical bond, the atomic orbitals of each atom are said to combine in a process called orbital hybridisation. The two most common types of bonds are sigma bonds (usually formed by hybrid orbitals) and pi bonds (formed by unhybridized p orbitals for atoms of main group elements). The geometry can also be understood by molecular orbital theory where the electrons are delocalised.
An understanding of the wavelike behavior of electrons in atoms and molecules is the subject of quantum chemistry.
Isomers
Isomers are types of molecules that share a chemical formula but have difference geometries, resulting in different properties:
A pure substance is composed of only one type of isomer of a molecule (all have the same geometrical structure).
Structural isomers
have the same chemical formula but different physical arrangements,
often forming alternate molecular geometries with very different
properties. The atoms are not bonded (connected) together in the same
orders.
Functional isomers
are special kinds of structural isomers, where certain groups of atoms
exhibit a special kind of behavior, such as an ether or an alcohol.
Stereoisomers may have many similar physicochemical properties (melting point, boiling point) and at the same time very different biochemical activities. This is because they exhibit a handedness that is commonly found in living systems. One manifestation of this chirality or handedness is that they have the ability to rotate polarized light in different directions.
Protein folding concerns the complex geometries and different isomers that proteins can take.
Types of molecular structure
A bond angle is the geometric angle between two adjacent bonds. Some common shapes of simple molecules include:
Linear: In a linear model, atoms are connected in a straight line. The bond angles are set at 180°. For example, carbon dioxide and nitric oxide have a linear molecular shape.
Trigonal planar: Molecules with the trigonal planar shape are somewhat triangular and in one plane (flat). Consequently, the bond angles are set at 120°. For example, boron trifluoride.
Angular: Angular molecules (also called bent or V-shaped) have a non-linear shape. For example, water (H2O), which has an angle of about 105°. A water molecule has two pairs of bonded electrons and two unshared lone pairs.
Tetrahedral:Tetra- signifies four, and -hedral relates to a face of a solid, so "tetrahedral" literally means "having four faces". This shape is found when there are four bonds all on one central atom, with no extra unshared electron pairs. In accordance with the VSEPR (valence-shell electron pair repulsion theory), the bond angles between the electron bonds are arccos(−1/3) = 109.47°. For example, methane (CH4) is a tetrahedral molecule.
Octahedral:Octa- signifies eight, and -hedral relates to a face of a solid, so "octahedral" means "having eight faces". The bond angle is 90 degrees. For example, sulfur hexafluoride (SF6) is an octahedral molecule.
Trigonal pyramidal: A trigonal pyramidal molecule has a pyramid-like shape
with a triangular base. Unlike the linear and trigonal planar shapes
but similar to the tetrahedral orientation, pyramidal shapes require
three dimensions in order to fully separate the electrons. Here, there
are only three pairs of bonded electrons, leaving one unshared lone
pair. Lone pair – bond pair repulsions change the bond angle from the
tetrahedral angle to a slightly lower value. For example, ammonia (NH3).
Artificial Life, also referred to as ALife, is a field of study wherein researchers examine systems related to natural life, its processes, and its evolution, through the use of simulations with computer models, robotics, and biochemistry. The discipline was named by Christopher Langton, an American computer scientist, in 1986. In 1987, Langton organized the first conference on the field, in Los Alamos, New Mexico. There are three main kinds of artificial life, named for their approaches: soft, from software; hard, from hardware; and wet, from biochemistry. Artificial life researchers study traditional biology by trying to replicate aspects of biological phenomena.
Overview
Artificial life studies the fundamental processes of living systems
in artificial environments in order to gain a deeper understanding of
the complex information processing that defines such systems. These
topics are broad, but often include evolutionary dynamics, emergent properties of collective systems, biomimicry, as well as related issues about the philosophy of the nature of life and the use of lifelike properties in artistic works.
Philosophy
The modeling philosophy of artificial life strongly differs from
traditional modeling by studying not only "life as we know it" but also
"life as it could be".
A traditional model of a biological system will focus on
capturing its most important parameters. In contrast, an ALife modeling
approach will generally seek to decipher the most simple and general
principles underlying life and implement them in a simulation. The
simulation then offers the possibility to analyse new and different
lifelike systems.
Vladimir Georgievich Red'ko proposed to generalize this
distinction to the modeling of any process, leading to the more general
distinction of "processes as we know them" and "processes as they could
be".
At present, the commonly accepted definition of life does not consider any current ALife simulations or software to be alive, and they do not constitute part of the evolutionary process of any ecosystem. However, different opinions about artificial life's potential have arisen:
The strong ALife (cf. Strong AI) position states that "life is a process which can be abstracted away from any particular medium" (John von Neumann). This view is rooted in von Neumann's work on cellular automata and universal constructors,
which demonstrated that self-reproduction could be achieved by
logic-based machines regardless of their physical substrate. Notably, Tom Ray declared that his program Tierra is not simulating life in a computer but synthesizing it.
The weak ALife position denies the possibility of generating a
"living process" outside of a chemical solution. Its researchers try
instead to simulate life processes to understand the underlying
mechanics of biological phenomena.
A central goal in the philosophy and modeling of artificial life is achieving Open-Ended Evolution (OEE).
This refers to the capacity of a system to continually produce novel,
complex, and adaptive behaviors or entities without reaching a stable
equilibrium or predefined end-point. Researchers argue that OEE is a hallmark of natural life that current artificial systems have yet to fully replicate.
Software-based ("soft")
Techniques
Cellular automata were used in the early days of artificial life, and are still often used for ease of scalability and parallelization. ALife and cellular automata share a closely tied history.
Artificial neural networks are sometimes used to model the brain of an agent. Although traditionally more of an artificial intelligence technique, neural nets can be important for simulating population dynamics of organisms that can learn.
The symbiosis between learning and evolution is central to theories
about the development of instincts in organisms with higher neurological
complexity, as in, for instance, the Baldwin effect.
Program-based simulations contain organisms with a "genome" language. This language is more often in the form of a Turing complete
computer program than actual biological DNA. Assembly derivatives are
the most common languages used. An organism "lives" when its code is
executed, and there are usually various methods allowing self-replication. Mutations are generally implemented as random changes to the code. Use of cellular automata is common but not required. Another example could be an artificial intelligence and multi-agent system/program.
Individual modules are added to a creature. These modules modify the
creature's behaviors and characteristics either directly, by hard
coding into the simulation (leg type A increases speed and metabolism),
or indirectly, through the emergent interactions between a creature's
modules (leg type A moves up and down with a frequency of X, which
interacts with other legs to create motion). Generally, these are
simulators that emphasize user creation and accessibility over mutation
and evolution.
Parameter-based
Organisms are generally constructed with pre-defined and fixed
behaviors that are controlled by various parameters that mutate. That
is, each organism contains a collection of numbers or other finite parameters. Each parameter controls one or several aspects of an organism in a well-defined way.
Neural net–based
These simulations have creatures that learn and grow using neural
nets or a close derivative. Emphasis is often, although not always, on
learning rather than on natural selection.
Complex systems modeling
Mathematical models of complex systems are of three types: black-box (phenomenological), white-box (mechanistic, based on the first principles) and grey-box (mixtures of phenomenological and mechanistic models). In black-box models, the individual-based (mechanistic) mechanisms of a complex dynamic system remain hidden.
Mathematical models for complex systems
Black-box models are completely nonmechanistic. They are
phenomenological and ignore a composition and internal structure of a
complex system. Due to the non-transparent nature of the model,
interactions of subsystems cannot be investigated. In contrast, a
white-box model of a complex dynamic system has ‘transparent walls’ and
directly shows underlying mechanisms. All events at the micro-, meso-
and macro-levels of a dynamic system are directly visible at all stages
of a white-box model's evolution. In most cases, mathematical modelers
use the heavy black-box mathematical methods, which cannot produce
mechanistic models of complex dynamic systems. Grey-box models are
intermediate and combine black-box and white-box approaches.
Logical deterministic individual-based cellular automata model of single species population growth
Creation of a white-box model of complex system is associated with the
problem of the necessity of an a priori basic knowledge of the modeling
subject. The deterministic logical cellular automata
are necessary but not sufficient condition of a white-box model. The
second necessary prerequisite of a white-box model is the presence of
the physical ontology of the object under study. The white-box modeling represents an automatic hyper-logical inference from the first principles
because it is completely based on the deterministic logic and axiomatic
theory of the subject. The purpose of the white-box modeling is to
derive from the basic axioms a more detailed, more concrete mechanistic
knowledge about the dynamics of the object under study. The necessity to
formulate an intrinsic axiomatic system
of the subject before creating its white-box model distinguishes the
cellular automata models of white-box type from cellular automata models
based on arbitrary logical rules. If cellular automata rules have not
been formulated from the first principles of the subject, then such a
model may have a weak relevance to the real problem.
Logical deterministic individual-based cellular automata model of interspecific competition for a single limited resource
Notable simulators
This is a list of artificial life and digital organism simulators:
Biochemical-based life is studied in the field of synthetic biology. It involves research such as the creation of synthetic DNA. The term "wet" is an extension of the term "wetware". Efforts toward "wet" artificial life focus on engineering live minimal cells from living bacteria Mycoplasma laboratorium and in building non-living biochemical cell-like systems from scratch.
In 2021, the same team that developed Xenobots reported a further breakthrough: the first biological robots capable of kinematic self-replication.
Unlike traditional biological reproduction (growth/birth), these
synthetic organisms spontaneously collect loose cells in their
environment to assemble new copies of themselves, a process previously
seen only at the molecular level.
Artificial chemistry started as a method within the ALife community to abstract the processes of chemical reactions.
Evolutionary algorithms are a practical application of the weak ALife principle applied to optimization problems.
Many optimization algorithms have been crafted which borrow from or
closely mirror ALife techniques. The primary difference lies in
explicitly defining the fitness of an agent by its ability to solve a
problem, instead of its ability to find food, reproduce, or avoid death. The following is a list of evolutionary algorithms closely related to and used in ALife:
Artificial life has had a controversial history. John Maynard Smith criticized certain artificial life work in 1994 as "fact-free science". Mario Bunge criticized the ideas of strong artificial life as part of his wider critique of computationalism.
He wrote that proponents of strong ALife are mistakenly erasing the
distinction between a simulation and the process that is being
simulated. He had no such objections to the weak ALife program.
In chemistry, orbital hybridisation (or hybridization) is the concept of mixing atomic orbitals to form new hybrid orbitals (with different energies, shapes, etc., than the component atomic orbitals) suitable for the pairing of electrons to form chemical bonds in valence bond theory.
For example, in a carbon atom which forms four single bonds, the
valence-shell s orbital combines with three valence-shell p orbitals to
form four equivalent sp3 mixtures in a tetrahedral arrangement around the carbon to bond to four different atoms. Hybrid orbitals are useful in the explanation of molecular geometry
and atomic bonding properties and are symmetrically disposed in space.
Usually hybrid orbitals are formed by mixing atomic orbitals of
comparable energies.
History and uses
ChemistLinus Pauling first developed the hybridisation theory in 1931 to explain the structure of simple molecules such as methane (CH4) using atomic orbitals. Pauling pointed out that a carbon atom forms four bonds by using one s
and three p orbitals, so that "it might be inferred" that a carbon atom
would form three bonds at right angles (using p orbitals) and a fourth
weaker bond using the s orbital in some arbitrary direction. In reality,
methane has four C–H bonds of equivalent strength. The angle between
any two bonds is the tetrahedral bond angle of 109°28' (around 109.5°). Pauling supposed that in the presence of four hydrogen
atoms, the s and p orbitals form four equivalent combinations which he
called hybrid orbitals. Each hybrid is denoted sp3 to indicate its composition, and is directed along one of the four C–H bonds. This concept was developed for such simple chemical systems, but the
approach was later applied more widely, and today it is considered an
effective heuristic for rationalizing the structures of organic compounds. It gives a simple orbital picture equivalent to Lewis structures.
Hybridisation theory is an integral part of organic chemistry, one of the most compelling examples being Baldwin's rules. For drawing reaction mechanisms sometimes a classical bonding picture is needed with two atoms sharing two electrons. Hybridisation theory explains bonding in alkenes and methane. The amount of p character or s character, which is decided mainly by
orbital hybridisation, can be used to reliably predict molecular
properties such as acidity or basicity.
Overview
Orbitals are a model representation of the behavior of electrons
within molecules. In the case of simple hybridization, this
approximation is based on atomic orbitals, similar to those obtained for the hydrogen atom, the only neutral atom for which the Schrödinger equation
can be solved exactly. In heavier atoms, such as carbon, nitrogen, and
oxygen, the atomic orbitals used are the 2s and 2p orbitals, similar to
excited state orbitals for hydrogen.
Hybrid orbitals are assumed to be mixtures of atomic orbitals,
superimposed on each other in various proportions. For example, in methane, the C hybrid orbital which forms each carbon–hydrogen bond consists of 25% s character and 75% p character and is thus described as sp3 (read as s-p-three) hybridised. Quantum mechanics describes this hybrid as an sp3wavefunction of the form , where N is a normalisation constant (here 1/2) and pσ is a p orbital directed along the C-H axis to form a sigma bond. The ratio of coefficients (denoted λ in general) is in this example. Since the electron density associated with an orbital is proportional to the square of the wavefunction, the ratio of p-character to s-character is λ2 = 3. The p character or the weight of the p component is N2λ2 = 3/4.
Hybridisation describes the bonding of atoms from an atom's point of view. For a tetrahedrally coordinated carbon (e.g., methane CH4), the carbon should have 4 orbitals directed towards the 4 hydrogen atoms.
Carbon's ground state configuration is 1s2 2s2 2p2 or more easily read:
C
↑↓
↑↓
↑
↑
1s
2s
2p
2p
2p
This diagram suggests that the carbon atom could use its two singly occupied p-type orbitals to form two covalent bonds with two hydrogen atoms in a methylene (CH2)
molecule, with a hypothetical bond angle of 90° corresponding to the
angle between two p orbitals on the same atom. However the true H–C–H
angle in singlet methylene is about 102° which implies the presence of some orbital hybridisation.
The carbon atom can also bond to four hydrogen atoms in methane
by an excitation (or promotion) of an electron from the doubly occupied
2s orbital to the empty 2p orbital, producing four singly occupied
orbitals.
C*
↑↓
↑
↑
↑
↑
1s
2s
2p
2p
2p
The energy released by the formation of two additional bonds more
than compensates for the excitation energy required, energetically
favoring the formation of four C-H bonds.
According to quantum mechanics, the lowest energy is obtained if
the four bonds are equivalent, which requires that they are formed from
equivalent orbitals on the carbon. A set of four equivalent orbitals can
be obtained that are linear combinations of the valence-shell (core
orbitals are almost never involved in bonding) s and p wave functions, which are the four sp3 hybrids.
C*
↑↓
↑
↑
↑
↑
1s
sp3
sp3
sp3
sp3
In CH4, four sp3 hybrid orbitals are overlapped by the four hydrogens' 1s orbitals, yielding four σ (sigma) bonds (that is, four single covalent bonds) of equal length and strength.
Other carbon compounds and other molecules may be explained in a similar way. For example, ethylene (C2H4) has a double bond between the carbons. For this molecule, carbon sp2 hybridises, because one π (pi) bond is required for the double bond between the carbons and only three σ bonds are formed per carbon atom. In sp2 hybridisation the 2s orbital is mixed with only two of the three available 2p orbitals, usually denoted 2px and 2py. The third 2p orbital (2pz) remains unhybridised.
C*
↑↓
↑
↑
↑
↑
1s
sp2
sp2
sp2
2p
forming a total of three sp2 orbitals with one remaining p orbital. In ethylene, the two carbon atoms form a σ bond by overlapping one sp2
orbital from each carbon atom. The π bond between the carbon atoms
perpendicular to the molecular plane is formed by 2p–2p overlap. Each
carbon atom forms covalent C–H bonds with two hydrogens by s–sp2
overlap, all with 120° bond angles. The hydrogen–carbon bonds are all
of equal strength and length, in agreement with experimental data.
The chemical bonding in compounds such as alkynes with triple bonds is explained by sp hybridization. In this model, the 2s orbital is mixed with only one of the three p orbitals,
C*
↑↓
↑
↑
↑
↑
1s
sp
sp
2p
2p
resulting in two sp orbitals and two remaining p orbitals. The chemical bonding in acetylene (ethyne) (C2H2) consists of sp–sp overlap between the two carbon atoms forming a σ bond and two additional π bonds formed by p–p overlap. Each carbon also bonds to hydrogen in a σ s–sp overlap at 180° angles.
Molecule shape
Shapes of the different types of hybrid orbitals
Hybridisation helps to explain molecule shape, since the angles between bonds are approximately equal to the angles between hybrid orbitals. This is in contrast to valence shell electron-pair repulsion (VSEPR) theory, which can be used to predict molecular geometry based on empirical rules rather than on valence-bond or orbital theories.
spx hybridisation
As the valence orbitals of main group elements are the one s and three p orbitals with the corresponding octet rule, spx hybridization is used to model the shape of these molecules.
As the valence orbitals of transition metals are the five d, one s and three p orbitals with the corresponding 18-electron rule, spxdy
hybridisation is used to model the shape of these molecules. These
molecules tend to have multiple shapes corresponding to the same
hybridization due to the different d-orbitals involved. A square planar
complex has one unoccupied p-orbital and hence has 16 valence electrons.
In certain transition metal complexes with a low d electron count, the p-orbitals are unoccupied and sdx hybridisation is used to model the shape of these molecules.
In some general chemistry textbooks, hybridization is presented for
main group coordination number 5 and above using an "expanded octet"
scheme with d-orbitals first proposed by Pauling. However, such a scheme
is now considered to be incorrect in light of computational chemistry
calculations.
In 1990, Eric Alfred Magnusson of the University of New South Wales
published a paper definitively excluding the role of d-orbital
hybridisation in bonding in hypervalent compounds of second-row (period 3)
elements, ending a point of contention and confusion. Part of the
confusion originates from the fact that d-functions are essential in the
basis sets used to describe these compounds (or else unreasonably high
energies and distorted geometries result). Also, the contribution of the
d-function to the molecular wavefunction is large. These facts were
incorrectly interpreted to mean that d-orbitals must be involved in
bonding.
Resonance
In light of computational chemistry, a better treatment would be to invoke sigma bondresonance
in addition to hybridisation, which implies that each resonance
structure has its own hybridisation scheme. All resonance structures
must obey the octet rule.
While the simple model of orbital hybridisation is commonly used to
explain molecular shape, hybridisation is used differently when computed
in modern valence bond programs. Specifically, hybridisation is not
determined a priori but is instead variationally optimized to
find the lowest energy solution and then reported. This means that all
artificial constraints, specifically two constraints, on orbital
hybridisation are lifted:
that hybridisation is restricted to integer values (isovalent hybridisation)
that hybrid orbitals are orthogonal to one another (hybridisation defects)
This means that in practice, hybrid orbitals do not conform to the
simple ideas commonly taught and thus in scientific computational papers
are simply referred to as spx, spxdy or sdx hybrids to express their nature instead of more specific integer values.
Although ideal hybrid orbitals can be useful, in reality, most bonds
require orbitals of intermediate character. This requires an extension
to include flexible weightings of atomic orbitals of each type (s, p, d)
and allows for a quantitative depiction of the bond formation when the
molecular geometry deviates from ideal bond angles. The amount of
p-character is not restricted to integer values; i.e., hybridizations
like sp2.5 are also readily described.
The hybridization of bond orbitals is determined by Bent's rule: "Atomic s character concentrates in orbitals directed towards electropositive substituents".
For molecules with lone pairs, the bonding orbitals are isovalent spx hybrids. For example, the two bond-forming hybrid orbitals of oxygen in water can be described as sp4.0 to give the interorbital angle of 104.5°. This means that they have 20% s character and 80% p character and does not
imply that a hybrid orbital is formed from one s and four p orbitals on
oxygen since the 2p subshell of oxygen only contains three p orbitals.
Hybridisation defects
Hybridisation of s and p orbitals to form effective spx
hybrids requires that they have comparable radial extent. While 2p
orbitals are on average less than 10% larger than 2s, in part
attributable to the lack of a radial node in 2p orbitals, 3p orbitals
which have one radial node, exceed the 3s orbitals by 20–33%. The difference in extent of s and p orbitals increases further down a
group. The hybridisation of atoms in chemical bonds can be analysed by
considering localised molecular orbitals, for example using natural
localised molecular orbitals in a natural bond orbital (NBO) scheme. In methane, CH4, the calculated p/s ratio is approximately 3 consistent with "ideal" sp3 hybridisation, whereas for silane, SiH4,
the p/s ratio is closer to 2. A similar trend is seen for the other 2p
elements. Substitution of fluorine for hydrogen further decreases the
p/s ratio. The 2p elements exhibit near ideal hybridisation with orthogonal hybrid
orbitals. For heavier p block elements this assumption of orthogonality
cannot be justified. These deviations from the ideal hybridisation were
termed hybridisation defects by Kutzelnigg.
However, computational VB groups such as Gerratt, Cooper and
Raimondi (SCVB) as well as Shaik and Hiberty (VBSCF) go a step further
to argue that even for model molecules such as methane, ethylene and
acetylene, the hybrid orbitals are already defective and nonorthogonal,
with hybridisations such as sp1.76 instead of sp3 for methane.
Photoelectron spectra
One misconception concerning orbital hybridization is that it incorrectly predicts the ultraviolet photoelectron spectra of many molecules. While this is true if Koopmans' theorem
is applied to localized hybrids, quantum mechanics requires that the
(in this case ionized) wavefunction obey the symmetry of the molecule
which implies resonance in valence bond theory. For example, in methane, the ionised states (CH4+) can be constructed out of four resonance structures attributing the ejected electron to each of the four sp3 orbitals. A linear combination of these four structures, conserving the number of structures, leads to a triply degenerate T2 state and an A1 state. The difference in energy between each ionized state and the ground state would be ionization energy, which yields two values in agreement with experimental results.
Two distinct states for CH4+ exist (A1 and T2), both of which result from the ionization of CH4. This gives rise to the two unique peaks on the photoelectron spectrum of methane.
Bonding orbitals formed from hybrid atomic orbitals may be considered
as localised molecular orbitals, which can be formed from the
delocalised orbitals of molecular orbital theory by an appropriate
mathematical transformation. For molecules in the ground state, this
transformation of the orbitals leaves the total many-electron wave
function unchanged. The hybrid orbital description of the ground state
is therefore equivalent to the delocalised orbital description
for ground state total energy and electron density, as well as the
molecular geometry that corresponds to the minimum total energy value.
The symmetry-adapted and hybridised lone pairs of H2O
Molecules with multiple bonds or multiple lone pairs can have
orbitals represented in terms of sigma and pi symmetry or equivalent
orbitals. Different valence bond methods use either of the two
representations, which have mathematically equivalent total
many-electron wave functions and are related by a unitary transformation of the set of occupied molecular orbitals.
For multiple bonds, the sigma-pi representation is the predominant one compared to the equivalent orbital (bent bond)
representation. In contrast, for multiple lone pairs, most textbooks
use the equivalent orbital representation. However, the sigma-pi
representation is also used, such as by Weinhold and Landis within the
context of natural bond orbitals,
a localised orbital theory containing modernised analogs of classical
(valence bond/Lewis structure) bonding pairs and lone pairs. For the hydrogen fluoride molecule, for example, two F lone pairs are
essentially unhybridised p orbitals, while the other is an spx hybrid orbital. An analogous consideration applies to water (one O lone pair is in a pure p orbital, another is in an spx hybrid orbital)