Search This Blog

Tuesday, April 7, 2026

Optics

From Wikipedia, the free encyclopedia
A researcher working on an optical system

Optics is the branch of physics that studies the behaviour, manipulation, and detection of electromagnetic radiation, including its interactions with matter and instruments that use or detect it. Optics usually describes the behaviour of visible, ultraviolet, and infrared light. The study of optics extends to other forms of electromagnetic radiation, including radio wavesmicrowaves, and X-rays. The term optics is also applied to technology for manipulating beams of elementary charged particles.

Most optical phenomena can be accounted for by using the classical electromagnetic description of light, however, complete electromagnetic descriptions of light are often difficult to apply in practice. Practical optics is usually done using simplified models. The most common of these, geometric optics, treats light as a collection of rays that travel in straight lines and bend when they pass through or reflect from surfaces. Physical optics is a more comprehensive model of light, which includes wave effects such as diffraction and interference that cannot be accounted for in geometric optics. Historically, the ray-based model of light was developed first, followed by the wave model of light. Progress in electromagnetic theory in the 19th century led to the discovery that light waves were in fact electromagnetic radiation.

Some phenomena depend on light having both wave-like and particle-like properties. Explanation of these effects requires quantum mechanics. When considering light's particle-like properties, the light is modelled as a collection of particles called "photons". Quantum optics deals with the application of quantum mechanics to optical systems.

Optical science is relevant to and studied in many related disciplines including astronomy, various engineering fields, photography, and medicine, especially in radiographic methods such as beam radiation therapy and CT scans, and in the physiological optical fields of ophthalmology and optometry. Practical applications of optics are found in a variety of technologies and everyday objects, including mirrors, lenses, telescopes, microscopes, lasers, and fibre optics.

History

The Nimrud lens

Optics began with the development of lenses by the ancient Egyptians and Mesopotamians. The earliest known lenses, made from polished crystal, often quartz, date from as early as 2000 BC from Crete (Archaeological Museum of Heraclion, Greece). Lenses from Rhodes date around 700 BC, as do Assyrian lenses such as the Nimrud lens.[9] The ancient Romans and Greeks filled glass spheres with water to make lenses. These practical developments were followed by the development of theories of light and vision by ancient Greek and Indian philosophers, and the development of geometrical optics in the Greco-Roman world. The word optics comes from the ancient Greek word ὀπτική, optikē 'appearance, look'.

Greek philosophy on optics broke down into two opposing theories on how vision worked, the intromission theory and the emission theory. The intromission approach saw vision as coming from objects casting off copies of themselves (called eidola) that were captured by the eye. With many propagators including Democritus, Epicurus, Aristotle and their followers, this theory seems to have some contact with modern theories of what vision really is, but it remained only speculation lacking any experimental foundation.

Plato first articulated the emission theory, the idea that visual perception is accomplished by rays emitted by the eyes. He also commented on the parity reversal of mirrors in Timaeus. Some hundred years later, Euclid (4th–3rd century BC) wrote a treatise entitled Optics where he linked vision to geometry, creating geometrical optics. He based his work on Plato's emission theory wherein he described the mathematical rules of perspective and described the effects of refraction qualitatively, although he questioned that a beam of light from the eye could instantaneously light up the stars every time someone blinked. Euclid stated the principle of the shortest trajectory of light, and considered multiple reflections on flat and spherical mirrors. Ptolemy, in his treatise Optics, held an extramission-intromission theory of vision: the rays (or flux) from the eye formed a cone, the vertex being within the eye, and the base defining the visual field. The rays were sensitive, and conveyed information back to the observer's intellect about the distance and orientation of surfaces. He summarized much of Euclid and went on to describe a way to measure the angle of refraction, though he failed to notice the empirical relationship between it and the angle of incidence. Plutarch (1st–2nd century AD) described multiple reflections on spherical mirrors and discussed the creation of magnified and reduced images, both real and imaginary, including the case of chirality of the images.

Reproduction of a page of Ibn Sahl's manuscript showing his knowledge of the law of refraction

During the Middle Ages, Greek ideas about optics were resurrected and extended by writers in the Muslim world. One of the earliest of these was Al-Kindi (c. 801–873) who wrote on the merits of Aristotelian and Euclidean ideas of optics, favouring the emission theory since it could better quantify optical phenomena. In 984, the Persian mathematician Ibn Sahl wrote the treatise "On burning mirrors and lenses", correctly describing a law of refraction equivalent to Snell's law. He used this law to compute optimum shapes for lenses and curved mirrors. In the early 11th century, Alhazen (Ibn al-Haytham) wrote the Book of Optics (Kitab al-manazir) in which he explored reflection and refraction and proposed a new system for explaining vision and light based on observation and experiment. He rejected the "emission theory" of Ptolemaic optics with its rays being emitted by the eye, and instead put forward the idea that light reflected in all directions in straight lines from all points of the objects being viewed and then entered the eye, although he was unable to correctly explain how the eye captured the rays. Alhazen's work was largely ignored in the Arabic world but it was anonymously translated into Latin around 1200 A.D. and further summarised and expanded on by the Polish monk Witelo making it a standard text on optics in Europe for the next 400 years.

In the 13th century in medieval Europe, English bishop Robert Grosseteste wrote on a wide range of scientific topics, and discussed light from four different perspectives: an epistemology of light, a metaphysics or cosmogony of light, an etiology or physics of light, and a theology of light, basing it on the works of Aristotle and Platonism. Grosseteste's most famous disciple, Roger Bacon, wrote works citing a wide range of recently translated optical and philosophical works, including those of Alhazen, Aristotle, Avicenna, Averroes, Euclid, al-Kindi, Ptolemy, Tideus, and Constantine the African. Bacon was able to use parts of glass spheres as magnifying glasses to demonstrate that light reflects from objects rather than being released from them.

The first wearable eyeglasses were invented in Italy around 1286. This was the start of the optical industry of grinding and polishing lenses for these "spectacles", first in Venice and Florence in the thirteenth century, and later in the spectacle making centres in both the Netherlands and Germany. Spectacle makers created improved types of lenses for the correction of vision based more on empirical knowledge gained from observing the effects of the lenses rather than using the rudimentary optical theory of the day (theory which for the most part could not even adequately explain how spectacles worked). This practical development, mastery, and experimentation with lenses led directly to the invention of the compound optical microscope around 1595, and the refracting telescope in 1608, both of which appeared in the spectacle making centres in the Netherlands.

The first treatise about optics by Johannes Kepler, Ad Vitellionem paralipomena quibus astronomiae pars optica traditur (1604), generally recognized as the foundation of modern optics.
Cover of the first edition of Newton's Opticks (1704)
Board with optical devices, 1728 Cyclopaedia

In the early 17th century, Johannes Kepler expanded on geometric optics in his writings, covering lenses, reflection by flat and curved mirrors, the principles of pinhole cameras, inverse-square law governing the intensity of light, and the optical explanations of astronomical phenomena such as lunar and solar eclipses and astronomical parallax. He was also able to correctly deduce the role of the retina as the actual organ that recorded images, finally being able to scientifically quantify the effects of different types of lenses that spectacle makers had been observing over the previous 300 years. After the invention of the telescope, Kepler set out the theoretical basis on how they worked and described an improved version, known as the Keplerian telescope, using two convex lenses to produce higher magnification.

Optical theory progressed in the mid-17th century with treatises written by philosopher René Descartes, which explained a variety of optical phenomena including reflection and refraction by assuming that light was emitted by objects which produced it. This differed substantively from the ancient Greek emission theory. In the late 1660s and early 1670s, Isaac Newton expanded Descartes's ideas into a corpuscle theory of light, famously determining that white light was a mix of colours that can be separated into its component parts with a prism. In 1690, Christiaan Huygens proposed a wave theory for light based on suggestions that had been made by Robert Hooke in 1664. Hooke himself publicly criticised Newton's theories of light and the feud between the two lasted until Hooke's death. In 1704, Newton published Opticks and, at the time, partly because of his success in other areas of physics, he was generally considered to be the victor in the debate over the nature of light.

Newtonian optics was generally accepted until the early 19th century when Thomas Young and Augustin-Jean Fresnel conducted experiments on the interference of light that firmly established light's wave nature. Young's famous double slit experiment showed that light followed the superposition principle, which is a wave-like property not predicted by Newton's corpuscle theory. This work led to a theory of diffraction for light and opened an entire area of study in physical optics. Wave optics was successfully unified with electromagnetic theory by James Clerk Maxwell in the 1860s.

The next development in optical theory came in 1899 when Max Planck correctly modelled blackbody radiation by assuming that the exchange of energy between light and matter only occurred in discrete amounts he called quanta. In 1905, Albert Einstein published the theory of the photoelectric effect that firmly established the quantization of light itself. In 1913, Niels Bohr showed that atoms could only emit discrete amounts of energy, thus explaining the discrete lines seen in emission and absorption spectra. The understanding of the interaction between light and matter that followed from these developments not only formed the basis of quantum optics but also was crucial for the development of quantum mechanics as a whole. The ultimate culmination, the theory of quantum electrodynamics, explains all optics and electromagnetic processes in general as the result of the exchange of real and virtual photons. Quantum optics gained practical importance with the inventions of the maser in 1953 and of the laser in 1960.

Following the work of Paul Dirac in quantum field theory, George Sudarshan, Roy J. Glauber, and Leonard Mandel applied quantum theory to the electromagnetic field in the 1950s and 1960s to gain a more detailed understanding of photodetection and the statistics of light.

Classical optics

Classical optics

Classical optics is divided into two main branches: geometrical (or ray) optics and physical (or wave) optics. In geometrical optics, light is considered to travel in straight lines, while in physical optics, light is considered as an electromagnetic wave.

Geometrical optics can be viewed as an approximation of physical optics that applies when the wavelength of the light used is much smaller than the size of the optical elements in the system being modelled.

Geometrical optics

Geometry of reflection and refraction of light rays

Geometrical optics, or ray optics, describes the propagation of light in terms of "rays" which travel in straight lines, and whose paths are governed by the laws of reflection and refraction at interfaces between different media. These laws were discovered empirically as far back as 984 AD and have been used in the design of optical components and instruments from then until the present day. They can be summarised as follows:

When a ray of light hits the boundary between two transparent materials, it is divided into a reflected and a refracted ray.

  • The law of reflection says that the reflected ray lies in the plane of incidence, and the angle of reflection equals the angle of incidence.
  • The law of refraction says that the refracted ray lies in the plane of incidence, and the sine of the angle of incidence divided by the sine of the angle of refraction is a constant: where n is a constant for any two materials and a given wavelength of light. If the first material is air or vacuum, n is the refractive index of the second material.

The laws of reflection and refraction can be derived from Fermat's principle which states that the path taken between two points by a ray of light is the path that can be traversed in the least time.[43]

Approximations

Geometric optics is often simplified by making the paraxial approximation, or "small angle approximation". The mathematical behaviour then becomes linear, allowing optical components and systems to be described by simple matrices. This leads to the techniques of Gaussian optics and paraxial ray tracing, which are used to find basic properties of optical systems, such as approximate image and object positions and magnifications.

Reflections

Diagram of specular reflection

Reflections can be divided into two types: specular reflection and diffuse reflection. Specular reflection describes the gloss of surfaces such as mirrors, which reflect light in a simple, predictable way. This allows for the production of reflected images that can be associated with an actual (real) or extrapolated (virtual) location in space. Diffuse reflection describes non-glossy materials, such as paper or rock. The reflections from these surfaces can only be described statistically, with the exact distribution of the reflected light depending on the microscopic structure of the material. Many diffuse reflectors are described or can be approximated by Lambert's cosine law, which describes surfaces that have equal luminance when viewed from any angle. Glossy surfaces can give both specular and diffuse reflection.

In specular reflection, the direction of the reflected ray is determined by the angle the incident ray makes with the surface normal, a line perpendicular to the surface at the point where the ray hits. The incident and reflected rays and the normal lie in a single plane, and the angle between the reflected ray and the surface normal is the same as that between the incident ray and the normal. This is known as the Law of Reflection.

For flat mirrors, the law of reflection implies that images of objects are upright and the same distance behind the mirror as the objects are in front of the mirror. The image size is the same as the object size. The law also implies that mirror images are parity inverted, which we perceive as a left-right inversion. Images formed from reflection in two (or any even number of) mirrors are not parity inverted. Corner reflectors produce reflected rays that travel back in the direction from which the incident rays came. This is called retroreflection.

Mirrors with curved surfaces can be modelled by ray tracing and using the law of reflection at each point on the surface. For mirrors with parabolic surfaces, parallel rays incident on the mirror produce reflected rays that converge at a common focus. Other curved surfaces may also focus light, but with aberrations due to the diverging shape causing the focus to be smeared out in space. In particular, spherical mirrors exhibit spherical aberration. Curved mirrors can form images with a magnification greater than or less than one, and the magnification can be negative, indicating that the image is inverted. An upright image formed by reflection in a mirror is always virtual, while an inverted image is real and can be projected onto a screen.

Refractions

Illustration of Snell's Law for the case n1 < n2, such as air/water interface

Refraction occurs when light travels through an area of space that has a changing index of refraction; this principle allows for lenses and the focusing of light. The simplest case of refraction occurs when there is an interface between a uniform medium with index of refraction n1 and another medium with index of refraction n2. In such situations, Snell's Law describes the resulting deflection of the light ray:

where θ1 and θ2 are the angles between the normal (to the interface) and the incident and refracted waves, respectively.

The index of refraction of a medium is related to the speed, v, of light in that medium by where c is the speed of light in vacuum.

Snell's Law can be used to predict the deflection of light rays as they pass through linear media as long as the indexes of refraction and the geometry of the media are known. For example, the propagation of light through a prism results in the light ray being deflected depending on the shape and orientation of the prism. In most materials, the index of refraction varies with the frequency of the light, known as dispersion. Taking this into account, Snell's Law can be used to predict how a prism will disperse light into a spectrum. The discovery of this phenomenon when passing light through a prism is famously attributed to Isaac Newton.

Some media have an index of refraction which varies gradually with position and, therefore, light rays in the medium are curved. This effect is responsible for mirages seen on hot days: a change in index of refraction air with height causes light rays to bend, creating the appearance of specular reflections in the distance (as if on the surface of a pool of water). Optical materials with varying indexes of refraction are called gradient-index (GRIN) materials. Such materials are used to make gradient-index optics.

For light rays travelling from a material with a high index of refraction to a material with a low index of refraction, Snell's law predicts that there is no θ2 when θ1 is large. In this case, no transmission occurs; all the light is reflected. This phenomenon is called total internal reflection and allows for fibre optics technology. As light travels down an optical fibre, it undergoes total internal reflection allowing for essentially no light to be lost over the length of the cable.

Lenses
A ray tracing diagram for a converging lens

A device that produces converging or diverging light rays due to refraction is known as a lens. Lenses are characterized by their focal length: a converging lens has positive focal length, while a diverging lens has negative focal length. Smaller focal length indicates that the lens has a stronger converging or diverging effect. The focal length of a simple lens in air is given by the lensmaker's equation.

Ray tracing can be used to show how images are formed by a lens. For a thin lens in air, the location of the image is given by the simple equation

where S1 is the distance from the object to the lens, S2 is the distance from the lens to the image, and f is the focal length of the lens. In the sign convention used here, the object and image distances are positive if the object and image are on opposite sides of the lens.

Incoming parallel rays are focused by a converging lens onto a spot one focal length from the lens, on the far side of the lens. This is called the rear focal point of the lens. Rays from an object at a finite distance are focused further from the lens than the focal distance; the closer the object is to the lens, the further the image is from the lens.

With diverging lenses, incoming parallel rays diverge after going through the lens, in such a way that they seem to have originated at a spot one focal length in front of the lens. This is the lens's front focal point. Rays from an object at a finite distance are associated with a virtual image that is closer to the lens than the focal point, and on the same side of the lens as the object. The closer the object is to the lens, the closer the virtual image is to the lens. As with mirrors, upright images produced by a single lens are virtual, while inverted images are real.

Lenses suffer from aberrations that distort images. Monochromatic aberrations occur because the geometry of the lens does not perfectly direct rays from each object point to a single point on the image, while chromatic aberration occurs because the index of refraction of the lens varies with the wavelength of the light.

Images of black letters in a thin convex lens of focal length f are shown in red. Selected rays are shown for letters E, I and K in blue, green and orange, respectively. Note that E (at 2f) has an equal-size, real and inverted image; I (at f) has its image at infinity; and K (at f/2) has a double-size, virtual and upright image.

Physical optics

In physical optics, light is considered to propagate as waves. This model predicts phenomena such as interference and diffraction, which are not explained by geometric optics. The speed of light waves in air is approximately 3.0×108 m/s (exactly 299,792,458 m/s in vacuum). The wavelength of visible light waves varies between 400 and 700 nm, but the term "light" is also often applied to infrared (0.7–300 μm) and ultraviolet radiation (10–400 nm).

The wave model can be used to make predictions about how an optical system will behave without requiring an explanation of what is "waving" in what medium. Until the middle of the 19th century, most physicists believed in an "ethereal" medium in which the light disturbance propagated. The existence of electromagnetic waves was predicted in 1865 by Maxwell's equations. These waves propagate at the speed of light and have varying electric and magnetic fields which are orthogonal to one another, and also to the direction of propagation of the waves. Light waves are now generally treated as electromagnetic waves except when quantum mechanical effects have to be considered.

Modelling and design of optical systems using physical optics

Many simplified approximations are available for analysing and designing optical systems. Most of these use a single scalar quantity to represent the electric field of the light wave, rather than using a vector model with orthogonal electric and magnetic vectors. The Huygens–Fresnel equation is one such model. This was derived empirically by Fresnel in 1815, based on Huygens' hypothesis that each point on a wavefront generates a secondary spherical wavefront, which Fresnel combined with the principle of superposition of waves. The Kirchhoff diffraction equation, which is derived using Maxwell's equations, puts the Huygens-Fresnel equation on a firmer physical foundation. Examples of the application of Huygens–Fresnel principle can be found in the articles on diffraction and Fraunhofer diffraction.

More rigorous models, involving the modelling of both electric and magnetic fields of the light wave, are required when dealing with materials whose electric and magnetic properties affect the interaction of light with the material. For instance, the behaviour of a light wave interacting with a metal surface is quite different from what happens when it interacts with a dielectric material. A vector model must also be used to model polarised light.

Numerical modeling techniques such as the finite element method, the boundary element method and the transmission-line matrix method can be used to model the propagation of light in systems which cannot be solved analytically. Such models are computationally demanding and are normally only used to solve small-scale problems that require accuracy beyond that which can be achieved with analytical solutions.

All of the results from geometrical optics can be recovered using the techniques of Fourier optics which apply many of the same mathematical and analytical techniques used in acoustic engineering and signal processing.

Gaussian beam propagation is a simple paraxial physical optics model for the propagation of coherent radiation such as laser beams. This technique partially accounts for diffraction, allowing accurate calculations of the rate at which a laser beam expands with distance, and the minimum size to which the beam can be focused. Gaussian beam propagation thus bridges the gap between geometric and physical optics.[59]

Superposition and interference

In the absence of nonlinear effects, the superposition principle can be used to predict the shape of interacting waveforms through the simple addition of the disturbances. This interaction of waves to produce a resulting pattern is generally termed "interference" and can result in a variety of outcomes. If two waves of the same wavelength and frequency are in phase, both the wave crests and wave troughs align. This results in constructive interference and an increase in the amplitude of the wave, which for light is associated with a brightening of the waveform in that location. Alternatively, if the two waves of the same wavelength and frequency are out of phase, then the wave crests will align with wave troughs and vice versa. This results in destructive interference and a decrease in the amplitude of the wave, which for light is associated with a dimming of the waveform at that location. See below for an illustration of this effect.

combined
waveform
wave 1
wave 2

Two waves in phase Two waves 180° out
of phase
When oil or fuel is spilled, colourful patterns are formed by thin-film interference.

Since the Huygens–Fresnel principle states that every point of a wavefront is associated with the production of a new disturbance, it is possible for a wavefront to interfere with itself constructively or destructively at different locations producing bright and dark fringes in regular and predictable patterns. Interferometry is the science of measuring these patterns, usually as a means of making precise determinations of distances or angular resolutions. The Michelson interferometer was a famous instrument which used interference effects to accurately measure the speed of light.

The appearance of thin films and coatings is directly affected by interference effects. Antireflective coatings use destructive interference to reduce the reflectivity of the surfaces they coat, and can be used to minimise glare and unwanted reflections. The simplest case is a single layer with a thickness of one-fourth the wavelength of incident light. The reflected wave from the top of the film and the reflected wave from the film/material interface are then exactly 180° out of phase, causing destructive interference. The waves are only exactly out of phase for one wavelength, which would typically be chosen to be near the centre of the visible spectrum, around 550 nm. More complex designs using multiple layers can achieve low reflectivity over a broad band, or extremely low reflectivity at a single wavelength.

Constructive interference in thin films can create a strong reflection of light in a range of wavelengths, which can be narrow or broad depending on the design of the coating. These films are used to make dielectric mirrors, interference filters, heat reflectors, and filters for colour separation in colour television cameras. This interference effect is also what causes the colourful rainbow patterns seen in oil slicks.

Diffraction and optical resolution

Diffraction on two slits separated by distance d. The bright fringes occur along lines where black lines intersect with black lines and white lines intersect with white lines. These fringes are separated by angle θ and are numbered as order n.

Diffraction is the process by which light interference is most commonly observed. The effect was first described in 1665 by Francesco Maria Grimaldi, who also coined the term from the Latin diffringere 'to break into pieces'. Later that century, Robert Hooke and Isaac Newton also described phenomena now known to be diffraction in Newton's rings while James Gregory recorded his observations of diffraction patterns from bird feathers.

The first physical optics model of diffraction that relied on the Huygens–Fresnel principle was developed in 1803 by Thomas Young in his interference experiments with the interference patterns of two closely spaced slits. Young showed that his results could only be explained if the two slits acted as two unique sources of waves rather than corpuscles. In 1815 and 1818, Augustin-Jean Fresnel firmly established the mathematics of how wave interference can account for diffraction.

The simplest physical models of diffraction use equations that describe the angular separation of light and dark fringes due to light of a particular wavelength (λ). In general, the equation takes the form where d is the separation between two wavefront sources (in the case of Young's experiments, it was two slits), θ is the angular separation between the central fringe and the m-th order fringe, where the central maximum is m = 0.

This equation is modified slightly to take into account a variety of situations such as diffraction through a single gap, diffraction through multiple slits, or diffraction through a diffraction grating that contains a large number of slits at equal spacing. More complicated models of diffraction require working with the mathematics of Fresnel or Fraunhofer diffraction.

X-ray diffraction makes use of the fact that atoms in a crystal have regular spacing at distances that are on the order of one angstrom. To see diffraction patterns, x-rays with similar wavelengths to that spacing are passed through the crystal. Since crystals are three-dimensional objects rather than two-dimensional gratings, the associated diffraction pattern varies in two directions according to Bragg reflection, with the associated bright spots occurring in unique patterns and d being twice the spacing between atoms.

Diffraction effects limit the ability of an optical detector to optically resolve separate light sources. In general, light that is passing through an aperture will experience diffraction and the best images that can be created (as described in diffraction-limited optics) appear as a central spot with surrounding bright rings, separated by dark nulls; this pattern is known as an Airy pattern, and the central bright lobe as an Airy disk. The size of such a disk is given by where θ is the angular resolution, λ is the wavelength of the light, and D is the diameter of the lens aperture. If the angular separation of the two points is significantly less than the Airy disk angular radius, then the two points cannot be resolved in the image, but if their angular separation is much greater than this, distinct images of the two points are formed and they can therefore be resolved. Rayleigh defined the somewhat arbitrary "Rayleigh criterion" that two points whose angular separation is equal to the Airy disk radius (measured to first null, that is, to the first place where no light is seen) can be considered to be resolved. It can be seen that the greater the diameter of the lens or its aperture, the finer the resolution. Interferometry, with its ability to mimic extremely large baseline apertures, allows for the greatest angular resolution possible.

For astronomical imaging, the atmosphere prevents optimal resolution from being achieved in the visible spectrum due to the atmospheric scattering and dispersion which cause stars to twinkle. Astronomers refer to this effect as the quality of astronomical seeing. Techniques known as adaptive optics have been used to eliminate the atmospheric disruption of images and achieve results that approach the diffraction limit.

Dispersion and scattering

Conceptual animation of light dispersion through a prism. High frequency (blue) light is deflected the most, and low frequency (red) the least.

Refractive processes take place in the physical optics limit, where the wavelength of light is similar to other distances, as a kind of scattering. The simplest type of scattering is Thomson scattering which occurs when electromagnetic waves are deflected by single particles. In the limit of Thomson scattering, in which the wavelike nature of light is evident, light is dispersed independent of the frequency, in contrast to Compton scattering which is frequency-dependent and strictly a quantum mechanical process, involving the nature of light as particles. In a statistical sense, elastic scattering of light by numerous particles much smaller than the wavelength of the light is a process known as Rayleigh scattering while the similar process for scattering by particles that are similar or larger in wavelength is known as Mie scattering with the Tyndall effect being a commonly observed result. A small proportion of light scattering from atoms or molecules may undergo Raman scattering, wherein the frequency changes due to excitation of the atoms and molecules. Brillouin scattering occurs when the frequency of light changes due to local changes with time and movements of a dense material.

Dispersion occurs when different frequencies of light have different phase velocities, due either to material properties (material dispersion) or to the geometry of an optical waveguide (waveguide dispersion). The most familiar form of dispersion is a decrease in index of refraction with increasing wavelength, which is seen in most transparent materials. This is called "normal dispersion". It occurs in all dielectric materials, in wavelength ranges where the material does not absorb light. In wavelength ranges where a medium has significant absorption, the index of refraction can increase with wavelength. This is called "anomalous dispersion".

The separation of colours by a prism is an example of normal dispersion. At the surfaces of the prism, Snell's law predicts that light incident at an angle θ to the normal will be refracted at an angle arcsin(sin (θ) / n). Thus, blue light, with its higher refractive index, is bent more strongly than red light, resulting in the well-known rainbow pattern.

Dispersion: two sinusoids propagating at different speeds make a moving interference pattern. The red dot moves with the phase velocity, and the green dots propagate with the group velocity. In this case, the phase velocity is twice the group velocity. The red dot overtakes two green dots, when moving from the left to the right of the figure. In effect, the individual waves (which travel with the phase velocity) escape from the wave packet (which travels with the group velocity).

Material dispersion is often characterised by the Abbe number, which gives a simple measure of dispersion based on the index of refraction at three specific wavelengths. Waveguide dispersion is dependent on the propagation constant. Both kinds of dispersion cause changes in the group characteristics of the wave, the features of the wave packet that change with the same frequency as the amplitude of the electromagnetic wave. "Group velocity dispersion" manifests as a spreading-out of the signal "envelope" of the radiation and can be quantified with a group dispersion delay parameter:

where vg is the group velocity. For a uniform medium, the group velocity is

where n is the index of refraction and c is the speed of light in a vacuum. This gives a simpler form for the dispersion delay parameter:

If D is less than zero, the medium is said to have positive dispersion or normal dispersion. If D is greater than zero, the medium has negative dispersion. If a light pulse is propagated through a normally dispersive medium, the result is the higher frequency components slow down more than the lower frequency components. The pulse therefore becomes positively chirped, or up-chirped, increasing in frequency with time. This causes the spectrum coming out of a prism to appear with red light the least refracted and blue/violet light the most refracted. Conversely, if a pulse travels through an anomalously (negatively) dispersive medium, high-frequency components travel faster than the lower ones, and the pulse becomes negatively chirped, or down-chirped, decreasing in frequency with time.

The result of group velocity dispersion, whether negative or positive, is ultimately temporal spreading of the pulse. This makes dispersion management extremely important in optical communications systems based on optical fibres, since if dispersion is too high, a group of pulses representing information will each spread in time and merge, making it impossible to extract the signal.

Polarisation

Polarisation is a general property of waves that describes the orientation of their oscillations. For transverse waves such as many electromagnetic waves, it describes the orientation of the oscillations in the plane perpendicular to the wave's direction of travel. The oscillations may be oriented in a single direction (linear polarisation), or the oscillation direction may rotate as the wave travels (circular or elliptical polarisation). Circularly polarised waves can rotate rightward or leftward in the direction of travel, and which of those two rotations is present in a wave is called the wave's chirality.

The typical way to consider polarisation is to keep track of the orientation of the electric field vector as the electromagnetic wave propagates. The electric field vector of a plane wave may be arbitrarily divided into two perpendicular components labeled x and y (with z indicating the direction of travel). The shape traced out in the x-y plane by the electric field vector is a Lissajous figure that describes the polarisation state. The following figures show some examples of the evolution of the electric field vector (blue), with time (the vertical axes), at a particular point in space, along with its x and y components (red/left and green/right), and the path traced by the vector in the plane (purple): The same evolution would occur when looking at the electric field at a particular time while evolving the point in space, along the direction opposite to propagation.

Linear polarisation diagram
Linear
Circular polarisation diagram
Circular
Elliptical polarisation diagram
Elliptical polarisation

In the leftmost figure above, the x and y components of the light wave are in phase. In this case, the ratio of their strengths is constant, so the direction of the electric vector (the vector sum of these two components) is constant. Since the tip of the vector traces out a single line in the plane, this special case is called linear polarisation. The direction of this line depends on the relative amplitudes of the two components.

In the middle figure, the two orthogonal components have the same amplitudes and are 90° out of phase. In this case, one component is zero when the other component is at maximum or minimum amplitude. There are two possible phase relationships that satisfy this requirement: the x component can be 90° ahead of the y component or it can be 90° behind the y component. In this special case, the electric vector traces out a circle in the plane, so this polarisation is called circular polarisation. The rotation direction in the circle depends on which of the two-phase relationships exists and corresponds to right-hand circular polarisation and left-hand circular polarisation.

In all other cases, where the two components either do not have the same amplitudes and/or their phase difference is neither zero nor a multiple of 90°, the polarisation is called elliptical polarisation because the electric vector traces out an ellipse in the plane (the polarisation ellipse). This is shown in the above figure on the right. Detailed mathematics of polarisation is done using Jones calculus and is characterised by the Stokes parameters.

Changing polarisation

Media that have different indexes of refraction for different polarisation modes are called birefringent. Well known manifestations of this effect appear in optical wave plates/retarders (linear modes) and in Faraday rotation/optical rotation (circular modes). If the path length in the birefringent medium is sufficient, plane waves will exit the material with a significantly different propagation direction, due to refraction. For example, this is the case with macroscopic crystals of calcite, which present the viewer with two offset, orthogonally polarised images of whatever is viewed through them. It was this effect that provided the first discovery of polarisation, by Erasmus Bartholinus in 1669. In addition, the phase shift, and thus the change in polarisation state, is usually frequency dependent, which, in combination with dichroism, often gives rise to bright colours and rainbow-like effects. In mineralogy, such properties, known as pleochroism, are frequently exploited for the purpose of identifying minerals using polarisation microscopes. Additionally, many plastics that are not normally birefringent will become so when subject to mechanical stress, a phenomenon which is the basis of photoelasticity. Non-birefringent methods, to rotate the linear polarisation of light beams, include the use of prismatic polarisation rotators which use total internal reflection in a prism set designed for efficient collinear transmission.

A polariser changing the orientation of linearly polarised light. In this picture, θ1θ0 = θi.

Media that reduce the amplitude of certain polarisation modes are called dichroic, with devices that block nearly all of the radiation in one mode known as polarising filters or simply "polarisers". Malus' law, which is named after Étienne-Louis Malus, says that when a perfect polariser is placed in a linear polarised beam of light, the intensity, I, of the light that passes through is given by

where I0 is the initial intensity, and θi is the angle between the light's initial polarisation direction and the axis of the polariser.

A beam of unpolarised light can be thought of as containing a uniform mixture of linear polarisations at all possible angles. Since the average value of cos2 θ is 1/2, the transmission coefficient becomes

In practice, some light is lost in the polariser and the actual transmission of unpolarised light will be somewhat lower than this, around 38% for Polaroid-type polarisers but considerably higher (>49.9%) for some birefringent prism types.

In addition to birefringence and dichroism in extended media, polarisation effects can also occur at the (reflective) interface between two materials of different refractive index. These effects are treated by the Fresnel equations. Part of the wave is transmitted and part is reflected, with the ratio depending on the angle of incidence and the angle of refraction. In this way, physical optics recovers Brewster's angle. When light reflects from a thin film on a surface, interference between the reflections from the film's surfaces can produce polarisation in the reflected and transmitted light.

Natural light
The effects of a polarising filter on the sky in a photograph. Left picture is taken without polariser. For the right picture, filter was adjusted to eliminate certain polarisations of the scattered blue light from the sky.

Most sources of electromagnetic radiation contain a large number of atoms or molecules that emit light. The orientation of the electric fields produced by these emitters may not be correlated, in which case the light is said to be unpolarised. If there is partial correlation between the emitters, the light is partially polarised. If the polarisation is consistent across the spectrum of the source, partially polarised light can be described as a superposition of a completely unpolarised component, and a completely polarised one. One may then describe the light in terms of the degree of polarisation, and the parameters of the polarisation ellipse.

Light reflected by shiny transparent materials is partly or fully polarised, except when the light is normal (perpendicular) to the surface. It was this effect that allowed the mathematician Étienne-Louis Malus to make the measurements that allowed for his development of the first mathematical models for polarised light. Polarisation occurs when light is scattered in the atmosphere. The scattered light produces the brightness and colour in clear skies. This partial polarisation of scattered light can be taken advantage of using polarising filters to darken the sky in photographs. Optical polarisation is principally of importance in chemistry due to circular dichroism and optical rotation (circular birefringence) exhibited by optically active (chiral) molecules.

Modern optics

Modern optics encompasses the areas of optical science and engineering that became popular in the 20th century. These areas of optical science typically relate to the electromagnetic or quantum properties of light but do include other topics. A major subfield of modern optics, quantum optics, deals with specifically quantum mechanical properties of light. Quantum optics is not just theoretical; some modern devices, such as lasers, have principles of operation that depend on quantum mechanics. Light detectors, such as photomultipliers and channeltrons, respond to individual photons. Electronic image sensors, such as CCDs, exhibit shot noise corresponding to the statistics of individual photon events. Light-emitting diodes and photovoltaic cells, too, cannot be understood without quantum mechanics. In the study of these devices, quantum optics often overlaps with quantum electronics.

Specialty areas of optics research include the study of how light interacts with specific materials as in crystal optics and metamaterials. Other research focuses on the phenomenology of electromagnetic waves as in singular optics, non-imaging optics, non-linear optics, statistical optics, and radiometry. Additionally, computer engineers have taken an interest in integrated optics, machine vision, and photonic computing as possible components of the "next generation" of computers.

Today, the pure science of optics is called optical science or optical physics to distinguish it from applied optical sciences, which are referred to as optical engineering. Prominent subfields of optical engineering include illumination engineering, photonics, and optoelectronics with practical applications like lens design, fabrication and testing of optical components, and image processing. Some of these fields overlap, with nebulous boundaries between the subjects' terms that mean slightly different things in different parts of the world and in different areas of industry. A professional community of researchers in nonlinear optics has developed in the last several decades due to advances in laser technology.

Lasers

Experiments such as this one with high-power lasers are part of the modern optics research.

A laser is a device that emits light, a kind of electromagnetic radiation, through a process called stimulated emission. The term laser is an acronym for 'Light Amplification by Stimulated Emission of Radiation'. Laser light is usually spatially coherent, which means that the light either is emitted in a narrow, low-divergence beam, or can be converted into one with the help of optical components such as lenses. Because the microwave equivalent of the laser, the maser, was developed first, devices that emit microwave and radio frequencies are usually called masers.

VLT's laser guide star

The first working laser was demonstrated on 16 May 1960 by Theodore Maiman at Hughes Research Laboratories. When first invented, they were called "a solution looking for a problem". Since then, lasers have become a multibillion-dollar industry, finding utility in thousands of highly varied applications. The first application of lasers visible in the daily lives of the general population was the supermarket barcode scanner, introduced in 1974. The laserdisc player, introduced in 1978, was the first successful consumer product to include a laser, but the compact disc player was the first laser-equipped device to become truly common in consumers' homes, beginning in 1982. These optical storage devices use a semiconductor laser less than a millimetre wide to scan the surface of the disc for data retrieval. Fibre-optic communication relies on lasers to transmit large amounts of information at the speed of light. Other common applications of lasers include laser printers and laser pointers. Lasers are used in medicine in areas such as bloodless surgery, laser eye surgery, and laser capture microdissection and in military applications such as missile defence systems, electro-optical countermeasures (EOCM), and lidar. Lasers are also used in holograms, bubblegrams, laser light shows, and laser hair removal.

Kapitsa–Dirac effect

The Kapitsa–Dirac effect causes beams of particles to diffract as the result of meeting a standing wave of light. Light can be used to position matter using various phenomena (see optical tweezers).

Applications

Optics is part of everyday life. The ubiquity of visual systems in biology indicates the central role optics plays as the science of one of the five senses. Many people benefit from eyeglasses or contact lenses, and optics are integral to the functioning of many consumer goods including cameras. Rainbows and mirages are examples of optical phenomena. Optical communication provides the backbone for both the Internet and modern telephony.

Human eye

Model of a human eye. Features mentioned in this article are 1. vitreous humour 3. ciliary muscle, 6. pupil, 7. anterior chamber, 8. cornea, 10. lens cortex, 22. optic nerve, 26. fovea, 30. retina.
The human eye is a living optical device. The iris (light brown region), pupil (black circle in the centre), and sclera (white surrounding area) are visible in this image, along with the eyelids and eyelashes which protect the eye

The human eye functions by focusing light onto a layer of photoreceptor cells called the retina, which forms the inner lining of the back of the eye. The focusing is accomplished by a series of transparent media. Light entering the eye passes first through the cornea, which provides much of the eye's optical power. The light then continues through the fluid just behind the cornea—the anterior chamber, then passes through the pupil. The light then passes through the lens, which focuses the light further and allows adjustment of focus. The light then passes through the main body of fluid in the eye—the vitreous humour, and reaches the retina. The cells in the retina line the back of the eye, except for where the optic nerve exits; this results in a blind spot.

There are two types of photoreceptor cells, rods and cones, which are sensitive to different aspects of light. Rod cells are sensitive to the intensity of light over a wide frequency range, thus are responsible for black-and-white vision. Rod cells are not present on the fovea, the area of the retina responsible for central vision, and are not as responsive as cone cells to spatial and temporal changes in light. There are, however, twenty times more rod cells than cone cells in the retina because the rod cells are present across a wider area. Because of their wider distribution, rods are responsible for peripheral vision.

In contrast, cone cells are less sensitive to the overall intensity of light, but come in three varieties that are sensitive to different frequency-ranges and thus are used in the perception of colour and photopic vision. Cone cells are highly concentrated in the fovea and have a high visual acuity meaning that they are better at spatial resolution than rod cells. Since cone cells are not as sensitive to dim light as rod cells, most night vision is limited to rod cells. Likewise, since cone cells are in the fovea, central vision (including the vision needed to do most reading, fine detail work such as sewing, or careful examination of objects) is done by cone cells.

Ciliary muscles around the lens allow the eye's focus to be adjusted. This process is known as accommodation. The near point and far point define the nearest and farthest distances from the eye at which an object can be brought into sharp focus. For a person with normal vision, the far point is located at infinity. The near point's location depends on how much the muscles can increase the curvature of the lens, and how inflexible the lens has become with age. Optometrists, ophthalmologists, and opticians usually consider an appropriate near point to be closer than normal reading distance—approximately 25 cm.

Defects in vision can be explained using optical principles. As people age, the lens becomes less flexible and the near point recedes from the eye, a condition known as presbyopia. Similarly, people suffering from hyperopia cannot decrease the focal length of their lens enough to allow for nearby objects to be imaged on their retina. Conversely, people who cannot increase the focal length of their lens enough to allow for distant objects to be imaged on the retina suffer from myopia and have a far point that is considerably closer than infinity. A condition known as astigmatism results when the cornea is not spherical but instead is more curved in one direction. This causes horizontally extended objects to be focused on different parts of the retina than vertically extended objects, and results in distorted images.

All of these conditions can be corrected using corrective lenses. For presbyopia and hyperopia, a converging lens provides the extra curvature necessary to bring the near point closer to the eye while for myopia a diverging lens provides the curvature necessary to send the far point to infinity. Astigmatism is corrected with a cylindrical surface lens that curves more strongly in one direction than in another, compensating for the non-uniformity of the cornea.

The optical power of corrective lenses is measured in diopters, a value equal to the reciprocal of the focal length measured in metres; with a positive focal length corresponding to a converging lens and a negative focal length corresponding to a diverging lens. For lenses that correct for astigmatism as well, three numbers are given: one for the spherical power, one for the cylindrical power, and one for the angle of orientation of the astigmatism.

Visual effects

The Ponzo Illusion relies on the fact that parallel lines appear to converge as they approach infinity.

Optical illusions (also called visual illusions) are characterized by visually perceived images that differ from objective reality. The information gathered by the eye is processed in the brain to give a percept that differs from the object being imaged. Optical illusions can be the result of a variety of phenomena including physical effects that create images that are different from the objects that make them, the physiological effects on the eyes and brain of excessive stimulation (e.g. brightness, tilt, colour, movement), and cognitive illusions where the eye and brain make unconscious inferences.

Cognitive illusions include some which result from the unconscious misapplication of certain optical principles. For example, the Ames room, Hering, Müller-Lyer, Orbison, Ponzo, Sander, and Wundt illusions all rely on the suggestion of the appearance of distance by using converging and diverging lines, in the same way that parallel light rays (or indeed any set of parallel lines) appear to converge at a vanishing point at infinity in two-dimensionally rendered images with artistic perspective. This suggestion is also responsible for the famous moon illusion where the moon, despite having essentially the same angular size, appears much larger near the horizon than it does at zenith. This illusion so confounded Ptolemy that he incorrectly attributed it to atmospheric refraction when he described it in his treatise, Optics.

Another type of optical illusion exploits broken patterns to trick the mind into perceiving symmetries or asymmetries that are not present. Examples include the café wall, Ehrenstein, Fraser spiral, Poggendorff, and Zöllner illusions. Related, but not strictly illusions, are patterns that occur due to the superimposition of periodic structures. For example, transparent tissues with a grid structure produce shapes known as moiré patterns, while the superimposition of periodic transparent patterns comprising parallel opaque lines or curves produces line moiré patterns.

Optical instruments

Illustrations of various optical instruments from the 1728 Cyclopaedia

Single lenses have a variety of applications including photographic lenses, corrective lenses, and magnifying glasses while single mirrors are used in parabolic reflectors and rear-view mirrors. Combining a number of mirrors, prisms, and lenses produces compound optical instruments which have practical uses. For example, a periscope is simply two plane mirrors aligned to allow for viewing around obstructions. The most famous compound optical instruments in science are the microscope and the telescope which were both invented by the Dutch in the late 16th century.

Microscopes were first developed with just two lenses: an objective lens and an eyepiece. The objective lens is essentially a magnifying glass and was designed with a very small focal length while the eyepiece generally has a longer focal length. This has the effect of producing magnified images of close objects. Generally, an additional source of illumination is used since magnified images are dimmer due to the conservation of energy and the spreading of light rays over a larger surface area. Modern microscopes, known as compound microscopes have many lenses in them (typically four) to optimize the functionality and enhance image stability. A slightly different variety of microscope, the comparison microscope, looks at side-by-side images to produce a stereoscopic binocular view that appears three dimensional when used by humans.

The first telescopes, called refracting telescopes, were also developed with a single objective and eyepiece lens. In contrast to the microscope, the objective lens of the telescope was designed with a large focal length to avoid optical aberrations. The objective focuses an image of a distant object at its focal point which is adjusted to be at the focal point of an eyepiece of a much smaller focal length. The main goal of a telescope is not necessarily magnification, but rather the collection of light which is determined by the physical size of the objective lens. Thus, telescopes are normally indicated by the diameters of their objectives rather than by the magnification which can be changed by switching eyepieces. Because the magnification of a telescope is equal to the focal length of the objective divided by the focal length of the eyepiece, smaller focal-length eyepieces cause greater magnification.

Since crafting large lenses is much more difficult than crafting large mirrors, most modern telescopes are reflecting telescopes, that is, telescopes that use a primary mirror rather than an objective lens. The same general optical considerations apply to reflecting telescopes that applied to refracting telescopes, namely, the larger the primary mirror, the more light collected, and the magnification is still equal to the focal length of the primary mirror divided by the focal length of the eyepiece. Professional telescopes generally do not have eyepieces and instead place an instrument (often a charge-coupled device) at the focal point instead.

Photography

Photograph taken with aperture f/32
Photograph taken with aperture f/5

The optics of photography involves both lenses and the medium in which the electromagnetic radiation is recorded, whether it be a plate, film, or charge-coupled device. Photographers must consider the reciprocity of the camera and the shot which is summarized by the relation

Exposure ∝ ApertureArea × ExposureTime × SceneLuminance

In other words, the smaller the aperture (giving greater depth of focus), the less light coming in, so the length of time has to be increased (leading to possible blurriness if motion occurs). An example of the use of the law of reciprocity is the Sunny 16 rule which gives a rough estimate for the settings needed to estimate the proper exposure in daylight.

A camera's aperture is measured by a unitless number called the f-number or f-stop, f/#, often notated as , and given by

where is the focal length, and is the diameter of the entrance pupil. By convention, "f/#" is treated as a single symbol, and specific values of f/# are written by replacing the number sign with the value. The two ways to increase the f-stop are to either decrease the diameter of the entrance pupil or change to a longer focal length (in the case of a zoom lens, this can be done by simply adjusting the lens). Higher f-numbers also have a larger depth of field due to the lens approaching the limit of a pinhole camera which is able to focus all images perfectly, regardless of distance, but requires very long exposure times.

The field of view that the lens will provide changes with the focal length of the lens. There are three basic classifications based on the relationship to the diagonal size of the film or sensor size of the camera to the focal length of the lens:

  • Normal lens: angle of view of about 50° (called normal because this angle considered roughly equivalent to human vision) and a focal length approximately equal to the diagonal of the film or sensor.
  • Wide-angle lens: angle of view wider than 60° and focal length shorter than a normal lens.
  • Long focus lens: angle of view narrower than a normal lens. This is any lens with a focal length longer than the diagonal measure of the film or sensor. The most common type of long focus lens is the telephoto lens, a design that uses a special telephoto group to be physically shorter than its focal length.

Modern zoom lenses may have some or all of these attributes.

The required exposure time depends on how sensitive to light the medium being used is (measured by the film speed, or, for digital media, by the quantum efficiency). Early photography used media that had very low light sensitivity, and so exposure times had to be long even for very bright shots. As technology has improved, so has the sensitivity through film cameras and digital cameras.

Other results from physical and geometrical optics apply to camera optics. For example, the maximum resolution capability of a particular camera set-up is determined by the diffraction limit associated with the pupil size and given, roughly, by the Rayleigh criterion.

Atmospheric optics

A colourful sky is often due to scattering of light off particulates and pollution, as in this photograph of a sunset during the October 2007 California wildfires.

The unique optical properties of the atmosphere cause a wide range of spectacular optical phenomena. The blue colour of the sky is a direct result of Rayleigh scattering which redirects higher frequency (blue) sunlight back into the field of view of the observer. Because blue light is scattered more easily than red light, the sun takes on a reddish hue when it is observed through a thick atmosphere, as during a sunrise or sunset. Additional particulate matter in the sky can scatter different colours at different angles creating colourful glowing skies at dusk and dawn. Scattering off of ice crystals and other particles in the atmosphere are responsible for halos, afterglows, coronas, rays of sunlight, and sun dogs. The variation in these kinds of phenomena is due to different particle sizes and geometries.

Mirages are optical phenomena in which light rays are bent due to thermal variations in the refraction index of air, producing displaced or heavily distorted images of distant objects. Other dramatic optical phenomena associated with this include the Novaya Zemlya effect where the sun appears to rise earlier than predicted with a distorted shape. A spectacular form of refraction occurs with a temperature inversion called the Fata Morgana where objects on the horizon or even beyond the horizon, such as islands, cliffs, ships or icebergs, appear elongated and elevated, like "fairy tale castles".

Rainbows are the result of a combination of internal reflection and dispersive refraction of light in raindrops. A single reflection off the backs of an array of raindrops produces a rainbow with an angular size on the sky that ranges from 40° to 42° with red on the outside. Double rainbows are produced by two internal reflections with angular size of 50.5° to 54° with violet on the outside. Because rainbows are seen with the sun 180° away from the centre of the rainbow, rainbows are more prominent the closer the sun is to the horizon.

Computational chemistry

From Wikipedia, the free encyclopedia
C60 molecule with isosurface of ground-state electron density as calculated with density functional theory

Computational chemistry is a branch of chemistry that uses computer simulations to assist in solving chemical problems. It uses methods of theoretical chemistry incorporated into computer programs to calculate the structures and properties of molecules, groups of molecules, and solids. Computational chemists typically focus on developing and applying computer programs and methodologies to specific chemical questions.

The complexity inherent in the many-body problem exacerbates the challenge of providing detailed descriptions of quantum mechanical systems. Computational results may complement information obtained by chemical experiments or predict unobserved chemical phenomena.

History

Building on the founding discoveries and theories in the history of quantum mechanics, the first theoretical calculations in chemistry were those of Walter Heitler and Fritz London in 1927, using valence bond theory. The books that were influential in the early development of computational quantum chemistry include Linus Pauling and E. Bright Wilson's 1935 Introduction to Quantum Mechanics – with Applications to ChemistryEyring, Walter and Kimball's 1944 Quantum Chemistry, Heitler's 1945 Elementary Wave Mechanics – with Applications to Quantum Chemistry, and later Coulson's 1952 textbook Valence, each of which served as primary references for chemists in the decades to follow.

With the development of efficient computer technology in the 1940s, the solutions of elaborate wave equations for complex atomic systems began to be a realizable objective. In the early 1950s, the first semi-empirical atomic orbital calculations were performed. Theoretical chemists became extensive users of the early digital computers. One significant advancement was marked by Clemens C. J. Roothaan's 1951 paper in the Reviews of Modern Physics. This paper focused largely on the "LCAO MO" approach (Linear Combination of Atomic Orbitals Molecular Orbitals). A very detailed account of such use in the United Kingdom is given by Smith and Sutcliffe. The first ab initio Hartree–Fock method calculations on diatomic molecules were performed in 1956 at MIT, using a basis set of Slater orbitals. For diatomic molecules, a systematic study using a minimum basis set and the first calculation with a larger basis set were published by Ransil and Nesbet respectively in 1960. The first polyatomic calculations using Gaussian orbitals were performed in the late 1950s. The first configuration interaction calculations were performed in Cambridge on the EDSAC computer in the 1950s using Gaussian orbitals by Boys and coworkers. By 1971, when a bibliography of ab initio calculations was published, the largest molecules included were naphthalene and azulene.

In 1964, Hückel method calculations (using a simple linear combination of atomic orbitals (LCAO) method to determine electron energies of molecular orbitals of π electrons in conjugated hydrocarbon systems) of molecules, ranging in complexity from butadiene and benzene to ovalene, were generated on computers at Berkeley and Oxford. These empirical methods were replaced in the 1960s by semi-empirical methods such as CNDO.

In the early 1970s, efficient ab initio computer programs such as ATMOL, Gaussian, IBMOL, and POLYAYTOM, began to be used to speed ab initio calculations of molecular orbitals. Of these four programs, only Gaussian, now vastly expanded, is still in use, but many other programs are now in use. At the same time, the methods of molecular mechanics, such as MM2 force field, were developed, primarily by Norman Allinger.

One of the first mentions of the term computational chemistry can be found in the 1970 book Computers and Their Role in the Physical Sciences by Sidney Fernbach and Abraham Haskell Taub, where they state "It seems, therefore, that 'computational chemistry' can finally be more and more of a reality." During the 1970s, widely different methods began to be seen as part of a new emerging discipline of computational chemistry. The Journal of Computational Chemistry was first published in 1980.

Computational chemistry has featured in several Nobel Prize awards. In 1998 the Nobel Prize in Chemistry was awarded to Walter Kohn, "for his development of the density-functional theory", and John Pople, "for his development of computational methods in quantum chemistry". Martin Karplus, Michael Levitt and Arieh Warshel received the 2013 Nobel Prize in Chemistry for "the development of multiscale models for complex chemical systems".

Applications

Computational chemistry has a wide breadth of applications

  • The prediction of the molecular structure of molecules by the use of the simulation of forces, or more accurate quantum chemical methods, to find stationary points on the energy surface as the position of the nuclei is varied.
  • Storing and searching for data on chemical entities (see chemical databases).
  • Identifying correlations between chemical structures and properties (see quantitative structure–property relationship (QSPR) and quantitative structure–activity relationship (QSAR)).
  • Computational approaches to help in the efficient synthesis of compounds.
  • Computational approaches to design molecules that interact in specific ways with other molecules (e.g. drug design and catalysis.

These fields can give rise to several applications as shown below.

Catalysis

Computational chemistry can help predict values like activation energy from catalysis. The presence of the catalyst opens a different reaction pathway (shown in red) with lower activation energy. The final result and the overall thermodynamics are the same.

Computational chemistry is a tool for analyzing catalytic systems without doing experiments. Modern electronic structure theory and density functional theory has allowed researchers to discover and understand catalysts. Computational studies apply theoretical chemistry to catalysis research. Density functional theory methods calculate the energies and orbitals of molecules to give models of those structures. Using these methods, researchers can predict values like activation energy, site reactivity and other thermodynamic properties.

Data that is difficult to obtain experimentally can be found using computational methods to model the mechanisms of catalytic cycles. Skilled computational chemists provide predictions that are close to experimental data with proper considerations of methods and basis sets. With good computational data, researchers can predict how catalysts can be improved to lower the cost and increase the efficiency of these reactions.

Drug development

Computational chemistry is used in drug development to model potentially useful drug molecules and help companies save time and cost in drug development. The drug discovery process involves analyzing data, finding ways to improve current molecules, finding synthetic routes, and testing those molecules. Computational chemistry helps with this process by giving predictions of which experiments would be best to do without conducting other experiments. Computational methods can also find values that are difficult to find experimentally like pKa's of compounds. Methods like density functional theory can be used to model drug molecules and find their properties, like their HOMO and LUMO energies and molecular orbitals.

Aside from drug synthesis, drug carriers are also researched by computational chemists for nanomaterials. It allows researchers to simulate environments to test the effectiveness and stability of drug carriers. Understanding how water interacts with these nanomaterials ensures stability of the material in human bodies. These computational simulations help researchers optimize the material find the best way to structure these nanomaterials before making them.

Computational chemistry databases

Databases are useful for both computational and non computational chemists in research and verifying the validity of computational methods. Empirical data is used to analyze the error of computational methods against experimental data. Empirical data helps researchers with their methods and basis sets to have greater confidence in the researchers results. Computational chemistry databases are also used in testing software or hardware for computational chemistry.

Databases can also use purely calculated data. Purely calculated data uses calculated values over experimental values for databases. Purely calculated data avoids dealing with these adjusting for different experimental conditions like zero-point energy. These calculations can also avoid experimental errors for difficult to test molecules. Though purely calculated data is often not perfect, identifying issues is often easier for calculated data than experimental.

Databases also give public access to information for researchers to use. They contain data that other researchers have found and uploaded to these databases so that anyone can search for them. Researchers use these databases to find information on molecules of interest and learn what can be done with those molecules. Some publicly available chemistry databases include the following.

  • BindingDB: Contains experimental information about protein-small molecule interactions.
  • RCSB: Stores publicly available 3D models of macromolecules (proteins, nucleic acids) and small molecules (drugs, inhibitors)
  • ChEMBL: Contains data from research on drug development such as assay results.
  • DrugBank: Data about mechanisms of drugs can be found here.

Methods

Ab initio method

The programs used in computational chemistry are based on many different quantum-chemical methods that solve the molecular Schrödinger equation associated with the molecular Hamiltonian. Methods that do not include any empirical or semi-empirical parameters in their equations – being derived directly from theory, with no inclusion of experimental data – are called ab initio methods.

Ab initio methods need to define a level of theory (the method) and a basis set. A basis set consists of functions centered on the molecule's atoms. These sets are then used to describe molecular orbitals via the linear combination of atomic orbitals (LCAO) molecular orbital method ansatz.

Diagram illustrating various ab initio electronic structure methods in terms of energy. Spacings are not to scale.

A common type of ab initio electronic structure calculation is the Hartree–Fock method (HF), an extension of molecular orbital theory, where electron-electron repulsions in the molecule are not specifically taken into account; only the electrons' average effect is included in the calculation. As the basis set size increases, the energy and wave function tend towards a limit called the Hartree–Fock limit.

Many types of calculations begin with a Hartree–Fock calculation and subsequently correct for electron-electron repulsion, referred to also as electronic correlation. These types of calculations are termed post–Hartree–Fock methods. By continually improving these methods, scientists can get increasingly closer to perfectly predicting the behavior of atomic and molecular systems under the framework of quantum mechanics, as defined by the Schrödinger equation.

In most cases, the Hartree–Fock wave function occupies a single configuration or determinant. In some cases, particularly for bond-breaking processes, this is inadequate, and several configurations must be used.

The total molecular energy can be evaluated as a function of the molecular geometry; in other words, the potential energy surface. Such a surface can be used for reaction dynamics. The stationary points of the surface lead to predictions of different isomers and the transition structures for conversion between isomers.

Molecular orbital diagram of the conjugated pi systems of the diazomethane molecule using Hartree-Fock Method, CH2N2

Computational thermochemistry

A particularly important objective, called computational thermochemistry, is to calculate thermochemical quantities such as the enthalpy of formation to chemical accuracy. Chemical accuracy is the accuracy required to make realistic chemical predictions and is generally considered to be 1 kcal/mol or 4 kJ/mol. To reach that accuracy in an economic way, it is necessary to use a series of post–Hartree–Fock methods and combine the results. These methods are called quantum chemistry composite methods.

Chemical dynamics

After the electronic and nuclear variables are separated within the Born–Oppenheimer representation, the wave packet corresponding to the nuclear degrees of freedom is propagated via the time evolution operator (physics) associated to the time-dependent Schrödinger equation (for the full molecular Hamiltonian). In the complementary energy-dependent approach, the time-independent Schrödinger equation is solved using the scattering theory formalism. The potential representing the interatomic interaction is given by the potential energy surfaces. In general, the potential energy surfaces are coupled via the vibronic coupling terms.

The most popular methods for propagating the wave packet associated to the molecular geometry are:

Split operator technique

How a computational method solves quantum equations impacts the accuracy and efficiency of the method. The split operator technique is one of these methods for solving differential equations. In computational chemistry, split operator technique reduces computational costs of simulating chemical systems. Computational costs are about how much time it takes for computers to calculate these chemical systems, as it can take days for more complex systems. Quantum systems are difficult and time-consuming to solve for humans. Split operator methods help computers calculate these systems quickly by solving the sub problems in a quantum differential equation. The method does this by separating the differential equation into two different equations, like when there are more than two operators. Once solved, the split equations are combined into one equation again to give an easily calculable solution.

This method is used in many fields that require solving differential equations, such as biology. However, the technique comes with a splitting error. For example, with the following solution for a differential equation.

The equation can be split, but the solutions will not be exact, only similar. This is an example of first order splitting.

There are ways to reduce this error, which include taking an average of two split equations.

Another way to increase accuracy is to use higher order splitting. Usually, second order splitting is the most that is done because higher order splitting requires much more time to calculate and is not worth the cost. Higher order methods become too difficult to implement, and are not useful for solving differential equations despite the higher accuracy.

Computational chemists spend much time making systems calculated with split operator technique more accurate while minimizing the computational cost. Calculating methods is a massive challenge for many chemists trying to simulate molecules or chemical environments.

Density functional methods

Density functional theory (DFT) methods are often considered to be ab initio methods for determining the molecular electronic structure, even though many of the most common functionals use parameters derived from empirical data, or from more complex calculations. In DFT, the total energy is expressed in terms of the total one-electron density rather than the wave function. In this type of calculation, there is an approximate Hamiltonian and an approximate expression for the total electron density, with various different levels of approximation and accuracy. DFT methods can be very accurate for relatively low computational cost. Some methods combine the density functional exchange functional with the Hartree–Fock exchange term and are termed hybrid functional methods, or an additional term for correlation in double-methods methods.

Semi-empirical methods

Semi-empirical quantum chemistry methods are based on the Hartree–Fock method formalism, but make many approximations and obtain some parameters from empirical data. They were very important in computational chemistry from the 60s to the 90s, especially for treating large molecules where the full Hartree–Fock method without the approximations were too costly. The use of empirical parameters appears to allow some inclusion of correlation effects into the methods.

Primitive semi-empirical methods were designed even before, where the two-electron part of the Hamiltonian is not explicitly included. For π-electron systems, this was the Hückel method proposed by Erich Hückel, and for all valence electron systems, the extended Hückel method proposed by Roald Hoffmann. Sometimes, Hückel methods are referred to as "completely empirical" because they do not derive from a Hamiltonian. Yet, the term "empirical methods", or "empirical force fields" is usually used to describe molecular mechanics.

Molecular mechanics potential energy function with continuum solvent

Molecular mechanics

In many cases, large molecular systems can be modeled successfully while avoiding quantum mechanical calculations entirely. Molecular mechanics simulations, for example, use one classical expression for the energy of a compound, for instance, the harmonic oscillator. All constants appearing in the equations must be obtained beforehand from experimental data or ab initio calculations.

The database of compounds used for parameterization, i.e. the resulting set of parameters and functions is called the force field, is crucial to the success of molecular mechanics calculations. A force field parameterized against a specific class of molecules, for instance, proteins, would be expected to only have any relevance when describing other molecules of the same class. These methods can be applied to proteins and other large biological molecules, and allow studies of the approach and interaction (docking) of potential drug molecules.

Molecular dynamics

Molecular dynamics (MD) use either quantum mechanics, molecular mechanics or a mixture of both to calculate forces which are then used to solve Newton's laws of motion to examine the time-dependent behavior of systems. The result of a molecular dynamics simulation is a trajectory that describes how the position and velocity of particles varies with time. The phase point of a system described by the positions and momenta of all its particles on a previous time point will determine the next phase point in time by integrating over Newton's laws of motion.

Monte Carlo

Monte Carlo (MC) generates configurations of a system by making random changes to the positions of its particles, together with their orientations and conformations where appropriate. It is a random sampling method, which makes use of the so-called importance sampling. Importance sampling methods are able to generate low energy states, as this enables properties to be calculated accurately. The potential energy of each configuration of the system can be calculated, together with the values of other properties, from the positions of the atoms.

Quantum mechanics/molecular mechanics (QM/MM)

QM/MM is a hybrid method that attempts to combine the accuracy of quantum mechanics with the speed of molecular mechanics. It is useful for simulating very large molecules such as enzymes.

Quantum Computational Chemistry

Quantum computational chemistry aims to exploit quantum computing to simulate chemical systems, distinguishing itself from the QM/MM (Quantum Mechanics/Molecular Mechanics) approach. While QM/MM uses a hybrid approach, combining quantum mechanics for a portion of the system with classical mechanics for the remainder, quantum computational chemistry exclusively uses quantum computing methods to represent and process information, such as Hamiltonian operators.

Conventional computational chemistry methods often struggle with the complex quantum mechanical equations, particularly due to the exponential growth of a quantum system's wave function. Quantum computational chemistry addresses these challenges using quantum computing methods, such as qubitization and quantum phase estimation, which are believed to offer scalable solutions.

Qubitization involves adapting the Hamiltonian operator for more efficient processing on quantum computers, enhancing the simulation's efficiency. Quantum phase estimation, on the other hand, assists in accurately determining energy eigenstates, which are critical for understanding the quantum system's behavior.

While these techniques have advanced the field of computational chemistry, especially in the simulation of chemical systems, their practical application is currently limited mainly to smaller systems due to technological constraints. Nevertheless, these developments may lead to significant progress towards achieving more precise and resource-efficient quantum chemistry simulations.

Computational costs in chemistry algorithms

The computational cost and algorithmic complexity in chemistry are used to help understand and predict chemical phenomena. They help determine which algorithms/computational methods to use when solving chemical problems. This section focuses on the scaling of computational complexity with molecule size and details the algorithms commonly used in both domains.

In quantum chemistry, particularly, the complexity can grow exponentially with the number of electrons involved in the system. This exponential growth is a significant barrier to simulating large or complex systems accurately.

Advanced algorithms in both fields strive to balance accuracy with computational efficiency. For instance, in MD, methods like Verlet integration or Beeman's algorithm are employed for their computational efficiency. In quantum chemistry, hybrid methods combining different computational approaches (like QM/MM) are increasingly used to tackle large biomolecular systems.

Algorithmic complexity examples

The following list illustrates the impact of computational complexity on algorithms used in chemical computations. It is important to note that while this list provides key examples, it is not comprehensive and serves as a guide to understanding how computational demands influence the selection of specific computational methods in chemistry.

Molecular dynamics

Algorithm

Solves Newton's equations of motion for atoms and molecules.


Complexity

The standard pairwise interaction calculation in MD leads to an complexity for particles. This is because each particle interacts with every other particle, resulting in interactions. Advanced algorithms, such as the Ewald summation or Fast Multipole Method, reduce this to or even by grouping distant particles and treating them as a single entity or using clever mathematical approximations.

Quantum mechanics/molecular mechanics (QM/MM)

Algorithm

Combines quantum mechanical calculations for a small region with molecular mechanics for the larger environment.

Complexity

The complexity of QM/MM methods depends on both the size of the quantum region and the method used for quantum calculations. For example, if a Hartree-Fock method is used for the quantum part, the complexity can be approximated as , where is the number of basis functions in the quantum region. This complexity arises from the need to solve a set of coupled equations iteratively until self-consistency is achieved.

Algorithmic flowchart illustrating the Hartree–Fock method

Hartree-Fock method

Algorithm

Finds a single Fock state that minimizes the energy.

Complexity

NP-hard or NP-complete as demonstrated by embedding instances of the Ising model into Hartree-Fock calculations. The Hartree-Fock method involves solving the Roothaan-Hall equations, which scales as to depending on implementation, with being the number of basis functions. The computational cost mainly comes from evaluating and transforming the two-electron integrals. This proof of NP-hardness or NP-completeness comes from embedding problems like the Ising model into the Hartree-Fock formalism.

An acrolein molecule. DFT gives good results in the prediction of sensitivity of some nanostructures to environmental pollutants such as Acrolein.

Proving the complexity classes for algorithms involves a combination of mathematical proof and computational experiments. For example, in the case of the Hartree-Fock method, the proof of NP-hardness is a theoretical result derived from complexity theory, specifically through reductions from known NP-hard problems.

Density functional theory

Algorithm

Investigates the electronic structure or nuclear structure of many-body systems such as atoms, molecules, and the condensed phases.

Complexity

Traditional implementations of DFT typically scale as , mainly due to the need to diagonalize the Kohn-Sham matrix. The diagonalization step, which finds the eigenvalues and eigenvectors of the matrix, contributes most to this scaling. Recent advances in DFT aim to reduce this complexity through various approximations and algorithmic improvements. There have also been significant speed improvements in eigensolvers, in particular the ELPA solver which is used in many codes.

DFT has QMA complexity.

Standard CCSD and CCSD(T) method

Algorithm

CCSD and CCSD(T) methods are advanced electronic structure techniques involving single, double, and in the case of CCSD(T), perturbative triple excitations for calculating electronic correlation effects.

Complexity

CCSD scales as where is the number of basis functions. This intense computational demand arises from the inclusion of single and double excitations in the electron correlation calculation. With the addition of perturbative triples in CCSD(Tthe complexity increases to . This elevated complexity restricts practical usage to smaller systems, typically up to 20-25 atoms in conventional implementations.

Electron density plot of the 2a1 molecular orbital of methane at the CCSD(T)/cc-pVQZ level. Graphic created with Molden based on correlated geometry optimization with CFOUR at the CCSD(T) level in cc-pVQZ basis.

Linear-scaling CCSD(T) method

Algorithm

An adaptation of the standard CCSD(T) method using local natural orbitals (NOs) to significantly reduce the computational burden and enable application to larger systems.

Complexity

Achieves linear scaling with the system size, a major improvement over the traditional fifth-power scaling of CCSD. This advancement allows for practical applications to molecules of up to 100 atoms with reasonable basis sets, marking a significant step forward in computational chemistry's capability to handle larger systems with high accuracy.

Accuracy

Computational chemistry is not an exact description of real-life chemistry, as the mathematical and physical models of nature can only provide an approximation. However, the majority of chemical phenomena can be described to a certain degree in a qualitative or approximate quantitative computational scheme.

Molecules consist of nuclei and electrons, so the methods of quantum mechanics apply. Computational chemists often attempt to solve the non-relativistic Schrödinger equation, with relativistic corrections added, although some progress has been made in solving the fully relativistic Dirac equation. In principle, it is possible to solve the Schrödinger equation in either its time-dependent or time-independent form, as appropriate for the problem in hand; in practice, this is not possible except for very small systems. Therefore, a great number of approximate methods strive to achieve the best trade-off between accuracy and computational cost.

Accuracy can always be improved with greater computational cost. Significant errors can present themselves in ab initio models comprising many electrons, due to the computational cost of full relativistic-inclusive methods. This complicates the study of molecules interacting with high atomic mass unit atoms, such as transitional metals and their catalytic properties. Present algorithms in computational chemistry can routinely calculate the properties of small molecules that contain up to about 40 electrons with errors for energies less than a few kJ/mol. For geometries, bond lengths can be predicted within a few picometers and bond angles within 0.5 degrees. The treatment of larger molecules that contain a few dozen atoms is computationally tractable by more approximate methods such as density functional theory (DFT).

There is some dispute within the field whether or not the latter methods are sufficient to describe complex chemical reactions, such as those in biochemistry. Large molecules can be studied by semi-empirical approximate methods. Even larger molecules are treated by classical mechanics methods that use what are called molecular mechanics (MM). In QM-MM methods, small parts of large complexes are treated quantum mechanically (QM), and the remainder is treated approximately (MM).

Quantum chemistry

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Quantum_chemistry   ...