Search This Blog

Sunday, August 24, 2025

Information theory

From Wikipedia, the free encyclopedia

A key measure in information theory is entropy. Entropy quantifies the amount of uncertainty involved in the value of a random variable or the outcome of a random process. For example, identifying the outcome of a fair coin flip (which has two equally likely outcomes) provides less information (lower entropy, less uncertainty) than identifying the outcome from a roll of a die (which has six equally likely outcomes). Some other important measures in information theory are mutual information, channel capacity, error exponents, and relative entropy. Important sub-fields of information theory include source coding, algorithmic complexity theory, algorithmic information theory and information-theoretic security.

Applications of fundamental topics of information theory include source coding/data compression (e.g. for ZIP files), and channel coding/error detection and correction (e.g. for DSL). Its impact has been crucial to the success of the Voyager missions to deep space, the invention of the compact disc, the feasibility of mobile phones and the development of the Internet and artificial intelligence. The theory has also found applications in other areas, including statistical inferencecryptography, neurobiologyperceptionsignal processinglinguistics, the evolution and function of molecular codes (bioinformatics), thermal physicsmolecular dynamicsblack holes, quantum computing, information retrieval, intelligence gathering, plagiarism detectionpattern recognition, anomaly detection, the analysis of musicart creationimaging system design, study of outer space, the dimensionality of space, and epistemology.

Overview

Information theory studies the transmission, processing, extraction, and utilization of information. Abstractly, information can be thought of as the resolution of uncertainty. In the case of communication of information over a noisy channel, this abstract concept was formalized in 1948 by Claude Shannon in a paper entitled A Mathematical Theory of Communication, in which information is thought of as a set of possible messages, and the goal is to send these messages over a noisy channel, and to have the receiver reconstruct the message with low probability of error, in spite of the channel noise. Shannon's main result, the noisy-channel coding theorem, showed that, in the limit of many channel uses, the rate of information that is asymptotically achievable is equal to the channel capacity, a quantity dependent merely on the statistics of the channel over which the messages are sent.

Coding theory is concerned with finding explicit methods, called codes, for increasing the efficiency and reducing the error rate of data communication over noisy channels to near the channel capacity. These codes can be roughly subdivided into data compression (source coding) and error-correction (channel coding) techniques. In the latter case, it took many years to find the methods Shannon's work proved were possible.

A third class of information theory codes are cryptographic algorithms (both codes and ciphers). Concepts, methods and results from coding theory and information theory are widely used in cryptography and cryptanalysis, such as the unit ban.

Historical background

The landmark event establishing the discipline of information theory and bringing it to immediate worldwide attention was the publication of Claude E. Shannon's classic paper "A Mathematical Theory of Communication" in the Bell System Technical Journal in July and October 1948. Historian James Gleick rated the paper as the most important development of 1948, noting that the paper was "even more profound and more fundamental" than the transistor. He came to be known as the "father of information theory". Shannon outlined some of his initial ideas of information theory as early as 1939 in a letter to Vannevar Bush.

Prior to this paper, limited information-theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability. Harry Nyquist's 1924 paper, Certain Factors Affecting Telegraph Speed, contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation W = K log m (recalling the Boltzmann constant), where W is the speed of transmission of intelligence, m is the number of different voltage levels to choose from at each time step, and K is a constant. Ralph Hartley's 1928 paper, Transmission of Information, uses the word information as a measurable quantity, reflecting the receiver's ability to distinguish one sequence of symbols from any other, thus quantifying information as H = log Sn = n log S, where S was the number of possible symbols, and n the number of symbols in a transmission. The unit of information was therefore the decimal digit, which since has sometimes been called the hartley in his honor as a unit or scale or measure of information. Alan Turing in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war Enigma ciphers.

Much of the mathematics behind information theory with events of different probabilities were developed for the field of thermodynamics by Ludwig Boltzmann and J. Willard Gibbs. Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by Rolf Landauer in the 1960s, are explored in Entropy in thermodynamics and information theory.

In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion:

"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."

With it came the ideas of:

Quantities of information

Information theory is based on probability theory and statistics, where quantified information is usually described in terms of bits. Information theory often concerns itself with measures of information of the distributions associated with random variables. One of the most important measures is called entropy, which forms the building block of many other measures. Entropy allows quantification of measure of information in a single random variable.

Another useful concept is mutual information defined on two random variables, which describes the measure of information in common between those variables, which can be used to describe their correlation. The former quantity is a property of the probability distribution of a random variable and gives a limit on the rate at which data generated by independent samples with the given distribution can be reliably compressed. The latter is a property of the joint distribution of two random variables, and is the maximum rate of reliable communication across a noisy channel in the limit of long block lengths, when the channel statistics are determined by the joint distribution.

The choice of logarithmic base in the following formulae determines the unit of information entropy that is used. A common unit of information is the bit or shannon, based on the binary logarithm. Other units include the nat, which is based on the natural logarithm, and the decimal digit, which is based on the common logarithm.

In what follows, an expression of the form p log p is considered by convention to be equal to zero whenever p = 0. This is justified because for any logarithmic base.

Entropy of an information source

Based on the probability mass function of a source, the Shannon entropy H, in units of bits per symbol, is defined as the expected value of the information content of the symbols.

The amount of information conveyed by an individual source symbol with probability is known as its self-information or surprisal, . This quantity is defined as:

A less probable symbol has a larger surprisal, meaning its occurrence provides more information. The entropy is the weighted average of the surprisal of all possible symbols from the source's probability distribution:

Intuitively, the entropy of a discrete random variable X is a measure of the amount of uncertainty associated with the value of when only its distribution is known. A high entropy indicates the outcomes are more evenly distributed, making the result harder to predict.

For example, if one transmits 1000 bits (0s and 1s), and the value of each of these bits is known to the receiver (has a specific value with certainty) ahead of transmission, no information is transmitted. If, however, each bit is independently and equally likely to be 0 or 1, 1000 shannons of information (more often called bits) have been transmitted.

The entropy of a Bernoulli trial as a function of success probability, often called the binary entropy function, Hb(p). The entropy is maximized at 1 bit per trial when the two possible outcomes are equally probable, as in an unbiased coin toss.

Properties

A key property of entropy is that it is maximized when all the messages in the message space are equiprobable. For a source with n possible symbols, where for all , the entropy is given by:

This maximum value represents the most unpredictable state.

For a source that emits a sequence of symbols that are independent and identically distributed (i.i.d.), the total entropy of the message is bits. If the source data symbols are identically distributed but not independent, the entropy of a message of length will be less than .

Units

The choice of the logarithmic base in the entropy formula determines the unit of entropy used:

  • A base-2 logarithm (as shown in the main formula) measures entropy in bits per symbol. This unit is also sometimes called the shannon in honor of Claude Shannon.
  • A Natural logarithm (base e) measures entropy in nats per symbol. This is often used in theoretical analysis as it avoids the need for scaling constants (like ln 2) in derivations.
  • Other bases are also possible. A base-10 logarithm measures entropy in decimal digits, or hartleys, per symbol. A base-256 logarithm measures entropy in bytes per symbol, since 28 = 256.

Binary Entropy Function

The special case of information entropy for a random variable with two outcomes (a Bernoulli trial) is the binary entropy function. This is typically calculated using a base-2 logarithm, and its unit is the shannon. If one outcome has probability p, the other has probability 1p. The entropy is given by:

This function is depicted in the plot shown above, reaching its maximum of 1 bit when p = 0.5, corresponding to the highest uncertainty.

Joint entropy

The joint entropy of two discrete random variables X and Y is merely the entropy of their pairing: (X, Y). This implies that if X and Y are independent, then their joint entropy is the sum of their individual entropies.

For example, if (X, Y) represents the position of a chess piece—X the row and Y the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.

Despite similar notation, joint entropy should not be confused with cross-entropy.

Conditional entropy (equivocation)

The conditional entropy or conditional uncertainty of X given random variable Y (also called the equivocation of X about Y) is the average conditional entropy over Y:

Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use. A basic property of this form of conditional entropy is that:

Mutual information (transinformation)

Mutual information measures the amount of information that can be obtained about one random variable by observing another. It is important in communication where it can be used to maximize the amount of information shared between sent and received signals. The mutual information of X relative to Y is given by:

where SI (Specific mutual Information) is the pointwise mutual information.

A basic property of the mutual information is that:

That is, knowing , we can save an average of I(X; Y) bits in encoding compared to not knowing .

Mutual information is symmetric:

Mutual information can be expressed as the average Kullback–Leibler divergence (information gain) between the posterior probability distribution of given the value of and the prior distribution on :

In other words, this is a measure of how much, on the average, the probability distribution on will change if we are given the value of . This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:

Mutual information is closely related to the log-likelihood ratio test in the context of contingency tables and the multinomial distribution and to Pearson's χ2 test: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.

Kullback–Leibler divergence (information gain)

The Kullback–Leibler divergence (or information divergence, information gain, or relative entropy) is a way of comparing two distributions: a "true" probability distribution , and an arbitrary probability distribution . If we compress data in a manner that assumes is the distribution underlying some data, when, in reality, is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression. It is thus defined

Although it is sometimes used as a 'distance metric', KL divergence is not a true metric since it is not symmetric and does not satisfy the triangle inequality (making it a semi-quasimetric).

Another interpretation of the KL divergence is the "unnecessary surprise" introduced by a prior from the truth: suppose a number is about to be drawn randomly from a discrete set with probability distribution . If Alice knows the true distribution , while Bob believes (has a prior) that the distribution is , then Bob will be more surprised than Alice, on average, upon seeing the value of . The KL divergence is the (objective) expected value of Bob's (subjective) surprisal minus Alice's surprisal, measured in bits if the log is in base 2. In this way, the extent to which Bob's prior is "wrong" can be quantified in terms of how "unnecessarily surprised" it is expected to make him.

Directed Information

Directed information, , is an information theory measure that quantifies the information flow from the random process to the random process . The term directed information was coined by James Massey and is defined as:

,

where is the conditional mutual information .

In contrast to mutual information, directed information is not symmetric. The measures the information bits that are transmitted causally[clarification needed] from to . The Directed information has many applications in problems where causality plays an important role such as capacity of channel with feedback, capacity of discrete memoryless networks with feedback, gambling with causal side information, compression with causal side information, real-time control communication settings, and in statistical physics.

Other quantities

Other important information theoretic quantities include the Rényi entropy and the Tsallis entropy (generalizations of the concept of entropy), differential entropy (a generalization of quantities of information to continuous distributions), and the conditional mutual information. Also, pragmatic information has been proposed as a measure of how much information has been used in making a decision.

Coding theory

A picture showing scratches on the readable surface of a CD-R. Music and data CDs are coded using error correcting codes and thus can still be read even if they have minor scratches using error detection and correction.

Coding theory is one of the most important and direct applications of information theory. It can be subdivided into source coding theory and channel coding theory. Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.

  • Data compression (source coding): There are two formulations for the compression problem:
  • Error-correcting codes (channel coding): While data compression removes as much redundancy as possible, an error-correcting code adds just the right kind of redundancy (i.e., error correction) needed to transmit the data efficiently and faithfully across a noisy channel.

This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts. However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user. In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the broadcast channel) or intermediary "helpers" (the relay channel), or more general networks, compression followed by transmission may no longer be optimal.

Source theory

Any process that generates successive messages can be considered a source of information. A memoryless source is one in which each message is an independent identically distributed random variable, whereas the properties of ergodicity and stationarity impose less restrictive constraints. All such sources are stochastic. These terms are well studied in their own right outside information theory.

Rate

Information rate is the average entropy per symbol. For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is:

that is, the conditional entropy of a symbol given all the previous symbols generated. For the more general case of a process that is not necessarily stationary, the average rate is:

that is, the limit of the joint entropy per symbol. For stationary sources, these two expressions give the same result.

The information rate is defined as:

It is common in information theory to speak of the "rate" or "entropy" of a language. This is appropriate, for example, when the source of information is English prose. The rate of a source of information is related to its redundancy and how well it can be compressed, the subject of source coding.

Channel capacity

Communications over a channel is the primary motivation of information theory. However, channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.

Consider the communications process over a discrete channel. A simple model of the process is shown below:

Here represents the space of messages transmitted, and the space of messages received during a unit time over our channel. Let p(y|x) be the conditional probability distribution function of given . We will consider p(y|x) to be an inherent fixed property of our communications channel (representing the nature of the noise of our channel). Then the joint distribution of and is completely determined by our channel and by our choice of f(x), the marginal distribution of messages we choose to send over the channel. Under these constraints, we would like to maximize the rate of information, or the signal, we can communicate over the channel. The appropriate measure for this is the mutual information, and this maximum mutual information is called the channel capacity and is given by:

This capacity has the following property related to communicating at information rate R (where R is usually bits per symbol). For any information rate R < C and coding error ε > 0, for large enough N, there exists a code of length N and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error. In addition, for any rate R > C, it is impossible to transmit with arbitrarily small block error.

Channel coding is concerned with finding such nearly optimal codes that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.

Capacity of particular channel models

  • A continuous-time analog communications channel subject to Gaussian noise—see Shannon–Hartley theorem.
  • A binary symmetric channel (BSC) with crossover probability p is a binary input, binary output channel that flips the input bit with probability p. The BSC has a capacity of 1 − Hb(p) bits per channel use, where Hb is the binary entropy function to the base-2 logarithm:
  • A binary erasure channel (BEC) with erasure probability p is a binary input, ternary output channel. The possible channel outputs are 0, 1, and a third symbol 'e' called an erasure. The erasure represents complete loss of information about an input bit. The capacity of the BEC is 1 − p bits per channel use.

Channels with memory and directed information

In practice many channels have memory. Namely, at time the channel is given by the conditional probability. It is often more comfortable to use the notation and the channel become . In such a case the capacity is given by the mutual information rate when there is no feedback available and the Directed information rate in the case that either there is feedback or not (if there is no feedback the directed information equals the mutual information).

Fungible information

Fungible information is the information for which the means of encoding is not important. Classical information theorists and computer scientists are mainly concerned with information of this sort. It is sometimes referred as speakable information.

Applications to other fields

Intelligence uses and secrecy applications

Information theoretic concepts apply to cryptography and cryptanalysis. Turing's information unit, the ban, was used in the Ultra project, breaking the German Enigma machine code and hastening the end of World War II in Europe. Shannon himself defined an important concept now called the unicity distance. Based on the redundancy of the plaintext, it attempts to give a minimum amount of ciphertext necessary to ensure unique decipherability.

Information theory leads us to believe it is much more difficult to keep secrets than it might first appear. A brute force attack can break systems based on asymmetric key algorithms or on most commonly used methods of symmetric key algorithms (sometimes called secret key algorithms), such as block ciphers. The security of all such methods comes from the assumption that no known attack can break them in a practical amount of time.

Information theoretic security refers to methods such as the one-time pad that are not vulnerable to such brute force attacks. In such cases, the positive conditional mutual information between the plaintext and ciphertext (conditioned on the key) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications. In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key. However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the Venona project was able to crack the one-time pads of the Soviet Union due to their improper reuse of key material.

Pseudorandom number generation

Pseudorandom number generators are widely available in computer language libraries and application programs. They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software. A class of improved random number generators is termed cryptographically secure pseudorandom number generators, but even they require random seeds external to the software to work as intended. These can be obtained via extractors, if done carefully. The measure of sufficient randomness in extractors is min-entropy, a value related to Shannon entropy through Rényi entropy; Rényi entropy is also used in evaluating randomness in cryptographic systems. Although related, the distinctions among these measures mean that a random variable with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.

Seismic exploration

One early commercial application of information theory was in the field of seismic oil exploration. Work in this field made it possible to strip off and separate the unwanted noise from the desired seismic signal. Information theory and digital signal processing offer a major improvement of resolution and image clarity over previous analog methods.

Semiotics

Semioticians Doede Nauta [nl] and Winfried Nöth both considered Charles Sanders Peirce as having created a theory of information in his works on semiotics. Nauta defined semiotic information theory as the study of "the internal processes of coding, filtering, and information processing."

Concepts from information theory such as redundancy and code control have been used by semioticians such as Umberto Eco and Ferruccio Rossi-Landi [it] to explain ideology as a form of message transmission whereby a dominant social class emits its message by using signs that exhibit a high degree of redundancy such that only one message is decoded among a selection of competing ones.

Integrated process organization of neural information

Quantitative information theoretic methods have been applied in cognitive science to analyze the integrated process organization of neural information in the context of the binding problem in cognitive neuroscience. In this context, either an information-theoretical measure, such as functional clusters (Gerald Edelman and Giulio Tononi's functional clustering model and dynamic core hypothesis (DCH)) or effective information (Tononi's integrated information theory (IIT) of consciousness), is defined (on the basis of a reentrant process organization, i.e. the synchronization of neurophysiological activity between groups of neuronal populations), or the measure of the minimization of free energy on the basis of statistical methods (Karl J. Friston's free energy principle (FEP), an information-theoretical measure which states that every adaptive change in a self-organized system leads to a minimization of free energy, and the Bayesian brain hypothesis).

Miscellaneous applications

Information theory also has applications in the search for extraterrestrial intelligenceblack holesbioinformatics, and gambling.

Saturday, August 23, 2025

Crystallography

From Wikipedia, the free encyclopedia
A crystalline solid: atomic resolution image of strontium titanate. Brighter spots are columns of strontium atoms and darker ones are titanium-oxygen columns.
Octahedral and tetrahedral interstitial sites in a face centered cubic structure
Kikuchi lines in an electron backscatter diffraction pattern of monocrystalline silicon, taken at 20 kV with a field-emission electron source

Crystallography is the branch of science devoted to the study of molecular and crystalline structure and properties. The word crystallography is derived from the Ancient Greek word κρύσταλλος (krústallos; "clear ice, rock-crystal"), and γράφειν (gráphein; "to write"). In July 2012, the United Nations recognised the importance of the science of crystallography by proclaiming 2014 the International Year of Crystallography.

Crystallography is a broad topic, and many of its subareas, such as X-ray crystallography, are themselves important scientific topics. Crystallography ranges from the fundamentals of crystal structure to the mathematics of crystal geometry, including those that are not periodic or quasicrystals. At the atomic scale it can involve the use of X-ray diffraction to produce experimental data that the tools of X-ray crystallography can convert into detailed positions of atoms, and sometimes electron density. At larger scales it includes experimental tools such as orientational imaging to examine the relative orientations at the grain boundary in materials. Crystallography plays a key role in many areas of biology, chemistry, and physics, as well new developments in these fields.

History and timeline

Before the 20th century, the study of crystals was based on physical measurements of their geometry using a goniometer. This involved measuring the angles of crystal faces relative to each other and to theoretical reference axes (crystallographic axes), and establishing the symmetry of the crystal in question. The position in 3D space of each crystal face is plotted on a stereographic net such as a Wulff net or Lambert net. The pole to each face is plotted on the net. Each point is labelled with its Miller index. The final plot allows the symmetry of the crystal to be established.

The discovery of X-rays and electrons in the last decade of the 19th century enabled the determination of crystal structures on the atomic scale, which brought about the modern era of crystallography. The first X-ray diffraction experiment was conducted in 1912 by Max von Laue, while electron diffraction was first realized in 1927 in the Davisson–Germer experiment and parallel work by George Paget Thomson and Alexander Reid. These developed into the two main branches of crystallography, X-ray crystallography and electron diffraction. The quality and throughput of solving crystal structures greatly improved in the second half of the 20th century, with the developments of customized instruments and phasing algorithms. Nowadays, crystallography is an interdisciplinary field, supporting theoretical and experimental discoveries in various domains. Modern-day scientific instruments for crystallography vary from laboratory-sized equipment, such as diffractometers and electron microscopes, to dedicated large facilities, such as photoinjectors, synchrotron light sources and free-electron lasers.

Methodology

Crystallographic methods depend mainly on analysis of the diffraction patterns of a sample targeted by a beam of some type. X-rays are most commonly used; other beams used include electrons or neutrons. Crystallographers often explicitly state the type of beam used, as in the terms X-ray diffraction, neutron diffraction and electron diffraction. These three types of radiation interact with the specimen in different ways.

It is hard to focus x-rays or neutrons, but since electrons are charged they can be focused and are used in electron microscopes to produce magnified images. There are many ways that transmission electron microscopy and related techniques such as scanning transmission electron microscopy, high-resolution electron microscopy can be used to obtain images with in many cases atomic resolution from which crystallographic information can be obtained. There are also other methods such as low-energy electron diffraction, low-energy electron microscopy and reflection high-energy electron diffraction which can be used to obtain crystallographic information about surfaces.

Applications in various areas

Materials science

Crystallography is used by materials scientists to characterize different materials. In single crystals, the effects of the crystalline arrangement of atoms is often easy to see macroscopically because the natural shapes of crystals reflect the atomic structure. In addition, physical properties are often controlled by crystalline defects. The understanding of crystal structures is an important prerequisite for understanding crystallographic defects. Most materials do not occur as a single crystal, but are poly-crystalline in nature (they exist as an aggregate of small crystals with different orientations). As such, powder diffraction techniques, which take diffraction patterns of samples with a large number of crystals, play an important role in structural determination.

Other physical properties are also linked to crystallography. For example, the minerals in clay form small, flat, platelike structures. Clay can be easily deformed because the platelike particles can slip along each other in the plane of the plates, yet remain strongly connected in the direction perpendicular to the plates. Such mechanisms can be studied by crystallographic texture measurements. Crystallographic studies help elucidate the relationship between a material's structure and its properties, aiding in developing new materials with tailored characteristics. This understanding is crucial in various fields, including metallurgy, geology, and materials science. Advancements in crystallographic techniques, such as electron diffraction and X-ray crystallography, continue to expand our understanding of material behavior at the atomic level.

In another example, iron transforms from a body-centered cubic (bcc) structure called ferrite to a face-centered cubic (fcc) structure called austenite when it is heated. The fcc structure is a close-packed structure unlike the bcc structure; thus the volume of the iron decreases when this transformation occurs.

Crystallography is useful in phase identification. When manufacturing or using a material, it is generally desirable to know what compounds and what phases are present in the material, as their composition, structure and proportions will influence the material's properties. Each phase has a characteristic arrangement of atoms. X-ray or neutron diffraction can be used to identify which structures are present in the material, and thus which compounds are present. Crystallography covers the enumeration of the symmetry patterns which can be formed by atoms in a crystal and for this reason is related to group theory.

Biology

X-ray crystallography is the primary method for determining the molecular conformations of biological macromolecules, particularly protein and nucleic acids such as DNA and RNA. The first crystal structure of a macromolecule was solved in 1958, a three-dimensional model of the myoglobin molecule obtained by X-ray analysis. Neutron crystallography is often used to help refine structures obtained by X-ray methods or to solve a specific bond; the methods are often viewed as complementary, as X-rays are sensitive to electron positions and scatter most strongly off heavy atoms, while neutrons are sensitive to nucleus positions and scatter strongly even off many light isotopes, including hydrogen and deuterium. Electron diffraction has been used to determine some protein structures, most notably membrane proteins and viral capsids.

Macromolecular structures determined through X-ray crystallography (and other techniques) are housed in the Protein Data Bank (PDB)–a freely accessible repository for the structures of proteins and other biological macromolecules. There are many molecular graphics codes available for visualising these structures.

Neutron diffraction

From Wikipedia, the free encyclopedia
 

Neutron diffraction or elastic neutron scattering is the application of neutron scattering to the determination of the atomic and/or magnetic structure of a material. A sample to be examined is placed in a beam of thermal or cold neutrons to obtain a diffraction pattern that provides information of the structure of the material. The technique is similar to X-ray diffraction but due to their different scattering properties, neutrons and X-rays provide complementary information: X-Rays are suited for superficial analysis, strong x-rays from synchrotron radiation are suited for shallow depths or thin specimens, while neutrons having high penetration depth are suited for bulk samples.[1]

History

Discovery of the neutron

In 1921, American chemist and physicist William D. Harkins introduced the term "neutron" while studying atomic structure and nuclear reactions. He proposed the existence of a neutral particle within the atomic nucleus, though there was no experimental evidence for it at the time. In 1932, British physicist James Chadwick provided experimental proof of the neutron's existence. His discovery confirmed the presence of this neutral subatomic particle, earning him the Nobel Prize in Physics in 1935. Chadwick's research was influenced by earlier work from Irène and Frédéric Joliot-Curie, who had detected unexplained neutral radiation but had not recognized it as a distinct particle. Neutrons are subatomic particles that exist in the nucleus of the atom, it has higher mass than protons but no electrical charge.

In the 1930s Enrico Fermi and colleagues gave theoretical contributions establishing the foundation of neutron scattering. Fermi developed a framework to understand how neutrons interact with atomic nuclei.

Early diffraction work

Diffraction was first observed in 1936 by two groups, von Halban and Preiswerk and by Mitchell and Powers. In 1944, Ernest O. Wollan, with a background in X-ray scattering from his PhD work under Arthur Compton, recognized the potential for applying thermal neutrons from the newly operational X-10 nuclear reactor to crystallography. Joined by Clifford G. Shull they developed neutron diffraction throughout the 1940s.

Neutron diffraction experiments were carried out in 1945 by Ernest O. Wollan using the Graphite Reactor at Oak Ridge. He was joined shortly thereafter (June 1946) by Clifford Shull, and together they established the basic principles of the technique, and applied it successfully to many different materials, addressing problems like the structure of ice and the microscopic arrangements of magnetic moments in materials. For this achievement, Shull was awarded one half of the 1994 Nobel Prize in Physics. (Wollan died in 1984). (The other half of the 1994 Nobel Prize for Physics went to Bert Brockhouse for development of the inelastic scattering technique at the Chalk River facility of AECL. This also involved the invention of the triple axis spectrometer).

1950–60s

The development of neutron sources such as reactors and spallation sources emerged. This allowed high-intensity neutron beams, enabling advanced scattering experiments. Notably, the high flux isotope reactor (HFIR) at Oak Ridge and Institut Laue Langevin (ILL) in Grenoble, France, emerged as key institutions for neutron scattering studies.

1970–1980s

This period saw major advancements in neutron scattering techniques by developing techniques to explore different aspects of material science, structure and behaviour.

Small angle neutron scattering (SANS): Used to investigate large-scale structural features in materials. The works of Glatter and Kratky also helped in the advancements of this method, though it was primarily developed for X-rays.

Inelastic neutron scattering (INS): Provides insights into the dynamic process at the microscopic level. Majorly used to examine atomic and molecular motions.

In 1949, Ernest Wollan and Clifford Shull conducted experiments using a double-crystal neutron spectrometer positioned on the southern side of the ORNL graphite reactor to collect data.

1990-present

Recent advancements focus on improved sources, using sophisticated detectors and enhanced computational techniques. Spallation sources have been developed at SNS (Spallation Neutron Source) in the U.S. and ISIS Neutron and Muon Source in the U.K., which can generate pulsed neutron beams for time-of-flight experiments. Neutron imaging and reflectometry were also developed, which are powerful tools to analyse surfaces, interfaces and thin film structures, thus providing valuable insights into the material properties.

Comparison of neutron scattering, XRD and electron scattering


Feature Neutron diffraction X-ray diffraction Electron scattering
Principle Interacts with atomic nuclei and magnetic moments enabling nuclear and magnetic scattering Scatter off electron cloud thus allowing probing of electron density. Scatter off electrostatic potential thus allowing probing of electron density.
Penetration depth High (suitable to study bulk materials since neutrons penetrate deeply in) Moderate (good penetration but also absorption by heavy elements) Low (suitable for surface studies since electrons are strongly absorbed) or quite deep depending upon the energy.
Sensitivity to light elements High (very sensitive to lighter elements like hydrogen or lithium) Low (poor sensitivity to lighter elements) High (can detect lighter elements ).
Magnetic studies Excellent (can probe magnetic structure and spin dynamics) Limited (require specialized techniques like resonance magnetic scattering) Yields local information
Resolution High (depending on techniques and instrument) High (can yield very precise positions for crystal structure) Very high (can achieve high resolution)
Sample environment Efficient (used to study samples in different environment) Efficient Limited (requires vacuum and thin samples)
Applications structure of materials and magnetic property of the material. X-ray crystallography Used for bulk materials, surfaces, defects, see electron diffraction

Principle

Processes

Neutrons are produced through three major processes, fission, spallation, and Low energy nuclear reactions.

Fission

In research reactors, fission takes place when a fissile nucleus, such as uranium-235 (235U), absorbs a neutron and subsequently splits into two smaller fragments. This process releases energy along with additional neutrons. On average, each fission event produces about 2.5 neutrons. While one neutron is required to maintain the chain reaction, the surplus neutrons can be utilized for various experimental applications.

Spallation

In spallation sources, high-energy protons (on the order of 1 GeV) bombard a heavy metal target (e.g., uranium (U), tungsten (W), tantalum (Ta), lead (Pb), or mercury (Hg)). This interaction causes the nuclei to spit out neutrons. Proton interactions result in around ten to thirty neutrons per event, of which the bulk are known as "evaporation neutrons"(~2 MeV), while a minority are identified as "cascade neutrons" with energies reaching up to the GeV range. Although spallation is a very efficient technique of neutron production, the technique generates high energy particles, therefore requiring shielding for safety.

Illustration of three major fundamental processes generating neutrons for scattering experiments: Nuclear fission (Top), Spallation (middle), Low energy reaction (bottom).

Low energy nuclear reactions

Low-energy nuclear reactions are the basis of neutron production in accelerator-driven sources. The selected target materials are based on the energy levels; lighter metals such as lithium (Li) and beryllium (Be) can be used toachieve their maximum possible reaction rate under 30 MeV, while heavier elements such as tungsten (W) and carbon (C) provide better performance above 312 MeV. These Compact Accelerator-driven Neutron Sources (CANS) have matured and are now approaching the performance of fission and spallation sources.

De-Broglie relation

Neutron scattering relies on the wave-particle dual nature of neutrons. The De-Broglie relation links the wavelength (λ) of a neutron to its energy (E where h is the Planck constant, p is the momentum of the neutron, m is the mass of the neutron, v is the velocity of the neutron.

Scattering

Neutron scattering is used to detect the distance between atoms and study the dynamics of materials. It involves two major principles: elastic scattering and inelastic scattering.

Elastic scattering provides insight into the structural properties of materials by looking at the angles at which neutrons are scattered. The resulting pattern of the scattering provides information regarding the atomic structure of crystals, liquids and amorphous materials.

Inelastic scattering focuses on material dynamics through the study of neutron energy and momentum changes during interactions. It is key to study phonons, magnons, and other excitations of solid materials.[22]

Neutron matter interaction

X- rays interact with matter through electrostatic interaction by interacting with the electron cloud of atoms, this limits their application as they can be scattered strongly from electrons. While being neutral, neutrons primarily interact with matter through the short-range strong force with atomic nuclei. Nuclei are far smaller than the electron cloud, meaning most materials are transparent to neutrons and allow deeper penetration. The interaction between neutrons and nuclei is described by the Fermi pseudopotential, that is, neutrons are well above their meson mass threshold, and thus can be treated effectively as point-like scatterers. While most elements have a low tendency to absorb neutrons, certain ones such as cadmium (Cd), gadolinium (Gd), helium (3He), lithium (6Li), and boron (10B) exhibit strong neutron absorption due to nuclear resonance effects. The likelihood of absorption increases with neutron wavelength (σaλ), meaning slower neutrons are absorbed more readily than faster ones.

Instrumental and sample requirements

The technique requires a source of neutrons. Neutrons are usually produced in a nuclear reactor or spallation source. At a research reactor, other components are needed, including a crystal monochromator (in the case of thermal neutrons), as well as filters to select the desired neutron wavelength. Some parts of the setup may also be movable. For the long-wavelength neutrons, crystals cannot be used and gratings are used instead as diffractive optical components. At a spallation source, the time of flight technique is used to sort the energies of the incident neutrons (higher energy neutrons are faster), so no monochromator is needed, but rather a series of aperture elements synchronized to filter neutron pulses with the desired wavelength.

The technique is most commonly performed as powder diffraction, which only requires a polycrystalline powder. Single crystal work is also possible, but the crystals must be much larger than those that are used in single-crystal X-ray crystallography. It is common to use crystals that are about 1 mm3.

The technique also requires a device that can detect the neutrons after they have been scattered.

Summarizing, the main disadvantage to neutron diffraction is the requirement for a nuclear reactor. For single crystal work, the technique requires relatively large crystals, which are usually challenging to grow. The advantages to the technique are many - sensitivity to light atoms, ability to distinguish isotopes, absence of radiation damage, as well as a penetration depth of several cm.

Nuclear scattering

Like all quantum particles, neutrons can exhibit wave phenomena typically associated with light or sound. Diffraction is one of these phenomena; it occurs when waves encounter obstacles whose size is comparable with the wavelength. If the wavelength of a quantum particle is short enough, atoms or their nuclei can serve as diffraction obstacles. When a beam of neutrons emanating from a reactor is slowed and selected properly by their speed, their wavelength lies near one angstrom (0.1 nm), the typical separation between atoms in a solid material. Such a beam can then be used to perform a diffraction experiment. Impinging on a crystalline sample, it will scatter under a limited number of well-defined angles, according to the same Bragg law that describes X-ray diffraction.

Neutrons and X-rays interact with matter differently. X-rays interact primarily with the electron cloud surrounding each atom. The contribution to the diffracted x-ray intensity is therefore larger for atoms with larger atomic number (Z). On the other hand, neutrons interact directly with the nucleus of the atom, and the contribution to the diffracted intensity depends on each isotope; for example, regular hydrogen and deuterium contribute differently. It is also often the case that light (low Z) atoms contribute strongly to the diffracted intensity, even in the presence of large-Z atoms. The scattering length varies from isotope to isotope rather than linearly with the atomic number. An element like vanadium strongly scatters X-rays, but its nuclei hardly scatters neutrons, which is why it is often used as a container material. Non-magnetic neutron diffraction is directly sensitive to the positions of the nuclei of the atoms.

The nuclei of atoms, from which neutrons scatter, are tiny. Furthermore, there is no need for an atomic form factor to describe the shape of the electron cloud of the atom and the scattering power of an atom does not fall off with the scattering angle as it does for X-rays. Diffractograms therefore can show strong, well-defined diffraction peaks even at high angles, particularly if the experiment is done at low temperatures. Many neutron sources are equipped with liquid helium cooling systems that allow data collection at temperatures down to 4.2 K. The superb high angle (i.e. high resolution) information means that the atomic positions in the structure can be determined with high precision. On the other hand, Fourier maps (and to a lesser extent difference Fourier maps) derived from neutron data suffer from series termination errors, sometimes so much that the results are meaningless.

Magnetic scattering

Although neutrons are uncharged, they carry a magnetic moment, and therefore interact with magnetic moments, including those arising from the electron cloud around an atom. Neutron diffraction can therefore reveal the microscopic magnetic structure of a material.

Magnetic scattering does require an atomic form factor as it is caused by the much larger electron cloud around the tiny nucleus. The intensity of the magnetic contribution to the diffraction peaks will therefore decrease towards higher angles.

Uses

Neutron diffraction can be used to determine the static structure factor of gases, liquids or amorphous solids. Most experiments, however, aim at the structure of crystalline solids, making neutron diffraction an important tool of crystallography.

Neutron diffraction is closely related to X-ray powder diffraction. In fact, the single crystal version of the technique is less commonly used because currently available neutron sources require relatively large samples and large single crystals are hard or impossible to come by for most materials. Future developments, however, may well change this picture. Because the data is typically a 1D powder diffractogram they are usually processed using Rietveld refinement. In fact the latter found its origin in neutron diffraction (at Petten in the Netherlands) and was later extended for use in X-ray diffraction.

One practical application of elastic neutron scattering/diffraction is that the lattice constant of metals and other crystalline materials can be very accurately measured. Together with an accurately aligned micropositioner a map of the lattice constant through the metal can be derived. This can easily be converted to the stress field experienced by the material. This has been used to analyse stresses in aerospace and automotive components to give just two examples. The high penetration depth permits measuring residual stresses in bulk components as crankshafts, pistons, rails, gears. This technique has led to the development of dedicated stress diffractometers, such as the ENGIN-X instrument at the ISIS neutron source.

Neutron diffraction can also be employed to give insight into the 3D structure any material that diffracts.

Another use is for the determination of the solvation number of ion pairs in electrolytes solutions.

The magnetic scattering effect has been used since the establishment of the neutron diffraction technique to quantify magnetic moments in materials, and study the magnetic dipole orientation and structure. One of the earliest applications of neutron diffraction was in the study of magnetic dipole orientations in antiferromagnetic transition metal oxides such as manganese, iron, nickel, and cobalt oxides. These experiments, first performed by Clifford Shull, were the first to show the existence of the antiferromagnetic arrangement of magnetic dipoles in a material structure. Now, neutron diffraction continues to be used to characterize newly developed magnetic materials.

Hydrogen, null-scattering and contrast variation

Neutron diffraction can be used to establish the structure of low atomic number materials like proteins and surfactants much more easily with lower flux than at a synchrotron radiation source. This is because some low atomic number materials have a higher cross section for neutron interaction than higher atomic weight materials.

One major advantage of neutron diffraction over X-ray diffraction is that the latter is rather insensitive to the presence of hydrogen (H) in a structure, whereas the nuclei 1H and 2H (i.e. Deuterium, D) are strong scatterers for neutrons. The greater scattering power of protons and deuterons means that the position of hydrogen in a crystal and its thermal motions can be determined with greater precision by neutron diffraction. The structures of metal hydride complexes, e.g., Mg2FeH6 have been assessed by neutron diffraction.

The neutron scattering lengths bH = −3.7406(11) fm  and bD = 6.671(4) fm, for H and D respectively, have opposite sign, which allows the technique to distinguish them. In fact there is a particular isotope ratio for which the contribution of the element would cancel, this is called null-scattering.

It is undesirable to work with the relatively high concentration of H in a sample. The scattering intensity by H-nuclei has a large inelastic component, which creates a large continuous background that is more or less independent of scattering angle. The elastic pattern typically consists of sharp Bragg reflections if the sample is crystalline. They tend to drown in the inelastic background. This is even more serious when the technique is used for the study of liquid structure. Nevertheless, by preparing samples with different isotope ratios, it is possible to vary the scattering contrast enough to highlight one element in an otherwise complicated structure. The variation of other elements is possible but usually rather expensive. Hydrogen is inexpensive and particularly interesting, because it plays an exceptionally large role in biochemical structures and is difficult to study structurally in other ways.

Applications

Study of hydrogen storage materials

Since neutron diffraction is particularly sensitive to lighter elements like hydrogen, it can be used for its detection. It can play a role in determining the crystal structure and hydrogen binding sites within metal hydrides, a class of materials of interest for hydrogen storage applications. The order of hydrogen atoms in the lattice reflects the storage capacity and kinetics of the material.

Magnetic structure determination

Neutron diffraction is also a useful technique for determining magnetic structures in materials, as neutrons can interact with magnetic moments. It can be used to determine the antiferromagnetic structure of manganese oxide (MnO) using neutron diffraction. Neutron Diffraction Studies can be used to measure the magnetic moment. Orientation study demonstrates how neutron diffraction can detect the precise alignment of the magnetic moment in materials, something that is much more challenging with X-rays.

Phase transition in ferroelectrics

Neutron diffraction has been widely employed to understand phase transitions in materials including ferroelectrics, which show the transition of crystal structure with temperature or pressure. It can be utilised to study the ferroelectric phase transition in lead titanate (PbTiO3). It can be used to analyse atomic displacements and corresponding lattice distortions.

Residual stress analysis in engineering materials

Neutron diffraction can be used as a technique for the nondestructive assessment of residual stresses in engineering materials, including metals and alloys. Also used for measuring residual stresses in engineering materials.

Lithium-ion batteries

Neutron diffraction is especially useful for the investigation of lithium-ion battery materials, because lithium atoms are almost opaque to X-ray radiation. It can further be used to investigate the structural evolution of lithium-ion battery cathode materials during charge and discharge cycles.

High temperature superconductors

Neutron diffraction has played an important role in revealing the crystal and magnetic structures in high-temperature superconductors. A neutron diffraction study of magnetic order in the high-temperature superconductor YBa2Cu3O6+x was done. The work of each of these scientific teams together with others across the globe has revealed the origins of the relationship between magnetic ordering and superconductivity, delivering crucial insights into the mechanism of high-temperature superconductivity.

Mechanical behaviour of alloys

Advancements in neutron diffraction have facilitated in situ investigations into the mechanical deformation of alloys under load, permitting observations on the mechanisms of deformation. The deformation behavior of titanium alloys under mechanical loads can be investigated using in situ neutron diffraction. This technique allows real-time monitoring of lattice strains and phase transformations throughout deformation.

Neutron diffraction, used along with molecular simulations, revealed that an ion channel's voltage sensing domain (red, yellow and blue molecule at center) perturbs the two-layered cell membrane that surrounds it (yellow surfaces), causing the membrane to thin slightly.

Neutron diffraction for ion channels

Neutron diffraction can be used to study ion channels, highlighting how neutrons interact with biological structures to reveal atomic details. Neutron diffraction is particularly sensitive to light elements like hydrogen, making it ideal for mapping water molecules, ion positions, and hydrogen bonds within the channel. By analysing neutron scattering patterns, researchers can determine ion binding sites, hydration structures, and conformational changes essential for ion transport and selectivity.

Current developments in neutron diffraction

Advancements in Neutron Diffraction Research

Neutron diffraction has made significant progress, particularly at Oak Ridge National Laboratory (ORNL), which operates a suite of 12 diffractometers—seven at the Spallation Neutron Source (SNS) and five at the High Flux Isotope Reactor (HFIR). These instruments are designed for different applications and are grouped into three categories: powder diffraction, single crystal diffraction, and advanced diffraction techniques.

To further enhance neutron diffraction research, ORNL is undertaking several key projects:

  • Expansion of the SNS First Target Station: New beamlines equipped with state-of-the-art instruments are being installed to broaden the scope of scientific investigations.
  • Proton Power Upgrade: This initiative aims to double the proton power used for neutron production, which will enhance research efficiency, allow for the study of smaller and more complex samples, and support the eventual development of a next-generation neutron source at SNS.
  • Development of the SNS Second Target Station: A new facility is being constructed to house 22 beamlines, making it a leading source for cold neutron research, crucial for studying soft matter, biological systems, and quantum materials.
  • Enhancements at HFIR: Planned upgrades include optimizing the cold neutron guide hall to improve experimental capabilities, expanding isotope production (including plutonium-238 for space exploration), and enhancing the performance of existing instruments.

These advancements are set to significantly improve neutron diffraction techniques, allowing for more precise and detailed analysis of material structures. By expanding research capabilities and increasing neutron production efficiency, these developments will support a wide range of scientific fields, from materials science to energy research and quantum physics.

Neutron diffraction technology is evolving rapidly, with a focus on improving beam intensity and instrument efficiency. Modern instruments are designed to produce smaller, more intense beams, enabling high-precision studies of smaller samples, which is particularly beneficial for new material research. Advanced detectors, such as boron-based alternatives to helium-3, are being developed to address material shortages, while improved neutron spin manipulation enhances the study of magnetic and structural properties. Computational advancements, including simulations and virtual instruments, are optimizing neutron sources, streamlining experimental design, and integrating machine learning for data analysis. Multiplexing and event-based acquisition systems are enhancing data collection by capturing multiple datasets simultaneously. Additionally,next-generation spallation sources like the European Spallation Source (ESS) and Oak Ridge's Second Target Station (STS) are increasing neutron production efficiency. Lastly, the rise of remote-controlled experiments and automation is improving accessibility and precision in neutron diffraction research.

Modern advancements in neutron diffraction are enhancing data precision, broadening structural research applications, and refining experimental methodologies. A key focus is the improved visualization of hydrogen atoms in biological macromolecules, crucial for studying enzymatic activity and hydrogen bonding. The expansion of specialized diffractometers has increased accessibility in structural biology, with techniques like monochromatic, quasi-Laue, and time-of-flight methods being optimized for efficiency. Innovations in sample preparation, particularly protein deuteration, are minimizing background noise and reducing the need for large crystals. Additionally, computational tools, including quantum chemical modeling, are aiding in the interpretation of complex molecular interactions. Improved neutron sources, such as spallation facilities, along with advanced detectors, are further boosting measurement accuracy and structural resolution. These developments are solidifying neutron diffraction as a critical technique for exploring the molecular architecture of biological systems.

African Pygmies

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/African_Py...