A Medley of Potpourri

Monday, June 11, 2018

Orthogonality

From Wikipedia, the free encyclopedia

The line segments AB and CD are orthogonal to each other.

In mathematics, orthogonality is the generalization of the notion of perpendicularity to the linear algebra of bilinear forms. Two elements u and v of a vector space with bilinear form B are orthogonal when B(u, v) = 0. Depending on the bilinear form, the vector space may contain nonzero self-orthogonal vectors. In the case of function spaces, families of orthogonal functions are used to form a basis.

By extension, orthogonality is also used to refer to the separation of specific features of a system. The term also has specialized meanings in other fields including art and chemistry.

Etymology

The word comes from the Greek ὀρθός (orthos), meaning "upright", and γωνία (gonia), meaning "angle". The ancient Greek ὀρθογώνιον orthogōnion (< ὀρθός orthos 'upright'^[1] + γωνία gōnia 'angle'^[2]) and classical Latin orthogonium originally denoted a rectangle.^[3] Later, they came to mean a right triangle. In the 12th century, the post-classical Latin word orthogonalis came to mean a right angle or something related to a right angle.^[4]

Mathematics and physics

Orthogonality and rotation of coordinate systems compared between left: Euclidean space through circular angle ϕ, right: in Minkowski spacetime through hyperbolic angle ϕ (red lines labelled c denote the worldlines of a light signal, a vector is orthogonal to itself if it lies on this line).^[5]

Definitions

In geometry, two Euclidean vectors are orthogonal if they are perpendicular, i.e., they form a right angle.
Two vectors, x and y, in an inner product space, V, are orthogonal if their inner product $\langle x,y\rangle$ is zero.^[6] This relationship is denoted $x\perp y$ .
Two vector subspaces, A and B, of an inner product space V, are called orthogonal subspaces if each vector in A is orthogonal to each vector in B. The largest subspace of V that is orthogonal to a given subspace is its orthogonal complement.
Given a module M and its dual M^∗, an element m′ of M^∗ and an element m of M are orthogonal if their natural pairing is zero, i.e. ⟨m′, m⟩ = 0. Two sets S′ ⊆ M^∗ and S ⊆ M are orthogonal if each element of S′ is orthogonal to each element of S.^[7]
A term rewriting system is said to be orthogonal if it is left-linear and is non-ambiguous. Orthogonal term rewriting systems are confluent.

A set of vectors in an inner product space is called pairwise orthogonal if each pairing of them is orthogonal. Such a set is called an orthogonal set.

In certain cases, the word normal is used to mean orthogonal, particularly in the geometric sense as in the normal to a surface. For example, the y-axis is normal to the curve y = x² at the origin. However, normal may also refer to the magnitude of a vector. In particular, a set is called orthonormal (orthogonal plus normal) if it is an orthogonal set of unit vectors. As a result, use of the term normal to mean "orthogonal" is often avoided. The word "normal" also has a different meaning in probability and statistics.

A vector space with a bilinear form generalizes the case of an inner product. When the bilinear form applied to two vectors results in zero, then they are orthogonal. The case of a pseudo-Euclidean plane uses the term hyperbolic orthogonality. In the diagram, axes x′ and t′ are hyperbolic-orthogonal for any given ϕ.

Euclidean vector spaces

In Euclidean space, two vectors are orthogonal if and only if their dot product is zero, i.e. they make an angle of 90° (π/2 radians), or one of the vectors is zero.^[8] Hence orthogonality of vectors is an extension of the concept of perpendicular vectors to spaces of any dimension.

The orthogonal complement of a subspace is the space of all vectors that are orthogonal to every vector in the subspace. In a three-dimensional Euclidean vector space, the orthogonal complement of a line through the origin is the plane through the origin perpendicular to it, and vice versa.^[9]

Note that the geometric concept two planes being perpendicular does not correspond to the orthogonal complement, since in three dimensions a pair of vectors, one from each of a pair of perpendicular planes, might meet at any angle.

In four-dimensional Euclidean space, the orthogonal complement of a line is a hyperplane and vice versa, and that of a plane is a plane.^[9]

Orthogonal functions

By using integral calculus, it is common to use the following to define the inner product of two functions f and g with respect to a nonnegative weight function w over an interval [a, b]:

\langle f,g\rangle _{w}=\int _{a}^{b}f(x)g(x)w(x)\,dx.

In simple cases, w(x) = 1.

We say that functions f and g are orthogonal if their inner product (equivalently, the value of this integral) is zero:

\langle f,g\rangle _{w}=0.

Orthogonality of two functions with respect to one inner product does not imply orthogonality with respect to another inner product.

We write the norm with respect to this inner product as

\|f\|_{w}={\sqrt {\langle f,f\rangle _{w}}}

The members of a set of functions {f_i : i = 1, 2, 3, ...} are orthogonal with respect to w on the interval [a, b] if

\langle f_{i},f_{j}\rangle _{w}=0\quad i\neq j.

The members of such a set of functions are orthonormal with respect to w on the interval [a, b] if

\langle f_{i},f_{j}\rangle _{w}=\delta _{i,j},

where

\delta _{i,j}=\left\{{\begin{matrix}1,&&i=j\\0,&&i\neq j\end{matrix}}\right.

is the Kronecker delta. In other words, every pair of them (excluding pairing of a function with itself) is orthogonal, and the norm of each is 1. See in particular the orthogonal polynomials.

Examples

The vectors (1, 3, 2)^T, (3, −1, 0)^T, (1, 3, −5)^T are orthogonal to each other, since (1)(3) + (3)(−1) + (2)(0) = 0, (3)(1) + (−1)(3) + (0)(−5) = 0, and (1)(1) + (3)(3) + (2)(−5) = 0.
The vectors (1, 0, 1, 0, ...)^T and (0, 1, 0, 1, ...)^T are orthogonal to each other. The dot product of these vectors is 0. We can then make the generalization to consider the vectors in Z₂ⁿ:

{\displaystyle \mathbf {v} _{k}=\sum _{i=0 \atop ai+k

for some positive integer a, and for 1 ≤ k ≤ a − 1, these vectors are orthogonal, for example (1, 0, 0, 1, 0, 0, 1, 0)^T, (0, 1, 0, 0, 1, 0, 0, 1)^T, (0, 0, 1, 0, 0, 1, 0, 0)^T are orthogonal.

The functions 2t + 3 and 45t² + 9t − 17 are orthogonal with respect to a unit weight function on the interval from −1 to 1:

$\int _{-1}^{1}\left(2t+3\right)\left(45t^{2}+9t-17\right)\,dt=0$
The functions 1, sin(nx), cos(nx) : n = 1, 2, 3, ... are orthogonal with respect to Riemann integration on the intervals [0, 2π], [−π, π], or any other closed interval of length 2π. This fact is a central one in Fourier series.

Orthogonal polynomials

Various polynomial sequences named for mathematicians of the past are sequences of orthogonal polynomials. In particular:
- The Hermite polynomials are orthogonal with respect to the Gaussian distribution with zero mean value.
- The Legendre polynomials are orthogonal with respect to the uniform distribution on the interval [−1, 1].
- The Laguerre polynomials are orthogonal with respect to the exponential distribution. Somewhat more general Laguerre polynomial sequences are orthogonal with respect to gamma distributions.
- The Chebyshev polynomials of the first kind are orthogonal with respect to the measure $1/{\sqrt {1-x^{2}}}.$
- The Chebyshev polynomials of the second kind are orthogonal with respect to the Wigner semicircle distribution.

Orthogonal states in quantum mechanics

In quantum mechanics, a sufficient (but not necessary) condition that two eigenstates of a Hermitian operator, $\psi _{m}$ and $\psi _{n}$ , are orthogonal is that they correspond to different eigenvalues. This means, in Dirac notation, that $\langle \psi _{m}|\psi _{n}\rangle =0$ if $\psi _{m}$ and $\psi _{n}$ correspond to different eigenvalues. This follows from the fact that Schrödinger's equation is a Sturm–Liouville equation (in Schrödinger's formulation) or that observables are given by hermitian operators (in Heisenberg's formulation).^{[citation needed]}

Art

In art, the perspective (imaginary) lines pointing to the vanishing point are referred to as "orthogonal lines".

The term "orthogonal line" often has a quite different meaning in the literature of modern art criticism. Many works by painters such as Piet Mondrian and Burgoyne Diller are noted for their exclusive use of "orthogonal lines" — not, however, with reference to perspective, but rather referring to lines that are straight and exclusively horizontal or vertical, forming right angles where they intersect. For example, an essay at the Web site of the Thyssen-Bornemisza Museum states that "Mondrian ... dedicated his entire oeuvre to the investigation of the balance between orthogonal lines and primary colours." [1]

Computer science

Orthogonality in programming language design is the ability to use various language features in arbitrary combinations with consistent results.^[10] This usage was introduced by Van Wijngaarden in the design of Algol 68:

The number of independent primitive concepts has been minimized in order that the language be easy to describe, to learn, and to implement. On the other hand, these concepts have been applied “orthogonally” in order to maximize the expressive power of the language while trying to avoid deleterious superfluities.^[11]

Orthogonality is a system design property which guarantees that modifying the technical effect produced by a component of a system neither creates nor propagates side effects to other components of the system. Typically this is achieved through the separation of concerns and encapsulation, and it is essential for feasible and compact designs of complex systems. The emergent behavior of a system consisting of components should be controlled strictly by formal definitions of its logic and not by side effects resulting from poor integration, i.e., non-orthogonal design of modules and interfaces. Orthogonality reduces testing and development time because it is easier to verify designs that neither cause side effects nor depend on them.

An instruction set is said to be orthogonal if it lacks redundancy (i.e., there is only a single instruction that can be used to accomplish a given task)^[12] and is designed such that instructions can use any register in any addressing mode. This terminology results from considering an instruction as a vector whose components are the instruction fields. One field identifies the registers to be operated upon and another specifies the addressing mode. An orthogonal instruction set uniquely encodes all combinations of registers and addressing modes.^{[citation needed]}

Communications

In communications, multiple-access schemes are orthogonal when an ideal receiver can completely reject arbitrarily strong unwanted signals from the desired signal using different basis functions. One such scheme is TDMA, where the orthogonal basis functions are nonoverlapping rectangular pulses ("time slots").

Another scheme is orthogonal frequency-division multiplexing (OFDM), which refers to the use, by a single transmitter, of a set of frequency multiplexed signals with the exact minimum frequency spacing needed to make them orthogonal so that they do not interfere with each other. Well known examples include (a, g, and n) versions of 802.11 Wi-Fi; WiMAX; ITU-T G.hn, DVB-T, the terrestrial digital TV broadcast system used in most of the world outside North America; and DMT (Discrete Multi Tone), the standard form of ADSL.

In OFDM, the subcarrier frequencies are chosen so that the subcarriers are orthogonal to each other, meaning that crosstalk between the subchannels is eliminated and intercarrier guard bands are not required. This greatly simplifies the design of both the transmitter and the receiver. In conventional FDM, a separate filter for each subchannel is required.

Statistics, econometrics, and economics

When performing statistical analysis, independent variables that affect a particular dependent variable are said to be orthogonal if they are uncorrelated,^[13] since the covariance forms an inner product. In this case the same results are obtained for the effect of any of the independent variables upon the dependent variable, regardless of whether one models the effects of the variables individually with simple regression or simultaneously with multiple regression. If correlation is present, the factors are not orthogonal and different results are obtained by the two methods. This usage arises from the fact that if centered by subtracting the expected value (the mean), uncorrelated variables are orthogonal in the geometric sense discussed above, both as observed data (i.e., vectors) and as random variables (i.e., density functions). One econometric formalism that is alternative to the maximum likelihood framework, the Generalized Method of Moments, relies on orthogonality conditions. In particular, the Ordinary Least Squares estimator may be easily derived from an orthogonality condition between the explanatory variables and model residuals.

Taxonomy

In taxonomy, an orthogonal classification is one in which no item is a member of more than one group, that is, the classifications are mutually exclusive.

Combinatorics

In combinatorics, two n×n Latin squares are said to be orthogonal if their superimposition yields all possible n² combinations of entries.^[14]

Chemistry and biochemistry

In synthetic organic chemistry orthogonal protection is a strategy allowing the deprotection of functional groups independently of each other. In chemistry and biochemistry, an orthogonal interaction occurs when there are two pairs of substances and each substance can interact with their respective partner, but does not interact with either substance of the other pair. For example, DNA has two orthogonal pairs: cytosine and guanine form a base-pair, and adenine and thymine form another base-pair, but other base-pair combinations are strongly disfavored. As a chemical example, tetrazine reacts with transcyclooctene and azide reacts with cyclooctyne without any cross-reaction, so these are mutually orthogonal reactions, and so, can be performed simultaneously and selectively.^[15] Bioorthogonal chemistry refers to chemical reactions occurring inside living systems without reacting with naturally present cellular components. In supramolecular chemistry the notion of orthogonality refers to the possibility of two or more supramolecular, often non-covalent, interactions being compatible; reversibly forming without interference from the other.

In analytical chemistry, analyses are "orthogonal" if they make a measurement or identification in completely different ways, thus increasing the reliability of the measurement. This is often required as a part of a new drug application.

System reliability

In the field of system reliability orthogonal redundancy is that form of redundancy where the form of backup device or method is completely different from the prone to error device or method. The failure mode of an orthogonally redundant back-up device or method does not intersect with and is completely different from the failure mode of the device or method in need of redundancy to safeguard the total system against catastrophic failure.

Neuroscience

In neuroscience, a sensory map in the brain which has overlapping stimulus coding (e.g. location and quality) is called an orthogonal map.

Gaming

In board games such as chess which feature a grid of squares, 'orthogonal' is used to mean "in the same row/'rank' or column/'file'". This is the counterpart to squares which are "diagonally adjacent".^[16] In the ancient Chinese board game Go a player can capture the stones of an opponent by occupying all orthogonally-adjacent points.

Other examples

Stereo vinyl records encode both the left and right stereo channels in a single groove. The V-shaped groove in the vinyl has walls that are 90 degrees to each other, with variations in each wall separately encoding one of the two analogue channels that make up the stereo signal. The cartridge senses the motion of the stylus following the groove in two orthogonal directions: 45 degrees from vertical to either side.^[17] A pure horizontal motion corresponds to a mono signal, equivalent to a stereo signal in which both channels carry identical (in-phase) signals.

Alpha Centauri system could have favorable conditions for life

X-ray radiation poses no threat to planets orbiting these two nearby Sun-like stars.

By Amber Jorgenson | Published: Friday, June 8, 2018

Original link: http://www.astronomy.com/news/2018/06/alpha-centauri-system-could-have-favorable-conditions-for-life

Alpha Centauri is the closest star system to Earth, and it

happens to house Sun-like stars. Sitting only 4 light years

away, or 25 trillion miles (40 trillion kilometers), Chandra

found that two of its stars could have favorable conditions for

habitable exoplanets.

X-ray: NASA/CXC/University of Colorado/T.Ayres;

Optical: Zdeněk Bardon/ESO

The search for habitable exoplanets spans far and wide, pushing the limits of what our modern telescopes are capable of. But rest assured that we aren’t ignoring what’s in our own backyard. Researchers have kept diligent eyes on Alpha Centauri, the closest system to Earth that happens to house Sun-like stars. And now, a comprehensive study published in Research Notes of the AAS clears Alpha Centauri’s two brightest stars of a crucial habitability factor: dangerous X-ray radiation.

In the study, NASA’s Chandra X-ray Observatory observed the three stars of Alpha Centauri, which sits just 4 light-years from Earth, twice a year since 2005. In an effort to determine the habitability of any planets within their orbits, Chandra monitored the amount of X-ray radiation that each star emitted into its habitable zone. An excess of X-ray radiation can wreak havoc on a planet by dissolving its atmosphere, causing harmful effects for potential residents, and creating destructive space weather that could mess with any technology possibly in use. But thankfully, the potential planets orbiting two of the three stars don’t have to worry any of that. In fact, these stars might actually create better planetary conditions than our own Sun.

"Because it is relatively close, the Alpha Centauri system is seen by many as the best candidate to explore for signs of life," said study’s author, Tom Ayres of the University of Colorado Boulder, in a press release. "The question is, will we find planets in an environment conducive to life as we know it?"

The three stars that make up Alpha Centauri aren’t exactly created equal, with some more hospitable to life than others. The two brightest stars in the system are a pair known as Alpha Cen A and Alpha Cen B (AB for short), which orbit each other so closely that Chandra is the only observatory precise enough to differentiate their X-rays. Farther out in the system is Alpha Cen C, known as Proxima, which is the closest non-Sun-like star to Earth. The AB pair are both remarkably similar to our Sun, with Alpha Cen A almost identical in size, brightness, and age, and Alpha Cen B only slightly smaller and dimmer.

Alpha Cen A and Alpha Cen B might look distinct in this

image captured by NASA’s Hubble Space Telescope, but

without high-precision instruments, the two Sun-like stars

appear as a single bright object in the sky. ESA/NASA

In regard to X-ray radiation, Alpha Cen A actually provides a safer planetary environment than the Sun, emitting lower doses of X-rays to its habitable zone. Alpha Cen B creates an environment that’s only marginally worse than the Sun, releasing higher amounts of X-rays by only a factor of five.

"This is very good news for Alpha Cen AB in terms of the ability of possible life on any of their planets to survive radiation bouts from the stars," Ayres said. "Chandra shows us that life should have a fighting chance on planets around either of these stars."

Proxima is a different story, though. It’s a significantly smaller red dwarf that emits about 500 times more X-ray radiation into its habitable zone than Earth receives from the Sun, and can radiate 50,000 time more during the massive X-ray flares that it’s known to hurl into space. While the AB duo’s X-ray radiation isn’t a threat to life, the massive dose expelled by Proxima definitely is.

And as luck would have it, the only exoplanet that’s been identified in Alpha Centauri is orbiting uninhabitable Proxima. Researchers haven’t given up hope, though. They continue to search for exoplanets around the AB pair, although their tight orbit makes it difficult to spot anything in between the two. But even if the search continues to turn up empty, Chandra’s extensive investigation will help researchers study the X-ray radiation patterns of stars similar to our Sun, allowing us to pinpoint any potential threats to Earth. And if we do come across planets orbiting these two stars, we might just find signs of life in our own backyard.

Vector space

From Wikipedia, the free encyclopedia

Vector addition and scalar multiplication: a vector

v

(blue) is added to another vector

w

(red, upper illustration). Below, w is stretched by a factor of 2, yielding the sum

v + 2 w

A vector space (also called a linear space) is a collection of objects called vectors, which may be added together and multiplied ("scaled") by numbers, called scalars. Scalars are often taken to be real numbers, but there are also vector spaces with scalar multiplication by complex numbers, rational numbers, or generally any field. The operations of vector addition and scalar multiplication must satisfy certain requirements, called axioms, listed below.

Euclidean vectors are an example of a vector space. They represent physical quantities such as forces: any two forces (of the same type) can be added to yield a third, and the multiplication of a force vector by a real multiplier is another force vector. In the same vein, but in a more geometric sense, vectors representing displacements in the plane or in three-dimensional space also form vector spaces. Vectors in vector spaces do not necessarily have to be arrow-like objects as they appear in the mentioned examples: vectors are regarded as abstract mathematical objects with particular properties, which in some cases can be visualized as arrows.

Vector spaces are the subject of linear algebra and are well characterized by their dimension, which, roughly speaking, specifies the number of independent directions in the space. Infinite-dimensional vector spaces arise naturally in mathematical analysis, as function spaces, whose vectors are functions. These vector spaces are generally endowed with additional structure, which may be a topology, allowing the consideration of issues of proximity and continuity. Among these topologies, those that are defined by a norm or inner product are more commonly used, as having a notion of distance between two vectors. This is particularly the case of Banach spaces and Hilbert spaces, which are fundamental in mathematical analysis.

Historically, the first ideas leading to vector spaces can be traced back as far as the 17th century's analytic geometry, matrices, systems of linear equations, and Euclidean vectors. The modern, more abstract treatment, first formulated by Giuseppe Peano in 1888, encompasses more general objects than Euclidean space, but much of the theory can be seen as an extension of classical geometric ideas like lines, planes and their higher-dimensional analogs.

Today, vector spaces are applied throughout mathematics, science and engineering. They are the appropriate linear-algebraic notion to deal with systems of linear equations. They offer a framework for Fourier expansion, which is employed in image compression routines, and they provide an environment that can be used for solution techniques for partial differential equations. Furthermore, vector spaces furnish an abstract, coordinate-free way of dealing with geometrical and physical objects such as tensors. This in turn allows the examination of local properties of manifolds by linearization techniques. Vector spaces may be generalized in several ways, leading to more advanced notions in geometry and abstract algebra.

Introduction and definition

The concept of vector space will first be explained by describing two particular examples:

First example: arrows in the plane

The first example of a vector space consists of arrows in a fixed plane, starting at one fixed point. This is used in physics to describe forces or velocities. Given any two such arrows,

v

and

w

, the parallelogram spanned by these two arrows contains one diagonal arrow that starts at the origin, too. This new arrow is called the sum of the two arrows and is denoted

v + w

. In the special case of two arrows on the same line, their sum is the arrow on this line whose length is the sum or the difference of the lengths, depending on whether the arrows have the same direction. Another operation that can be done with arrows is scaling: given any positive real number

a

, the arrow that has the same direction as

v

, but is dilated or shrunk by multiplying its length by

a

, is called multiplication of

v

a

. It is denoted

a v

. When

a

is negative,

a v

is defined as the arrow pointing in the opposite direction, instead.

The following shows a few examples: if

a = 2

, the resulting vector

a w

has the same direction as

w

, but is stretched to the double length of

w

(right image below). Equivalently,

2 w

is the sum

w + w

. Moreover,

(-1) v = - v

has the opposite direction and the same length as

v

(blue vector pointing down in the right image).

Second example: ordered pairs of numbers

A second key example of a vector space is provided by pairs of real numbers

x

and

y

. (The order of the components

x

and

y

is significant, so such a pair is also called an ordered pair.) Such a pair is written as

(x, y)

. The sum of two such pairs and multiplication of a pair with a number is defined as follows:

(x 1, y 1)

(x 2, y 2)

= (x 1 + x 2, y 1 + y 2)

and

a (x, y) = (ax, ay)

The first example above reduces to this one if the arrows are represented by the pair of Cartesian coordinates of their end points.

Definition

In this article, vectors are represented in boldface to distinguish them from scalars.^{[nb 1]}

A vector space over a field

F

is a set

V

together with two operations that satisfy the eight axioms listed below.

The first operation, called vector addition or simply addition $+ : V \times V \to V$ , takes any two vectors $v$ and $w$ and assigns to them a third vector which is commonly written as $v + w$ , and called the sum of these two vectors. (Note that the resultant vector is also an element of the set $V$ ).
The second operation, called scalar multiplication $\cdot : F \times V \to V$ ， takes any scalar $a$ and any vector $v$ and gives another vector $a v$ . (Similarly, the vector $a v$ is an element of the set $V$ ).

Elements of

V

are commonly called vectors. Elements of

F

are commonly called scalars.

In the two examples above, the field is the field of the real numbers and the set of the vectors consists of the planar arrows with fixed starting point and of pairs of real numbers, respectively.

To qualify as a vector space, the set

V

and the operations of addition and multiplication must adhere to a number of requirements called axioms.^[1] In the list below, let

u

v

and

w

be arbitrary vectors in

V

, and

a

and

b

scalars in

F

Axiom	Meaning
Associativity of addition	$u + (v + w) = (u + v) + w$
Commutativity of addition	$u + v = v + u$
Identity element of addition	There exists an element $0 \in V$ , called the zero vector, such that $v + 0 = v$ for all $v \in V$ .
Inverse elements of addition	For every $v \in V$ , there exists an element $- v \in V$ , called the additive inverse of $v$ , such that $v + (- v) = 0$ .
Compatibility of scalar multiplication with field multiplication	$a (b v) = (ab) v$ ^{[nb 2]}
Identity element of scalar multiplication	$1 v = v$ , where $1$ denotes the multiplicative identity in $F$ .
Distributivity of scalar multiplication with respect to vector addition	$a (u + v) = a u + a v$
Distributivity of scalar multiplication with respect to field addition	$(a + b) v = a v + b v$

These axioms generalize properties of the vectors introduced in the above examples. Indeed, the result of addition of two ordered pairs (as in the second example above) does not depend on the order of the summands:

(x v, y v) + (x w, y w) = (x w, y w) + (x v, y v)

Likewise, in the geometric example of vectors as arrows,

v + w = w + v

since the parallelogram defining the sum of the vectors is independent of the order of the vectors. All other axioms can be checked in a similar manner in both examples. Thus, by disregarding the concrete nature of the particular type of vectors, the definition incorporates these two and many more examples in one notion of vector space.

Subtraction of two vectors and division by a (non-zero) scalar can be defined as

v - w = v + (- w)

v / a = (1/ a) v

When the scalar field

F

is the real numbers

R

, the vector space is called a real vector space. When the scalar field is the complex numbers

C

, the vector space is called a complex vector space. These two cases are the ones used most often in engineering. The general definition of a vector space allows scalars to be elements of any fixed field

F

. The notion is then known as an

F

-vector spaces or a vector space over $F$ . A field is, essentially, a set of numbers possessing addition, subtraction, multiplication and division operations.^{[nb 3]} For example, rational numbers form a field.

In contrast to the intuition stemming from vectors in the plane and higher-dimensional cases, there is, in general vector spaces, no notion of nearness, angles or distances. To deal with such matters, particular types of vector spaces are introduced; see below.

Alternative formulations and elementary consequences

Vector addition and scalar multiplication are operations, satisfying the closure property:

u + v

and

a v

are in

V

for all

a

F

, and

u

v

V

. Some older sources mention these properties as separate axioms.^[2]

In the parlance of abstract algebra, the first four axioms are equivalent to requiring the set of vectors to be an abelian group under addition. The remaining axioms give this group an

F

-module structure. In other words, there is a ring homomorphism

f

from the field

F

into the endomorphism ring of the group of vectors. Then scalar multiplication

a v

is defined as

(f (a))(v)

.^[3]

There are a number of direct consequences of the vector space axioms. Some of them derive from elementary group theory, applied to the additive group of vectors: for example the zero vector

0

V

and the additive inverse

- v

of any vector

v

are unique. Other properties follow from the distributive law, for example

a v

equals

0

if and only if

a

equals

0

v

equals

0

History

Vector spaces stem from affine geometry via the introduction of coordinates in the plane or three-dimensional space. Around 1636, Descartes and Fermat founded analytic geometry by equating solutions to an equation of two variables with points on a plane curve.^[4] In 1804, to achieve geometric solutions without using coordinates, Bolzano introduced certain operations on points, lines and planes, which are predecessors of vectors.^[5] His work was then used in the conception of barycentric coordinates by Möbius in 1827.^[6] In 1828 C. V. Mourey suggested the existence of an algebra surpassing not only ordinary algebra but also two-dimensional algebra created by him searching a geometrical interpretation of complex numbers.^[7]

The definition of vectors was founded on Bellavitis' notion of the bipoint, an oriented segment of which one end is the origin and the other a target, then further elaborated with the presentation of complex numbers by Argand and Hamilton and the introduction of quaternions and biquaternions by the latter.^[8] They are elements in

R 2

R 4

, and

R 8

; their treatment as linear combinations can be traced back to Laguerre in 1867, who also defined systems of linear equations.

In 1857, Cayley introduced matrix notation, which allows for a harmonization and simplification of linear maps. Around the same time, Grassmann studied the barycentric calculus initiated by Möbius. He envisaged sets of abstract objects endowed with operations.^[9] In his work, the concepts of linear independence and dimension, as well as scalar products, are present. In fact, Grassmann's 1844 work extended a vector space of n dimensions to one of 2ⁿ dimensions by consideration of 2-vectors

u\wedge v

and 3-vectors

u\wedge v\wedge w

called multivectors. This extension, called multilinear algebra, is governed by the rules of exterior algebra. Peano was the first to give the modern definition of vector spaces and linear maps in 1888.^[10]

An important development of vector spaces is due to the construction of function spaces by Lebesgue. This was later formalized by Banach and Hilbert, around 1920.^[11] At that time, algebra and the new field of functional analysis began to interact, notably with key concepts such as spaces of p-integrable functions and Hilbert spaces.^[12] Vector spaces, including infinite-dimensional ones, then became a firmly established notion, and many mathematical branches started making use of this concept.

Examples

Coordinate spaces

The simplest example of a vector space over a field

F

is the field itself, equipped with its standard addition and multiplication. More generally, a vector space can be composed of n-tuples (sequences of length

n

) of elements of

F

, such as

(a 1, a 2, ..., a n)

, where each

a i

is an element of

F

.^[13]

A vector space composed of all the

n

-tuples of a field

F

is known as a coordinate space, usually denoted

F n

. The case

n = 1

is the above-mentioned simplest example, in which the field

F

is also regarded as a vector space over itself. The case

F = R

and

n = 2

was discussed in the introduction above.

Complex numbers and other field extensions

The set of complex numbers

C

, i.e., numbers that can be written in the form

x + iy

for real numbers

x

and

y

where

i

is the imaginary unit, form a vector space over the reals with the usual addition and multiplication:

(x + iy) + (a + ib) = (x + a) + i (y + b)

and

c \cdot (x + iy) = (c \cdot x) + i (c \cdot y)

for real numbers

x

y

a

b

and

c

. The various axioms of a vector space follow from the fact that the same rules hold for complex number arithmetic.

In fact, the example of complex numbers is essentially the same (i.e., it is isomorphic) to the vector space of ordered pairs of real numbers mentioned above: if we think of the complex number

x + i y

as representing the ordered pair

(x, y)

in the complex plane then we see that the rules for sum and scalar product correspond exactly to those in the earlier example.

More generally, field extensions provide another class of examples of vector spaces, particularly in algebra and algebraic number theory: a field

F

containing a smaller field

E

is an

E

-vector space, by the given multiplication and addition operations of

F

.^[14] For example, the complex numbers are a vector space over

R

, and the field extension

\mathbf {Q} (i{\sqrt {5}})

is a vector space over

Q

Function spaces

Functions from any fixed set

Ω

to a field

F

also form vector spaces, by performing addition and scalar multiplication pointwise. That is, the sum of two functions

f

and

g

is the function

(f + g)

given by

(f + g)(w) = f (w) + g (w)

and similarly for multiplication. Such function spaces occur in many geometric situations, when

Ω

is the real line or an interval, or other subsets of

R

. Many notions in topology and analysis, such as continuity, integrability or differentiability are well-behaved with respect to linearity: sums and scalar multiples of functions possessing such a property still have that property.^[15] Therefore, the set of such functions are vector spaces. They are studied in greater detail using the methods of functional analysis, see below. Algebraic constraints also yield vector spaces: the vector space

F [x]

is given by polynomial functions:

f (x) = r 0 + r 1 x + ... + r n -1 x n -1 + r n x n

, where the coefficients

r 0, ..., r n

are in

F

.^[16]

Linear equations

Systems of homogeneous linear equations are closely tied to vector spaces.^[17] For example, the solutions of

$a$	$+$	$3 b$	$+$	$c$	$= 0$
$4 a$	$+$	$2 b$	$+$	$2 c$	$= 0$

are given by triples with arbitrary

a

b = a /2

, and

c = -5 a /2

. They form a vector space: sums and scalar multiples of such triples still satisfy the same ratios of the three variables; thus they are solutions, too. Matrices can be used to condense multiple linear equations as above into one vector equation, namely

A x = 0

where

A =

{\begin{bmatrix}1&3&1\\4&2&2\end{bmatrix}}

is the matrix containing the coefficients of the given equations,

x

is the vector

(a, b, c)

A x

denotes the matrix product, and

0 = (0, 0)

is the zero vector. In a similar vein, the solutions of homogeneous linear differential equations form vector spaces. For example,

f''(x) + 2 f'(x) + f (x) = 0

yields

f (x) = a e - x + bx e - x

, where

a

and

b

are arbitrary constants, and

e x

is the natural exponential function.

Basis and dimension

A vector

v

R 2

(blue) expressed in terms of different bases: using the standard basis of

R 2 v = x e 1 + y e 2

(black), and using a different, non-orthogonal basis:

v = f 1 + f 2

(red).

Bases allow one to represent vectors by a sequence of scalars called coordinates or components. A basis is a (finite or infinite) set

B = {b i} i \in I

of vectors

b i

, for convenience often indexed by some index set

I

, that spans the whole space and is linearly independent. "Spanning the whole space" means that any vector

v

can be expressed as a finite sum (called a linear combination) of the basis elements:

\mathbf {v} =a_{1}\mathbf {b} _{i_{1}}+a_{2}\mathbf {b} _{i_{2}}+\cdots +a_{n}\mathbf {b} _{i_{n}},

(1)

where the

a k

are scalars, called the coordinates (or the components) of the vector

v

with respect to the basis

B

, and

b i k

(k = 1, ..., n)

elements of

B

. Linear independence means that the coordinates

a k

are uniquely determined for any vector in the vector space.

For example, the coordinate vectors

e 1 = (1, 0, ..., 0)

e 2 = (0, 1, 0, ..., 0)

, to

e n = (0, 0, ..., 0, 1)

, form a basis of

F n

, called the standard basis, since any vector

(x 1, x 2, ..., x n)

can be uniquely expressed as a linear combination of these vectors:

(x 1, x 2, ..., x n) = x 1 (1, 0, ..., 0) + x 2 (0, 1, 0, ..., 0) + ... + x n (0, ..., 0, 1) = x 1 e 1 + x 2 e 2 + ... + x n e n

The corresponding coordinates

x 1

x 2

...

x n

are just the Cartesian coordinates of the vector.

Every vector space has a basis. This follows from Zorn's lemma, an equivalent formulation of the Axiom of Choice.^[18] Given the other axioms of Zermelo–Fraenkel set theory, the existence of bases is equivalent to the axiom of choice.^[19] The ultrafilter lemma, which is weaker than the axiom of choice, implies that all bases of a given vector space have the same number of elements, or cardinality (cf. Dimension theorem for vector spaces).^[20] It is called the dimension of the vector space, denoted by dim V. If the space is spanned by finitely many vectors, the above statements can be proven without such fundamental input from set theory.^[21]

The dimension of the coordinate space

F n

n

, by the basis exhibited above. The dimension of the polynomial ring F[x] introduced above is countably infinite, a basis is given by

1

x

x 2

...

A fortiori, the dimension of more general function spaces, such as the space of functions on some (bounded or unbounded) interval, is infinite.^{[nb 4]} Under suitable regularity assumptions on the coefficients involved, the dimension of the solution space of a homogeneous ordinary differential equation equals the degree of the equation.^[22] For example, the solution space for the above equation is generated by

e - x and xe - x

. These two functions are linearly independent over

R

, so the dimension of this space is two, as is the degree of the equation.

A field extension over the rationals

Q

can be thought of as a vector space over

Q

(by defining vector addition as field addition, defining scalar multiplication as field multiplication by elements of

Q

, and otherwise ignoring the field multiplication). The dimension (or degree) of the field extension

Q (α)

over

Q

depends on

α

. If

α

satisfies some polynomial equation

q_{n}\alpha ^{n}+q_{n-1}\alpha ^{n-1}+\ldots +q_{0}=0

with rational coefficients

q n, ..., q 0

(in other words, if α is algebraic), the dimension is finite. More precisely, it equals the degree of the minimal polynomial having α as a root.^[23] For example, the complex numbers C are a two-dimensional real vector space, generated by 1 and the imaginary unit i.

The latter satisfies i² + 1 = 0, an equation of degree two. Thus, C is a two-dimensional R-vector space (and, as any field, one-dimensional as a vector space over itself, C). If α is not algebraic, the dimension of Q(α) over Q is infinite. For instance, for α = π there is no such equation, in other words π is transcendental.^[24]

Linear maps and matrices

The relation of two vector spaces can be expressed by linear map or linear transformation. They are functions that reflect the vector space structure—i.e., they preserve sums and scalar multiplication:

f(x + y) = f(x) + f(y) and f(a · x) = a · f(x) for all x and y in V, all a in F.^[25]

An isomorphism is a linear map

f : V \to W

such that there exists an inverse map

g : W \to V

, which is a map such that the two possible compositions

f \circ g : W \to W

and

g \circ f : V \to V

are identity maps. Equivalently, f is both one-to-one (injective) and onto (surjective).^[26] If there exists an isomorphism between V and W, the two spaces are said to be isomorphic; they are then essentially identical as vector spaces, since all identities holding in V are, via f, transported to similar ones in W, and vice versa via g.

Describing an arrow vector v by its coordinates x and y yields an isomorphism of vector spaces.

For example, the "arrows in the plane" and "ordered pairs of numbers" vector spaces in the introduction are isomorphic: a planar arrow v departing at the origin of some (fixed) coordinate system can be expressed as an ordered pair by considering the x- and y-component of the arrow, as shown in the image at the right. Conversely, given a pair (x, y), the arrow going by x to the right (or to the left, if x is negative), and y up (down, if y is negative) turns back the arrow v.

Linear maps V → W between two vector spaces form a vector space Hom_F(V, W), also denoted L(V, W).^[27] The space of linear maps from V to F is called the dual vector space, denoted V^∗.^[28] Via the injective natural map

V \to V **

, any vector space can be embedded into its bidual; the map is an isomorphism if and only if the space is finite-dimensional.^[29]

Once a basis of

V

is chosen, linear maps

f : V \to W

are completely determined by specifying the images of the basis vectors, because any element of V is expressed uniquely as a linear combination of them.^[30] If

dim V = dim W

, a 1-to-1 correspondence between fixed bases of

V

and

W

gives rise to a linear map that maps any basis element of

V

to the corresponding basis element of

W

. It is an isomorphism, by its very definition.^[31] Therefore, two vector spaces are isomorphic if their dimensions agree and vice versa. Another way to express this is that any vector space is completely classified (up to isomorphism) by its dimension, a single number. In particular, any n-dimensional

F

-vector space

V

is isomorphic to

F n

. There is, however, no "canonical" or preferred isomorphism; actually an isomorphism

φ : F n \to V

is equivalent to the choice of a basis of

V

, by mapping the standard basis of

F n

V

, via

φ

. The freedom of choosing a convenient basis is particularly useful in the infinite-dimensional context, see below.

Matrices

A typical matrix

Matrices are a useful notion to encode linear maps.^[32] They are written as a rectangular array of scalars as in the image at the right. Any m-by-n matrix A gives rise to a linear map from Fⁿ to F^m, by the following

{\displaystyle \mathbf {x} =(x_{1},x_{2},\cdots ,x_{n})\mapsto \left(\sum _{j=1}^{n}a_{1j}x_{j},\sum _{j=1}^{n}a_{2j}x_{j},\cdots ,\sum _{j=1}^{n}a_{mj}x_{j}\right)}

, where

\sum

denotes summation,

or, using the matrix multiplication of the matrix

A

with the coordinate vector

x

x \mapsto A x

Moreover, after choosing bases of

V

and

W

, any linear map

f : V \to W

is uniquely represented by a matrix via this assignment.^[33]

The volume of this parallelepiped is the absolute value of the determinant of the 3-by-3 matrix formed by the vectors

r 1

r 2

, and

r 3

The determinant

det (A)

of a square matrix

A

is a scalar that tells whether the associated map is an isomorphism or not: to be so it is sufficient and necessary that the determinant is nonzero.^[34] The linear transformation of

R n

corresponding to a real n-by-n matrix is orientation preserving if and only if its determinant is positive.

Eigenvalues and eigenvectors

Endomorphisms, linear maps

f : V \to V

, are particularly important since in this case vectors

v

can be compared with their image under

f

f (v)

. Any nonzero vector

v

satisfying

λ v = f (v)

, where

λ

is a scalar, is called an eigenvector of

f

with eigenvalue

λ

.^{[nb 5]}^[35] Equivalently,

v

is an element of the kernel of the difference

f - λ \cdot Id

(where Id is the identity map

V \to V)

. If

V

is finite-dimensional, this can be rephrased using determinants:

f

having eigenvalue

λ

is equivalent to

det(f - λ \cdot Id) = 0

By spelling out the definition of the determinant, the expression on the left hand side can be seen to be a polynomial function in

λ

, called the characteristic polynomial of

f

.^[36] If the field

F

is large enough to contain a zero of this polynomial (which automatically happens for

F

algebraically closed, such as

F = C

) any linear map has at least one eigenvector. The vector space

V

may or may not possess an eigenbasis, a basis consisting of eigenvectors. This phenomenon is governed by the Jordan canonical form of the map.^[37]^{[nb 6]} The set of all eigenvectors corresponding to a particular eigenvalue of

f

forms a vector space known as the eigenspace corresponding to the eigenvalue (and

f

) in question. To achieve the spectral theorem, the corresponding statement in the infinite-dimensional case, the machinery of functional analysis is needed, see below.

Basic constructions

In addition to the above concrete examples, there are a number of standard linear algebraic constructions that yield vector spaces related to given ones. In addition to the definitions given below, they are also characterized by universal properties, which determine an object

X

by specifying the linear maps from

X

to any other vector space.

Subspaces and quotient spaces

A line passing through the origin (blue, thick) in

R 3

is a linear subspace. It is the intersection of two planes (green and yellow).

A nonempty subset W of a vector space V that is closed under addition and scalar multiplication (and therefore contains the 0-vector of V) is called a linear subspace of V, or simply a subspace of V, when the ambient space is unambiguously a vector space.^[38]^{[nb 7]} Subspaces of V are vector spaces (over the same field) in their own right. The intersection of all subspaces containing a given set S of vectors is called its span, and it is the smallest subspace of V containing the set S. Expressed in terms of elements, the span is the subspace consisting of all the linear combinations of elements of S.^[39]

A linear subspace of dimension 1 is a vector line. A linear subspace of dimension 2 is a vector plane. A linear subspace that contains all elements but one of a basis of the ambient space is a vector hyperplane. In a vector space of finite dimension

n

, a vector hyperplane is thus a subspace of dimension

n - 1

.

The counterpart to subspaces are quotient vector spaces.^[40] Given any subspace

W \subset V

, the quotient space V/W ("V modulo W") is defined as follows: as a set, it consists of

v + W = {v + w : w \in W},

where v is an arbitrary vector in V. The sum of two such elements

v 1 + W

and

v 2 + W

(v 1 + v 2) + W,

and scalar multiplication is given by

a \cdot (v + W) = (a \cdot v) + W

. The key point in this definition is that

v 1 + W = v 2 + W

if and only if the difference of v₁ and v₂ lies in W.^{[nb 8]} This way, the quotient space "forgets" information that is contained in the subspace W.

The kernel ker(f) of a linear map

f : V \to W

consists of vectors v that are mapped to 0 in W.^[41] Both kernel and image

im(f) = {f (v) : v \in V}

are subspaces of V and W, respectively.^[42] The existence of kernels and images is part of the statement that the category of vector spaces (over a fixed field F) is an abelian category, i.e. a corpus of mathematical objects and structure-preserving maps between them (a category) that behaves much like the category of abelian groups.^[43] Because of this, many statements such as the first isomorphism theorem (also called rank–nullity theorem in matrix-related terms)

V / ker(f) ≡ im(f).

and the second and third isomorphism theorem can be formulated and proven in a way very similar to the corresponding statements for groups.

An important example is the kernel of a linear map

x \mapsto A x

for some fixed matrix A, as above. The kernel of this map is the subspace of vectors x such that

A x = 0

, which is precisely the set of solutions to the system of homogeneous linear equations belonging to A. This concept also extends to linear differential equations

a_{0}f+a_{1}{\frac {df}{dx}}+a_{2}{\frac {d^{2}f}{dx^{2}}}+\cdots +a_{n}{\frac {d^{n}f}{dx^{n}}}=0

, where the coefficients a_i are functions in x, too.

In the corresponding map

f\mapsto D(f)=\sum _{i=0}^{n}a_{i}{\frac {d^{i}f}{dx^{i}}}

the derivatives of the function f appear linearly (as opposed to f′′(x)², for example). Since differentiation is a linear procedure (i.e.,

(f + g)' = f' + g'

and

(c \cdot f)' = c \cdot f'

for a constant

c

) this assignment is linear, called a linear differential operator. In particular, the solutions to the differential equation

D (f) = 0

form a vector space (over

R

C

Direct product and direct sum

The direct product of vector spaces and the direct sum of vector spaces are two ways of combining an indexed family of vector spaces into a new vector space.
The direct product

\textstyle {\prod _{i\in I}V_{i}}

of a family of vector spaces V_i consists of the set of all tuples (

v i) i \in I

, which specify for each index i in some index set I an element v_i of V_i.^[44] Addition and scalar multiplication is performed componentwise. A variant of this construction is the direct sum

\oplus _{i\in I}V_{i}

(also called coproduct and denoted

\textstyle {\coprod _{i\in I}V_{i}}

), where only tuples with finitely many nonzero vectors are allowed. If the index set I is finite, the two constructions agree, but in general they are different.

Tensor product

The tensor product

V \otimes F W

, or simply

V \otimes W

, of two vector spaces V and W is one of the central notions of multilinear algebra which deals with extending notions such as linear maps to several variables. A map

g : V \times W \to X

is called bilinear if g is linear in both variables v and w. That is to say, for fixed w the map

v \mapsto g (v, w)

is linear in the sense above and likewise for fixed v.
The tensor product is a particular vector space that is a universal recipient of bilinear maps g, as follows. It is defined as the vector space consisting of finite (formal) sums of symbols called tensors

v₁ ⊗ w₁ + v₂ ⊗ w₂ + ... + v_n ⊗ w_n,

subject to the rules

a · (v ⊗ w) = (a · v) ⊗ w = v ⊗ (a · w), where a is a scalar,

(v₁ + v₂) ⊗ w = v₁ ⊗ w + v₂ ⊗ w, and

v ⊗ (w₁ + w₂) = v ⊗ w₁ + v ⊗ w₂.^[45]

Commutative diagram depicting the universal property of the tensor product.

These rules ensure that the map f from the

V \times W

V \otimes W

that maps a tuple

(v, w)

v \otimes w

is bilinear. The universality states that given any vector space X and any bilinear map

g : V \times W \to X

, there exists a unique map u, shown in the diagram with a dotted arrow, whose composition with f equals g:

u (v \otimes w) = g (v, w)

.^[46] This is called the universal property of the tensor product, an instance of the method—much used in advanced abstract algebra—to indirectly define objects by specifying maps from or to this object.

Vector spaces with additional structure

From the point of view of linear algebra, vector spaces are completely understood insofar as any vector space is characterized, up to isomorphism, by its dimension. However, vector spaces per se do not offer a framework to deal with the question—crucial to analysis—whether a sequence of functions converges to another function. Likewise, linear algebra is not adapted to deal with infinite series, since the addition operation allows only finitely many terms to be added. Therefore, the needs of functional analysis require considering additional structures.

A vector space may be given a partial order ≤, under which some vectors can be compared.^[47] For example, n-dimensional real space Rⁿ can be ordered by comparing its vectors componentwise. Ordered vector spaces, for example Riesz spaces, are fundamental to Lebesgue integration, which relies on the ability to express a function as a difference of two positive functions

f = f⁺ − f⁻,

where f⁺ denotes the positive part of f and f⁻ the negative part.^[48]

Normed vector spaces and inner product spaces

"Measuring" vectors is done by specifying a norm, a datum which measures lengths of vectors, or by an inner product, which measures angles between vectors. Norms and inner products are denoted

|\mathbf {v} |

and

\langle \mathbf {v} ,\mathbf {w} \rangle

, respectively. The datum of an inner product entails that lengths of vectors can be defined too, by defining the associated norm

|\mathbf {v} |:={\sqrt {\langle \mathbf {v} ,\mathbf {v} \rangle }}

. Vector spaces endowed with such data are known as normed vector spaces and inner product spaces, respectively.^[49]
Coordinate space Fⁿ can be equipped with the standard dot product:

\langle \mathbf {x} ,\mathbf {y} \rangle =\mathbf {x} \cdot \mathbf {y} =x_{1}y_{1}+\cdots +x_{n}y_{n}.

In R², this reflects the common notion of the angle between two vectors x and y, by the law of cosines:

{\displaystyle \mathbf {x} \cdot \mathbf {y} =\cos \left(\angle (\mathbf {x} ,\mathbf {y} )\right)\cdot |\mathbf {x} |\cdot |\mathbf {y} |.}

Because of this, two vectors satisfying

\langle \mathbf {x} ,\mathbf {y} \rangle =0

are called orthogonal. An important variant of the standard dot product is used in Minkowski space: R⁴ endowed with the Lorentz product

\langle \mathbf {x} |\mathbf {y} \rangle =x_{1}y_{1}+x_{2}y_{2}+x_{3}y_{3}-x_{4}y_{4}.

^[50]

In contrast to the standard dot product, it is not positive definite:

\langle \mathbf {x} |\mathbf {x} \rangle

also takes negative values, for example for

\mathbf {x} =(0,0,0,1)

. Singling out the fourth coordinate—corresponding to time, as opposed to three space-dimensions—makes it useful for the mathematical treatment of special relativity.

Topological vector spaces

Convergence questions are treated by considering vector spaces V carrying a compatible topology, a structure that allows one to talk about elements being close to each other.^[51]^[52] Compatible here means that addition and scalar multiplication have to be continuous maps. Roughly, if x and y in V, and a in F vary by a bounded amount, then so do

x + y

and

a x

.^{[nb 9]} To make sense of specifying the amount a scalar changes, the field F also has to carry a topology in this context; a common choice are the reals or the complex numbers.
In such topological vector spaces one can consider series of vectors. The infinite sum

\sum _{i=0}^{\infty }f_{i}

denotes the limit of the corresponding finite partial sums of the sequence (f_i)_i∈N of elements of V. For example, the f_i could be (real or complex) functions belonging to some function space V, in which case the series is a function series. The mode of convergence of the series depends on the topology imposed on the function space. In such cases, pointwise convergence and uniform convergence are two prominent examples.

Unit "spheres" in R² consist of plane vectors of norm 1. Depicted are the unit spheres in different p-norms, for p = 1, 2, and ∞. The bigger diamond depicts points of 1-norm equal to 2.

A way to ensure the existence of limits of certain infinite series is to restrict attention to spaces where any Cauchy sequence has a limit; such a vector space is called complete. Roughly, a vector space is complete provided that it contains all necessary limits. For example, the vector space of polynomials on the unit interval [0,1], equipped with the topology of uniform convergence is not complete because any continuous function on [0,1] can be uniformly approximated by a sequence of polynomials, by the Weierstrass approximation theorem.^[53] In contrast, the space of all continuous functions on [0,1] with the same topology is complete.^[54] A norm gives rise to a topology by defining that a sequence of vectors v_n converges to v if and only if

{\text{lim}}_{n\rightarrow \infty }|\mathbf {v} _{n}-\mathbf {v} |=0.

Banach and Hilbert spaces are complete topological vector spaces whose topologies are given, respectively, by a norm and an inner product. Their study—a key piece of functional analysis—focusses on infinite-dimensional vector spaces, since all norms on finite-dimensional topological vector spaces give rise to the same notion of convergence.^[55] The image at the right shows the equivalence of the 1-norm and ∞-norm on R²: as the unit "balls" enclose each other, a sequence converges to zero in one norm if and only if it so does in the other norm. In the infinite-dimensional case, however, there will generally be inequivalent topologies, which makes the study of topological vector spaces richer than that of vector spaces without additional data.

From a conceptual point of view, all notions related to topological vector spaces should match the topology. For example, instead of considering all linear maps (also called functionals)

V \to W

, maps between topological vector spaces are required to be continuous.^[56] In particular, the (topological) dual space

V *

consists of continuous functionals

V \to R

(or to

C

). The fundamental Hahn–Banach theorem is concerned with separating subspaces of appropriate topological vector spaces by continuous functionals.^[57]

Banach spaces

Banach spaces, introduced by Stefan Banach, are complete normed vector spaces.^[58] A first example is the vector space ℓ ^p consisting of infinite vectors with real entries

x = (x 1, x 2, ...)

whose p-norm

(1 \leq p \leq \infty)

given by

|\mathbf {x} |_{p}:=\left(\sum _{i}|x_{i}|^{p}\right)^{1/p}

for p < ∞ and

|\mathbf {x} |_{\infty }:={\text{sup}}_{i}|x_{i}|

is finite. The topologies on the infinite-dimensional space ℓ ^p are inequivalent for different p. E.g. the sequence of vectors

x n = (2 - n, 2 - n, ..., 2 - n, 0, 0, ...)

, i.e. the first 2ⁿ components are 2⁻ⁿ, the following ones are 0, converges to the zero vector for

p = \infty

, but does not for

p = 1

|x_{n}|_{\infty }=\sup(2^{-n},0)=2^{-n}\rightarrow 0

, but

|x_{n}|_{1}=\sum _{i=1}^{2^{n}}2^{-n}=2^{n}\cdot 2^{-n}=1.

More generally than sequences of real numbers, functions

f : Ω \to R

are endowed with a norm that replaces the above sum by the Lebesgue integral

|f|_{p}:=\left(\int _{\Omega }|f(x)|^{p}\,dx\right)^{1/p}.

The space of integrable functions on a given domain Ω (for example an interval) satisfying

| f | p < \infty

, and equipped with this norm are called Lebesgue spaces, denoted L^p(Ω).^{[nb 10]} These spaces are complete.^[59] (If one uses the Riemann integral instead, the space is not complete, which may be seen as a justification for Lebesgue's integration theory.^{[nb 11]}) Concretely this means that for any sequence of Lebesgue-integrable functions

f 1, f 2, ...

with

| f n | p < \infty

, satisfying the condition

\lim _{k,\ n\to \infty }\int _{\Omega }|{f}_{k}(x)-{f}_{n}(x)|^{p}\,dx=0

there exists a function f(x) belonging to the vector space L^p(Ω) such that

\lim _{k\to \infty }\int _{\Omega }|{f}(x)-{f}_{k}(x)|^{p}\,dx=0.

Imposing boundedness conditions not only on the function, but also on its derivatives leads to Sobolev spaces.^[60]

Hilbert spaces

The succeeding snapshots show summation of 1 to 5 terms in approximating a periodic function (blue) by finite sum of sine functions (red).

Complete inner product spaces are known as Hilbert spaces, in honor of David Hilbert.^[61] The Hilbert space L²(Ω), with inner product given by

\langle f\ ,\ g\rangle =\int _{\Omega }f(x){\overline {g(x)}}\,dx,

where

{\overline {g(x)}}

denotes the complex conjugate of g(x),^[62]^{[nb 12]} is a key case.

By definition, in a Hilbert space any Cauchy sequence converges to a limit. Conversely, finding a sequence of functions f_n with desirable properties that approximates a given limit function, is equally crucial. Early analysis, in the guise of the Taylor approximation, established an approximation of differentiable functions f by polynomials.^[63] By the Stone–Weierstrass theorem, every continuous function on

[a, b]

can be approximated as closely as desired by a polynomial.^[64] A similar approximation technique by trigonometric functions is commonly called Fourier expansion, and is much applied in engineering, see below. More generally, and more conceptually, the theorem yields a simple description of what "basic functions", or, in abstract Hilbert spaces, what basic vectors suffice to generate a Hilbert space H, in the sense that the closure of their span (i.e., finite linear combinations and limits of those) is the whole space. Such a set of functions is called a basis of H, its cardinality is known as the Hilbert space dimension.^{[nb 13]} Not only does the theorem exhibit suitable basis functions as sufficient for approximation purposes, but together with the Gram–Schmidt process, it enables one to construct a basis of orthogonal vectors.^[65] Such orthogonal bases are the Hilbert space generalization of the coordinate axes in finite-dimensional Euclidean space.

The solutions to various differential equations can be interpreted in terms of Hilbert spaces. For example, a great many fields in physics and engineering lead to such equations and frequently solutions with particular physical properties are used as basis functions, often orthogonal.^[66] As an example from physics, the time-dependent Schrödinger equation in quantum mechanics describes the change of physical properties in time by means of a partial differential equation, whose solutions are called wavefunctions.^[67] Definite values for physical properties such as energy, or momentum, correspond to eigenvalues of a certain (linear) differential operator and the associated wavefunctions are called eigenstates. The spectral theorem decomposes a linear compact operator acting on functions in terms of these eigenfunctions and their eigenvalues.^[68]

Algebras over fields

A hyperbola, given by the equation

x \cdot y = 1

. The coordinate ring of functions on this hyperbola is given by

R [x, y] / (x \cdot y - 1)

, an infinite-dimensional vector space over

R

General vector spaces do not possess a multiplication between vectors. A vector space equipped with an additional bilinear operator defining the multiplication of two vectors is an algebra over a field.^[69] Many algebras stem from functions on some geometrical object: since functions with values in a given field can be multiplied pointwise, these entities form algebras. The Stone–Weierstrass theorem mentioned above, for example, relies on Banach algebras which are both Banach spaces and algebras.

Commutative algebra makes great use of rings of polynomials in one or several variables, introduced above. Their multiplication is both commutative and associative. These rings and their quotients form the basis of algebraic geometry, because they are rings of functions of algebraic geometric objects.^[70]

Another crucial example are Lie algebras, which are neither commutative nor associative, but the failure to be so is limited by the constraints (

[x, y]

denotes the product of

x

and

y

$[x, y] = -[y, x]$ (anticommutativity), and
$[x, [y, z]] + [y, [z, x]] + [z, [x, y]] = 0$ (Jacobi identity).^[71]

Examples include the vector space of n-by-n matrices, with

[x, y] = xy - yx

, the commutator of two matrices, and

R 3

, endowed with the cross product.

The tensor algebra T(V) is a formal way of adding products to any vector space V to obtain an algebra.^[72] As a vector space, it is spanned by symbols, called simple tensors

v 1 \otimes v 2 \otimes ... \otimes v n

, where the degree

n

varies.

The multiplication is given by concatenating such symbols, imposing the distributive law under addition, and requiring that scalar multiplication commute with the tensor product ⊗, much the same way as with the tensor product of two vector spaces introduced above. In general, there are no relations between

v 1 \otimes v 2

and

v 2 \otimes v 1

. Forcing two such elements to be equal leads to the symmetric algebra, whereas forcing

v 1 \otimes v 2 = - v 2 \otimes v 1

yields the exterior algebra.^[73]

When a field,

F

is explicitly stated, a common term used is

F

-algebra.

Applications

Vector spaces have many applications as they occur frequently in common circumstances, namely wherever functions with values in some field are involved. They provide a framework to deal with analytical and geometrical problems, or are used in the Fourier transform. This list is not exhaustive: many more applications exist, for example in optimization. The minimax theorem of game theory stating the existence of a unique payoff when all players play optimally can be formulated and proven using vector spaces methods.^[74] Representation theory fruitfully transfers the good understanding of linear algebra and vector spaces to other mathematical domains such as group theory.^[75]

Distributions

A distribution (or generalized function) is a linear map assigning a number to each "test" function, typically a smooth function with compact support, in a continuous way: in the above terminology the space of distributions is the (continuous) dual of the test function space.^[76] The latter space is endowed with a topology that takes into account not only f itself, but also all its higher derivatives. A standard example is the result of integrating a test function f over some domain Ω:

I(f)=\int _{\Omega }f(x)\,dx.

When

Ω = {p},

the set consisting of a single point, this reduces to the Dirac distribution, denoted by δ, which associates to a test function f its value at the

p : δ(f) = f (p)

. Distributions are a powerful instrument to solve differential equations. Since all standard analytic notions such as derivatives are linear, they extend naturally to the space of distributions. Therefore, the equation in question can be transferred to a distribution space, which is bigger than the underlying function space, so that more flexible methods are available for solving the equation. For example, Green's functions and fundamental solutions are usually distributions rather than proper functions, and can then be used to find solutions of the equation with prescribed boundary conditions. The found solution can then in some cases be proven to be actually a true function, and a solution to the original equation (e.g., using the Lax–Milgram theorem, a consequence of the Riesz representation theorem).^[77]

Fourier analysis

The heat equation describes the dissipation of physical properties over time, such as the decline of the temperature of a hot body placed in a colder environment (yellow depicts colder regions than red).

Resolving a periodic function into a sum of trigonometric functions forms a Fourier series, a technique much used in physics and engineering.^{[nb 14]}^[78] The underlying vector space is usually the Hilbert space L²(0, 2π), for which the functions sin mx and cos mx (m an integer) form an orthogonal basis.^[79] The Fourier expansion of an L² function f is

{\frac {a_{0}}{2}}+\sum _{m=1}^{\infty }\left[a_{m}\cos \left(mx\right)+b_{m}\sin \left(mx\right)\right].

The coefficients a_m and b_m are called Fourier coefficients of f, and are calculated by the formulas^[80]

a_{m}={\frac {1}{\pi }}\int _{0}^{2\pi }f(t)\cos(mt)\,dt

b_{m}={\frac {1}{\pi }}\int _{0}^{2\pi }f(t)\sin(mt)\,dt.

In physical terms the function is represented as a superposition of sine waves and the coefficients give information about the function's frequency spectrum.^[81] A complex-number form of Fourier series is also commonly used.^[80] The concrete formulae above are consequences of a more general mathematical duality called Pontryagin duality.^[82] Applied to the group R, it yields the classical Fourier transform; an application in physics are reciprocal lattices, where the underlying group is a finite-dimensional real vector space endowed with the additional datum of a lattice encoding positions of atoms in crystals.^[83]

Fourier series are used to solve boundary value problems in partial differential equations.^[84] In 1822, Fourier first used this technique to solve the heat equation.^[85] A discrete version of the Fourier series can be used in sampling applications where the function value is known only at a finite number of equally spaced points. In this case the Fourier series is finite and its value is equal to the sampled values at all points.^[86] The set of coefficients is known as the discrete Fourier transform (DFT) of the given sample sequence. The DFT is one of the key tools of digital signal processing, a field whose applications include radar, speech encoding, image compression.^[87] The JPEG image format is an application of the closely related discrete cosine transform.^[88]

The fast Fourier transform is an algorithm for rapidly computing the discrete Fourier transform.^[89] It is used not only for calculating the Fourier coefficients but, using the convolution theorem, also for computing the convolution of two finite sequences.^[90] They in turn are applied in digital filters^[91] and as a rapid multiplication algorithm for polynomials and large integers (Schönhage–Strassen algorithm).^[92]^[93]

Differential geometry

The tangent space to the 2-sphere at some point is the infinite plane touching the sphere in this point.

The tangent plane to a surface at a point is naturally a vector space whose origin is identified with the point of contact. The tangent plane is the best linear approximation, or linearization, of a surface at a point.^{[nb 15]} Even in a three-dimensional Euclidean space, there is typically no natural way to prescribe a basis of the tangent plane, and so it is conceived of as an abstract vector space rather than a real coordinate space. The tangent space is the generalization to higher-dimensional differentiable manifolds.^[94]

Riemannian manifolds are manifolds whose tangent spaces are endowed with a suitable inner product.^[95] Derived therefrom, the Riemann curvature tensor encodes all curvatures of a manifold in one object, which finds applications in general relativity, for example, where the Einstein curvature tensor describes the matter and energy content of space-time.^[96]^[97] The tangent space of a Lie group can be given naturally the structure of a Lie algebra and can be used to classify compact Lie groups.^[98]

Generalizations

Vector bundles

A Möbius strip. Locally, it looks like

U \times R

A vector bundle is a family of vector spaces parametrized continuously by a topological space X.^[94] More precisely, a vector bundle over X is a topological space E equipped with a continuous map

π : E → X

such that for every x in X, the fiber π⁻¹(x) is a vector space. The case dim

V = 1

is called a line bundle. For any vector space V, the projection

X \times V \to X

makes the product

X \times V

into a "trivial" vector bundle. Vector bundles over X are required to be locally a product of X and some (fixed) vector space V: for every x in X, there is a neighborhood U of x such that the restriction of π to π⁻¹(U) is isomorphic^{[nb 16]} to the trivial bundle

U \times V \to U

. Despite their locally trivial character, vector bundles may (depending on the shape of the underlying space X) be "twisted" in the large (i.e., the bundle need not be (globally isomorphic to) the trivial bundle

X \times V

). For example, the Möbius strip can be seen as a line bundle over the circle S¹ (by identifying open intervals with the real line). It is, however, different from the cylinder

S 1 \times R

, because the latter is orientable whereas the former is not.^[99]

Properties of certain vector bundles provide information about the underlying topological space. For example, the tangent bundle consists of the collection of tangent spaces parametrized by the points of a differentiable manifold. The tangent bundle of the circle S¹ is globally isomorphic to

S 1 \times R

, since there is a global nonzero vector field on S¹.^{[nb 17]} In contrast, by the hairy ball theorem, there is no (tangent) vector field on the 2-sphere S² which is everywhere nonzero.^[100] K-theory studies the isomorphism classes of all vector bundles over some topological space.^[101] In addition to deepening topological and geometrical insight, it has purely algebraic consequences, such as the classification of finite-dimensional real division algebras: R, C, the quaternions H and the octonions O.

The cotangent bundle of a differentiable manifold consists, at every point of the manifold, of the dual of the tangent space, the cotangent space. Sections of that bundle are known as differential one-forms.

Modules

Modules are to rings what vector spaces are to fields: the same axioms, applied to a ring R instead of a field F, yield modules.^[102] The theory of modules, compared to that of vector spaces, is complicated by the presence of ring elements that do not have multiplicative inverses. For example, modules need not have bases, as the Z-module (i.e., abelian group) Z/2Z shows; those modules that do (including all vector spaces) are known as free modules. Nevertheless, a vector space can be compactly defined as a module over a ring which is a field with the elements being called vectors. Some authors use the term vector space to mean modules over a division ring.^[103] The algebro-geometric interpretation of commutative rings via their spectrum allows the development of concepts such as locally free modules, the algebraic counterpart to vector bundles.

Affine and projective spaces

An affine plane (light blue) in R³. It is a two-dimensional subspace shifted by a vector x (red).

Roughly, affine spaces are vector spaces whose origins are not specified.^[104] More precisely, an affine space is a set with a free transitive vector space action. In particular, a vector space is an affine space over itself, by the map

V \times V \to V, (v, a) \mapsto a + v

If W is a vector space, then an affine subspace is a subset of W obtained by translating a linear subspace V by a fixed vector

x \in W

; this space is denoted by

x + V

(it is a coset of V in W) and consists of all vectors of the form

x + v

for

v \in V .

An important example is the space of solutions of a system of inhomogeneous linear equations

Ax = b

generalizing the homogeneous case

b = 0

above.^[105] The space of solutions is the affine subspace

x + V

where x is a particular solution of the equation, and V is the space of solutions of the homogeneous equation (the nullspace of A).

The set of one-dimensional subspaces of a fixed finite-dimensional vector space V is known as projective space; it may be used to formalize the idea of parallel lines intersecting at infinity.^[106] Grassmannians and flag manifolds generalize this by parametrizing linear subspaces of fixed dimension k and flags of subspaces, respectively.

Search This Blog

Monday, June 11, 2018

Orthogonality

Etymology

Mathematics and physics

Definitions

Euclidean vector spaces

Orthogonal functions

Examples

Orthogonal polynomials

Orthogonal states in quantum mechanics

Art

Computer science

Communications

Statistics, econometrics, and economics

Taxonomy

Combinatorics

Chemistry and biochemistry

System reliability

Neuroscience

Gaming

Other examples

Alpha Centauri system could have favorable conditions for life

Vector space

Introduction and definition

First example: arrows in the plane

Second example: ordered pairs of numbers

Definition

Alternative formulations and elementary consequences

History

Examples

Coordinate spaces

Complex numbers and other field extensions

Function spaces

Linear equations

Basis and dimension

Linear maps and matrices

Matrices

Eigenvalues and eigenvectors

Basic constructions

Subspaces and quotient spaces

Direct product and direct sum

Tensor product

Vector spaces with additional structure

Normed vector spaces and inner product spaces

Topological vector spaces

Banach spaces

Hilbert spaces

Algebras over fields

Applications

Distributions

Fourier analysis

Differential geometry

Generalizations

Vector bundles

Modules

Affine and projective spaces

Fossil