Search This Blog

Thursday, August 18, 2022

Numerical integration

From Wikipedia, the free encyclopedia
 
Numerical integration is used to calculate a numerical approximation for the value , the area under the curve defined by .

In analysis, numerical integration comprises a broad family of algorithms for calculating the numerical value of a definite integral, and by extension, the term is also sometimes used to describe the numerical solution of differential equations. This article focuses on calculation of definite integrals.

The term numerical quadrature (often abbreviated to quadrature) is more or less a synonym for numerical integration, especially as applied to one-dimensional integrals. Some authors refer to numerical integration over more than one dimension as cubature; others take quadrature to include higher-dimensional integration.

The basic problem in numerical integration is to compute an approximate solution to a definite integral

to a given degree of accuracy. If f(x) is a smooth function integrated over a small number of dimensions, and the domain of integration is bounded, there are many methods for approximating the integral to the desired precision.

Reasons for numerical integration

There are several reasons for carrying out numerical integration, as opposed to analytical integration by finding the antiderivative:

  1. The integrand f(x) may be known only at certain points, such as obtained by sampling. Some embedded systems and other computer applications may need numerical integration for this reason.
  2. A formula for the integrand may be known, but it may be difficult or impossible to find an antiderivative that is an elementary function. An example of such an integrand is f(x) = exp(−x2), the antiderivative of which (the error function, times a constant) cannot be written in elementary form.
  3. It may be possible to find an antiderivative symbolically, but it may be easier to compute a numerical approximation than to compute the antiderivative. That may be the case if the antiderivative is given as an infinite series or product, or if its evaluation requires a special function that is not available.

History

The term "numerical integration" first appears in 1915 in the publication A Course in Interpolation and Numeric Integration for the Mathematical Laboratory by David Gibb.

Quadrature is a historical mathematical term that means calculating area. Quadrature problems have served as one of the main sources of mathematical analysis. Mathematicians of Ancient Greece, according to the Pythagorean doctrine, understood calculation of area as the process of constructing geometrically a square having the same area (squaring). That is why the process was named quadrature. For example, a quadrature of the circle, Lune of Hippocrates, The Quadrature of the Parabola. This construction must be performed only by means of compass and straightedge.

The ancient Babylonians used the trapezoidal rule to integrate the motion of Jupiter along the ecliptic.

Antique method to find the Geometric mean

For a quadrature of a rectangle with the sides a and b it is necessary to construct a square with the side (the Geometric mean of a and b). For this purpose it is possible to use the following fact: if we draw the circle with the sum of a and b as the diameter, then the height BH (from a point of their connection to crossing with a circle) equals their geometric mean. The similar geometrical construction solves a problem of a quadrature for a parallelogram and a triangle.

The area of a segment of a parabola

Problems of quadrature for curvilinear figures are much more difficult. The quadrature of the circle with compass and straightedge had been proved in the 19th century to be impossible. Nevertheless, for some figures (for example the Lune of Hippocrates) a quadrature can be performed. The quadratures of a sphere surface and a parabola segment done by Archimedes became the highest achievement of the antique analysis.

  • The area of the surface of a sphere is equal to quadruple the area of a great circle of this sphere.
  • The area of a segment of the parabola cut from it by a straight line is 4/3 the area of the triangle inscribed in this segment.

For the proof of the results Archimedes used the Method of exhaustion of Eudoxus.

In medieval Europe the quadrature meant calculation of area by any method. More often the Method of indivisibles was used; it was less rigorous, but more simple and powerful. With its help Galileo Galilei and Gilles de Roberval found the area of a cycloid arch, Grégoire de Saint-Vincent investigated the area under a hyperbola (Opus Geometricum, 1647), and Alphonse Antonio de Sarasa, de Saint-Vincent's pupil and commentator, noted the relation of this area to logarithms.

John Wallis algebrised this method: he wrote in his Arithmetica Infinitorum (1656) series that we now call the definite integral, and he calculated their values. Isaac Barrow and James Gregory made further progress: quadratures for some algebraic curves and spirals. Christiaan Huygens successfully performed a quadrature of some Solids of revolution.

The quadrature of the hyperbola by Saint-Vincent and de Sarasa provided a new function, the natural logarithm, of critical importance.

With the invention of integral calculus came a universal method for area calculation. In response, the term quadrature has become traditional, and instead the modern phrase "computation of a univariate definite integral" is more common.

Methods for one-dimensional integrals

Numerical integration methods can generally be described as combining evaluations of the integrand to get an approximation to the integral. The integrand is evaluated at a finite set of points called integration points and a weighted sum of these values is used to approximate the integral. The integration points and weights depend on the specific method used and the accuracy required from the approximation.

An important part of the analysis of any numerical integration method is to study the behavior of the approximation error as a function of the number of integrand evaluations. A method that yields a small error for a small number of evaluations is usually considered superior. Reducing the number of evaluations of the integrand reduces the number of arithmetic operations involved, and therefore reduces the total round-off error. Also, each evaluation takes time, and the integrand may be arbitrarily complicated.

A 'brute force' kind of numerical integration can be done, if the integrand is reasonably well-behaved (i.e. piecewise continuous and of bounded variation), by evaluating the integrand with very small increments.

Quadrature rules based on interpolating functions

A large class of quadrature rules can be derived by constructing interpolating functions that are easy to integrate. Typically these interpolating functions are polynomials. In practice, since polynomials of very high degree tend to oscillate wildly, only polynomials of low degree are used, typically linear and quadratic.

Illustration of the rectangle rule.

The simplest method of this type is to let the interpolating function be a constant function (a polynomial of degree zero) that passes through the point . This is called the midpoint rule or rectangle rule

Illustration of the trapezoidal rule.

The interpolating function may be a straight line (an affine function, i.e. a polynomial of degree 1) passing through the points and . This is called the trapezoidal rule

Illustration of Simpson's rule.

For either one of these rules, we can make a more accurate approximation by breaking up the interval into some number of subintervals, computing an approximation for each subinterval, then adding up all the results. This is called a composite rule, extended rule, or iterated rule. For example, the composite trapezoidal rule can be stated as

where the subintervals have the form with and Here we used subintervals of the same length but one could also use intervals of varying length .

Interpolation with polynomials evaluated at equally spaced points in yields the Newton–Cotes formulas, of which the rectangle rule and the trapezoidal rule are examples. Simpson's rule, which is based on a polynomial of order 2, is also a Newton–Cotes formula.

Quadrature rules with equally spaced points have the very convenient property of nesting. The corresponding rule with each interval subdivided includes all the current points, so those integrand values can be re-used.

If we allow the intervals between interpolation points to vary, we find another group of quadrature formulas, such as the Gaussian quadrature formulas. A Gaussian quadrature rule is typically more accurate than a Newton–Cotes rule that uses the same number of function evaluations, if the integrand is smooth (i.e., if it is sufficiently differentiable). Other quadrature methods with varying intervals include Clenshaw–Curtis quadrature (also called Fejér quadrature) methods, which do nest.

Gaussian quadrature rules do not nest, but the related Gauss–Kronrod quadrature formulas do.

Generalized midpoint rule formula

A generalized midpoint rule formula is given by

or

where denotes -th derivative. For example, substituting and

in the generalized midpoint rule formula, we obtain an equation of the inverse tangent

where is imaginary unit and

Since at each odd the numerator of the integrand becomes , the generalized midpoint rule formula can be reorganized as

The following example of Mathematica code generates the plot showing difference between inverse tangent and its approximation truncated at and :

f[theta_, x_] := theta/(1 + theta^2 * x^2);

aTan[theta_, M_, nMax_] := 
    2*Sum[(Function[x, Evaluate[D[f[theta, x], {x, 2*n}]]][(m - 1/2)/
        M])/((2*n + 1)!*(2*M)^(2*n + 1)), {m, 1, M}, {n, 0, nMax}];

Plot[{ArcTan[theta] - aTan[theta, 5, 10]}, {theta, -Pi, Pi}, 
 PlotRange -> All]

For a function defined over interval , its integral is

Therefore, we can apply the generalized midpoint integration formula above by assuming that .

Adaptive algorithms

If f(x) does not have many derivatives at all points, or if the derivatives become large, then Gaussian quadrature is often insufficient. In this case, an algorithm similar to the following will perform better:

def calculate_definite_integral_of_f(f, initial_step_size):
    """
    This algorithm calculates the definite integral of a function
    from 0 to 1, adaptively, by choosing smaller steps near
    problematic points.
    """
    x = 0.0
    h = initial_step_size
    accumulator = 0.0
    while x < 1.0:
        if x + h > 1.0:
            h = 1.0 - x  # At end of unit interval, adjust last step to end at 1.
        if error_too_big_in_quadrature_of_f_over_range(f, [x, x + h]):
            h = make_h_smaller(h)
        else:
            accumulator += quadrature_of_f_over_range(f, [x, x + h])
            x += h
            if error_too_small_in_quadrature_of_over_range(f, [x, x + h]):
                h = make_h_larger(h)  # Avoid wasting time on tiny steps.
    return accumulator

Some details of the algorithm require careful thought. For many cases, estimating the error from quadrature over an interval for a function f(x) isn't obvious. One popular solution is to use two different rules of quadrature, and use their difference as an estimate of the error from quadrature. The other problem is deciding what "too large" or "very small" signify. A local criterion for "too large" is that the quadrature error should not be larger than t ⋅ h where t, a real number, is the tolerance we wish to set for global error. Then again, if h is already tiny, it may not be worthwhile to make it even smaller even if the quadrature error is apparently large. A global criterion is that the sum of errors on all the intervals should be less than t. This type of error analysis is usually called "a posteriori" since we compute the error after having computed the approximation.

Heuristics for adaptive quadrature are discussed by Forsythe et al. (Section 5.4).

Extrapolation methods

The accuracy of a quadrature rule of the Newton–Cotes type is generally a function of the number of evaluation points. The result is usually more accurate as the number of evaluation points increases, or, equivalently, as the width of the step size between the points decreases. It is natural to ask what the result would be if the step size were allowed to approach zero. This can be answered by extrapolating the result from two or more nonzero step sizes, using series acceleration methods such as Richardson extrapolation. The extrapolation function may be a polynomial or rational function. Extrapolation methods are described in more detail by Stoer and Bulirsch (Section 3.4) and are implemented in many of the routines in the QUADPACK library.

Conservative (a priori) error estimation

Let have a bounded first derivative over i.e. The mean value theorem for where gives

for some depending on .

If we integrate in from to on both sides and take the absolute values, we obtain

We can further approximate the integral on the right-hand side by bringing the absolute value into the integrand, and replacing the term in by an upper bound

 

 

 

 

(1)

where the supremum was used to approximate.

Hence, if we approximate the integral by the quadrature rule our error is no greater than the right hand side of 1. We can convert this into an error analysis for the Riemann sum, giving an upper bound of

for the error term of that particular approximation. (Note that this is precisely the error we calculated for the example .) Using more derivatives, and by tweaking the quadrature, we can do a similar error analysis using a Taylor series (using a partial sum with remainder term) for f. This error analysis gives a strict upper bound on the error, if the derivatives of f are available.

This integration method can be combined with interval arithmetic to produce computer proofs and verified calculations.

Integrals over infinite intervals

Several methods exist for approximate integration over unbounded intervals. The standard technique involves specially derived quadrature rules, such as Gauss-Hermite quadrature for integrals on the whole real line and Gauss-Laguerre quadrature for integrals on the positive reals.[4] Monte Carlo methods can also be used, or a change of variables to a finite interval; e.g., for the whole line one could use

and for semi-infinite intervals one could use

as possible transformations.

Multidimensional integrals

The quadrature rules discussed so far are all designed to compute one-dimensional integrals. To compute integrals in multiple dimensions, one approach is to phrase the multiple integral as repeated one-dimensional integrals by applying Fubini's theorem (the tensor product rule). This approach requires the function evaluations to grow exponentially as the number of dimensions increases. Three methods are known to overcome this so-called curse of dimensionality.

A great many additional techniques for forming multidimensional cubature integration rules for a variety of weighting functions are given in the monograph by Stroud. Integration on the sphere has been reviewed by Hesse et al. (2015).

Monte Carlo

Monte Carlo methods and quasi-Monte Carlo methods are easy to apply to multi-dimensional integrals. They may yield greater accuracy for the same number of function evaluations than repeated integrations using one-dimensional methods.

A large class of useful Monte Carlo methods are the so-called Markov chain Monte Carlo algorithms, which include the Metropolis–Hastings algorithm and Gibbs sampling.

Sparse grids

Sparse grids were originally developed by Smolyak for the quadrature of high-dimensional functions. The method is always based on a one-dimensional quadrature rule, but performs a more sophisticated combination of univariate results. However, whereas the tensor product rule guarantees that the weights of all of the cubature points will be positive if the weights of the quadrature points were positive, Smolyak's rule does not guarantee that the weights will all be positive.

Bayesian Quadrature

Bayesian quadrature is a statistical approach to the numerical problem of computing integrals and falls under the field of probabilistic numerics. It can provide a full handling of the uncertainty over the solution of the integral expressed as a Gaussian process posterior variance.

Connection with differential equations

The problem of evaluating the integral

can be reduced to an initial value problem for an ordinary differential equation by applying the first part of the fundamental theorem of calculus. By differentiating both sides of the above with respect to the argument x, it is seen that the function F satisfies

Methods developed for ordinary differential equations, such as Runge–Kutta methods, can be applied to the restated problem and thus be used to evaluate the integral. For instance, the standard fourth-order Runge–Kutta method applied to the differential equation yields Simpson's rule from above.

The differential equation has a special form: the right-hand side contains only the independent variable (here ) and not the dependent variable (here ). This simplifies the theory and algorithms considerably. The problem of evaluating integrals is thus best studied in its own right.

Chirality (physics)

From Wikipedia, the free encyclopedia

A chiral phenomenon is one that is not identical to its mirror image (see the article on mathematical chirality). The spin of a particle may be used to define a handedness, or helicity, for that particle, which, in the case of a massless particle, is the same as chirality. A symmetry transformation between the two is called parity transformation. Invariance under parity transformation by a Dirac fermion is called chiral symmetry.

Chirality and helicity

The helicity of a particle is positive (“right-handed”) if the direction of its spin is the same as the direction of its motion. It is negative (“left-handed”) if the directions of spin and motion are opposite. So a standard clock, with its spin vector defined by the rotation of its hands, has left-handed helicity if tossed with its face directed forwards.

Mathematically, helicity is the sign of the projection of the spin vector onto the momentum vector: “left” is negative, “right” is positive.

Right left helicity.svg

The chirality of a particle is more abstract: It is determined by whether the particle transforms in a right- or left-handed representation of the Poincaré group.

For massless particles – photons, gluons, and (hypothetical) gravitons – chirality is the same as helicity; a given massless particle appears to spin in the same direction along its axis of motion regardless of point of view of the observer.

For massive particles – such as electrons, quarks, and neutrinos – chirality and helicity must be distinguished: In the case of these particles, it is possible for an observer to change to a reference frame moving faster than the spinning particle, in which case the particle will then appear to move backwards, and its helicity (which may be thought of as “apparent chirality”) will be reversed. That is, helicity is a constant of motion, but it is not Lorentz invariant. Chirality is Lorentz invariant, but is not a constant of motion - a propagating massive left-handed spinor will evolve into a right handed spinor over time, and vice-versa.

A massless particle moves with the speed of light, so no real observer (who must always travel at less than the speed of light) can be in any reference frame where the particle appears to reverse its relative direction of spin, meaning that all real observers see the same helicity. Because of this, the direction of spin of massless particles is not affected by a change of viewpoint (Lorentz boost) in the direction of motion of the particle, and the sign of the projection (helicity) is fixed for all reference frames: The helicity of massless particles is a relativistic invariant (a quantity whose value is the same in all inertial reference frames) which always matches the massless particles' chirality.

The discovery of neutrino oscillation implies that neutrinos have mass, so the photon is the only known massless particle. Gluons are also expected to be massless, although the assumption that they are has not been conclusively tested. Hence, these are the only two particles now known for which helicity could be identical to chirality, and only the photon has been confirmed by measurement. All other observed particles have mass and thus may have different helicities in different reference frames.

Chiral theories

Particle physicists have only observed or inferred left-chiral fermions and right-chiral antifermions engaging in the charged weak interaction. Even in the case of the electrically neutral weak interaction, which can engage with both left- and right-chiral fermions, in most circumstances two left-handed fermions interact more strongly than right-handed or opposite-handed fermions, implying that the universe has a preference for left-handed chirality. This preferential treatment of one chirality over another violates a symmetry that holds for all other forces of nature.

Chirality for a Dirac fermion ψ is defined through the operator γ5, which has eigenvalues ±1. Any Dirac field can thus be projected into its left- or right-handed component by acting with the projection operators ½(1 − γ5) or ½(1 + γ5) on ψ.

The coupling of the charged weak interaction to fermions is proportional to the first projection operator, which is responsible for this interaction's parity symmetry violation.

A common source of confusion is due to conflating the γ5, chirality operator with the helicity operator. Since the helicity of massive particles is frame-dependent, it might seem that the same particle would interact with the weak force according to one frame of reference, but not another. The resolution to this paradox is that the chirality operator is equivalent to helicity for massless fields only, for which helicity is not frame-dependent. By contrast, for massive particles, chirality is not the same as helicity, so there is no frame dependence of the weak interaction: A particle that couples to the weak force in one frame does so in every frame.

A theory that is asymmetric with respect to chiralities is called a chiral theory, while a non-chiral (i.e., parity-symmetric) theory is sometimes called a vector theory. Many pieces of the Standard Model of physics are non-chiral, which is traceable to anomaly cancellation in chiral theories. Quantum chromodynamics is an example of a vector theory, since both chiralities of all quarks appear in the theory, and couple to gluons in the same way.

The electroweak theory, developed in the mid 20th century, is an example of a chiral theory. Originally, it assumed that neutrinos were massless, and only assumed the existence of left-handed neutrinos (along with their complementary right-handed antineutrinos). After the observation of neutrino oscillations, which imply that neutrinos are massive (like all other fermions) the revised theories of the electroweak interaction now include both right- and left-handed neutrinos. However, it is still a chiral theory, as it does not respect parity symmetry.

The exact nature of the neutrino is still unsettled and so the electroweak theories that have been proposed are somewhat different, but most accommodate the chirality of neutrinos in the same way as was already done for all other fermions.

Chiral symmetry

Vector gauge theories with massless Dirac fermion fields ψ exhibit chiral symmetry, i.e., rotating the left-handed and the right-handed components independently makes no difference to the theory. We can write this as the action of rotation on the fields:

  and  

or

  and  

With N flavors, we have unitary rotations instead: U(N)L×U(N)R.

More generally, we write the right-handed and left-handed states as a projection operator acting on a spinor. The right-handed and left-handed projection operators are

and

Massive fermions do not exhibit chiral symmetry, as the mass term in the Lagrangian, mψψ, breaks chiral symmetry explicitly.

Spontaneous chiral symmetry breaking may also occur in some theories, as it most notably does in quantum chromodynamics.

The chiral symmetry transformation can be divided into a component that treats the left-handed and the right-handed parts equally, known as vector symmetry, and a component that actually treats them differently, known as axial symmetry. (cf. Current algebra.) A scalar field model encoding chiral symmetry and its breaking is the chiral model.

The most common application is expressed as equal treatment of clockwise and counter-clockwise rotations from a fixed frame of reference.

The general principle is often referred to by the name chiral symmetry. The rule is absolutely valid in the classical mechanics of Newton and Einstein, but results from quantum mechanical experiments show a difference in the behavior of left-chiral versus right-chiral subatomic particles.

Example: u and d quarks in QCD

Consider quantum chromodynamics (QCD) with two massless quarks u and d (massive fermions do not exhibit chiral symmetry). The Lagrangian reads

In terms of left-handed and right-handed spinors, it reads

(Here, i is the imaginary unit and the Dirac operator.)

Defining

it can be written as

The Lagrangian is unchanged under a rotation of qL by any 2×2 unitary matrix L, and qR by any 2×2 unitary matrix R.

This symmetry of the Lagrangian is called flavor chiral symmetry, and denoted as U(2)L×U(2)R. It decomposes into

The singlet vector symmetry, U(1)V, acts as

and thus invariant under U(1) gauge symmetry. This corresponds to baryon number conservation.

The singlet axial group U(1)A transforms as the following global transformation

However, it does not correspond to a conserved quantity, because the associated axial current is not conserved. It is explicitly violated by a quantum anomaly.

The remaining chiral symmetry SU(2)L×SU(2)R turns out to be spontaneously broken by a quark condensate formed through nonperturbative action of QCD gluons, into the diagonal vector subgroup SU(2)V known as isospin. The Goldstone bosons corresponding to the three broken generators are the three pions. As a consequence, the effective theory of QCD bound states like the baryons, must now include mass terms for them, ostensibly disallowed by unbroken chiral symmetry. Thus, this chiral symmetry breaking induces the bulk of hadron masses, such as those for the nucleons — in effect, the bulk of the mass of all visible matter.

In the real world, because of the nonvanishing and differing masses of the quarks, SU(2)L × SU(2)R is only an approximate symmetry to begin with, and therefore the pions are not massless, but have small masses: they are pseudo-Goldstone bosons.

More flavors

For more "light" quark species, N flavors in general, the corresponding chiral symmetries are U(N)L × U(N)R, decomposing into

and exhibiting a very analogous chiral symmetry breaking pattern.

Most usually, N = 3 is taken, the u, d, and s quarks taken to be light (the Eightfold way (physics)), so then approximately massless for the symmetry to be meaningful to a lowest order, while the other three quarks are sufficiently heavy to barely have a residual chiral symmetry be visible for practical purposes.

An application in particle physics

In theoretical physics, the electroweak model breaks parity maximally. All its fermions are chiral Weyl fermions, which means that the charged weak gauge bosons W+ and W only couple to left-handed quarks and leptons.

Some theorists found this objectionable, and so conjectured a GUT extension of the weak force which has new, high energy W' and Z' bosons, which do couple with right handed quarks and leptons:

to

Here, SU(2)L (pronounced “SU(2) left”) is none other than SU(2)W from above, while B−L is the baryon number minus the lepton number. The electric charge formula in this model is given by

where and are the left and right weak isospin values of the fields in the theory.

There is also the chromodynamic SU(3)C. The idea was to restore parity by introducing a left-right symmetry. This is a group extension of (the left-right symmetry) by

to the semidirect product

This has two connected components where acts as an automorphism, which is the composition of an involutive outer automorphism of SU(3)C with the interchange of the left and right copies of SU(2) with the reversal of U(1)B−L . It was shown by Mohapatra & Senjanovic (1975) that left-right symmetry can be spontaneously broken to give a chiral low energy theory, which is the Standard Model of Glashow, Weinberg, and Salam, and also connects the small observed neutrino masses to the breaking of left-right symmetry via the seesaw mechanism.

In this setting, the chiral quarks

and

are unified into an irreducible representation (“irrep”)

The leptons are also unified into an irreducible representation

The Higgs bosons needed to implement the breaking of left-right symmetry down to the Standard Model are

This then provides three sterile neutrinos which are perfectly consistent with current neutrino oscillation data. Within the seesaw mechanism, the sterile neutrinos become superheavy without affecting physics at low energies.

Because the left-right symmetry is spontaneously broken, left-right models predict domain walls. This left-right symmetry idea first appeared in the Pati–Salam model (1974) and Mohapatra–Pati models (1975).

 

Operator (computer programming)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Operator_(computer_programmin...