A Medley of Potpourri

Sunday, June 4, 2023

Propagation of uncertainty

From Wikipedia, the free encyclopedia

In statistics, propagation of uncertainty (or propagation of error) is the effect of variables' uncertainties (or errors, more specifically random errors) on the uncertainty of a function based on them. When the variables are the values of experimental measurements they have uncertainties due to measurement limitations (e.g., instrument precision) which propagate due to the combination of variables in the function.

The uncertainty u can be expressed in a number of ways. It may be defined by the absolute error $Δ x$ . Uncertainties can also be defined by the relative error $(Δ x)/ x$ , which is usually written as a percentage. Most commonly, the uncertainty on a quantity is quantified in terms of the standard deviation, $σ$ , which is the positive square root of the variance. The value of a quantity and its error are then expressed as an interval $x \pm u$ . However, the most general way of characterizing uncertainty is by specifying its probability distribution. If the probability distribution of the variable is known or can be assumed, in theory it is possible to get any of its statistics. In particular, it is possible to derive confidence limits to describe the region within which the true value of the variable may be found. For example, the 68% confidence limits for a one-dimensional variable belonging to a normal distribution are approximately ± one standard deviation $σ$ from the central value $x$ , which means that the region $x \pm σ$ will cover the true value in roughly 68% of cases.

If the uncertainties are correlated then covariance must be taken into account. Correlation can arise from two different sources. First, the measurement errors may be correlated. Second, when the underlying values are correlated across a population, the uncertainties in the group averages will be correlated.

In a general context where a nonlinear function modifies the uncertain parameters (correlated or not), the standard tools to propagate uncertainty, and infer resulting quantity probability distribution/statistics, are sampling techniques from the Monte Carlo method family. For very expansive data or complex functions, the calculation of the error propagation may be very expansive so that a surrogate model or a parallel computing strategy may be necessary.

In some particular cases, the uncertainty propagation calculation can be done through simplistic algebraic procedures. Some of these scenarios are described below.

Linear combinations

Let ${f_{k} (x_{1}, x_{2}, \dots, x_{n})}$ be a set of m functions, which are linear combinations of $n$ variables $x_{1}, x_{2}, \dots, x_{n}$ with combination coefficients $A_{k 1}, A_{k 2}, \dots, A_{k n}, (k = 1, \dots, m)$ :

f_{k} = \sum_{i = 1}^{n} A_{k i} x_{i},

or in matrix notation,

f = A x .

Also let the variance–covariance matrix of x = (x₁, ..., x_n) be denoted by $Σ^{x}$ and let the mean value be denoted by $μ$ :

Σ^{x} = E [(x - μ) \otimes (x - μ)] = (\begin{matrix} σ_{1}^{2} & σ_{12} & σ_{13} & \dots \\ σ_{21} & σ_{2}^{2} & σ_{23} & \dots \\ σ_{31} & σ_{32} & σ_{3}^{2} & \dots \\ ⋮ & ⋮ & ⋮ & ⋱ \end{matrix}) = (\begin{matrix} Σ_{11}^{x} & Σ_{12}^{x} & Σ_{13}^{x} & \dots \\ Σ_{21}^{x} & Σ_{22}^{x} & Σ_{23}^{x} & \dots \\ Σ_{31}^{x} & Σ_{32}^{x} & Σ_{33}^{x} & \dots \\ ⋮ & ⋮ & ⋮ & ⋱ \end{matrix}) .

$\otimes$ is the outer product.

Then, the variance–covariance matrix $Σ^{f}$ of f is given by

Σ^{f} = E [(f - E [f]) \otimes (f - E [f])] = E [A (x - μ) \otimes A (x - μ)] = A E [(x - μ) \otimes (x - μ)] A^{T} = A Σ^{x} A^{T}

In component notation, the equation

Σ^{f} = A Σ^{x} A^{T} .

reads

Σ_{i j}^{f} = \sum_{k}^{n} \sum_{l}^{n} A_{i k} Σ_{k l}^{x} A_{j l} .

This is the most general expression for the propagation of error from one set of variables onto another. When the errors on x are uncorrelated, the general expression simplifies to

Σ_{i j}^{f} = \sum_{k}^{n} A_{i k} Σ_{k}^{x} A_{j k},

where $Σ_{k}^{x} = σ_{x_{k}}^{2}$ is the variance of k-th element of the x vector. Note that even though the errors on x may be uncorrelated, the errors on f are in general correlated; in other words, even if $Σ^{x}$ is a diagonal matrix, $Σ^{f}$ is in general a full matrix.

The general expressions for a scalar-valued function f are a little simpler (here a is a row vector):

f = \sum_{i}^{n} a_{i} x_{i} = a x,

σ_{f}^{2} = \sum_{i}^{n} \sum_{j}^{n} a_{i} Σ_{i j}^{x} a_{j} = a Σ^{x} a^{T} .

Each covariance term $σ_{i j}$ can be expressed in terms of the correlation coefficient $ρ_{i j}$ by $σ_{i j} = ρ_{i j} σ_{i} σ_{j}$ , so that an alternative expression for the variance of f is

σ_{f}^{2} = \sum_{i}^{n} a_{i}^{2} σ_{i}^{2} + \sum_{i}^{n} \sum_{j (j \neq i)}^{n} a_{i} a_{j} ρ_{i j} σ_{i} σ_{j} .

In the case that the variables in x are uncorrelated, this simplifies further to

σ_{f}^{2} = \sum_{i}^{n} a_{i}^{2} σ_{i}^{2} .

In the simple case of identical coefficients and variances, we find

σ_{f} = \sqrt{n} | a | σ .

For the arithmetic mean, $a = 1 / n$ , the result is the standard error of the mean:

σ_{f} = σ / \sqrt{n} .

Non-linear combinations

When f is a set of non-linear combination of the variables x, an interval propagation could be performed in order to compute intervals which contain all consistent values for the variables. In a probabilistic approach, the function f must usually be linearised by approximation to a first-order Taylor series expansion, though in some cases, exact formulae can be derived that do not depend on the expansion as is the case for the exact variance of products.^[7] The Taylor expansion would be:

f_{k} \approx f_{k}^{0} + \sum_{i}^{n} \frac{\partial f_{k}}{\partial x_{i}} x_{i}

where $\partial f_{k} / \partial x_{i}$ denotes the partial derivative of f_k with respect to the i-th variable, evaluated at the mean value of all components of vector x. Or in matrix notation,

f \approx f^{0} + J x

where J is the Jacobian matrix. Since f⁰ is a constant it does not contribute to the error on f. Therefore, the propagation of error follows the linear case, above, but replacing the linear coefficients, A_ki and A_kj by the partial derivatives, $\frac{\partial f_{k}}{\partial x_{i}}$ and $\frac{\partial f_{k}}{\partial x_{j}}$ . In matrix notation,^[8]

Σ^{f} = J Σ^{x} J^{⊤} .

That is, the Jacobian of the function is used to transform the rows and columns of the variance-covariance matrix of the argument. Note this is equivalent to the matrix expression for the linear case with $J = A$ .

Simplification

Neglecting correlations or assuming independent variables yields a common formula among engineers and experimental scientists to calculate error propagation, the variance formula:^[9]

s_{f} = \sqrt{{(\frac{\partial f}{\partial x})}^{2} s_{x}^{2} + {(\frac{\partial f}{\partial y})}^{2} s_{y}^{2} + {(\frac{\partial f}{\partial z})}^{2} s_{z}^{2} + \dots}

where $s_{f}$ represents the standard deviation of the function $f$ , $s_{x}$ represents the standard deviation of $x$ , $s_{y}$ represents the standard deviation of $y$ , and so forth.

It is important to note that this formula is based on the linear characteristics of the gradient of $f$ and therefore it is a good estimation for the standard deviation of $f$ as long as $s_{x}, s_{y}, s_{z}, \dots$ are small enough. Specifically, the linear approximation of $f$ has to be close to $f$ inside a neighbourhood of radius $s_{x}, s_{y}, s_{z}, \dots$ .^[10]

Example

Any non-linear differentiable function, $f (a, b)$ , of two variables, $a$ and $b$ , can be expanded as

f \approx f^{0} + \frac{\partial f}{\partial a} a + \frac{\partial f}{\partial b} b

now, taking variance on both sides, and using the formula^[11] for variance of a linear combination of variables:

$V a r (a X + b Y) = a^{2} V a r (X) + b^{2} V a r (Y) + 2 a b * C o v (X, Y)$

hence:

σ_{f}^{2} \approx {| \frac{\partial f}{\partial a} |}^{2} σ_{a}^{2} + {| \frac{\partial f}{\partial b} |}^{2} σ_{b}^{2} + 2 \frac{\partial f}{\partial a} \frac{\partial f}{\partial b} σ_{a b}

where $σ_{f}$ is the standard deviation of the function $f$ , $σ_{a}$ is the standard deviation of $a$ , $σ_{b}$ is the standard deviation of $b$ and $σ_{a b} = σ_{a} σ_{b} ρ_{a b}$ is the covariance between $a$ and $b$ .

In the particular case that $f = a b$ , $\frac{\partial f}{\partial a} = b, \frac{\partial f}{\partial b} = a$ . Then

σ_{f}^{2} \approx b^{2} σ_{a}^{2} + a^{2} σ_{b}^{2} + 2 a b σ_{a b}

{(\frac{σ_{f}}{f})}^{2} \approx {(\frac{σ_{a}}{a})}^{2} + {(\frac{σ_{b}}{b})}^{2} + 2 (\frac{σ_{a}}{a}) (\frac{σ_{b}}{b}) ρ_{a b}

where $ρ_{a b}$ is the correlation between $a$ and $b$ .

When the variables $a$ and $b$ are uncorrelated, $ρ_{a b} = 0$ . Then

{(\frac{σ_{f}}{f})}^{2} \approx {(\frac{σ_{a}}{a})}^{2} + {(\frac{σ_{b}}{b})}^{2} .

Caveats and warnings

Error estimates for non-linear functions are biased on account of using a truncated series expansion. The extent of this bias depends on the nature of the function. For example, the bias on the error calculated for log(1+x) increases as x increases, since the expansion to x is a good approximation only when x is near zero.

For highly non-linear functions, there exist five categories of probabilistic approaches for uncertainty propagation; see Uncertainty quantification for details.

Reciprocal and shifted reciprocal

In the special case of the inverse or reciprocal $1 / B$ , where $B = N (0, 1)$ follows a standard normal distribution, the resulting distribution is a reciprocal standard normal distribution, and there is no definable variance.

However, in the slightly more general case of a shifted reciprocal function $1 / (p - B)$ for $B = N (μ, σ)$ following a general normal distribution, then mean and variance statistics do exist in a principal value sense, if the difference between the pole $p$ and the mean $μ$ is real-valued.

Ratios

Ratios are also problematic; normal approximations exist under certain conditions.

Example formulae

This table shows the variances and standard deviations of simple functions of the real variables $A, B$ , with standard deviations $σ_{A}, σ_{B},$ covariance $σ_{A B} = ρ_{A B} σ_{A} σ_{B}$ , and correlation $ρ_{A B}$ . The real-valued coefficients $a$ and $b$ are assumed exactly known (deterministic), i.e., $σ_{a} = σ_{b} = 0$ .

In the columns "Variance" and "Standard Deviation", A and B should be understood as expectation values (i.e. values around which we're estimating the uncertainty), and $f$ should be understood as the value of the function calculated at the expectation value of $A, B$ .

Function	Variance	Standard Deviation
$f = a A$	$σ_{f}^{2} = a^{2} σ_{A}^{2}$	$σ_{f} = \| a \| σ_{A}$
$f = a A + b B$	$σ_{f}^{2} = a^{2} σ_{A}^{2} + b^{2} σ_{B}^{2} + 2 a b σ_{A B}$	$σ_{f} = \sqrt{a^{2} σ_{A}^{2} + b^{2} σ_{B}^{2} + 2 a b σ_{A B}}$
$f = a A - b B$	$σ_{f}^{2} = a^{2} σ_{A}^{2} + b^{2} σ_{B}^{2} - 2 a b σ_{A B}$	$σ_{f} = \sqrt{a^{2} σ_{A}^{2} + b^{2} σ_{B}^{2} - 2 a b σ_{A B}}$
$f = A - B,$	$σ_{f}^{2} = σ_{A}^{2} + σ_{B}^{2} - 2 σ_{A B}$	$σ_{f} = \sqrt{σ_{A}^{2} + σ_{B}^{2} - 2 σ_{A B}}$
$f = A B$	$σ_{f}^{2} \approx f^{2} [{(\frac{σ_{A}}{A})}^{2} + {(\frac{σ_{B}}{B})}^{2} + 2 \frac{σ_{A B}}{A B}]$	$σ_{f} \approx \| f \| \sqrt{{(\frac{σ_{A}}{A})}^{2} + {(\frac{σ_{B}}{B})}^{2} + 2 \frac{σ_{A B}}{A B}}$
$f = \frac{A}{B}$	$σ_{f}^{2} \approx f^{2} [{(\frac{σ_{A}}{A})}^{2} + {(\frac{σ_{B}}{B})}^{2} - 2 \frac{σ_{A B}}{A B}]$	$σ_{f} \approx \| f \| \sqrt{{(\frac{σ_{A}}{A})}^{2} + {(\frac{σ_{B}}{B})}^{2} - 2 \frac{σ_{A B}}{A B}}$
$f = a A^{b}$	$σ_{f}^{2} \approx {(a b A^{b - 1} σ_{A})}^{2} = {(\frac{f b σ_{A}}{A})}^{2}$	$σ_{f} \approx \| a b A^{b - 1} σ_{A} \| = \| \frac{f b σ_{A}}{A} \|$
$f = a \ln (b A)$	$σ_{f}^{2} \approx {(a \frac{σ_{A}}{A})}^{2}$	$σ_{f} \approx \| a \frac{σ_{A}}{A} \|$
$f = a \log_{10} (b A)$	$σ_{f}^{2} \approx {(a \frac{σ_{A}}{A \ln (10)})}^{2}$	$σ_{f} \approx \| a \frac{σ_{A}}{A \ln (10)} \|$
$f = a e^{b A}$	$σ_{f}^{2} \approx f^{2} {(b σ_{A})}^{2}$ ^[19]	$σ_{f} \approx \| f \| \| (b σ_{A}) \|$
$f = a^{b A}$	$σ_{f}^{2} \approx f^{2} (b \ln (a) σ_{A})^{2}$	$σ_{f} \approx \| f \| \| (b \ln (a) σ_{A}) \|$
$f = a \sin (b A)$	$σ_{f}^{2} \approx {[a b \cos (b A) σ_{A}]}^{2}$	$σ_{f} \approx \| a b \cos (b A) σ_{A} \|$
$f = a \cos (b A)$	$σ_{f}^{2} \approx {[a b \sin (b A) σ_{A}]}^{2}$	$σ_{f} \approx \| a b \sin (b A) σ_{A} \|$
$f = a \tan (b A)$	$σ_{f}^{2} \approx {[a b \sec^{2} (b A) σ_{A}]}^{2}$	$σ_{f} \approx \| a b \sec^{2} (b A) σ_{A} \|$
$f = A^{B}$	$σ_{f}^{2} \approx f^{2} [{(\frac{B}{A} σ_{A})}^{2} + {(\ln (A) σ_{B})}^{2} + 2 \frac{B \ln (A)}{A} σ_{A B}]$	$σ_{f} \approx \| f \| \sqrt{{(\frac{B}{A} σ_{A})}^{2} + {(\ln (A) σ_{B})}^{2} + 2 \frac{B \ln (A)}{A} σ_{A B}}$
$f = \sqrt{a A^{2} \pm b B^{2}}$	$σ_{f}^{2} \approx {(\frac{A}{f})}^{2} a^{2} σ_{A}^{2} + {(\frac{B}{f})}^{2} b^{2} σ_{B}^{2} \pm 2 a b \frac{A B}{f^{2}} σ_{A B}$	$σ_{f} \approx \sqrt{{(\frac{A}{f})}^{2} a^{2} σ_{A}^{2} + {(\frac{B}{f})}^{2} b^{2} σ_{B}^{2} \pm 2 a b \frac{A B}{f^{2}} σ_{A B}}$

For uncorrelated variables ( $ρ_{A B} = 0$ , $σ_{A B} = 0$ ) expressions for more complicated functions can be derived by combining simpler functions. For example, repeated multiplication, assuming no correlation, gives

f = A B C; {(\frac{σ_{f}}{f})}^{2} \approx {(\frac{σ_{A}}{A})}^{2} + {(\frac{σ_{B}}{B})}^{2} + {(\frac{σ_{C}}{C})}^{2} .

For the case $f = A B$ we also have Goodman's expression for the exact variance: for the uncorrelated case it is

V (X Y) = E (X)^{2} V (Y) + E (Y)^{2} V (X) + E ((X - E (X))^{2} (Y - E (Y))^{2})

and therefore we have:

σ_{f}^{2} = A^{2} σ_{B}^{2} + B^{2} σ_{A}^{2} + σ_{A}^{2} σ_{B}^{2}

Effect of correlation on differences

If A and B are uncorrelated, their difference A-B will have more variance than either of them. An increasing positive correlation ( $ρ_{A B} \to 1$ ) will decrease the variance of the difference, converging to zero variance for perfectly correlated variables with the same variance. On the other hand, a negative correlation ( $ρ_{A B} \to - 1$ ) will further increase the variance of the difference, compared to the uncorrelated case.

For example, the self-subtraction f=A-A has zero variance $σ_{f}^{2} = 0$ only if the variate is perfectly autocorrelated ( $ρ_{A} = 1$ ). If A is uncorrelated, $ρ_{A} = 0$ , then the output variance is twice the input variance, $σ_{f}^{2} = 2 σ_{A}^{2}$ . And if A is perfectly anticorrelated, $ρ_{A} = - 1$ , then the input variance is quadrupled in the output, $σ_{f}^{2} = 4 σ_{A}^{2}$ (notice $1 - ρ_{A} = 2$ for f = aA - aA in the table above).

Example calculations

Inverse tangent function

We can calculate the uncertainty propagation for the inverse tangent function as an example of using partial derivatives to propagate error.

Define

f (x) = \arctan (x),

where $Δ_{x}$ is the absolute uncertainty on our measurement of $x$ . The derivative of $f (x)$ with respect to $x$ is

\frac{d f}{d x} = \frac{1}{1 + x^{2}} .

Therefore, our propagated uncertainty is

Δ_{f} \approx \frac{Δ_{x}}{1 + x^{2}},

where $Δ_{f}$ is the absolute propagated uncertainty.

Resistance measurement

A practical application is an experiment in which one measures current, $I$ , and voltage, $V$ , on a resistor in order to determine the resistance, $R$ , using Ohm's law, $R = V / I$ .

Given the measured variables with uncertainties, $I \pm σ I$ and $V \pm σ V$ , and neglecting their possible correlation, the uncertainty in the computed quantity, $σ R$ , is:

σ_{R} \approx \sqrt{σ_{V}^{2} {(\frac{1}{I})}^{2} + σ_{I}^{2} {(\frac{- V}{I^{2}})}^{2}} = R \sqrt{{(\frac{σ_{V}}{V})}^{2} + {(\frac{σ_{I}}{I})}^{2}} .

Volatility (finance)

From Wikipedia, the free encyclopedia

The VIX

In finance, volatility (usually denoted by σ) is the degree of variation of a trading price series over time, usually measured by the standard deviation of logarithmic returns.

Historic volatility measures a time series of past market prices. Implied volatility looks forward in time, being derived from the market price of a market-traded derivative (in particular, an option).

Volatility terminology

Volatility as described here refers to the actual volatility, more specifically:

actual current volatility of a financial instrument for a specified period (for example 30 days or 90 days), based on historical prices over the specified period with the last observation the most recent price.
actual historical volatility which refers to the volatility of a financial instrument over a specified period but with the last observation on a date in the past
- near synonymous is realized volatility, the square root of the realized variance, in turn calculated using the sum of squared returns divided by the number of observations.
actual future volatility which refers to the volatility of a financial instrument over a specified period starting at the current time and ending at a future date (normally the expiry date of an option)

Now turning to implied volatility, we have:

historical implied volatility which refers to the implied volatility observed from historical prices of the financial instrument (normally options)
current implied volatility which refers to the implied volatility observed from current prices of the financial instrument
future implied volatility which refers to the implied volatility observed from future prices of the financial instrument

For a financial instrument whose price follows a Gaussian random walk, or Wiener process, the width of the distribution increases as time increases. This is because there is an increasing probability that the instrument's price will be farther away from the initial price as time increases. However, rather than increase linearly, the volatility increases with the square-root of time as time increases, because some fluctuations are expected to cancel each other out, so the most likely deviation after twice the time will not be twice the distance from zero.

Since observed price changes do not follow Gaussian distributions, others such as the Lévy distribution are often used. These can capture attributes such as "fat tails". Volatility is a statistical measure of dispersion around the average of any random variable such as market parameters etc.

Mathematical definition

For any fund that evolves randomly with time, volatility is defined as the standard deviation of a sequence of random variables, each of which is the return of the fund over some corresponding sequence of (equally sized) times.

Thus, "annualized" volatility σ_annually is the standard deviation of an instrument's yearly logarithmic returns.

The generalized volatility σ_T for time horizon T in years is expressed as:

σ_{T} = σ_{annually} \sqrt{T} .

Therefore, if the daily logarithmic returns of a stock have a standard deviation of σ_daily and the time period of returns is P in trading days, the annualized volatility is

σ_{annually} = σ_{daily} \sqrt{P} .

σ_{T} = σ_{daily} \sqrt{P T} .

A common assumption is that P = 252 trading days in any given year. Then, if σ_daily = 0.01, the annualized volatility is

σ_{annually} = 0.01 \sqrt{252} = 0.1587.

The monthly volatility (i.e. $T = \frac{1}{12}$ of a year) is

σ_{monthly} = 0.01 \sqrt{\frac{252}{12}} = 0.0458.

The formulas used above to convert returns or volatility measures from one time period to another assume a particular underlying model or process. These formulas are accurate extrapolations of a random walk, or Wiener process, whose steps have finite variance. However, more generally, for natural stochastic processes, the precise relationship between volatility measures for different time periods is more complicated. Some use the Lévy stability exponent α to extrapolate natural processes:

σ_{T} = T^{1 / α} σ .

If α = 2 the Wiener process scaling relation is obtained, but some people believe α < 2 for financial activities such as stocks, indexes and so on. This was discovered by Benoît Mandelbrot, who looked at cotton prices and found that they followed a Lévy alpha-stable distribution with α = 1.7. (See New Scientist, 19 April 1997.)

Volatility origin

Much research has been devoted to modeling and forecasting the volatility of financial returns, and yet few theoretical models explain how volatility comes to exist in the first place.

Roll (1984) shows that volatility is affected by market microstructure. Glosten and Milgrom (1985) shows that at least one source of volatility can be explained by the liquidity provision process. When market makers infer the possibility of adverse selection, they adjust their trading ranges, which in turn increases the band of price oscillation.

In September 2019, JPMorgan Chase determined the effect of US President Donald Trump's tweets, and called it the Volfefe index combining volatility and the covfefe meme.

Volatility for investors

Investors care about volatility for at least eight reasons:

The wider the swings in an investment's price, the harder emotionally it is to not worry;
Price volatility of a trading instrument can define position sizing in a portfolio;
When certain cash flows from selling a security are needed at a specific future date, higher volatility means a greater chance of a shortfall;
Higher volatility of returns while saving for retirement results in a wider distribution of possible final portfolio values;
Higher volatility of return when retired gives withdrawals a larger permanent impact on the portfolio's value;
Price volatility presents opportunities to buy assets cheaply and sell when overpriced;
Portfolio volatility has a negative impact on the compound annual growth rate (CAGR) of that portfolio
Volatility affects pricing of options, being a parameter of the Black–Scholes model.

Volatility versus direction

Volatility does not measure the direction of price changes, merely their dispersion. This is because when calculating standard deviation (or variance), all differences are squared, so that negative and positive differences are combined into one quantity. Two instruments with different volatilities may have the same expected return, but the instrument with higher volatility will have larger swings in values over a given period of time.

For example, a lower volatility stock may have an expected (average) return of 7%, with annual volatility of 5%. This would indicate returns from approximately negative 3% to positive 17% most of the time (19 times out of 20, or 95% via a two standard deviation rule). A higher volatility stock, with the same expected return of 7% but with annual volatility of 20%, would indicate returns from approximately negative 33% to positive 47% most of the time (19 times out of 20, or 95%). These estimates assume a normal distribution; in reality stocks are found to be leptokurtotic.

Volatility over time

Although the Black-Scholes equation assumes predictable constant volatility, this is not observed in real markets, and amongst the models are Emanuel Derman and Iraj Kani's and Bruno Dupire's local volatility, Poisson process where volatility jumps to new levels with a predictable frequency, and the increasingly popular Heston model of stochastic volatility.

It is common knowledge that types of assets experience periods of high and low volatility. That is, during some periods, prices go up and down quickly, while during other times they barely move at all. In foreign exchange market, price changes are seasonally heteroskedastic with periods of one day and one week.

Periods when prices fall quickly (a crash) are often followed by prices going down even more, or going up by an unusual amount. Also, a time when prices rise quickly (a possible bubble) may often be followed by prices going up even more, or going down by an unusual amount.

Most typically, extreme movements do not appear 'out of nowhere'; they are presaged by larger movements than usual. This is termed autoregressive conditional heteroskedasticity. Whether such large movements have the same direction, or the opposite, is more difficult to say. And an increase in volatility does not always presage a further increase—the volatility may simply go back down again.

Not only the volatility depends on the period when it is measured but also on the selected time resolution. The effect is observed due to the fact that the information flow between short-term and long-term traders is asymmetric. As a result, volatility measured with high resolution contains information that is not covered by low resolution volatility and vice versa.

The risk parity weighted volatility of the three assets Gold, Treasury bonds and Nasdaq acting as proxy for the Marketportfolio seems to have a low point at 4% after turning upwards for the 8th time since 1974 at this reading in the summer of 2014.

Alternative measures of volatility

Some authors point out that realized volatility and implied volatility are backward and forward looking measures, and do not reflect current volatility. To address that issue an alternative, ensemble measures of volatility were suggested. One of the measures is defined as the standard deviation of ensemble returns instead of time series of returns. Another considers the regular sequence of directional-changes as the proxy for the instantaneous volatility.

Implied volatility parametrisation

There exist several known parametrisations of the implied volatility surface, Schonbucher, SVI and gSVI.

Crude volatility estimation

Using a simplification of the above formula it is possible to estimate annualized volatility based solely on approximate observations. Suppose you notice that a market price index, which has a current value near 10,000, has moved about 100 points a day, on average, for many days. This would constitute a 1% daily movement, up or down.

To annualize this, you can use the "rule of 16", that is, multiply by 16 to get 16% as the annual volatility. The rationale for this is that 16 is the square root of 256, which is approximately the number of trading days in a year (252). This also uses the fact that the standard deviation of the sum of n independent variables (with equal standard deviations) is √n times the standard deviation of the individual variables.

The average magnitude of the observations is merely an approximation of the standard deviation of the market index. Assuming that the market index daily changes are normally distributed with mean zero and standard deviation σ, the expected value of the magnitude of the observations is √(2/ $π$ )σ = 0.798σ. The net effect is that this crude approach underestimates the true volatility by about 20%.

Estimate of compound annual growth rate (CAGR)

Consider the Taylor series:

\log (1 + y) = y - \frac{1}{2} y^{2} + \frac{1}{3} y^{3} - \frac{1}{4} y^{4} + \dots

Taking only the first two terms one has:

C A G R \approx A R - \frac{1}{2} σ^{2}

Volatility thus mathematically represents a drag on the CAGR (formalized as the "volatility tax"). Realistically, most financial assets have negative skewness and leptokurtosis, so this formula tends to be over-optimistic. Some people use the formula:

C A G R \approx A R - \frac{1}{2} k σ^{2}

for a rough estimate, where k is an empirical factor (typically five to ten).

Criticisms of volatility forecasting models

Performance of VIX (left) compared to past volatility (right) as 30-day volatility predictors, for the period of Jan 1990-Sep 2009. Volatility is measured as the standard deviation of S&P500 one-day returns over a month's period. The blue lines indicate linear regressions, resulting in the correlation coefficients r shown. Note that VIX has virtually the same predictive power as past volatility, insofar as the shown correlation coefficients are nearly identical.

Despite the sophisticated composition of most volatility forecasting models, critics claim that their predictive power is similar to that of plain-vanilla measures, such as simple past volatilityespecially out-of-sample, where different data are used to estimate the models and to test them. Other works have agreed, but claim critics failed to correctly implement the more complicated models. Some practitioners and portfolio managers seem to completely ignore or dismiss volatility forecasting models. For example, Nassim Taleb famously titled one of his Journal of Portfolio Management papers "We Don't Quite Know What We are Talking About When We Talk About Volatility". In a similar note, Emanuel Derman expressed his disillusion with the enormous supply of empirical models unsupported by theory. He argues that, while "theories are attempts to uncover the hidden principles underpinning the world around us, as Albert Einstein did with his theory of relativity", we should remember that "models are metaphors – analogies that describe one thing relative to another".