From Wikipedia, the free encyclopedia
https://en.wikipedia.org/wiki/Statistical_hypothesis_testing

A statistical hypothesis is a hypothesis that is testable on the basis of observed data modelled as the realised values taken by a collection of random variables. A set of data is modelled as being realised values of a collection of random variables having a joint probability distribution in some set of possible joint distributions. The hypothesis being tested is exactly that set of possible probability distributions. A statistical hypothesis test is a method of statistical inference. An alternative hypothesis is proposed for the probability distribution of the data, either explicitly or only informally. The comparison of the two models is deemed statistically significant if, according to a threshold probability—the significance level—the data would be unlikely to occur if the null hypothesis were true. A hypothesis test specifies which outcomes of a study may lead to a rejection of the null hypothesis at a pre-specified level of significance, while using a pre-chosen measure of deviation from that hypothesis (the test statistic, or goodness-of-fit measure). The pre-chosen level of significance is the maximal allowed "false positive rate". One wants to control the risk of incorrectly rejecting a true null hypothesis.

The process of distinguishing between the null hypothesis and the alternative hypothesis is aided by considering two conceptual types of errors. The first type of error occurs when the null hypothesis is wrongly rejected. The second type of error occurs when the null hypothesis is wrongly not rejected. (The two types are known as type 1 and type 2 errors.)

Hypothesis tests based on statistical significance are another way of expressing confidence intervals (more precisely, confidence sets). In other words, every hypothesis test based on significance can be obtained via a confidence interval, and every confidence interval can be obtained via a hypothesis test based on significance.

Significance-based hypothesis testing is the most common framework for statistical hypothesis testing. An alternative framework for statistical hypothesis testing is to specify a set of statistical models, one for each candidate hypothesis, and then use model selection techniques to choose the most appropriate model. The most common selection techniques are based on either Akaike information criterion or Bayes factor. However, this is not really an "alternative framework", though one can call it a more complex framework. It is a situation in which one likes to distinguish between many possible hypotheses, not just two. Alternatively, one can see it as a hybrid between testing and estimation, where one of the parameters is discrete, and specifies which of a hierarchy of more and more complex models is correct.

  • Null hypothesis significance testing* is the name for a version of hypothesis testing with no explicit mention of possible alternatives, and not much consideration of error rates. It was championed by Ronald Fisher in a context in which he downplayed any explicit choice of alternative hypothesis and consequently paid no attention to the power of a test. One simply set up a null hypothesis as a kind of straw man, or more kindly, as a formalisation of a standard, establishment, default idea of how things were. One tried to overthrow this conventional view by showing that it led to the conclusion that something extremely unlikely had happened, thereby discrediting the theory.

The testing process