A Medley of Potpourri

Monday, January 15, 2024

Causal model

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Causal_model

In the philosophy of science, a causal model (or structural causal model) is a conceptual model that describes the causal mechanisms of a system. Several types of causal notation may be used in the development of a causal model. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.

They can allow some questions to be answered from existing observational data without the need for an interventional study such as a randomized controlled trial. Some interventional studies are inappropriate for ethical or practical reasons, meaning that without a causal model, some hypotheses cannot be tested.

Causal models can help with the question of external validity (whether results from one study apply to unstudied populations). Causal models can allow data from multiple studies to be merged (in certain circumstances) to answer questions that cannot be answered by any individual data set.

Causal models have found applications in signal processing, epidemiology and machine learning.

Definition

Causal models are mathematical models representing causal relationships within an individual system or population. They facilitate inferences about causal relationships from statistical data. They can teach us a good deal about the epistemology of causation, and about the relationship between causation and probability. They have also been applied to topics of interest to philosophers, such as the logic of counterfactuals, decision theory, and the analysis of actual causation.
— Stanford Encyclopedia of Philosophy

Judea Pearl defines a causal model as an ordered triple $⟨ U, V, E ⟩$ , where U is a set of exogenous variables whose values are determined by factors outside the model; V is a set of endogenous variables whose values are determined by factors within the model; and E is a set of structural equations that express the value of each endogenous variable as a function of the values of the other variables in U and V.

History

Aristotle defined a taxonomy of causality, including material, formal, efficient and final causes. Hume rejected Aristotle's taxonomy in favor of counterfactuals. At one point, he denied that objects have "powers" that make one a cause and another an effect. Later he adopted "if the first object had not been, the second had never existed" ("but-for" causation).

In the late 19th century, the discipline of statistics began to form. After a years-long effort to identify causal rules for domains such as biological inheritance, Galton introduced the concept of mean regression (epitomized by the sophomore slump in sports) which later led him to the non-causal concept of correlation.

As a positivist, Pearson expunged the notion of causality from much of science as an unprovable special case of association and introduced the correlation coefficient as the metric of association. He wrote, "Force as a cause of motion is exactly the same as a tree god as a cause of growth" and that causation was only a "fetish among the inscrutable arcana of modern science". Pearson founded Biometrika and the Biometrics Lab at University College London, which became the world leader in statistics.

In 1908 Hardy and Weinberg solved the problem of trait stability that had led Galton to abandon causality, by resurrecting Mendelian inheritance.

In 1921 Wright's path analysis became the theoretical ancestor of causal modeling and causal graphs. He developed this approach while attempting to untangle the relative impacts of heredity, development and environment on guinea pig coat patterns. He backed up his then-heretical claims by showing how such analyses could explain the relationship between guinea pig birth weight, in utero time and litter size. Opposition to these ideas by prominent statisticians led them to be ignored for the following 40 years (except among animal breeders). Instead scientists relied on correlations, partly at the behest of Wright's critic (and leading statistician), Fisher. One exception was Burks, a student who in 1926 was the first to apply path diagrams to represent a mediating influence (mediator) and to assert that holding a mediator constant induces errors. She may have invented path diagrams independently.

In 1923, Neyman introduced the concept of a potential outcome, but his paper was not translated from Polish to English until 1990.

In 1958 Cox warned that controlling for a variable Z is valid only if it is highly unlikely to be affected by independent variables.

In the 1960s, Duncan, Blalock, Goldberger and others rediscovered path analysis. While reading Blalock's work on path diagrams, Duncan remembered a lecture by Ogburn twenty years earlier that mentioned a paper by Wright that in turn mentioned Burks.

Sociologists originally called causal models structural equation modeling, but once it became a rote method, it lost its utility, leading some practitioners to reject any relationship to causality. Economists adopted the algebraic part of path analysis, calling it simultaneous equation modeling. However, economists still avoided attributing causal meaning to their equations.

Sixty years after his first paper, Wright published a piece that recapitulated it, following Karlin et al.'s critique, which objected that it handled only linear relationships and that robust, model-free presentations of data were more revealing.

In 1973 Lewis advocated replacing correlation with but-for causality (counterfactuals). He referred to humans' ability to envision alternative worlds in which a cause did or not occur, and in which an effect appeared only following its cause. In 1974 Rubin introduced the notion of "potential outcomes" as a language for asking causal questions.

In 1983 Cartwright proposed that any factor that is "causally relevant" to an effect be conditioned on, moving beyond simple probability as the only guide.

In 1986 Baron and Kenny introduced principles for detecting and evaluating mediation in a system of linear equations. As of 2014 their paper was the 33rd most-cited of all time. That year Greenland and Robins introduced the "exchangeability" approach to handling confounding by considering a counterfactual. They proposed assessing what would have happened to the treatment group if they had not received the treatment and comparing that outcome to that of the control group. If they matched, confounding was said to be absent.

Ladder of causation

Pearl's causal metamodel involves a three-level abstraction he calls the ladder of causation. The lowest level, Association (seeing/observing), entails the sensing of regularities or patterns in the input data, expressed as correlations. The middle level, Intervention (doing), predicts the effects of deliberate actions, expressed as causal relationships. The highest level, Counterfactuals (imagining), involves constructing a theory of (part of) the world that explains why specific actions have specific effects and what happens in the absence of such actions.

Association

One object is associated with another if observing one changes the probability of observing the other. Example: shoppers who buy toothpaste are more likely to also buy dental floss. Mathematically:

P (f l o s s | t o o t h p a s t e)

or the probability of (purchasing) floss given (the purchase of) toothpaste. Associations can also be measured via computing the correlation of the two events. Associations have no causal implications. One event could cause the other, the reverse could be true, or both events could be caused by some third event (unhappy hygienist shames shopper into treating their mouth better).

Intervention

This level asserts specific causal relationships between events. Causality is assessed by experimentally performing some action that affects one of the events. Example: after doubling the price of toothpaste, what would be the new probability of purchasing? Causality cannot be established by examining history (of price changes) because the price change may have been for some other reason that could itself affect the second event (a tariff that increases the price of both goods). Mathematically:

P (f l o s s | d o (t o o t h p a s t e))

where do is an operator that signals the experimental intervention (doubling the price). The operator indicates performing the minimal change in the world necessary to create the intended effect, a "mini-surgery" on the model with as little change from reality as possible.

Counterfactuals

The highest level, counterfactual, involves consideration of an alternate version of a past event, or what would happen under different circumstances for the same experimental unit. For example, what is the probability that, if a store had doubled the price of floss, the toothpaste-purchasing shopper would still have bought it?

P (f l o s s | t o o t h p a s t e, p r i c e * 2)

Counterfactuals can indicate the existence of a causal relationship. Models that can answer counterfactuals allow precise interventions whose consequences can be predicted. At the extreme, such models are accepted as physical laws (as in the laws of physics, e.g., inertia, which says that if force is not applied to a stationary object, it will not move).

Causality

Causality vs correlation

Statistics revolves around the analysis of relationships among multiple variables. Traditionally, these relationships are described as correlations, associations without any implied causal relationships. Causal models attempt to extend this framework by adding the notion of causal relationships, in which changes in one variable cause changes in others.

Twentieth century definitions of causality relied purely on probabilities/associations. One event ( $X$ ) was said to cause another if it raises the probability of the other ( $Y$ ). Mathematically this is expressed as:

P (Y | X) > P (Y)

Such definitions are inadequate because other relationships (e.g., a common cause for $X$ and $Y$ ) can satisfy the condition. Causality is relevant to the second ladder step. Associations are on the first step and provide only evidence to the latter.

A later definition attempted to address this ambiguity by conditioning on background factors. Mathematically:

P (Y | X, K = k) > P (Y | K = k)

where $K$ is the set of background variables and $k$ represents the values of those variables in a specific context. However, the required set of background variables is indeterminate (multiple sets may increase the probability), as long as probability is the only criterion.

Other attempts to define causality include Granger causality, a statistical hypothesis test that causality (in economics) can be assessed by measuring the ability to predict the future values of one time series using prior values of another time series.

Types

A cause can be necessary, sufficient, contributory or some combination.

Necessary

For x to be a necessary cause of y, the presence of y must imply the prior occurrence of x. The presence of x, however, does not imply that y will occur. Necessary causes are also known as "but-for" causes, as in y would not have occurred but for the occurrence of x.

Sufficient causes

For x to be a sufficient cause of y, the presence of x must imply the subsequent occurrence of y. However, another cause z may independently cause y. Thus the presence of y does not require the prior occurrence of x.

Contributory causes

For x to be a contributory cause of y, the presence of x must increase the likelihood of y. If the likelihood is 100%, then x is instead called sufficient. A contributory cause may also be necessary.

Model

Causal diagram

A causal diagram is a directed graph that displays causal relationships between variables in a causal model. A causal diagram includes a set of variables (or nodes). Each node is connected by an arrow to one or more other nodes upon which it has a causal influence. An arrowhead delineates the direction of causality, e.g., an arrow connecting variables $A$ and $B$ with the arrowhead at $B$ indicates that a change in $A$ causes a change in $B$ (with an associated probability). A path is a traversal of the graph between two nodes following causal arrows.

Causal diagrams include causal loop diagrams, directed acyclic graphs, and Ishikawa diagrams.

Causal diagrams are independent of the quantitative probabilities that inform them. Changes to those probabilities (e.g., due to technological improvements) do not require changes to the model.

Model elements

Causal models have formal structures with elements with specific properties.

Junction patterns

The three types of connections of three nodes are linear chains, branching forks and merging colliders.

Chain

Chains are straight line connections with arrows pointing from cause to effect. In this model, $B$ is a mediator in that it mediates the change that $A$ would otherwise have on $C$ .

A \to B \to C

Fork

In forks, one cause has multiple effects. The two effects have a common cause. There exists a (non-causal) spurious correlation between $A$ and $C$ that can be eliminated by conditioning on $B$ (for a specific value of $B$ ).

A \leftarrow B \to C

"Conditioning on $B$ " means "given $B$ " (i.e., given a value of $B$ ).

An elaboration of a fork is the confounder:

A \leftarrow B \to C \to A

In such models, $B$ is a common cause of $A$ and $C$ (which also causes $A$ ), making $B$ the confounder.

Collider

In colliders, multiple causes affect one outcome. Conditioning on $B$ (for a specific value of $B$ ) often reveals a non-causal negative correlation between $A$ and $C$ . This negative correlation has been called collider bias and the "explain-away" effect as $B$ explains away the correlation between $A$ and $C$ . The correlation can be positive in the case where contributions from both $A$ and $C$ are necessary to affect $B$ .

A \to B \leftarrow C

Node types

Mediator

A mediator node modifies the effect of other causes on an outcome (as opposed to simply affecting the outcome). For example, in the chain example above, $B$ is a mediator, because it modifies the effect of $A$ (an indirect cause of $C$ ) on $C$ (the outcome).

Confounder

A confounder node affects multiple outcomes, creating a positive correlation among them.

Instrumental variable

An instrumental variable is one that:

has a path to the outcome;
has no other path to causal variables;
has no direct influence on the outcome.

Regression coefficients can serve as estimates of the causal effect of an instrumental variable on an outcome as long as that effect is not confounded. In this way, instrumental variables allow causal factors to be quantified without data on confounders.

For example, given the model:

Z \to X \to Y \leftarrow U \to X

$Z$ is an instrumental variable, because it has a path to the outcome $Y$ and is unconfounded, e.g., by $U$ .

In the above example, if $Z$ and $X$ take binary values, then the assumption that $Z = 0, X = 1$ does not occur is called monotonicity.

Refinements to the technique include creating an instrument by conditioning on other variable to block the paths between the instrument and the confounder and combining multiple variables to form a single instrument.

Mendelian randomization

Definition: Mendelian randomization uses measured variation in genes of known function to examine the causal effect of a modifiable exposure on disease in observational studies.

Because genes vary randomly across populations, presence of a gene typically qualifies as an instrumental variable, implying that in many cases, causality can be quantified using regression on an observational study.

Associations

Independence conditions

Independence conditions are rules for deciding whether two variables are independent of each other. Variables are independent if the values of one do not directly affect the values of the other. Multiple causal models can share independence conditions. For example, the models

A \to B \to C

and

A \leftarrow B \to C

have the same independence conditions, because conditioning on $B$ leaves $A$ and $C$ independent. However, the two models do not have the same meaning and can be falsified based on data (that is, if observational data show an association between $A$ and $C$ after conditioning on $B$ , then both models are incorrect). Conversely, data cannot show which of these two models are correct, because they have the same independence conditions.

Conditioning on a variable is a mechanism for conducting hypothetical experiments. Conditioning on a variable involves analyzing the values of other variables for a given value of the conditioned variable. In the first example, conditioning on $B$ implies that observations for a given value of $B$ should show no dependence between $A$ and $C$ . If such a dependence exists, then the model is incorrect. Non-causal models cannot make such distinctions, because they do not make causal assertions.

Confounder/deconfounder

An essential element of correlational study design is to identify potentially confounding influences on the variable under study, such as demographics. These variables are controlled for to eliminate those influences. However, the correct list of confounding variables cannot be determined a priori. It is thus possible that a study may control for irrelevant variables or even (indirectly) the variable under study.

Causal models offer a robust technique for identifying appropriate confounding variables. Formally, Z is a confounder if "Y is associated with Z via paths not going through X". These can often be determined using data collected for other studies. Mathematically, if

P (Y | X) \neq P (Y | d o (X))

X and Y are confounded (by some confounder variable Z).

Earlier, allegedly incorrect definitions of confounder include:

"Any variable that is correlated with both X and Y."
Y is associated with Z among the unexposed.
Noncollapsibility: A difference between the "crude relative risk and the relative risk resulting after adjustment for the potential confounder".
Epidemiological: A variable associated with X in the population at large and associated with Y among people unexposed to X.

The latter is flawed in that given that in the model:

X \to Z \to Y

Z matches the definition, but is a mediator, not a confounder, and is an example of controlling for the outcome.

In the model

X \leftarrow A \to B \leftarrow C \to Y

Traditionally, B was considered to be a confounder, because it is associated with X and with Y but is not on a causal path nor is it a descendant of anything on a causal path. Controlling for B causes it to become a confounder. This is known as M-bias.

Backdoor adjustment

For analysing the causal effect of X on Y in a causal model all confounder variables must be addressed (deconfounding). To identify the set of confounders, (1) every noncausal path between X and Y must be blocked by this set; (2) without disrupting any causal paths; and (3) without creating any spurious paths.

Definition: a backdoor path from variable X to Y is any path from X to Y that starts with an arrow pointing to X.

Definition: Given an ordered pair of variables (X,Y) in a model, a set of confounder variables Z satisfies the backdoor criterion if (1) no confounder variable Z is a descendent of X and (2) all backdoor paths between X and Y are blocked by the set of confounders.

If the backdoor criterion is satisfied for (X,Y), X and Y are deconfounded by the set of confounder variables. It is not necessary to control for any variables other than the confounders. The backdoor criterion is a sufficient but not necessary condition to find a set of variables Z to decounfound the analysis of the causal effect of X on y.

When the causal model is a plausible representation of reality and the backdoor criterion is satisfied, then partial regression coefficients can be used as (causal) path coefficients (for linear relationships).

P (Y | d o (X)) = \sum_{z} P (Y | X, Z = z) P (Z = z)

Frontdoor adjustment

If the elements of a blocking path are all unobservable, the backdoor path is not calculable, but if all forward paths from $X \to Y$ have elements $z$ where no open paths connect $z \to Y$ , then $Z$ , the set of all $z$ s, can measure $P (Y | d o (X))$ . Effectively, there are conditions where $Z$ can act as a proxy for $X$ .

Definition: a frontdoor path is a direct causal path for which data is available for all $z \in Z$ , $Z$ intercepts all directed paths $X$ to $Y$ , there are no unblocked paths from $Z$ to $Y$ , and all backdoor paths from $Z$ to $Y$ are blocked by $X$ .

The following converts a do expression into a do-free expression by conditioning on the variables along the front-door path.

P (Y | d o (X)) = \sum_{z} [P (Z = z | X) \sum_{x} P (Y | X = x, Z = z) P (X = x)]

Presuming data for these observable probabilities is available, the ultimate probability can be computed without an experiment, regardless of the existence of other confounding paths and without backdoor adjustment.

Interventions

Queries

Queries are questions asked based on a specific model. They are generally answered via performing experiments (interventions). Interventions take the form of fixing the value of one variable in a model and observing the result. Mathematically, such queries take the form (from the example):

P (floss | d o (toothpaste))

where the do operator indicates that the experiment explicitly modified the price of toothpaste. Graphically, this blocks any causal factors that would otherwise affect that variable. Diagramatically, this erases all causal arrows pointing at the experimental variable.

More complex queries are possible, in which the do operator is applied (the value is fixed) to multiple variables.

Do calculus

The do calculus is the set of manipulations that are available to transform one expression into another, with the general goal of transforming expressions that contain the do operator into expressions that do not. Expressions that do not include the do operator can be estimated from observational data alone, without the need for an experimental intervention, which might be expensive, lengthy or even unethical (e.g., asking subjects to take up smoking). The set of rules is complete (it can be used to derive every true statement in this system). An algorithm can determine whether, for a given model, a solution is computable in polynomial time.

Rules

The calculus includes three rules for the transformation of conditional probability expressions involving the do operator.

Rule 1

Rule 1 permits the addition or deletion of observations:

P (Y | d o (X), Z, W) = P (Y | d o (X), Z)

in the case that the variable set Z blocks all paths from W to Y and all arrows leading into X have been deleted.

Rule 2

Rule 2 permits the replacement of an intervention with an observation or vice versa:

P (Y | d o (X), Z) = P (Y | X, Z)

in the case that Z satisfies the back-door criterion.

Rule 3

Rule 3 permits the deletion or addition of interventions.:

P (Y | d o (X)) = P (Y)

in the case where no causal paths connect X and Y.

Extensions

The rules do not imply that any query can have its do operators removed. In those cases, it may be possible to substitute a variable that is subject to manipulation (e.g., diet) in place of one that is not (e.g., blood cholesterol), which can then be transformed to remove the do. Example:

P (Heart disease | d o (blood cholesterol)) = P (Heart disease | d o (diet))

Counterfactuals

Counterfactuals consider possibilities that are not found in data, such as whether a nonsmoker would have developed cancer had they instead been a heavy smoker. They are the highest step on Pearl's causality ladder.

Potential outcome

Definition: A potential outcome for a variable Y is "the value Y would have taken for individual u, had X been assigned the value x". Mathematically:

Y_{X = x} (u)

Y_{x} (u)

The potential outcome is defined at the level of the individual u.

The conventional approach to potential outcomes is data-, not model-driven, limiting its ability to untangle causal relationships. It treats causal questions as problems of missing data and gives incorrect answers to even standard scenarios.

Causal inference

In the context of causal models, potential outcomes are interpreted causally, rather than statistically.

The first law of causal inference states that the potential outcome

Y_{X} (u)

can be computed by modifying causal model M (by deleting arrows into X) and computing the outcome for some x. Formally:

Y_{X} (u) = Y_{M x} (u)

Conducting a counterfactual

Examining a counterfactual using a causal model involves three steps. The approach is valid regardless of the form of the model relationships, linear or otherwise. When the model relationships are fully specified, point values can be computed. In other cases (e.g., when only probabilities are available) a probability-interval statement, such as non-smoker x would have a 10-20% chance of cancer, can be computed.

Given the model:

Y \leftarrow X \to M \to Y \leftarrow U

the equations for calculating the values of A and C derived from regression analysis or another technique can be applied, substituting known values from an observation and fixing the value of other variables (the counterfactual).

Abduct

Apply abductive reasoning (logical inference that uses observation to find the simplest/most likely explanation) to estimate u, the proxy for the unobserved variables on the specific observation that supports the counterfactual. Compute the probability of u given the propositional evidence.

Act

For a specific observation, use the do operator to establish the counterfactual (e.g., m=0), modifying the equations accordingly.

Predict

Calculate the values of the output (y) using the modified equations.

Mediation

Direct and indirect (mediated) causes can only be distinguished via conducting counterfactuals.Understanding mediation requires holding the mediator constant while intervening on the direct cause. In the model

$Y \leftarrow M \leftarrow X \to Y$

M mediates X's influence on Y, while X also has an unmediated effect on Y. Thus M is held constant, while do(X) is computed.

The Mediation Fallacy instead involves conditioning on the mediator if the mediator and the outcome are confounded, as they are in the above model.

For linear models, the indirect effect can be computed by taking the product of all the path coefficients along a mediated pathway. The total indirect effect is computed by the sum of the individual indirect effects. For linear models mediation is indicated when the coefficients of an equation fitted without including the mediator vary significantly from an equation that includes it.

Direct effect

In experiments on such a model, the controlled direct effect (CDE) is computed by forcing the value of the mediator M (do(M = 0)) and randomly assigning some subjects to each of the values of X (do(X=0), do(X=1), ...) and observing the resulting values of Y.

C D E (0) = P (Y = 1 | d o (X = 1), d o (M = 0)) - P (Y = 1 | d o (X = 0), d o (M = 0))

Each value of the mediator has a corresponding CDE.

However, a better experiment is to compute the natural direct effect. (NDE) This is the effect determined by leaving the relationship between X and M untouched while intervening on the relationship between X and Y.

N D E = P (Y_{M = M 0} = 1 | d o (X = 1)) - P (Y_{M = M 0} = 1 | d o (X = 0))

For example, consider the direct effect of increasing dental hygienist visits (X) from every other year to every year, which encourages flossing (M). Gums (Y) get healthier, either because of the hygienist (direct) or the flossing (mediator/indirect). The experiment is to continue flossing while skipping the hygienist visit.

Indirect effect

The indirect effect of X on Y is the "increase we would see in Y while holding X constant and increasing M to whatever value M would attain under a unit increase in X".

Indirect effects cannot be "controlled" because the direct path cannot be disabled by holding another variable constant. The natural indirect effect (NIE) is the effect on gum health (Y) from flossing (M). The NIE is calculated as the sum of (floss and no-floss cases) of the difference between the probability of flossing given the hygienist and without the hygienist, or:

N I E = \sum_{m} [P (M = m | X = 1) - P (M = m | X = 0)] x x P (Y = 1 | X = 0, M = m)

The above NDE calculation includes counterfactual subscripts ( $Y_{M = M 0}$ ). For nonlinear models, the seemingly obvious equivalence

T o t a l e f f e c t = D i r e c t e f f e c t + I n d i r e c t e f f e c t

does not apply because of anomalies such as threshold effects and binary values. However,

T o t a l e f f e c t (X = 0 \to X = 1) = N D E (X = 0 \to X = 1) - N I E (X = 1 \to X = 0)

works for all model relationships (linear and nonlinear). It allows NDE to then be calculated directly from observational data, without interventions or use of counterfactual subscripts.

Transportability

Causal models provide a vehicle for integrating data across datasets, known as transport, even though the causal models (and the associated data) differ. E.g., survey data can be merged with randomized, controlled trial data. Transport offers a solution to the question of external validity, whether a study can be applied in a different context.

Where two models match on all relevant variables and data from one model is known to be unbiased, data from one population can be used to draw conclusions about the other. In other cases, where data is known to be biased, reweighting can allow the dataset to be transported. In a third case, conclusions can be drawn from an incomplete dataset. In some cases, data from studies of multiple populations can be combined (via transportation) to allow conclusions about an unmeasured population. In some cases, combining estimates (e.g., P(W|X)) from multiple studies can increase the precision of a conclusion.

Do-calculus provides a general criterion for transport: A target variable can be transformed into another expression via a series of do-operations that does not involve any "difference-producing" variables (those that distinguish the two populations). An analogous rule applies to studies that have relevantly different participants.

Bayesian network

Any causal model can be implemented as a Bayesian network. Bayesian networks can be used to provide the inverse probability of an event (given an outcome, what are the probabilities of a specific cause). This requires preparation of a conditional probability table, showing all possible inputs and outcomes with their associated probabilities.

For example, given a two variable model of Disease and Test (for the disease) the conditional probability table takes the form:

Probability of a positive test for a given disease
	Test
Disease	Positive	Negative
Negative	12	88
Positive	73	27

According to this table, when a patient does not have the disease, the probability of a positive test is 12%.

While this is tractable for small problems, as the number of variables and their associated states increase, the probability table (and associated computation time) increases exponentially.

Bayesian networks are used commercially in applications such as wireless data error correction and DNA analysis.

Invariants/context

A different conceptualization of causality involves the notion of invariant relationships. In the case of identifying handwritten digits, digit shape controls meaning, thus shape and meaning are the invariants. Changing the shape changes the meaning. Other properties do not (e.g., color). This invariance should carry across datasets generated in different contexts (the non-invariant properties form the context). Rather than learning (assessing causality) using pooled data sets, learning on one and testing on another can help distinguish variant from invariant properties.

Four causes

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Four_causes

The four causes or four explanations are, in Aristotelian thought, four fundamental types of answer to the question "why?" in analysis of change or movement in nature: the material, the formal, the efficient, and the final. Aristotle wrote that "we do not have knowledge of a thing until we have grasped its why, that is to say, its cause." While there are cases in which classifying a "cause" is difficult, or in which "causes" might merge, Aristotle held that his four "causes" provided an analytical scheme of general applicability.

Aristotle's word aitia (Greek: αἰτία) has, in philosophical scholarly tradition, been translated as 'cause'. This peculiar, specialized, technical, usage of the word 'cause' is not that of everyday English language. Rather, the translation of Aristotle's αἰτία that is nearest to current ordinary language is "explanation."

In Physics II.3 and Metaphysics V.2, Aristotle holds that there are four kinds of answers to "why" questions:

Matter: The material cause of a change or movement. This is the aspect of the change or movement that is determined by the material that composes the moving or changing things. For a table, this might be wood; for a statue, it might be bronze or marble.
Form: The formal cause of a change or movement. This is a change or movement caused by the arrangement, shape, or appearance of the thing changing or moving. Aristotle says, for example, that the ratio 2:1, and number in general, is the formal cause of the octave.
Efficient, or agent: The efficient or moving cause of a change or movement. This consists of things apart from the thing being changed or moved, which interact so as to be an agency of the change or movement. For example, the efficient cause of a table is a carpenter, or a person working as one, and according to Aristotle the efficient cause of a child is a parent.
Final, end, or purpose: The final cause of a change or movement. This is a change or movement for the sake of a thing to be what it is. For a seed, it might be an adult plant; for a sailboat, it might be sailing; for a ball at the top of a ramp, it might be coming to rest at the bottom.

The four "causes" are not mutually exclusive. For Aristotle, several, preferably four, answers to the question "why" have to be given to explain a phenomenon and especially the actual configuration of an object.^[7] For example, if asking why a table is such and such, an explanation in terms of the four causes would sound like this: This table is solid and brown because it is made of wood (matter); it does not collapse because it has four legs of equal length (form); it is as it is because a carpenter made it, starting from a tree (agent); it has these dimensions because it is to be used by humans (end).

Aristotle distinguished between intrinsic and extrinsic causes. Matter and form are intrinsic causes because they deal directly with the object, whereas efficient and finality causes are said to be extrinsic because they are external.

Thomas Aquinas demonstrated that only those four types of causes can exist and no others. He also introduced a priority order according to which "matter is made perfect by the form, form is made perfect by the agent, and agent is made perfect by the finality." Hence, the finality is the cause of causes or, equivalently, the queen of causes.

Definition of "cause"

In his philosophical writings, Aristotle used the Greek word αἴτιον (aition), a neuter singular form of an adjective. The Greek word had meant, perhaps originally in a "legal" context, what or who is "responsible," mostly but not always in a bad sense of "guilt" or "blame." Alternatively, it could mean "to the credit of" someone or something. The appropriation of this word by Aristotle and other philosophers reflects how the Greek experience of legal practice influenced the concern in Greek thought to determine what is responsible. The word developed other meanings, including its use in philosophy in a more abstract sense.

About a century before Aristotle, the anonymous author of the Hippocratic text On Ancient Medicine had described the essential characteristics of a cause as it is considered in medicine:

We must, therefore, consider the causes of each [medical] condition to be those things which are such that, when they are present, the condition necessarily occurs, but when they change to another combination, it ceases.

Aristotle's "four causes"

Aristotle used the four causes to provide different answers to the question, "because of what?" The four answers to this question illuminate different aspects of how a thing comes into being or of how an event takes place.

Material

Aristotle considers the material "cause" (ὕλη, hū́lē) of an object as equivalent to the nature of the raw material out of which the object is composed. (The word "nature" for Aristotle applies to both its potential in the raw material and its ultimate finished form. In a sense this form already existed in the material: see potentiality and actuality.)

Whereas modern physics looks to simple bodies, Aristotle's physics took a more general viewpoint, and treated living things as exemplary. Nevertheless, he felt that simple natural bodies such as earth, fire, air, and water also showed signs of having their own innate sources of motion, change, and rest. Fire, for example, carries things upwards, unless stopped from doing so. Things formed by human artifice, such as beds and cloaks, have no innate tendency to become beds or cloaks.

In traditional Aristotelian philosophical terminology, material is not the same as substance. Matter has parallels with substance in so far as primary matter serves as the substratum for simple bodies which are not substance: sand and rock (mostly earth), rivers and seas (mostly water), atmosphere and wind (mostly air and then mostly fire below the moon). In this traditional terminology, 'substance' is a term of ontology, referring to really existing things; only individuals are said to be substances (subjects) in the primary sense. Secondary substance, in a different sense, also applies to man-made artifacts.

Formal

Aristotle considers the formal "cause" (εἶδος, eîdos) as describing the pattern or form which when present makes matter into a particular type of thing, which we recognize as being of that particular type.

By Aristotle's own account, this is a difficult and controversial concept. It links with theories of forms such as those of Aristotle's teacher, Plato, but in Aristotle's own account (see his Metaphysics), he takes into account many previous writers who had expressed opinions about forms and ideas, but he shows how his own views differ from them.

Efficient

Aristotle defines the agent or efficient "cause" (κινοῦν, kinoûn) of an object as that which causes change and drives transient motion (such as a painter painting a house) (see Aristotle, Physics II 3, 194b29). In many cases, this is simply the thing that brings something about. For example, in the case of a statue, it is the person chiseling away which transforms a block of marble into a statue. According to Lloyd, of the four causes, only this one is what is meant by the modern English word "cause" in ordinary speech.

Final

Aristotle defines the end, purpose, or final "cause" (τέλος, télos) as that for the sake of which a thing is done. Like the form, this is a controversial type of explanation in science; some have argued for its survival in evolutionary biology, while Ernst Mayr denied that it continued to play a role. It is commonly recognised that Aristotle's conception of nature is teleological in the sense that Nature exhibits functionality in a more general sense than is exemplified in the purposes that humans have. Aristotle observed that a telos does not necessarily involve deliberation, intention, consciousness, or intelligence:

This is most obvious in the animals other than man: they make things neither by art nor after inquiry or deliberation. That is why people wonder whether it is by intelligence or by some other faculty that these creatures work, – spiders, ants, and the like... It is absurd to suppose that purpose is not present because we do not observe the agent deliberating. Art does not deliberate. If the ship-building art were in the wood, it would produce the same results by nature. If, therefore, purpose is present in art, it is present also in nature.
— Aristotle, Physics, II.8

According to Aristotle, a seed has the eventual adult plant as its end (i.e., as its telos) if and only if the seed would become the adult plant under normal circumstances. In Physics II.9, Aristotle hazards a few arguments that a determination of the end (i.e., final cause) of a phenomenon is more important than the others. He argues that the end is that which brings it about, so for example "if one defines the operation of sawing as being a certain kind of dividing, then this cannot come about unless the saw has teeth of a certain kind; and these cannot be unless it is of iron." According to Aristotle, once a final "cause" is in place, the material, efficient and formal "causes" follow by necessity. However, he recommends that the student of nature determine the other "causes" as well, and notes that not all phenomena have an end, e.g., chance events.

Aristotle saw that his biological investigations provided insights into the causes of things, especially into the final cause:

We should approach the investigation of every kind of animal without being ashamed, since in each one of them there is something natural and something beautiful. The absence of chance and the serving of ends are found in the works of nature especially. And the end, for the sake of which a thing has been constructed or has come to be, belongs to what is beautiful.
— Aristotle, On the Parts of Animals 645^{a 21–26}, Book I, Part 5.

George Holmes Howison highlights "final causation" in presenting his theory of metaphysics, which he terms "personal idealism", and to which he invites not only man, but all (ideal) life:

Here, in seeing that Final Cause – causation at the call of self-posited aim or end – is the only full and genuine cause, we further see that Nature, the cosmic aggregate of phenomena and the cosmic bond of their law which in the mood of vague and inaccurate abstraction we call Force, is after all only an effect... Thus teleology, or the Reign of Final Cause, the reign of ideality, is not only an element in the notion of Evolution, but is the very vital cord in the notion. The conception of evolution is founded at last and essentially in the conception of Progress: but this conception has no meaning at all except in the light of a goal; there can be no goal unless there is a Beyond for everything actual; and there is no such Beyond except through a spontaneous ideal. The presupposition of Nature, as a system undergoing evolution, is therefore the causal activity of our Pure Ideals. These are our three organic and organizing conceptions called the True, the Beautiful, and the Good.
— George Holmes Howison, The Limits of Evolution (1901)

However, Edward Feser argues, in line with the Aristotelian and Thomistic tradition, that finality has been greatly misunderstood. Indeed, without finality, efficient causality becomes inexplicable. Finality thus understood is not purpose but that end towards which a thing is ordered. When a match is rubbed against the side of a matchbox, the effect is not the appearance of an elephant or the sounding of a drum, but fire. The effect is not arbitrary because the match is ordered towards the end of fire which is realized through efficient causes.

In their biosemiotic study, Stuart Kauffman, Robert K. Logan et al. (2007) remark:

Our language is teleological. We believe that autonomous agents constitute the minimal physical system to which teleological language rightly applies.
— Biology and Philosophy

Scholasticism

In the Scholasticism, the efficient causality was governed by two principles:

omne agens agit simile sibi (every agent produces something similar to itself): stated frequently in the writings of St. Thomas Aquinas, the principle establishes a relationship of similarity and analogy between cause and effect;
nemo dat quod non habet (no one gives what he does not possess): partially similar to the legal principle of the same name, in Metaphysics it establishes that the cause cannot bestow on the effect the quantity of being (and thus of unity, truth, goodness, reality and perfection) that it does not already possess within itself. Otherwise, there would be creation out of nothingness of self and other-from-self In other words, the cause must possess a degree of reality greater than or equal to that of the effect. If it is greater, we speak of equivocal causation, in analogy to the three types of logical predication (univocal, equivocal, analogical); if it is equal, we speak of univocal predication.

Thomas in this regard distinguished between causa fiendi (cause of occurring, of only beginning to be) and causa essendi (cause of being and also of beginning to be) When the being of the agent cause is in the effect in a lesser or equal degree, this is a causa fiendi. Furthermore, the second principle also establishes a qualitative link: the cause can only transmit its own essence to the effect. For example, a dog cannot transmit the essence of a feline to its young, but only that of a dog. The principle is equivalent to that of Causa aequat effectum (cause equals effect) in both a quantitative and qualitative sense.

Modern science

In his Advancement of Learning (1605), Francis Bacon wrote that natural science "doth make inquiry, and take consideration of the same natures : but how? Only as to the material and efficient causes of them, and not as to the forms." Using the terminology of Aristotle, Bacon demands that, apart from the "laws of nature" themselves, the causes relevant to natural science are only efficient causes and material causes, or, to use the formulation which became famous later, natural phenomena require scientific explanation in terms of matter and motion.

In The New Organon, Bacon divides knowledge into physics and metaphysics:

From the two kinds of axioms which have been spoken of arises a just division of philosophy and the sciences, taking the received terms (which come nearest to express the thing) in a sense agreeable to my own views. Thus, let the investigation of forms, which are (in the eye of reason at least, and in their essential law) eternal and immutable, constitute Metaphysics; and let the investigation of the efficient cause, and of matter, and of the latent process, and the latent configuration (all of which have reference to the common and ordinary course of nature, not to her eternal and fundamental laws) constitute Physics. And to these let there be subordinate two practical divisions: to Physics, Mechanics; to Metaphysics, what (in a purer sense of the word) I call Magic, on account of the broadness of the ways it moves in, and its greater command over nature.

Biology

Explanations in terms of final causes remain common in evolutionary biology. Francisco J. Ayala has claimed that teleology is indispensable to biology since the concept of adaptation is inherently teleological. In an appreciation of Charles Darwin published in Nature in 1874, Asa Gray noted "Darwin's great service to Natural Science" lies in bringing back teleology "so that, instead of Morphology versus Teleology, we shall have Morphology wedded to Teleology." Darwin quickly responded, "What you say about Teleology pleases me especially and I do not think anyone else has ever noticed the point." Francis Darwin and T. H. Huxley reiterate this sentiment. The latter wrote that "the most remarkable service to the philosophy of Biology rendered by Mr. Darwin is the reconciliation of Teleology and Morphology, and the explanation of the facts of both, which his view offers." James G. Lennox states that Darwin uses the term 'Final Cause' consistently in his Species Notebook, On the Origin of Species, and after.

Contrary to the position described by Francisco J. Ayala, Ernst Mayr states that "adaptedness... is a posteriori result rather than an a priori goal-seeking." Various commentators view the teleological phrases used in modern evolutionary biology as a type of shorthand. For example, S. H. P. Madrell writes that "the proper but cumbersome way of describing change by evolutionary adaptation [may be] substituted by shorter overtly teleological statements" for the sake of saving space, but that this "should not be taken to imply that evolution proceeds by anything other than from mutations arising by chance, with those that impart an advantage being retained by natural selection." However, Lennox states that in evolution as conceived by Darwin, it is true both that evolution is the result of mutations arising by chance and that evolution is teleological in nature.

Statements that a species does something "in order to" achieve survival are teleological. The validity or invalidity of such statements depends on the species and the intention of the writer as to the meaning of the phrase "in order to." Sometimes it is possible or useful to rewrite such sentences so as to avoid teleology. Some biology courses have incorporated exercises requiring students to rephrase such sentences so that they do not read teleologically. Nevertheless, biologists still frequently write in a way which can be read as implying teleology even if that is not the intention.

Animal behaviour (Tinbergen's four questions)

Tinbergen's four questions, named after the ethologist Nikolaas Tinbergen and based on Aristotle's four causes, are complementary categories of explanations for animal behaviour. They are also commonly referred to as levels of analysis.

The four questions are on:

function, what an adaptation does that is selected for in evolution;
phylogeny, the evolutionary history of an organism, revealing its relationships to other species;
mechanism, namely the proximate cause of a behaviour, such as the role of testosterone in aggression; and
ontogeny, the development of an organism from egg to embryo to adult.

Technology (Heidegger's four causes)

In The Question Concerning Technology, echoing Aristotle, Martin Heidegger describes the four causes as follows:

causa materialis: the material or matter
causa formalis: the form or shape the material or matter enters
causa finalis: the end
causa efficiens: the effect that brings about the finished result.

Heidegger explains that "[w]hoever builds a house or a ship or forges a sacrificial chalice reveals what is to be brought forth, according to the terms of the four modes of occasioning."

The educationist David Waddington comments that although the efficient cause, which he identifies as "the craftsman," might be thought the most significant of the four, in his view each of Heidegger's four causes is "equally co-responsible" for producing a craft item, in Heidegger's terms "bringing forth" the thing into existence. Waddington cites Lovitt's description of this bringing forth as "a unified process."

Search This Blog

Monday, January 15, 2024

Causal model

Definition

History

Ladder of causation

Association

Intervention

Counterfactuals

Causality

Causality vs correlation

Types

Necessary

Sufficient causes

Contributory causes

Model

Causal diagram

Model elements

Junction patterns

Chain

Fork

Collider

Node types

Mediator

Confounder

Instrumental variable

Mendelian randomization

Associations

Independence conditions

Confounder/deconfounder

Backdoor adjustment

Frontdoor adjustment

Interventions

Queries

Do calculus

Rules

Rule 1

Rule 2

Rule 3

Extensions

Counterfactuals

Potential outcome

Causal inference

Conducting a counterfactual

Abduct

Act

Predict

Mediation

Direct effect

Indirect effect

Transportability

Bayesian network

Invariants/context

Four causes

Definition of "cause"

Aristotle's "four causes"

Material

Formal

Efficient

Final

Scholasticism

Modern science

Biology

Animal behaviour (Tinbergen's four questions)

Technology (Heidegger's four causes)

Emic and etic