Graphical summary of a meta analysis of over 1,000 cases of diffuse intrinsic pontine glioma and other pediatric gliomas, in which information about the mutations involved as well as generic outcomes were distilled from the underlying primary literature.
A meta-analysis is a statistical analysis that combines the results of multiple scientific studies.
 Meta-analysis can be performed when there are multiple scientific 
studies addressing the same question, with each individual study 
reporting measurements that are expected to have some degree of error. 
The aim then is to use approaches from statistics
 to derive a pooled estimate closest to the unknown common truth based 
on how this error is perceived. Existing methods for meta-analysis yield
 a weighted average
 from the results of the individual studies, and what differs is the 
manner in which these weights are allocated and also the manner in which
 the uncertainty is computed around the point estimate thus generated. 
In addition to providing an estimate of the unknown common truth, 
meta-analysis has the capacity to contrast results from different 
studies and identify patterns among study results, sources of 
disagreement among those results, or other interesting relationships 
that may come to light in the context of multiple studies.
A key benefit of this approach is the aggregation of information leading to a higher statistical power
 and more robust point estimate than is possible from the measure 
derived from any individual study. However, in performing a 
meta-analysis, an investigator must make choices which can affect the 
results, including deciding how to search for studies, selecting studies
 based on a set of objective criteria, dealing with incomplete data, 
analyzing the data, and accounting for or choosing not to account for publication bias.
Meta-analyses are often, but not always, important components of a systematic review
 procedure. For instance, a meta-analysis may be conducted on several 
clinical trials of a medical treatment, in an effort to obtain a better 
understanding of how well the treatment works. Here it is convenient to 
follow the terminology used by the Cochrane Collaboration, and use "meta-analysis" to refer to statistical methods of combining evidence, leaving other aspects of 'research synthesis'
 or 'evidence synthesis', such as combining information from qualitative
 studies, for the more general context of systematic reviews.  A 
meta-analysis is a secondary source.
History
The historical roots of meta-analysis can be traced back to 17th century studies of astronomy, while a paper published in 1904 by the statistician Karl Pearson in the British Medical Journal
 which collated data from several studies of typhoid inoculation is seen
 as the first time a meta-analytic approach was used to aggregate the 
outcomes of multiple clinical studies.
 The first meta-analysis of all conceptually identical experiments 
concerning a particular research issue, and conducted by independent 
researchers, has been identified as the 1940 book-length publication Extrasensory Perception After Sixty Years, authored by Duke University psychologists J. G. Pratt, J. B. Rhine, and associates. This encompassed a review of 145 reports on ESP
 experiments published from 1882 to 1939, and included an estimate of 
the influence of unpublished papers on the overall effect (the file-drawer problem). Although meta-analysis is widely used in epidemiology and evidence-based medicine
 today, a meta-analysis of a medical treatment was not published until 
1955. In the 1970s, more sophisticated analytical techniques were 
introduced in educational research, starting with the work of Gene V. Glass, Frank L. Schmidt and John E. Hunter. 
The term "meta-analysis" was coined in 1976 by the statistician Gene V. Glass, who stated "my
 major interest currently is in what we have come to call ...the 
meta-analysis of research. The term is a bit grand, but it is precise 
and apt ... Meta-analysis refers to the analysis of analyses". 
Although this led to him being widely recognized as the modern founder 
of the method, the methodology behind what he termed "meta-analysis" 
predates his work by several decades. The statistical theory surrounding meta-analysis was greatly advanced by the work of Nambury S. Raju, Larry V. Hedges, Harris Cooper, Ingram Olkin, John E. Hunter, Jacob Cohen, Thomas C. Chalmers, Robert Rosenthal, Frank L. Schmidt, and Douglas G. Bonett.
Advantages
Conceptually,
 a meta-analysis uses a statistical approach to combine the results from
 multiple studies in an effort to increase power (over individual 
studies), improve estimates of the size of the effect and/or to resolve 
uncertainty when reports disagree. A meta-analysis is a statistical 
overview of the results from one or more systematic reviews. Basically, 
it produces a weighted average of the included study results and this 
approach has several advantages:
- Results can be generalized to a larger population
 - The precision and accuracy of estimates can be improved as more data is used. This, in turn, may increase the statistical power to detect an effect
 - Inconsistency of results across studies can be quantified and analyzed. For instance, inconsistency may arise from sampling error, or study results (partially) influenced by differences between study protocols
 - Hypothesis testing can be applied on summary estimates
 - Moderators can be included to explain variation between studies
 - The presence of publication bias can be investigated
 
Steps in a meta-analysis
A
 meta-analysis is usually preceded by a systematic review, as this 
allows identification and critical appraisal of all the relevant 
evidence (thereby limiting the risk of bias in summary estimates). The 
general steps are then as follows:
- Formulation of the research question, e.g. using the PICO model (Population, Intervention, Comparison, Outcome).
 - Search of literature
 - Selection of studies ('incorporation criteria')
- Based on quality criteria, e.g. the requirement of randomization and blinding in a clinical trial
 - Selection of specific studies on a well-specified subject, e.g. the treatment of breast cancer.
 - Decide whether unpublished studies are included to avoid publication bias (file drawer problem)
 
 - Decide which dependent variables or summary measures are allowed. 
For instance, when considering a meta-analysis of published (aggregate) 
data:
- Differences (discrete data)
 - Means (continuous data)
 - Hedges' g
 is a popular summary measure for continuous data that is standardized 
in order to eliminate scale differences, but it incorporates an index of
 variation between groups:
- in which is the treatment mean, is the control mean, the pooled variance.
 
 
 - Selection of a meta-analysis model, e.g. fixed effect or random effects meta-analysis.
 - Examine sources of between-study heterogeneity, e.g. using subgroup analysis or meta-regression.
 
Formal guidance for the conduct and reporting of meta-analyses is provided by the Cochrane Handbook.
For reporting guidelines, see the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement.[14]
Methods and assumptions
Approaches
In general, two types of evidence can be distinguished when performing a meta-analysis: individual participant data (IPD), and aggregate data (AD). The aggregate data can be direct or indirect. 
AD is more commonly available (e.g. from the literature) and 
typically represents summary estimates such as odds ratios or relative 
risks. This can be directly synthesized across conceptually similar 
studies using several approaches (see below). On the other hand, 
indirect aggregate data measures the effect of two treatments that were 
each compared against a similar control group in a meta-analysis. For 
example, if treatment A and treatment B were directly compared vs 
placebo in separate meta-analyses, we can use these two pooled results 
to get an estimate of the effects of A vs B in an indirect comparison as
 effect A vs Placebo minus effect B vs Placebo. 
IPD evidence represents raw data as collected by the study 
centers. This distinction has raised the need for different 
meta-analytic methods when evidence synthesis is desired, and has led to
 the development of one-stage and two-stage methods.
 In one-stage methods the IPD from all studies are modeled 
simultaneously whilst accounting for the clustering of participants 
within studies. Two-stage methods first compute summary statistics for 
AD from each study and then calculate overall statistics as a weighted 
average of the study statistics. By reducing IPD to AD, two-stage 
methods can also be applied when IPD is available; this makes them an 
appealing choice when performing a meta-analysis. Although it is 
conventionally believed that one-stage and two-stage methods yield 
similar results, recent studies have shown that they may occasionally 
lead to different conclusions.
Statistical models for aggregate data
Direct evidence: Models incorporating study effects only
Fixed effects model
The
 fixed effect model provides a weighted average of a series of study 
estimates. The inverse of the estimates' variance is commonly used as 
study weight, so that larger studies tend to contribute more than 
smaller studies to the weighted average. Consequently, when studies 
within a meta-analysis are dominated by a very large study, the findings
 from smaller studies are practically ignored.
 Most importantly, the fixed effects model assumes that all included 
studies investigate the same population, use the same variable and 
outcome definitions, etc. This assumption is typically unrealistic as 
research is often prone to several sources of heterogeneity; e.g. 
treatment effects may differ according to locale, dosage levels, study 
conditions, ...
Random effects model
A
 common model used to synthesize heterogeneous research is the random 
effects model of meta-analysis. This is simply the weighted average of 
the effect sizes of a group of studies. The weight that is applied in 
this process of weighted averaging with a random effects meta-analysis 
is achieved in two steps:
- Step 1: Inverse variance weighting
 - Step 2: Un-weighting of this inverse variance weighting by applying a random effects variance component (REVC) that is simply derived from the extent of variability of the effect sizes of the underlying studies.
 
This means that the greater this variability in effect sizes 
(otherwise known as heterogeneity), the greater the un-weighting and 
this can reach a point when the random effects meta-analysis result 
becomes simply the un-weighted average effect size across the studies. 
At the other extreme, when all effect sizes are similar (or variability 
does not exceed sampling error), no REVC is applied and the random 
effects meta-analysis defaults to simply a fixed effect meta-analysis 
(only inverse variance weighting). 
The extent of this reversal is solely dependent on two factors:
- Heterogeneity of precision
 - Heterogeneity of effect size
 
Since neither of these factors automatically indicates a faulty 
larger study or more reliable smaller studies, the re-distribution of 
weights under this model will not bear a relationship to what these 
studies actually might offer. Indeed, it has been demonstrated that 
redistribution of weights is simply in one direction from larger to 
smaller studies as heterogeneity increases until eventually all studies 
have equal weight and no more redistribution is possible.
Another issue with the random effects model is that the most commonly 
used confidence intervals generally do not retain their coverage 
probability above the specified nominal level and thus substantially 
underestimate the statistical error and are potentially
overconfident in their conclusions. Several fixes have been suggested but the debate continues on.
 A further concern is that the average treatment effect can sometimes be
 even less conservative compared to the fixed effect model
 and therefore misleading in practice. One interpretational fix that has
 been suggested is to create a prediction interval around the random 
effects estimate to portray the range of possible effects in practice.
 However, an assumption behind the calculation of such a prediction 
interval is that trials are considered more or less homogeneous entities
 and that included patient populations and comparator treatments should 
be considered exchangeable and this is usually unattainable in practice.
The most widely used method to estimate between studies variance (REVC) is the DerSimonian-Laird (DL) approach.
 Several advanced iterative (and computationally expensive) techniques 
for computing the between studies variance exist (such as maximum 
likelihood, profile likelihood and restricted maximum likelihood 
methods) and random effects models using these methods can be run in 
Stata with the metaan command.
 The metaan command must be distinguished from the classic metan (single
 "a") command in Stata that uses the DL estimator. These advanced 
methods have also been implemented in a free and easy to use Microsoft 
Excel add-on, MetaEasy.
 However, a comparison between these advanced methods and the DL method 
of computing the between studies variance demonstrated that there is 
little to gain and DL is quite adequate in most scenarios.
However, most meta-analyses include between 2 and 4 studies and 
such a sample is more often than not inadequate to accurately estimate 
heterogeneity. Thus it appears that in small meta-analyses, an incorrect
 zero between study variance estimate is obtained, leading to a false 
homogeneity assumption. Overall, it appears that heterogeneity is being 
consistently underestimated in meta-analyses and sensitivity analyses in
 which high heterogeneity levels are assumed could be informative.
 These random effects models and software packages mentioned above 
relate to study-aggregate meta-analyses and researchers wishing to 
conduct individual patient data (IPD) meta-analyses need to consider 
mixed-effects modelling approaches.
IVhet model
Doi
 & Barendregt working in collaboration with Khan, Thalib and 
Williams (from the University of Queensland, University of Southern 
Queensland and Kuwait University), have created an inverse variance 
quasi likelihood based alternative (IVhet) to the random effects (RE) 
model for which details are available online. This was incorporated into MetaXL version 2.0,
 a free Microsoft excel add-in for meta-analysis produced by Epigear 
International Pty Ltd, and made available on 5 April 2014. The authors 
state that a clear advantage of this model is that it resolves the two 
main problems of the random effects model. The first advantage of the 
IVhet model is that coverage remains at the nominal (usually 95%) level 
for the confidence interval unlike the random effects model which drops 
in coverage with increasing heterogeneity.
 The second advantage is that the IVhet model maintains the inverse 
variance weights of individual studies, unlike the RE model which gives 
small studies more weight (and therefore larger studies less) with 
increasing heterogeneity. When heterogeneity becomes large, the 
individual study weights under the RE model become equal and thus the RE
 model returns an arithmetic mean rather than a weighted average. This 
side-effect of the RE model does not occur with the IVhet model which 
thus differs from the RE model estimate in two perspectives:
 Pooled estimates will favor larger trials (as opposed to penalizing 
larger trials in the RE model) and will have a confidence interval that 
remains within the nominal coverage under uncertainty (heterogeneity). 
Doi & Barendregt suggest that while the RE model provides an 
alternative method of pooling the study data, their simulation results
 demonstrate that using a more specified probability model with 
untenable assumptions, as with the RE model, does not necessarily 
provide better results. The latter study also reports that the IVhet 
model resolves the problems related to underestimation of the 
statistical error, poor coverage of the confidence interval and 
increased MSE seen with the random effects model and the authors 
conclude that researchers should henceforth abandon use of the random 
effects model in meta-analysis. While their data is compelling, the 
ramifications (in terms of the magnitude of spuriously positive results 
within the Cochrane database) are huge and thus accepting this 
conclusion requires careful independent confirmation. The availability 
of a free software (MetaXL) that runs the IVhet model (and all other models for comparison) facilitates this for the research community.
Direct evidence: Models incorporating additional information
Quality effects model
Doi and Thalib originally introduced the quality effects model. They
 introduced a new approach to adjustment for inter-study variability by 
incorporating the contribution of variance due to a relevant component 
(quality) in addition to the contribution of variance due to random 
error that is used in any fixed effects meta-analysis model to generate 
weights for each study. The strength of the quality effects 
meta-analysis is that it allows available methodological evidence to be 
used over subjective random effects, and thereby helps to close the 
damaging gap which has opened up between methodology and statistics in 
clinical research. To do this a synthetic bias variance is computed 
based on quality information to adjust inverse variance weights and the 
quality adjusted weight of the ith study is introduced. These adjusted weights are then used in meta-analysis. In other words, if study i
 is of good quality and other studies are of poor quality, a proportion 
of their quality adjusted weights is mathematically redistributed to 
study i giving it more weight towards the overall effect size. As
 studies become increasingly similar in terms of quality, 
re-distribution becomes progressively less and ceases when all studies 
are of equal quality (in the case of equal quality, the quality effects 
model defaults to the IVhet model – see previous section). A recent 
evaluation of the quality effects model (with some updates) demonstrates
 that despite the subjectivity of quality assessment, the performance 
(MSE and true variance under simulation) is superior to that achievable 
with the random effects model.
 This model thus replaces the untenable interpretations that abound in 
the literature and a software is available to explore this method 
further.
Indirect evidence: Network meta-analysis methods
A
 network meta-analysis looks at indirect comparisons. In the image, A 
has been analyzed in relation to C and C has been analyzed in relation 
to b. However the relation between A and B is only known indirectly, and
 a network meta-analysis looks at such indirect evidence of differences 
between methods and interventions using statistical method.
Indirect comparison meta-analysis methods (also called network 
meta-analyses, in particular when multiple treatments are assessed 
simultaneously) generally use two main methodologies. First, is the 
Bucher method
 which is a single or repeated comparison of a closed loop of 
three-treatments such that one of them is common to the two studies and 
forms the node where the loop begins and ends. Therefore, multiple 
two-by-two comparisons (3-treatment loops) are needed to compare 
multiple treatments. This methodology requires that trials with more 
than two arms have two arms only selected as independent pair-wise 
comparisons are required. The alternative methodology uses complex statistical modelling
 to include the multiple arm trials and comparisons simultaneously 
between all competing treatments. These have been executed using 
Bayesian methods, mixed linear models and meta-regression approaches.
Bayesian framework
Specifying a Bayesian network meta-analysis model involves writing a directed acyclic graph (DAG) model for general-purpose Markov chain Monte Carlo (MCMC) software such as WinBUGS.
 In addition, prior distributions have to be specified for a number of 
the parameters, and the data have to be supplied in a specific format.
 Together, the DAG, priors, and data form a Bayesian hierarchical model.
 To complicate matters further, because of the nature of MCMC 
estimation, overdispersed starting values have to be chosen for a number
 of independent chains so that convergence can be assessed.
 Currently, there is no software that automatically generates such 
models, although there are some tools to aid in the process. The 
complexity of the Bayesian approach has limited usage of this 
methodology. Methodology for automation of this method has been 
suggested
 but requires that arm-level outcome data are available, and this is 
usually unavailable. Great claims are sometimes made for the inherent 
ability of the Bayesian framework to handle network meta-analysis and 
its greater flexibility. However, this choice of implementation of 
framework for inference, Bayesian or frequentist, may be less important 
than other choices regarding the modeling of effects (see discussion on models above).
Frequentist multivariate framework
On
 the other hand, the frequentist multivariate methods involve 
approximations and assumptions that are not stated explicitly or 
verified when the methods are applied (see discussion on meta-analysis 
models above). For example, the mvmeta package for Stata enables network
 meta-analysis in a frequentist framework.
 However, if there is no common comparator in the network, then this has
 to be handled by augmenting the dataset with fictional arms with high 
variance, which is not very objective and requires a decision as to what
 constitutes a sufficiently high variance.
 The other issue is use of the random effects model in both this 
frequentist framework and the Bayesian framework. Senn advises analysts 
to be cautious about interpreting the 'random effects' analysis since 
only one random effect is allowed for but one could envisage many.
 Senn goes on to say that it is rather naıve, even in the case where 
only two treatments are being compared to assume that random-effects 
analysis accounts for all
uncertainty about the way effects can vary from trial to trial. Newer 
models of meta-analysis such as those discussed above would certainly 
help alleviate this situation and have been implemented in the next 
framework.
Generalized pairwise modelling framework
An
 approach that has been tried since the late 1990s is the implementation
 of the multiple three-treatment closed-loop analysis. This has not been
 popular because the process rapidly becomes overwhelming as network 
complexity increases. Development in this area was then abandoned in 
favor of the Bayesian and multivariate frequentist methods which emerged
 as alternatives. Very recently, automation of the three-treatment 
closed loop method has been developed for complex networks by some 
researchers
 as a way to make this methodology available to the mainstream research 
community. This proposal does restrict each trial to two interventions, 
but also introduces a workaround for multiple arm trials: a different 
fixed control node can be selected in different runs. It also utilizes 
robust meta-analysis methods so that many of the problems highlighted 
above are avoided. Further research around this framework is required to
 determine if this is indeed superior to the Bayesian or multivariate 
frequentist frameworks. Researchers willing to try this out have access 
to this framework through a free software.
Tailored meta-analysis
Another
 form of additional information comes from the intended setting. If the 
target setting for applying the meta-analysis results is known then it 
may be possible to use data from the setting to tailor the results thus 
producing a ‘tailored meta-analysis’.
 This has been used in test accuracy meta-analyses, where empirical 
knowledge of the test positive rate and the prevalence have been used to
 derive a region in Receiver Operating Characteristic
 (ROC) space known as an ‘applicable region’. Studies are then selected 
for the target setting based on comparison with this region and 
aggregated to produce a summary estimate which is tailored to the target
 setting.
Validation of meta-analysis results
The meta-analysis estimate represents a weighted average across studies and when there is heterogeneity
 this may result in the summary estimate not being representative of 
individual studies. Qualitative appraisal of the primary studies using 
established tools can uncover potential biases,
 but does not quantify the aggregate effect of these biases on the 
summary estimate. Although the meta-analysis result could be compared 
with an independent prospective primary study, such external validation 
is often impractical. This has led to the development of methods that 
exploit a form of leave-one-out cross validation, sometimes referred to as internal-external cross validation (IOCV).
 Here each of the k included studies in turn is omitted and compared 
with the summary estimate derived from aggregating the remaining k- 1 
studies.  A general validation statistic, Vn based on IOCV has been developed to measure the statistical validity of meta-analysis results.
 For test accuracy and prediction, particularly when there are 
multivariate effects, other approaches which seek to estimate the 
prediction error have also been proposed.
Challenges
A meta-analysis of several small studies does not always predict the results of a single large study.
 Some have argued that a weakness of the method is that sources of bias 
are not controlled by the method: a good meta-analysis cannot correct 
for poor design or bias in the original studies.
 This would mean that only methodologically sound studies should be 
included in a meta-analysis, a practice called 'best evidence 
synthesis'.
 Other meta-analysts would include weaker studies, and add a study-level
 predictor variable that reflects the methodological quality of the 
studies to examine the effect of study quality on the effect size.
 However, others have argued that a better approach is to preserve 
information about the variance in the study sample, casting as wide a 
net as possible, and that methodological selection criteria introduce 
unwanted subjectivity, defeating the purpose of the approach.
Publication bias: the file drawer problem
A
 funnel plot expected without the file drawer problem. The largest 
studies converge at the tip while smaller studies show more or less 
symmetrical scatter at the base
A
 funnel plot expected with the file drawer problem. The largest studies 
still cluster around the tip, but the bias against publishing negative 
studies has caused the smaller studies as a whole to have an 
unjustifiably favorable result to the hypothesis
Another potential pitfall is the reliance on the available body of 
published studies, which may create exaggerated outcomes due to publication bias, as studies which show negative results or insignificant
 results are less likely to be published. For example, pharmaceutical 
companies have been known to hide negative studies and researchers may 
have overlooked unpublished studies such as dissertation studies or 
conference abstracts that did not reach publication. This is not easily 
solved, as one cannot know how many studies have gone unreported.
This file drawer problem
 (characterized by negative or non-significant results being tucked away
 in a cabinet), can result in a biased distribution of effect sizes thus
 creating a serious base rate fallacy,
 in which the significance of the published studies is overestimated, as
 other studies were either not submitted for publication or were 
rejected. This should be seriously considered when interpreting the 
outcomes of a meta-analysis.
The distribution of effect sizes can be visualized with a funnel plot
 which (in its most common version) is a scatter plot of standard error 
versus the effect size. It makes use of the fact that the smaller 
studies (thus larger standard errors) have more scatter of the magnitude
 of effect (being less precise) while the larger studies have less 
scatter and form the tip of the funnel. If many negative studies were 
not published, the remaining positive studies give rise to a funnel plot
 in which the base is skewed to one side (asymmetry of the funnel plot).
 In contrast, when there is no publication bias, the effect of the 
smaller studies has no reason to be skewed to one side and so a 
symmetric funnel plot results. This also means that if no publication 
bias is present, there would be no relationship between standard error 
and effect size.
 A negative or positive relation between standard error and effect size 
would imply that smaller studies that found effects in one direction 
only were more likely to be published and/or to be submitted for 
publication.
Apart from the visual funnel plot, statistical methods for 
detecting publication bias have also been proposed. These are 
controversial because they typically have low power for detection of 
bias, but also may make false positives under some circumstances.
 For instance small study effects (biased smaller studies), wherein 
methodological differences between smaller and larger studies exist, may
 cause asymmetry in effect sizes that resembles publication bias. 
However, small study effects may be just as problematic for the 
interpretation of meta-analyses, and the imperative is on meta-analytic 
authors to investigate potential sources of bias. 
A Tandem Method for analyzing publication bias has been suggested for cutting down false positive error problems.
 This Tandem method consists of three stages. Firstly, one calculates 
Orwin's fail-safe N, to check how many studies should be added in order 
to reduce the test statistic to a trivial size. If this number of 
studies is larger than the number of studies used in the meta-analysis, 
it is a sign that there is no publication bias, as in that case, one 
needs a lot of studies to reduce the effect size. Secondly, one can do 
an Egger's regression test, which tests whether the funnel plot is 
symmetrical. As mentioned before: a symmetrical funnel plot is a sign 
that there is no publication bias, as the effect size and sample size 
are not dependent. Thirdly, one can do the trim-and-fill method, which 
imputes data if the funnel plot is asymmetrical. 
The problem of publication bias is not trivial as it is suggested
 that 25% of meta-analyses in the psychological sciences may have 
suffered from publication bias.
 However, low power of existing tests and problems with the visual 
appearance of the funnel plot remain an issue, and estimates of 
publication bias may remain lower than what truly exists. 
Most discussions of publication bias focus on journal practices 
favoring publication of statistically significant findings. However, 
questionable research practices, such as reworking statistical models 
until significance is achieved, may also favor statistically significant
 findings in support of researchers' hypotheses.
Studies often do not report the effects when they do not reach statistical significance[citation needed].
 For example, they may simply say that the groups did not show 
statistically significant differences, without report any other 
information (e.g. a statistic or p-value). Exclusion of these studies 
would lead to a situation similar to publication bias, but their 
inclusion (assuming null effects) would also bias the meta-analysis. 
MetaNSUE, a new method created by Joaquim Radua, has shown to allow researchers to include unbiasedly these studies. Its steps are as follows:
- Maximum likelihood estimation of the meta-analytic effect and the heterogeneity between studies.
 - Multiple imputation of the NSUEs adding noise to the estimate of the effect.
 - Separate meta-analyses for each imputed dataset.
 - Pooling of the results of these meta-analyses.
 
Other
 weaknesses are that it has not been determined if the statistically 
most accurate method for combining results is the fixed, IVhet, random 
or quality effect models, though the criticism against the random 
effects model is mounting because of the perception that the new random 
effects (used in meta-analysis) are essentially formal devices to 
facilitate smoothing or shrinkage and prediction may be impossible or 
ill-advised.
 The main problem with the random effects approach is that it uses the 
classic statistical thought of generating a "compromise estimator" that 
makes the weights close to the naturally weighted estimator if 
heterogeneity across studies is large but close to the inverse variance 
weighted estimator if the between study heterogeneity is small. However,
 what has been ignored is the distinction between the model we choose to analyze a given dataset, and the mechanism by which the data came into being.
 A random effect can be present in either of these roles, but the two 
roles are quite distinct. There's no reason to think the analysis model 
and data-generation mechanism (model) are similar in form, but many 
sub-fields of statistics have developed the habit of assuming, for 
theory and simulations, that the data-generation mechanism (model) is 
identical to the analysis model we choose (or would like others to 
choose). As a hypothesized mechanisms for producing the data, the random
 effect model for meta-analysis is silly and it is more appropriate to 
think of this model as a superficial description and something we choose
 as an analytical tool – but this choice for meta-analysis may not work 
because the study effects are a fixed feature of the respective 
meta-analysis and the probability distribution is only a descriptive 
tool.
Problems arising from agenda-driven bias
The most severe fault in meta-analysis often occurs when the person or persons doing the meta-analysis have an economic, social, or political agenda such as the passage or defeat of legislation. People with these types of agendas may be more likely to abuse meta-analysis due to personal bias. For example, researchers favorable to the author's agenda are likely to have their studies cherry-picked
 while those not favorable will be ignored or labeled as "not credible".
 In addition, the favored authors may themselves be biased or paid to 
produce results that support their overall political, social, or 
economic goals in ways such as selecting small favorable data sets and 
not incorporating larger unfavorable data sets. The influence of such 
biases on the results of a meta-analysis is possible because the 
methodology of meta-analysis is highly malleable.
A 2011 study done to disclose possible conflicts of interests in 
underlying research studies used for medical meta-analyses reviewed 29 
meta-analyses and found that conflicts of interests in the studies 
underlying the meta-analyses were rarely disclosed. The 29 meta-analyses
 included 11 from general medicine journals, 15 from specialty medicine 
journals, and three from the Cochrane Database of Systematic Reviews. The 29 meta-analyses reviewed a total of 509 randomized controlled trials (RCTs). Of these, 318 RCTs reported funding sources, with 219 (69%) receiving funding from industry[clarification needed].
 Of the 509 RCTs, 132 reported author conflict of interest disclosures, 
with 91 studies (69%) disclosing one or more authors having financial 
ties to industry. The information was, however, seldom reflected in the 
meta-analyses. Only two (7%) reported RCT funding sources and none 
reported RCT author-industry ties. The authors concluded "without 
acknowledgment of COI due to industry funding or author industry 
financial ties from RCTs included in meta-analyses, readers' 
understanding and appraisal of the evidence from the meta-analysis may 
be compromised."
For example, in 1998, a US federal judge found that the United States Environmental Protection Agency
 had abused the meta-analysis process to produce a study claiming cancer
 risks to non-smokers from environmental tobacco smoke (ETS) with the 
intent to influence policy makers to pass smoke-free–workplace laws. The
 judge found that:
EPA's study selection is disturbing. First, there is evidence in the record supporting the accusation that EPA "cherry picked" its data. Without criteria for pooling studies into a meta-analysis, the court cannot determine whether the exclusion of studies likely to disprove EPA's a priori hypothesis was coincidence or intentional. Second, EPA's excluding nearly half of the available studies directly conflicts with EPA's purported purpose for analyzing the epidemiological studies and conflicts with EPA's Risk Assessment Guidelines. See ETS Risk Assessment at 4-29 ("These data should also be examined in the interest of weighing all the available evidence, as recommended by EPA's carcinogen risk assessment guidelines (U.S. EPA, 1986a) (emphasis added)). Third, EPA's selective use of data conflicts with the Radon Research Act. The Act states EPA's program shall "gather data and information on all aspects of indoor air quality" (Radon Research Act § 403(a)(1)) (emphasis added).
As a result of the abuse, the court vacated Chapters 1–6 of and the 
Appendices to EPA's "Respiratory Health Effects of Passive Smoking: Lung
 Cancer and other Disorders".
Applications in modern science
Modern
 statistical meta-analysis does more than just combine the effect sizes 
of a set of studies using a weighted average. It can test if the 
outcomes of studies show more variation than the variation that is 
expected because of the sampling of different numbers of research 
participants. Additionally, study characteristics such as measurement 
instrument used, population sampled, or aspects of the studies' design 
can be coded and used to reduce variance of the estimator (see 
statistical models above). Thus some methodological weaknesses in 
studies can be corrected statistically. Other uses of meta-analytic 
methods include the development and validation of clinical prediction 
models, where meta-analysis may be used to combine individual 
participant data from different research centers and to assess the 
model's generalisability, or even to aggregate existing prediction models.
Meta-analysis can be done with single-subject design as well as group research designs. This is important because much research has been done with single-subject research designs. Considerable dispute exists for the most appropriate meta-analytic technique for single subject research.
Meta-analysis leads to a shift of emphasis from single studies to
 multiple studies. It emphasizes the practical importance of the effect 
size instead of the statistical significance of individual studies. This
 shift in thinking has been termed "meta-analytic thinking". The results
 of a meta-analysis are often shown in a forest plot. 
Results from studies are combined using different approaches. One
 approach frequently used in meta-analysis in health care research is 
termed 'inverse variance method'. The average effect size across all studies is computed as a weighted mean,
 whereby the weights are equal to the inverse variance of each study's 
effect estimator. Larger studies and studies with less random variation 
are given greater weight than smaller studies. Other common approaches 
include the Mantel–Haenszel method
and the Peto method.
Seed-based d mapping
 (formerly signed differential mapping, SDM) is a statistical technique 
for meta-analyzing studies on differences in brain activity or structure
 which used neuroimaging techniques such as fMRI, VBM or PET. 
Different high throughput techniques such as microarrays have been used to understand Gene expression. MicroRNA
 expression profiles have been used to identify differentially expressed
 microRNAs in particular cell or tissue type or disease conditions or to
 check the effect of a treatment. A meta-analysis of such expression 
profiles was performed to derive novel conclusions and to validate the 
known findings.