Search This Blog

Monday, December 23, 2013

Scientific Groupthink and a Summary of the Evidence that most Published Research is False

If these articles are both true, then are Americans right to distrust scientists and their claims?

http://www.american.com/archive/2013/december/scientific-groupthink-and-gay-parenting

Part one:  Scientific Groupthink

 
Wednesday, December 18, 2013
The controversy over a recent study on gay parenting illustrates a sociopolitical groupthink operating in the social scientific community. Scientists should go where the science takes them, not where their politics does.

University of Texas sociology professor Mark Regnerus’s study, “How Different Are the Adult Children of Parents Who Have Same-Sex Relationships? Findings from the New Family Structures Study,” published in the academic journal Social Science Research last year, caused a firestorm in the scientific community. Unlike most previous studies, Regnerus found that children of parents who had experienced a same-sex relationship fared worse than children of heterosexual parents on measures of social, emotional, and psychological adjustment as well as educational attainment, employment history, need for public assistance, substance abuse, and criminal justice system involvement.

The reaction to the Regnerus study was swift and harsh. Many of his academic colleagues said it was fatally flawed. Many questioned the motives of the author, reviewers, and journal editor. Did they have an anti-gay political agenda?

The controversy illustrates how tougher standards for assessing scientific worth are applied if a study produces results that are inconsistent with the scientists’ own political views. Suppose Regnerus had conducted an identical study, with the same methodological flaws, that had produced results consistent with previous studies, finding no differences between the children of gay or lesbian ("lesbigay") versus heterosexual parents. Would this one study (among the over 60 studies on lesbigay parenting) receive the same criticism, or any criticism at all, from the academic community? Would 201 scholars send a letter to the journal objecting to its publication of the study? Would the author’s former department chair publish an op-ed saying that she was “furious” about her junior colleague’s “pseudo-science”? Would academics make allegations in blogs and other forums about the integrity of the author, journal editor, and editorial review process?1 Would the professor’s university subject him to an intrusive investigation for possible scientific misconduct (of which it found no evidence)? And would similar attacks have been launched against other researchers who dared to question the scholarly consensus?2
Conservatives’ trust in science has dipped to an all-time low.
This is not the first time that science has clashed with politics. The Bell Curve, a book about the heritability of intelligence and the resulting libertarian or conservative policy implications, created great controversy. The Regnerus case unfolded similarly to the controversy surrounding the publication of a meta-analysis of child sexual abuse studies that was published in the journal Psychological Bulletin and reported that childhood sexual abuse often caused few long-lasting psychological effects. The article caused outrage. The study was attacked as substandard, and many questioned the authors’ motives and alleged scientific misconduct.

Most would acknowledge that science, particularly policy-relevant social science, is often politicized. The Regnerus controversy illustrates that scientists’ sociopolitical views frequently affect the kind of science that is conducted on policy-relevant questions, how findings are interpreted and received, and the degree of critical scrutiny such studies receive.

Scientific Groupthink
“If when a study yields an unpopular conclusion it is subjected to greater scrutiny, and more effort is expended towards its refutation, an obvious bias to ‘find what the community is looking for’ will have been introduced.”3

The Regnerus case illustrates a sociopolitical groupthink operating in the social scientific community. Surveys of the professoriate consistently find faculties to be quite lopsidedly liberal. The political imbalance is particularly acute in the social sciences, with liberal-conservative ratios of between 8:1 to 30:1 in most disciplines, and particularly with respect to social issues like gay marriage.

Such homogeneity of sociopolitical views among social scientists almost invariably leads to “groupthink,” a phenomenon that occurs when group members have relatively homogeneous backgrounds or ideological views. With this groupthink comes self-censorship and pressure on dissenters, the negative stereotyping and discounting of conservative perspectives, and a failure to consider conservative-friendly (as compared with liberal-friendly) question framing and data interpretation. A recent national survey of psychology professors found that one in four reported that they would be less likely to give a positive recommendation on a journal manuscript or grant application having a conservative perspective, and one in six would be less likely to invite conservative colleagues to participate in a symposium. In sociology, Notre Dame University Sociology Professor Christian Smith notes that:
 
The temptation . . . to advance a political agenda is too often indulged in sociology, especially by activist faculty in certain fields, like marriage, family, sex, and gender ... Research programs that advance narrow agendas compatible with particular ideologies are privileged ... the influence of progressive orthodoxy in sociology is evident in decisions made by graduate students, junior faculty, and even senior faculty about what, why, and how to research, publish, and teach ... The result is predictable: Play it politically safe, avoid controversial questions, publish the right conclusions.

Regnerus did not, however, play it safe. He did not publish the right conclusions on a politically controversial topic. Politically correct sociologists, on the other hand, enjoy certain privileges in a very politically conscious and liberal discipline. Indeed, there sometimes is the belief “that social science should be an instrument for social change and thus should promote the ‘correct’ values and ideological positions.”4

No wonder there is so little research by academics that arguably supports conservative policy perspectives. When such research is published, the Regnerus controversy illustrates how it may be received. Critics used the liberal norms and privileges of their discipline to marginalize the Regnerus study. A point-by-point methodological comparison of the Regnerus study alongside previous lesbigay parenting studies reveals the selective scrutiny applied by the critics of the Regnerus study.5

Ideological Diversity Is the Antidote
“No one knows how many research programs [social scientists] have failed to launch, or how many research discoveries they have failed to make, as a result of the skew in the distribution of [political] views within their discipline.”6

Contrary to the critics’ concerns about the political conservatism of Regnerus and his funders, the Regnerus study illustrates the value of ideological diversity among both researchers and funders. The allegedly conservative researcher Regnerus, funded by advocacy organizations opposing gay marriage, conducted a study producing findings useful to gay marriage opponents. Many previous studies were conducted and/or funded by those favoring gay marriage, and they produced findings useful to the gay-marriage cause.
Scientists should go where the science takes them, not where their politics does.
It is not surprising, nor is it indicative of nefarious scientific misconduct, that researchers of different ideological persuasions would produce findings consistent with their own ideology. It is human nature to frame research questions and interpret findings in ways that confirm one’s political beliefs. Such biases are the norm, even among scientists. This is particularly true when it comes to research on social issues because social scientists, many of whom were attracted to social science because of its progressive ideology, often have values invested in the issues they research. One can find such ideological tilt throughout social science research. For instance, how researchers interpret data on the relative contributions of hereditary factors versus environment to intelligence, or on biological factors in personality styles, seems to be partly a function of their political views.

Politics inevitably enter into the scientific endeavor as a consequence of the sociopolitical, parochial, financial, or career interests of researchers, funders, and professional organizations as well as those of the larger scientific community and polity. Scientists’ values and interests influence how they define and conceptualize social and behavioral issues, the data collection and analysis methods chosen, how results are interpreted, how scientists scrutinize and evaluate a study’s quality, and whether there are incentives or disincentives to advance research findings in policy advocacy.

Because biases are endemic to the scientific enterprise, the Regnerus case illustrates how research conducted or funded by those outside the sociopolitical mainstream, insofar as social scientists are concerned, may be the only way that “politically incorrect” research challenging the scientific consensus gets done. Theoretical or ideological homogeneity among researchers tends to produce myopic, one-sided research, whereas ideological diversity fosters a more dynamic climate that encourages unorthodox, diverse (and sometimes politically incorrect) research. Not only do those in the political minority bring diverse perspectives to the research endeavor, but their very presence has the effect of widening perspective and reducing bias in the rest of the scientific community. If social scientists were embedded in ideologically diverse networks of other scientists, they would be more likely to consider and test alternative hypotheses and perspectives on the social issues they research.

Science and Scientists in the Policy Debate
“Social scientists are never more revealing of themselves than when challenging the objectivity of one another’s work. In some fields almost any study is assumed to have a more or less discoverable political purpose.”7

Especially with controversies like the Regnerus study, it is no wonder that policymakers of all political persuasions are often skeptical about policy research coming from the academy, or that conservatives’ trust in science has dipped to an all-time low. This is what happens when policy-relevant research fails to be politically inclusive because virtually everyone funding and doing the research comes from the same political perspective.
Social scientists, many of whom were attracted to social science because of its progressive ideology, often have values invested in the issues they research.
Indeed, scientists who do research on policy issues arguably have an obligation to inform policymakers and the public about their research findings. But it is dangerous for science, policymaking, and the public’s trust in science when scientists are encouraged to do so only when the science supports liberal positions but are discouraged from doing so, or risk disapprobation from their colleagues, when the findings do not. Sadly, this is often the case. Scientists should go where the science takes them, not where their politics does. To attack a study based on the political incorrectness of its findings or its author’s and funder’s politics is scientifically irrelevant and ad hominem.
Rather, studies must stand or fall on the weight of their methodological reliability and validity.

Part Two:  A summary of the evidence that most published research is false


A summary of the evidence that most published research is false


One of the hottest topics in science has two main conclusions:
  • Most published research is false
  • There is a reproducibility crisis in science
The first claim is often stated in a slightly different way: that most results of scientific experiments do not replicate. I recently got caught up in this debate and I frequently get asked about it.
So I thought I'd do a very brief review of the reported evidence for the two perceived crises. An important point is all of the scientists below have made the best effort they can to tackle a fairly complicated problem and this is early days in the study of science-wise false discovery rates. But the take home message is that there is currently no definitive evidence one way or another about whether most results are false.
  1. Paper: Why most published research findings are falseMain idea: People use hypothesis testing to determine if specific scientific discoveries are significant. This significance calculation is used as a screening mechanism in the scientific literature. Under assumptions about the way people perform these tests and report them it is possible to construct a universe where most published findings are false positive results. Important drawback: The paper contains no real data, it is purely based on conjecture and simulation.
  2. Paper: Drug development: Raise standards for preclinical researchMain ideaMany drugs fail when they move through the development process. Amgen scientists tried to replicate 53 high-profile basic research findings in cancer and could only replicate 6. Important drawback: This is not a scientific paper. The study design, replication attempts, selected studies, and the statistical methods to define "replicate" are not defined. No data is available or provided.
  3. Paper: An estimate of the science-wise false discovery rate and application to the top medical literatureMain idea: The paper collects P-values from published abstracts of papers in the medical literature and uses a statistical method to estimate the false discovery rate proposed in paper 1 above. Important drawback: The paper only collected data from major medical journals and the abstracts. P-values can be manipulated in many ways that could call into question the statistical results in the paper.
  4. Paper: Revised standards for statistical evidenceMain idea: The P-value cutoff of 0.05 is used by many journals to determine statistical significance. This paper proposes an alternative method for screening hypotheses based on Bayes factors. Important drawback: The paper is a theoretical and philosophical argument for simple hypothesis tests. The data analysis recalculates Bayes factors for reported t-statistics and plots the Bayes factor versus the t-test then makes an argument for why one is better than the other.
  5. Paper: Contradicted and initially stronger effects in highly cited research Main idea: This paper looks at studies that attempted to answer the same scientific question where the second study had a larger sample size or more robust (e.g. randomized trial) study design. Some effects reported in the second study do not match the results exactly from the first. Important drawback: The title does not match the results. 16% of studies were contradicted (meaning effect in a different direction). 16% reported smaller effect size, 44% were replicated and 24% were unchallenged. So 44% + 24% + 16% = 86% were not contradicted. Lack of replication is also not proof of error.
  6. PaperModeling the effects of subjective and objective decision making in scientific peer reviewMain idea: This paper considers a theoretical model for how referees of scientific papers may behave socially. They use simulations to point out how an effect called "herding" (basically peer-mimicking) may lead to biases in the review process. Important drawback: The model makes major simplifying assumptions about human behavior and supports these conclusions entirely with simulation. No data is presented.
  7. Paper: Repeatability of published microarray gene expression analysesMain idea: This paper attempts to collect the data used in published papers and to repeat one randomly selected analysis from the paper. For many of the papers the data was either not available or available in a format that made it difficult/impossible to repeat the analysis performed in the original paper. The types of software used were also not clear. Important drawbackThis paper was written about 18 data sets in 2005-2006. This is both early in the era of reproducibility and not comprehensive in any way. This says nothing about the rate of false discoveries in the medical literature but does speak to the reproducibility of genomics experiments 10 years ago.
  8. Paper: Investigating variation in replicability: The "Many Labs" replication project. (not yet published) Main ideaThe idea is to take a bunch of published high-profile results and try to get multiple labs to replicate the results. They successfully replicated 10 out of 13 results and the distribution of results you see is about what you'd expect (see embedded figure below). Important drawback: The paper isn't published yet and it only covers 13 experiments. That being said, this is by far the strongest, most comprehensive, and most reproducible analysis of replication among all the papers surveyed here.
I do think that the reviewed papers are important contributions because they draw attention to real concerns about the modern scientific process. Namely
  • We need more statistical literacy
  • We need more computational literacy
  • We need to require code be published
  • We need mechanisms of peer review that deal with code
  • We need a culture that doesn't use reproducibility as a weapon
  • We need increased transparency in review and evaluation of papers
Some of these have simple fixes (more statistics courses, publishing code) some are much, much harder (changing publication/review culture).
The Many Labs project (Paper 8) points out that statistical research is proceeding in a fairly reasonable fashion. Some effects are overestimated in individual studies, some are underestimated, and some are just about right. Regardless, no single study should stand alone as the last word about an important scientific issue. It obviously won't be possible to replicate every study as intensely as those in the Many Labs project, but this is a reassuring piece of evidence that things aren't as bad as some paper titles and headlines may make it seem.

Many labs data. Blue x's are original effect sizes. Other dots are effect sizes from replication experiments (http://rolfzwaan.blogspot.com/2013/11/what-can-we-learn-from-many-labs.html)
 
The Many Labs results suggest that the hype about the failures of science are, at the very least, premature. I think an equally important idea is that science has pretty much always worked with some number of false positive and irreplicable studies. This was beautifully described by Jared Horvath in this blog post from the Economist.  I think the take home message is that regardless of the rate of false discoveries, the scientific process has led to amazing and life-altering discoveries.

Entropy (information theory)

From Wikipedia, the free encyclopedia https://en.wikipedia.org/wiki/Entropy_(information_theory) In info...