A Medley of Potpourri

Sunday, April 11, 2021

Scholarly peer review

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Scholarly_peer_review

Scholarly peer review (also known as refereeing) is the process of subjecting an author's scholarly work, research, or ideas to the scrutiny of others who are experts in the same field, before a paper describing this work is published in a journal, conference proceedings or as a book. The peer review helps the publisher (that is, the editor-in-chief, the editorial board or the program committee) decide whether the work should be accepted, considered acceptable with revisions, or rejected.

Peer review requires a community of experts in a given (and often narrowly defined) field, who are qualified and able to perform reasonably impartial review. Impartial review, especially of work in less narrowly defined or inter-disciplinary fields, may be difficult to accomplish, and the significance (good or bad) of an idea may never be widely appreciated among its contemporaries. Peer review is generally considered necessary to academic quality and is used in most major scholarly journals. However, peer review does not prevent publication of invalid research, and there is little evidence that peer review improves the quality of published papers.

Scholarly peer review has been subject to a number of criticisms, and various proposals for reforming the system have been suggested over the years. Attempts to reform the peer review process originate among others from the fields of metascience and journalology. Reformers seek to increase the reliability and efficiency of the peer review process and to provide it with a scientific foundation. Alternatives to common peer review practices have been put to the test, in particular open peer review, where the comments are visible to readers, generally with the identities of the peer reviewers disclosed as well, e.g., F1000, eLife, BMJ, Sci and BioMed Central.

History

The first record of an editorial pre-publication peer-review is from 1665 by Henry Oldenburg, the founding editor of Philosophical Transactions of the Royal Society at the Royal Society of London.

The first peer-reviewed publication might have been the Medical Essays and Observations published by the Royal Society of Edinburgh in 1731. The present-day peer-review system evolved from this 18th-century process, began to involve external reviewers in the mid-19th-century, and did not become commonplace until the mid-20th-century.

Peer review became a touchstone of the scientific method, but until the end of the 19th century was often performed directly by an editor-in-chief or editorial committee. Editors of scientific journals at that time made publication decisions without seeking outside input, i.e. an external panel of reviewers, giving established authors latitude in their journalistic discretion. For example, Albert Einstein's four revolutionary Annus Mirabilis papers in the 1905 issue of Annalen der Physik were peer-reviewed by the journal's editor-in-chief, Max Planck, and its co-editor, Wilhelm Wien, both future Nobel prize winners and together experts on the topics of these papers. On a much later occasion, Einstein was severely critical of the external review process, saying that he had not authorized the editor in chief to show his manuscript "to specialists before it is printed", and informing him that he would "publish the paper elsewhere"—which he did, and in fact he later had to withdraw the publication.

While some medical journals started to systematically appoint external reviewers, it is only since the middle of the 20th century that this practice has spread widely and that external reviewers have been given some visibility within academic journals, including being thanked by authors and editors. A 2003 editorial in Nature stated that, in the early 20th century, "the burden of proof was generally on the opponents rather than the proponents of new ideas." Nature itself instituted formal peer review only in 1967. Journals such as Science and the American Journal of Medicine increasingly relied on external reviewers in the 1950s and 1960s, in part to reduce the editorial workload. In the 20th century, peer review also became common for science funding allocations. This process appears to have developed independently from that of editorial peer review.

Gaudet provides a social science view of the history of peer review carefully tending to what is under investigation, here peer review, and not only looking at superficial or self-evident commonalities among inquisition, censorship, and journal peer review. It builds on historical research by Gould, Biagioli, Spier, and Rip. The first Peer Review Congress met in 1989. Over time, the fraction of papers devoted to peer review has steadily declined, suggesting that as a field of sociological study, it has been replaced by more systematic studies of bias and errors. In parallel with "common experience" definitions based on the study of peer review as a "pre-constructed process", some social scientists have looked at peer review without considering it as pre-constructed. Hirschauer proposed that journal peer review can be understood as reciprocal accountability of judgements among peers. Gaudet proposed that journal peer review could be understood as a social form of boundary judgement – determining what can be considered as scientific (or not) set against an overarching knowledge system, and following predecessor forms of inquisition and censorship.

Pragmatically, peer review refers to the work done during the screening of submitted manuscripts. This process encourages authors to meet the accepted standards of their discipline and reduces the dissemination of irrelevant findings, unwarranted claims, unacceptable interpretations, and personal views. Publications that have not undergone peer review are likely to be regarded with suspicion by academic scholars and professionals. Non-peer-reviewed work does not contribute, or contributes less, to the academic credit of scholar such as the h-index, although this heavily depends on the field.

Justification

It is difficult for authors and researchers, whether individually or in a team, to spot every mistake or flaw in a complicated piece of work. This is not necessarily a reflection on those concerned, but because with a new and perhaps eclectic subject, an opportunity for improvement may be more obvious to someone with special expertise or who simply looks at it with a fresh eye. Therefore, showing work to others increases the probability that weaknesses will be identified and improved. For both grant-funding and publication in a scholarly journal, it is also normally a requirement that the subject is both novel and substantial.

The decision whether or not to publish a scholarly article, or what should be modified before publication, ultimately lies with the publisher (editor-in-chief or the editorial board) to which the manuscript has been submitted. Similarly, the decision whether or not to fund a proposed project rests with an official of the funding agency. These individuals usually refer to the opinion of one or more reviewers in making their decision. This is primarily for three reasons:

Workload. A small group of editors/assessors cannot devote sufficient time to each of the many articles submitted to many journals.
Miscellany of ideas. Were the editor/assessor to judge all submitted material themselves, approved material would solely reflect their opinion.
Limited expertise. An editor/assessor cannot be expected to be sufficiently expert in all areas covered by a single journal or funding agency to adequately judge all submitted material.

Reviewers are often anonymous and independent. However, some reviewers may choose to waive their anonymity, and in other limited circumstances, such as the examination of a formal complaint against the referee, or a court order, the reviewer's identity may have to be disclosed. Anonymity may be unilateral or reciprocal (single- or double-blinded reviewing).

Since reviewers are normally selected from experts in the fields discussed in the article, the process of peer review helps to keep some invalid or unsubstantiated claims out of the body of published research and knowledge. Scholars will read published articles outside their limited area of detailed expertise, and then rely, to some degree, on the peer-review process to have provided reliable and credible research that they can build upon for subsequent or related research. Significant scandal ensues when an author is found to have falsified the research included in an article, as other scholars, and the field of study itself, may have relied upon the invalid research.

For US universities, peer reviewing of books before publication is a requirement for full membership of the Association of American University Presses.

Procedure

In the case of proposed publications, the publisher (editor-in-chief or the editorial board, often with assistance of corresponding or associate editors) sends advance copies of an author's work or ideas to researchers or scholars who are experts in the field (known as "referees" or "reviewers"). Communication is normally by e-mail or through a web-based manuscript processing system such as ScholarOne, Scholastica, or Open Journal Systems. Depending on the field of study and on the specific journal, there are usually one to three referees for a given article. For example, Springer states that there are two or three reviewers per article.

The peer-review process involves three steps:

Step 1: Desk evaluation

An editor evaluates the manuscript to judge whether the paper will be passed on journal referees. At this phase many articles receive a “desk reject,” that is, the editor chooses not to pass along the article. The authors may or may not receive a letter of explanation.

Desk rejection is intended to be a streamlined process so that editors may move past nonviable manuscripts quickly and provide authors with the opportunity to pursue a more suitable journal. For example, the European Accounting Review editors subject each manuscript to three questions to decide whether a manuscript moves forward to referees: 1) Is the article a fit for the journal's aims and scope, 2) is the paper content (e.g. literature review, methods, conclusions) sufficient and does the paper make a worthwhile contribution to the larger body of literature, and 3) does it follow format and technical specifications? If “no” to any of these, the manuscript receives a desk rejection.

Desk rejection rates vary by journal. For example, in 2017 researchers at the World Bank compiled rejection rates of several global economics journals; the desk rejection rate ranged from 21% (Economic Lacea) to 66% (Journal of Development Economics). The American Psychological Association publishes rejection rates for several major publications in the field, and although they do not specify whether the rejection is pre- or post- desk evaluation, their figures in 2016 ranged from a low of 49% to a high of 90%.

Step 2: External review

If the paper is not desk rejected, the editors send the manuscript to the referees, who are chosen for their expertise and distance from the authors. At this point, referees may reject, accept without changes (rare) or instruct the authors to revise and resubmit.

Reasons vary for acceptance of an article by editors, but Elsevier published an article where three editors weigh in on factors that drive article acceptance. These factors include whether the manuscript: delivers “new insight into an important issue,” will be useful to practitioners, advances or proposes a new theory, raises new questions, has appropriate methods and conclusion, presents a good argument based on the literature, and tells a good story. One editor notes that he likes papers that he “wished he’d done” himself.

These referees each return an evaluation of the work to the editor, noting weaknesses or problems along with suggestions for improvement. Typically, most of the referees' comments are eventually seen by the author, though a referee can also send 'for your eyes only' comments to the publisher; scientific journals observe this convention almost universally. The editor then evaluates the referees' comments, her or his own opinion of the manuscript before passing a decision back to the author(s), usually with the referees' comments.

Referees' evaluations usually include an explicit recommendation of what to do with the manuscript or proposal, often chosen from options provided by the journal or funding agency. For example, Nature recommends four courses of action:

to unconditionally accept the manuscript or the proposal,
to accept it in the event that its authors improve it in certain ways
to reject it, but encourage revision and invite re-submission
to reject it outright.

During this process, the role of the referees is advisory. The editor(s) is typically under no obligation to accept the opinions of the referees, though he or she will most often do so. Furthermore, the referees in scientific publication do not act as a group, do not communicate with each other, and typically are not aware of each other's identities or evaluations. Proponents argue that if the reviewers of a paper are unknown to each other, the editor(s) can more easily verify the objectivity of the reviews. There is usually no requirement that the referees achieve consensus, with the decision instead often made by the editor(s) based on her best judgement of the arguments.

In situations where multiple referees disagree substantially about the quality of a work, there are a number of strategies for reaching a decision. The paper may be rejected outright, or the editor may choose which reviewer's point the authors should address. When a publisher receives very positive and very negative reviews for the same manuscript, the editor will often solicit one or more additional reviews as a tie-breaker. As another strategy in the case of ties, the publisher may invite authors to reply to a referee's criticisms and permit a compelling rebuttal to break the tie. If a publisher does not feel confident to weigh the persuasiveness of a rebuttal, the publisher may solicit a response from the referee who made the original criticism. An editor may convey communications back and forth between authors and a referee, in effect allowing them to debate a point.

Even in these cases, however, publishers do not allow multiple referees to confer with each other, though each reviewer may often see earlier comments submitted by other reviewers. The goal of the process is explicitly not to reach consensus or to persuade anyone to change their opinions, but instead to provide material for an informed editorial decision. One early study regarding referee disagreement found that agreement was greater than chance, if not much greater than chance, on six of seven article attributes (e.g. literature review and final recommendation to publish), but this study was small and it was conducted on only one journal. At least one study has found that reviewer disagreement is not common, but this study is also small and on only one journal.

Traditionally, reviewers would often remain anonymous to the authors, but this standard varies both with time and with academic field. In some academic fields, most journals offer the reviewer the option of remaining anonymous or not, or a referee may opt to sign a review, thereby relinquishing anonymity. Published papers sometimes contain, in the acknowledgments section, thanks to anonymous or named referees who helped improve the paper. For example, Nature journals provide this option.

Sometimes authors may exclude certain reviewers: one study conducted on the Journal of Investigative Dermatology found that excluding reviewers doubled the chances of article acceptance. Some scholars are uncomfortable with this idea, arguing that it distorts the scientific process. Others argue that it protects against referees who are biased in some manner (e.g. professional rivalry, grudges). In some cases, authors can choose referees for their manuscripts. mSphere, an open-access journal in microbial science, has moved to this model. Editor-in-Chief Mike Imperiale says this process is designed to reduce the time it takes to review papers and permit the authors to choose the most appropriate reviewers. But a scandal in 2015 shows how this choosing reviewers can encourage fraudulent reviews. Fake reviews were submitted to the Journal of the Renin-Angiotensin-Aldosterone System in the names of author-recommended reviewers, causing the journal to eliminate this option.

Step 3: Revisions

If the manuscript has not been rejected during peer review, it returns to the authors for revisions. During this phase, the authors address the concerns raised by reviewers. Dr. William Stafford Noble offers ten rules for responding to reviewers. His rules include:

"Provide an overview, then quote the full set of reviews”
“Be polite and respectful of all reviewers”
“Accept the blame”
“Make the response self-contained”
“Respond to every point raised by the reviewer”
“Use typography to help the reviewer navigate your response”
“Whenever possible, begin your response to each comment with a direct answer to the point being raised”
“When possible, do what the reviewer asks”
“Be clear about what changed relative to the previous version”
“If necessary, write the response twice” (i.e. write a version for “venting” but then write a version the reviewers will see)

Recruiting referees

At a journal or book publisher, the task of picking reviewers typically falls to an editor. When a manuscript arrives, an editor solicits reviews from scholars or other experts who may or may not have already expressed a willingness to referee for that journal or book division. Granting agencies typically recruit a panel or committee of reviewers in advance of the arrival of applications.

Referees are supposed to inform the editor of any conflict of interests that might arise. Journals or individual editors may invite a manuscript's authors to name people whom they consider qualified to referee their work. For some journals this is a requirement of submission. Authors are sometimes also given the opportunity to name natural candidates who should be disqualified, in which case they may be asked to provide justification (typically expressed in terms of conflict of interest).

Editors solicit author input in selecting referees because academic writing typically is very specialized. Editors often oversee many specialties, and can not be experts in all of them. But after an editor selects referees from the pool of candidates, the editor typically is obliged not to disclose the referees' identities to the authors, and in scientific journals, to each other. Policies on such matters differ among academic disciplines. One difficulty with respect to some manuscripts is that, there may be few scholars who truly qualify as experts, people who have themselves done work similar to that under review. This can frustrate the goals of reviewer anonymity and avoidance of conflicts of interest. Low-prestige or local journals and granting agencies that award little money are especially handicapped with regard to recruiting experts.

A potential hindrance in recruiting referees is that they are usually not paid, largely because doing so would itself create a conflict of interest. Also, reviewing takes time away from their main activities, such as his or her own research. To the would-be recruiter's advantage, most potential referees are authors themselves, or at least readers, who know that the publication system requires that experts donate their time. Serving as a referee can even be a condition of a grant, or professional association membership.

Referees have the opportunity to prevent work that does not meet the standards of the field from being published, which is a position of some responsibility. Editors are at a special advantage in recruiting a scholar when they have overseen the publication of his or her work, or if the scholar is one who hopes to submit manuscripts to that editor's publishing entity in the future. Granting agencies, similarly, tend to seek referees among their present or former grantees.

Peerage of Science is an independent service and a community where reviewer recruitment happens via Open Engagement: authors submit their manuscript to the service where it is made accessible for any non-affiliated scientist, and 'validated users' choose themselves what they want to review. The motivation to participate as a peer reviewer comes from a reputation system where the quality of the reviewing work is judged and scored by other users, and contributes to user profiles. Peerage of Science does not charge any fees to scientists, and does not pay peer reviewers. Participating publishers however pay to use the service, gaining access to all ongoing processes and the opportunity to make publishing offers to the authors.

With independent peer review services the author usually retains the right to the work throughout the peer review process, and may choose the most appropriate journal to submit the work to. Peer review services may also provide advice or recommendations on most suitable journals for the work. Journals may still want to perform an independent peer review, without the potential conflict of interest that financial reimbursement may cause, or the risk that an author has contracted multiple peer review services but only presents the most favorable one.

An alternative or complementary system of performing peer review is for the author to pay for having it performed. Example of such service provider is Rubriq, which for each work assigns peer reviewers who are financially compensated for their efforts.

Different styles

Anonymous and attributed

For most scholarly publications, the identity of the reviewers is kept anonymised (also called "blind peer review"). The alternative, attributed peer review involves revealing the identities of the reviewers. Some reviewers choose to waive their right to anonymity, even when the journal's default format is blind peer review.

In anonymous peer review, reviewers are known to the journal editor or conference organiser but their names are not given to the article's author. In some cases, the author's identity can also be anonymised for the review process, with identifying information is stripped from the document before review. The system is intended to reduce or eliminate bias.

Some experts proposed blind review procedures for reviewing controversial research topics.

In double-blind peer review, which has been fashioned by sociology journals in the 1950s and remains more common in the social sciences and humanities than in the natural sciences, the identity of the authors is concealed from the reviewers ("blinded"), and vice versa, lest the knowledge of authorship or concern about disapprobation from the author bias their review. Critics of the double-blind review process point out that, despite any editorial effort to ensure anonymity, the process often fails to do so, since certain approaches, methods, writing styles, notations, etc., point to a certain group of people in a research stream, and even to a particular person.

In many fields of "big science", the publicly available operation schedules of major equipments, such as telescopes or synchrotrons, would make the authors' names obvious to anyone who would care to look them up. Proponents of double-blind review argue that it performs no worse than single-blind, and that it generates a perception of fairness and equality in academic funding and publishing. Single-blind review is strongly dependent upon the goodwill of the participants, but no more so than double-blind review with easily identified authors.

As an alternative to single-blind and double-blind review, authors and reviewers are encouraged to declare their conflicts of interest when the names of authors and sometimes reviewers are known to the other. When conflicts are reported, the conflicting reviewer can be prohibited from reviewing and discussing the manuscript, or his or her review can instead be interpreted with the reported conflict in mind; the latter option is more often adopted when the conflict of interest is mild, such as a previous professional connection or a distant family relation. The incentive for reviewers to declare their conflicts of interest is a matter of professional ethics and individual integrity. Even when the reviews are not public, they are still a matter of record and the reviewer's credibility depends upon how they represent themselves among their peers. Some software engineering journals, such as the IEEE Transactions on Software Engineering, use non-blind reviews with reporting to editors of conflicts of interest by both authors and reviewers.

A more rigorous standard of accountability is known as an audit. Because reviewers are not paid, they cannot be expected to put as much time and effort into a review as an audit requires. Therefore, academic journals such as Science, organizations such as the American Geophysical Union, and agencies such as the National Institutes of Health and the National Science Foundation maintain and archive scientific data and methods in the event another researcher wishes to replicate or audit the research after publication.

The traditional anonymous peer review has been criticized for its lack of accountability, the possibility of abuse by reviewers or by those who manage the peer review process (that is, journal editors), its possible bias, and its inconsistency, alongside other flaws. Eugene Koonin, a senior investigator at the National Center for Biotechnology Information, asserts that the system has "well-known ills" and advocates "open peer review".

Open peer review

In 1999, the open access journal Journal of Medical Internet Research was launched, which from its inception decided to publish the names of the reviewers at the bottom of each published article. Also in 1999, the British Medical Journal moved to an open peer review system, revealing reviewers' identities to the authors but not the readers, and in 2000, the medical journals in the open access BMC series published by BioMed Central, launched using open peer review. As with the BMJ, the reviewers' names are included on the peer review reports. In addition, if the article is published the reports are made available online as part of the "pre-publication history"'.

Several other journals published by the BMJ Group allow optional open peer review, as does PLoS Medicine, published by the Public Library of Science. The BMJ's Rapid Responses allows ongoing debate and criticism following publication.

In June 2006, Nature launched an experiment in parallel open peer review: some articles that had been submitted to the regular anonymous process were also available online for open, identified public comment. The results were less than encouraging – only 5% of authors agreed to participate in the experiment, and only 54% of those articles received comments. The editors have suggested that researchers may have been too busy to take part and were reluctant to make their names public. The knowledge that articles were simultaneously being subjected to anonymous peer review may also have affected the uptake.

In February 2006, the journal Biology Direct was launched by BioMed Central, adding another alternative to the traditional model of peer review. If authors can find three members of the Editorial Board who will each return a report or will themselves solicit an external review, the article will be published. As with Philica, reviewers cannot suppress publication, but in contrast to Philica, no reviews are anonymous and no article is published without being reviewed. Authors have the opportunity to withdraw their article, to revise it in response to the reviews, or to publish it without revision. If the authors proceed with publication of their article despite critical comments, readers can clearly see any negative comments along with the names of the reviewers. In the social sciences, there have been experiments with wiki-style, signed peer reviews, for example in an issue of the Shakespeare Quarterly.

In 2010, the BMJ began publishing signed reviewer's reports alongside accepted papers, after determining that telling reviewers that their signed reviews might be posted publicly did not significantly affect the quality of the reviews.

In 2011, Peerage of Science, an independent peer review service, was launched with several non-traditional approaches to academic peer review. Most prominently, these include the judging and scoring of the accuracy and justifiability of peer reviews, and concurrent usage of a single peer review round by several participating journals.

Starting in 2013 with the launch of F1000Research, some publishers have combined open peer review with postpublication peer review by using a versioned article system. At F1000Research, articles are published before review, and invited peer review reports (and reviewer names) are published with the article as they come in. Author-revised versions of the article are then linked to the original. A similar postpublication review system with versioned articles is used by Science Open launched in 2014.

In 2014, Life implanted an open peer review system, under which the peer-review reports and authors’ responses are published as an integral part of the final version of each article.

Since 2016, Synlett is experimenting with closed crowd peer review. The article under review is sent to a pool of 80+ expert reviewers who then collaboratively comment on the manuscript.

In an effort to address issues with the reproducibility of research results, some scholars are asking that authors agree to share their raw data as part of the peer review process. As far back as 1962, for example, a number of psychologists have attempted to obtain raw data sets from other researchers, with mixed results, in order to reanalyze them. A recent attempt resulted in only seven data sets out of fifty requests. The notion of obtaining, let alone requiring, open data as a condition of peer review remains controversial. In 2020 peer review lack of access to raw data led to article retractions in prestigious The New England Journal of Medicine and The Lancet. Many journals now require access to raw data to be included in peer review.

Pre- and post-publication peer review

The process of peer review is not restricted to the publication process managed by academic journals. In particular, some forms of peer review can occur before an article is submitted to a journal and/or after it is published by the journal.

Pre-publication peer review

Manuscripts are typically reviewed by colleagues before submission, and if the manuscript is uploaded to preprint servers, such as ArXiv, BioRxiv or SSRN, researchers can read and comment on the manuscript. The practice to upload to preprint servers, and the activity of discussion heavily depend on the field, and it allows an open pre-publication peer review. The advantage of this method is speed and transparency of the review process. Anyone can give feedback, typically in form of comments, and typically not anonymously. These comments are also public, and can be responded to, therefore author-reviewer communication is not restricted to the typical 2–4 rounds of exchanges in traditional publishing. The authors can incorporate comments from a wide range of people instead of feedback from the typically 3–4 reviewers. The disadvantage is that a far larger number of papers are presented to the community without any guarantee on quality.

Post-publication peer review

After a manuscript is published, the process of peer review continues as publications are read, known as post-publication peer review. Readers will often send letters to the editor of a journal, or correspond with the editor via an on-line journal club. In this way, all "peers" may offer review and critique of published literature. The introduction of the "epub ahead of print" practice in many journals has made possible the simultaneous publication of unsolicited letters to the editor together with the original paper in the print issue.

A variation on this theme is open peer commentary, in which commentaries from specialists are solicited on published articles and the authors are invited to respond. Journals using this process solicit and publish non-anonymous commentaries on the "target paper" together with the paper, and with original authors' reply as a matter of course. Open peer commentary was first implemented by the anthropologist Sol Tax, who founded the journal Current Anthropology in 1957. The journal Behavioral and Brain Sciences, published by Cambridge University Press, was founded by Stevan Harnad in 1978 and modeled on Current Anthropology's open peer commentary feature. Psycoloquy (1990–2002) was based on the same feature, but this time implemented online. Since 2016 open peer commentary is also provided by the journal Animal Sentience.

In addition to journals hosting their own articles' reviews, there are also external, independent websites dedicated to post-publication peer-review, such as PubPeer which allows anonymous commenting of published literature and pushes authors to answer these comments. It has been suggested that post-publication reviews from these sites should be editorially considered as well. The megajournals F1000Research and ScienceOpen publish openly both the identity of the reviewers and the reviewer's report alongside the article.

Some journals use postpublication peer review as formal review method, instead of prepublication review. This was first introduced in 2001, by Atmospheric Chemistry and Physics (ACP). More recently F1000Research and ScienceOpen were launched as megajournals with postpublication review as formal review method. At both ACP and F1000Research peer reviewers are formally invited, much like at prepublication review journals. Articles that pass peer review at those two journals are included in external scholarly databases.

In 2006, a small group of UK academic psychologists launched Philica, the instant online journal Journal of Everything, to redress many of what they saw as the problems of traditional peer review. All submitted articles are published immediately and may be reviewed afterwards. Any researcher who wishes to review an article can do so and reviews are anonymous. Reviews are displayed at the end of each article, and are used to give the reader criticism or guidance about the work, rather than to decide whether it is published or not. This means that reviewers cannot suppress ideas if they disagree with them. Readers use reviews to guide their reading, and particularly popular or unpopular work is easy to identify.

Sci (ISSN 2413-4155) from MDPI, a scholarly, open access journal which covers all research fields and publishes reviews, regular research papers, communications, and short notes, was established in March 2018 to open the "black box of peer-review". It subsequently adapted a more transparent workflow, post publication public peer-review (P4R) advocating the maintenance of transparency and scientific originality. The P4R system in place from March 2019 until November 2020 promised authors immediate visibility of their manuscripts on the journal’s online platform after a brief and limited check of scientific soundness and proper reporting and against plagiarism and offensive material. This approach, however, was faced with some challenges, namely:

the extended manuscript processing time due to waiting to volunteers to come forward
certain refusal by authors to accept comments or reviews has been noted in Sci, possibly fueled by the fact that the manuscript had been published de facto already as part of the P4R strategy of post-publication review
logistical mess, as the options of retraction or rejection are not really available in P4R, where a highly problematic public naming and shaming of a weak manuscript looks to be the only tool then available to guard against lack of quality
the inability to include Sci as a P4R journal in Clarivate’s Web of Science and Science Citation Index due to the generation of several DOIs

Therefore, the a switch to a hybrid workflow, P4R hybrid, was sought since November 2020.

Social media and informal peer review

Recent research has called attention to the use of social media technologies and science blogs as a means of informal, post-publication peer review, as in the case of the #arseniclife (or GFAJ-1) controversy. In December 2010, an article published in Scienceexpress (the ahead-of-print version of Science) generated both excitement and skepticism, as its authors—led by NASA astrobiologist Felisa Wolfe-Simon—claimed to have discovered and cultured a certain bacteria that could replace phosphorus with arsenic in its physiological building blocks. At the time of the article's publication, NASA issued press statements suggesting that the finding would impact the search for extraterrestrial life, sparking excitement on Twitter under the hashtag #arseniclife, as well as criticism from fellow experts who voiced skepticism via their personal blogs. Ultimately, the controversy surrounding the article attracted media attention, and one of the most vocal scientific critics—Rosemary Redfield—formally published in July 2012 regarding her and her colleagues' unsuccessful attempt to replicate the NASA scientists’ original findings.

Researchers following the impact of the #arseniclife case on social media discussions and peer review processes concluded the following:

Our results indicate that interactive online communication technologies can enable members in the broader scientific community to perform the role of journal reviewers to legitimize scientific information after it has advanced through formal review channels. In addition, a variety of audiences can attend to scientific controversies through these technologies and observe an informal process of post-publication peer review. (p 946)

Result-blind peer review

Studies which report a positive or statistically-significant result are far more likely to be published than ones which do not. A counter-measure to this positivity bias is to hide or make unavailable the results, making journal acceptance more like scientific grant agencies reviewing research proposals. Versions include:

Result-blind peer review or results blind peer review, first proposed 1966: Reviewers receive an edited version of the submitted paper which omits the results and conclusion section. In a two-stage version, a second round of reviews or editorial judgment is based on the full paper version, which was first proposed in 1977.
Conclusion-blind review, proposed by Robin Hanson in 2007 extends this further asking all authors to submit a positive and a negative version, and only after the journal has accepted the article authors reveal which is the real version.
Pre-accepted articles or outcome-unbiased journals or advance publication review or registered reports or prior to results submission or early acceptance extends study pre-registration to the point that journals accepted or reject papers based on the version of the paper written before the results or conclusions have been made (an enlarged study protocol), but instead describes the theoretical justification, experimental design, and statistical analysis. Only once the proposed hypothesis and methodology have been accepted by reviewers, the authors would collect the data or analyze previously collected data. A limited variant of a pre-accepted article was The Lancet's study protocol review from 1997–2015 reviewed and published randomized trial protocols with a guarantee that the eventual paper would at least be sent out to peer review rather than immediately rejected. For example, Nature Human Behaviour has adopted the registered report format, as it “shift[s] the emphasis from the results of research to the questions that guide the research and the methods used to answer them”. The European Journal of Personality defines this format: “In a registered report, authors create a study proposal that includes theoretical and empirical background, research questions/hypotheses, and pilot data (if available). Upon submission, this proposal will then be reviewed prior to data collection, and if accepted, the paper resulting from this peer-reviewed procedure will be published, regardless of the study outcomes.”

The following journals used result-blind peer review or pre-accepted articles:

The European Journal of Parapsychology, under Martin Johnson (who proposed a version of Registered Reports in 1974), began accepting papers based on submitted designs and then publishing them, from 1976 to 1993, and published 25 RRs total
The International Journal of Forecasting used opt-in result-blind peer review and pre-accepted articles from before 1986 through 1996/1997.
The journal Applied Psychological Measurement offered an opt-in "advance publication review" process from 1989–1996, ending use after only 5 papers were submitted.
The JAMA Internal Medicine found in a 2009 survey that 86% of its reviewers would be willing to work in a result-blind peer review process, and ran a pilot experiment with a two-stage result-blind peer review, showing the unblinded step benefited positive studies more than negatives. but the journal does not currently use result-blind peer review.

The Center for Open Science encourages using "Registered Reports" (pre-accepted articles) beginning in 2013. As of October 2017, ~80 journals offer Registered Reports in general, have had special issues of Registered Reports, or limited acceptance of Registered Reports (e.g. replications only) including AIMS Neuroscience, Cortex, Perspectives on Psychological Science, Social Psychology, & Comparative Political Studies
- Comparative Political Studies published results of its pilot experiment of 19 submissions of which 3 were pre-accepted in 2016. the process worked well but submissions were weighted towards quantitative experimental designs, and reduced the amount of 'fishing' as submitters and reviewers focused on theoretical backing, substantive importance of results, with attention to the statistical power and implications of a null result, concluding that "we can clearly state that this form of review lead to papers that were of the highest quality. We would love to see a top journal adopt results-free review as a policy, at very least allowing results-free review as one among several standard submission options."

Criticism

Various editors have expressed criticism of peer review. In addition, a Cochrane review found little empirical evidence that peer review ensures quality in biomedical research, while a second systematic review and meta-analysis found a need for evidence-based peer review in biomedicine given the paucity of assessment of the interventions designed to improve the process.

To an outsider, the anonymous, pre-publication peer review process is opaque. Certain journals are accused of not carrying out stringent peer review in order to more easily expand their customer base, particularly in journals where authors pay a fee before publication. Richard Smith, MD, former editor of the British Medical Journal, has claimed that peer review is "ineffective, largely a lottery, anti-innovatory, slow, expensive, wasteful of scientific time, inefficient, easily abused, prone to bias, unable to detect fraud and irrelevant; Several studies have shown that peer review is biased against the provincial and those from low- and middle-income countries; Many journals take months and even years to publish and the process wastes researchers' time. As for the cost, the Research Information Network estimated the global cost of peer review at £1.9 billion in 2008."

In addition, Australia's Innovative Research Universities group (a coalition of seven comprehensive universities committed to inclusive excellence in teaching, learning and research in Australia) has found that "peer review disadvantages researchers in their early careers, when they rely on competitive grants to cover their salaries, and when unsuccessful funding applications often mark the end of a research idea".

Low-end distinctions in articles understandable to all peers

John Ioannidis argues that since the exams and other tests that people pass on their way from "layman" to "expert" focus on answering the questions in time and in accordance with a list of answers, and not on making precise distinctions (the latter of which would be unrecognizable to experts of lower cognitive precision), there is as much individual variation in the ability to distinguish causation from correlation among "experts" as there is among "laymen". Ioannidis argues that as a result, scholarly peer review by many "experts" allows only articles that are understandable at a wide range of cognitive precision levels including very low ones to pass, biasing publications towards favoring articles that infer causation from correlation while mislabelling articles that make the distinction as "incompetent overestimation of one's ability" on the side of the authors because some of the reviewing "experts" are cognitively unable to distinguish the distinction from alleged rationalization of specific conclusions. It is argued by Ioannidis that this makes peer review a cause of selective publication of false research findings while stopping publication of rigorous criticism thereof, and that further post-publication review repeats the same bias by selectively retracting the few rigorous articles that may have made it through initial pre-publication peer review while letting the low-end ones that confuse correlation and causation remain in print.

Peer review and trust

Researchers have peer reviewed manuscripts prior to publishing them in a variety of ways since the 18th century. The main goal of this practice is to improve the relevance and accuracy of scientific discussions. Even though experts often criticize peer review for a number of reasons, the process is still often considered the "gold standard" of science. Occasionally however, peer review approves studies that are later found to be wrong and rarely deceptive or fraudulent results are discovered prior to publication. Thus, there seems to be an element of discord between the ideology behind and the practice of peer review. By failing to effectively communicate that peer review is imperfect, the message conveyed to the wider public is that studies published in peer-reviewed journals are "true" and that peer review protects the literature from flawed science. A number of well-established criticisms exist of many elements of peer review. In the following we describe cases of the wider impact inappropriate peer review can have on public understanding of scientific literature.

Multiple examples across several areas of science find that scientists elevated the importance of peer review for research that was questionable or corrupted. For example, climate change deniers have published studies in the Energy and Environment journal, attempting to undermine the body of research that shows how human activity impacts the Earth's climate. Politicians in the United States who reject the established science of climate change have then cited this journal on several occasions in speeches and reports.

At times, peer review has been exposed as a process that was orchestrated for a preconceived outcome. The New York Times gained access to confidential peer review documents for studies sponsored by the National Football League (NFL) that were cited as scientific evidence that brain injuries do not cause long-term harm to its players. During the peer review process, the authors of the study stated that all NFL players were part of a study, a claim that the reporters found to be false by examining the database used for the research. Furthermore, The Times noted that the NFL sought to legitimize the studies" methods and conclusion by citing a "rigorous, confidential peer-review process" despite evidence that some peer reviewers seemed "desperate" to stop their publication. Recent research has also demonstrated that widespread industry funding for published medical research often goes undeclared and that such conflicts of interest are not appropriately addressed by peer review.

Another problem that peer review fails to catch is ghostwriting, a process by which companies draft articles for academics who then publish them in journals, sometimes with little or no changes. These studies can then be used for political, regulatory and marketing purposes. In 2010, the US Senate Finance Committee released a report that found this practice was widespread, that it corrupted the scientific literature and increased prescription rates. Ghostwritten articles have appeared in dozens of journals, involving professors at several universities.

Just as experts in a particular field have a better understanding of the value of papers published in their area, scientists are considered to have better grasp of the value of published papers than the general public and to see peer review as a human process, with human failings, and that "despite its limitations, we need it. It is all we have, and it is hard to imagine how we would get along without it". But these subtleties are lost on the general public, who are often misled into thinking that published in a journal with peer review is the "gold standard" and can erroneously equate published research with the truth. Thus, more care must be taken over how peer review, and the results of peer-reviewed research, are communicated to non-specialist audiences; particularly during a time in which a range of technical changes and a deeper appreciation of the complexities of peer review are emerging. This will be needed as the scholarly publishing system has to confront wider issues such as retractions and replication or reproducibility "crisis'.

Views of peer review

Peer review is often considered integral to scientific discourse in one form or another. Its gatekeeping role is supposed to be necessary to maintain the quality of the scientific literature and avoid a risk of unreliable results, inability to separate signal from noise, and slow scientific progress.

Shortcomings of peer review have been met with calls for even stronger filtering and more gatekeeping. A common argument in favor of such initiatives is the belief that this filter is needed to maintain the integrity of the scientific literature.

Calls for more oversight have at least two implications that are counterintuitive of what is known to be true scholarship.

The belief that scholars are incapable of evaluating the quality of work on their own, that they are in need of a gatekeeper to inform them of what is good and what is not.
The belief that scholars need a "guardian" to make sure they are doing good work.

Others argue that authors most of all have a vested interest in the quality of a particular piece of work. Only the authors could have, as Feynman (1974) puts it, the "extra type of integrity that is beyond not lying, but bending over backwards to show how you're maybe wrong, that you ought to have when acting as a scientist." If anything, the current peer review process and academic system could penalize, or at least fail to incentivize, such integrity.

Instead, the credibility conferred by the "peer-reviewed" label could diminish what Feynman calls the culture of doubt necessary for science to operate a self-correcting, truth-seeking process. The effects of this can be seen in the ongoing replication crisis, hoaxes, and widespread outrage over the inefficacy of the current system. It's common to think that more oversight is the answer, as peer reviewers are not at all lacking in skepticism. But the issue is not the skepticism shared by the select few who determine whether an article passes through the filter. It is the validation, and accompanying lack of skepticism, that comes afterwards. Here again more oversight only adds to the impression that peer review ensures quality, thereby further diminishing the culture of doubt and counteracting the spirit of scientific inquiry.

Quality research - even some of our most fundamental scientific discoveries - dates back centuries, long before peer review took its current form. Whatever peer review existed centuries ago, it took a different form than it does in modern times, without the influence of large, commercial publishing companies or a pervasive culture of publish or perish. Though in its initial conception it was often a laborious and time-consuming task, researchers took peer review on nonetheless, not out of obligation but out of duty to uphold the integrity of their own scholarship. They managed to do so, for the most part, without the aid of centralised journals, editors, or any formalised or institutionalised process whatsoever. Supporters of modern technology argue that it makes it possible to communicate instantaneously with scholars around the globe, make such scholarly exchanges easier, and restore peer review to a purer scholarly form, as a discourse in which researchers engage with one another to better clarify, understand, and communicate their insights.

Such modern technology includes posting results to preprint servers, preregistration of studies, open peer review, and other open science practices. In all these initiatives, the role of gatekeeping remains prominent, as if a necessary feature of all scholarly communication, but critics argue that a proper, real-world implementation could test and disprove this assumption; demonstrate researchers' desire for more that traditional journals can offer; show that researchers can be entrusted to perform their own quality control independent of journal-coupled review. Jon Tennant also argues that the outcry over the inefficiencies of traditional journals centers on their inability to provide rigorous enough scrutiny, and the outsourcing of critical thinking to a concealed and poorly-understood process. Thus, the assumption that journals and peer review are required to protect scientific integrity seems to undermine the very foundations of scholarly inquiry.

To test the hypothesis that filtering is indeed unnecessary to quality control, many of the traditional publication practices would need to be redesigned, editorial boards repurposed if not disbanded, and authors granted control over the peer review of their own work. Putting authors in charge of their own peer review is seen as serving a dual purpose. On one hand, it removes the conferral of quality within the traditional system, thus eliminating the prestige associated with the simple act of publishing. Perhaps paradoxically, the removal of this barrier might actually result in an increase of the quality of published work, as it eliminates the cachet of publishing for its own sake. On the other hand, readers know that there is no filter so they must interpret anything they read with a healthy dose of skepticism, thereby naturally restoring the culture of doubt to scientific practice.

In addition to concerns about the quality of work produced by well-meaning researchers, there are concerns that a truly open system would allow the literature to be populated with junk and propaganda by those with a vested interest in certain issues. A counterargument is that the conventional model of peer review diminishes the healthy skepticism that is a hallmark of scientific inquiry, and thus confers credibility upon subversive attempts to infiltrate the literature. Allowing such "junk" to be published could make individual articles less reliable but render the overall literature more robust by fostering a "culture of doubt".

Allegations of bias and suppression

The interposition of editors and reviewers between authors and readers may enable the intermediators to act as gatekeepers. Some sociologists of science argue that peer review makes the ability to publish susceptible to control by elites and to personal jealousy. The peer review process may sometimes impede progress and may be biased against novelty. A linguistic analysis of review reports suggests that reviewers focus on rejecting the applications by searching for weak points, and not on finding the high-risk/high-gain groundbreaking ideas that may be in the proposal. Reviewers tend to be especially critical of conclusions that contradict their own views, and lenient towards those that match them. At the same time, established scientists are more likely than others to be sought out as referees, particularly by high-prestige journals/publishers. As a result, ideas that harmonize with the established experts' are more likely to see print and to appear in premier journals than are iconoclastic or revolutionary ones. This accords with Thomas Kuhn's well-known observations regarding scientific revolutions. A theoretical model has been established whose simulations imply that peer review and over-competitive research funding foster mainstream opinion to monopoly.

Criticisms of traditional anonymous peer review allege that it lacks accountability, can lead to abuse by reviewers, and may be biased and inconsistent.

There have also been suggestions of gender bias in peer review, with male authors being likely to receive more favorable treatment. However, a 2021 study found no evidence for such bias (and found that in some respects female authors were treated more favourably).

Open access journals and peer review

Some critics of open access (OA) journals have argued that, compared to traditional subscription journals, open access journals might utilize substandard or less formal peer review practices, and, as a consequence, the quality of scientific work in such journals will suffer. In a study published in 2012, this hypothesis was tested by evaluating the relative "impact" (using citation counts) of articles published in open access and subscription journals, on the grounds that members of the scientific community would presumably be less likely to cite substandard work, and that citation counts could therefore act as one indicator of whether or not the journal format indeed impacted peer review and the quality of published scholarship. This study ultimately concluded that "OA journals indexed in Web of Science and/or Scopus are approaching the same scientific impact and quality as subscription journals, particularly in biomedicine and for journals funded by article processing charges," and the authors consequently argue that "there is no reason for authors not to choose to publish in OA journals just because of the ‘OA’ label.

Failures

Peer review fails when a peer-reviewed article contains fundamental errors that undermine at least one of its main conclusions and that could have been identified by more careful reviewers. Many journals have no procedure to deal with peer review failures beyond publishing letters to the editor. Peer review in scientific journals assumes that the article reviewed has been honestly prepared. The process occasionally detects fraud, but is not designed to do so. When peer review fails and a paper is published with fraudulent or otherwise irreproducible data, the paper may be retracted. A 1998 experiment on peer review with a fictitious manuscript found that peer reviewers failed to detect some manuscript errors and the majority of reviewers may not notice that the conclusions of the paper are unsupported by its results.

Fake peer review

There have been instances where peer review was claimed to be performed but in fact was not; this has been documented in some predatory open access journals (e.g., the Who's Afraid of Peer Review? affair) or in the case of sponsored Elsevier journals.

In November 2014, an article in Nature exposed that some academics were submitting fake contact details for recommended reviewers to journals, so that if the publisher contacted the recommended reviewer, they were the original author reviewing their own work under a fake name. The Committee on Publication Ethics issued a statement warning of the fraudulent practice. In March 2015, BioMed Central retracted 43 articles and Springer retracted 64 papers in 10 journals in August 2015. Tumor Biology journal is another example of peer review fraud.

In 2020, the Journal of Nanoparticle Research fell victim to an "organized rogue editor network", who impersonated respected academics, got a themed issue created, and got 19 substandard articles published (out of 80 submitted). The journal was praised for dealing with the scam openly and transparently.

Plagiarism

Reviewers generally lack access to raw data, but do see the full text of the manuscript, and are typically familiar with recent publications in the area. Thus, they are in a better position to detect plagiarism of prose than fraudulent data. A few cases of such textual plagiarism by historians, for instance, have been widely publicized.

On the scientific side, a poll of 3,247 scientists funded by the U.S. National Institutes of Health found 0.3% admitted faking data and 1.4% admitted plagiarism. Additionally, 4.7% of the same poll admitted to self-plagiarism or autoplagiarism, in which an author republishes the same material, data, or text, without citing their earlier work.

Examples

"Perhaps the most widely recognized failure of peer review is its inability to ensure the identification of high-quality work. The list of important scientific papers that were rejected by some peer-reviewed journals goes back at least as far as the editor of Philosophical Transaction's 1796 rejection of Edward Jenner's report of the first vaccination against smallpox."
The Soon and Baliunas controversy involved the publication in 2003 of a review study written by aerospace engineer Willie Soon and astronomer Sallie Baliunas in the journal Climate Research, which was quickly taken up by the G.W. Bush administration as a basis for amending the first Environmental Protection Agency Report on the Environment. The paper was strongly criticized by numerous scientists for its methodology and for its misuse of data from previously published studies, prompting concerns about the peer review process of the paper. The controversy resulted in the resignation of several editors of the journal and the admission by its publisher Otto Kinne that the paper should not have been published as it was.
The trapezoidal rule, in which the method of Riemann sums for numerical integration was republished in a Diabetes research journal, Diabetes Care. The method is almost always taught in high school calculus, and was thus considered an example of an extremely well known idea being re-branded as a new discovery.
A conference organized by the Wessex Institute of Technology was the target of an exposé by three researchers who wrote nonsensical papers (including one that was composed of random phrases). They reported that the papers were "reviewed and provisionally accepted" and concluded that the conference was an attempt to "sell" publication possibilities to less experienced or naive researchers.This may however be better described as a lack of any actual peer review, rather than peer review having failed.
In the humanities, one of the most infamous cases of plagiarism undetected by peer review involved Martin Stone, formerly professor of medieval and Renaissance philosophy at the Hoger Instituut voor Wijsbegeerte of the KU Leuven. Martin Stone managed to publish at least forty articles and book chapters that were almost entirely stolen from the work of others. Most of these publications appeared in highly rated peer-reviewed journals and book series.

In popular culture

In 2017, the Higher School of Economics in Moscow unveiled a "Monument to an Anonymous Peer Reviewer". It takes the form of a large concrete cube, or dice, with "Accept", "Minor Changes", "Major Changes", "Revise and Resubmit" and "Reject" on its five visible sides. Sociologist Igor Chirikov, who devised the monument, said that while researchers have a love-hate relationship with peer review, peer reviewers nonetheless do valuable but mostly invisible work, and the monument is a tribute to them.

Metascience

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Metascience

Evidence-based practices

Assessment Conservation Design Dentistry Education Legislation Library and information practice Management Medical ethics Medicine Nursing Pharmacy in developing countries Philanthropy Policy Policing Prosecution Research Scheduling Toxicology

Metascience (also known as meta-research) is the use of scientific methodology to study science itself. Metascience seeks to increase the quality of scientific research while reducing waste. It is also known as "research on research" and "the science of science", as it uses research methods to study how research is done and where improvements can be made. Metascience concerns itself with all fields of research and has been described as "a bird's eye view of science." In the words of John Ioannidis, "Science is the best thing that has happened to human beings ... but we can do it better."

Measures have been implemented to address the issues revealed by metascience. These measures include the pre-registration of scientific studies and clinical trials as well as the founding of organizations such as CONSORT and the EQUATOR Network that issue guidelines for methodology and reporting. There are continuing efforts to reduce the misuse of statistics, to eliminate perverse incentives from academia, to improve the peer review process, to combat bias in scientific literature, and to increase the overall quality and efficiency of the scientific process.

History

John Ioannidis (2005), "Why Most Published Research Findings Are False".

In 1966, an early meta-research paper examined the statistical methods of 295 papers published in ten high-profile medical journals. It found that, "in almost 73% of the reports read ... conclusions were drawn when the justification for these conclusions was invalid." In 2005, John Ioannidis published a paper titled "Why Most Published Research Findings Are False", which argued that a majority a papers in the medical field produce conclusions that are wrong. The paper went on to become the most downloaded paper in the Public Library of Science and is considered foundational to the field of metascience. In a related study with Jeremy Howick and Despina Koletsi, Ioannidis showed that only a minority of medical interventions are supported by 'high quality' evidence according to The Grading of Recommendations Assessment, Development and Evaluation (GRADE) approach. Later meta-research identified widespread difficulty in replicating results in many scientific fields, including psychology and medicine. This problem was termed "the replication crisis". Metascience has grown as a reaction to the replication crisis and to concerns about waste in research.

Many prominent publishers are interested in meta-research and in improving the quality of their publications. Top journals such as Science, The Lancet, and Nature, provide ongoing coverage of meta-research and problems with reproducibility. In 2012 PLOS ONE launched a Reproducibility Initiative. In 2015 Biomed Central introduced a minimum-standards-of-reporting checklist to four titles.

The first international conference in the broad area of meta-research was the Research Waste/EQUATOR conference held in Edinburgh in 2015; the first international conference on peer review was the Peer Review Congress held in 1989. In 2016, Research Integrity and Peer Review was launched. The journal's opening editorial called for "research that will increase our understanding and suggest potential solutions to issues related to peer review, study reporting, and research and publication ethics".

Areas of meta-research

Metascience can be categorize into five major areas of interest: Methods, Reporting, Reproducibility, Evaluation, and Incentives. These correspond, respectively, with how to perform, communicate, verify, evaluate, and reward research.

Methods

Metascience seeks to identify poor research practices, including biases in research, poor study design, abuse of statistics, and to find methods to reduce these practices. Meta-research has identified numerous biases in scientific literature. Of particular note is the widespread misuse of p-values and abuse of statistical significance.

Reporting

Meta-research has identified poor practices in reporting, explaining, disseminating and popularizing research, particularly within the social and health sciences. Poor reporting makes it difficult to accurately interpret the results of scientific studies, to replicate studies, and to identify biases and conflicts of interest in the authors. Solutions include the implementation of reporting standards, and greater transparency in scientific studies (including better requirements for disclosure of conflicts of interest). There is an attempt to standardize reporting of data and methodology through the creation of guidelines by reporting agencies such as CONSORT and the larger EQUATOR Network.

Reproducibility

The replication crisis is an ongoing methodological crisis in which it has been found that many scientific studies are difficult or impossible to replicate. While the crisis has its roots in the meta-research of the mid- to late-1900s, the phrase "replication crisis" was not coined until the early 2010s as part of a growing awareness of the problem. The replication crisis particularly affects psychology (especially social psychology) and medicine. Replication is an essential part of the scientific process, and the widespread failure of replication puts into question the reliability of affected fields.

Moreover, replication of research (or failure to replicate) is considered less influential than original research, and is less likely to be published in many fields. This discourages the reporting of, and even attempts to replicate, studies.

Evaluation

Metascience seeks to create a scientific foundation for peer review. Meta-research evaluates peer review systems including pre-publication peer review, post-publication peer review, and open peer review. It also seeks to develop better research funding criteria.

Incentives

Metascience seeks to promote better research through better incentive systems. This includes studying the accuracy, effectiveness, costs, and benefits of different approaches to ranking and evaluating research and those who perform it. Critics argue that perverse incentives have created a publish-or-perish environment in academia which promotes the production of junk science, low quality research, and false positives. According to Brian Nosek, “The problem that we face is that the incentive system is focused almost entirely on getting research published, rather than on getting research right.” Proponents of reform seek to structure the incentive system to favor higher-quality results.

Reforms

Meta-research identifying flaws in scientific practice has inspired reforms in science. These reforms seek to address and fix problems in scientific practice which lead to low-quality or inefficient research.

Pre-registration

The practice of registering a scientific study before it is conducted is called pre-registration. It arose as a means to address the replication crisis. Pregistration requires the submission of a registered report, which is then accepted for publication or rejected by a journal based on theoretical justification, experimental design, and the proposed statistical analysis. Pre-registration of studies serves to prevent publication bias, reduce data dredging, and increase replicability.

Reporting standards

Studies showing poor consistency and quality of reporting have demonstrated the need for reporting standards and guidelines in science, which has led to the rise of organisations that produce such standards, such as CONSORT (Consolidated Standards of Reporting Trials) and the EQUATOR Network.

The EQUATOR (Enhancing the QUAlity and Transparency Of health Research) Network is an international initiative aimed at promoting transparent and accurate reporting of health research studies to enhance the value and reliability of medical research literature. The EQUATOR Network was established with the goals of raising awareness of the importance of good reporting of research, assisting in the development, dissemination and implementation of reporting guidelines for different types of study designs, monitoring the status of the quality of reporting of research studies in the health sciences literature, and conducting research relating to issues that impact the quality of reporting of health research studies. The Network acts as an "umbrella" organisation, bringing together developers of reporting guidelines, medical journal editors and peer reviewers, research funding bodies, and other key stakeholders with a mutual interest in improving the quality of research publications and research itself.

Applications

Medicine

Clinical research in medicine is often of low quality, and many studies cannot be replicated. An estimated 85% of research funding is wasted. Additionally, the presence of bias affects research quality. The pharmaceutical industry exerts substantial influence on the design and execution of medical research. Conflicts of interest are common among authors of medical literature and among editors of medical journals. While almost all medical journals require their authors to disclose conflicts of interest, editors are not required to do so. Financial conflicts of interest have been linked to higher rates of positive study results. In antidepressant trials, pharmaceutical sponsorship is the best predictor of trial outcome.

Blinding is another focus of meta-research, as error caused by poor blinding is a source of experimental bias. Blinding is not well reported in medical literature, and widespread misunderstanding of the subject has resulted in poor implementation of blinding in clinical trials. Furthermore, failure of blinding is rarely measured or reported. Research showing the failure of blinding in antidepressant trials has led some scientists to argue that antidepressants are no better than placebo. In light of meta-research showing failures of blinding, CONSORT standards recommend that all clinical trials assess and report the quality of blinding.

Studies have shown that systematic reviews of existing research evidence are sub-optimally used in planning a new research or summarizing the results. Cumulative meta-analyses of studies evaluating the effectiveness of medical interventions have shown that many clinical trials could have been avoided if a systematic review of existing evidence was done prior to conducting a new trial. For example, Lau et al. analyzed 33 clinical trials (involving 36974 patients) evaluating the effectiveness of intravenous streptokinase for acute myocardial infarction. Their cumulative meta-analysis demonstrated that 25 of 33 trials could have been avoided if a systematic review was conducted prior to conducting a new trial. In other words, randomizing 34542 patients was potentially unnecessary. One study analyzed 1523 clinical trials included in 227 meta-analyses and concluded that "less than one quarter of relevant prior studies" were cited. They also confirmed earlier findings that most clinical trial reports do not present systematic review to justify the research or summarize the results.

Many treatments used in modern medicine have been proven to be ineffective, or even harmful. A 2007 study by John Ioannidis found that it took an average of ten years for the medical community to stop referencing popular practices after their efficacy was unequivocally disproven.

Psychology

Metascience has revealed significant problems in psychological research. The field suffers from high bias, low reproducibility, and widespread misuse of statistics. The replication crisis affects psychology more strongly than any other field; as many as two-thirds of highly publicized findings may be impossible to replicate. Meta-research finds that 80-95% of psychological studies support their initial hypotheses, which strongly implies the existence of publication bias.

The replication crisis has led to renewed efforts to re-test important findings. In response to concerns about publication bias and p-hacking, more than 140 psychology journals have adopted result-blind peer review, in which studies are pre-registered and published without regard for their outcome. An analysis of these reforms estimated that 61 percent of result-blind studies produce null results, in contrast with 5 to 20 percent in earlier research. This analysis shows that result-blind peer review substantially reduces publication bias.

Psychologists routinely confuse statistical significance with practical importance, enthusiastically reporting great certainty in unimportant facts. Some psychologists have responded with an increased use of effect size statistics, rather than sole reliance on the p values.

Physics

Richard Feynman noted that estimates of physical constants were closer to published values than would be expected by chance. This was believed to be the result of confirmation bias: results that agreed with existing literature were more likely to be believed, and therefore published. Physicists now implement blinding to prevent this kind of bias.

Associated fields

Journalology

Journalology, also known as publication science, is the scholarly study of all aspects of the academic publishing process. The field seeks to improve the quality of scholarly research by implementing evidence-based practices in academic publishing. The term "journalology" was coined by Stephen Lock, the former editor-in-chief of the BMJ. The first Peer Review Congress, held in 1989 in Chicago, Illinois, is considered a pivotal moment in the founding of journalology as a distinct field. The field of journalology has been influential in pushing for study pre-registration in science, particularly in clinical trials. Clinical-trial registration is now expected in most countries.

Scientometrics

Scientometrics concerns itself with measuring bibliographic data in scientific publications. Major research issues include the measurement of the impact of research papers and academic journals, the understanding of scientific citations, and the use of such measurements in policy and management contexts.

Scientific data science

Scientific data science is the use of data science to analyse research papers. It encompasses both qualitative and quantitative methods. Research in scientific data science includes fraud detection and citation network analysis.

Invalid science

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Invalid_science

Invalid science consists of scientific claims based on experiments that cannot be reproduced or that are contradicted by experiments that can be reproduced. Recent analyses indicate that the proportion of retracted claims in the scientific literature is steadily increasing. The number of retractions has grown tenfold over the past decade, but they still make up approximately 0.2% of the 1.4m papers published annually in scholarly journals.

The U.S. Office of Research Integrity (ORI), investigates scientific misconduct.

Incidence

Science magazine ranked first for the number of articles retracted at 70, just edging out PNAS, which retracted 69. Thirty-two of Science's retractions were due to fraud or suspected fraud, and 37 to error. A subsequent "retraction index" indicated that journals with relatively high impact factors, such as Science, Nature and Cell, had a higher rate of retractions. Under 0.1% of papers in PubMed had were retracted of more than 25 million papers going back to the 1940s.

The fraction of retracted papers due to scientific misconduct was estimated at two-thirds, according to studies of 2047 papers published since 1977. Misconducted included fraud and plagiarism. Another one-fifth were retracted because of mistakes, and the rest were pulled for unknown or other reasons.

A separate study analyzed 432 claims of genetic links for various health risks that vary between men and women. Only one of these claims proved to be consistently reproducible. Another meta review, found that of the 49 most-cited clinical research studies published between 1990 and 2003, more than 40 percent of them were later shown to be either totally wrong or significantly incorrect.

Biological sciences

In 2012 biotech firm Amgen was able to reproduce just six of 53 important studies in cancer research. Earlier, a group at Bayer, a drug company, successfully repeated only one fourth of 67 important papers. In 2000-10 roughly 80,000 patients took part in clinical trials based on research that was later retracted because of mistakes or improprieties.

Paleontology

Nathan Mhyrvold failed repeatedly to replicate the findings of several papers on dinosaur growth. Dinosaurs added a layer to their bones each year. Tyrannosaurus rex was thought to have increased in size by more than 700 kg a year, until Mhyrvold showed that this was a factor of 2 too large. In 4 of 12 papers he examined, the original data had been lost. In three, the statistics were correct, while three had serious errors that invalidated their conclusions. Two papers mistakenly relied on data from these three. He discovered that some of the paper's graphs did not reflect the data. In one case, he found that only four of nine points on the graph came from data cited in the paper.

Major retractions

Torcetrapib was originally hyped as a drug that could block a protein that converts HDL cholesterol into LDL with the potential to "redefine cardiovascular treatment". One clinical trial showed that the drug could increase HDL and decrease LDL. Two days after Pfizer announced its plans for the drug, it ended the Phase III clinical trial due to higher rates of chest pain and heart failure and a 60 percent increase in overall mortality. Pfizer had invested more than $1 billion in developing the drug.

An in-depth review of the most highly cited biomarkers (whose presence are used to infer illness and measure treatment effects) claimed that 83 percent of supposed correlations became significantly weaker in subsequent studies. Homocysteine is an amino acid whose levels correlated with heart disease. However, a 2010 study showed that lowering homocysteine by nearly 30 percent had no effect on heart attack or stroke.

Priming

Priming studies claim that decisions can be influenced by apparently irrelevant events that a subject witnesses just before making a choice. Nobel Prize-winner Daniel Kahneman alleges that much of it is poorly founded. Researchers have been unable to replicate some of the more widely cited examples. A paper in PLoS ONE reported that nine separate experiments could not reproduce a study purporting to show that thinking about a professor before taking an intelligence test leads to a higher score than imagining a football hooligan. A further systematic replication involving 40 different labs around the world did not replicate the main finding. However, this latter systematic replication showed that participants who did not think there was a relation between thinking about a hooligan or a professor where significantly more susceptible to the priming manipulation.

Potential causes

Competition

In the 1950s, when academic research accelerated during the cold war, the total number of scientists was a few hundred thousand. In the new century 6m-7m researchers are active. The number of research jobs has not matched this increase. Every year six new PhDs compete for every academic post. Replicating other researcher’s results is not perceived to be valuable. The struggle to compete encourages exaggeration of findings and biased data selection. A recent survey found that one in three researchers knows of a colleague who has at least somewhat distorted their results.

Publication bias

Major journals reject in excess of 90% of submitted manuscripts and tend to favor the most dramatic claims. The statistical measures that researchers use to test their claims allow a fraction of false claims to appear valid. Invalid claims are more likely to be dramatic (because they are false.) Without replication, such errors are less likely to be caught.

Conversely, failures to prove a hypothesis are rarely even offered for publication. “Negative results” now account for only 14% of published papers, down from 30% in 1990. Knowledge of what is not true is as important as of what is true.

Peer review

Peer review is the primary validation technique employed by scientific publications. However, a prominent medical journal tested the system and found major failings. It supplied research with induced errors and found that most reviewers failed to spot the mistakes, even after being told of the tests.

A pseudonymous fabricated paper on the effects of a chemical derived from lichen on cancer cells was submitted to 304 journals for peer review. The paper was filled with errors of study design, analysis and interpretation. 157 lower-rated journals accepted it. Another study sent an article containing eight deliberate mistakes in study design, analysis and interpretation to more than 200 of the British Medical Journal’s regular reviewers. On average, they reported fewer than two of the problems.

Peer reviewers typically do not re-analyse data from scratch, checking only that the authors’ analysis is properly conceived.

Statistics

Type I and type II errors

Scientists divide errors into type I, incorrectly asserting the truth of a hypothesis (false positive) and type II, rejecting a correct hypothesis (false negative). Statistical checks assess the probability that data which seem to support a hypothesis come about simply by chance. If the probability is less than 5%, the evidence is rated “statistically significant”. One definitional consequence is a type one error rate of one in 20.

Statistical power

In 2005 Stanford epidemiologist John Ioannidis showed that the idea that only one paper in 20 gives a false-positive result was incorrect. He claimed, “most published research findings are probably false.” He found three categories of problems: insufficient “statistical power” (avoiding type II errors); the unlikeliness of the hypothesis; and publication bias favoring novel claims.

A statistically powerful study identifies factors with only small effects on data. In general studies with more repetitions that run the experiment more times on more subjects have greater power. A power of 0.8 means that of ten true hypotheses tested, the effects of two are missed. Ioannidis found that in neuroscience the typical statistical power is 0.21; another study found that psychology studies average 0.35.

Unlikeliness is a measure of the degree of surprise in a result. Scientists prefer surprising results, leading them to test hypotheses that are unlikely to very unlikely. Ioannidis claimed that in epidemiology, some one in ten hypotheses should be true. In exploratory disciplines like genomics, which rely on examining voluminous data about genes and proteins, only one in a thousand should prove correct.

In a discipline in which 100 out of 1,000 hypotheses are true, studies with a power of 0.8 will find 80 and miss 20. Of the 900 incorrect hypotheses, 5% or 45 will be accepted because of type I errors. Adding the 45 false positives to the 80 true positives gives 125 positive results, or 36% specious. Dropping statistical power to 0.4, optimistic for many fields, would still produce 45 false positives but only 40 true positives, less than half.

Negative results are more reliable. Statistical power of 0.8 produces 875 negative results of which only 20 are false, giving an accuracy of over 97%. Negative results however account for a minority of published results, varying by discipline. A study of 4,600 papers found that the proportion of published negative results dropped from 30% to 14% between 1990 and 2007.

Subatomic physics sets an acceptable false-positive rate of one in 3.5m (known as the five-sigma standard). However, even this does not provide perfect protection. The problem invalidates some 3/4s of machine learning studies according to one review.

Statistical significance

Statistical significance is a measure for testing statistical correlation. It was invented by English mathematician Ronald Fisher in the 1920s. It defines a “significant” result as any data point that would be produced by chance less than 5 (or more stringently, 1) percent of the time. A significant result is widely seen as an important indicator that the correlation is not random.

While correlations track the relationship between truly independent measurements, such as smoking and cancer, they are much less effective when variables cannot be isolated, a common circumstance in biological systems. For example, statistics found a high correlation between lower back pain and abnormalities in spinal discs, although it was later discovered that serious abnormalities were present in two-thirds of pain-free patients.

Minimum threshold publishers

Journals such as PLoS One use a “minimal-threshold” standard, seeking to publish as much science as possible, rather than to pick out the best work. Their peer reviewers assess only whether a paper is methodologically sound. Almost half of their submissions are still rejected on that basis.

Unpublished research

Only 22% of the clinical trials financed by the National Institutes of Health (NIH) released summary results within one year of completion, even though the NIH requires it. Fewer than half published within 30 months; a third remained unpublished after 51 months. When other scientists rely on invalid research, they may waste time on lines of research that are themselves invalid. The failure to report failures means that researchers waste money and effort exploring blind alleys already investigated by other scientists.

Fraud

In 21 surveys of academics (mostly in the biomedical sciences but also in civil engineering, chemistry and economics) carried out between 1987 and 2008, 2% admitted fabricating data, but 28% claimed to know of colleagues who engaged in questionable research practices.

Lack of access to data and software

Clinical trials are generally too costly to rerun. Access to trial data is the only practical approach to reassessment. A campaign to persuade pharmaceutical firms to make all trial data available won its first convert in February 2013 when GlaxoSmithKline became the first to agree.

Software used in a trial is generally considered to be proprietary intellectual property and is not available to replicators, further complicating matters. Journals that insist on data-sharing tend not to do the same for software.

Even well-written papers may not include sufficient detail and/or tacit knowledge (subtle skills and extemporisations not considered notable) for the replication to succeed. One cause of replication failure is insufficient control of the protocol, which can cause disputes between the original and replicating researchers.

Reform

Statistics training

Geneticists have begun more careful reviews, particularly of the use of statistical techniques. The effect was to stop a flood of specious results from genome sequencing.

Protocol registration

Registering research protocols in advance and monitoring them over the course of a study can prevent researchers from modifying the protocol midstream to highlight preferred results. Providing raw data for other researchers to inspect and test can also better hold researchers to account.

Post-publication review

Replacing peer review with post-publication evaluations can encourage researchers to think more about the long-term consequences of excessive or unsubstantiated claims. That system was adopted in physics and mathematics with good results.

Replication

Few researchers, especially junior workers, seek opportunities to replicate others' work, partly to protect relationships with senior researchers.

Reproduction benefits from access to the original study's methods and data. More than half of 238 biomedical papers published in 84 journals failed to identify all the resources (such as chemical reagents) necessary to reproduce the results. In 2008 some 60% of researchers said they would share raw data; in 2013 just 45% do. Journals have begun to demand that at least some raw data be made available, although only 143 of 351 randomly selected papers covered by some data-sharing policy actually complied.

The Reproducibility Initiative is a service allowing life scientists to pay to have their work validated by an independent lab. In October 2013 the initiative received funding to review 50 of the highest-impact cancer findings published between 2010 and 2012. Blog Syn is a website run by graduate students that is dedicated to reproducing chemical reactions reported in papers.

In 2013 replication efforts received greater attention. Nature and related publications introduced an 18-point checklist for life science authors in May, in its effort to ensure that its published research can be reproduced. Expanded "methods" sections and all data were to be available online. The Centre for Open Science opened as an independent laboratory focused on replication. The journal Perspectives on Psychological Science announced a section devoted to replications. Another project announced plans to replicate 100 studies published in the first three months of 2008 in three leading psychology journals.

Major funders, including the European Research Council, the US National Science Foundation and Research Councils UK have not changed their preference for new work over replications.

Publication bias

From Wikipedia, the free encyclopedia

https://en.wikipedia.org/wiki/Publication_bias

Publication bias is a type of bias that occurs in published academic research. It occurs when the outcome of an experiment or research study influences the decision whether to publish or otherwise distribute it. Publishing only results that show a significant finding disturbs the balance of findings, and inserts bias in favor of positive results. The study of publication bias is an important topic in metascience.

Studies with significant results can be of the same standard as studies with a null result with respect to quality of execution and design. However, statistically significant results are three times more likely to be published than papers with null results. A consequence of this is that researchers are unduly motivated to manipulate their practices to ensure that a statistically significant result is reported.

Multiple factors contribute to publication bias. For instance, once a scientific finding is well established, it may become newsworthy to publish reliable papers that fail to reject the null hypothesis. It has been found that the most common reason for non-publication is simply that investigators decline to submit results, leading to non-response bias. Factors cited as underlying this effect include investigators assuming they must have made a mistake, failure to support a known finding, loss of interest in the topic, or anticipation that others will be uninterested in the null results. The nature of these issues and the problems that have been triggered, have been referred to as the 5 diseases that threaten science, which include: "significosis, an inordinate focus on statistically significant results; neophilia, an excessive appreciation for novelty; theorrhea, a mania for new theory; arigorium, a deficiency of rigor in theoretical and empirical work; and finally, disjunctivitis, a proclivity to produce large quantities of redundant, trivial, and incoherent works."

Attempts to identify unpublished studies often prove difficult or are unsatisfactory. In an effort to combat this problem, some journals require that studies submitted for publication are pre-registered (registering a study prior to collection of data and analysis) with organizations like the Center for Open Science.

Other proposed strategies to detect and control for publication bias include p-curve analysis and disfavoring small and non-randomised studies because of their demonstrated high susceptibility to error and bias.

Definition

Publication bias occurs when the publication of research results depends not just on the quality of the research but also on the hypothesis tested, and the significance and direction of effects detected. The subject was first discussed in 1959 by statistician Theodore Sterling to refer to fields in which "successful" research is more likely to be published. As a result, "the literature of such a field consists in substantial part of false conclusions resulting from errors of the first kind in statistical tests of significance". In the worst case, false conclusions could canonize as being true if the publication rate of negative results is too low.

Publication bias is sometimes called the file-drawer effect, or file-drawer problem. This term suggests that results not supporting the hypotheses of researchers often go no further than the researchers' file drawers, leading to a bias in published research. The term "file drawer problem" was coined by psychologist Robert Rosenthal in 1979.

Positive-results bias, a type of publication bias, occurs when authors are more likely to submit, or editors are more likely to accept, positive results than negative or inconclusive results. Outcome reporting bias occurs when multiple outcomes are measured and analyzed, but the reporting of these outcomes is dependent on the strength and direction of its results. A generic term coined to describe these post-hoc choices is HARKing ("Hypothesizing After the Results are Known").

Evidence

Meta-analysis of stereotype threat on girls' math scores showing asymmetry typical of publication bias. From Flore, P. C., & Wicherts, J. M. (2015)

There is extensive meta-research on publication bias in the biomedical field. Investigators following clinical trials from the submission of their protocols to ethics committees (or regulatory authorities) until the publication of their results observed that those with positive results are more likely to be published. In addition, studies often fail to report negative results when published, as demonstrated by research comparing study protocols with published articles.

The presence of publication bias was investigated in meta-analyses. The largest such analysis investigated the presence of publication bias in systematic reviews of medical treatments from the Cochrane Library. The study showed that statistically positive significant findings are 27% more likely to be included in meta-analyses of efficacy than other findings. Results showing no evidence of adverse effects have a 78% greater probability of inclusion in safety studies than statistically significant results showing adverse effects. Evidence of publication bias was found in meta-analyses published in prominent medical journals.

Impact on meta-analysis

Where publication bias is present, published studies are no longer a representative sample of the available evidence. This bias distorts the results of meta-analyses and systematic reviews. For example, evidence-based medicine is increasingly reliant on meta-analysis to assess evidence.

Meta-analyses and systematic reviews can account for publication bias by including evidence from unpublished studies and the grey literature. The presence of publication bias can also be explored by constructing a funnel plot in which the estimate of the reported effect size is plotted against a measure of precision or sample size. The premise is that the scatter of points should reflect a funnel shape, indicating that the reporting of effect sizes is not related to their statistical significance. However, when small studies are predominately in one direction (usually the direction of larger effect sizes), asymmetry will ensue and this may be indicative of publication bias.

Because an inevitable degree of subjectivity exists in the interpretation of funnel plots, several tests have been proposed for detecting funnel plot asymmetry. These are often based on linear regression, and may adopt a multiplicative or additive dispersion parameter to adjust for the presence of between-study heterogeneity. Some approaches may even attempt to compensate for the (potential) presence of publication bias, which is particularly useful to explore the potential impact on meta-analysis results.

Compensation examples

Two meta-analyses of the efficacy of reboxetine as an antidepressant demonstrated attempts to detect publication bias in clinical trials. Based on positive trial data, reboxetine was originally passed as a treatment for depression in many countries in Europe and the UK in 2001 (though in practice it is rarely used for this indication). A 2010 meta-analysis concluded that reboxetine was ineffective and that the preponderance of positive-outcome trials reflected publication bias, mostly due to trials published by the drug manufacturer Pfizer. A subsequent meta-analysis published in 2011, based on the original data, found flaws in the 2010 analyses and suggested that the data indicated reboxetine was effective in severe depression. Examples of publication bias are given by Ben Goldacre and Peter Wilmshurst.

In the social sciences, a study of published papers exploring the relationship between corporate social and financial performance found that "in economics, finance, and accounting journals, the average correlations were only about half the magnitude of the findings published in Social Issues Management, Business Ethics, or Business and Society journals".

One example cited as an instance of publication bias is the refusal to publish attempted replications of Bem's work that claimed evidence for precognition by The Journal of Personality and Social Psychology (the original publisher of Bem's article).

An analysis comparing studies of gene-disease associations originating in China to those originating outside China found that those conducted within the country reported a stronger association and a more statistically significant result.

Risks

John Ioannidis argues that "claimed research findings may often be simply accurate measures of the prevailing bias." He lists the following factors as those that make a paper with a positive result more likely to enter the literature and suppress negative-result papers:

The studies conducted in a field have small sample sizes.
The effect sizes in a field tend to be smaller.
There is both a greater number and lesser preselection of tested relationships.
There is greater flexibility in designs, definitions, outcomes, and analytical modes.
There are prejudices (financial interest, political, or otherwise).
The scientific field is hot and there are more scientific teams pursuing publication.

Other factors include experimenter bias and white hat bias.

Remedies

Publication bias can be contained through better-powered studies, enhanced research standards, and careful consideration of true and non-true relationships. Better-powered studies refer to large studies that deliver definitive results or test major concepts and lead to low-bias meta-analysis. Enhanced research standards such as the pre-registration of protocols, the registration of data collections and adherence to established protocols are other techniques. To avoid false-positive results, the experimenter must consider the chances that they are testing a true or non-true relationship. This can be undertaken by properly assessing the false positive report probability based on the statistical power of the test and reconfirming (whenever ethically acceptable) established findings of prior studies known to have minimal bias.

Study registration

In September 2004, editors of prominent medical journals (including the New England Journal of Medicine, The Lancet, Annals of Internal Medicine, and JAMA) announced that they would no longer publish results of drug research sponsored by pharmaceutical companies, unless that research was registered in a public clinical trials registry database from the start. Furthermore, some journals (e.g. Trials), encourage publication of study protocols in their journals.

The World Health Organization (WHO) agreed that basic information about all clinical trials should be registered at the study's inception, and that this information should be publicly accessible through the WHO International Clinical Trials Registry Platform. Additionally, public availability of complete study protocols, alongside reports of trials, is becoming more common for studies.

Search This Blog

Sunday, April 11, 2021

Scholarly peer review

History

Justification

Procedure

Step 1: Desk evaluation

Step 2: External review

Step 3: Revisions

Recruiting referees

Different styles

Anonymous and attributed

Open peer review

Pre- and post-publication peer review

Pre-publication peer review

Post-publication peer review

Social media and informal peer review

Result-blind peer review

Criticism

Low-end distinctions in articles understandable to all peers

Peer review and trust

Views of peer review

Allegations of bias and suppression

Open access journals and peer review

Failures

Fake peer review

Plagiarism

Examples

In popular culture

Metascience

History

Areas of meta-research

Methods

Reporting

Reproducibility

Evaluation

Incentives

Reforms

Pre-registration

Reporting standards

Applications

Medicine

Psychology

Physics

Associated fields

Journalology

Scientometrics

Scientific data science

Invalid science

Incidence

Biological sciences

Paleontology

Major retractions

Priming

Potential causes

Competition

Publication bias

Peer review

Statistics

Type I and type II errors

Statistical power

Statistical significance

Minimum threshold publishers

Unpublished research

Fraud

Lack of access to data and software

Reform

Statistics training

Protocol registration

Post-publication review

Replication

Publication bias

Definition

Evidence

Impact on meta-analysis

Compensation examples

Risks

Remedies

Study registration

Atmospheric refraction