Reproduction and replication in scientific research – part 1

By Ernesto Spinak

Introduction1,2

Screenshot from the public domain films Maniac (1934) showing Horace B. Carpenter as the character "Dr. Meirschultz"

Image: Maniac, 1934.

The scientific enterprise depends on the scientific community’s ability to examine its claims and gain confidence over time in results and inferences that have withstood repeated testing. Reporting uncertainties in scientific results is a core principle of the scientific process. It is up to scientists to convey the appropriate information and the degree of uncertainty in presenting their claims. Science points to refined degrees of confidence rather than complete certainty.3

Many scientific studies seek to measure, explain, and make predictions about natural phenomena. Other studies seek to detect and measure the effects of an intervention on a system. Statistical inference provides a concept and computational framework for addressing the scientific issues in each fit. Estimation and hypothesis testing are broad groupings of inferential procedures.

Reproducibility and replicability in relation to data and computationally intensive scientific work have received attention with the use of extended computational tools. In the 1990s, Jon Claerbout launched the “reproducible research movement”. The assumption was that reanalysis of the same data using the same methods would yield the same results.

With the onset of massive data analysis, particularly in the medical sciences, the following questions were introduced:

  • How should studies and research in the various disciplines that use the algorithmic computational model be?
  • Can the approaches be designed to efficiently generate reliable knowledge?
  • How could the hypotheses and results be better communicated to allow others to confirm, refute or build on these results?
  • How can scientists’ potential biases be understood, identified, and exposed to improve accuracy in generating and interpreting research results?

To summarize, we consider reproducibility as “obtaining the same results in an independent study whose procedures are as close as possible to the original experiment”. However, it didn’t end up like that.

The sense of crisis began with widespread public awareness of reproducibility failures when the Center for Open Science announced in 2015 that it could confirm only 39 out of 100 published studies in psychology. For many scientists, however, not just in psychology, reproducibility has not worked or been effective.

For more than a decade, the inability to reproduce discoveries in various disciplines, including biomedical, behavioral, and social sciences, has led some authors to assert the existence of a so-called “reproducibility crisis” in these disciplines. Some symptoms detected were:

  • several aspects of the reproducibility of scientific studies, including definitions of reproducibility, were ambiguously interpreted;
  • several variables involved in evaluating the success of attempts to reproduce a study were identified, as well as other factors suggested as responsible for reproducibility failures;
  • various types of validity of experimental studies and threats to validity regarding reproducibility were observed;
  • these ambiguities and uncertainties have been presented as evidence of threats to reproducibility in the behavioral science analysis literature.

Much of the criticism and comment about reproducibility and solutions to the crisis, both real and perceived, focused on statistics, methodologies, and how communications were reported. Over the past decade, statisticians have shown how statistics can be unintentionally misused, or in some cases intentionally abused, as researchers try to produce results that appeal to professional colleagues and potential funders.

Peer review doesn’t protect us enough either. To this day, peer review is what supposedly guarantees what has been published, that the findings are correct and, implicitly, that those findings could be reproduced if other researchers would try. However, as several scholars have shown, the current incarnation of peer review – in which submissions to journals are reviewed by anonymous peers – is a historical accident, far from being a devised procedure capable of separating truth from fiction.

In the late 1990s, peer review came under heavy criticism. Its many flaws, which directly contribute to reproducibility difficulties, have become well known, but it bears repeating: studies with negative or null results are rarely reported and few are published, opening the door to false positives. The expectation is that open science with the practices of initiating research communication through preprints, sharing the data underlying the articles and publishing review reports can alleviate this problem.

Reproducibility failures increase research costs, particularly in the health sciences. Let’s look at some recent articles as examples.

“Low reproducibility rates in life science research undermine the production of cumulative knowledge and contribute to delays and costs in the development of therapeutic drugs. An analysis of studies between 2012 and 2015 indicates that the cumulative (total) prevalence of non-reproducible preclinical research exceeds 50%, resulting in approximately $28 billion/year spent on non-reproducible preclinical research in the United States alone.”4

Also significant was a 2005 article by John Ioannidis provocatively titled Why Most Published Research Findings Are False.5 Ioannidis argued that “most research results are false for most research projects and for most disciplinary fields”5 due to a combination of design, analysis, and reporting biases; tests carried out by several independent teams leading to the publication of false positive results; and low potential research projects. Admitting that there was no way to obtain 100% certainty, Ioannidis asked for evidence of greater probative power, correcting for publication bias, and addressing problems with other forms of bias.6

“Currently, many published research results are false or exaggerated. It is estimated that 85% of search resources are wasted.”7

For those involved in discussions of rigor, reproducibility, and replication in science, conversations about the “reproducibility crisis” seem ill-structured.

Apparently, many different issues fall under this label, and not just those related to the “purity of reagents, the accessibility of computer code, or the structure of incentives in academic research”.

Papers over the past two decades have attempted to address these problems by creating various definitions of the terms under discussion, such as reproducibility, replicability, etc. A correspondence analysis of terminology in scientific publications carried out by Nelson NC, (2021) Mapping the discursive dimensions of the reproducibility crisis: A mixed methods analysis8 identified three discussion groups in the articles: one group focused on the use of reagents, another on statistical methods and a last group focused on the heterogeneity of the natural world.

Daniele Fanelli and John Ioannidis of the Meta-Research Innovation Center at Stanford have argued that “the lexicon of reproducibility to date has been multiple and ill-defined,”9 and that the lack of clarity about the specific types of reproducibility that are discussed has been an impediment to progress on these issues. Many commentators have pointed out that there is considerable confusion between the terms reproducibility and replicability, and that these terms are often used interchangeably in the literature. Victoria Stodden has argued that there are three main types of reproducibility: empirical, statistical, and computational, each representing a different narrative linked to a different discipline.

So far, scholars have tried to address these concerns by proposing clarifying definitions or typologies to guide discussions. The 2019 Reproducibility and Replicability in Science Report3 from the National Academies of Sciences, Engineering, and Medicine addresses the problem of terminological confusion and creates a defining distinction between reproducibility and replicability, a distinction that aligns with the use of these terms in science. computing, but it is at odds with the more flexible forms that are used by relevant organizations such as the Center for Open Science and the plans of the National Institutes of Health.

Many commentators argue that reproducibility is a societal problem that will require changes in the culture of science and, furthermore, methodologies designed to study cultural variation and change, such as: participant observation, ethnography, cross-cultural comparisons, qualitative analysis, and data analysis. Methodologies that are rarely used for scientific or research-oriented purposes of reproducibility. Achieving lasting change in scientific cultures will first require a more systematic approach and understanding of the variation in how scientists interpret reproducibility issues to create “culturally competent”8 interventions.

Before examining the theories underlying the lack of replicability of experiments, let’s look at some basic issues that might explain it. The formal and epistemological bases will be presented in a later contribution, where we decompose the already mentioned document of the National Academies of Sciences.

Below are two examples of studies in which non-replication of results led researchers to look for the source of discrepancies and ultimately increased understanding of the systems under study and how they are reported.

How do you determine to what extent a replication attempt was successful or not? Sometimes the problem is that the report is not clear or detailed enough in the procedures.

Two independent laboratories were performing experiments on breast tissue using what they assumed was the same protocol, however, their results continued to differ. When researchers from the two labs sat side by side to perform the experiment, they found that one lab was gently removing cells from the flask, while the other lab was using a more vigorous stirring procedure.

Both methods are common, so none of the researchers thought to mention the details of the stirring process. Before these researchers discovered variation in technique, it was not known that the stirring method could affect the outcome of this experiment. After its discovery, clarifying the type of stirring method in the study methods became an avoidable source of lack of replicability.

Non-replicability can also be the result of human error or an inexperienced researcher. Deficiencies in the design, conduct of a study, or subsequent communication may contribute to non-replicability. We consider here a selected set of such avoidable sources of non-replication, which will be explained in detail in future notes:

  • Publication bias;
  • Misaligned incentives for publication;
  • The use of inappropriate statistical inference;
  • Deficient study design;
  • Errors in conducting the experiment;
  • Incomplete report of a study.

To complete today’s note, let’s provide some useful suggestions:

The systems needed to promote reproducible research must come from institutions, from scientists and from sponsors, since journals cannot build them alone. These types of changes will require additional resources, infrastructure, personnel, and procedures. The burden on institutions and researchers will be real, but so will the burden of irreproducible research.

To make published research more credible, practices that have improved credibility and efficiency in specific fields can be transplanted to other disciplines to benefit from them; possibilities include the adoption of large-scale collaborative systems.

It is necessary to make changes to the system of science incentives and rewards, affecting, for example, publications, grants, and purchases of academic goods that are more in line with reproducible research.

Conclusion

In this post, we present a current overview of the problems of Replicability and Reproducibility in scholarly communication. In the next two posts we will address the philosophical foundations offered by the National Academies of Sciences, the extent to which these guidelines are applicable to the social sciences and humanities, and what contribution open science, open peer review, and preprint servers can make.

The Replication in Research series consists of three posts

Notes

1. The issue of replicability has been in “crisis” in scientific publishing, to which we will dedicate, from this note onwards, a series of posts to try to explain the meanings of the terms: Replicability, Reproducibility, Robustness and Generalizability.

We will also analyze how replicability is understood in different scientific disciplines, what are the most frequent errors and what gravitation they have for the validation of published scientific knowledge.

The Research Replication series will consist of three posts:

  1. Scenario of the supposed “crisis” of replication in scientific publishing.
  2. Expert comments on the terminology used: (a) the Guide published by the National Academies of Sciences, Engineering, and Medicine trying to standardize the concepts; (b) Differing opinions from the Guide of Social Sciences and Humanities disciplines whose paradigms do not conform to the exact sciences, and (c) views of the medical sciences that point to other problems.
  3. Summary of previous notes with suggestions for resolving the biggest conflicts that point to Open Science and the use of preprint servers.
  4. Many of the concepts expressed in this series of notes are adapted from a guidance document published by the National Academies Press3 in 2019.

3. NATIONAL ACADEMIES OF SCIENCES, ENGINEERING, AND MEDICINE. Reproducibility and Replicability in Science. Washington, DC: The National Academies Press, 2019. https://doi.org/10.17226/25303. Available from: https://nap.nationalacademies.org/catalog/25303/reproducibility-and-replicability-in-science

4. FREEDMAN, L.P., COCKBURN, I.M. and SIMCOE, T.S. The Economics of Reproducibility in Preclinical Research. PLoS Biol [online]. 2015, vol. 13, no. 6, e1002626 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pbio.1002165. Available from: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165

5. IOANNIDIS, J.P.A. Why Most Published Research Findings Are False. PLoS Med [online]. 2005, vol. 2, no. 8, e124 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pmed.0020124. Available from: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124

6. SAYRE, F. and RIEGELMAN, A. The reproducibility crisis and academic libraries. College & Research Libraries [online]. 2018, vol. 79, no. 1 [viewed 19 May 2023]. https://doi.org/10.5860/crl.79.1.2. Available from: https://crl.acrl.org/index.php/crl/article/view/16846

7. IOANNIDIS, J.P.A. How to make more published research true. PLoS Med [online]. 2014, vol. 11, e1001747 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pmed.1001747. Available from: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001747

8. NELSON, N.C., et al. (2021) Mapping the discursive dimensions of the reproducibility crisis: A mixed methods analysis. PLoS ONE [online]. 2021, vol. 16, no. 7, e0254090 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pone.0254090. Available from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0254090

9. GOODMAN, S.N., FANELLI, D. and IOANNIDIS, J.P.A. What does research reproducibility mean? Science Translational Medicine [online]. 2016, vol. 8, no. 341, 341ps12 [viewed 19 May 2023]. https://doi.org/10.1126/scitranslmed.aaf5027. Available from: https://www.science.org/doi/10.1126/scitranslmed.aaf5027

References

CLAERBOUT, J.F., AND KARRENBACH, M. Electronic Documents Give Reproducible Research a New Meaning. SEG Technical Program Expanded Abstracts. 1992, 601-604 [viewed 19 May 2023]. https://doi.org/10.1190/1.1822162. Available from: https://library.seg.org/doi/abs/10.1190/1.1822162

COLLINS, F. AND TABAK, L. Policy: NIH plans to enhance reproducibility. Nature [online]. 2014, vol. 505, pp. 612–613 [viewed 19 May 2023]. https://doi.org/10.1038/505612a. Available from: https://www.nature.com/articles/505612a

FREEDMAN, L.P., COCKBURN, I.M. and SIMCOE, T.S. The Economics of Reproducibility in Preclinical Research. PLoS Biol [online]. 2015, vol. 13, no. 6, e1002626 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pbio.1002165. Available from: https://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002165

GOODMAN, S.N., FANELLI, D. and IOANNIDIS, J.P.A. What does research reproducibility mean? Science Translational Medicine [online]. 2016, vol. 8, no. 341, 341ps12 [viewed 19 May 2023]. https://doi.org/10.1126/scitranslmed.aaf5027. Available from: https://www.science.org/doi/10.1126/scitranslmed.aaf5027

HARRIS, R.F. Rigor Mortis: How Sloppy Science Creates Worthless Cures, Crushes Hope, and Wastes Billions. New York: Basic Books, 2017.

HINES, W.C., et al. Sorting out the FACS: A Devil in the Details. Cell Reports. 2014, vol. 6, no. 5, pp. 779-781 [viewed 19 May 2023]. http://doi.org/10.1016/j.celrep.2014.02.021. Available from: https://www.cell.com/cell-reports/fulltext/S2211-1247(14)00121-1

IOANNIDIS, J.P.A. How to make more published research true. PLoS Med [online]. 2014, vol. 11, e1001747 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pmed.1001747. Available from: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.1001747

IOANNIDIS, J.P.A. Why Most Published Research Findings Are False. PLoS Med [online]. 2005, vol. 2, no. 8, e124 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pmed.0020124. Available from: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124

LARAWAY, S, et al. An Overview of Scientific Reproducibility: Consideration of Relevant Issues for Behavior Science/Analysis. Perspect Behav Sci [online]. 2019, vol.42, no. 1, pp. 33-57 [viewed 19 May 2023]. https://doi.org/10.1007/s40614-019-00193-3. Available from: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6701706/

NATIONAL ACADEMIES OF SCIENCES, ENGINEERING, AND MEDICINE. Reproducibility and Replicability in Science. Washington, DC: The National Academies Press, 2019. https://doi.org/10.17226/25303. Available from: https://nap.nationalacademies.org/catalog/25303/reproducibility-and-replicability-in-science

NELSON, N.C., et al. (2021) Mapping the discursive dimensions of the reproducibility crisis: A mixed methods analysis. PLoS ONE [online]. 2021, vol. 16, no. 7, e0254090 [viewed 19 May 2023]. https://doi.org/10.1371/journal.pone.0254090. Available from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0254090

Open Science Collaboration. An open, large-scale, collaborative effort to estimate the reproducibility of psychological science. Perspect Psychol Sci [online]. 2012, vol. 7, pp. 657–660 [viewed 19 May 2023]. https://doi.org/10.1177/1745691612462588. Available from: https://journals.sagepub.com/doi/10.1177/1745691612462588

POPPER, K. The Logic of Scientific Discovery. London: Routledge, 2005.

SAYRE, F. and RIEGELMAN, A. The reproducibility crisis and academic libraries. College & Research Libraries [online]. 2018, vol. 79, no. 1 [viewed 19 May 2023]. https://doi.org/10.5860/crl.79.1.2. Available from: https://crl.acrl.org/index.php/crl/article/view/16846

STODDEN, V. Resolving irreproducibility in empirical and computational research [online]. IMS Bulletin blog, 2013 [viewed 19 May 2023]. Available from: https://imstat.org/2013/11/17/resolving-irreproducibility-in-empirical-and-computational-research/

STUPPLE, A., SINGERMAN, D. and CELI, L.A. The reproducibility crisis in the age of digital medicine. npj Digit. Med. [online]. 2019, vol. 2, no. 1 [viewed 19 May 2023]. https://doi.org/10.1038/s41746-019-0079-z. Available from: https://www.nature.com/articles/s41746-019-0079-z

External link

Center for Open Science: https://www.cos.io/

 

About Ernesto Spinak

Collaborator on the SciELO program, a Systems Engineer with a Bachelor’s degree in Library Science, and a Diploma of Advanced Studies from the Universitat Oberta de Catalunya (Barcelona, Spain) and a Master’s in “Sociedad de la Información” (Information Society) from the same university. Currently has a consulting company that provides services in information projects to 14 government institutions and universities in Uruguay.

 

Translated from the original in Spanish by Lilian Nassi-Calò.

 

Como citar este post [ISO 690/2010]:

SPINAK, E. Reproduction and replication in scientific research – part 1 [online]. SciELO in Perspective, 2023 [viewed ]. Available from: https://blog.scielo.org/en/2023/05/19/reproduction-and-replication-in-scientific-research-part-1/

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation