By Bruno Schaefer, Luiz Augusto Campos, and Marcia Rangel Candido
1. Introduction
As of this year, Revista DADOS will have an editorial department specifically set up to deal with issues of the replicability of its articles. Since its foundation in 1966, the publication has built its name on a commitment to producing objective and valid information about the social world. This commitment included a break with essayism in favor of a more systematic research view, which led to the publication of manuscripts strongly supported by empirical evidence.
Continuing its tradition of innovation, DADOS is experiencing, along with other Brazilian and international journals, a revolution in the scientific world, postulated by the open science movement, which, among other things, has to do with the processes of data replication and making data available. To keep up with recent changes, we have adopted a Replicability Editorial Office, now headed by Bruno Schaefer, a professor at Universidade do Estado do Rio de Janeiro’s (UERJ) Institute of Social and Political Studies.
This text is divided into seven sections and discusses the importance of replicability in the social sciences; the concept of replicability that we use in DADOS editorial process; the differences in replicability between quantitative and qualitative research; international and national experiences on the subject; the ethical issues involved in replicability; and, finally, the impacts on the workflow and management of submissions to the journal.
3. What does replicability mean and why does it matter?
For almost two decades, the debate about the “replicability crisis” has been a specter that surrounds scientific practice, whether in the “hard” sciences (chemistry, physics, and biology) or in the social sciences (psychology, economics, political science, and sociology, among others). Ioannidis’ paper, Why Most Published Research Findings Are False,1 published in 2005, postulated that a large number of randomized studies in epidemiology include false results, since they are not replicable. This finding mainly involved two problems: studies with a low number of cases analyzed and statistical significance bias. It was, therefore, the recurrence of experiments with little empirical evidence which “exaggerate too much” to reach a p-value of less than 5%, a reliability criterion. In the field of psychology, an effort to replicate 100 experiments that assumed causal inferences managed to find the same results in less than half of them. In Political Science, the recent article Transparency and Replication in Brazilian Political Science: A First Look2 published in DADOS indicated an even worse performance for Brazilian output. From a corpus of 197 papers with some kind of quantitative analysis, only 28% of their respective authors agreed to share their data and codes, of which only 14% were able to attempt replication, with a 5% success rate. The most common problems in the replication process involved the absence of a computer routine (script), problems with the results, and problems with the data.
Before we analyze what is behind this “replicability crisis”, it is important to understand what the concept means. For Janz (2016), replication means “the process by which a published article’s hypotheses and findings are re-analyzed to confirm or challenge the results”.3 For King (1995), replicability means: “(…) that sufficient information exists with which to understand, evaluate, and build upon a prior work if a third party could replicate the results without any additional information from the author.”4
Despite the differences between the authors, one common point is the idea that replicable research is that which clearly provides the process of collecting, processing, and analyzing data, so that a third party can follow the same path and find similar results, either by analyzing the same empirical material (database, for example) or by applying the research design to other cases.
Figueiredo Filho, et al. (2019)5 postulate seven reasons why we should take replicability in the social sciences seriously:
- The availability of data prevents errors and misconduct. In the first case, researchers can make mistakes in the data analysis process that will be corrected, since the empirical material and analysis techniques are available to referees and the scientific community at large. In the second case, taking replication seriously allows misconduct (data invention, p-hacking, among other frauds) to be spotted;
- Thinking about research based on replication patterns makes it easier to conduct the analysis. When we know that our analysis can be replicated, we make an extra effort to make our ideas and choices clearer;
- Replication facilitates the process of evaluating papers. Without some possibility of replication, we are forced to rely blindly on what is written, which greatly limits evaluation;
- Replicable materials help to accumulate knowledge and develop the scientific field. Not only does replication itself provide greater validation of scientific discoveries, but it also guarantees accessibility to evidence and databases that were previously completely inaccessible to a wider public;
- Replicability boosts the researchers’ reputation;
- Making research material available helps in the process of learning and training new researchers;
- Replicability increases the article’s impact. Papers that publish their databases have more citations than papers that do not (Christensen, et al. 2019).6
3. Replicability, reproduction, and transparency
The concept of replicability is often used synonymously with other, equally important practices such as reproducibility or transparency, which can lead to confusion and noise. Transparency is a broader concept, which involves communicating in a clear and open manner how the research procedures were carried out, how the path between the research question and the results takes place. In this sense, the idea of transparency dialogues with the open science movement: “broad access to the sources of knowledge involved in and produced by research is intended to maximize the raison d’être of science as a cooperative cultural and social enterprise.”7 Being transparent involves the ability to communicate. Reproduction, or reproducibility, in turn, involves making the step-by-step of the research available, usually scripts or computer routines that make it possible to reproduce the work. A reproducible piece of work is one in which re-analysis of the same data using the same methods would produce the same results.
Research can only be replicable if it is transparent, just as research is reproducible if it is replicable. The concept involves the need for clarity in the knowledge production process, which may or may not be reproducible using computer routines. In other words, all replicable research is transparent, but not necessarily reproducible stricto sensu, as it can use data collection and analysis methods that are not directly reproducible (ethnography and other qualitative methods), or do not use computer routines (scripts). The concept of replicability is also broader and underlies the idea that the same research design can be used for other empirical material.
A large part of the “replicability crisis” involves the failure to replicate experiments in other contexts. For example, conducting an experiment with undergraduate students that finds positive results for the hypothesis that people tend to obey authorities blindly (Milgram’s experiment), must be replicable in another context (other students and another university).
In DADOS, we have adopted the concept of replicability, the idea that research should be clear and transparent about the methodological step-by-step (path between problem and answers), make available all the data necessary for the same results to be found and, where possible, share computer routines that facilitate the reproduction of findings.
4. Replicability in quantitative and qualitative research
The debate on replicability and reproducibility is very rich and we do not intend to cover it all here. The point we would like to make is that often, in the social sciences, we can deal with problems that are not directly replicable. The debate between qualitative and quantitative research highlights this issue. In the mid-1990s, King, Kehone & Verba (henceforth KKV) published a seminal book on methodology in the social sciences, Design Social Inquiry: scientific inference in qualitative research.8 The argument put forward by the authors is that the aim of the social sciences is to construct valid, descriptive, or explanatory inferences, and that qualitative and quantitative research would have the same logic. The essence of science would be the method, not the subjects dealt with.
In this sense, adherents of the qualitative approach should pay attention to the use of methodological strategies already used by quantitativists (especially those imported from statistics), which are capable of building valid inferences. Between the description of a phenomenon and the search for one or more causes, research should focus on the search for causality(ies). For the authors, inference therefore refers to the process in which we use known (and available) information to learn about unknown (and unavailable) information.
Criticism of the KKV proposal has come from several quarters. Brady & Collier (2004), for example, in Rethinking Social Inquiry: Diverse Tools, Shared Standards9 tackle KKV’s notion that the structure of the quantitative approach would be the only possibility of achieving valid inferences or a standard of scientificity. For Haverland and Yanow (2012), among others, it would also be necessary to differentiate between methods and methodology. According to these authors, confusion between the terms tends to occur many times, which affects the construction of research and the analysis of results. While method refers to the tools and techniques used in a piece of work, methodology refers to a broader level, which concerns the ontological and epistemological constructs that guide the adoption of one method or another. It is precisely at this point that it becomes necessary to differentiate between the construction of knowledge proposed by quantitative or qualitative approaches. While for researchers guided by a quantitative design, the main issue would be to “explain” a given phenomenon, roughly speaking, the effect of X¹ and X² on Y; researchers guided by a qualitative research design tend to focus on the interpretation and meaning of certain results.
In A Tale of Two Cultures: qualitative and quantitative research in social sciences,10 Goertz & Mahoney (2012) propose a possible integration between qualitative and quantitative research. For the authors, it is necessary to consider that these approaches start from different epistemological positions. Quantitative research starts from an objectivist epistemology (to avoid using the term positivism, which is used erroneously most of the time), while qualitative research starts from a constructivist or interpretivist epistemology. This difference is even mathematical, since the former are based on statistics and probability, while the latter are based on logic and set theory. Within these approaches themselves, or “cultures”, there would also be divisions: quantitative research interested in making causal or descriptive inferences (advances in computing and machine learning, among others); and “qualitative” research focused on interpretation and the production of meaning (with a capital Q) or working with qualitative methods guided by objectivist epistemology, such as QCA, process tracing, among others.
The aforementioned distinctions are of interest here insofar as they relate to the debate on replicability. Quantitative studies are usually more replicable because – ideally – they use structured databases, computer routines and analysis methods that can be reproduced as well as extended. Qualitative research using methods such as QCA or process tracing follows similar patterns. Now, other techniques and methods are by their nature non-reproducible. How do you redo an ethnography? Go back in time and observe the same phenomenon with the same eyes? Therefore, in DADOS we adopt as a standard in qualitative research carried out based on interpretive epistemology, the idea that authors must be as transparent as possible in describing their methods, and it is desirable that, together with the papers, they send methodological annexes that can be published: videos, transcripts and recordings of interviews, and field diaries, among others.
Making these materials from qualitative research available fulfills two additional functions. Firstly, it ensures that complementary information is available beyond the increasingly narrow confines of an academic article. Secondly, it helps to preserve data from qualitative research, which is often lost in personal or restricted archives. For all these reasons, DADOS strongly recommends making available evidence from qualitative research (interview transcripts, videos, recordings, coding used for content analysis, and field diaries, among others).
5. National and international experiences
Although the “replicability crisis” has caught the attention of scientists around the world, editorial policies that actually encourage greater transparency, replication and reproducibility are in the minority in the social sciences (Gherghina and Katsanidou 2013). In the recent period, there has been progress in efforts to make data available and in the adherence of journals to the open science movement. In political science and sociology, DADOS’ main areas of activity, it is possible to identify a marked advance in journals with a higher Impact Factor, characterized by the institution of replicability policies in cases such as Political Analysis, the American Political Science Review, the American Journal of Political Science, and Sociological Methods & Research. The British Journal of Political Science, for example, now requires authors to deposit their data, the code book, the computer routine and the tables, graphs and figures that generated the analysis.
In the national context, the Brazilian Political Science Review was a pioneer in making article data available in the Dataverse repository and, more recently, adhered to a data curation process: they are reproduced by the journal’s editors and, once the same results are found, the article is published.
In a broader sense, initiatives such as the Rede Brasileira de Reprodutibilidade (RBR) seek to bring together different organizations and areas of knowledge: “(…) to promote rigorous, reliable and transparent science in Brazil”.11 The creation of a Brazilian data repository, Lattes Data, is also an important step.
6. Ethical issues of replicability
The quest for replicability addresses important ethical issues, ranging from controlling bad scientific practices to making valuable information available to society, which often funds it with public resources. But depending on the nature of the data, replicability can give rise to ethical problems, which almost always have to do with the risk of direct or indirect identification of the individuals or organizations that are the focus of research.
Direct identification occurs when elements of the identity of an individual or organization are explicitly included in the databases sent for replication. This is not always a problem, on the contrary. Public figures such as politicians and civil servants have much of their personal data publicized precisely so that there is greater civil control of their activities. However, this does not apply to everyone. There are subjects whose exposure is sensitive, such as minors or people in conflict with the law. In such cases, databases or evidence are often de-identified, either by deleting or modifying identifying variables (name, ID, address etc.).
Indirect identification, on the other hand, can happen when data that has already been de-identified still allows for detailed knowledge of the cases. This can happen in databases that gather much information about specific cases. Even if I do not know any personal information about a given case, I can locate it in the world because the database contains a lot of indirect information (race, gender, region, education, age, etc.). Although more difficult to evaluate, these cases should be judged jointly by authors and editors to ensure the greatest replicability without exposing the study populations to any risk.
7. The Data Replicability Editorial Office
Having considered the conceptual and situational aspects, in this section we describe how the Replicability Editor in DADOS will work. It is important to start by pointing out that the creation of this function is in line with recent initiatives to modernize the journal, which has joined the open science movement, starting to receive preprint submissions, and requiring the submission of databases to encourage transparency in research evaluations. Moreover, we have also improved scientific dissemination actions and instituted policies to promote diversity and gender and race equity among reviewers and authors.
In general terms, the journal already has guidelines in its submission rules for researchers to send in their detailed research materials, computer routines and other information when submitting the manuscript for evaluation. This facilitates the preliminary verification work of the editors and referees who, however, do not have the same responsibilities as a replicability editor. DADOS has a webpage on the Dataverse portal, which will only publish what has been authorized by the replicability editor following desk and peer reviews.
The main change, therefore, is that the journal now has a specific editor to curate the scientific evidence presented in the manuscripts. This means that, even if accepted, articles will only be published when their analytical material has been verified as reproducible by the replicability editor. The process from submission to publication will be as follows:
- Submission of the article (in preprint or traditional format);
- Desk-review;
- Appointment of reviewers;
- Authors’ comments and responses to reviews;
- Approval (or not);
- Data curation (replication of article findings by journal editors and assistants), which will involve communication between authors and the journal;
- Publishing the article and making the data available in the Dataverse repository.
The action aims to ensure greater security and rigor in the findings we publish, to contribute more broadly to the process of building knowledge in the social sciences, as well as to align DADOS‘ editorial practices with cutting-edge national and international replicability standards, thus, following the open science paradigm. In the midst of the rapid transformation of scientific work and communication tools with the advent of various artificial intelligence resources, the promotion of transparency is becoming increasingly necessary and beneficial for exchanges between the academic community and the general public and can also be a tool that boosts trust in science.
Notes
1. IOANNIDIS, J.P.A. Why Most Published Research Findings Are False. PLOS Medicine [online]. 2005, vol. 2, no. 8, e124. https://doi.org/10.1371/journal.pmed.0020124. Available from: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124
2. AVELINO, G., DESPOSATO, S. and MARDEGAN, I. Transparency and Replication in Brazilian Political Science: A First Look. Dados rev. ciênc. sociais [online]. 2021, vol. 64, no. 3, e20190304 [viewed 20 October 2023]. https://doi.org/10.1590/dados.2021.64.3.242. Available from: https://www.scielo.br/j/dados/a/4pMrxZVYv4pXypfGrRr55Nx/
3. JANZ, N. Bringing the Gold Standard into the Classroom: Replication in University Teaching. International Studies Perspectives [online]. 2016, vol. 17, no. 4, pp. 392–407 [viewed 20 October 2023]. https://doi.org/10.1111/insp.12104. Available from: https://academic.oup.com/isp/article-abstract/17/4/392/2528285?redirectedFrom=fulltext
4. KING, G. Replication, Replication. PS: Political Science & Politics [online]. 1995, vol. 28, no. 3, pp. 444-452 [viewed 20 October 2023]. https://doi.org/10.2307/420301. Available from: https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/abs/replication-replication/85C204B396C5060963589BDC1A8E7357
5. FIGUEIREDO FILHO, D. et al. Seven Reasons Why: A User’s Guide to Transparency and Reproducibility. Bras. Political Sci. Rev. [online]. 2019, vol. 13, no. 2 [viewed 20 October 2023]. https://doi.org/10.1590/1981-3821201900020001. Available from: https://www.scielo.br/j/bpsr/a/sytyL4L63976XCHfK3d7Qjh/
6. CHRISTENSEN, G. et al. A Study of the Impact of Data Sharing on Article Citations Using Journal Policies as a Natural Experiment. PLoS One [online]. 2019, vol. 1, no. 12, e0225883 [viewed 20 October 2023]. https://doi.org/10.1371%2Fjournal.pone.0225883. Available from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0225883
7. Open Science and the new research communication modus operandi – Part II [online]. SciELO in Perspective, 2019 [viewed 20 October 2023]. Available from: https://blog.scielo.org/en/2019/08/01/open-science-and-the-new-research-communication-modus-operandi-part-ii/
8. KING, G., KEOHANE, R.O. and VERBA, S. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton University Press, 2021.
9. COLLIER, D. and BRADY, H.E. Rethinking Social Inquiry: Diverse Tools, Shared Standards. Lanham: Rowman & Littlefield Publishers, 2004.
10. GOERTZ, G. and MAHONEY, J. A tale of two cultures: Qualitative and quantitative research in the social sciences. Princeton: Princeton University Press, 2012.
11. Rede Brasileira De Reprodutibilidade – Site Institucional: https://www.reprodutibilidade.org
References
AVELINO, G., DESPOSATO, S. and MARDEGAN, I. Transparency and Replication in Brazilian Political Science: A First Look. Dados rev. ciênc. sociais [online]. 2021, vol. 64, no. 3, e20190304 [viewed 20 October 2023]. https://doi.org/10.1590/dados.2021.64.3.242. Available from: https://www.scielo.br/j/dados/a/4pMrxZVYv4pXypfGrRr55Nx/
CHRISTENSEN, G. et al. A Study of the Impact of Data Sharing on Article Citations Using Journal Policies as a Natural Experiment. PLoS One [online]. 2019, vol. 1, no. 12, e0225883 [viewed 20 October 2023]. https://doi.org/10.1371%2Fjournal.pone.0225883. Available from: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0225883
DINIZ, D. Ética na pesquisa em ciências humanas: novos desafios. Ciência & Saúde Coletiva [online]. 2008, vol.13, no. 2, pp. 417–26 [viewed 20 October 2023]. https://doi.org/10.1590/S1413-81232008000200017. Available from: https://www.scielo.br/j/csc/a/QDNVw9nGF7X7b8Kf4LNvRVs/
FIGUEIREDO FILHO, D. et al. Seven Reasons Why: A User’s Guide to Transparency and Reproducibility. Bras. Political Sci. Rev. [online]. 2019, vol. 13, no. 2 [viewed 20 October 2023]. https://doi.org/10.1590/1981-3821201900020001. Available from: https://www.scielo.br/j/bpsr/a/sytyL4L63976XCHfK3d7Qjh/
GHERGHINA, S. and ALEXIA, K. Data Availability in Political Science Journals. European Political Science [online], vol. 12, no. 3, pp. 333–49 [viewed 20 October 2023]. https://doi.org/10.1057/eps.2013.8. Available from: https://link.springer.com/article/10.1057/eps.2013.8
GOERTZ, G. and MAHONEY, J. A tale of two cultures: Qualitative and quantitative research in the social sciences. Princeton: Princeton University Press, 2012.
IOANNIDIS, J.P.A. Why Most Published Research Findings Are False. PLOS Medicine [online]. 2005, vol. 2, no. 8, e124. https://doi.org/10.1371/journal.pmed.0020124. Available from: https://journals.plos.org/plosmedicine/article?id=10.1371/journal.pmed.0020124
JANZ, N. Bringing the Gold Standard into the Classroom: Replication in University Teaching. International Studies Perspectives [online]. 2016, vol. 17, no. 4, pp. 392–407 [viewed 20 October 2023]. https://doi.org/10.1111/insp.12104. Available from: https://academic.oup.com/isp/article-abstract/17/4/392/2528285?redirectedFrom=fulltext
KIDDER, L.H. and FINE, M. Qualitative and Quantitative Methods: When Stories Converge. New Directions for Program Evaluation [online]. 1987, vol. 35, pp. 57–75 [viewed 20 October 2023]. https://doi.org/10.1002/ev.1459. Available from: https://onlinelibrary.wiley.com/doi/10.1002/ev.1459
KING, G., KEOHANE, R.O. and VERBA, S. Designing Social Inquiry: Scientific Inference in Qualitative Research. Princeton: Princeton University Press, 2021.
KING, G. Replication, Replication. PS: Political Science & Politics [online]. 1995, vol. 28, no. 3, pp. 444-452 [viewed 20 October 2023]. https://doi.org/10.2307/420301. Available from: https://www.cambridge.org/core/journals/ps-political-science-and-politics/article/abs/replication-replication/85C204B396C5060963589BDC1A8E7357
MAKEL, M.C., et al. Replication is relevant to qualitative research. Educational Research and Evaluation [online]. 2022, vol. 27, vol. 1, pp. 215–19 [viewed 20 October 2023]. https://doi.org/10.1080/13803611.2021.2022310. Available from: https://www.tandfonline.com/doi/full/10.1080/13803611.2021.2022310
OPEN SCIENCE COLLABORATION. Estimating the reproducibility of psychological science. Science [online]. 2015, vol. 349, no. 6251, aac4716 [viewed 20 October 2023]. https://doi.org/10.1126/science.aac4716. Available from: https://www.science.org/doi/10.1126/science.aac4716
Open Science and the new research communication modus operandi – Part II [online]. SciELO in Perspective, 2019 [viewed 20 October 2023]. Available from: https://blog.scielo.org/en/2019/08/01/open-science-and-the-new-research-communication-modus-operandi-part-ii/
PIPER, K. Science Has Been in a ‘Replication Crisis’ for a Decade. Have We Learned Anything? [online]. Vox. 2020 [viewed 20 October 2023]. Available from: https://www.vox.com/future-perfect/21504366/science-replication-crisis-peer-review-statistics.
POWNALL, M. Is Replication Possible for Qualitative Research? [online]. PsyArXiv. 2022 [viewed 20 October 2023]. https://doi.org/10.31234/osf.io/dwxeg. Available from: https://osf.io/preprints/psyarxiv/dwxeg/
SPINAK, E. Reproduction and replication in scientific research – part 1 [online]. SciELO in Perspective, 2023 [viewed 20 October 2023]. Available from: https://blog.scielo.org/en/2023/05/19/reproduction-and-replication-in-scientific-research-part-1/
External links
Instituto de Estudos Sociais e Políticos da Universidade do Estado do Rio de Janeiro (UERJ), https://iesp.uerj.br/
Lattes Data (CNPq): https://lattesdata.cnpq.br/
Rede Brasileira De Reprodutibilidade – Site Institucional: https://www.reprodutibilidade.org
Revista DADOS – Dataverse: https://dataverse.harvard.edu/dataverse/revistadados
Revista DADOS: https://www.scielo.br/j/dados/
Translated from the original in Portuguese by Lilian Nassi-Calò.
Como citar este post [ISO 690/2010]:
Recent Comments