SciELO Data repository in regular operation

By Solange Santos, Carolina Tanigushi, and Abel L Packer

SciELO Data logo

The third development cycle of the SciELO Program foresees its positioning as an open science program. In this sense, the SciELO Program starts to formally incorporate Open Science practices in its principles and in the formulation of the strategy, objectives, functions and work plans of the SciELO Publishing Model1 (Portuguese only).

In this new modus operandi, it’s up to each journal to renew its editorial policy and management in favor of accepting manuscripts deposited on reliable preprint servers, citing, referencing, and sharing content underlying the articles’ texts and offering options that promote greater transparency and quality to the peer review process

In August 2020, the SciELO Data research data repository was launched and operated experimentally until January 2022. SciELO Data has a multidisciplinary scope and operates the deposit, preservation, and dissemination of research data from articles submitted, approved for publication, or already published in journals of the SciELO Network or posted on SciELO Preprints. Research data are operated in the form of digital files of numerical data, computer application codes, texts, schematics, and other contents underlying the articles’ texts.

SciELO Data uses the open source Dataverse platform, developed by the Institute for Quantitative Social Science (IQSS) at Harvard University. As of June 2022, there are 79 data repositories operating with Dataverse spread across the world.

A dataverse repository allows you to recursively host multiple collections, or dataverses repositories. In the context of SciELO Data, each journal has its own dataverse, as does the SciELO Preprints server. Eventually, it may have more than one dataverse. Journals are responsible for managing their dataverses with the support of the SciELO team, allowing the sharing of research data to be done in accordance with the editorial policies of each journal and the specifics of the research communities.

Dataverses are made up of sets of data files, or datasets. Each set and each of its files is always accompanied by metadata. In this way, in addition to storing, preserving, and publishing research data, SciELO journals will increase research visibility through sharing, citing, exploring, and analyzing research data.

Diagram showing the levels of the SciELO Data dataverse collection. The SciELO Data dataverse collection encompasses the dataverses of journals that have the datasets within them.

In this context, the SciELO Data workflow occurs as follows:

  1. Researchers prepare the dataset during their research (numerical data files in different formats, computer application codes, documents, schematics, audios, videos, etc.) following the Research data preparation guide;2
  2. The deposit of the dataset in the journal’s dataverse in SciELO Data is performed following the Research data deposit guidelines;3
  3. The researchers send the dataset to be curated by the journal’s editorial team;
  4. The journal receives notification from SciELO Data informing about the deposit of a dataset;
  5. The journal curates the dataset in accordance with the instructions in the Research data curation guidelines for editorial teams;4
  6. Once the dataset meets the established standards, the journal informs the SciELO Data team that the dataset is already available for curation by the SciELO team;
  7. The SciELO Data team reviews the dataset;
  8. SciELO informs the journal’s editorial staff that the review has been completed; and,
  9. The Editorial Team decides on the dataset publication.

SciELO Data flowchart

In promoting research visibility, the content of the SciELO Data repository is indexed and retrievable by some of the most important data search and indexing services, such as Google, Dataset Search, Re3Data, FAIRSharing, Repository Finder, and Ranking web of repositories.

The adoption of open science practices assumes the review and enrichment of methodologies and concepts and challenges to add new functionalities, which are often perceived as disruptive to scholarly communication.

In this sense, one of the biggest challenges for the consolidation of SciELO Data is precisely the difficulties and resistance of researchers to advance in sharing data from their research and from journals in updating their editorial policies to transform data sharing as an integrated practice in the editorial process. Added to this is the time and energy needed to learn and implement these changes, which demand from different actors (authors, reviewers, editors, editorial teams) a novel activity aligned with the Open Science modus operandi.

As a reliable data repository, SciELO Data has, in addition to the long-term preservation of research data, the role of facilitator in the sharing and reuse of data that support the analyses, results, discussions and conclusions described in the manuscripts. It should also be noted that the publication of these research data contributes to increasing transparency of the peer review process, also facilitating research reproduction and replication.

Since January 2021, the SciELO program has promoted the improvement of a technical group in research data management and its operation in the Dataverse system, which is responsible for the development of SciELO Data. The group prepared and keeps updated extensive documentation to support the operation, which can be consulted in Portuguese, English, and Spanish on the SciELO Data webpage on the SciELO Network portal.5 Responsible for technical assistance to authors, journals, and coordination of the SciELO network, the group can be contacted by the email <>.


1. SciELO – modelo de publicação eletrônica para países em desenvolvimento [online]. 2019 [viewed 24 August 2022]. Available from:

2. Research data preparation guidelines [online]. 2022 [viewed 24 August 2022]. Available from:

3. Research data deposit guidelines [online]. 2022 [viewed 24 August 2022]. Available from:

4. Research data curation guidelines for editorial teams [online]. 2022 [viewed 24 August 2022]. Available from:

5. SciELO Data |


Launch of the SciELO Data repositor [online]. MailChimp. 2020 [viewed 24 August 2022]. Available from:

Research data curation guidelines for editorial teams [online]. 2022 [viewed 24 August 2022]. Available from:

Research data deposit guidelines [online]. 2022 [viewed 24 August 2022]. Available from:

Research data preparation guidelines [online]. 2022 [viewed 24 August 2022]. Available from:

SciELO – modelo de publicação eletrônica para países em desenvolvimento [online]. 2019 [viewed 24 August 2022]. Available from:

External links

DataCite Repository Selector:

Dataset Search:

FAIRsharing | SciELO Data:


Metrics | The Dataverse Project:

Ranking Web of Repositories:

SciELO Data |

SciELO Data |

SciELO Data:

SciELO Preprints:

The Dataverse Project:


Translated from the original in Portuguese by Lilian Nassi-Calò.


Como citar este post [ISO 690/2010]:

SANTOS, S., TANIGUSHI, C, and PACKER, A.L. SciELO Data repository in regular operation [online]. SciELO in Perspective, 2022 [viewed ]. Available from:


One Thought on “SciELO Data repository in regular operation

  1. Pingback: Day in Review (August 22–25) - Association of Research Libraries

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation