The open access sharing of machine-readable research data has been attracting the attention of academia, funding agencies, private companies and civil society since 2013, when a report on its potential scientific and financial impact, also regarding the productive sector, was published.
Since then, there has been significant progress in discussions on technological, methodological, and ethical issues on data sharing practices. An open access peer-reviewed scientific journal – Scientific Data – was launched by the Nature Publishing Group (NPG) to publish exclusively research datasets. A recent article in this journal refers to the initiative Making Data Count1, which aims to promote data sharing and – especially – its recognition as citable scientific output and gauge metrics of datasets scientific impact.
One way to encourage the production and publication of data is to establish metrics to assess their significance and impact, since impact is the academic world’s currency. Academia traditionally uses citations based metrics, but more recently, alternative metrics based in social media (‘altmetrics’) provide quick evaluation after publication, as well as diversify the ways to measure impact. Several groups have conducted studies on traditional, alternative and innovative metrics for research data.
Making Data Count is the result of collaboration between the California Digital Library (CDL), the Public Library of Science (PLoS) and the Data Observation Network for Earth (DataOne), which aims not only to define, but also to implement a set of metrics for research data. The homepage of the initiative states “Sharing data is time consuming and researchers need incentives for undertaking the extra work. Metrics for data will provide feedback on data usage, page-views, and impact that will help encourage researchers to share their data. This project will explore and test the metrics (“data-level metrics”) to capture activity surrounding research data”.
Creating Making Data Count was preceded by an online survey conducted in November-December 2014, attended by 247 researchers and 73 repository administrators. Most researchers (78%) were from academic institutions located in the US (57%) and UK (14%). Among administrators, 64% were from academic repositories and 22% from repositories managed by governments in the US (72%) and UK (11%).
The questions posed included the proportion of data shared by the authors, where to look for public data to re-use, data usage frequency, behavior regarding information of those reusing data, interest in the data impact, and which metrics repositories housing their data follow/display.
Regarding the form of sharing data, researchers reported frequently meeting personal requests, via email, for example. The disadvantages of this practice range from not providing the data requested to the impossibility of measuring impact. Fortunately, 75% of respondents reported sharing part of their data through repositories specifically designed for this purpose.
Concerning the search for data reuse, the majority of respondents reported conducting searches in various sources, including journal articles, databases, Internet search engines, social media and discussion forums within their communities.
Open data is frequently used in some stage of the process of developing a research project, according to 96% of respondents. The data are used by researchers in comparable extent in the various stages of a research work, and 70% reported using them to reach the main conclusion. These facts underscore the importance of open data in the way research is currently performed.
The researchers who share their data are interested in knowing who uses them and for what purpose. Repositories satisfy part of this interest; about half request detailed contact information (name, institution and e-mail) while the other half does not collect any information at all. Regarding the availability of data involving confidentiality, repositories require detailed identification of users, while easy access to data makes them more likely to be used. As a compromise, the repositories may request from users the area of expertise in which they intend to use open data, which enables administrators to know where the data is being used preserving, at the same time, the confidentiality of users.
Data-level metrics is of potential interest to managers, funding agencies and researchers that generated the data. Most researchers (85%) and repository administrators (61%) who were interviewed highlighted citations as the most prestigious metric, followed by downloads and finally, page-views, which is consistent with article and journal-level metrics. Finally, administrators reported which metrics their repositories follow and display. Almost all monitor downloads and most of them also page-views, but less than half display these results. A small percentage (about 20%) follows citations to individual datasets or repositories, in general, despite their prestige, presumably due to the difficulty in obtaining them.
Researchers, mostly academics, value citations, but its use for open data is limited, since datasets are rarely cited. Most of them, however, believe that “formal citation would be a fair condition for data sharing”1.
The complex issue of credit of research data was recently gathered in the document Joint Declaration of Data Citation Principles (DC1), prepared by the international working group FORCE11 and previously reported on this blog2. To date, the Declaration was signed by 94 repositories, publishers, and prestigious academic institutions. Considering DC1 principles, the initiative Making Data Count recommends formal citation of open data be encouraged, besides collecting and making available downloads, that are easier to obtain and provide and rely on certain reputation among researchers. Last but not least is counting article-level metrics and alternative metrics based on bookmarks and social media such as Mendeley, CiteULike, Facebook, Twitter and others.
It is expected, therefore, that measuring open data impact to increase their availability and use by researchers is met by initiatives such as Making Data Count and other to come.
1. KRATZ, J. E. and STRASSER, C. Making data count. Scientific Data. 2015, nº150039 [online]. DOI: http://dx..org/10.1038/sdata.2015.39
2. SPINAK, E. Principles for the citation of scientific data. SciELO in Perspective. [viewed18 September 2015]. Available from: http://blog.scielo.org/en/2015/01/15/principles-for-the-citation-of-scientific-data/
Dataset Level Metrics Subject Group. Consortia Advancing Standards in Research Administration Information (CASRAI). Available from: http://casrai.org/standards/subject-groups/dataset-level-metrics
KRATZ, J. E. and STRASSER, C. Making data count. Scientific Data. 2015, nº150039 [online]. DOI: http://dx..org/10.1038/sdata.2015.39
MARTONE, M. Joint Declaration of Data Citation Principles. FORCE11, Data Citation Synthesis Group, 2014. Available from https://www.force11.org/group/joint-declaration-data-citation-principles-final
NISO Alternative Assessment Metrics (Altmetrics) Initiative. National Information Standards Organization (NISO). Available from: http://www.niso.org/topics/tl/altmetrics_initiative/
RDA/WDS Publishing Data Bibliometrics WG Case Statement. Research Data Alliance. Available from: http://rd-alliance.org/group/rdawds-publishing-data-bibliometrics-wg/case-statement/rdawds-publishing-data-bibliometrics-wg
SCIENTIFIC ELECTRONIC LIBRARY ONLINE. The Open Data movement: international consolidation. SciELO in Perspective. [viewed 18 September 2015]. Available from: http://blog.scielo.org/en/2014/07/14/the-open-data-movement-international-consolidation/
SPINAK, E. and PACKER, A. Scientific Data: Nature Publishing Group moves the communication of scientific data forward with its new online open access publication. SciELO in Perspective. [viewed 18 September 2015]. Available from: http://blog.scielo.org/en/2014/02/04/scientific-data-nature-publishing-group-moves-the-communication-of-scientific-data-forward-with-its-new-online-open-access-publication/
SPINAK, E. Exchange of research data remains low and increases slowly. SciELO in Perspective. [viewed 18 September 2015]. Available from: http://blog.scielo.org/en/2014/11/12/exchange-of-research-data-remains-low-and-increases-slowly/
SPINAK, E. International Open Data Week – what’s new?. SciELO in Perspective. [viewed 18 September 2015]. Available from: http://blog.scielo.org/en/2015/01/07/international-open-data-week-whats-new/
SPINAK, E. Open-Data: liquid information, democracy, innovation… the times they are a-changin’. SciELO in Perspective. [viewed 18 September 2015]. Available from: http://blog.scielo.org/en/2013/11/18/open-data-liquid-information-democracy-innovation-the-times-they-are-a-changin/
SPINAK, E. Principles for the citation of scientific data. SciELO in Perspective. [viewed 18 September 2015]. Available from: http://blog.scielo.org/en/2015/01/15/principles-for-the-citation-of-scientific-data/
Altmetric – <http://www.altmetric.com/>
Making Data Count: Project to develop Data-Level Metrics – <http://mdc.lagotto.io/>
Scientific Data – <http://www.nature.com/sdata/>
Lilian Nassi-Calò studied chemistry at Instituto de Química – USP, holds a doctorate in Biochemistry by the same institution and a post-doctorate as an Alexander von Humboldt fellow in Wuerzburg, Germany. After her studies, she was a professor and researcher at IQ-USP. She also worked as an industrial chemist and presently she is Coordinator of Scientific Communication at BIREME/PAHO/WHO and a collaborator of SciELO.
Translated from the original in portuguese by Lilian Nassi-Calò
How to cite this post [ISO 690/2010]: