By Jan Velterop
Quality often seems to be the first thing on scientists’ lips when talking about academic journal articles. I have written about that before on this blog1. In many – my impression is most – cases, the title of the journal the article is published in serves as a proxy for quality. However, a clear definition of what constitutes quality is always missing. The journal impact factor (JIF) is the most used indicator of the quality of the articles published in the journal in question.
This is problematic. It has been problematic for a long time, of course. Yet the San Francisco Declaration on Research Assessment (DORA)2, which recommends “not [to] use journal-based metrics, such as Journal Impact Factors, as a surrogate measure of the quality of individual research articles, to assess an individual scientist’s contributions, or in hiring, promotion, or funding decisions”, has been signed by only 840 institutions (April 2017) out of an estimated 25,000+ world-wide. Journals with a high JIF are still considered prestigious journals, and as a general rule, prestigious journals are keen to publish articles with spectacular results. And spectacular results do not equate with quality as well as with being flawed. Curt Rice, professor in Norway, puts it like this: “The most prestigious journals have the highest rates of retraction, and fraud and misconduct are greater sources of retraction in these journals than in less prestigious ones”3.
An article being published in a prestigious journal is clearly not the same as it having high quality. In fact, research results can be spectacular precisely because they are statistical outliers. It is a known phenomenon that when experiments that report spectacular results are subsequently replicated, they often look less spectacular, even if the evidence is still valid. A process caused most likely by ‘regression to the mean’. Yet it is exactly the spectacular nature of statistical outliers that is so attractive to a high prestige journal. Quoting Curt Rice again: “it is to some degree logical that you see these things, because statistical flukes are quite often very nice and very unusual. This increases the odds of being published in one of these major journals”.
A particularly disastrous example of an article published in a prestige journal and then retracted, was the now infamous Andrew Wakefield4 paper, in which autism and disorders of the gut in children were associated with vaccination. Unfortunately, it was published in what is usually seen as a quality journal: The Lancet. That gave it the credibility and false authority as a result of which it received much publicity, also in the lay press, and consequently there still is a sizeable anti-vaccination movement, particularly in the United States (but not only there). According to a 2015 report by the Pew Research Center5, as many as about one in ten Americans thinks vaccines are not safe. Other scientists could not replicate Wakefield’s results and his co-authors withdrew their support for the study. After conducting an official inquiry, a tribunal of the British General Medical Council concluded that Wakefield acted dishonestly and irresponsibly. The Lancet retracted the paper, and Wakefield was struck off the UK medical register with a statement that he had deliberately falsified scientific results, but it turned out to be too little, too late. The Pew report mentions that “by then, however, the damage had already been done. Many people in the US and Europe still believe that vaccinations cause illnesses and conditions including autism in children. Despite official medical advice that says vaccines are safe and vital, many parents still worry about inoculating their children”. To a large degree this can be blamed on the prestige accorded to Wakefield’s article by The Lancet.
If you are searching for retractions on the internet, you will come across quite a few journal names that are familiar. They are familiar because they are widely known as prestigious ones. Retraction Watch keeps a close eye on retractions, monitoring them throughout the scientific realm. But it is unlikely Retraction Watch catches all fraudulent articles, let alone articles with ‘merely’ deep statistical flaws, mainly because they are not all retracted. Chris Hartgerink, a researcher who studies bias, error and fraud in scientific publications, concludes that “the scientific system as we know it is pretty screwed up” (in an interview with Stephen Buranyi6). In the same article7, Buranyi also points to a 2009 study by the Stanford researcher Daniele Fanelli, who concludes that “it is likely that, if on average 2% of scientists admit to have falsified research at least once and up to 34% admit other questionable research practices, the actual frequencies of misconduct could be higher than [what is often reported]”.
Is quality illusionary?
So, is quality illusionary? Is believing in the quality of scientific publications on the basis of their prestige just a bureaucratic necessity in the scientific ‘ego’-system? After all, quality is often associated with a high Impact Factor, which is based on a simple count of the average number of times articles in a given journal are being cited, and impact factors are important for researchers to indicate the importance of the publications they list on their CVs. They need to do this, of course, because the ego-system is being reinforced by the widespread – almost universal – system of using citations (usually the the impact factor) to evaluate journals, papers, people, funding proposals, research groups, institutions, and even countries.
Einstein is alleged to have said: “Not everything that can be counted counts, and not everything that counts can be counted”. Nature, a journal you will know, seems to agree that research assessment rests too heavily on the inflated status of the impact factor. One of its editorials in 2005 indeed carried the headline “Research assessment rests too heavily on the inflated status of the impact factor”8. Nature doesn’t necessarily seem to do irony when advertising their impact factor very prominently, though, as they keep featuring the Impact Factor on the cover.
Clearly, the impact factor still plays a very important role. Even an impact factor of lower than one is considered worth boasting about by many a researcher. Apparently, there are even pseudo-impact factors, that have been invented by whatever organization that cannot count on the ISI Journal Impact Factor by Thomson-Reuters.
The notion of impact really is quite incoherent. According to Stefan Collini, professor at Cambridge, that is because it “rewards the sensationalist and second-rate […] and risks turning academics into door-to-door salesmen for vulgarized versions of their increasingly market-oriented products”9.
Can citations even be the right sort of measure for quality? If that is the big question, the answer must simply be ‘no’. You cannot conflate impact and influence with quality. As Lea Velho, professor in Brazil, puts it: “To conflate impact/influence with quality […] is to assume perfect communication in the international scientific community. […] citation patterns are significantly influenced by factors ‘external’ to the scientific realm and, thus, reflect neither simply the quality, influence nor even the impact of the research work referred to”10. In other words, quality cannot be described with the impact factor as we know it.
How determinant are reviewers’ judgments?
But the impact factor is still widely seen as a mark of quality, as very important, even though an article’s ‘quality’ is routinely assessed by just a few people: the peer reviewers, at the point of publication. Usually, there are two reviewers, and there is the editor. The reviewers are sometimes chosen rather randomly, probably more often than you would like to believe. The correlation of a particular reviewer’s evaluation with ‘quality’, as measured by later citations of the manuscript reviewed, is low. This raises questions as to the importance of reviewers’ judgement. Some suggest that downplaying the impact of peer reviews can have beneficial effects, if referees should decide only whether a paper reaches a minimum level of technical quality. Osterloh and Frey11 propose that within the resulting set, each paper should then have the same probability of being published. This procedure should make it more likely that unconventional and innovative articles would be published. If you were to make the probability of being published 100% once a paper has reached a minimum level of (technical) quality, you have essentially the method used by so-called ‘mega-journals’ such as PLOS One.
Though the PLOS One approach is getting some following, it is by no means the prevailing method. Is it in the nature of science to just keep on counting and to infer quality from the quantity of citations? Or is this really a case of academic ‘managerialism’, to keep assessment of researchers simple and straightforward (albeit at the expense of fairness and accuracy)? It is no more than the lazy approach to quality assessment, in my view, and only assesses pseudo-quality anyway.
Towards a socio-cultural solution?
What are we doing about it? There may possibly be technical solutions to the problem of scientific communication, which really is not doing what it is supposed to do. The default focus always seems to be on technical solutions. However, we really need sociocultural ones. We need to reconsider what we mean by quality and how to assess it.
Rankings are seen as important in science, and as such, achieving high rankings form part of the incentives for researchers. There are rankings on journals or individual researchers, universities, even whole countries. Is ‘katataxiphilia’ (‘the love of ranking’, from Greek κατάταξη = classification, rank) impeding knowledge exchange? And if we rank – if we really believe that we need to rank – should we not rank on different things than citations? Should we not reward people for collaborating and sharing the data and knowledge they have gathered first and foremost?
Do we still need journals?
The whole point of scientific knowledge, of the knowledge sphere around the world if you wish, is to disseminate it so that whoever needs it can take it in. So why are we still using the model of ‘journals’ for our primary communication, even though our modern technology, the internet, doesn’t require them anymore? And even though the publishing process can introduce quite a delay? Is it because journals give us the ‘quality’ rankings researchers so crave?
I am excited by the emergence of the so-called preprint servers (such as bioRχiv) in areas other than physics, where preprints have long existed, even in print, before Arχiv was established in 1991. Preprint servers do not pronounce anything about the ‘significance’ of articles posted on them. They just enable open sharing of research results. The importance, significance and quality of an article are very difficult – probably impossible – to determine at the point of publication, and will only emerge over time. After any experiments have been replicated. After the broader discipline community has reached a consensus. Preprint services are only concerned about the inherent qualities of what is being presented, such as the measurable technical article qualities I mentioned above in respect of PLOS One. And after posting articles, they then allow the wider community to peer-review and comment on those articles, openly and transparently. The need for ‘ribbons’ (journal citations for the purpose of career progression), where necessary, can be satisfied in a separate, parallel, procedure that involves journals.
Notes
1. VELTEROP, J. Openness and quality of a published article [online]. SciELO in Perspective, 2015 [viewed 11 April 2017]. Available from: http://blog.scielo.org/en/2015/12/16/openness-and-quality-of-a-published-article/
2. The San Francisco Declaration on Research Assessment (DORA) [online]. San Francisco Declaration on Research Assessment (DORA) [viewed 11 April 2017]. Available from: http://www.ascb.org/dora/
3. RICE, C. Why you can’t trust research: 3 problems with the quality of science [online]. Curt Rice, 2013. [viewed 11 April 2017]. Available from: http://curt-rice.com/2013/02/06/why-you-cant-trust-research-3-problems-with-the-quality-of-science/
4. WAKEFIELD, A. J., et al. Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children – Retraction notice. Article originally published in The Lancet, Volume 375, Issue 9713, 6–12 February 2010, Page 445 [online]. Science Direct, 2017 [viewed 11 April 2017]. Available from: http://www.sciencedirect.com/science/article/pii/S0140673697110960
5. 83% Say Measles Vaccine Is Safe for Healthy Children [online]. Pew Research Center, 2015 [viewed 11 April 2017]. Available from: http://www.people-press.org/2015/02/09/83-percent-say-measles-vaccine-is-safe-for-healthy-children/
6. In an interview with Stephen Buranyi, The hi-tech war on science fraud [online]. The Guardian, 2017 [viewed 11 April 2017]. Available from: http://www.theguardian.com/science/2017/feb/01/high-tech-war-on-science
7. FANELLI, D. How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLOS One [online]. 2009, vol. 4, no. 5, e5738 [viewed 11 April 2017]. DOI: 10.1371/journal.pone.0005738. Available from: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0005738
8. Not-so-deep impact – Research assessment rests too heavily on the inflated status of the impact factor. Nature [online]. 2005, vol. 435, pp 1003-1004 [viewed 11 April 2017]. DOI: 10.1038/4351003b. Available from: http://www.nature.com/nature/journal/v435/n7045/full/4351003b.html
9. REISZ, M. The core connection [online]. Times Higher Education, 2010 [viewed 11 April 2017]. Available from: https://www.timeshighereducation.com/features/the-core-connection/409838.article
10. VELHO, L. The ‘meaning’ of citation in the context of a scientifically peripheral country. Scientometrics. 1986, vol. 9, nos, 1-2, pp. 71-89. DOI: 10.1007/BF02016609
11. OSTERLOH, M. and FREY, B. S. Input Control and Random Choice Improving the Selection Process for Journal Articles. University of Zurich, Department of Economics, Working Paper No. 25. 2011. Available from: www.econ.uzh.ch/static/wp/econwp025.pdf
References
83% Say Measles Vaccine Is Safe for Healthy Children [online]. Pew Research Center, 2015 [viewed 11 April 2017]. Available from: http://www.people-press.org/2015/02/09/83-percent-say-measles-vaccine-is-safe-for-healthy-children/
Criteria for Publication [online]. PLOS One [viewed 11 April 2017]. Available from: http://journals.plos.org/plosone/s/journal-information
FANELLI, D. How Many Scientists Fabricate and Falsify Research? A Systematic Review and Meta-Analysis of Survey Data. PLOS One [online]. 2009, vol. 4, no. 5, e5738 [viewed 11 April 2017]. DOI: 10.1371/journal.pone.0005738. Available from: http://journals.plos.org/plosone/article?id=10.1371/journal.pone.0005738
GORSKI, D. The General Medical Council to Andrew Wakefield: “The panel is satisfied that your conduct was irresponsible and dishonest” [online]. Science-Based Medicine, 2010 [viewed 11 April 2017]. Available from: https://www.sciencebasedmedicine.org/andrew-wakefield-the-panel-is-satisfied-that-your-conduct-was-irresponsible-and-dishonest/
In an interview with Stephen Buranyi, The hi-tech war on science fraud [online]. The Guardian, 2017 [viewed 11 April 2017]. Available from: http://www.theguardian.com/science/2017/feb/01/high-tech-war-on-science
JALALIAN, M. The story of fake impact factor companies and how we detected them. Electronic Physician [online]. 2015, vol. 7, no. 2, pp.1069-1072 [viewed 11 April 2017]. DOI: 10.14661/2015.1069-1072. PMID: 26120416. Available from: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4477767/
LEHRER, J. The Truth Wears Off. The New Yorker [online]. 2010 [viewed 11 April 2017]. Available from: http://www.newyorker.com/magazine/2010/12/13/the-truth-wears-off
Not-so-deep impact – Research assessment rests too heavily on the inflated status of the impact factor. Nature [online]. 2005, vol. 435, pp 1003-1004 [viewed 11 April 2017]. DOI: 10.1038/4351003b. Available from: http://www.nature.com/nature/journal/v435/n7045/full/4351003b.html
OSTERLOH, M. and FREY, B. S. Input Control and Random Choice Improving the Selection Process for Journal Articles. University of Zurich, Department of Economics, Working Paper No. 25. 2011. Available from: www.econ.uzh.ch/static/wp/econwp025.pdf
REISZ, M. The core connection [online]. Times Higher Education, 2010 [viewed 11 April 2017]. Available from: https://www.timeshighereducation.com/features/the-core-connection/409838.article
RICE, C. Why you can’t trust research: 3 problems with the quality of science [online]. Curt Rice, 2013. [viewed 11 April 2017]. Available from: http://curt-rice.com/2013/02/06/why-you-cant-trust-research-3-problems-with-the-quality-of-science/
Scimago Journal and Country Rank. Available from: http://www.scimagojr.com/countryrank.php?order=ci&ord=desc
STARBUCK, W. H. The production of knowledge. The challenge of social science research. Oxford University Press. 2006. pp. 83-84.
The San Francisco Declaration on Research Assessment (DORA) [online]. San Francisco Declaration on Research Assessment (DORA) [viewed 11 April 2017]. Available from: http://www.ascb.org/dora/
VELHO, L. The ‘meaning’ of citation in the context of a scientifically peripheral country. Scientometrics. 1986, vol. 9, nos, 1-2, pp. 71-89. DOI: 10.1007/BF02016609
VELTEROP, J. Openness and quality of a published article [online]. SciELO in Perspective, 2015 [viewed 11 April 2017]. Available from: http://blog.scielo.org/en/2015/12/16/openness-and-quality-of-a-published-article/
VELTEROP, J. The best of both worlds [online]. SciELO in Perspective, 2016 [viewed 11 April 2017]. Available from: http://blog.scielo.org/en/2016/06/13/the-best-of-both-worlds/
WAKEFIELD, A. J., et al. Ileal-lymphoid-nodular hyperplasia, non-specific colitis, and pervasive developmental disorder in children – Retraction notice. Article originally published in The Lancet, Volume 375, Issue 9713, 6–12 February 2010, Page 445 [online]. Science Direct, 2017 [viewed 11 April 2017]. Available from: http://www.sciencedirect.com/science/article/pii/S0140673697110960
Link externo
Retraction Watch – <http://retractionwatch.com/>
About Jan Velterop
Jan Velterop (1949), marine geophysicist who became a science publisher in the mid-1970s. He started his publishing career at Elsevier in Amsterdam. in 1990 he became director of a Dutch newspaper, but returned to international science publishing in 1993 at Academic Press in London, where he developed the first country-wide deal that gave electronic access to all AP journals to all institutes of higher education in the United Kingdom (later known as the BigDeal). He next joined Nature as director, but moved quickly on to help get BioMed Central off the ground. He participated in the Budapest Open Access Initiative. In 2005 he joined Springer, based in the UK as Director of Open Access. In 2008 he left to help further develop semantic approaches to accelerate scientific discovery. He is an active advocate of BOAI-compliant open access and of the use of microattribution, the hallmark of so-called “nanopublications”. He published several articles on both topics.
Como citar este post [ISO 690/2010]:
Pingback: Jan Velterop / La apertura es la única cualidad de un artículo académico que puede ser medido objetivamente – CIECEHCS
Read the comment in Spanish, by Javier Santovenia:
http://blog.scielo.org/es/2017/05/02/la-apertura-es-la-unica-cualidad-de-un-articulo-academico-que-puede-ser-medido-objetivamente/#comment-40976