Scientific Integrity in the Age of AI and the challenges of transparency: Fraud, manipulation, and the new transparency challenges

By Fabiano Couto Corrêa da Silva

Introduction

Scientific integrity has always been a cornerstone of public trust in science. Yet we are living through an unprecedented crisis in which the scale, sophistication and speed of scientific fraud challenge traditional mechanisms of control and verification. The rise of artificial intelligence (AI) introduces an entirely new complexity: while it offers powerful tools to detect anomalies and fraud, it also lowers the cost and scales up the production of fraudulent content at an alarming pace. For editors, reviewers and researchers, this is a conversation of great importance for preserving trust in science as a social institution.

In the book Quem controla seus dados?1, I argue that the expansion of AI, coupled with the concentration of platforms and data flows, intensifies informational colonialism and demands transparency, provenance and sovereign governance to safeguard scientific integrity.

The contemporary crisis: the industrialization of scientific fraud

Scientific integrity faces an industrial-scale crisis. The phenomenon of “paper mills”, true producers of fraudulent articles, represents a qualitative transformation in the nature of scientific misconduct. This is no longer about isolated cases of dishonest researchers, but commercial operations that mass-produce fake studies sold to researchers under pressure from the “publish or perish” logic.

The numbers are alarming. Recent studies document an exponential rise in retractions over the past two decades. Conservative estimates suggest that up to 400,000 fraudulent articles infiltrated the scientific literature in the past twenty years, while only 7,275 retractions have been officially recorded as linked to such organizations, revealing the hidden magnitude of the problem.

The numbers are alarming. Recent studies show that the number of retractions has been increasing over the years, even though it represents a small proportion of total publications, and that many articles continue to receive a substantial volume of citations even after being retracted (2024)2. At the same time, analyses of paper mills indicate that specialized organizations have already managed to publish many thousands of fabricated manuscripts in peer-reviewed journals, selling scientific authorship as a service (2024)3. Taken together, these studies suggest that scientific misconduct is not limited to isolated cases but takes on structured, large-scale forms, with lasting effects on the scientific literature.

The COVID-19 pandemic, with its desperate rush for publications, dramatically exacerbated this problem. Articles about miracle treatments, fabricated data on the effectiveness of medications, and studies with questionable methodologies flooded scientific journals, many of which were only retracted months or years later, when the damage to public health had already been done.

This industrialization of fraud exposes deep fragilities in an overburdened peer-review system and in a publishing model that sometimes prioritizes quantity over quality. Volunteer reviewers, often unpaid and time-pressed, are ill-equipped to detect ever more sophisticated frauds. Publishers, meanwhile, face a dilemma between maintaining publication speed (and hence revenue) and investing in rigorous checks that slow the process.

Artificial intelligence, a double-edged sword

Generative AI marks a paradigm shift in this crisis. Where fraud once required considerable effort, such as fabricating data, manipulating images, writing coherent text, today it is possible to generate fraudulent content with startling ease. This expands the typology of scientific misconduct far beyond plagiarism and traditional data fabrication.

Multimodal fraudulent content

AI can create not only academically plausible text but also fake scientific images (microscopy slides, experiment charts, MRI scans), videos and even audio of interviews or testimonies. Generative AI tools can produce entire datasets that appear statistically valid yet are entirely fictitious. The sophistication of these frauds makes detection extremely difficult, even for experienced reviewers.

Deep epistemic implications

The very nature of scientific evidence is called into question. If an image can be generated by AI to be indistinguishable from a real one, how can we trust what we see? If data can be fabricated with perfect statistical distributions, how do we distinguish the real from the fake? Authorship also becomes murky. Who is the author of text generated with AI assistance? The researcher who wrote the prompt? The company that built the model? The community that produced the training data?

The transparency “paradox”

As discussed in the post The transparency paradox when using generative AI in academic research4, published on the SciELO in Perspective blog itself, declaring the use of AI,though ethically advisable and increasingly required by journals—may lead readers to perceive a work as less trustworthy. This creates a dilemma for researchers: being transparent about AI tools can harm reception; not declaring them may be considered misconduct. This paradox underscores the urgent need to develop new norms and expectations around AI’s role in scientific production.

Detection tools and a technological arms race

Fortunately, the same technology that powers fraud also offers new defensive tools. Detection paradigms are evolving constantly, including:

  • Advanced digital forensics — Sophisticated techniques to identify manipulations in images and data, including metadata inspection, detection of anomalous compression patterns and checks for statistical consistency in datasets.
  • Linguistic and stylometric analysis — Algorithms that detect textual patterns suggestive of AI generation, including semantic-coherence analysis, identification of “hallucinations” (plausible-but-false outputs) and inconsistent writing styles.
  • Emerging technologies: blockchain and data provenance — Blockchain, with its capacity to create immutable, auditable records, is being explored to ensure research-data provenance and integrity. Cryptographic timestamping systems can prove when data were collected, by whom and under what conditions, creating a verifiable chain of custody.

Limitations and challenges

No technology is a silver bullet. The race between fraud and detection is constant and asymmetric: fraudsters need only find a single gap, whereas detection systems must be broad and near-infallible. There is also the risk of false positives—legitimate work erroneously flagged as fraudulent—and algorithmic biases that disproportionately penalize certain researcher groups (e.g., non-native English speakers or less-prestigious institutions).

Ethical and social dimensions that put trust at stake

The end result is erosion of public trust in science. Each fraud uncovered, each paper-mill scandal, each high-profile retraction undermines the credibility of the entire system. At a time when we face global challenges that require evidence-based responses, such as climate change, pandemics, social inequalities,loss of trust in science is an existential risk.

This crisis also has a colonial dimension. Systemic vulnerabilities, excessive publication pressure, lack of resources for rigorous checks and dependence on proprietary platforms and tools, disproportionately affect researchers and institutions in the Global South. Researchers in less-resourced countries are more vulnerable to paper-mill offers promising rapid publication in seemingly legitimate journals. At the same time, they are often the first to be suspected when frauds are discovered, reinforcing harmful stereotypes.

Responsibility, therefore, cannot be merely individual. It is institutional and systemic, demanding deep cultural transformation in evaluation practices, incentives and training. Universities and funders need to rethink academic success metrics, valuing quality over quantity. Publishers must invest in more robust verification processes, even if that means publishing less and more slowly. And the scientific community as a whole must develop a culture of transparency, reproducibility and collective responsibility.

Towards a governance of scientific integrity

Addressing the integrity crisis in the age of AI requires more than stopgaps or isolated technologies. It calls for integrated governance that combines technological recommendations, institutional reforms and—crucially—international cooperation. Building an ethical and equitable scientific system hinges on scientific sovereignty: the capacity of each community to define its own rules and standards of integrity, in dialogue with the global community, without unilateral imposition of Northern-devised models.

Transparency, accountability and epistemic justice must be the pillars of this new architecture of trust. This challenge requires the engagement of all actors in the scientific ecosystem: researchers, editors, reviewers, institutions, funding agencies and civil society. Scientific integrity is not merely a technical or individual ethical issue; it is a collective responsibility that defines the future of science as a social institution.

Posts of the series about the Quem controla seus dados? book

Note

1. SILVA, F. C. C. Quem controla seus dados? Ciência Aberta, Colonialismo de Dados e Soberania na era da Inteligência Artificial e do Big Data. São Paulo: Pimenta Cultural, 2025 [viewed 10 December 2025] https://10.31560/pimentacultural/978-85-7221-474-2. Available from: https://www.pimentacultural.com/livro/quem-controla-dados/

2. SCHMIDT, M. et al. Why do some retracted articles continue to get cited? Scientometrics [online]. 2024 vol. 129, pp. 7535–7563, ISSN:1588-2861 [viewed 10 December 2025] https://doi.org/10.1007/s11192-024-05147-4. Available from: https://link.springer.com/article/10.1007/s11192-024-05147-4

3. PARKER, L.; BOUGHTON, S.; BERO, L.; BYRNE, J. A. Paper mill challenges: past, present, and future. Journal of Clinical Epidemiology [online] 2024, v. 176 [viewed 10 December 2025] https://doi.org/10.1016/j.jclinepi.2024.111549. Available from: em: https://www.sciencedirect.com/science/article/pii/S0895435624003056

4. SAMPAIO, R. C. O paradoxo da transparência no uso de IA generativa na pesquisa acadêmica. Blog SciELO em Perspectiva, 2025 [viewed 10 December 2025]. Available from: https://blog.scielo.org/blog/2025/10/10/o-paradoxo-da-transparencia-no-uso-de-ia-generativa-na-pesquisa-academica/

References

PARKER, L.; BOUGHTON, S.; BERO, L.; BYRNE, J. A. Paper mill challenges: past, present, and future. Journal of Clinical Epidemiology [online] 2024, v. 176 [viewed 10 December 2025] https://doi.org/10.1016/j.jclinepi.2024.111549. Available from: em: https://www.sciencedirect.com/science/article/pii/S0895435624003056

SILVA, F. C. C. Quem controla seus dados? Ciência Aberta, Colonialismo de Dados e Soberania na era da Inteligência Artificial e do Big Data. São Paulo: Pimenta Cultural, 2025. [viewed 10 December 2025] https://10.31560/pimentacultural/978-85-7221-474-2. Available from: https://www.pimentacultural.com/livro/quem-controla-dados/

SCHMIDT, M. et al. Why do some retracted articles continue to get cited? Scientometrics [online]. 2024 vol. 129, pp. 7535–7563, ISSN:1588-2861 [viewed 10 December 2025] https://doi.org/10.1007/s11192-024-05147-4. Available from: https://link.springer.com/article/10.1007/s11192-024-05147-4.

 

About Fabiano Couto Corrêa da Silva

Photograph of Fabiano Couto Corrêa da Silva

Fabiano Couto Corrêa da Silva is a researcher in Information Science focusing on open science, data colonialism and informational sovereignty. He leads DataLab – Laboratory for Data, Institutional Metrics and Scientific Reproducibility, with an emphasis on FAIR/CARE.

 

Translated from the original in Portuguese by Fabiano Couto Corrêa da Silva.

 

Como citar este post [ISO 690/2010]:

SILVA, F.C.C. Scientific Integrity in the Age of AI and the challenges of transparency: Fraud, manipulation, and the new transparency challenges [online]. SciELO in Perspective, 2025 [viewed ]. Available from: https://blog.scielo.org/en/2025/12/10/scientific-integrity-in-the-age-of-ai-and-the-challenges-of-transparency-fraud-manipulation-and-the-new-transparency-challenges/

 

Leave a Reply

Your email address will not be published. Required fields are marked *

Post Navigation