As new tools flourish, AI 'fingerprints' on scientific papers could damage trust in vital research
Are some researchers using too much artificial intelligence (AI) in their scientific papers? Experts say that "fingerprints" of generative AI (GenAI) can be found in an increasing number of studies.
A recent preprint paper, which hasn’t been peer-reviewed yet, estimated that at least 60,000 papers were probably "polished" using AI in some way by analysing the writing style.
"It's not to say that we knew how much LLM [large language model] work was involved in them, but certainly, these are immensely high shifts overnight," Andrew Gray, a librarian at University College London, told Euronews Next, adding that these types of "fingerprints" can be expected even if the tools were used for mere copyediting.
While certain shifts can be linked to changes in how people write, the evolution of some words is "staggering".
"Based on what we're seeing, those numbers look like they're going steadily up," Gray said.
It has already started causing waves. A peer-reviewed study with AI-generated pictures that the authors openly credited to the Midjourney tool was published in the journal Frontiers in Cell Development and Biology and went viral on social media in February.
The journal has since retracted the study and apologised "to the scientific community".
"There's very few that explicitly mention the use of ChatGPT and similar tools," Gray said about the papers he analysed.
New tools pose trust issues
While GenAI may help speed up the editing process, such as when an author is not a native speaker of the language they are writing in, a lack of transparency regarding the use of these tools is concerning, according to experts.
"There is concern that experiments, for example, are not being carried out properly, that there is cheating at all levels," Guillaume Cabanac, a professor of computer science at the University of Toulouse, told Euronews Next.
Nicknamed a "deception sleuth" by Nature, Cabanac tracks fake science and dubious papers.
"Society gives credit to science but this credit can be withdrawn at any time," he added, explaining that misusing AI tools could damage the public’s trust in scientific research.
With colleagues, Cabanac developed a tool called the Problematic Paper Screener to detect "tortured phrases" – those that are found when a paraphrasing tool is used, for example, to avoid plagiarism detection.
But since the GenAI tools went public, Cabanac started noticing a trend of new fingerprints appearing in papers such as the term "regenerate," a button appearing at the end of AI chatbots’ answers, or sentences beginning with "As an AI language model".
They are telltale signs of text that was taken from an AI tool.
“I only detect a tiny fraction of what I assume to be produced today, but it's enough to establish a proof of concept,” Cabanac said.
One of the issues is that AI-generated content will likely be increasingly difficult to spot as the technology progresses.
“It's very easy for these tools to subtly change things, or to change things in a way that maybe you didn't quite anticipate with a secondary meaning. So, if you're not checking it carefully after it's gone through the tool, there's a real risk of errors creeping in,” Gray said.
Harder to spot in the future
The peer-reviewed process is meant to prevent any blatant mistakes from appearing in the journals, but it’s not often the case as Cabanac points out on social media.
Some publishers have released guidelines regarding the use of AI in submitted publications.
The journal Nature said in 2023 that an AI tool could not be a credited author on a research paper, and that any researchers using AI tools must document their use.
Gray fears that these papers will be harder to spot in the future.
"As the tools get better, we would expect fewer really obvious [cases]," he said, adding that publishers should give "serious thought" to the guidelines and expected disclosure.
Both Gray and Cabanac urged authors to be cautious, with Cabanac calling to flag suspicious papers and regularly check for retracted ones.
"We can't allow ourselves to quote, for example, a study or a scientific article that has been retracted," Cabanac said.
"You always have to double-check what you're basing your work on".
He also questioned the soundness of the peer-reviewing process which proved deficient in some cases.
"Making assessments badly, too quickly or helped by ChatGPT without rereading, that's not good for science," he said.