One of the world’s most prestigious machine learning conferences has banned authors from using AI tools like ChatGPT to write research papers, sparking a debate about the role of AI-generated text in academia.
The International Conference on Machine Learning (ICML) announced the policy earlier this week, stating: “Documents that include text generated by a large-scale language model (LLM) such as ChatGPT are prohibited unless the generated text is presented as part of experimental paper analysis. The news sparkled widespread discussion on social media, with academics and AI researchers both defending and criticizing the policy. Conference organizers responded by releasing a longer statement explaining their thinking. (ICML responded to requests from On the edge for comment by directing us to the same statement.)
According to ICML, the rise of publicly available AI language models such as ChatGPT — a general-purpose AI chatbot that launched on the web last November — represents an “exciting” development that nevertheless comes with “unexpected consequences [and] unanswered questions.” ICML says this includes questions about who owns the output of such systems (they are trained on public data, usually collected without consent, and sometimes repeat that information verbatim) and whether AI-generated text and images should ” considered new or merely derivative of an existing work.’
Are AI writing tools just assistants or something more?
The latter question ties into a difficult debate about authorship—that is, who “writes” an AI-generated text: the machine or its human controller? This is especially important given that the ICML only prohibits text “produced entirely” by AI. Conference organizers say they are no prohibiting the use of tools such as ChatGPT “to edit or polish author-written text” and note that many authors have already used “semi-automated editing tools” such as grammar correction software Grammarly for this purpose.
“These questions and many more are sure to be answered over time as these large-scale generative models are more widely adopted.” However, we do not yet have clear answers to any of these questions,” the conference organizers wrote.
As a result, the ICML says its ban on AI-generated text will be reviewed next year.
However, the issues that ICML addresses may not be easily resolved. The presence of AI tools like ChatGPT is causing confusion for many organizations, some of which have responded with bans of their own. Last year, Q&A coding site Stack Overflow banned users from submitting answers created with ChatGPT, while the New York City Department of Education blocked access to the tool for anyone on its network just this week.
AI language models are auto-completion tools with no inherent sense of fact
In any case, there are various fears about the harmful effects of AI-generated text. One of the most common is that the output of these systems is simply unreliable. These AI tools are massive auto-completion systems trained to predict which word comes next in any given sentence. As such, they have no hard-coded database of “facts” to refer to – only the ability to write plausible-sounding claims. This means that they tend to present false information as truth from a given sentence sounds plausibly does not guarantee its reality.
In the case of ICML’s ban on AI-generated text, another potential challenge is distinguishing between writing that is merely “polished” or “edited” by AI and that which is “produced entirely” by these tools. At what point does a series of small AI-driven tweaks amount to a bigger rewrite? What if a user asked an AI tool to summarize his article into a quick abstract? Does this count as freshly generated text (because the text is new) or just polishing (because it’s a summary of the words the author actually wrote)?
Before the ICML clarified the scope of its policy, many researchers worried that a potential ban on AI-generated text could also be harmful to those who do not speak or write English as a first language. This was said by Professor Yoav Goldberg of Bar-Ilan University in Israel On the edge that a general ban on the use of AI writing tools would be an act of protection against these communities.
“There’s a clear unconscious bias in the evaluation of peer-reviewed articles to favor looser ones, and that works in favor of native speakers,” says Goldberg. “By using tools like ChatGPT to help articulate their ideas, many non-native speakers seem to believe they can ‘level the playing field’ around these issues.” Such tools can help researchers to save time, Goldberg said, as well as to communicate better with colleagues.
But AI writing tools are also qualitatively different from simpler software like Grammarly. Deb Raji, a research fellow in artificial intelligence at the Mozilla Foundation who writes extensively on large language models, said On the edge that it makes sense for ICML to introduce a policy specifically targeting these systems. Like Goldberg, she said she’s heard from non-English speakers that such tools can be “incredibly helpful” for drafting documents, adding that language models have the potential to make more drastic changes to text.
“I see LLMs as quite different from something like autocorrect or Grammarly, which are correctional and educational tools,” Raji said. “Although it can be used for this purpose, LLMs are not specifically intended to correct the structure and language of text that has already been written – there are other, more problematic possibilities, such as generating new text and spamming.”
“At the end of the day, authors sign the paper and have a reputation to uphold.”
Goldberg said that while he thinks it’s certainly possible for academics to generate articles entirely using AI, “there’s very little incentive for them to actually do it.”
“At the end of the day, authors sign the paper and have a reputation to uphold,” he said. “Even if the bogus paper somehow passes peer review, any incorrect statement will be associated with the author and ‘stick’ with them throughout their career.”
This point is particularly important given that there is no completely reliable way to detect AI-generated text. Even the ICML notes that flawless detection is “difficult” and that the conference will not proactively enforce its ban by submitting requests through detection software. Instead, it will only investigate submissions that have been flagged by other academics as suspicious.
In other words: in response to the rise of disruptive and new technologies, organizers rely on traditional social mechanisms to enforce academic norms. AI can be used to polish, edit or write text, but humans will still have to assess its value.