In a startling turn of events this week, artificial intelligence startup OpenAI filed court documents claiming that The New York Times intentionally tried to trick its popular ChatGPT chatbot into plagiarizing Times articles verbatim.
The accusations came as part of OpenAI’s response to a copyright infringement lawsuit filed against it in December by the Times. In the lawsuit, the Times alleges that OpenAI and Microsoft, which has invested billions into the startup, trained their AI systems on Times articles without permission. This training then enabled ChatGPT to reproduce full New York Times stories word-for-word when given certain prompts, depriving the Times of licensing revenue and harming its relationship with readers.
OpenAI’s latest court filings do not deny that ChatGPT can reproduce Times content. However, the startup claims that the Times arrived at many of its examples of copied text by exploiting a loophole in ChatGPT that OpenAI engineers are working to fix.
Specifically, OpenAI says that the Times took advantage of the fact that ChatGPT has a nearly perfect recall of text it has seen before. By copying and pasting Times articles directly into ChatGPT prompts, reporters could get the system to regurgitate back verbatim passages from those same articles. OpenAI argues that this method of prompting the system does not reflect how the average user interacts with ChatGPT.
“Normal people do not use OpenAI’s products this way,” the company’s filing states, even citing a Times article from April 2023 titled “35 Ways Real People Are Using A.I. Right Now” as evidence that the Times was aware its testing methods went beyond typical use cases.
A spokesperson for the Times pushed back strongly on the notion that its reporters “hacked” or tricked ChatGPT, countering that they were “simply using OpenAI’s products to look for evidence that they stole and reproduced The Times’s copyrighted works.” The Times alleges that regardless of how examples were obtained, they still prove OpenAI copied original reporting without permission.
While strongly worded, OpenAI’s accusations did not make up the entirety of its latest court filings. The startup also asked the judge to dismiss several aspects of the Times‘ lawsuit entirely, arguing that any copyright infringement outside the standard three-year statute of limitations window cannot be considered.
How the judge ultimately rules on OpenAI’s requests could set an important precedent in what is likely to become an avalanche of AI copyright cases. The Times is only the first of many major publishers considering lawsuits against AI startups like OpenAI. With billions invested into the space and new local startups emerging, courts will soon be flooded with thorny questions about the legal limitations of machine learning systems built on scraped data.
Both OpenAI and journalistic institutions like the Times clearly recognize the pivotal nature of this first confrontation. As such, all punches are being thrown as the former darlings of AI clash over the industry’s disruptive effects on copyright law and the free press. With reputations on the line, neither side seems willing to pull any punches in what is gearing up to be an ugly legal brawl between former tech allies.
Discover more from GadgetBond
Subscribe to get the latest posts sent to your email.
