The New York Times says OpenAI violated copyright laws and used its work “unlawfully;” OpenAI believes The Times’ lawsuit is “without merit.”
Generative Artificial Intelligence (AI) models have been under a lot of scrutiny lately. In July 2023, OpenAI, the company behind ChatGPT, received three lawsuits in just 10 days. In September, Jonathan Franzen, George RR Martin, John Grisham, Jodi Picoult, and 17 other writers sued OpenAI for alleged theft of copyrighted work.
Also citing copyright infringement, online picture library Getty Images is suing Stable Diffusion, while Universal Music and other music publishers are suing Anthropic, the maker of the Claude 2 AI model, for distributing copyrighted song lyrics. Some security groups have even accused AI companies of posing a privacy risk to personal data.
The biggest lawsuit yet seems to be the one launched by the New York Times, if nothing else for the size and influence of the news organization. But why did the Times sue OpenAI and Microsoft? And how did the companies respond?
Why the New York Times is suing OpenAI and Microsoft
The New York Times sued OpenAI and Microsoft over the use of their copyrighted work to train AI chatbots. The lawsuit was filed in the Federal District Court in Manhattan on Dec. 27, 2023.
In the suit, as detailed in an article published by the newspaper, the New York Times alleges to have suffered “billions of dollars in statutory and actual damages” related to the “unlawful copying and use of The Times’s uniquely valuable works” by OpenAI and Microsoft.
According to the lawsuit, OpenAI trained their ChatGPT AI models on “millions of articles published by The Times,” among those from other news companies.
“Settled copyright law protects our journalism and content,” the NYT lawsuit says. “If Microsoft and OpenAI want to use our work for commercial purposes, the law requires that they first obtain our permission. They have not done so.”
Microsoft invests in OpenAI and provides it with cloud tech. It also uses OpenAI’s ChatGPT models to power its Bing search engine.
The two companies’ GPT models, according to The Times’ lawsuit, “directly compete with Times content.”
“Defendants seek to free-ride on The Times’s massive investment in its journalism” and use its content “without payment to create products that substitute for The Times and steal audiences away from it,” the NYT says in the suit.
What does the New York Times demand?
The New York Times lawsuit doesn’t mention an “exact monetary demand.” It does however ask that OpenAI and Microsoft be held responsible for the damages related to the use of NYT’s copyrighted work.
According to the suit, The Times also want OpenAI and Microsoft to destroy the chatbots and training data that use The Times’ copyrighted materials.
Related Articles: The Challenges Ahead for Generative AI | Top 3 AI Productivity Tools of 2023 | GPT-4: What is The World’s Most Advanced AI Companion Capable Of?
The paper compares the advent of AI to that of broadcast radio or the music exchange software Napster. And like with those other historical technological changes, it argues, legislation will need to be enacted, and soon.
The Times quotes Richard Tofel, a news consultant who thinks that an intervention by the Supreme Court is inevitable. Especially, he claims, when it comes to journalism companies. But is that really the case?
Open AI responds to New York Times accusations
OpenAI responded to the New York Times’ claims in a post published on their website on January 8. In it, OpenAI says it “support[s] journalism” and believes the suit is “without merit.”
It disagrees with the New York Times’ claims and accuses it of “not telling the full story,” adding that they were surprised to read about the lawsuit in The Times.
It claims to be working with news organizations to create new opportunities. Artificial intelligence and journalism don’t have to be at odds, the company says, which is why they partner with different news organizations.
Indeed, OpenAI now has licensing agreements in place with Associated Press (AP) as well as Axel Springer, owner of Politico and Business Insider, that allow it to use the work of these news organizations to train its chatbots.
Further, training AI chatbots using information that’s publicly available on the internet is fair use, OpenAI writes, providing examples of countries with laws that “permit training models on copyrighted content.”
The company also says that it was the first in the AI industry to give publishers a “simple opt-out process” that, if adopted, prevents its tools from accessing their sites. The NYT, they write, opted-out in August 2023.
Finally, OpenAI argues that “regurgitation” is a rare bug that is being ironed out and that the New York Times isn’t telling the whole story. Regurgitation is what causes a chatbot to copy whole stories without attributing them to their author.
“Interestingly, the regurgitations The New York Times induced appear to be from years-old articles that have proliferated on multiple third–party websites,” OpenAI writes. “It seems they intentionally manipulated prompts, often including lengthy excerpts of articles, in order to get our model to regurgitate.”
OpenAI also accuses the NYT of “repeatedly refus[ing] to share any examples, despite [their] commitment to investigate and fix any issues.”
Editor’s Note: The opinions expressed here by the authors are their own, not those of Impakter.com — In the Featured Photo: . Featured Photo Credit: Stéphan Valentin.