The Times is the first significant media outlet to sue the leader of AI.
The New York Times has become the first big media organization to bring a copyright infringement case against OpenAI and Microsoft. This lawsuit, which we all knew was going to happen, has now arrived on the scene. According to the allegations made in the case, which was submitted to the federal court in Manhattan, OpenAI language models that were trained with content from the Times are now competing with the information outlet as a source of online information. Although the lawsuit does not specify the amount of money that should be awarded as compensation, it does state that OpenAI should be held liable for “billions of dollars in statutory and actual damages.”
Over the course of several decades, the laws governing copyright have remained substantially untouched. It would have been reasonable to expect that the introduction of material that is easily shared and can be copied indefinitely would have compelled a reexamination of intellectual property; nevertheless, if anything, copyright has been strengthened in response to developing technology. In addition to the fact that the duration of copyright has been extended several times, websites are compelled to comply to DMCA takedown demands, which are frequently incorrect. Considering the paucity of development that has been made, it is impossible to predict how the courts will deal with training data for artificial intelligence. Although it is not the same as stealing the work of another person and selling it over and over again, it is still something.
According to reports, the Times was in discussions with OpenAI throughout the course of the summer with the intention of negotiating a licensing agreement that is comparable to the one that OpenAI has with the Associated Press. But the negotiations came to an end without reaching a consensus. This particular case will serve as the initial significant test of the legal difficulties that are associated with artificial intelligence. Large language models (LLMs) are the foundation of today’s popular artificial intelligence systems. These LLMs are massive, and they can sometimes consist of billions of machine learning parameters. In order for the models to be able to generate responses in natural language that are similar to those produced by humans, they need to be large, which means that they need to take in a significant amount of training data. For the purpose of feeding its GPT models, OpenAI is famous for scraping a significant percentage of the internet.
There are a number of ChatGPT inquiries that are based on New York Times data, according to The Times, and it offered several samples. It has been demonstrated through the lawsuit that ChatGPT may react to questions with quotes that are almost verbatim from stories that are available on the New York Times website and that would have required a subscription to view. Based on the inquiry, it was also discovered that ChatGPT had made extensive use of Wirecutter, which is the product recommendation site for the Times. The artificial intelligence, on the other hand, does not source Wirecutter, and it most surely does not include the commerce links that generate cash for the website.
The lawsuit expresses concerns about the inaccuracy of generative artificial intelligence at the same time that it complains about copyright violation. Simply put, these models are nothing more than sophisticated word calculators; they have no idea what is true and what is not true. As a consequence of this, the models are able to “hallucinate” specifics and may even defend these lies when they are put under scrutiny. The New York Times reports that individuals might pose queries to ChatGPT concerning current events, and the bot would use all of the content from the New York Times to make it appear as though it is aware of the answers. In point of fact, however, it might be passing on potentially harmful misinformation.
OpenAI’s legal problems have taken a drastic turn for the worse as a result of this case. A few people were taking notice of OpenAI prior to the year 2023; nevertheless, it was the relationship with Microsoft that was announced at the beginning of 2023 that captured everyone’s attention. Since that time, authors and smaller media organizations have begun to launch legal salvoes against OpenAI, asserting that their content was digested by the model without their authorization. If it is successful, The Times may demand a mountain of cash, but more crucially, it may compel OpenAI to remove the models that it trained with data from The New York Times. This would immediately rid Microsoft of its lead in artificial intelligence. As this article was being published, neither OpenAI nor Microsoft had provided a response to the case.