Okay, so before we divulge into this debate, let’s just take a moment to understand how AI works. Training an AI model takes data. This data is fed into the algorithm, it’s processed as knowledge and the model gives output based on the knowledge it has gained. Just like using books, and teaching aids to teach children, who use the knowledge gained to answer questions.
But it’s not that simple with AI. It’s fundamentally a bit more complex.
“The lawsuit, filed in Federal District Court in Manhattan, contends that millions of articles published by The Times were used to train automated chatbots that now compete with the news outlet as a source of reliable information.”
So basically, it’s being stated that AI has become a very real competition to the source used to train it. And what’s more, in a crowded space of information with cut-throat competition, there is a feeling of betrayal as the data used in training was used without any real consent.
“Defendants seek to free-ride on The Times’s massive investment in its journalism,” the complaint says, accusing OpenAI and Microsoft of “using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.”
One may argue here that the information was public and free to use, but is it fair to use it to train models that can threaten the actual content creators?
On this, OpenAI comments – “We respect the rights of content creators and owners and are committed to working with them to ensure they benefit from A.I. technology and new revenue models. We’re hopeful that we will find a mutually beneficial way to work together, as we are doing with many other publishers.”
There is also an ethical aspect at play here that content creators use their original ideas, thoughts and creativity to generate content. AI assimilates, processes and re-constructs them as unique outcomes. This is what bulds the premise for New York Time’s lawsuit asking for billions of dollars in statutory and actual damages. Stating, that these bots are impacting business, and they should discontinue the bots which have used their materials in training.
The world is watching this lawsuit very carefully as it would set a precedent for using content generated by reliable outlets as training materials. This lawsuit technically puts to test the legal boundaries of generative AI and emerging tech. What implications does it carry for the news and content industry where prompts can whip up creations by learning from large datasets? The outcome of this lawsuit will affect all those who have built successful business models from online journalism, writing, arts, etc.
The intellectual property and ownership concerns have often been a topic of discussion since the inception of ChatGPT and other generative AI models. But their actual impact and real problems can now be seen as the models are getting better at producing accurate results and their use is penetrating every industry at various levels.
Andreessen Horowitz, a venture capital firm and early backer of OpenAI, wrote in comments to the U.S. Copyright Office that exposing A.I. companies to copyright liability would “either kill or significantly hamper their development.”
But, the other side to this argument is the cost of creative and independent thinking.
A representative from the NYT stated – “If The Times and other news organizations cannot produce and protect their independent journalism, there will be a vacuum that no computer or artificial intelligence can fill,” the complaint reads. It adds, “Less journalism will be produced, and the cost to society will be enormous.”
The outcome of this lawsuit is something the whole world is watching right now. It sure does promise to be an interesting 2024.