Encyclopaedia Britannica and its dictionary subsidiary Merriam-Webster have filed a lawsuit against OpenAI in federal court in Manhattan, accusing the company of unlawfully using their reference materials to train its artificial intelligence models.
According to the complaint filed Friday, the publishers allege that OpenAI copied nearly 100,000 encyclopedia articles and dictionary entries from their online platforms to train its large language models, including the system behind ChatGPT.
Britannica claims that the AI-generated summaries produced by ChatGPT replicate its content and may reduce traffic to its websites by providing answers directly to users. The complaint argues that this practice “cannibalizes” Britannica’s audience and undermines its subscription-based information services.
The publishers also allege that ChatGPT can generate responses that closely resemble original Britannica entries, sometimes reproducing passages nearly verbatim. In addition, the lawsuit claims OpenAI improperly references Britannica in AI responses, which the company argues could mislead users into believing the chatbot has permission to reproduce its content.
Britannica is seeking unspecified financial damages and a court order preventing further use of its materials in AI training.
Part of Broader AI Copyright Disputes
The case is the latest in a series of legal battles between content owners and AI developers over the use of copyrighted material to train generative AI systems. Authors, publishers, and media companies have filed multiple lawsuits arguing that their work was used without permission or compensation.
Technology companies, including OpenAI, have defended their practices by arguing that training AI models on large datasets constitutes fair use because the models transform the information into new outputs rather than reproducing the original content.
Britannica previously filed a related lawsuit against AI startup Perplexity AI, which remains ongoing. The outcome of these cases could influence how courts interpret copyright law in the era of large-scale AI model training.
