Anthropic Lands Legal Victory Over Use Of Books In AI Training Models

A judge’s decision that Anthropic‘s use of copyrighted books to train its AI models is a “fair use” is likely only the start of lengthy litigation to resolve one of the most hotly contested questions over the latest tech revolution.

U.S. District Judge William Alsup ruled that Anthropic’s use of the books was “exceedingly transformative,” one of the factors courts have used in determining whether the use of protected works without authorization was legal. His decision was the first major decision that weighed the fair use question in generative AI systems.

Alsup’s summary judgment ruling came in a case brought by a group of authors, including Andrea Bartz, author of The Lost Night: A Novel, The Herd, We Were Never Here, and The Spare Room.

The judge, however, did rule that Anthropic had to face a trail on the question of whether it is liable for downloading millions of pirated books in digital form off the internet, something it had to do in order to train its models.

“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft but it may affect the extent of statutory damages,” the judge wrote.

Read the ruling in the Anthropic fair use case.

The attorney for the authors did not immediately respond to a request for comment.

There have been numerous lawsuits challenging the use of copyrighted material in AI training models. One of the most recent was a lawsuit filed by Comcast and Disney against Midjourney over their use of their characters in its image-generating service. While the litigation focused on the end result of Midjourney’s service, it also makes mention of its inputs. It noted that Midjourney has been teasing a new video service, “meaning that Midjourney is very likely already infringing Plaintiffs’ copyrighted works.”

Some content creators’ groups, while dismayed by the Anthropic ruling, are taking solace in the fact that Anthropic may be on the hook for the way that it accessed the millions of books.

In a statement, the Authors Guild said, “The decision allows the copying of millions of books for training, whether legally acquired or not; but if the books are not used for training and are simply downloaded and stored, that is not fair use. The court does not explain its reasoning well, but it seems to find training for AI use to be a sufficient justification.”

The guild, though, wrote that the court’s decision on the pirated books may still have an impact. “We expect that courts taking note of the staggering scope of piracy will send a clear message to all AI model developers and operators that they must license the books and other commercial copyrighted content that they use and may not help themselves to these works on pirate and other websites,” the guild wrote.

There are numerous other lawsuits that target AI companies over their use of copyrighted material for their training models. There also are not bright line laws on whether something is fair use; rather, Section 107 of the Copyright Act sets out a series of factors for determining that question, including the purpose and character, the nature of the copyrighted work, the amount of a portion used and the effect on the marketplace.

Per the U.S. Copyright Office, “Courts evaluate fair use claims on a case-by-case basis, and the outcome of any given case depends on a fact-specific inquiry. This means that there is no formula to ensure that a predetermined percentage or amount of a work—or specific number of words, lines, pages, copies—may be used without permission.”

The “fair use” question is unlikely to be taken up any time soon by Congress, which has shown little appetite for reopening copyright laws. Instead, it is being left to courts to establish the legal framework over the fast-developing technology.

Anthropic Lands Legal Victory Over Use Of Books In AI Training Models

Related

Comments

Leave a Reply Cancel reply

Share this:

Related

Comments

Leave a Reply Cancel reply