Contact Us


Disruptive Competition Project

655 15th St., NW

Suite 410


Washington, D.C. 20005

Phone: (202) 783-0070
Fax: (202) 783-0534

Contact Us

Please fill out this form and we will get in touch with you shortly.
Close

Presumption of Copying in AI Training

· September 13, 2024

Credit: JannHuizenga

New bills are being introduced in Congress which seek to assist rightsholders in proving that their works were used to train generative AI models. These types of legislation contemplate a complicated technical process which utilizes persistent watermarks or the maintenance of records of the content used to train AI models. Unfortunately, these well-intentioned efforts may actually lead to a frustrating result – ignoring much simpler existing laws in copyright. 

To prevail on a copyright infringement claim, a copyright owner must show

  1. the defendant copied from her original work; and
  2. what was copied was protected expression.

Because there rarely is direct proof of copying, courts have established presumptions to assist the plaintiff in meeting her burden. Thus, if the plaintiff can show that the defendant had access to her work and that the allegedly infringing work is substantially similar to it, the court will presume that the defendant copied from the plaintiff’s work. Furthermore, if the plaintiff can show that the allegedly infringing work is strikingly similar to her work, the court will presume copying without requiring a showing of access. 

Since the emergence of the Internet, courts have wrestled with the question of whether a defendant should be deemed to have had access to a work if the work was available on the Internet. Courts usually have required more than mere presence of the work on a publicly accessible site to establish access. This past August, for example, the Eleventh Circuit in Morford v. Cattelan found that the presence of an image on an artist’s public Facebook page for ten years, as well as on the artist’s YouTube channel and blog post, was too speculative to establish a sufficient nexus between the plaintiff’s work and the defendant. The court required more evidence that the plaintiff’s work enjoyed success or publicity to trigger the presumption that the defendant had seen the plaintiff’s work.

Thus, the courts not only have created reasonable presumptions easing the burden on plaintiffs to prove that copying occurred, they also have demonstrated their adeptness at updating those presumptions to accommodate new technologies. The courts clearly are capable of adapting these presumptions in copyright cases relating to AI. While a court might choose not to presume that an artist accessed an image posted on another artist’s Facebook page, it might well decide to presume that a company that systematically scraped content from the World Wide Web accessed any content found on the Web. In other words, if the plaintiff shows that her work was available on a publicly accessible website, the court could presume that an AI firm harvested the work for training purposes: that it accessed and made a verbatim copy of the work in the training database. The defendant could rebut this presumption by proving that it did not scrape the website, for example by maintaining a record of which websites it crawled and when. Or the defendant could concede the issue of copying the content, but argue that it was a fair use. This approach would be far more flexible and far less burdensome than implementing watermarking or other metadata-related requirements on all internet works. 

For decades, the U.S. copyright system has been resilient in the face of massive technological change because it accommodates judicial creativity. Rather than these prescriptive proposals, a more appropriate approach would be to study existing copyright law, which has been interpreted by courts and backed by clear constitutional protections like the freedoms of speech and expression, and how they scale with existing technologies. Without this attention and care, legislation that conflicts with existing law would ultimately do more harm to technological innovations and fair use.

Intellectual Property

The Internet enables the free exchange of ideas and content that, in turn, promote creativity, commerce, and innovation. However, a balanced approach to copyright, trademarks, and patents is critical to this creative and entrepreneurial spirit the Internet has fostered. Consequently, it is our belief that the intellectual property system should encourage innovation, while not impeding new business models and open-source developments.