A Dream of Spring for AI Copyright
This week marks the 10th Anniversary of Fair Use Week. Join us for a discussion around Artificial Intelligence (AI) that analyzes how AI applications function under copyright law, focusing on how fair use affects and enables this burgeoning technology.
AI technologies have reached a popularity level not seen before. How does this new ubiquity affect fair use and the creation of new works?
The first AI winter happened in 1974, after a report commissioned by the UK Science Research Council criticized how AI had failed to achieve its objectives at the time and noted that “in no part of the field have the discoveries made so far produced the major impact that was then promised.”1 What followed was a slump in popularity for technologies related to the field that lasted until 1980, after an initial excitement that stemmed from the creation of the computer itself.
Going back to the basics, an AI application, not unlike any computer, needs 3 key elements to function properly: hardware, which runs software that relies on data to produce results. The key difference is that artificial intelligence performs “tasks that would normally require human intelligence, such as visual perception, speech recognition, decision making, and language translation”. This is a definition by John McCarthy, one of the leading researchers on the field, during the Dartmouth Conference of 1956, which sought to unify the various research efforts at the time under a single banner. Even Alan Turing would pose the question “Can machines think?” in his seminal work Computing Machinery and Intelligence, from 1950.
The main reason for that first AI winter was the lack of power of the hardware at the time. AI researchers during the 1970s realized it was far easier to teach an application how to play chess than to lift a pen, in a phenomenon dubbed Moravec’s Paradox. Mental abilities that are taken for granted (like walking or recognizing a face) ultimately demand far greater computational power than calculating pi, for instance. This makes hard problems easy and easy ones hard. That is why research into computer vision and robotics made little progress during the 1970s.
By the 1980s hardware had improved, with systems like LISP machines becoming more popular and being advertised as capable of simulating the decision-making capabilities of humans. However, smaller personal computers from the likes of Apple and IBM started gaining traction among the population, as specialized hardware like LISP machines were too expensive to maintain and incapable of adequately dealing with unusual inputs. This brought about the second AI winter by 1993 with popularity in the area reaching a new low point.
Since then a lot has changed. Hardware continued to improve (as per Moore’s Law) with computers getting stronger and smaller with each generation. And with the growth of the internet all the computing capacity no longer needed to be located in a single place. Instead, for companies like Google, it could be distributed all over the globe. Moreover, the internet’s rise in popularity with the general public created the opportunity for more data points than ever. Newer AI applications have begun to utilize the rapidly developing hardware, evolving software, and increasing data to flourish.
In a new spring for AI applications, we can nowadays find those that can generate art, create texts of various types, and translate more accurately, among the technology’s myriad uses. However, the rapid deployment of these AI tools is attracting new challenges. There are concerns about the usage of copyrighted works to train AI, not everyone is happy that these applications can suddenly write children’s books or win art competitions. With scrutiny towards such applications increasing, legislators and the public across the globe have started to look at these mystery boxes with increased interest.
While some welcome these AI developments with excitement, others have been less accepting, filing lawsuits against artwork-generating AI already in the U.S. and the UK. The crux of the matter is whether these systems infringed on the copyright of artists to generate their creations. One case filed on American soil has as plaintiffs three artists who initiated a class-action lawsuit against the AI apps Stability.ai and Midjourney, and against the image repository DeviantArt alleging direct and vicarious copyright infringement, DMCA violations, unfair competition and publicity rights violation. The complaint can be found here. Specifically, the artists claim the defendants have “taken billions of Training Images scraped from public websites” and used them “to produce seemingly new images through a mathematical software process”.
The UK case follows along the same lines, with Getty Images suing Stability.ai claiming that the latter “infringed intellectual property rights including copyright in content owned or represented by Getty Images”. The argument is similar to that of the U.S. case, that the defendant “unlawfully copied and processed millions of images protected by copyright and the associated metadata owned or represented by Getty Images absent a license to benefit Stability AI’s commercial interests and to the detriment of the content creators”.
Putting the technical aspects on how the training of an AI application is made aside (see a great explainer on the topic here), the heart of not only these two lawsuits but of the functioning of AI applications as a whole is whether the scraping of third-party content to be used as input can be regarded as a fair use. As a general concept, Text and Data Mining (TDM) is the collection of vast amounts of digitized material for use in software to analyze and extract information from it. In the U.S., this topic is regulated by §107 of the Copyright Act, which states that the fair use of a copyrighted work, by reproduction or other means, for purposes such as criticism, comment, news reporting, teaching, scholarship, or research, is not an infringement of copyright.
The law establishes four factors to determine whether a use would be considered “fair”: the purpose and character of the use, the nature of the copyrighted work, the amount that was copied, and its effect on the potential market of that work. This type of flexible exception has generally been interpreted by courts to permit some TDM uses required by AI applications to generate images, as Jonathan Band points out.
One landmark case for the topic, Authors Guild, Inc. v. Google, Inc., heard by the Court of Appeals for the Second Circuit between 2005 and 2015, reached the conclusion that Google’s attempt to digitize books through scanning and computer-aided recognition for use in its search engine was seen as a transformative step for libraries, despite not having sought the permission to do so. With this judgment deeming such TDM practices to be fair use, that is a key precedent currently relied upon by makers of AI applications to legally support their practices.
However, this ruling is increasingly called into question, with publishers now wary of the Bing chatbots’s media diet, and the UK government curbing the expansion of TDM exceptions. Will the cases mentioned above challenge this understanding? It remains to be seen. The understanding of how this technology works is incipient and it will take time until lawmakers can fully grasp the concept in order to consider proposals that do not curb the progress of innovative technologies such as AI.
This is just another example of how important fair use and limitations and exceptions are for the advancement of new technologies. The fair use doctrine has become one of the legal linchpins that AI applications rely on. Its defense and broadening is paramount so that creators and inventors can continue to recombine existing knowledge to create newer and more exciting possibilities, like they did beforehand with the camera and image editing programs like photoshop. This will ensure a long spring for AI tools and the new works of art and innovations that artists, musicians, researchers, and the public in general will create using them.
1 Lighthill, J. (1973), “Artificial intelligence: a general survey”, Artificial intelligence: a paper symposium