American lawmakers have failed to agree on how AI companies should compensate content creators for the massive amounts of data — often scraped, licensed, or inputted from the web. In the legislative vacuum, a flurry of lawsuits and tentative voluntary industry measures have surfaced, leaving the courts to interpret existing laws.

Rightsholders are looking for damages and seeking clarity. Courts have yet to deliver a definitive verdict on whether AI companies’ use of data qualifies as “fair use” or if creators must be compensated. The uncertainty hurts both copyright holders and AI developers.

While US law prohibits copying, exceptions exist for fair use. Courts assess fair use by considering four key factors: the purpose and character of the use (favoring nonprofit, educational, or transformative uses), the nature of the copyrighted work (with factual works more likely to qualify than creative ones), the amount and substantiality of the portion used (weighing both quantity and quality), and the effect of the use on the potential market for the original work. Courts rule on fair use on a case-specific basis.

The AI industry claims that copyrighted materials used for AI training is “transformative” and meets the first factor of fair use. It is not “expressive” since the process produces “a useful generative AI system” instead of replicating the original works. The third fair use factor also supports this view, as the materials are not used to compete or make public but serve only for training, or as some say, “learning.”

Legal precedent for “transformative” fair use arguably comes from The Authors Guild, Inc. v. Google, Inc. In 2013, a court ruled that Google could copy entire books to build a searchable database. The search company only displayed excerpts, which judges declared to be fair use.

Currently, most prominent AI cases in front of US courts target large language models. They pit some of the country’s most famed content creators against the biggest names in tech. In The New York Times v. OpenAI, the newspaper accuses OpenAI of using its copyrighted material to train ChatGPT without permission. In Silverman v. Open AI and Alter v. OpenAI, authors also allege that their works were used without consent. In re Google Generative AI Copyright Litigation, Google faces accusations of scraping copyrighted content to develop Gemini AI.

Lawsuits over visual AI models are also proliferating. In Getty Images v. Stability AI, Getty claims Stability AI used its photos without permission. Legal challenges extend to the music industry. In Concord Music Group v. Anthropic, AI’s use of copyrighted lyrics is contested.

Get the Latest
Sign up to receive regular Bandwidth emails and stay informed about CEPA's work.

Legal experts anticipate conflicting rulings in lower courts, which could require the Supreme Court to provide a definitive resolution. The economic implications are significant; a ruling against the fair use of copyrighted works could hinder innovation and impact the burgeoning AI sector. This potential outcome may influence the Court to interpret laws cautiously to avoid stifling technological advancement.

A clear federal framework remains out of reach, stymied by deep ideological divisions in Congress. During the past two years, US legislators held four hearings: one on AI and intellectual property, a second on Artificial Intelligence and IP, a third on AI-assisted Inventions and Creative Works, and an Oversight Hearing, where Copyright Office head Shira Perlmutter called for increased transparency and clarity on fair use in AI training.

Some federal legislation has been introduced. The Generative AI Copyright Disclosure Act of 2024 would force AI developers to disclose the copyrighted works used to train their models. The proposed AI Foundation Model Transparency Act would direct the Federal Trade Commission to create standards for public access to AI training data and algorithms. But none of these proposals are close to becoming law.

The US Copyright Office has been active. In 2023, it hosted listening sessions on AI’s impact across creative fields, webinars on registering AI-generated content, and a roundtable discussing global perspectives on AI and copyright. In March 2024, it released a report on AI’s impact on copyright, focusing on AI-generated works and using copyrighted materials in training.

Although the copyright office took no definitive stance on the key fair use issue, it ruled on the broader, related issue of who owns a right to the output of a large language model — the AI companies. “Works created solely by AI without human authorship do not qualify for copyright protection under US law,” the Copyright Office says.

With no federal legislation, states could fill in the gap. They have not. Despite more than 700 bills introduced on AI, none address copyright. Most focus instead on transparency and accountability, like California’s AI Transparency Act and Washington state’s copycat legislation HB 1168.

As lawsuits proliferate, media organizations and platforms are striking licensing deals. OpenAI has partnered with organizations like The Atlantic, Shutterstock, Axel Springer, Condé Nast, and Wiley, securing access to high-quality content ranging from journalism to images and academic papers. The Atlantic deal ensures proper attribution and links back to its content within OpenAI’s products, while Shutterstock provides images for training AI’s visual generation capabilities. Although media conglomerates are finding new revenue streams through AI partnerships, the deals do not clarify whether such use is “fair” or a copyright violation.

Google’s collaborations with The Associated Press and Reddit illustrate a growing awareness of the value of licensing for AI development. AP provides real-time news updates to enhance Google’s Gemini chatbot, marking a precedent for news agencies. Reddit has licensed its user-generated content to Google, acknowledging the importance of leveraging social media data for training purposes.

Licensing agreements represent a promising step toward reducing friction between content creators and AI developers. By fostering partnerships, such agreements offer a practical alternative to drawn-out courtroom battles, paving the way for innovation while respecting intellectual property rights.

But licensing deals provide only a temporary fix: the fundamental issues of fair use and creator compensation remain unresolved. Legal battles will shape the future of American AI development, challenging both rightsholder rights and the principles and promises of innovation. If the US wishes to maintain its technological lead, courts will be responsible for introducing clear rules.

Hillary Brill is a non-resident Senior Fellow with the Tech Policy Program at the Center for European Policy Analysis (CEPA). Brill served as interim Executive Director of the Georgetown Law Institute for Technology Law & Policy and teaches Copyright Law and a new Technology Policy Practice. Previously, Brill was the IP Practitioner-in-Residence at the American University Washington College of Law. Brill received her BA from Harvard University and her JD from Georgetown.

Bandwidth is CEPA’s online journal dedicated to advancing transatlantic cooperation on tech policy. All opinions expressed on Bandwidth are those of the author alone and may not represent those of the institutions they represent or the Center for European Policy Analysis. CEPA maintains a strict intellectual independence policy across all its projects and publications.

2025 CEPA Forum Tech & Security Conference

Explore the latest from the conference.

Learn More
Read More From Bandwidth
CEPA’s online journal dedicated to advancing transatlantic cooperation on tech policy.
Read More