A brief look at the copyright issues raised by generative AI

Today, generative AI tools can be used to produce content of all sorts- ranging from complex texts and essays on virtually any topic, write short stories, generate photographs and drawings, write computer code and even create poetry and music. In fact, we have created the image for this blog post using Bing’s AI based image creator.

This brings up a myriad of intellectual property concerns. Are the developers of AI tools infringing IP rights when they collect and use such content to train their AI models? Is the output of AI tools copyrightable? If yes, who is the author or owner of such works?

These questions have a much greater significance as these AI tools form part of business workflows. Are you opening yourself up to allegations of IP infringement by using these tools? Would you be able to enforce rights in content produced by these tools? Before companies embrace the benefits of generative AI, they need to understand the risks and how to protect themselves.

Copyrightability of AI Generated works – Are AI generated works original?

The Copyright Act 1957 (Copyright Act) provides that for grant of copyright protection, the work must be original i.e. it should originate from the author.[1] In India, a work must involve a minimum degree of creativity and not be a product of only skill and labour.[2] Generative AI tools train on data scrapped from millions of pre-existing sources and give outputs based on a combination of these sources and their models. Output produced by AI tools may not satisfy the requirement of “creativity” required for copyright protection, if they are viewed as a collection of data compiled from already existing sources without any infusion of creativity. However, anyone who has experimented with these tools would struggle to argue that the output is mere compilations which lack creativity. Does “creativity” necessarily require the involvement of a human mind?

In 2021, an application was filed to the US Copyright office, for copyright registration of a comic book consisting of text and images (created partly by a human and partly by AI tool “Midjourney”). The US Copyright Office refused to grant copyright protection to that portion of the comic book which was created by the AI tool. It also clarified that it will not register works produced by a machine or mere mechanical process that operates randomly or automatically without any creative input or intervention from a human author.[3]

Arguably, the position under Indian law may be different from the US Copyright office’s view (for reasons discussed below). But is there a way where works produced by AI can still be granted protection even though they are based on underlying original works? One solution is that output produced by AI tools be protected as “derivative works”. The Copyright Act protects two kinds of works: primary works and derivative works. Primary works are not based on any pre-existing works.[4] Derivative works are works based on existing subject matter. For derivative works to get copyright protection, there should be substantial variation from the pre-existing work. The variation cannot be trivial in nature.[5] AI tools scrap data from pre-existing sources, however, the output produced is not a verbatim copy of pre-existing sources. The output is based on the AI model’s specific learning based on the data it has trained on. Hence there may be a case to argue that the outputs produced by AI tools are derivative works and not copied from the primary works.

Who is the author of AI generated works?

In case AI generated works are entitled to copyright protection, the next question is, who is the author of the creative outputs produced by generative AI tools.

Interestingly, the Copyright Act recognises authorship of computer-generated works. It stipulates that the author of literary, dramatic, musical or artistic work which are computer-generated is the “person who causes the work to be created.”[6] This provision was inserted into the Copyright Act through the 1994 amendments. A reading of the Joint Committee Report on the Copyright (Second Amendment Bill), 1992 (Report) suggests that the provision was inserted keeping in mind the rapid developments in technology, including artificial intelligence. The Committee acknowledged that through artificial intelligence, computers were capable of giving new ideas, apart from what is fed into the system. The Committee also discussed the need for protecting works generated by a computer such as computerised music, animated films etc. The Committee relied on the Copyright, Designs and Patents Act, 1988 of the UK which grants protection to “computer-generated works” i.e. work generated by a computer where there is no human author. The Committee also acknowledged the need for distinguishing between computer generated works and computer assisted works (where human contributions can be easily identified).[7]

As per the provision, the person who causes the computer-generated work to be created will be the author of the work. This begs the question- who is the “person” who has caused the work to be created. The developer of the AI tool, the user who input the query or the AI tool itself? Copyright law has till now only recognized natural persons as authors of a copyrighted work. The term “person” has also been interpreted conservatively by the Courts in respect of copyright law.

In 2019, the Delhi High Court rejected a copyright claim over a list compiled by a computer, on the grounds of, inter alia, lack of human intervention.[8] Further, in 2020, the Copyright Office had recognized an AI tool, Raghav, as an author of an artwork produced by the AI tool, along with the developer of the AI tool. This was seen as the first time that an AI tool was being recognized as an author of a copyrighted work in India. However, subsequently the Copyright Office issued a withdrawal notice, stating that the onus was on the applicant to inform the Copyright Office about the legal status of the AI tool.[9] Arguably, this decision ignored the legislative intent of the Indian law which specifically recognises granted protection to computer generated works.

As businesses start deploying these tools as part of their workflows, whether to support on writing code, or creating marketing materials, it is imperative to understand if these works are protectable. In case they are not going to receive protection, businesses’ ability to protect their IP can be seriously impacted and alternate strategies may need to be put in place.

IP issues in collection and usage of data

Another issue is that AI tools are trained on large amounts of data, which could include licensed data. For e.g. AI tools can be trained on content such as books, articles, photographs, paintings, among others. Written materials such as books, articles are entitled to copyright protection as “literary works” and photographs/paintings are entitled to be protected as “artistic works”. In case training data used by AI tools is protected as “copyrighted works”, the output produced by such tools may infringe the copyright in these works. For e.g. DALL-E can generate images in the style of famous artists, meaning that the training data would have included samples of copyrighted work of various artists. For this purpose, it’s important that developers of DALL-E obtain a license to use such works. Developers run the risk of facing legal action in case they use unlicensed content. In fact, developers are already facing legal action over their AI tools using protected content. Few of the cases are discussed below.

Getty, an image licensing service, has brought a lawsuit against the creators of art-generating AI “Stable Diffusion” in a US federal court, alleging that the tool unlawfully copied and processed millions of images, violating its copyright in the images. Stability AI has not responded to the allegations on merits and no order has been passed in the case yet[10], however Stability AI CEO has been quoted as saying that he believes that generative AI “transforms” the work product and hence is protected by fair use.[11]

This is not the only copyright case against Stability AI. Three artists have filed a class action suit against Stability AI, Midjourney and DeviantArt for copyright infringement[12]. The artists allege that these tools are trained on billions of copyrighted images and on being prompted to produce work “in the style” of a particular artist, produce seemingly new images, which are actually derivative works of these copyrighted images. Stability AI[13] and Midjourney[14] have pleaded that the lawsuit does not identify any output image which has allegedly infringed on their copyright, let alone one that is substantially similar to any of their copyrighted works. However, no order has been passed in the case yet.

A class action lawsuit was filed by software developers in a US District Court against GitHub, Microsoft, and OpenAI, alleging that GitHub’s AI Copilot tool, which generates code, has been trained by scraping unlicensed data. GitHub denied the allegations stating that the tool had been trained on publicly available code and that the software developers had failed to prove injury. In a recent motion to dismiss hearing, the United States District Court of California passed an order, refusing to dismiss the claim of breach of software licensing terms on the ground that the software developers had sufficiently identified the contractual obligations allegedly breached. Certain allegations have been dismissed with leave to amend including tortious interference, fraud, unjust enrichment, unfair competition, breach of the GitHub Privacy Policy and Terms of Service, violation of the California Consumer Privacy Act and negligence.[15]

Developers could, in certain cases, justify the collection and use of these works as “fair use” under the copyright law. Under India’s Copyright Act, using a copyrighted work for “criticism or review”, being fair use, may not require consent to be taken from the copyright owner.[16] The success of a fair use claim will depend on whether the outputs created by the AI tools are “transformative”. The test to see if a work is transformative is to see if it is different in character, serves a different purpose than the prior work, and is not a mere substitute. It is not sufficient that only superficial changes are made in the work.[17] Hence developers could argue that the outputs are responses to prompts specifically input by users and are hence transformative in nature. For e.g. Google Image Search facility was challenged on the ground that Google was displaying copyrighted images as “thumbnail” images on its search engine. The US Court of Appeal held that Google’s use of the thumbnail versions was “significantly transformative” since Google transformed the actual image, put it in a different context as a pointer, directing the user to a source of information, and hence providing social benefit as an electronic reference tool.[18]

Things to consider as developers and deployers

Companies whether developing or using AI based tools need to take steps to protect themselves against possible claims of infringement.

Developers of AI tools should adopt systems that enable appropriate and fair use of licensed content, by obtaining due licenses as well as give content creators credit over reproduction of their works. Developers should also regularly get their AI systems audited, which will help in defending infringement claims.

Deployers of AI solutions should carefully read the terms of use/ terms of service to assess if the AI model was trained on any protected content and if appropriate licences were taken. Companies should take reasonable efforts to ensure that the AI tool does not product outputs that are unauthorised . User companies can also insist on indemnification for intellectual property infringements in case the AI tool is not trained on licensed data. For e.g. Adobe has recently announced that it is offering financial indemnity in case of copyright claims on its new generative AI tool “Firefly”.[19] There is also a growing need for a policy and legislative guidance from the Copyright office and the Courts to provide clarity to the developers and deployers of generative AI tools. The law needs to find balance between protecting copyright and encouraging innovation. We may need to relook at how we understand the concept of creativity and what the law seeks to incentivise.

This post has been authored by Pallavi Sondhi, Senior associate with inputs from Aman Taneja, Principal associate and Anirudh Rastogi, Managing Partner.

This image has been generated using Bing image creator.

For more on the topic please reach out to us at contact@ikigailaw.com.

[1] Eastern Book Company v. D.B. Modak 2002 PTC 641

[2] Eastern Book Company v. D.B. Modak 2002 PTC 641

[3] https://www.copyright.gov/docs/zarya-of-the-dawn.pdf

[4] Eastern Book Company v. D.B. Modak 2002 PTC 641

[5] Eastern Book Company v. D.B. Modak 2002 PTC 641

[6] Section 2(d)(vi), the Copyright Act 1957.

[7] The Committee Report does not appear to be available online. Please reach out to the authors if you would like to receive a copy.

[8] Navigators Logistics Ltd. v. Kashif Qureshi & Ors. 254 (2018) DLT 307

[9] https://www.managingip.com/article/2a5d0jj2zjo7fajsjwwlc/exclusive-indian-copyright-office-issues-withdrawal-notice-to-ai-co-author

[10] https://aboutblaw.com/6DW; Getty Images (US) Inc v. Stability AI Inc.

[11] https://sifted.eu/articles/stable-diffusion-ai-emad-mostaque

[12] https://ipwatchdog.com/wp-content/uploads/2023/02/Andersen_et_al_v._Stability_AI.pdf ; Andersen et al v. Stability AI Ltd. et al.

[13]https://fingfx.thomsonreuters.com/gfx/legaldocs/akpeqnbmopr/AI%20COPYIRGHT%20LAWSUIT%20stabilitymtd.pdf

[14]https://fingfx.thomsonreuters.com/gfx/legaldocs/zgvobjokkpd/AI%20COPYRIGHT%20LAWSUIT%20midjourneymtd.pdf

[15] J. Doe 1, et al v. GitHub Inc., et al, Case No. 22-cv-06823-JST.; J. Doe 1, et al., v. GitHub, Inc.

[16] Section 52(1)(a), the Copyright Act 1957.

[17] Syndicate of The Press of The University of Cambridge on Behalf of The Chancellor, Masters and School and Ors. v. B.D. Bhandari and Ors. 185 (2011) DLT 346

[18] https://cases.justia.com/federal/appellate-courts/ca9/06-55405/0655405-2011-02-26.pdf?ts=1411056857; Perfect 10, Inc. v. Amazon.com, Inc 508 F.3d 1146 (9th Cir. 2007)

[19] https://techmonitor.ai/technology/ai-and-automation/adobe-firefly-generative-ai