Ethics of Copyright Attribution in Generative AI

Arturo Dos
Philosophy of Entrepreneurship
8 min readMar 6, 2023

--

tl;dr — As the world rages about ChatGPT and the new “Generative AI” buzz, the question of attribution of AI generated work remains to be answered. As of today, the development of technology has by far outpaced the development ethics, as there still exists no standard legal or social framework for claiming ownership over or royalties for work that have been used as input for AI generated output.

source: unsplash

It was a much simpler time, when crafted goods and artwork were made by humans and sold to other human beings.

It was still a simpler time when Venetians started a legal convention for patenting intangible assets. Since then, people in the modern world had (sort of) a common legal language for claiming ownership and collecting royalties.

Moving Copyright into the Age of the Internet

Who knows what in the age of the Internet? The question plagued us for more than two decades.

Taking music for example, Morpheus, Kazaa and Napster once dominated in the nascent digital world where the transfer of digital versions of physical albums had yet to be regulated.

In short, these music-sharing platforms profited on the assumption that they or their users never owed a cent for owning or transferring a digital replica of somebody else’s work.

That very quickly changed in 2001, when a federal court order blocked all transfers of copyrighted digital material on Napster. This eventually lead to the downfall of the original music-sharing platforms.

Meanwhile, players like Pandora Radio played by the rules with the RIAA and obtained cheaper limited licenses to allow users to listen to music online legally but without full control of their playlists — like listening to a radio station online.

Soon after, YouTube came along with the “share your creativity through videos” bang-wagon, which encourages users to upload videos that can potentially “go viral”, and perhaps not surprisingly, many videos contained copyrighted content like parts or the entirety of music videos, movies and songs.

For a while, YouTube hid behind the claim that copyright holders cannot hold YouTube (or any content-sharing platform) liable for copyright infringement of their users.

Surely enough, after Google’s 2006 YouTube acquisition, both companies were hit with a mega lawsuit from Viacom in 2007 for enabling copyright infringement.

Despite Google and YouTube’s legal triumph over Viacom, the lawsuit is one of the catalysts that realized a new era in which content-sharing and content-hosting companies use AI to detect copyright material, and in turn offer revenue-sharing to copyright holders.

After around 2014, the dust finally settled — the record industry and the tech companies more or less reached a win-win formula:

Today, on one hand we have companies like YouTube that use AI to detect copyrighted material in videos, and then share a fraction of its video advertising revenue with copyright owners; on the other hand, we have Spotify and Apple iTunes that pay royalty every time one of their users listen to a song.

Tech companies get to use copyrighted music to make money, record companies take a cut of the revenue (and share some with the artists), and users get cheap music. Everybody wins, well, sort of.

Similar problems have obviously happened with movies and TV shows, and similar deals have been reached between the MPAA and companies like Netflix and Amazon over streaming rights and royalties.

What you may not expect, is that even the use of dance moves has become a controversy on the Internet.

This move is called Milly Rock by Rapper 2 Milly (in his single “Milly Rock”):

source

It was popularized in 2015 after the release of his single, the move could be widely seen in dance clubs and on the Internet.

And Fornite by Epic Games, the blockbuster online first-person shooter game, added the move to its game:

source

For those who are not familiar with in-game purchases, these “emotes” are usually sold alone or as a part of a bundle. These purely software-based in-game purchases can be priced anywhere between $1.99 and $19.99.

With more than 400 million players, you can imagine Epic Games raking in hundreds of millions with just one set of in-game purchases in Fortnite.

In 2018, 2 Milly sued Epic Games for copyright infringement. The case was eventually dropped after the Supreme Court ruled that 2 Milly needs an approved copyright first.

Even though the Milly Rock case never had a proper resolution. It surely stirred up a few debates:

Can you copyright a dance move that is not a choreography?

Then what is stopping people from copyrighting a jumping jack?

If you can copy a dance move, then can a musical artist copyright a motif?

Artists sample from each other all the time, like how Rhianna’s Wild Thoughts sampled Santana’s Maria Maria, or how Black Eyed Peas Pump It sampled folk song Misirlou (Pulp fiction theme song). There is a tacit consensus as to how much of a song you can use before you have to pay royalties, but that line sure is blurry.

The debate will continue into 2023 as there are billions of dollars at stake.

But hey, that was all just handling the distribution of physically recorded music and video over digital media.

From here on out, it gets way more complicated. It really does.

Copyright In the Age of AI Generation

When a piece of digital asset is no longer holistically created by humans, but is instead generated by AI with partial data collected from humans’ (potentially copyrighted) work, can the AI output be copyrighted?

And who owns the copyright?

AI used to be the science and the engineering of a descriptive replica of human mind, but nowadays it is all about ingesting data to create models that can approximate what humans do.

Speech recognition and speed recognition, as you can imagine, requires tons of voice data.

Some voice data could unknowingly become the basis of the world’s most recognizable artificially generated voice — SIRI.

Similarly, when we look at something like this generated by MidJourney:

source

It is undeniable that this art piece is a visual and conceptual montage of thousands if not millions of existing artworks.

Some would try to argue by saying that this is a generation of a painting that resemble Baroque or Rococo styles, and you cannot copyright, trademark or patent the idea of an artistic movement or artistic style.

Sure, but let’s not forget that artistic styles and philosophical movements are extrapolated from token creations, and these creations themselves are very much eligible for ownership or copyright.

One may not be able to own the idea of impressionism in music, but Debussy, if still alive, is very much in ownership of his compositions.

Same goes for dance styles like hip-hop, breaking (commonly misnamed “breakdancing”) or salsa dancing — while the concept of the styles cannot be copyrighted or patented, the choreographers who create their works of art are very much in ownership of their respective works.

When so-called “generative AI” trains itself on data sets gathered from a supposed “public realm” like the Internet, the software is effectively creating a complex montage of decomposed works of others.

Some may argue that modern diffusion model in AI uses data points to extrapolate and create new information, it never actually “use” or “output” the original data points, so is it really subject to this debate of copyright attribution?

In my honest opinion?

Of course, it is.

In layman terms, diffusion models in AI use data to predict trends, one way to explain its usage is to restore a damaged or blurred image, like so:

source

In fact, this is what your 4K or UltraHD TVs do to older Blu-ray when it automatically upscales the image quality. Basically, your TV is running a model that generates pixels between pixels of existing lower fidelity image or video to make the picture sharper.

In short, what we call “generative AI” is basically a different use for these types of models — instead of generating missing pixels in lower fidelity images, we’re using them to generate a new image with patterns and trends the machine has already seen before.

The underlying premises of a diffusion model remains the same. The fact that “generative AI” in reality extrapolates and generates the information with such close adjacency to the original data points, the type of derivation we are witnessing in “generative AI” blurs the line between replication and creation.

So, if upscaling a Blu-ray copy of the Godfather to 4K quality doesn’t release you from copyright claims from Paramount/Viacom, then employing generative AI doesn’t release you from copyright claims associated with the data points you used either.

No Verdict Today

If you thought I’m here to incite a movement to collect royalty from those who use “generative AI” training data, hold your horses.

I don’t think one way or the other at this point — I don’t believe that tech companies should profit off of other people’s work — the way Epic Games profited off of Milly Rock, but at the same time I don’t believe data points, musical motifs, dance moves or simple English quotes should be allowed to be copyrighted.

No copyright protection disincentives original creators, but too much protection also suffocates posterior creativity.

The last thing we want, is for the battle for AI data become anything like the legal abomination that is the mobile device industrial rivalry.

To give you a sense of the insufferable ugliness, Apple was granted a patent for a boring, unoriginal tablet design, and Apple consistently uses its mobile patents to push Samsung out of global markets. All the while Google gobbled up mobile patents to join the legal brawl, not to mention Microsoft’s old patent allows it to collect a few dollars for every Android device ever made. Even though Apple is losing to Microsoft in the tablet-laptop hybrid game, Apple got the patent for something they obviously weren’t the first to build.

The mobile device industry is an example of how copyright abuses and trolling can get way out of hand when things that should not be copyrighted are copyrighted — and corporations and individuals with the financial means can push others out of the market by employing vacuous copyright claims in court.

Thus, I’m on the fence when it comes to copyright attributions of generative AI.

How do we make an innovative, productive and profitable AI company pay its dues to the society without killing it?

That is the question.

--

--

Serial Entrepreneur in Education and B2B SaaS. Product and Engineering Management. AI, Education and UX. Philosophy, Dance, Music and Culinary Hobbyist.