Getty Images bans AI-generated content over fears of copyright claims

jfoster · on Sept 21, 2022

Reading between the lines of this, it sounds to me like Getty is preparing a copyright claim against the AI companies:

1. They seem of the opinion that the copyright question is open.

2. Their business stands to lose substantially as a result of such models existing.

3. It would be a bad look for them to make a claim whilst simultaneously accepting works from the models into Getty.

4. At least some of their watermarked content seems to have been included in the training data of the OpenAI model: https://news.ycombinator.com/item?id=32573523

If I'm correct about that, they will probably not settle, as their business was likely worth substantially more than any feasible settlement arrangement.

ar-nelson · on Sept 21, 2022

I've seen a lot of confidence on HN and other tech communities that a court would never rule that training an AI on copyrighted images is infringement, but I'm not so sure. To be clear, I hope that training AI on copyrighted images remains legal, because it would cripple the field of AI text and image generation if it wasn't!

But think about these similar hypotheticals:

1. I take a copyrighted Getty stock image (that I don't own, maybe even watermarked), blur it with a strong Gaussian blur filter until it's unrecognizable, and use it as the background of an otherwise original digital painting.

2. I take a small GPL project on GitHub, manually translate it from C to Python (so that the resulting code does not contain a single line of code identical to the original), then redistribute the translated project under a GPL-incompatible license without acknowledging the original.

Are these infringements?

In both of these cases, a copyrighted original work is transformed and incorporated into a different work in such a way that the original could not be reconstructed. But, intuitively, both cases feel like infringement. I don't know how a court would rule, but there's at least some chance these would be infringements, and they're conceptually not too different from distilling an image into an AI model and generating something new based on it.

nneonneo · on Sept 21, 2022

It’s not about reconstruction, it’s about the notion of a “derivative work”. Translating a work would absolutely be derivative (consider the case of translating a literary work between languages: this is a classic example of a derivative work). Blurring a work but incorporating it would nonetheless still be derivative, I think.

The challenge with these models is that they’ve clearly been trained on (exposed to) copyrighted material, and can also demonstrably reproduce elements of copyrighted works on demand. If they were humans, a court could deem the outputs copyright infringement, perhaps invoking the subconscious copying doctrine (https://www.americanbar.org/groups/intellectual_property_law...). Similarly, if a person uses a model to generate an infringing work, I suspect that person could be held liable for copyright infringement. Intention to infringe is not necessary in order to prove copyright infringement.

The harder question is whether the models themselves constitute copyright infringement. Maybe there’s a Google Books-esque defense here? Hard to tell if it would work.

tobyjsullivan · on Sept 21, 2022

> If they were humans, a court could deem the outputs copyright infringement

I'm not sure I understand how this is self-evident. The closest equivalent I can see would be a human who looks at many pieces of art to understand:

- What is art and what is just scribbles or splatter?

- What is good and what isn't?

- What different styles are possible?

Then the human goes and creates their own piece.

It turns out, the legal solution is to evaluate each piece individually rather than the process. And, within that, the court has settled on "if it looks like a duck and it quacks like a duck..." which is where the subconscious copying presumably comes in.

I don't know where courts will go. The new challenge is AI can generate "potentially infringing" work at a much higher rate than humans, but that's really about it. I'd be surprised if it gets treated materially different than human-created works.

notahacker · on Sept 21, 2022

> The new challenge is AI can generate "potentially infringing" work at a much higher rate than humans, but that's really about it

The other challenges are: (i) the model isn't a human that can defend themself by explaining their creative process, it's a literal mathematical transformation of the inputs including the copyrighted work. (And I'm not sure "actually the human brain is just computation" defences offered by lawyers are ever likely to prevail in court, because if they do that opens much bigger cans of worms in virtually every legal field...) (ii) the representatives of OpenAI Inc who do have to explain themselves are going to have to talk about their approach to licenses for use of the material (which in this case appears to have been to disregard them altogether). That could be a serious issue for them even if the court agrees with the general principle that diffusion models or GANs are not plagiarism.

And possibly also (iii) the AI has ridiculous failure modes like implementing the Getty watermark which makes the model look far more closely derived from its source data than it actually is

tobyjsullivan · on Sept 21, 2022

It's worth pointing out that the problem, in this scenario, is for the creator (ie, the human running the algorithm). They will need to determine whether a piece might violate copyright before using it or selling it. That seems like a very hard problem, and could be the justification for more [new] blanket rules on the AI process.

Retric · on Sept 21, 2022

Proving artwork you created is free from all copyright issues is similarly impossible, but in practice isn’t an issue. So, I don’t see any AI specific justification being relevant.

mr_toad · on Sept 22, 2022

How common is it for an artist to accidentally generate a work that resembles an existing work?

Morgawr · on Sept 22, 2022

I can't speak for digital art because I'm not an artist but I can say it's extremely common for original music to (often accidentally/subconsciously) include melodies or pieces of melodies from other music.

sojournerc · on Sept 22, 2022

That's more like two paintings sharing the same color scheme or using the same brand+color of paint but still being unique. The performance of those melodies and structures is what makes a song unique and creative.

zarzavat · on Sept 22, 2022

Adam Neely the YouTuber has produced several videos about recent legal cases where musicians sue because of some superficial similarities. In most cases the higher courts in the US recognize that copying is part of the normal creative process and is not infringement unless it is a blatant rip-off.

3000000001 · on Sept 21, 2022

This is exactly my thinking. If the court finds somebody guilty of infringing on a human-made piece of digital art the response is to punish the human, not to ban or impose limits on photoshop.

At risk of stretching the analogy, you don’t charge the gun with murder…

chakalakasp · on Sept 22, 2022

Except, of course, the human has very little control over what the AI outputs in TXT2TXT scenarios, at least in terms of whether the output would match the definition of copyright infringement of someone else's work. IMG2TXT is kinda different -- I think you could make a much stronger case for derivative work there. So you have a tool that can randomly create massive liability for you, and you can't know if its done so or not until someone sues you.

mannykannot · on Sept 22, 2022

The humans are the cause of it happening in the first place. Maybe the gun analogy is not such a stretch: if you pull the trigger, you own the consequences.

Reasonable fair use principles could distinguish personal and R&D use from commercial use.

gus_massa · on Sept 21, 2022

I think it's more common with music. Some musician goes to a foreign country and heard an obscure local song. 20 years later the musician has forgotten completely about the song and the trip. One day a catchy melody appears in the head of the musician out of the blue, and the musician complete the song and add a lyric. The song get famous, and later reach the foreign country, and everyone acuse the musician of plagiarism.

Qem · on Sept 22, 2022

Reminds me of a recent scandal involving Adele, where she is accused of plagiarizing a Brazilian composer: https://english.elpais.com/usa/2021-10-19/toninho-geraes-vs-...

romwell · on Sept 21, 2022

>I'm not sure I understand how this is self-evident. The closest equivalent I can see would be a human who looks at many pieces of art

...and then gets told "Hey, go and paint me a copy of that Andy Warhol piece from memory".

The model might not violate the copyright, but its output is derivative work if the copyrighted works are included in the training set.

tobyjsullivan · on Sept 22, 2022

That would be over-fitting which is certainly a failure mode of ML. But it's still a failure mode, not an inherent property.

judge2020 · on Sept 21, 2022

Note that a derivative work doesn't instantly make it 'fair use' for the purposes of copyright. You typically still need 'adaptation' permission from the copyright holder to made a derivative work of it, so you can't make 'Breaking Bad: The Musical' by recreating major scenes in a play format, at least not without substantially changing it[0].

For the purpose of fair use, copyright.gov has an informative section titled "About Fair Use" which details what sort of modifications and usage of a copyrighted work would be legal without any permission from the copyright holder https://www.copyright.gov/fair-use/#:~:text=a)(3).-,About%20...

0: https://en.wikipedia.org/wiki/Say_My_Name!_(Musical)

CPLX · on Sept 21, 2022

It’s the opposite of what you’re suggesting in the first sentence. Something being a derivative work is a fairly clear sign that it’s not fair use.

schoen · on Sept 21, 2022

It's kind of a combination of these two positions: something being a derivative work means that it requires an excuse under copyright law, whether that be a license, or a fair use defense (which may be fact-specific), or some other justification. Without such a justification, the derivative work is treated as a copyright infringement.

This is a result of 17 USC §106(2), which says that one of the things that "the owner of a copyright [...] has the exclusive rights to do and to authorize" is "to prepare derivative works based upon the copyrighted work", unless an exception (including fair use) applies.

https://www.law.cornell.edu/uscode/text/17/106

karmasimida · on Sept 22, 2022

> If they were humans, a court could deem the outputs copyright infringement

Why? I am a human, I can learn the composition of stock photos and use the idea in my work. There is no copyright infringement, or there is simply no way to prove it.

The connection needs to be really strong to consider something as derivative work.

peoplefromibiza · on Sept 21, 2022

> consider the case of translating a literary work between languages: this is a classic example of a derivative work

This is a classic example of "you need a license on the original material to sell it" (unless it is public domain)

You need the rights also if you are translating from a pre existing translation in a different language (i.e. you're translating a Japanese book from the French authorized translation).

Translating an opera is not automatically derivative work and does not fall under the umbrella of fair use.

source: my sister is a professional translator

EDIT: technically you'd need to acquire the rights even if you translate it by yourself, for fun, and show it to someone else.

dTal · on Sept 22, 2022

> The challenge with these models is that they’ve clearly been trained on (exposed to) copyrighted material, and can also demonstrably reproduce elements of copyrighted works on demand. If they were humans, a court could deem the outputs copyright infringement, perhaps invoking the subconscious copying doctrine (https://www.americanbar.org/groups/intellectual_property_law...).

Every single human has been exposed to copyrighted material, and probably can reproduce fragments of copyrighted material on demand. Nobody ever writes a book or paints a picture without reading a lot of books and looking at a lot of paintings first. For a "subconscious copying" suit to apply, you need to demonstrate "probative similarity" - that is, similarity to copyrighted material that is unlikely to be coincidental.

In other words - it's not clear to me that the situation with AI is any different than with a human, or that it presents new legal challenges. If it looks new, it is new.

chii · on Sept 22, 2022

> Blurring a work but incorporating it would nonetheless still be derivative, I think.

you have to show that the derivative work is a substantial part of the new work.

If your background is a small portion of the new image, and the blue makes it difficult to see that it was the original, and that any other blurred image would've done the same job, then i would argue that the final new painting does not constitute a derivative work.

A similar argument could be made for AI models. The model consists of billions of training images. None of these images are individually substantial in the final output, despite that on some part of the output, you can trace a derivative work. For example, if an author used 1 million books, and copied every nth word from each book and merged it, to produce a final book (which happens to produce a coherent book), i would argue that the author did not infringe copyright on any of the original 1million source books.

noduerme · on Sept 22, 2022

>> Translating a work would absolutely be derivative (consider the case of translating a literary work between languages: this is a classic example of a derivative work)

True for literature, which can be copyrighted. Not true for logic. Code that performs the same function but is written independently can't, by definition, violate copyright. It could violate a patent, but it's much harder to patent code.

However, in the AI case, the models themselves do nothing even approaching such translation. They fully incorporate what are essentially just compressed representations of existing works into their weights.

I'd guess if any single copyright holder could show that their work could be substantially retrieved from the model with the right query string, they'd have a reasonable claim on royalties from the entire model, or else the whole model would have to be thrown out.

not2b · on Sept 21, 2022

Yes, I expect that if you ask the model for "Getty images photo of [famous person] doing [thing Getty Images has only one photo of that person doing]" you might well get the original photo out.

andybak · on Sept 22, 2022

Should be easy to try. My guess is that it most likely won't.

not2b · on Sept 22, 2022

You won't get the exact image, because the model doesn't perfectly memorize all inputs, but you'll likely get something so close that everyone would consider it a derivative work.

This is very similar to getting Copilot to spit out Carnack's fast inverse square root code with the right prompt.

andybak · on Sept 22, 2022

There's some interesting thought experiments around this:

1. You do the same thing but don't use "getty" in the prompt.

2. You do the same thing with "getty" in the prompt on a model NOT trained with Getty images

3. You ask a human photographer to do the same thing and you show him a Getty image

4. You ask a human photographer to do the same thing but without showing him a Getty image (but he's presumably seen many in the past).

If 1 and 2 produce images that are also similar to a Getty image, where do we stand? I imagine it's likely that a trained model can learn "getty-like" without any actual Getty images to make 2 happen.

And currently IP law treats humans and AIs the same (in the sense that infringment rules don't distinguish between them) so would you consider 3 and 4 to be similar to 1 and 2?

not2b · on Sept 22, 2022

For 1, there's probably a case where a picture that exactly matches the prompt is very famous, so maybe the right prompt would manage to effectively select an image licensed by Getty. The GPLed fast inverse square root routine was an example of this kind of thing: one good match and the model finds it. Finding such a case might or might not be possible.

For 2, you can't find what isn't there, so something random would come out.

For 3 or 4, I suppose you could pay paparazzi to stalk the celebrity and try to produce a similar original shot, and this might be impossible depending on the prompt (for example, photo of [person] at [event] on [date]"), but if it's possible it has absolutely no resemblance to 1 or 2. The photographer would produce an original work, unless your #3 contractor tries to remove the watermark and pass off the Getty image as their own, which would be idiotic.

There is no particular "Getty Images" style, other than their quality requirements, they are a huge company that acquires and licenses a ton of pro photography. There's no such thing as "getty-like".

So, no, only option 1 might possibly produce a problematic derivative work.

andybak · on Sept 22, 2022

> For 2, you can't find what isn't there, so something random would come out.

It's not a search engine. It can synthesise things that it hasn't seen to some degree (assuming it knows the elements that make up the request.

The tricky bit would to teach it "Gettyness" without showing it Getty images but I think that's entirely possible. Getty images aren't astonishing examples of unprecedented originality. So it just needs to know a) the celebrity b) the action and c) what people mean when they ask for something that looks like a Getty image.

EDIT - I answered in a rush and realise you made a similar point to me about "Gettyness" - which makes it even harder for me to understand why you think it would be a violation.

How many different ways can George Clooney eat a burrito in Times Square?

not2b · on Sept 22, 2022

Suppose that there's a really iconic pic of George Clooney eating a burrito in Times Square, with a "Getty Images" watermark on it, in the training set. It's perfectly framed, has a somewhat comic expression, and a bit of the contents spilled on his shirt. It seems quite possible that the model could produce an image that isn't identical to this but so much so that people assume it's a copy, maybe produced by an artist based on the picture. Worse, it will have most of the "Getty Images" watermark right on the image.

If that doesn't convince you that there might be an issue I'll just stop here.

kmeisthax · on Sept 21, 2022

Both hypotheticals are likely infringement. The first example may be considered de minimus, but the courts hate using those words, so they might just argue that you didn't blur it enough to be unrecognizable or that it could be unblurred.

However, the thing that makes AI training different is that:

1. In the US, it was ruled that scraping an entire corpus of books for the purpose of providing a search index of them is fair use (see Authors Guild v. Google). The logic in that suit would be quite similar to a defense of ML training.

2. In the EU, ML training on copyrighted material is explicitly legal as per the latest EU copyright directive.

Note that neither of these apply to the use of works generated by an AI. If I get GitHub Copilot to regurgitate GPL code, I haven't magically laundered copyrighted source code. I've just copied the GPL code - I had access to it through the AI and the thing I put out is substantially similar to the original. This is likely the reason why Getty Images is worried about AI-generated art, because we don't have adequate controls against training data regurgitation and people might be using it as a way to (insufficiently) launder copyright.

LtWorf · on Sept 22, 2022

Searching books is different than generating books and selling them.

Beldin · on Sept 21, 2022

> To be clear, I hope that training AI on copyrighted images remains legal, because it would cripple the field of AI text and image generation if it wasn't!

To be clear, there's no law banning training an AI. There are laws for what you can do with other people's stuff.

In short, maybe the AI field would indeed be crippled if they no longer freely take input from others without asking permission and/or offering compensation. And maybe that's far, far from a bad thing.

ar-nelson · on Sept 21, 2022

That's true, but AI models trained on copyrighted images already exist and can't just be removed from the internet, and their output will often be indistinguishable from that of "clean" models. What I fear is a kind of legal hazard that would make even the possibility that AI had been used anywhere in a work radioactive.

Imagine another hypothetical: I create a derivative work by running img2img on another artist's painting without their permission. Whether the AI model in question contains copyrighted content or not, this is probably infringement.

Now suppose that, instead, I create an original work, without using img2img on someone else's art. But, as part of my process, I use AI inpainting, with a clean AI model, so that the work has telltale signs of AI generation in it.

And then suppose an artist I've never heard of notices that my painting is superficially similar to theirs--not enough to be infringement on its own, even with a subconscious infringement argument. But they sue me, claiming that my image was an img2img AI-generated derivative of theirs, and the AI artifacts in the image are proof.

With enough scaremongering about AI infringement, it might be possible for a plaintiff to win a frivolous lawsuit like this. After all, courts are unlikely to understand the technology well enough to make fine distinctions, and there's no way for me to prove the provenance of my image! If it becomes common knowledge that AI models can easily launder copyrighted images, and assumed that this is the primary reason people use AI, then the existence of any AI artifacts in a work could become grounds for a copyright lawsuit.

kranke155 · on Sept 21, 2022

It’s completely ridiculous to believe that copyright claims are unenforceable because you ran it through an ML transformation engine.

I hope Getty sues and wins. Train your datasets on your own data! This is mass IP theft.

robocat · on Sept 22, 2022

Be careful not to watch any copyrighted films through your kranke engine. The copyright owner could ask for all copies to be removed from wet storage, fair damages. I wouldn’t wish that on ainybody.

https://en.m.wikipedia.org/wiki/Monkey_selfie_copyright_disp... Slater claims the copyright (the monkey and the Canon EOS 5D DSLR definitely don’t have the copyright).

cheald · on Sept 21, 2022

Nonsense. Observations about certain characteristics of a copyrighted work are not covered under that work's copyright. If I take a copyrighted book and produce a table of word frequencies in that book, no serious person would claim that the author's copyright domain extends to my table.

kranke155 · on Sept 22, 2022

Everyone in HN keeps pretending like ML transformation = human inspiration. This is really funny - we don’t have AGI but we have an AGI-like capability to avoid copyright. Seems to be the only place where human rights and AI rights are matched is where it most benefits AI research. How interesting.

A for profit computer program =! A human being.

dTal · on Sept 22, 2022

Can you articulate a meaningful (and more importantly legally provable) difference? Both human brains and these sorts of AI programs are intractable black boxes. Maybe you think computer programs lack some sort of divine spark, but even if we accept that it seems to me that it's not a given that humans apply their divine spark every time they create something either.

kranke155 · on Sept 22, 2022

That’s a bizarre definition.

So because they’re both black boxes we suddenly treat them both the same legally?

I don’t have to prove that an ML transform is equivalent to a human being. Divine spark isn’t necessary to protect you from copyright - we have laws that determine what you need to do and those laws have allowed IP to exist as a profitable area for a century.

Here’s a simple question - if I were to take an image from an artist on artstation and announce my for money Warhammer tournament with it, without paying him, I’d be violating his IP rights.

But if I build an ML engine I apparently can take 100 of his images, produce similar images and charge money for it. Magic!

Explain to me - how did the artist suddenly lose his IP rights exactly ?

I submit it is up to ML researchers to prove this isn’t the biggest IP land grab in history. Not for me to prove that somehow humans are different. We know humans are different and we’ve codified in law just how much new IP needs to be different in order to avoid copyright. That’s good enough for humans.

But somehow ML enthusiasts want to say that if I feed an ML all the Disney movies and I get it to make it derivative work, it’s somehow protected? I can’t wait to see all the Mickey Mouse movies made by AÍ. Guess what - that’s never going to happen. If you think Disney or any other IP empire will let that go, I don’t know what to tell you. It’s absolute madness to say that anything that goes into an ML transformation is uncopyrightable. Or somehow I have no rights over my artwork because I don’t have the ability to sue OpenAI to oblivion like Disney does? Because that’s literally what we’re saying - those artists from artstation and Getty whose images were used were used exactly because they believed that they wouldn’t be able to sue - thanks to distinctions like your own!

Disney won’t let that stand, they would sue and push for annihilation, but hey the 100,000 artists on artstation? F** them, they can’t sue us. Let’s use their shit.

That’s basically where we’re at.

Let’s work out an example. So this AI can be feed all of Lucien Freud’s works. Then it can produce Lucien Freud-like artwork. And in this process, where all that is missing is Lucien Freud’s own signature, it could potentially make a virtual replica of one of his most famous paintings! Thus I would have a copy of Lucien Freud! But completely copyright free.

Are you for real? This cannot, absolutely cannot be allowed to happen. It is the biggest data theft in history, larger than Facebook or anything, to basically say that any human data when run through an ML transformation engine is no longer property of the human being who produced it but if the ML engineer. This is absolute madness if you consider the second order and third order effects.

Essentially all human data that can be automated through ML would be automated to the gain of only the ML engineers involved. The intellectual property of millions of human beings producing the data would be nothing but compost. This will lead to an unsustainable situation, where no one but ML engineers will be able to make any profit in the world. Human beings must be compensated for their data, or we’re headed straight into a dystopia.

dTal · on Sept 22, 2022

>But if I build an ML engine I apparently can take 100 of his images, produce similar images and charge money for it. Magic!

You can pay a human artist to produce "similar" images too, and as long as they aren't too similar it's fine.

You seem to be conflating - repeatedly - exact copies of a work with merely copying a style. Family Guy had an episode where Brian and Stewie visit a Disney universe, and everything was animated in the style of a classic Disney film. A bunch of humans did indeed watch a bunch of Disney films, and make a new animation based on what they saw. Did Disney - a notoriously litigious company - sue? Of course not. Copying a style is not infringement! I don't see why it's any different if an AI program does that.

vineyardmike · on Sept 26, 2022

> A bunch of humans did indeed watch a bunch of Disney films, and make a new animation based on what they saw. Did Disney - a notoriously litigious company - sue? Of course not.

Why would Disney sue themselves? Disney owns family guy.

In a world where Disney did not own it, Disney would probably sue if they used a cartoon mouse. "Style" may not be covered by copyright, but characters likely are. A human animator would know enough not to make a parody cartoon mouse look too Micky-like, but an AI wouldn't know to avoid that.

kranke155 · on Sept 23, 2022

Parody is a well known copyright exemption.

I just don’t see how you can go - this is just like a human being, let me make a million dollars out of it.

chlorion · on Sept 22, 2022

It wouldn't extend to your table, but if you used that table to generate a book of your own, it's very possible a serious person would claim the author's copyright extends to the generated book.

kiicia · on Sept 22, 2022

there is only one way frequency tables would return original work, all other versions would be gibberish, its like when using Huffman tables for compression where table itself is not equal to original data

Xelynega · on Sept 22, 2022

Is the copyright still not applicable if your encoding table has rules to reconstruct text from? No serious person would argue that copyright doesn't extend to the encoded version of the book and prevent me from profiting off it.

I believe the same applies to AI generation of text/images. Just because you're encoding the data in statistical models doesn't mean that it's not encoded.

indymike · on Sept 21, 2022

> I've seen a lot of confidence on HN and other tech communities that a court would never rule that training an AI on copyrighted images is infringement, but I'm not so sure. To be clear, I hope that training AI on copyrighted images remains legal, because it would cripple the field of AI text and image generation if it wasn't!

Regardless of the copyright of the training data which really is unresolved, the copyright-ability of output of AI is questionable at best. There's no way to monetize the generated images for a stock art that isn't at risk of a court ruling pulling the rug from under it.

ummonk · on Sept 21, 2022

AI-produced art is still human-made, as a person does the job of engineering a prompt and selecting from the generated images. The copyrightability of such work is unlikely to ever seriously be in question.

sacado2 · on Sept 22, 2022

This is not as obvious as you may think. This is closely related to the "monkey selfies" copyright claim issues. The court didn't seem to agree with the photographer who said "I own the copyright of those pictures because I configured the camera by myself and put it there, the monkey only pushed the button."

ummonk · on Sept 22, 2022

The court didn't rule on the monkey selfies copyright claim. The copyright owner just ran out of money to pay for a lawyer and gave up fighting against Wikimedia.

ska · on Sept 21, 2022

That's not so obviously clear cut. Can the model produce identical output from the same simple prompt?

MintsJohn · on Sept 21, 2022

But that's exactly what happens, AI isn't randomness, it's a set of predefined calculations. The randomness is in the seed/starting point. For e.g Stable Diffusion it is given/user input, resulting in perfect reproducibility.

ska · on Sept 21, 2022

I should have been clearer - that was rhetorical to point out that if you and I use the same prompt and pick the same resultant image, it's harder to claim either of us have copyright. This is a gray area. The tools themselves could introduce some stochastic aspect so that outputs are never identical, also.

None of this leads to an obviously clear cut legal position wrt copyright.

salawat · on Sept 21, 2022

Your (and the other person's) set pf inputs drive a mathematical function that derives the same output.

If we were all honest, we'd give up the idea of copyright altogether where ML is concerned. You can't get much closer to "It's just math, man" than what it is currently.

ska · on Sept 21, 2022

I think we are saying the same thing, roughly.

Of course, "it's just math, man" isn't precisely a legal argument, either.

ummonk · on Sept 22, 2022

Vector art is just math, yet you can get copyright on it.

mr_toad · on Sept 22, 2022

It might be hard to argue that a prompt is original enough to be covered by copyright. Shorter, simpler prompts might be ruled unoriginal.

indymike · on Sept 21, 2022

This is what I would like to believe… but needs to be proven before I stake my business on it.

iroh2727 · on Sept 22, 2022

> because it would cripple the field of AI text and image generation

I don't disagree with this statement. But arguably, it's becoming clear that these fields exist, on an economic level, as a means for already powerful corporations and technocrats to gain ownership over and repurpose the labor of previous generations for profitable automation (not simply to create C-3PO or something).

Not unlike all those other non-digital areas of the economy (e.g. railroads were built on the blood, sweat, and tears of previous generations and they are now owned by a small few, likely unrelated to the descendants of the laborers who built them).

For some ML-related commentary on this subject, see e.g. https://nathanieltravis.com/2022/08/01/ai-research-the-corpo...

joe_the_user · on Sept 21, 2022

I've seen a lot of confidence on HN and other tech communities that a court would never rule that training an AI on copyrighted images is infringement, but I'm not so sure.

Indeed and your examples are intended to point to gray areas. But a much more problematic (for the user) example is: some Dall-E-like program spits out seemingly original images but 0.1% are visibly near duplicates of copyrighted images and these form the basis of a lawsuit that costs someone a lot of money. Copyright in general tends to use the concept provenance - knowing the sequence of authors and processes that went into the creation of the object [1] and naturally AI makes this impossible.

The AI training sequence either muddies the waters hopeless or creates a situation where the trainer is liable to everyone who created the data. And I don't think the question will answered just once. The thing to consider is that anyone can just create an "AI" that just spits out stock images (which is obviously a copyright violation) and so the court would have to look at detailed involved and neither the court nor the AI creator would want that at all.

ianal... [1] https://serc.carleton.edu/serc/cms/prov_reuse.html

TaylorAlexander · on Sept 21, 2022

> To be clear, I hope that training AI on copyrighted images remains legal, because it would cripple the field of AI text and image generation if it wasn't!

Temporarily, yes. However it wouldn’t be the worst thing if they were forced to work on sample-efficiency. And I maintain that if they want a large dataset of images they can collaborate with Twitter and Instagram to add an image license option to the image uploader, and provide a license option that explicitly allows this kind of use.

AI needs sample efficiency research and if they actually had to get consent for their images there are a lot of artists who wouldn’t feel like their work was being ripped off.

It’s probably better if this kind of use is broadly permitted, but I don’t think it’s the disaster some think it would be if they were forced to get consent. There’s already millions of openly licensed images in several collections online (Creative Commons, Wikimedia Commons) and if they forced people to provide licenses on sites like twitter we could actually have open datasets for this instead of these kind of grey areas of privately scraped datasets.

kromem · on Sept 21, 2022

There's a literal arms race behind the scenes in AI right now.

I think it's very unlikely that corporate IP claims that could substantially hold back progress in domestic AI development will end up being successful.

dcow · on Sept 22, 2022

I don't know if there's a clear answer to (1), but with respect to (2) I believe there is precedent that the copyright owner would have a strong case if they had reason to believe the python library author had seen their work and it played a roll in the transcription to python. You can't look at a GPL work and literally transcribe it to license launder. Companies have tried. You can implement a similar idea in a clean room, though.

stjohnswarts · on Sept 24, 2022

Honestly, as much as I am rooting for AI "art" (still not sure about that term here) I can see how Getty would easily have a claim in court if the AI was indeed trained on some of their images AND they can prove it somehow. If that's not derivative then I don't know what is. Maybe a special niche could be carved out for people who are only researching and experimenting and not really "selling" or profiting from the resulting images. It would seem if they're right that maybe they could bury a watermark in their images that identifies it as Getty (or just whomever) and that it's copyrighted by them and they don't give permission to use it for training AI. Maybe I just don't know enough about how the algorithms work though shrug

aaroninsf · on Sept 21, 2022

Unfortunately for the Getty et al, "feels like" is worth exactly $0.

Any court that understands the technology at even a lay level, has no path to find infringement applying existing precedent.

NB "style" is not protected.

eropple · on Sept 22, 2022

I do understand the technology at at least a lay level (unless your Scotsman is true by your conclusion), and it seems to me that the idea that it's "style" and not "permuting the input data" is one that seems to be a postulate, not a fact.

not2b · on Sept 21, 2022

For case 2, translating a novel from, say, English to Japanese, still requires permission of the holder of the copyright to the English version, even though the resulting novel "does not contain a single line ... identical to the original".

afro88 · on Sept 21, 2022

It might technically be infringement, but the proof is in the pudding. It may be very hard to prove a specific image (or set of millions of images) were used in training.

wnevets · on Sept 21, 2022

> 1. They seem of the opinion that the copyright question is open.

I'm surprised it has taken this long to be honest. I've seen generated images with the blurred Getty watermark on them.

cheald · on Sept 21, 2022

It's not that the watermark is on them per se, but that the model tried to emulate an image it had seen before which had a watermark on it. Imagine showing a child a bunch of pictures with Getty watermarks on them, then they draw their own, with their own emulation of the watermark. They don't know it's a watermark, they don't know what a watermark is, they just see this shape on a lot of pictures and put it on their own. That's essentially what's going on.

The model is only around 4GB, and it was trained on ~5B images. At 24 bit depth, that'd be 786k raw data per image, which would be 3.5 petabytes of uncompressed information in the full training set. Either the authors have invented the world's greatest compression algorithm, or the original image data isn't actually in the model.

So, I think the argument is: if you look at someone else's (copyrighted) work, and produce your own work incorporating style, composition, etc elements which you learned from their work, are you engaged in copyright infringement? IANAL but I think the answer is "no" - you would have to try to reproduce the actual work to be engaging in copyright infringement, and not only do these models not do that, it would be extremely hard to get them to do so without feeding them the actual copyrighted work as an input to the inference procedure.

wnevets · on Sept 21, 2022

> It's not that the watermark is on them per se, but that the model tried to emulate an image it had seen before which had a watermark on it. Imagine showing a child a bunch of pictures with Getty watermarks on them, then they draw their own, with their own emulation of the watermark. That's essentially what's going on.

The blurred watermark is what makes its obvious they used Getty's (copyrighted?) images to train the model.

cheald · on Sept 21, 2022

I understand that, but why is using copyrighted images to train a model be any more illegal than studying copyrighted paintings in art school? Copyright doesn't prevent consumption or interpretation, simply reproduction.

joe_the_user · on Sept 21, 2022

If a student studied an older master in school and produced a painting inspired by that old master that included a copy of the signature of the old master, this would be more indication of intent to fraud than if they didn't include the signature.

Copyright can be fairly flexible in interpreting what constitutes a derivative work. The Getty water is evidence that an image belongs to Getty. If someone produces an image with the watermark and gets sued, they could say "your honor, I know it looks like I copied that image but let's consider the details of how my hypercomplex whatsit work..." and then judge, say a nontechnical person, looks at the defendant and say "no, just no, the court isn't going to look at those details, how could court do that?". Or maybe the court would consider it only if you paid 1000 neutral lawyer-programmers to come up with a judgement, at a cost of millions or billions per case.

filoleg · on Sept 21, 2022

What if it was not the copied signature of the old master, but a new one with a similar style and placed in a similar spot on the painting, but with the name of the student instead and looking blurry/a bit different? Because that's what's happening here, and that doesn't sound quite like fraud.

Another scenario, what if i create a painting of a river by hand in acrylic and also draw a getty-watermark-looking thing on top using acrylic? As for why, i would put it there as an integral part of the piece, to allude to the fact of how corporations got their hands over even the purest things that have nothing to do with them, with the fake watermark in acrylic symbolizing it. You can make up any other reason, this is just the one i thought of as i was writing this. It wont look exactly like the real getty watermark, it will be acrylic and drawn by hand, so pretty uneven with colors being off and way less detailed. Doesn't feel like fraud to me.

joe_the_user · on Sept 21, 2022

My argument isn't really whether this is morally fraud. Maybe the device is really being "creative" or maybe it's copying. The question is whether the things supposed originality can be defend in court.

What if i create a painting of a river by hand in acrylic and also draw a getty-watermark-looking thing on top using acrylic? As for why, i would put it there as an integral part of the piece, to allude to the fact of how corporations got their hands over even the purest things that have nothing to do with them, with the fake watermark in acrylic symbolizing it.

A human artist might well do that and make that defense in court. For all anyone knows, some GPT-3-derived-thing might go through such a thought process also (though it seems unlikely). However, the GPT-3-derived-thing can't testify in court concerning it's intent and that produces problems. And it's difficult for anyone to make this claim for it.

Edit: Also, if instead of a single work (of parody), you produced a series of your own stock photos, used the Getty Watermark and invited people to use them for stock photo purposes, then your use of the copyrighted Getty Watermark would no longer fall under the parody exception for fair use.

Jevon23 · on Sept 22, 2022

I'm not a lawyer and I can't say how existing copyright law applies to this situation, but, how is taking images and feeding them into an ML model different from taking library code and including it in your software?

In both cases, you take a series of bytes (the image data / the library source code) that is ultimately crucial to the functioning of your software, combine it with your own original code you wrote (training / compilation), and end up with a new output (the trained model / the binary executable) that is distinct from any of the original sources.

If you use a GPL'd library in your software, then it's uncontroversial to say that you have to follow the terms of the GPL. You can't say "well actually, the compiler is just reading your source code and learning what sort of binary it should produce, just like a human learns by studying source code, so I actually don't have to follow your licensing terms". No one would buy that. You clearly used that library, so you have to obey whatever terms come along with it.

Why is it fine to ignore the licensing terms for image data you incorporate into your software, but not third-party source code that you incorporate?

karmasimida · on Sept 22, 2022

> If you use a GPL'd library in your software, then it's uncontroversial to say that you have to follow the terms of the GPL. You can't say "well actually, the compiler is just reading your source code and learning what sort of binary it should produce, just like a human learns by studying source code, so I actually don't have to follow your licensing terms". No one would buy that. You clearly used that library, so you have to obey whatever terms come along with it.

What if I read the code, understand its concepts, and re-implements another library that provide similar functionality without directly linking to the original repository, that is not an infringement, and actually how open source community has always been operating, like MariaDB to MySQL, or any projects that markets themselves as 'open source alternative' of some commercial software.

I would argue, the diffusion models are really good, it is possible that they capture the essentials of drawing that they learn no different than a human. Put in another way, it masters the imaging process at fundamental level

Jevon23 · on Sept 22, 2022

>What if I read the code, understand its concepts, and re-implements another library that provide similar functionality without directly linking to the original repository, that is not an infringement

I agree, that's fine.

The analogous situation with image generators would be if the companies that trained the models had a human artist look at every image in their dataset, paint a unique but similar image, and then feed all of those images into the model, so that no copyrighted images were used in training without permission. But that's obviously not what they did. They just fed in the images unaltered, without getting permission.

LtWorf · on Sept 23, 2022

> What if I read the code, understand its concepts, and re-implements another library that provide similar functionality without directly linking to the original repository, that is not an infringement

It might be… which is why famously wine developers don't look at leaked windows code.

pvaldes · on Sept 21, 2022

A better metaphor would be copying and then claiming that you created the original art. Studying can explain but does not replace the subject of study. Producing a painting based in previous paintings can create further problems.

In art is very common that some author review their own work and made several paintings of the same subject. Artists made several attempts to conquer a panting, they draw studios or use different mediums. Photographers reproduce one portrait 20 years later to see how the subject changed. If you insert an IA image in the middle and copyright it, then any posterior painting on the same subject, even by the original author, would became derivative of the IA image that replaced it. This could even go so far as excluding the artists to review their most successful works.

notahacker · on Sept 21, 2022

Because the copyright holder has granted you the right to look at paintings and hasn't granted you the right to store them on your server to perform the mathematical transformations necessary to facilitate an adaptation-on-demand service.

Even if it was plausible to believe the mechanics of how human brains process art was particularly similar to a diffusion model or GAN, I don't see "but human brains are deterministic functions of their inputs too" as being a successful legal argument any time soon. You'd have to throw out rather more of the legal system than just copyright if those arguments start to prevail...

karmasimida · on Sept 22, 2022

> Because the copyright holder has granted you the right to look at paintings and hasn't granted you the right to store them on your server to perform the mathematical transformations necessary to facilitate an adaptation-on-demand service.

You just described how modern browser cache images.

The diffusion model is revolutionary at scale, but doesn't mean it is doing anything drastically different than what is allowed right now, e.g. any AI based image beautifying/denoise filter, just the scale changes everything.

sacado2 · on Sept 22, 2022

The fact watermarked images are available for free doesn't necessarily mean you can do whatever you want with them. It depends on what the getty licence on watermarked images says exactly. I'm pretty sure they don't include something like "you can use these picture as data in automated processes". They are (I guess) available "for your eyes only".

kiicia · on Sept 22, 2022

iirc court already stated that you can crawl available net and use what you find there, it was after that company doing face recognition, wasn't it?

sdenton4 · on Sept 21, 2022

A world where we can't use copyrighted material to update neutral network weights is a world where we can buy books but not read them...

snovv_crash · on Sept 21, 2022

You can train it but not for commercial purposes. Nobody cares what you do at home, but if you want to use someone else's work to make money they will come knocking for their cut.

sdenton4 · on Sept 21, 2022

Does O'Reilly ask for a percentage of a software engineer's income after they've read their book of perl recipes?

snovv_crash · on Sept 22, 2022

O'Reilly sells their books to software engineers with the intent for them to use the information to further their knowledge and apply it in a commercial setting.

The images in Getty are provided with the intent to be used only as a catalogue for purchasing corresponding images without watermarks.

The difference in intent is very clear, and a judge would make a distinction between these.

wnevets · on Sept 21, 2022

It is allowed to sale a book that is collection of pages from copyrighted books? Paragraph 1 is from a Stephen King novel, Paragraph 2 is from A Storm of Swords and so on? I am not a copyright attorney but that sounds like violation to me.

PeterisP · on Sept 21, 2022

If you have legally obtained copies of the relevant novels, according to the first sale doctrine you should be allowed to cut them up, staple first chapter of one novel to the second chapter of another and third chapter of the next, and then sell the result.

But the authors have the exclusive right to making more copies of their work, so if you'd want to make a thousand of these frankensteinbooks, you would need to get a thousand copies of the original books.

PeterisP · on Sept 21, 2022

Noone is contesting the fact that images where copyright is owned by Getty were used the model.

The contested issue is whether training a model requires permission from the copyright holder, because for most ways of using a copyrighted work - all uses except those where copyright law explicitly asserts that copyright holders have exclusive rights - no permission is needed.

notahacker · on Sept 21, 2022

I think you'd struggle to argue that the Getty watermark was a general style and composition principle and not a distinct motif unique to Getty (and in music copyright cases, the defence of plagiarising motifs inadvertently frequently fails).

cheald · on Sept 21, 2022

From the model's perspective, it's not a distinct motif, that's the thing (and, it struggles quite a lot to reproduce the actual mark). The model doesn't have any concept of what a "watermark" is. As far as it's concerned, it's just a compositional element that happens to be in some images. Most "watermarks" Stable Diffusion produces are jumbles of colorized pixels which we can recognize as being evocative of a watermark, but which isn't the actual mark.

A quick demo: I fed in the prompts "a getty watermark", "an image with a getty watermark", and "getty", and it spat out these: https://imgur.com/a/mKeFECG - not a watermark to be seen (though lots of water).

I was then able to generate an obviously-not-a-stock photo containing something approximating a Getty watermark, with the prompt "++++(stock photo) of a sea monster, art": https://imgur.com/a/mNC6XtQ - the heavily forced attention on the "stock photo" forces the model to say "okay, fine, what's something that means stock photo? I'll add this splorch of white that's kinda like what I've seen in a lot of things tagged as stock photos" and it incorporates that into the image as a to satisfy the prompt.

We can easily recognize that as attempting to mimic the Getty watermark, but it's not clearly recognizable as the mark itself, nor is the image likely to resemble much of anything in Getty's library.

notahacker · on Sept 21, 2022

> From the model's perspective, it's not a distinct motif, that's the thing (and, it struggles quite a lot to reproduce the actual mark). The model doesn't have any concept of what a "watermark" is.

The court delivers the judgement, not the model.

If courts can find against musicians whilst accepting they 'unconsciously' plagiarised key elements of a song in their own completely different song played by different musicians based on maybe hearing it in the background somewhere, they can certainly find against the creators of a model which has a sufficiently strong and obvious dependency on Getty IP they imported to output reasonably close approximations of Getty watermarks.

karmasimida · on Sept 22, 2022

IMO, it is very difficult to prove

1: If that watermark is recognizable enough to be a getty watermark or something whose shape vaguely looks like a watermark. 2. Where that watermark is coming from.

From how the model is trained, it is possible that the model considers the watermark itself a style of the picture and mimics it. But it would be mission impossible to trace to a particular work that the inspiration is coming from.

sdenton4 · on Sept 21, 2022

It's not even necessarily trying to emulate any particular image it's seen before; it may just decide 'this is the kind of image that often has a watermark, so here goes.'

mr_toad · on Sept 22, 2022

> Either the authors have invented the world's greatest compression algorithm, or the original image data isn't actually in the model.

AI and finding the best compression algorithm for an input are essentially the same problem.

preinheimer · on Sept 21, 2022

First time I generated an image that had some iStockPhoto watermarks on it I started getting really uncomfortable.

ISL · on Sept 21, 2022

A little delay at $50k an infringement can go a long, long way.

I'm not sure what the statute of limitations is for infringement, but I bet it is two years or more.

jffry · on Sept 21, 2022

In the US it seems to be 5 years for criminal infringement, or 3 years for civil actions, according to Title 17, Chapter 5, Section 507: https://www.copyright.gov/title17/92chap5.html#507

spoonjim · on Sept 21, 2022

There are two separate open US legal questions:

1) Does the use of copyrighted images in the OpenAI training model make the output a copyright infringement? This is not yet settled law. If I read a thousand books and write my own book, it's not an infringement, but if I copy and paste one page from each of a thousand books it would be.

2) Can an OpenAI generated image be copyrighted? Courts have ruled that an AI cannot hold a copyright itself, but whether or not the output can be copyrighted by the AI's "operator" depends a lot on how the AI is perceived as a tool vs. as a creator. Nobody would argue that Picasso can't hold a copyright because he used a brush as a tool, but courts have ruled that a photographer couldn't hold the rights to a photo that a monkey took with his equipment. The ruling here will probably stem from whether the AI is ruled to be a creator itself, which the human just pushes a button on, vs. a tool like a brush, which needs a lot of skill and creativity to operate.

jfoster · on Sept 21, 2022

Yeah, I think it might actually be a good thing to have some of the copyright questions settled one way or the other.

extra88 · on Sept 21, 2022

> if I copy and paste one page from each of a thousand books it would be.

It almost certainly would not be infringement. One page of text out of an entire work is very small. Amount and substantiality of the portion used in relation to the copyrighted work as a whole is one of the factors considered when making a fair use defense. This hypothetical book would also have zero effect on the potential market for the source books.

AI-generated images generally won't infringe on the training images because they don't substantially contain portions of the sources. If a generated image happened to be substantially similar to a particular source image, it could also affect the potential market for that source image. But it's also likely that there tons of human-made images that are coincidentally also substantially similar and they're not infringing on each other; they're also probably in the AI's training set so good luck trying to make the case that your image in the training is the one being infringed. On top of that, if the plaintiff could win, the actual market value of the source work is relevant to damages and that value is likely almost nothing so congratulations, you tied up the legal system just to get one AI-image taken down.

BTW, using all the images to train the AI is not itself infringement; I can't say there was no wrong-doing in the process of acquiring the images but using them to train the AI was not infringing on the copyrights of those images.

https://www.copyright.gov/fair-use/

belorn · on Sept 21, 2022

A few seconds of a song is also very small, and yet it only take a few notes in some cases for a court to find someone guilty of copyright infringement.

> On top of that, if the plaintiff could win, the actual market value of the source work is relevant to damages and that value is likely almost nothing so congratulations, you tied up the legal system just to get one AI-image taken down.

The pirate bay trial gave precedence that people can be found guilty of copyright infringement in the general sense, rather than for a specific case of a copyrighted work. The lawyers for the site tried to argue that the legal system should had been forced to first go to court over a specific work with a specific person in mind who did the infringement, but the judges disagreed. According to the court it was enough that infringement was likely to have occurred somewhere and somehow. A site like getty could make the same argument that infringement of their images is likely to have occurred somewhere, by someone, and the court could accept that as fact and continue on that basis.

extra88 · on Sept 21, 2022

Different domains of copyrightable material have different norms. The music industry, in response to sampling, has established that even small, recognizable snippets have marketable value and can therefore be infringed upon (there's it's also rife with case law with, in my opinion, bad wins by plaintiffs). For photographic images, collage is already an established art form and is generally considered transformative and fair use of the source images.

I'm not familiar with any The Pirate Bay case; if you are referring to this one [0], it was in a Swedish court and I'm not familiar with Swedish copyright law. However, the first sentence says the charge was promoting infringement, not that they were engaged in infringement themselves. I don't think that's relevant to what I was replying to but could be very relevant to Getty Images's decision, if AI-generate content is infringing, they don't want to be accused of promoting infringement. There's undoubtedly already infringement taking place on Getty Images but likely at such a small scale that the organization itself is not put at risk.

[0] https://en.wikipedia.org/wiki/The_Pirate_Bay_trial

belorn · on Sept 22, 2022

It is the correct case, but the "promoting" might be a bit of an translation issue. They were found guilty of aiding and enabling infringement, but as I describe above, not for any specific infringement of a specific work but rather the act of infringement in a general sense.

There has been cases where "enabling" infringement been argued by US prosecutors, but I don't know what the legal status of that argument is. I have heard lawyers argue that nothing in copyright law explicitly forbid enabling, but that was mostly around the time of 2010.

sportslife · on Sept 21, 2022

It's interesting whether there will be a change to the norms.

It's my faint memory that for music, when it was just some kids looping breaks on vinyl printed in the hundreds or low thousands, the sampling was fine or at least not obviously an issue; but then copyright-holders saw that those artists and their management had started bringing in real money, and the laws were clarified.

anigbrowl · on Sept 21, 2022

I predict they'll lose because any of the existing contenders floats effortlessly over the 'transformativity' hurdle.

While I'm worried about the impact of widely deployed AI on commercial artists, musicians etc. and don't think many developers have really come to grips with the implications and possibilities for all fields, including their own, I feel nothing but amusement at the grim prospects of commercial image brokers who have spent years collecting rent on the creativity of others.

pbhjpbhj · on Sept 21, 2022

It seems clear that such training of AIs requires copying an image onto a computer system in which the training algorithms are performed. Maybe that fits in Fair Use (I doubt it: it's commercial and harms the original creators) but it certainly doesn't fit in Fair Dealing (in UK).

I certainly, personally, approve of weak copyright laws that allows for things like training AIs without getting permission; neither USA, EU, nor UK seem to have such legislation however.

This is all personal opinion and not legal advice, nor related to my employment.

colinmhayes · on Sept 21, 2022

How is this any different from training artists? Other art is copied into their brain and they transform it to create something new.

indymike · on Sept 21, 2022

Yes, but a computer is not a person - and copyright law is not very friendly to machine generated output.

dingdong123 · on Sept 21, 2022

> It seems clear that such training of AIs requires copying an image onto a computer system in which the training algorithms are performed

This is how web browsers work. If you go to https://www.gettyimages.nl/ your browser will copy the images from their server to your computer.

fomine3 · on Sept 22, 2022

(IANAL) Use of copyrighted materials for AI training is explicitly allowed without permission in Japan law since 2019. https://storialaw.jp/en/service/bigdata/bigdata-12

strangescript · on Sept 21, 2022

The AI doesn't actually need the image. It needs a two dimensional array that represents the pixel values. I am sure there are some very clever ways to get around that hurdle if that is where the bar is set.

pbhjpbhj · on Sept 21, 2022

Well, if you can create the array without using the image then you're golden; but if you're not using the image then you're not training on the image. If you are using the image, then you're deriving (at least) a representation from the image.

IME, limited as it is, courts don't take kindly to overly clever attempts to differentiate things from what everyone agrees they are; like 'the file isn't the image, so I can copy that', judges aren't stupid.

r00fus · on Sept 22, 2022

I'm curious how a "two dimensional pixel array" doesn't correspond to a picture.

lioeters · on Sept 22, 2022

It's not a pirated image/film/book/music - it's just a (very long) array of numbers!

LtWorf · on Sept 23, 2022

> It needs a two dimensional array that represents the pixel values.

That's an image…

Are you saying that if I shut down the screen and the image isn't shown, I can copy it and send it around because it isn't an image but some numbers?

ciberado · on Sept 21, 2022

They realized immediately that AI has reached a disruptive point for the stock photography industry, just like the digital composition changed the rules years ago.

Probably their value in the advertisment production chain is going to become close to zero in a few years, and they will try to stop or at least slow it down.

But once we have open source models released to the public I cannot see how a local legislation can have any impact at all.

anigbrowl · on Sept 21, 2022

They'll still be staggeringly rich, their wealth level just won't be automatically accelerating any more. If they litigate over it I doubt jurors will feel much of their pain.

LtWorf · on Sept 23, 2022

I don't think it's a penal justice trial, so it wouldn't go with a jury at all.

systemvoltage · on Sept 21, 2022

I’m going against the grain but we should also recognize that it takes real effort and time to professionally take stock photographs and Getty has a lot of consignments. They travel, they go to junkyards, they search for subjects to take photos of. Catalog them. All this isn’t free and these people would need to put food on the table.

This isn’t similar to “Elevator operators were obsolete after they had a control panel and elevators became reliable for people to feel safe. So their jobs must go”.

Here, it is more like “High quality stock photography would become scarce”. There are free stock photography resources but IMO Getty’s photos are on another level.

That said, Getty has a history of being ridiculously protective and litigation powerhouse. They have a lot of lawyers.

Gigachad · on Sept 21, 2022

I don't think most stock image users really care all that much. They just want a picture of someone pointing at a whiteboard which AI will generate pretty well. And then people with a little more talent will use AI plus their knowledge of good photography to tweak and tune the output to generate high quality stuff at a fraction of the cost of going out and taking real photos.

ISL · on Sept 21, 2022

In addition, Getty won't want to be liable for hosting something on which a third party holds copyright and hasn't assigned it to the submitter. They'll have plenty of liability and they have deep pockets.

strangescript · on Sept 21, 2022

Is it copyright infringement to experience copyrighted material and make new art based on those experiences? Of course not. The end here is inevitable. Even if some really backward thinking judgements go through, eventually it will wash out. In 20 years AI will be generating absurd amounts of original content.

nomel · on Sept 21, 2022

> Is it copyright infringement to experience copyrighted material and make new art based on those experiences?

Yes, covered under "derivative work": https://www.copyright.gov/circs/circ14.pdf

> A derivative work is a work based on or derived from one or more already existing works. Common derivative works include translations, ..., art reproductions, abridgments, and condensations of preexisting works. Another common type of derivative work is a “new edition” of a preexisting work in which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work.

> In any case where a copyrighted work is used without the permission of the copyright owner, copyright protection will not extend to any part of the work in which such material has been used unlawfully. The unauthorized adaptation of a work may constitute copyright infringement.

nomel · on Sept 22, 2022

This is being downvoted, so I'm worried my understanding is incorrect. Could anyone explain how this is wrong? IANAL.

brobinson · on Sept 22, 2022

So by this logic, Deadly Premonition and Mizzurna Falls are derivative works of Twin Peaks?

kiicia · on Sept 22, 2022

"inspired by"

karmasimida · on Sept 21, 2022

They should sue, and it will clarify things a bit, and the matter would be settled for the years to come.

What is certain is that AI-powered image creation will here to stay forever.

pbhjpbhj · on Sept 21, 2022

But this will decide whether it's democratised or locked-up by those who own large bodies of training data.

adamsmith143 · on Sept 21, 2022

Stable Diffusion is already out in the world. The cat is out of the bag.

sdenton4 · on Sept 21, 2022

Yah, but think of Napster getting eventually usurped by Spotify. The danger is that it's legally no longer possible to update the models (which are very expensive to train), and we end up with only Disney with the copyright horde large enough to train decent models, let alone good ones...

adamsmith143 · on Sept 22, 2022

They are indeed hard to train but groups like EleutherAI have already basically crowdsourced training LLM's successfully so it's absolutely doable by non Corporate/Academic/Government entities to train high end models like this.

Gigachad · on Sept 21, 2022

The models are expensive to train right now, but I suspect in 10 years, anyone with a multi gpu rig could train the equivalent of Stable Diffusion.

mr_toad · on Sept 22, 2022

But in ten years the state of the art will be something better than Stable Diffusion.

scarmig · on Sept 21, 2022

China.

baq · on Sept 21, 2022

Should’ve said, everywhere except the US. I’d point to classics like Luxembourg or Sweden?

petercooper · on Sept 21, 2022

I first thought that if Getty were to sue anyone, it would be those actually publishing and distributing works derived from copyright material (i.e. end users) rather than the providers of mere tools.

But I wonder if they could make the argument that the ML model itself is a "derivative work" (I don't agree that it is, but I can see the case being made). That would be a heck of a court case and resurface a lot of the "illegal number" (https://en.wikipedia.org/wiki/Illegal_number) stuff again.

beiller · on Sept 21, 2022

Most of the usage I've seen or even tried myself is like "Sonic the hedgehog doing a kickflip". It kind of makes it obvious in my opinion that yeah, this is pretty much not right. Even worse I'm seeing things like "Sonic the hedgehog artstation in the style of (artist xyz)". Is it just ripping artists images from art station without explicit permission?

Xelynega · on Sept 22, 2022

It's my understanding that a lot of branded material(like "sonic the hedgehog") was filtered out of training data so that the copyright challenges were limited to small holders that can't fight back instead of large holders like Sega and Disney.

So any prompts expecting copyright characters is going to end up wierd because of lack of training data.

peoplefromibiza · on Sept 21, 2022

I believe the more important issue is that material generated through AI is not copyrightable by the author of such images.

If you are an artist you can't claim any copyright on what you're generating.

If you're not the copyrighted holder it follows that you can't sell it or that you can't complain if someone else copy it (verbatim) and sells it.

PeterisP · on Sept 21, 2022

The criteria for a work being copyrightable literally is the slightest touch of creativity, and I'm quite certain that writing a prompt and selecting a result out of a bunch of random seeds would qualify for that.

Fully automated mass creation would get excluded, as would be any attempts to assert that copyright to a non-human entity, but all the artwork I've seen generated by people should be copyrightable - the main debatable question is whether they're infringing on copyright of the training data.

On a different note, I'd argue that the models themselves (the large model parameters, as opposed to the source code of the system) are not copyrightable works, being the result of a mechanistic transformation, as pure 'sweat of the brow' (i.e. time, effort and cost of training them) does not suffice for copyright protection, no matter how large.

peoplefromibiza · on Sept 21, 2022

> The criteria for a work being copyrightable literally is the slightest touch of creativity

If a prompt is copyrightable, that's a problem. Because it's just words. Recipes should be in the same league then.

If I can get the same output with a slightly different prompt, how would you protect your works?

If I copy your output, how can you protect your works, given that the output depends on something not copyrightable? (as per your statement, which I agree with)

Look at it this way: If I make something out of a Spirograph, is it a copyrightable work?

noobermin · on Sept 21, 2022

5. Getty pictures were trained on likely without permission. So, they definitely might be considering starting a lawsuit.

[0] https://news.ycombinator.com/item?id=32573523

plussed_reader · on Sept 21, 2022

Why would it be illegal to train a model on their free samples? I thought their business was paying to remove the watermark?

shagie · on Sept 21, 2022

The "free samples" are still copyrighted by the artist. Adding a watermark to it doesn't remove the copyright and arguably, adding the copyright doesn't even create a new work.

Their business is hosting, indexing, and managing the licensing for art that has been submitted to them and licensed to another party.

plussed_reader · on Sept 21, 2022

And they're available for public consumption at the website, albeit at reduced quality.

Are the images part of the distributed data set? I thought it was values/coefficients that manifest from the algorithmic analysis of the source image?

shagie · on Sept 21, 2022

Yes... ish.

On one hand, if you do a "this is the size of the net" and then divide it by the number of training images, its rather small amount of storage per image.

On the other hand, when I was playing with stable diffusion on the command line following the instructions of https://replicate.com/blog/run-stable-diffusion-on-m1-mac

python scripts/txt2img.py --prompt "wolf with bling walking down a street" --n_samples 6 --n_iter 1 --plms

I got: https://imgur.com/a/N1OufD1

Now, you tell me if there's a copyrighted image encoded in that data set or not.

beiller · on Sept 21, 2022

You have the version which filters out NSFW images based on keywords. The code literally replaces images it thinks are NSFW with Rick Astley. Copyright aside (yes it's probably wrong to hard code an image of Rick Astley in the actual stable diffusion git repository) that image is not contained in the weights of the model.

- edit - please god tell me this is not an elaborate rick roll :)

shagie · on Sept 21, 2022

It's not... though if

    stable-diffusion % python scripts/txt2img.py --prompt "Rick Astley Never Gonna Give You Up" --n_samples 1 --n_iter 1 --plms

is such that it triggers NSFW sometimes, then... I'm... let's say "confused" about what entails NSFW prompts.

(digging through scroll back)

    Creating invisible watermark encoder (see https://github.com/ShieldMnt/invisible-watermark)...
    Sampling:   0%|                                           | 0/1 [00:00<?, ?it/sData shape for PLMS sampling is (1, 4, 64, 64)             | 0/1 [00:00<?, ?it/s]
    Running PLMS Sampling with 50 timesteps
    PLMS Sampler: 100%|| 50/50 [03:23<00:00,  4.06s/it]
    Potential NSFW content was detected in one or more images. A black image will be returned instead. Try again with a different prompt and/or seed.:00,  3.99s/it]
    data: 100%|| 1/1 [03:29<00:00, 209.20s/it]
    Sampling: 100%|| 1/1 [03:29<00:00, 209.20s/it]
    Your samples are ready and waiting for you here: 
    outputs/txt2img-samples

Apparently you're right... though the "black image" is a poor description of the image.

beiller · on Sept 22, 2022

Yeah I see noisy images in your output it may just be glitching. I may have poorly described how it worked because I'm not fully sure. It may be a nsfw image detection model and not based on the prompt. Either way you can disable it in code, I tried

cheald · on Sept 21, 2022

That's a very interesting result. Did you happen to capture the seed for either of those first two images? It would be interesting to try to reproduce.

shagie · on Sept 21, 2022

Alas no. And I haven't been able to tickle it again in the right way to get those images out.

The invocation of that run is still in my scroll back:

    (venv) shagie@MacM1 stable-diffusion % python scripts/txt2img.py --prompt "wolf with bling walking down a street" --n_samples 6 --n_iter 1 --plms
    Global seed set to 42
    Loading model from models/ldm/stable-diffusion-v1/model.ckpt
    Global Step: 470000
    LatentDiffusion: Running in eps-prediction mode
    DiffusionWrapper has 859.52 M params.
    making attention of type 'vanilla' with 512 in_channels
    Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
    making attention of type 'vanilla' with 512 in_channels

That's the only spot I see the seed mentioned and then it goes on with lots of other logging but nothing seed related that would indicate a way to reproduce it.

---

(late edit) you can fairly accurately (so far 1 image out of 20) get that image out with the prompt "Rick Astley Never Gonna Give You Up"

cheald · on Sept 22, 2022

I'm thus far unable to reproduce it.

Given:

Rick Astley Never Gonna Give You Up Steps: 20, Sampler: PLMS, CFG scale: 7, Seed: 4231695436, Size: 512x512, Batch size: 2, Batch pos: 0

I ran a couple of batches of 32 (64 images total): https://imgur.com/a/74IbCuD

(The images with the nonsensical but obvious Impact font that was learned from memes are quite funny, though)

If you can get a full set of parameters (size, sampler, seed, prompt, cfg scale) then I should hopefully be able to reproduce your results, though.

bergenty · on Sept 21, 2022

So what, me drawing the exact same image based on a Getty image doesn’t violate anything and everything the AI is doing is massively derivative so I don’t see how they could possibly have a case.

shagie · on Sept 21, 2022

Creating a drawing based on an image clearly falls in the existing derivative work.

The "what is a model" and "what is the copyright status of the output of the model" are questions that have yet to be settled from the legal standpoint.

That Getty has images available for viewing with a watermark and that watermark is reproduced kind of in some results from model generated images suggests that the model was trained on images that were not licensed as the people who created the model claimed.

I'll also point to the "I created images from Stable Diffusion that are clearly the cover image from 'Never Gonna Give You Up'" suggests that images aren't as impossible to extract as one would believe from a model.

Copyright and derivative works is ultimately the domain of humans looking at laws - not deterministic machines. The case is argued by humans and before humans. A lawyer can and will make a case that the model itself is a derivative work and that the images produced by it have the possibility of being identified as mechanical modifications of existing works and therefore derivative themselves - just as a photograph of a painting is a derivative work of the painting.

If the output of the ML model can be identified as having major copyrightable elements from an existing original work, then it is derivative - no matter how it got there.

So, returning to your question. If you draw the exact same image based on an image hosted and licensed by Getty - it certainly will be a derivative work and violate copyright.

bergenty · on Sept 21, 2022

Getty already ruined Google images by not letting them link directly to the image, I’m going to be massively pissed off if they try and ruin this ecosystem as well.

samstave · on Sept 21, 2022

[flagged]

samstave · on Sept 21, 2022

So funny how ppl are defending Getty when they don’t know their history…

6gvONxR4sf7o · on Sept 21, 2022

It's always weird to see the contrast between HN's reaction to copyright questions about text/image generation, and HN's reaction when it's code generation.

When a model is trained on 'all-rights-reserved' content like most image datasets, the community say it's fair game. But when it's 'just-a-few-rights-reserved' content like GPL code, apparently the community says that crosses a line?

Realistically, this tells me that we need ways for people to share things along the lines of all the open-source licenses we see.

You could imagine a GPL-like license being really good for the community/ecosystem: "If you train on this content, you have to release the model."

joe_the_user · on Sept 21, 2022

When a model is trained on 'all-rights-reserved' content like most image datasets, the community say it's fair game. But when it's 'just-a-few-rights-reserved' content like GPL code, apparently the community says that crosses a line?

A) This is just taking divided opinion and treating it like a person with a contradictory opinion (as others have noted).

B) Nothing about GPL makes it "less copyrighted". Acting like a commercial copyright is "stronger" because it doesn't immediately grant certain uses is false and needs to be challenged whenever the claim is made.

C) If anything, I suspect image generation is going to be practically more problematic for users - you'll be exhibiting a result that might be very similar to the copyrighted training, might contain a watermark, etc. If you paste a big piece of GPL'd copy into your commercial source code, it won't necessarily be obvious once the thing is compiled (though there are whistleblowers and etc, don't do it).

6gvONxR4sf7o · on Sept 21, 2022

> Nothing about GPL makes it "less copyrighted". Acting like a commercial copyright is "stronger" because it doesn't immediately grant certain uses is false and needs to be challenged whenever the claim is made.

GPL says "you have a license to use it if you do XYZ." The alternative is "you have no license to use it." How is that not strictly "stronger?"

joe_the_user · on Sept 22, 2022

The GPL is as strong as a commercial license in the sense that the conditions it does specify are exactly as legally binding as those of a commercial license.

6gvONxR4sf7o · on Sept 23, 2022

Right, but a license is more permissive than no license.

bakugo · on Sept 21, 2022

If this was a website for artists instead of programmers you'd see the exact opposite pattern. Unfortunately, people only seem to care when it threatens their own livelihood, not when it threatens that of the people around them.

hot_gril · on Sept 21, 2022

I don't care either way. If an AI can do your programmer job, it means you aren't using your brain enough.

An AI can probably do half of my day job because it's stupidly repetitive. Leadership imposes old ways, and they dismiss anything "new" (i.e. newer than 2005). For example, writing all these high-level data pipelines and even web backends in C++ despite having no special performance need for it. Even though I'm not literally copy-pasting code, I'm copy-pasting something in my mind only a little higher-level than that, then relying on some procedural and muscle memory to pump it out. It's a skill that anyone can learn, just takes time. If I didn't have side projects, I'd forget what it's like to think about my code.

Some old-school programmers complain about kids with high-level languages doing their job more efficiently, so they work on lower-level stuff instead. It's been that way for decades. Now AI is knocking on that door. But before AI, C was the high-level thing, and we got compiler optimizations obviating much of the need for asm expertise, undoubtedly pissing off some who really invested in that skillset. If I'm working on something needing the performance guarantees of C or Asm, and the computer can assist me, I'm all for it. Please take this repetitive job so I can use my brain instead.

And the copyright thing is just an excuse. Programmers usually don't give a darn about copyright other than being legally obligated to comply with it. So much of programming is copy-paste. GPL had its day, and it makes less and less sense as services take over. GPL locks small-time devs out of including the code in a for-profit project but does nothing to stop big corps from using it in a SaaS. The biggest irony is how Microsoft not only uses GPL'd code for profit but also ships WSL, all legally.

egypturnash · on Sept 21, 2022

The average HN denizen gets so pissed off when I ask them to save their comment rejoicing in the inevitability of the elimination of my entire field, and take it out when next year's descendant of CodePilot gives them the same horrible sinking feeling that these things give me.

tombert · on Sept 21, 2022

If Copilot can replace my job, I think that job should be replaced by copilot. I don't think that saving jobs should be a reason to not hinder progress. I hope that what I contribute to my company is more than whatever future version of Copilot can create, but if not I will try and find another career.

Will I be sad if I lose my job to AI/ML? Yeah, probably, but at some fundamental level that's why I've always tried to keep myself up to date with stuff that's harder to automate.

kmeisthax · on Sept 21, 2022

People don't like being told no.

The vast majority of all-rights-reserved content is either not licensable, or not licensable at a price that anyone would be willing to pay or can afford. Ergo we[0] would much rather see more opportunities to use the work without needing permission, because we will never have permission.

When getting permission is reasonable then people are willing to defend the system. And code is much more likely to be licensable than art.

I still think the "Copilot is GPL evasion" argument is bad, though.

[0] As in the average HN user

6gvONxR4sf7o · on Sept 21, 2022

That's a good way to look at it: The difference between "no" and "yes if XYZ."

Maybe it's that people respect "yes if XYZ" more than "no" because there's some path to yes that way. In that case, it really does speak to the need for some open-ish text and image licenses, like "you can use my image in a model if you share the model with me."

jstanley · on Sept 21, 2022

> When a model is trained on 'all-rights-reserved' content like most image datasets, the community say it's fair game. But when it's 'just-a-few-rights-reserved' content like GPL code, apparently the community says that crosses a line?

I don't think this is right. I think different people have different views, and you're just assuming that the same people have contradictory views.

6gvONxR4sf7o · on Sept 21, 2022

I'm not assuming that, but I see how it could read that way, since I'm being fast and loose with the language. The community (anthropomorphizing the blob again) definitely empirically reacts very differently to the two topics.

vhold · on Sept 21, 2022

To me the difference is this: https://news.ycombinator.com/item?id=27710287

It's possible for generation models to perfectly memorize and reproduce training data, at which point I view it as a sort of indexed slightly-lossy compression, but it's almost never happening with image generation because the models are too small to memorize billions of pictures, it can't produce copies.

Stable diffusion 1.4 has been shrunk to around 4.3GB, and has around 900 million parameters.

I don't know how big Copilot is, but a relatively recently released 20 billion parameter language model is over 40GB. ( https://huggingface.co/EleutherAI/gpt-neox-20b/tree/main ) GPT-3, according to OpenAI, is 175 billion parameters.

It's possible there are some images in there you can pull out exactly as is from the training data, if they were to appear enough times, like I suspect the Mona Lisa could be almost identically reconstructed, but it would take a lot of random generation. I'm trying it now and most of the images are cropped, colors blown out, wrong number of hands or fingers, eyes are wrong, etc.

collegeburner · on Sept 21, 2022

that's bc hn has a lot of gpl zealots who want special rules because they believe a viral license is a "fundamental good" and closing stuff off via copyright is the opposite. i don't agree and think viral licenses suck, but it's not a unpopular opinion on here.

LanceH · on Sept 21, 2022

Is there a difference here?

With code, you literally copy it and put it in a device and fail to provide the source, breaking the GPL.

With a copyrighted set of images, you scan those and break them down into some set of data and then never need to actually copy the images themselves -- my understanding anyway.

Does that set of data contain copied works? Or is it just a set of notes about the works that have been viewed by the AI?

Xelynega · on Sept 22, 2022

The difference seems like semantics. You're taking compyrighted data, encoding it, then deriving a work from the output of that encoded data.

If I compress copyrighted works and redistribute them as my own I'm technically breaking them down into some set of data and never actually distributing copyrighted work, right?

stale2002 · on Sept 21, 2022

> You could imagine a GPL-like license being really good for the community/ecosystem: "If you train on this content, you have to release the model."

The issue with that is that it is perfectly legal to ignore the license, if you use the copyrighted work in a transformative way.

It doesn't matter what the license says, if it is legal to ignore the license.

6gvONxR4sf7o · on Sept 21, 2022

That seems to be the status quo, but if huge companies benefit from the electorate's work and use that to put them out of jobs, I wouldn't be surprised if the law changes.

paxys · on Sept 22, 2022

There's no community consensus on either of them, so not sure how you are trying to draw a contrast. There are people who thing both are wrong and those that think both are right and everything in between.

cantSpellSober · on Sept 21, 2022

Email this morning:

AI Generated Content

Effective immediately, Getty Images will cease to accept all submissions created using AI generative models (e.g., Stable Diffusion, Dall‑E 2, MidJourney, etc.) and prior submissions utilizing such models will be removed.

There are open questions with respect to the copyright of outputs from these models and there are unaddressed rights issues with respect to the underlying imagery and metadata used to train these models.

These changes do not prevent the submission of 3D renders and do not impact the use of digital editing tools (e.g., Photoshop, Illustrator, etc.) with respect to modifying and creating imagery.

Best wishes,

Getty Images | iStock

hedora · on Sept 21, 2022

What if someone uses the Stable Diffusion editor plugin for photoshop, etc?

toomanydoubts · on Sept 21, 2022

I don't see how that's an issue. Using photoshop doesn't automatically allow you to post the images. You can post the images IF it was not made/edited by an AI model, regardless of if photoshop was used.

If you use the Stable Diffusion plugin, then you're using stable diffusion, and therefore can't post the image. It doesn't matter at all that you used the photoshop plugin.

cheald · on Sept 21, 2022

A lot of Photoshop's built-in tools could arguably qualify as "AI models" (think things like content-aware fill!) - I don't think Getty would say that using them would disqualify your image, but isn't that essentially the same thing, except that this version is ultra high-powered version?

Xelynega · on Sept 22, 2022

Except they're not. Content aware fill uses only your image as the input and runs space-time video completion. Since it doesn't have to train on datasets, it's not part of the conversation.

anigbrowl · on Sept 21, 2022

LOL how will they even know? Are they going to ask for every .psd and read the entire edit history? what's to stop you making images all day in SD, and when you find one you like, just tracing over it in a new layer?

hedora · on Sept 23, 2022

Ignoring the philosophical ambiguity, their announcement specifically says this policy change doesn't apply to pictures made with Photoshop.

biggerChris · on Sept 21, 2022

It’s funny because, I can make an image and they wouldn’t know it’s A.i generated.

karaterobot · on Sept 21, 2022

You can also plagiarize someone else's work and submit it as your own, and they won't notice. But if they find out, it'll be removed and there will be some punitive action. I assume it's the same enforcement model in this case.

GameOfFrowns · on Sept 21, 2022

>You can also plagiarize someone else's work and submit it as your own, and they won't notice.

Or just take an image that is already in public domain and just slap a gettyimages® watermark on it...

j-bos · on Sept 21, 2022

Getty's way ahead of you: https://www.techdirt.com/2019/04/01/getty-images-sued-yet-ag...

Turing_Machine · on Sept 21, 2022

That's Getty's business model.

deltasevennine · on Sept 21, 2022

Why is this voted down? It's true. How can this be in actuality prevented?

Changing times. Not even imagery or art can be trusted. A whole range of human creativity based occupations is approaching the border of redundancy.

You can argue that they may never arrive at that border, but there is no argument to be made that this is the first time ever in the history of human kind where artists are approaching that border of redundancy.

And what does this mean for every other human occupation? The AI may be primitive now, but it's still the beginning and it it is certainly possible for us to be approaching a future where it produces things BETTER then what a human produces.

unity1001 · on Sept 21, 2022

> How can this be in actuality prevented?

By abolishing copyright and making crediting the author a habit. There is no way copyrights can survive the AI. There is no way human civilization will let 200 year old legal practices hold back technological advancement.

pessimizer · on Sept 21, 2022

> There is no way human civilization will let 200 year old legal practices hold back technological advancement.

Human civilization has kings and queens, and laws based on the sayings of ancient prophets.

Instead of trying to figure out what "human civilization" will accept, figuring out what current wealthy capital-owners will accept will be more predictive.

unity1001 · on Sept 21, 2022

Even if wealthy-capital owners get the AI banned where they can, the countries where they don't hold the power will get ahead by not banning the AI/

b800h · on Sept 21, 2022

I think there's a strong case for arguing that we may actually completely ban AI and any sufficiently strong ML algorithm. AI hasn't even realised a millionth of its potential yet, and it's already running rings around humans (cf. algorithmic feeds driving political conflicts online). I think potentially it will cease to be tolerated and be treated a bit like WMDs.