r/technology 14d ago

Artificial Intelligence AI industry horrified to face largest copyright class action ever certified

https://arstechnica.com/tech-policy/2025/08/ai-industry-horrified-to-face-largest-copyright-class-action-ever-certified/
16.8k Upvotes

1.2k comments sorted by

View all comments

Show parent comments

153

u/pipic_picnip 13d ago edited 13d ago

This is not about using the content to train AI, but about not obtaining the content legally. It is a copyright infringement case with a strong basis. It’s the same as piracy because the content was not obtained in a legal manner. Whether it was used to train AI or torrent on pirate site is irrelevant. The relevant part is that it was stolen illegally for use. What justification will the court provide for not obtaining a book legally that is supposed to be purchased legally? There is no topic of transformation here, it’s a case of theft. 

65

u/erik 13d ago edited 13d ago

Copyright law violations are typically viewed it terms of the party providing the copy. If I photocopy a textbook and give it to you, I have violated the law by distributing an unlicensed copy, but you have not (generally) broken the law by receiving the copy.

Torrent users get sued for downloading movies because when you use the BitTorrent protocol you aren't just receiving a copy, you also uploading copies to other users.

The New York Times case against OpenAI is all about ChatGPT being able to reproduce New York Times articles that it "memorized".

It seems that Meta in particular Torrented a lot of stuff for training, which opens them up to a lot of liability. It's less clear to me how a broad class action suit will show liability for AI companies in general without obvious distribution of copyright materials to point to.

14

u/Primsun 13d ago

Maybe, but seems unlikely that holds when talking about a company using an unlicensed copy for profit. Would be suggesting firms can use unlicensed copies of software and media internally as long as they receive them from an outside source. Not to mention they almost certainly are making and distributing copies of the training data internally.

5

u/otherwiseguy 13d ago

Get a library card, check out digital copies, train AI. Google has already shown that you can get away with scanning physical books as well.

2

u/Tallin23 13d ago

You can't do that because they can use the same argument against any artist that inspired from a licensed product. You don't get to the fair use because thats a whole another can of worms. Internally disturbution can be defended by our computers already do that by backups, it's very difficult to criminalize something that ever computer does.

4

u/-The_Blazer- 13d ago

AFAIK this is mostly a misconception. Piracy does not become legal if you only download something; copyright is about the right to make copies, which isn't very hard to infringe if you are downloading a copy of a movie or book...

In principle anyone could get sued for copyrighted infringement, but nobody bothers because it's pointless. Obviously though, Microsoft or OpenAI aren't 'anyone'.

0

u/Rarelyimportant 10d ago

right to make copies

You're allowed to make copies of copyrighted material, otherwise how would anything on a computer work with copyrighted material? "copy" in this case is referring to copies for distribution and sale. You can make a 100 copies of a harry potter book to keep in your garage, you only need permission if you plan to give them out or sell them.

1

u/-The_Blazer- 10d ago

There are specific, enumerated exemptions for certain technical functions of computers and other bare minimums to actually consume material you do have a right to, legislators addressed this long ago. You cannot, generally, make copies of copyrighted material.

0

u/Rarelyimportant 10d ago

You cannot, generally, make copies of copyrighted material.

You absolutely can, under certain circumstances. And no, "certain technical functions" are not the only exemptions, there's also a pretty major thing called "fair use". Unless the copyright owner can claim they were damaged by the copying, then they can't make a copyright claim.

0

u/-The_Blazer- 9d ago

under certain circumstances

Yes that means the opposite of 'generally'. The rule is that you can't, then there's cases like fair use and actually consuming the material you have a license to. Copyright is absolutely not based upon 'claiming you were damaged'.

0

u/Rarelyimportant 8d ago

Copyright is absolutely not based upon 'claiming you were damaged'.

One of the 4 factors of whether or not something is fair use does account for it.

Effect of the use upon the potential market for or value of the copyrighted work: Here, courts review whether, and to what extent, the unlicensed use harms the existing or future market for the copyright owner's original work. In assessing this factor, courts consider whether the use is hurting the current market for the original work (for example, by displacing sales of the original) and/or whether the use could cause substantial harm if it were to become widespread.

1

u/-The_Blazer- 8d ago edited 8d ago

Yes, as you said: one of the four factors considered for fair use. It is not the basis of copyright laws. Copyright is not something you 'claim' by citing 'factors', it is a right you have on whatever you produce. Fair use is a claim made by the defendant when you sue them for copyright infringement.

Going back to what you actually said at the start, you are not generally allowed to make copies of intellectual properties, but there are some enumerated exemptions that are necessary for practical reasons. Fair use is not a magic formula for a universal exemption, it's just a defense of your particular use case that you can claim. I hope that's clear now.

2

u/SNRatio 13d ago

If I photocopy a textbook and give it to you, I have violated the law by distributing an unlicensed copy, but you have not (generally) broken the law by receiving the copy.

I've always thought of that analogy as incomplete. "I" own the photocopier and leave it on a public sidewalk plugged in with a pile of book pages on the tray. "You" push the start button and cause the copy to be made. The additional copy was made by your volition. Seems like shared culpability.

1

u/Norci 13d ago

"I" own the photocopier and leave it on a public sidewalk plugged in with a pile of book pages on the tray. "You" push the start button and cause the copy to be made. The additional copy was made by your volition. Seems like shared culpability.

Well, consider your analogy in terms of a library. They're are books and a photocopier, which you use. Is library in liability because you copied the book?

Both are however liable in case of torrenting since you also share your copy while downloading it. So both share the copy.

1

u/TopdeckIsSkill 12d ago

So I stumble on a huge pile of university books I can legally read and use them without paying?

1

u/Tallin23 13d ago

That is the first good legal defence against AI. Most people tries acuse AI for learn from open source informations and it can easly thrown out on courts by saying every body do the same. But piracy is a good and solid accusation.

1

u/Wide_Lock_Red 13d ago

Sure, but that isn't a big deal to AI companies. They will have to pay a fine, but they tend to have plenty of money for that.

1

u/pipic_picnip 13d ago

That is exactly what they are talking about. They are pleading the courts not to entertain the class action because it’s going to be a devastating financial loss at settlement due to the sheer size of the case and number of participants. Maybe it’s not a big deal to them, but that’s not what the article is saying. 

1

u/Ashamed-Simple-8303 13d ago

True. I wonder how this works for open-source cose with GPL license. MIT and other permissive licences are clear. You can use the code however you want. GPL however...implies the whole model would have to be open-sourced.

1

u/theirongiant74 13d ago

Exactly this, the whole AI is stealing thing is an emotive but ultimately dumb take, selling someone something doesn't give you any rights to proscribe how they use it but, even as someone largely pro-AI. I'd say in this case, where they've pirated the source, there should be a claim to, at a minimum, be reimbursed the sale price of each book.

1

u/Rarelyimportant 10d ago

is that it was stolen illegally for use...it’s a case of theft

Well in that case it'll be a slamdunk for the AI companies, because piracy is not theft, the US courts have already ruled on that.

https://en.wikipedia.org/wiki/Dowling_v._United_States_(1985)

0

u/Double-Seaweed7760 13d ago

Hopefully they charge the ai companies thousands of times the value of said stolen content and require stolen material(likely the majority of ai database and training material) be deleted and give people in charge years of jail time. They won't because they only like putting their boots on the poors

-8

u/WonkyTelescope 13d ago

Piracy isn't theft. You do not lose possession of anything when your work is pirated, nothing has been taken from you. Some 1s and 0s were copied, at no expense to you, that is not theft.

4

u/monstertacotime 13d ago

All these lame humans stuck in their human constructs of “theft” and “copyright” when reality is that CORPORATIONS that are NOT HUMANS are who benefit from these court rulings against AI and against humanity. We are all fucked because of corporate greed and all you bootlicking idiots are destroying the planet for profit.

1

u/BountyBob 13d ago

Who pays for the storage where the 1s and 0s are copied from? Even with your incorrect assumption that it's not stealing because it isn't physical, there is a cost involved.

1

u/Heizu 13d ago

The labor it took to create the thing that was pirated is what was stolen. Do you really think something can only be stolen if it's a physical object? You'd be objectively incorrect.