cross-posted from: https://lemmy.intai.tech/post/43759
cross-posted from: https://lemmy.world/post/949452
OpenAI’s ChatGPT and Sam Altman are in massive trouble. OpenAI is getting sued in the US for illegally using content from the internet to train their LLM or large language models
But if we’re drawing the line at “did it for profit”, how much technological advancement will happen? I suspect most advancement is profit driven. Obviously people should be paid for any work they actually put in, but we’re talking about content on the internet that you willingly create for fun and the fact it’s used by someone else for profit is a side thing.
And quite frankly, there’s no way to pay you for this. No company is gonna pay you to use your social media comments to train their AI and even if they did, your share would likely be pennies at best. The only people who would get paid would be companies like reddit and Twitter, which would just write into their terms of service that they’re allowed to do that (and I mean, they already use your data for targeting ads and it’s of course visible to anyone on the internet).
So it’s really a choice between helping train AI (which could be viewed as a net benefit for society, depending on how you view those AIs) vs simply not helping train them.
Also, if we’re requiring payment, only the super big AI companies can afford to frankly pay anything at all. Training an AI is already so expensive that it’s hard enough for small players to enter this business without having to pay for training data too (and at insane prices, if Twitter and Reddit are any indication).
Hundreds of projects in github are supported by donations, innovation happens even without profit incentives. It may slow down the pace of AI development but I am willing to wait anothrt decade for AIs if it protects user data and let’s regulation catch up.
Reddit is currently trying to monetize their user comments and other content by charging for API access. Which creates a system where only the corporations profit and the users generating the content are not only unpaid, but expected to pay directly or are monetized by ads. And if the users want to use the technogy trained by their content they also have to pay for it.
Sure seems like a great deal for corporations and users getting fleeced as much as possible.