• mrmaplebar@fedia.io
    link
    fedilink
    arrow-up
    0
    ·
    2 days ago

    I’m not sure the law is as settled as you’re making it out to be here… We’re still watching a lot of these questions slowly trickle up the courts.

    In my opinion, the idea that someone can feed a bunch of unlicensed copyrighted material as input into a generative AI meat grinder and produce public domain outputs doesn’t really pass the smell test to me.

    For example, could I take use an LLM trained on GPL code to rewrite Linux in a legally distinct way, and then treat it as permissive or proprietary code after minor modifications? Likewise, can you train an LLM on someone else’s proprietary code and rewrite it as a GPL program?

    This sort of copyright/license laundering seems like an existential threat to the way that copyleft FOSS has existed for decades. I think it makes a lot of sense to be extremely cautious and skeptical of AI-generated code submissions.

    Edit: I also want to point out that the issue of whether or not it can be considered “fair use” to train generative AI models on unlicensed copyrighted works is very much an open question. If it was determined that it isn’t always fair use, then I don’t know what that might mean for many of the existing models that have been trained that way, or the outputs that they produce.