

This really isn’t true though, even if it is currently true in many cases. Case in point, if I wrote something and published it right now, it wouldn’t be part of any AI model yet. A party with a lot of money (like, say, a tech corporation) could easily create a bespoke coding model that is trained on everything but the desired libraries, thus achieving “clean room”.



Again, while this may be currently true for the most part, this is not considering the future evolution of technology. Models are only going to continue getting cheaper to produce. While it is possible that it is prohibitively expensive today (and I’m not convinced that that’s the case universally) that will not be the case in the future as model training is essentially guaranteed to get dramatically cheaper in the coming years due to hardware advancements. Burying our heads in the sand now isn’t going to help anything.