I said I was focusing on copyleft, cool that you ignored the entire post though. 😑
yoasif
ianal but does it even work like that? Is there any specific reason to think it does? I don't believe you really get credit for purity and fairness vibes in the legal system. Same goes for the idea that code where it is ambiguous whether it is AI output could be considered public domain, seems kind of implausible, is there actually any reason to think the law works that way? If it did, then any copyrighted work not accompanied by proof of human authorship would be at risk, uncharacteristic for a system focused on giving big copyright holders what they want without trouble.
I'm mostly just playing along with your thought experiment. As I said, we know that projects are already accepting LLM code into projects that are nominally copyleft.
There is no way, leaks happen, big tech companies have massive influence, a situation where their code falls into the public domain as soon as the public gets their hands on it just isn't realistic.
If that is the case, is chardet 7.0.0 a derivative work of chardet, or is it a public domain LLM work? The whole LLM project is fraught with questions like these, but it seems that the vendors at least are counting on not copying leaked software and instead copying open source code that is publicly hosted.
Why is it okay to strip copyright from open source works but not from leaked closed source works?
We know that Disney is suing to protect its works - if it is true that LLM outputs are transformative, they should lose, as should any vendor whose leaked code was "transformed" by an LLM.
But any source code leak is also open sourcing in that world.
I don't see how that helps free software, though. Those programmers got paid. Volunteers didn't.
It ends up with a weird reverse robin hood situation. LLM vendors steal from the poor, sell that to the rich. Do the rich give back? Only if it is stolen from them.
Making use of the non-copyrightability of AI output to copy code in otherwise unauthorized ways does not seem like a straightforward or legally safe thing to do. That's especially the case because high profile proprietary software projects also make heavy use of AI, it doesn't seem likely the courts will support a legal precedent that strips those projects of copyright and allow anyone to use them for whatever.
I think what may happen in practice could be worse - basically if we can't tell whether some code is the work of a human, but the project accepts AI code, if there we forego the analysis of whether something was produced by a human, the entire project may be deemed public domain -- perhaps after a certain date (when LLM contributions were welcomed).
Beyond that, by integrating LLM code into those projects, the projects are signifying assent to their works to be consumed by LLMs for infringement of the whole work - not just the LLM produced portions - it is hard to be doctrinaire about adherence to the open source license when the maintainers themselves are violating it.
We may see a future where copyrights for works become more like trademarks - if you don't make any attempt to protect your work from piracy, you may simply lose the right to contest its theft.
Obviously, it is as you say - today the courts may smile upon a GPL project where a commercial vendor copied and released as their own without sharing alike - but if they instead say that they copied the work into their LLM and produced a copy without protections (as chardet has done), the courts might be less willing to afford the project copyright protections if the project itself was making use of the same copyright stripping technology to strip others' work to claim protections over copied work.
Besides which, "authored by Claude" seems like a pretty easy way to find public domain code, and as Malus presents, the only code that may ultimately be protected is closed source code - you can't copy it if you don't have the source.
The diversion of "people may try to pass of LLM code as their own" is a nice diversion, but ancillary to the existing situation where projects are incorporating public domain code as licensed. We can start there before we start worrying about fraud.
Can I legally reverse engineer AI generated software?
If you have the source, why would you need to?
Can you even put terms and conditions on this supposed public domain copyright free compiled software product?
You can put terms on anything, but you can't protect the underlying asset if someone breaks your terms. Think of the code produced by Grsecruity that they put behind a paywall -- people were free to release the code (since it was licensed as open source as a derivative work), but obviously Grsecruity was able to discontinue their agreement with their clients who would do so.
Is the compiled version even different than the raw AI generated source code in its ability to be licensed?
People aren't generally licensing compiled binaries as open source, since you can't produce derivative works from them. But I think that if there is no copyright protection for the work, compiling it doesn't change the copyrightability. Curious what you think.
What rights does one have to AI generated code? Be it compiled or source. It’s surely not just communal.
Why is that surely the case? It is public domain - that is the most "communal" you can get for copyright.
I have seen this sentiment, but I don't know what the world looks like without copyright protections for creative works.
Does open source exist in your vision? How?
My imagination for this topic may not be as expansive as yours, but my interpretation is that if people contribute code to the commons, it will immediately available for any use - including for use by massive corporations.
So it ends up looking like people working for big companies for free.
as soon as it's modified by a human in nontrivial ways
is doing a lot of heavy lifting here.
We know that people are using coding LLMs as slot machines - pull the handle and see if it solves your problem. Where is the human modifying anything? That is a "straight dump" of AI output without modifications.
Honestly, if AI destroys copyright, it's the best thing it can do.
I have seen this being said, but I really don't understand it. Just because copyright can be abused doesn't mean (to me) that we ought to throw the baby out with the bathwater.
If copyright no longer exists, what incentive do people have to share copyleft code at all? It clearly would no longer exist, so can you help me understand how both copyright can be dead and open source exist? Or are you simply accepting that rather than copyright, we are using trade secrets (like the KFC chicken recipe) to protect works?
I don't really think we need to go down the copyfraud path to see that AI code damages copyleft projects no matter what - we know that some projects are already accepting AI generated code, and they don't ask you to hide it - it is all in the open.
How does this apply to software made by, say, Anthropic? They proudly say Claude Code is written by AI. If it can’t be copywritten, or licensed, then it’s just a matter of figuring out how to acquire a copy of the source code, and you could do whatever with it. Right?
If you were on Mastodon last week when the Claude source code was released (by Claude, accidentally), people were joking about how Anthropic was trying to use the DMCA to get the source removed from websites -- even though clearly, copyrights don't apply, since the code is clearly in the public domain.
If the LLM wrote the code, it is uncopyrightable.
All works created by a person are copyright by default, so people need to release their works to allow others to build on it or use it (except for the limited uses allowed by fair use). Like-minded people have come up with various licenses that allow people to release their works in ways that people prefer.
I know what copyleft licenses are about, that was covered in the post - if you read it. If you are saying that you are making long comments without reading the post, great I guess, but not super interesting (to me).
I'm not really interested in getting into an argument around license choice because I wasn't advocating for any particular license (like you seem to be).