EthicsConfirmed

10 sources

Published Mar 104 min read

AI Rewrites Popular Python Library, Sparks License War Over Copyleft Ethics

Image: Ars Technica AI

Main Takeaway

Chardet maintainer used Claude to reimplement LGPL library as MIT-licensed code, triggering fierce debate over whether AI-assisted rewrites can legally strip copyleft protections from open source projects.

The 48x Speedup That Broke Copyleft

Dan Blanchard just sanded down one of Python's most-used libraries and bolted on a new license while he was at it. Chardet 7.0 dropped last week with a 48x performance boost and multi-core support, but the real fireworks came from what got stripped away: the LGPL license that governed every previous version.

The library handles text encoding detection for roughly 130 million projects monthly. Blanchard wanted it in Python's standard library for years, but told The Register that the LGPL license kept blocking the path. So he fed Claude Code nothing but the API docs and test suite, never once looking at the original source. The AI spat back code that shares less than 1.3% similarity with prior versions by JPlag's measure. Blanchard shipped it as MIT-licensed, declaring it a clean rewrite.

Mark Pilgrim, who wrote chardet back in 2006, immediately opened a GitHub issue calling bullshit. His argument's simple: you can't call something a "complete rewrite" when you've had "extensive exposure to the original codebase." The LGPL requires derivative works carry the same license. Adding an AI into the mix doesn't grant extra rights.

The Legal vs Legitimate Divide

Here's where it gets thorny. Copyright law protects specific expression, not ideas or functionality. Clean-room reverse engineering has always lived in this gap — think GNU rebuilding UNIX tools without touching the original code. The law says that's fine.

But Blanchard's approach wasn't truly clean-room. He knew the codebase intimately from years of maintaining it. He just didn't look at it while Claude was generating the new version. Antirez (Redis creator) argues this mirrors how GNU reimplemented UNIX, making it legally permissible. Armin Ronacher (Flask creator) welcomed the relicensing, seeing it as progress.

The Hacker News AI piece slices through this reasoning. When GNU rebuilt UNIX, they turned proprietary code into free software. The ethical force came from expanding freedom, not just exploiting legal loopholes. Blanchard did the reverse: took copyleft code and made it more permissive, potentially enabling its use in closed-source products.

The AI Clean Room Problem

Traditional clean-room engineering uses two teams: one analyzes the original, another implements from specs. AI breaks this model. Claude absorbed patterns from who-knows-how-much training data. Blanchard's prompt included the API and test suite — essentially the library's DNA. The AI generated functionally equivalent code without direct copying, but the process feels more like laundering than engineering.

This creates a practical nightmare for license enforcement. If AI can strip copyleft protections by reimplementing exposed APIs, every GPL and LGPL project becomes vulnerable. Developers could feed their least-favorite license into an AI and get back MIT-licensed equivalents. The four freedoms that copyleft protects start looking negotiable.

What Happens Next

The chardet situation remains unresolved. Pilgrim wants the license reverted. Blanchard's holding firm. GitHub's watching closely — this could set precedent for how platforms handle AI-assisted license laundering.

More importantly, this opens a massive loophole in copyleft enforcement. The Free Software Foundation hasn't weighed in yet, but they'll need to. AI companies might face pressure to detect and refuse reimplementation requests that target specific open source projects. Or we might see new licenses that explicitly prohibit AI-assisted reimplementation.

For now, every maintainer of LGPL or GPL code has to wonder: is their project's license just one AI prompt away from irrelevance?

Key Points

Chardet maintainer used Claude AI to reimplement LGPL library as MIT-licensed code, claiming 48x performance improvement and less than 1.3% code similarity

Original author Mark Pilgrim argues this violates LGPL terms since maintainer had extensive exposure to original codebase, making it a derivative work

Debate centers on whether AI-assisted reimplementation constitutes legitimate clean-room engineering or license laundering through AI mediation

Case could establish precedent for using AI to circumvent copyleft licenses across open source ecosystem

Legal analysis suggests current copyright law may permit this practice, raising ethical questions about exploiting legal loopholes versus respecting license intent

FAQs

Under current copyright law, yes. Copyright protects specific expression, not functionality or ideas. If AI-generated code shares minimal similarity with the original while implementing the same API, it may qualify as a new work. However, this remains legally untested territory.

Potentially yes. Any license that relies on derivative work definitions could be vulnerable to AI reimplementation. GPL, LGPL, and other copyleft licenses are particularly at risk since their copyleft provisions trigger on derivative works.

Traditional clean-room uses two isolated teams: one analyzes the original, another implements from specifications. AI breaks this model by mediating between analysis and implementation, creating a gray area where the implementer (AI) has potentially absorbed patterns from training data that includes similar code.

Projects might need license updates explicitly prohibiting AI-assisted reimplementation, technical measures like code obfuscation, or new legal strategies. The Free Software Foundation may issue guidance, but enforcement becomes nearly impossible against distributed AI systems.

GitHub hasn't taken a position yet. They face competing pressures: supporting developers who want to use AI tools while protecting open source licenses. The outcome of the chardet dispute may influence platform policies.

Source Reliability

10 sources

50% of sources are established · Avg reliability: 59

T1 10%

T2 20%

T3 50%

T4 10%

T5 10%

Highly Trusted(1)

Ars Technica AI

Trusted(2)

Phoronix

Hacker News AI

Established(5)

App.daily

Gigazine

Dev

Lobste

Opensource.stackexchange

Unrated(1)

Dera

Low Credibility(1)

Tuananh

Go deeper with Organic Intel

Our AI for Your Life systems give you practical, step-by-step guides based on stories like this.

Explore ai for your life systems

Was this article helpful?

AI Rewrites Popular Python Library, Sparks License War Over Copyleft Ethics

The 48x Speedup That Broke Copyleft

The Legal vs Legitimate Divide

The AI Clean Room Problem

What Happens Next

Key Points

FAQs

Source Reliability

AI can rewrite open source code—but can it rewrite the license, too?

Discover More

Google Rushes Mental Health Safeguards to Gemini After Suicide Lawsuit

The New Arms Race: AI Writing Gets Better While Detection Tools Struggle

Google Under Fire: 200+ Groups Demand Ban on AI Videos for Kids

Perplexity's "Incognito Mode" Exposed as Sham in Federal Privacy Lawsuit

Americans Embrace AI Tools While Fearing Their Consequences

Stay ahead of AI in 5 minutes a week.

Summary

The 48x Speedup That Broke Copyleft

The Legal vs Legitimate Divide

The AI Clean Room Problem

What Happens Next

Key Points

FAQs

Source Reliability

AI can rewrite open source code—but can it rewrite the license, too?

Discover More

Google Rushes Mental Health Safeguards to Gemini After Suicide Lawsuit

The New Arms Race: AI Writing Gets Better While Detection Tools Struggle

Google Under Fire: 200+ Groups Demand Ban on AI Videos for Kids

Perplexity's "Incognito Mode" Exposed as Sham in Federal Privacy Lawsuit

Americans Embrace AI Tools While Fearing Their Consequences

Stay ahead of AI in 5 minutes a week.