Alejandro Cantero Jódar
Alejandro Cantero Jódar

Claude Code vs. Codex: Limits, Strategies, and the Chinese Alternative

· Programming Alejandro Cantero Jódar

Featured image for article: Claude Code vs. Codex: Limits, Strategies, and the Chinese Alternative

New usage limits in Claude Code

Anthropic, the company behind Claude Code, has recently imposed stricter restrictions on the use of its popular coding assistant. Starting August 28, 2025, the paid plans Claude Pro ($20/month) and Claude Max ($100–$200/month) include weekly hour limits. These are added to the existing 5-hour session windows (once the first prompt starts, a 5-hour block begins during which all messages count). In practice, this means even paid subscribers now have a weekly cap on Claude Code usage—around 40–80 hours for Pro users, with proportionally higher limits in Max tiers.

The restrictions aim to curb a small group of users who abused the system, such as those running Claude Code 24/7 or even reselling access by sharing accounts. Before these changes, some “superusers” reportedly consumed up to $10,000 worth of tokens in a single month under the $200 plan—a clearly unsustainable level of use that forced Anthropic to act. According to the company, “most users won’t notice any difference,” since only about 5% exceeded the new limits, but for heavy developers, the adjustment has been harsh.

In the free Claude.ai plan, the allowance is even smaller (estimated at 2–5 prompts every 5 hours on average), while the Pro plan allows around 10–40 prompts per session. It’s clear Anthropic is redefining what “unlimited” really means, prioritizing service stability and cost control over complete freedom for advanced users.

 

OpenAI and the progressive limitation of free tokens

OpenAI, known for its Codex model (and ChatGPT), has also adopted usage control strategies. Initially, it attracted developers with generous free token allowances but has gradually reduced these “gifts,” similar to Anthropic.

An example is the free tokens program launched in 2024–2025: OpenAI offered users up to 1 million free tokens per day on its code models if they agreed to share their data to improve the AI. This program, initially set to end on April 30, 2025, was extended indefinitely (with a 30-day notice required before its eventual closure). However, there was a catch: the free tokens could only be used with the GPT-4o model (a previous code model), not the newer and more powerful “GPT-4o-latest.” In other words, OpenAI restricted free access to an inferior model, incentivizing users who wanted top quality to pay.

Additionally, ChatGPT itself enforces limits: free-tier users have a message cap (roughly 10 messages every 5 hours), and advanced features like GPT-4, Code Interpreter, or image tools require a paid Plus subscription. Even in developer environments, Codex/ChatGPT users report hidden weekly caps in practice: before a recent change, a Plus subscriber could hit their quota after a few million tokens in a few days and be blocked until the next week—something OpenAI addressed only by introducing a higher-tier Pro plan.

All this shows that OpenAI has embraced the same strategy: rationing free or included tokens per plan and adjusting limits as demand grows. For developers, these graduated restrictions cause the same frustration felt with Claude: the “unlimited use” era of coding assistants has ended, replaced by invisible walls determining how much you can code before paying extra.

 

GLM-4.6: the low-cost Chinese alternative

In response to Anthropic and OpenAI’s restrictions, new alternatives have emerged. One of the most notable is GLM-4.6, a Chinese large language model that’s becoming a serious contender in coding tasks. What’s interesting about GLM-4.6 is not just its technical performance but its pricing strategy: it’s available through Zhipu AI’s GLM Coding Plan for as little as $3 per month (in promotional annual plans, or about $6/month with monthly billing).

This low-cost plan allows integration of GLM-4.6 into popular development environments—even inside Claude Code’s own interface or tools like Cline, OpenCode, or VS Code. In other words, users can “plug in” the Chinese model into the same tools previously used for Claude, gaining more usage for much less money.

My own tests with GLM-4.6 inside Claude Code have been pleasantly surprising: the model performs as expected, generating code comparable in quality to Claude or ChatGPT in most cases. Developers behind GLM-4.6 claim that in formal evaluations, the model matches Claude 4’s performance across multiple benchmarks, making it the most advanced AI model developed in China. This isn’t empty hype; in 74 real coding tests within Claude Code, GLM-4.6 outperformed Claude Sonnet 4 (Claude’s programming-optimized variant) in most tasks, proving equally effective and more efficient in token consumption.

This is remarkable given the huge price difference. GLM-4.6 also boasts strong technical features: it handles up to 200,000 tokens of context (enabling massive codebases) and can use external tools during generation, giving it powerful reasoning and web navigation abilities. In short, for a fraction of the cost of ChatGPT or Claude, this Chinese alternative delivers high-level performance that’s attracting developers tired of Western firms’ restrictions.

Average Token Usage Chart

Chart: Comparison of GLM-4.6 vs. Claude Sonnet 4 in real-world programming tasks. In ~49% of tested cases, GLM-4.6 achieved better results (blue bars), tied in ~9% (black), and lost in ~42% (green). This indicates near-par performance with Claude in coding—remarkable for such a low-cost model.

Average Token Usage Chart

Chart: Token efficiency per interaction (prompt + responses including tool calls). GLM-4.6 (blue) consumes fewer tokens per round (~651K on average) than competitors like its predecessor GLM-4.5 (green) or other advanced models like K2 or DeepSeek (gray). This efficiency lowers operating costs while maintaining productivity, enabling cheap pricing without sacrificing capability.

 

Thanks to these traits, GLM-4.6 is rapidly gaining disillusioned users. For just $3–6/month, developers constrained by Anthropic and OpenAI’s limits can code again with virtually unlimited freedom. The community has noticed: some call GLM’s new plan “insane value” compared to Claude—and rightly so. Instead of worrying about 5-hour message windows or weekly caps, GLM users can simply focus on coding. The promise of “3× more usage than Claude Code for $3” sounds almost utopian but is now reality for anyone willing to step outside the usual ecosystem.

 

Tight competition and the balance between tokens and price

The rise of models like GLM-4.6 shows how fierce the coding AI race has become. Just two years ago, few imagined a Chinese model would rival Silicon Valley’s best. Today, Chinese models rank among the world’s top three in this domain, matching benchmarks of Claude and GPT—at a fraction of the cost.

This shift has forced a reevaluation of the balance between the number of tokens needed for an effective coding agent and the price users are willing to pay. On one side, companies like Anthropic and OpenAI have invested huge sums to build cutting-edge models and now seek to recoup costs via limits and premium tiers; but if they overdo restrictions, they risk alienating their most loyal base—power users and developers—and pushing them straight to the competition. Many professional programmers process large codebases or multiple complex queries daily; for them, hitting “you’ve reached your limit, wait X hours” mid-work isn’t just annoying—it’s a productivity and revenue loss.

These users are proving they’ll pay, but only when they feel the value is fair. They prefer a slightly less polished or famous service if it gives them more freedom per dollar spent.

In this context, GLM-4.6 has found the perfect niche: enough tokens and power to meet real development needs, at a price that makes $20 or $200 monthly plans look excessive. Its success reveals an uncomfortable truth for the big players: the coding assistant market is becoming commoditized, and developer loyalty is fragile. Winning the race is no longer about being the smartest model—it’s about being the most accessible and generous.

The “token war” is underway, and for now, users benefit from having alternatives. In the end, this will push Anthropic, OpenAI, and new competitors toward a healthier balance between quality, quantity, and cost—a balance where limits are reasonable and transparent, and where paying for premium service means empowering your projects, not throttling them.

That’s the key lesson here: when a community feels unfairly restricted, it will always find—or create—an alternative to reopen the gates. And today, that alternative is called GLM-4.6.

Related posts

Lessons from Klarna's AI Implementation Failure

Lessons from Klarna's AI Implementation Failure

This report analyzes Klarna's challenging experience with AI in customer service, revealing how poorly defined goals and overestimation of AI capabilities can lead to negative results. Learn from their mistakes to inform your AI strategy.

Read more
The AI Gold Rush: How Smart Agents Are Shaping Code Editors

The AI Gold Rush: How Smart Agents Are Shaping Code Editors

We’re not just coding anymore—we’re negotiating with intelligent agents that promise to help but sometimes slow us down before they speed things up. This isn't just a tech shift; it’s a mindset shift, and the real winners will be those who learn to partner with AI, not be replaced by it.

Read more
Microsoft Majorana 1: Why it's important for programmers?

Microsoft Majorana 1: Why it's important for programmers?

Majorana 1, the first quantum processor with topological qubits that will transform quantum software development and simplify the programming of advanced algorithms.

Read more
ALIA: Spain's LLM and Technological Sovereignty

ALIA: Spain's LLM and Technological Sovereignty

ALIA, developed by the Spanish government and coordinated by the BSC, promotes the use of Spanish in AI. With a 150 million euro investment, ALIA democratizes access to AI and positions Spain as a technology leader.

Read more
CSS is now a programming language

CSS is now a programming language

CSS is evolving by enabling conditional logic and reusable styles natively. These features reduce reliance on preprocessors like Sass and Less.

Read more
Best Open-Source Headless CMS (SQLite compatible) for 2025

Best Open-Source Headless CMS (SQLite compatible) for 2025

Headless CMS platforms are offering flexibility, scalability, and cost savings. In this article, we check the best open-source headless CMS options for 2025, focusing on tools that support SQLite for efficient static content management.

Read more
The Future of RxJS in Angular

The Future of RxJS in Angular

At the end of 2024, Angular 19 has been released with interesting changes that offer new solutions to problems that were traditionally solved with RxJS. These changes make it quite clear where Angular wants to go regarding interoperability with RxJS in the future.

Read more

Alejandro Cantero Jódar

Made with ♥ by a passioned developer © 2025