A working agent, not a source dump
A genuine tool-calling loop, a streaming REPL, session history, and multi-turn execution. Ported from the real Claude Code TypeScript architecture and shipped as a CLI you actually run.
ClawCodex is a token-efficient Python rebuild of Claude Code — 230K lines of pure Python. A genuine tool-calling loop, a streaming REPL, skills, and one runtime in front of Anthropic, OpenAI, Z.ai GLM, MiniMax, OpenRouter, and DeepSeek, where prefix-cache reuse makes long DeepSeek sessions over 200× cheaper to run.
ClawCodex puts a real runtime around the model: a working loop, every provider, and code you can read and extend.
A genuine tool-calling loop, a streaming REPL, session history, and multi-turn execution. Ported from the real Claude Code TypeScript architecture and shipped as a CLI you actually run.
Claude Code targets Claude models only. ClawCodex puts Anthropic, OpenAI, Z.ai GLM, MiniMax, OpenRouter, and DeepSeek behind the same loop — so you swap vendor, region, and price tier without giving up tools or skills.
Idiomatic Python with full type hints, real test suites, and markdown-driven SKILL.md extensibility. Fork it, add a tool or a skill, and make it yours. MIT licensed.
Streaming, skills, permissions, sessions, MCP, and a scriptable headless mode — the same loop whichever model you point it at.
ClawCodex keeps the request prefix byte-stable so DeepSeek's prompt cache covers your whole system + tools + history span. Cache-hit input bills at about $0.0435 per 1M tokens, over 200× cheaper than Claude Fable 5, so long agentic sessions cost pennies. The longer you code, the more you save.
True API streaming for direct replies plus richer streaming during tool-driven loops. Toggle live output with /stream and re-render clean Markdown with /render-last.
Markdown-based SKILL.md slash commands with named arguments and per-skill tool limits. Project skills and user skills, loaded from .clawcodex/skills.
An inline prompt_toolkit + Rich REPL with history, tab completion, and multiline input. Launch the Textual TUI any time with clawcodex --tui or /tui.
Plan (read-only), acceptEdits, dontAsk, and an explicit bypass for sandboxes. The REPL, TUI, and headless -p mode all honor the same gates.
Save and reload conversations locally. auto_save writes each session; max_history caps retained turns. Everything stays on your machine.
Model Context Protocol tools and resources, plus pre/post tool-use lifecycle hooks for shell, prompt, agent, and HTTP automation.
Run -p for one-shot prompts, with --output-format json or stream-json for pipes, CI, and agent-to-agent workflows.
A TS-parity Read pipeline sniffs magic bytes and resizes to API limits. @-mention an image to inline it, with Anthropic-to-OpenAI block translation for vision-capable backends.
Swap vendor, region, and price tier without giving up tools or skills. Six providers ship today.
Plus any OpenAI-compatible endpoint via a custom base_url — self-hosted vLLM, SGLang, or Ollama included.
# point a single run at a different backend
clawcodex --provider deepseek --model deepseek-v4-pro
clawcodex --provider zai --model glm-5.2 -p "refactor utils.py"
Read code, call tools behind permission gates, leave real evidence, and resume across sessions.
You type in the REPL or pipe a prompt with -p. Plan mode is read-only by default.
The selected provider streams the reply; tool calls overlap with the stream.
Reads, edits, bash, and web run behind permission gates, leaving real evidence.
Results feed the next turn. Sessions save locally so long work survives.
The heartbeat that orchestrates model calls and tool execution.
30+ tools: read files, run bash, search the web, drive sub-agents and MCP.
Background workflows and agent orchestration, journaled and resumable.
A bootstrap layer and an app layer, kept independent by design.
Relevance scanning over project, user, and team CLAUDE.md files.
Lifecycle events with shell, prompt, agent, and HTTP executors.
Full SWE-bench Verified split (499 instances), both agents driven by Gemini 2.5 Pro under one standardized harness.
Same CLI you just installed, same agent loop, same tools. No hand-edits.
A mini CRM with contacts, deals, a dashboard, and a full test suite.
demos/crm-app
Profile, network, jobs, and messaging in a familiar feed layout.
demos/linkedin-app
A browser voxel sandbox with terrain, mining, a HUD, and player controls.
demos/minecraft-app
Animated hero, live countdown, host nations, and 16 stadiums. Built with GLM-5.2.
demos/wc26-intro
DeepSeek is now the default provider, with a prefix-cache optimization that makes long agentic coding sessions dramatically token-cheaper. Recent highlights from the repository.
ClawCodex keeps its request prefix byte-stable across turns, so DeepSeek's prompt cache covers the whole system + tools + history span. Cache-hit input bills at about $0.0435 per 1M tokens, over 200× cheaper than Claude Fable 5. Gated to DeepSeek; every other provider is byte-for-byte unchanged.
A Python workflow engine — agent(), parallel(), pipeline(), phase() — with journaling and resume, wired end-to-end with per-agent retry and worktree isolation, plus a bundled /deep-research harness.
DeepSeek-V4-Pro is wired in as the default, with its 1M-token context window registered and a per-model prompt-cache hit-rate plus cost surfaced in /cost. Interrupted streams now recover truncated tool-call argument JSON instead of dropping it.
Pair a cheap worker model with an expensive reviewer consulted only at decision points — roughly 6x cheaper than running the expensive model alone, with live token and USD cost in the status bar.
No CLA. No sponsor lockouts. Issues triaged in the open, releases cut from main, and the maintainer reads everything. Bring a real test and prose that tells the reviewer what you were thinking.