ποΈ DeepSeek Reasonix β Architecture Analysis
What is Reasonix?
A DeepSeek-native, cache-first coding agent for the terminal. Unlike general-purpose agents, every architectural decision is justified by DeepSeek-specific behavior or economic property. The product north star: "a coding agent that stays cheap enough to leave on."
Key differentiator: cache stability is an invariant the loop is designed around, not a feature you toggle on. The entire codebase is tuned to DeepSeek's byte-stable automatic prefix-caching mechanic β achieving real-world 99.82% cache hit rates (435M input tokens, ~$12 instead of ~$61).
npm: reasonix Node β₯22 MIT License TypeScript 5.6+ Commander.js + Ink 5 TUI
flowchart TB
subgraph "User Interface"
CLI["CLI Entry\nsrc/cli/index.ts"]
TUI["Ink TUI\nsrc/cli/ui/App.tsx\n(~1984 LOC)"]
DASH["Dashboard SPA\nReact + Express\nport :3100"]
end
subgraph "Core Engine"
LOOP["CacheFirstLoop\nsrc/loop.ts\n(1052 LOC)"]
CM["ContextManager\nsrc/context-manager.ts\n(345 LOC)"]
REPAIR["ToolCallRepair\nsrc/repair/"]
PROMPT["Prompt Builder\nsrc/code/prompt.ts\n+ prompt-fragments.ts"]
end
subgraph "Tool System"
TOOLS["ToolRegistry\nsrc/tools.ts"]
FS["Filesystem\nread/write/edit/search"]
SHELL["Shell\nrun_command + jobs"]
SUB["Subagents\nspawn_subagent"]
MCP_BRIDGE["MCP Bridge\nstdio + SSE"]
WEB["Web\nsearch + fetch"]
end
subgraph "Infrastructure"
CLIENT["DeepSeek Client\nsrc/client.ts\nfetch + SSE streaming"]
SESSION["Session Store\nJSONL persistence"]
MEMORY["Memory Store\nUser / Project / Runtime"]
TOKENIZER["DeepSeek V3 Tokenizer\nPorted from Python"]
end
CLI --> TUI
TUI --> LOOP
LOOP --> CM
LOOP --> REPAIR
LOOP --> PROMPT
LOOP --> TOOLS
TOOLS --> FS
TOOLS --> SHELL
TOOLS --> SUB
TOOLS --> MCP_BRIDGE
TOOLS --> WEB
LOOP --> CLIENT
LOOP --> SESSION
LOOP --> MEMORY
LOOP --> TOKENIZER
DASH --> LOOP
π¦ Technology Stack & Module Layout
Core Stack
- Language: TypeScript 5.6+, ES2022, ESM (
"type": "module") - CLI Framework: Commander.js + Ink 5 (React 18 for terminal UI)
- Testing: Vitest 2.x, ~250 test files
- Lint/Format: Biome 1.9 (2-space, double quotes, always semicolons, 100 width)
- Build: tsup (bundler), tsx (dev runner)
- Desktop: Tauri (Rust shell, macOS/Windows/Linux)
- Dashboard: React + Vite + Express (port 3100)
Module Map
| Directory | Purpose | Key Files |
|---|---|---|
src/cli/ | CLI Entry + TUI | index.ts, commands/code.tsx, commands/chat.tsx, ui/App.tsx |
src/tools/ | Tool Definitions | filesystem.ts, shell.ts, subagent.ts, plan.ts, web.ts, memory.ts |
src/loop/ | Agent Loop Engine | dispatch.ts, messages.ts, streaming.ts, thinking.ts, healing.ts |
src/repair/ | Tool-Call Repair | flatten.ts, scavenge.ts, storm.ts, truncation.ts |
src/mcp/ | MCP Protocol | client.ts, stdio.ts, sse.ts, registry.ts, spec.ts |
src/code/ | Edit Engine | edit-blocks.ts, prompt.ts, setup.ts, diff-preview.ts |
src/memory/ | Persistence | session.ts, runtime.ts (3-region model), project.ts, user.ts |
src/core/ | Kernel | events.ts, reducers.ts, eventize.ts, inflight.ts, pause-gate.ts |
src/ports/ | Interfaces | model-client.ts, tool-host.ts, event-sink.ts, memory-store.ts |
src/transcript/ | Logging | log.ts, diff.ts, replay.ts |
ποΈ The Four Architectural Pillars
Pillar 1: Cache-First Loop src/loop.ts
DeepSeek bills cached input at ~10% of the miss rate. The loop partitions context into three regions to maximize prefix-cache stability:
flowchart TB
subgraph "IMMUTABLE PREFIX β Fixed for session"
SYS["System Prompt\n(codeSystemBase)"]
TOOLS["Tool Specifications\n(ToolRegistry.specs)"]
FEW["Few-Shot Examples"]
end
subgraph "APPEND-ONLY LOG β Grows monotonically"
direction TB
A1["assistant_1"]
T1["tool_result_1"]
A2["assistant_2"]
T2["tool_result_2"]
A1 --> T1 --> A2 --> T2
end
subgraph "VOLATILE SCRATCH β Reset each turn"
R1["R1 Reasoning\n(reasoning_content)"]
PLAN["Transient Plan State"]
THOUGHTS["Working Thoughts"]
end
I --> L --> S
Invariants: (1) Prefix computed once per session, hashed, and pinned. (2) Log entries serialized in append order β no rewrites. (3) Scratch distilled via Pillar 2 before folding into log.
prompt_cache_hit_tokens / (hit + miss) exposed per-turn and aggregated per-session. Visible in TUI top-bar cache cell.
Pillar 2: Tool-Call Repair src/repair/
Four-pass pipeline addressing DeepSeek-specific failure modes:
flowchart LR
INPUT["Model Response\n(tool_calls + reasoning_content)"]
subgraph "Pass 1: Flatten"
FLAT["Schema Flatten\n>10 params or depth>2\nβ dot-notation"]
end
subgraph "Pass 2: Scavenge"
SCAV["Scavenge\nRegex + JSON parser\nSweeps reasoning_content\nfor forgotten tool calls"]
end
subgraph "Pass 3: Truncation Repair"
TRUNC["Truncation Repair\nDetect unbalanced JSON\nClose braces or\nrequest continuation"]
end
subgraph "Pass 4: Storm"
STORM["Storm Breaker\nIdentical (tool, args)\nwithin sliding window\nβ suppress + reflect"]
end
INPUT --> FLAT --> SCAV --> TRUNC --> STORM
STORM --> DISPATCH["dispatchToolCallsChunked()"]
| Pass | File | Problem Solved |
|---|---|---|
| 1. Flatten | flatten.ts | DeepSeek drops args when schema has >10 leaf params or depth >2 β auto-flatten to dot-notation, re-nest at dispatch |
| 2. Scavenge | scavenge.ts | Tool-call JSON emitted inside <think>, missing from final message |
| 3. Truncation | truncation.ts | Truncated JSON due to max_tokens hit mid-structure |
| 4. Storm | storm.ts | Same tool called repeatedly with identical args (call-storm) |
Pillar 3: Cost Control v0.6
Tiered Model Presets
| Preset | Model | Effort | Relative Cost |
|---|---|---|---|
flash | v4-flash | max | 1Γ |
auto (default) | v4-flash β v4-pro on hard turns | max | 1β3Γ |
pro | v4-pro | max | ~12Γ |
Cost Mechanisms
- Turn-end auto-compaction: Tool results exceeding 3000 tokens shrunk when turn ends
- Proactive 40% threshold: Context-ratio above 40% triggers pre-emptive shrink before 80% emergency
- Model self-report escalation (
<<<NEEDS_PRO>>>): Model decides when task exceeds current tier - Auxiliary calls hard-code flash: Summarization, subagent spawns, truncation repair β all use v4-flash regardless of user preset
- Parallel tool dispatch: Read-only tools race concurrently (up to 3) to reduce turn latency
Pillar 4: MCP Protocol Integration src/mcp/
Model Context Protocol support with stdio + SSE transports. Tools registered via registry at startup; third-party MCP tools default to parallelSafe: false.
- Transports: stdio (subprocess), SSE (HTTP streaming), Streamable HTTP
- Registry: Marketplace overlay with categorized server list
- Lifecycle: Connect β initialize β list_tools β ready. Reconnect on crash
- Truncation: Results capped at
DEFAULT_MAX_RESULT_TOKENSwith save-to-disk option
π Prompt Construction Lineage
flowchart TB
subgraph "Layer 1: Shared Fragments"
TUI["TUI_FORMATTING_RULES\nβ’ Table formatting\nβ’ Code blocks\nβ’ No ASCII art"]
NEG["NEGATIVE_CLAIM_RULE\nβ’ Cite or shut up\nβ’ search_content FIRST"]
ESC["escalationContract(modelId)\nβ’ ESCALATION_CONTRACT\nβ’ <<<NEEDS_PRO>>> marker"]
end
subgraph "Layer 2: Identity + Rules"
ID["Identity Block\nβ’ You are Reasonix Code\nβ’ Don't infer from workspace"]
CITE["Citation Rules\nβ’ File paths with line ranges\nβ’ Broken paths = red strikethrough"]
AUDIT["Audit Rails (6 rules)\nβ’ Auto-preview limits\nβ’ Flagβconsumer trace\nβ’ No fabricated %"]
end
subgraph "Layer 3: Tool Guidance"
PICK["Tool Selection\nβ’ submit_plan vs ask_choice\nβ’ todo_write"]
PLAN_MODE["Plan Mode\nβ’ Bounced writes\nβ’ submit_plan required"]
SUBAGENT["Subagent Delegation\nβ’ Default: don't delegate\nβ’ Skill index with tags"]
end
subgraph "Layer 4: Edit Protocol"
EDIT["SEARCH/REPLACE Format\nβ’ Read before edit enforced\nβ’ Exact whitespace match\nβ’ Edit gate routes review/auto"]
WHEN["When to Edit\nβ’ Only on explicit change ask\nβ’ Analyze β tools + prose"]
end
subgraph "Layer 5: Memory Stack"
UM["User Memory\n~/.reasonix/memory/"]
PM["Project Memory\nREASONIX.md"]
SM["Skill Memos\n@reasonix/core-utils"]
end
subgraph "Output"
FINAL["CODE_SYSTEM_PROMPT\nBuilt per-session by\ncodeSystemBase(modelId)\nwith real model name"]
end
TUI --> ID
NEG --> CITE
ESC --> PICK
ID --> PICK
CITE --> AUDIT
AUDIT --> PLAN_MODE
PICK --> SUBAGENT
PLAN_MODE --> EDIT
SUBAGENT --> WHEN
EDIT --> FINAL
WHEN --> FINAL
UM --> FINAL
PM --> FINAL
SM --> FINAL
Prompt Assembly Code Path
codeSystemBase(modelId)
ββ CODE_SYSTEM_TEMPLATE (180 lines, src/code/prompt.ts)
ββ TUI_FORMATTING_RULES (from prompt-fragments.ts)
ββ NEGATIVE_CLAIM_RULE (from prompt-fragments.ts)
ββ escalationContract(modelId) (from prompt-fragments.ts)
ββ Identity block (lines 13-17)
ββ Citation rules (lines 19-21)
ββ Audit rails (lines 23-31)
ββ Tool selection guidance (lines 34-38)
ββ Plan mode constraints (lines 40-42)
ββ Subagent delegation policy (lines 44-48)
ββ Edit protocol + SEARCH/REPLACE format (lines 50-80+)
ββ __ESCALATION_CONTRACT__ (replaced at runtime with model-specific contract)
applyMemoryStack() overlays:
ββ User memory files (HIGH PRIORITY constraints block)
ββ Project memory (REASONIX.md or CLAUDE.md)
ββ Skill memos (@reasonix/core-utils compaction module)
π Agent Loop β CacheFirstLoop
flowchart TB
START(["User submits prompt"])
subgraph "Turn Setup"
RESET["resetStorm()\nFresh intent = clean slate"]
LOAD["loadSessionMessages()\nRestore from JSONL"]
HEAL["healLoadedMessages()\nFix tool call pairing"]
CHECK["Budget check\nRefuse if over cap"]
end
subgraph "Iteration Loop"
BUILD["Build messages:\nprefix + log + scratch"]
STREAM["streamModelResponse()\nSSE from DeepSeek API"]
THINK["Strip/retain reasoning\nthinking mode handling"]
REPAIR["ToolCallRepair.process()\nscavengeβtruncationβstorm"]
end
subgraph "Dispatch Phase"
DISPATCH["dispatchToolCallsChunked()\nParallel-safe chunking"]
RUN["runOne(call)\nToolRegistry.dispatch()"]
APPEND["appendAndPersist(msg)\nLog + session JSONL"]
end
subgraph "Post-Turn"
COMPACT["decideAfterUsage()\nFold if ratio > 75%"]
BUDGET["updateBudget()\nWarn at 80%, stop at 100%"]
end
START --> RESET --> LOAD --> HEAL --> CHECK
CHECK --> BUILD --> STREAM --> THINK --> REPAIR
REPAIR -->|has tool calls| DISPATCH
REPAIR -->|no tool calls| COMPACT
DISPATCH --> RUN --> APPEND
APPEND -->|more iter cycles| BUILD
APPEND -->|turn done| COMPACT
COMPACT -->|fold| HEAL
COMPACT -->|carry on| DONE(["Await next user input"])
COMPACT -->|exit summary| FORCE(["forceSummaryAfterIterLimit()"])
Key Loop States
| State | File | Description |
|---|---|---|
ImmutablePrefix | memory/runtime.ts | System prompt + tool specs + few-shots β pinned for session |
AppendOnlyLog | memory/runtime.ts | Monotonically growing conversation β preserves cache prefix |
VolatileScratch | memory/runtime.ts | R1 reasoning + transient state β reset each turn |
ReadTracker | tools/read-tracker.ts | Files model has read β gates edit_file. Cleared on fold |
SessionStats | telemetry/stats.ts | Per-session token usage + cost + cache hit ratio |
InflightSet | core/inflight.ts | Tracking in-progress tool calls for TUI spinner display |
Mid-Turn Steering
Users can inject guidance during a running turn. The wrapper constant preserves context:
MID_TURN_STEER_WRAPPER = "[Mid-turn steer queued by the user. Do not treat this as a new task; use it only as additional guidance for the current task after completing the current step.]"
β‘ Parallel Tool Dispatch
flowchart TB
CALLS["Repaired Tool Calls\n(repairedCalls[])"]
subgraph "Chunking Logic (dispatchToolCallsChunked)"
CHECK{"REASONIX_TOOL_DISPATCH\n=== 'serial'?"}
SERIAL["Serial Mode\nOne call per chunk"]
GROUP["Group parallel-safe calls\nup to REASONIX_PARALLEL_MAX (default 3)\nStop at first non-parallel-safe"]
BARRIER["Serial Barrier\nNon-parallel-safe call\nruns alone"]
end
subgraph "Execution"
RACE["Promise.allSettled()\nRace chunk in parallel"]
ORDER["Yield results\nin DECLARED ORDER\n(not completion order)"]
APPEND["Append tool messages\nto AppendOnlyLog"]
end
CALLS --> CHECK
CHECK -->|yes| SERIAL
CHECK -->|no| GROUP
GROUP --> BARRIER
SERIAL --> RACE
BARRIER --> RACE
RACE --> ORDER --> APPEND
APPEND -->|more calls| CHECK
APPEND -->|done| NEXT(["Return to agent loop"])
Parallel-Safe Tools (Built-In)
| Tool | Why Safe |
|---|---|
read_file, list_directory, directory_tree | Read-only filesystem ops |
search_files, search_content, get_file_info | Read-only search ops |
web_search, web_fetch | External, no filesystem side effects |
recall_memory, semantic_search | Read-only memory access |
run_skill, spawn_subagent | Isolated child loops |
job_output, list_jobs | In-memory job queries |
Not parallel-safe: edit_file, write_file, run_command β these mutate state and must execute serially for read-after-write ordering.
REASONIX_PARALLEL_MAX (default 3, hard cap 16) Β· REASONIX_TOOL_DISPATCH=serial (escape hatch)
π§ Tool-Call Repair Pipeline
Pass Order: scavenge β truncation β storm (flatten runs at construction)
flowchart TB
RAW["Raw Model Response\nβ’ tool_calls[]\nβ’ reasoning_content\nβ’ content"]
subgraph "Phase 0: Registration Time"
ANALYZE["analyzeSchema()\nCheck depth > 2\nor leaf params > 10"]
FLAT_STORE["Store flatSchema\non InternalTool"]
end
subgraph "Phase 1: Scavenge"
COMBINE["Combine channels\nreasoning + content"]
SCAN["Regex + JSON parse\nFind tool calls in\n<think> + DSML markup"]
DEDUP["Deduplicate\n(by name+args signature)\nMax 4 scavenged calls"]
end
subgraph "Phase 2: Truncation Repair"
CHECK_JSON["Check each call's\nargument JSON"]
FIX["Close braces/strings\nFallback: leave original\n(rejects with 'invalid JSON')"]
end
subgraph "Phase 3: Storm Breaker"
WINDOW["Sliding window\n(default 6 calls)"]
DETECT["Detect identical\n(tool_name, args) tuples"]
ACT["3+ repeats in window\nβ suppress call\nβ inject reflection"]
end
RAW --> ANALYZE
RAW --> COMBINE --> SCAN --> DEDUP
DEDUP --> CHECK_JSON --> FIX
FIX --> WINDOW --> DETECT --> ACT
ACT --> REPAIRED(["Repaired calls β dispatchToolCallsChunked()"])
Scavenge Deep-Dive src/repair/scavenge.ts
Searches BOTH channels:
- reasoning_content: Where R1 models leak JSON tool calls inside
<think>blocks - content: Where DSML markup appears in regular text turns
Channels are joined with newline and scanned independently. Dedup prefers first-seen (declared calls take priority over scavenged). Only novel (name, args) signatures are added. Default max: 4 scavenged calls.
Storm Breaker src/repair/storm.ts
Prevents infinite loops where the model calls the same tool with identical args repeatedly. Configurable:
- Window size: 6 calls (default)
- Threshold: 3 identical calls in window β suppress
- Exemptions:
stormExempt: truefor cheap state-inspection tools - Reset: Mutating calls (writes, edits) clear the window β post-edit verify reads aren't false positives
π Context Manager β Auto-Compaction
flowchart TB
USAGE["After turn response\nUsage.promptTokens"]
subgraph "Threshold Decision (decideAfterUsage)"
FST{"> 80%?\nFORCE_SUMMARY_THRESHOLD"}
AGGR{"> 78%?\nHISTORY_FOLD_AGGRESSIVE"}
NORM{"> 75%?\nHISTORY_FOLD_THRESHOLD"}
PRE{"> 90% at turn start?\nTURN_START_FOLD_THRESHOLD"}
end
subgraph "Fold Execution"
PINS["Extract pinned skills\n(collectPinnedSkills)"]
SUMMARIZE["Summarize head messages\nv4-flash + effort=high\n15s timeout"]
REWRITE["rewriteSession()\nReplace head with summary\n+ pinned skill bodies\n+ recent tail"]
end
FST -->|yes| EXIT(["Exit with summary\nforceSummaryAfterIterLimit()"])
FST -->|no| AGGR
AGGR -->|yes| AGG_FOLD(["Aggressive fold\ntailBudget = 10% ctxMax"])
AGGR -->|no| NORM
NORM -->|yes| NORM_FOLD(["Normal fold\ntailBudget = 20% ctxMax"])
NORM -->|no| CARRY(["Carry on"])
PRE -->|yes| PRE_FOLD(["Pre-iter fold\nbefore turn starts"])
AGG_FOLD --> PINS
NORM_FOLD --> PINS
PRE_FOLD --> PINS
PINS --> SUMMARIZE --> REWRITE
Compaction Thresholds
| Constant | Value | Trigger |
|---|---|---|
FORCE_SUMMARY_THRESHOLD | 0.80 | Exit turn with summary β defense in depth |
HISTORY_FOLD_AGGRESSIVE_THRESHOLD | 0.78 | Fold harder β tail budget 10% |
HISTORY_FOLD_THRESHOLD | 0.75 | Normal fold β tail budget 20% |
TURN_START_FOLD_THRESHOLD | 0.90 | Pre-turn fold β covers session restore, huge paste |
HISTORY_FOLD_MIN_SAVINGS_FRACTION | 0.30 | Skip fold if head wouldn't shrink by β₯30% |
<skill-pin name="...">...</skill-pin> are lifted from the head before summarization and re-appended verbatim after. This ensures skill procedures survive context folds.
π MCP Protocol Architecture
flowchart TB
subgraph "Transport Layer"
STDIO["StdioTransport\nSubprocess spawn\nJSON-RPC over stdin/stdout"]
SSE["SseTransport\nHTTP GET /sse event stream\nPOST for clientβserver"]
STREAM_HTTP["StreamableHttpTransport\nStreaming HTTP with\nupgrade path"]
end
subgraph "Client"
CONNECT["McpClient.connect()\nHandshake + initialize"]
TOOLS["listTools()\nβ ToolSpec[] registered\nin ToolRegistry"]
CALL["callTool(name, args)\nβ result string\ncapped at maxResultTokens"]
end
subgraph "Registry"
CATALOG["catalog.ts\nMCP server directory"]
MARKETPLACE["marketplace-overlay/\nCurated server list\nwith i18n (zh-CN.json)"]
SPEC["spec.ts\nMCP schema validation"]
end
STDIO --> CONNECT
SSE --> CONNECT
STREAM_HTTP --> CONNECT
CONNECT --> TOOLS --> CALL
CATALOG --> MARKETPLACE
SPEC --> CONNECT
π Design Evolution
| Version | Milestone |
|---|---|
| v0.0.x | Pillar 1 end-to-end, repair pipeline complete, Ink TUI scaffold |
| v0.1 | Ο-bench numbers published, streaming polish, transcript replay |
| v0.3 | MCP client (stdio + SSE), session persistence |
| v0.4.x | reasonix code with SEARCH/REPLACE edits, review/auto gate, background jobs, hooks |
| v0.5.x | V4 model support, skills, memory, subagents, actionable error messages |
| v0.6 | Cost control (flash-first, auto-compaction, /pro one-shot, self-report escalation, cost badges). Shared prompt fragments. UI refactor (App.tsx split into hooks, slash.ts split into 13 modules) |
| v0.31 (current) | Branch + harvest removed. Leaner surface, fewer slash commands |
Explicit Non-Goals
- Multi-agent orchestration (subagents are cost-reduction, not coordination)
- RAG / vector retrieval
- Non-DeepSeek backends (OpenAI shim possible via --model but untested)
- Web UI / SaaS
- Silent cost escalation (every pro call is user-visible)
DeepSeek Reasonix Architecture Analysis Β· Generated by Hermes Agent Β· github.com/esengine/deepseek-reasonix