πŸ—οΈ DeepSeek Reasonix β€” Architecture Analysis

~350
Source Files
4
Architectural Pillars
99.82%
Real-World Cache Hit
1,052
Lines in loop.ts
180
Lines Code Prompt
~250
Tests

What is Reasonix?

A DeepSeek-native, cache-first coding agent for the terminal. Unlike general-purpose agents, every architectural decision is justified by DeepSeek-specific behavior or economic property. The product north star: "a coding agent that stays cheap enough to leave on."

Key differentiator: cache stability is an invariant the loop is designed around, not a feature you toggle on. The entire codebase is tuned to DeepSeek's byte-stable automatic prefix-caching mechanic β€” achieving real-world 99.82% cache hit rates (435M input tokens, ~$12 instead of ~$61).

npm: reasonix Node β‰₯22 MIT License TypeScript 5.6+ Commander.js + Ink 5 TUI

flowchart TB
  subgraph "User Interface"
    CLI["CLI Entry\nsrc/cli/index.ts"]
    TUI["Ink TUI\nsrc/cli/ui/App.tsx\n(~1984 LOC)"]
    DASH["Dashboard SPA\nReact + Express\nport :3100"]
  end

  subgraph "Core Engine"
    LOOP["CacheFirstLoop\nsrc/loop.ts\n(1052 LOC)"]
    CM["ContextManager\nsrc/context-manager.ts\n(345 LOC)"]
    REPAIR["ToolCallRepair\nsrc/repair/"]
    PROMPT["Prompt Builder\nsrc/code/prompt.ts\n+ prompt-fragments.ts"]
  end

  subgraph "Tool System"
    TOOLS["ToolRegistry\nsrc/tools.ts"]
    FS["Filesystem\nread/write/edit/search"]
    SHELL["Shell\nrun_command + jobs"]
    SUB["Subagents\nspawn_subagent"]
    MCP_BRIDGE["MCP Bridge\nstdio + SSE"]
    WEB["Web\nsearch + fetch"]
  end

  subgraph "Infrastructure"
    CLIENT["DeepSeek Client\nsrc/client.ts\nfetch + SSE streaming"]
    SESSION["Session Store\nJSONL persistence"]
    MEMORY["Memory Store\nUser / Project / Runtime"]
    TOKENIZER["DeepSeek V3 Tokenizer\nPorted from Python"]
  end

  CLI --> TUI
  TUI --> LOOP
  LOOP --> CM
  LOOP --> REPAIR
  LOOP --> PROMPT
  LOOP --> TOOLS
  TOOLS --> FS
  TOOLS --> SHELL
  TOOLS --> SUB
  TOOLS --> MCP_BRIDGE
  TOOLS --> WEB
  LOOP --> CLIENT
  LOOP --> SESSION
  LOOP --> MEMORY
  LOOP --> TOKENIZER
  DASH --> LOOP

πŸ“¦ Technology Stack & Module Layout

Core Stack

  • Language: TypeScript 5.6+, ES2022, ESM ("type": "module")
  • CLI Framework: Commander.js + Ink 5 (React 18 for terminal UI)
  • Testing: Vitest 2.x, ~250 test files
  • Lint/Format: Biome 1.9 (2-space, double quotes, always semicolons, 100 width)
  • Build: tsup (bundler), tsx (dev runner)
  • Desktop: Tauri (Rust shell, macOS/Windows/Linux)
  • Dashboard: React + Vite + Express (port 3100)

Module Map

DirectoryPurposeKey Files
src/cli/CLI Entry + TUIindex.ts, commands/code.tsx, commands/chat.tsx, ui/App.tsx
src/tools/Tool Definitionsfilesystem.ts, shell.ts, subagent.ts, plan.ts, web.ts, memory.ts
src/loop/Agent Loop Enginedispatch.ts, messages.ts, streaming.ts, thinking.ts, healing.ts
src/repair/Tool-Call Repairflatten.ts, scavenge.ts, storm.ts, truncation.ts
src/mcp/MCP Protocolclient.ts, stdio.ts, sse.ts, registry.ts, spec.ts
src/code/Edit Engineedit-blocks.ts, prompt.ts, setup.ts, diff-preview.ts
src/memory/Persistencesession.ts, runtime.ts (3-region model), project.ts, user.ts
src/core/Kernelevents.ts, reducers.ts, eventize.ts, inflight.ts, pause-gate.ts
src/ports/Interfacesmodel-client.ts, tool-host.ts, event-sink.ts, memory-store.ts
src/transcript/Logginglog.ts, diff.ts, replay.ts

πŸ›οΈ The Four Architectural Pillars

Pillar 1: Cache-First Loop src/loop.ts

DeepSeek bills cached input at ~10% of the miss rate. The loop partitions context into three regions to maximize prefix-cache stability:

flowchart TB
  subgraph "IMMUTABLE PREFIX β€” Fixed for session"
    SYS["System Prompt\n(codeSystemBase)"]
    TOOLS["Tool Specifications\n(ToolRegistry.specs)"]
    FEW["Few-Shot Examples"]
  end

  subgraph "APPEND-ONLY LOG β€” Grows monotonically"
    direction TB
    A1["assistant_1"]
    T1["tool_result_1"]
    A2["assistant_2"]
    T2["tool_result_2"]
    A1 --> T1 --> A2 --> T2
  end

  subgraph "VOLATILE SCRATCH β€” Reset each turn"
    R1["R1 Reasoning\n(reasoning_content)"]
    PLAN["Transient Plan State"]
    THOUGHTS["Working Thoughts"]
  end

  I --> L --> S

Invariants: (1) Prefix computed once per session, hashed, and pinned. (2) Log entries serialized in append order β€” no rewrites. (3) Scratch distilled via Pillar 2 before folding into log.

Metric: prompt_cache_hit_tokens / (hit + miss) exposed per-turn and aggregated per-session. Visible in TUI top-bar cache cell.

Pillar 2: Tool-Call Repair src/repair/

Four-pass pipeline addressing DeepSeek-specific failure modes:

flowchart LR
  INPUT["Model Response\n(tool_calls + reasoning_content)"]
  
  subgraph "Pass 1: Flatten"
    FLAT["Schema Flatten\n>10 params or depth>2\n→ dot-notation"]
  end
  
  subgraph "Pass 2: Scavenge"
    SCAV["Scavenge\nRegex + JSON parser\nSweeps reasoning_content\nfor forgotten tool calls"]
  end
  
  subgraph "Pass 3: Truncation Repair"
    TRUNC["Truncation Repair\nDetect unbalanced JSON\nClose braces or\nrequest continuation"]
  end
  
  subgraph "Pass 4: Storm"
    STORM["Storm Breaker\nIdentical (tool, args)\nwithin sliding window\n→ suppress + reflect"]
  end

  INPUT --> FLAT --> SCAV --> TRUNC --> STORM
  STORM --> DISPATCH["dispatchToolCallsChunked()"]
PassFileProblem Solved
1. Flattenflatten.tsDeepSeek drops args when schema has >10 leaf params or depth >2 β€” auto-flatten to dot-notation, re-nest at dispatch
2. Scavengescavenge.tsTool-call JSON emitted inside <think>, missing from final message
3. Truncationtruncation.tsTruncated JSON due to max_tokens hit mid-structure
4. Stormstorm.tsSame tool called repeatedly with identical args (call-storm)

Pillar 3: Cost Control v0.6

Tiered Model Presets

PresetModelEffortRelative Cost
flashv4-flashmax1Γ—
auto (default)v4-flash β†’ v4-pro on hard turnsmax1–3Γ—
prov4-promax~12Γ—

Cost Mechanisms

  • Turn-end auto-compaction: Tool results exceeding 3000 tokens shrunk when turn ends
  • Proactive 40% threshold: Context-ratio above 40% triggers pre-emptive shrink before 80% emergency
  • Model self-report escalation (<<<NEEDS_PRO>>>): Model decides when task exceeds current tier
  • Auxiliary calls hard-code flash: Summarization, subagent spawns, truncation repair β€” all use v4-flash regardless of user preset
  • Parallel tool dispatch: Read-only tools race concurrently (up to 3) to reduce turn latency
Budget Cap: Soft USD cap β€” warns at 80%, refuses next turn at 100%. Per-turn cost colored green (<$0.05), yellow ($0.05–0.20), red (β‰₯$0.20).

Pillar 4: MCP Protocol Integration src/mcp/

Model Context Protocol support with stdio + SSE transports. Tools registered via registry at startup; third-party MCP tools default to parallelSafe: false.

  • Transports: stdio (subprocess), SSE (HTTP streaming), Streamable HTTP
  • Registry: Marketplace overlay with categorized server list
  • Lifecycle: Connect β†’ initialize β†’ list_tools β†’ ready. Reconnect on crash
  • Truncation: Results capped at DEFAULT_MAX_RESULT_TOKENS with save-to-disk option

πŸ“ Prompt Construction Lineage

flowchart TB
  subgraph "Layer 1: Shared Fragments"
    TUI["TUI_FORMATTING_RULES\nβ€’ Table formatting\nβ€’ Code blocks\nβ€’ No ASCII art"]
    NEG["NEGATIVE_CLAIM_RULE\nβ€’ Cite or shut up\nβ€’ search_content FIRST"]
    ESC["escalationContract(modelId)\nβ€’ ESCALATION_CONTRACT\nβ€’ <<<NEEDS_PRO>>> marker"]
  end

  subgraph "Layer 2: Identity + Rules"
    ID["Identity Block\nβ€’ You are Reasonix Code\nβ€’ Don't infer from workspace"]
    CITE["Citation Rules\nβ€’ File paths with line ranges\nβ€’ Broken paths = red strikethrough"]
    AUDIT["Audit Rails (6 rules)\n‒ Auto-preview limits\n‒ Flag→consumer trace\n‒ No fabricated %"]
  end

  subgraph "Layer 3: Tool Guidance"
    PICK["Tool Selection\nβ€’ submit_plan vs ask_choice\nβ€’ todo_write"]
    PLAN_MODE["Plan Mode\nβ€’ Bounced writes\nβ€’ submit_plan required"]
    SUBAGENT["Subagent Delegation\nβ€’ Default: don't delegate\nβ€’ Skill index with tags"]
  end

  subgraph "Layer 4: Edit Protocol"
    EDIT["SEARCH/REPLACE Format\nβ€’ Read before edit enforced\nβ€’ Exact whitespace match\nβ€’ Edit gate routes review/auto"]
    WHEN["When to Edit\nβ€’ Only on explicit change ask\nβ€’ Analyze β†’ tools + prose"]
  end

  subgraph "Layer 5: Memory Stack"
    UM["User Memory\n~/.reasonix/memory/"]
    PM["Project Memory\nREASONIX.md"]
    SM["Skill Memos\n@reasonix/core-utils"]
  end

  subgraph "Output"
    FINAL["CODE_SYSTEM_PROMPT\nBuilt per-session by\ncodeSystemBase(modelId)\nwith real model name"]
  end

  TUI --> ID
  NEG --> CITE
  ESC --> PICK
  ID --> PICK
  CITE --> AUDIT
  AUDIT --> PLAN_MODE
  PICK --> SUBAGENT
  PLAN_MODE --> EDIT
  SUBAGENT --> WHEN
  EDIT --> FINAL
  WHEN --> FINAL
  UM --> FINAL
  PM --> FINAL
  SM --> FINAL

Prompt Assembly Code Path

codeSystemBase(modelId)
  └─ CODE_SYSTEM_TEMPLATE (180 lines, src/code/prompt.ts)
       β”œβ”€ TUI_FORMATTING_RULES (from prompt-fragments.ts)
       β”œβ”€ NEGATIVE_CLAIM_RULE (from prompt-fragments.ts)
       β”œβ”€ escalationContract(modelId) (from prompt-fragments.ts)
       β”œβ”€ Identity block (lines 13-17)
       β”œβ”€ Citation rules (lines 19-21)
       β”œβ”€ Audit rails (lines 23-31)
       β”œβ”€ Tool selection guidance (lines 34-38)
       β”œβ”€ Plan mode constraints (lines 40-42)
       β”œβ”€ Subagent delegation policy (lines 44-48)
       β”œβ”€ Edit protocol + SEARCH/REPLACE format (lines 50-80+)
       └─ __ESCALATION_CONTRACT__ (replaced at runtime with model-specific contract)

applyMemoryStack() overlays:
  β”œβ”€ User memory files (HIGH PRIORITY constraints block)
  β”œβ”€ Project memory (REASONIX.md or CLAUDE.md)
  └─ Skill memos (@reasonix/core-utils compaction module)

πŸ”„ Agent Loop β€” CacheFirstLoop

flowchart TB
  START(["User submits prompt"])
  
  subgraph "Turn Setup"
    RESET["resetStorm()\nFresh intent = clean slate"]
    LOAD["loadSessionMessages()\nRestore from JSONL"]
    HEAL["healLoadedMessages()\nFix tool call pairing"]
    CHECK["Budget check\nRefuse if over cap"]
  end
  
  subgraph "Iteration Loop"
    BUILD["Build messages:\nprefix + log + scratch"]
    STREAM["streamModelResponse()\nSSE from DeepSeek API"]
    THINK["Strip/retain reasoning\nthinking mode handling"]
    REPAIR["ToolCallRepair.process()\nscavenge→truncation→storm"]
  end
  
  subgraph "Dispatch Phase"
    DISPATCH["dispatchToolCallsChunked()\nParallel-safe chunking"]
    RUN["runOne(call)\nToolRegistry.dispatch()"]
    APPEND["appendAndPersist(msg)\nLog + session JSONL"]
  end
  
  subgraph "Post-Turn"
    COMPACT["decideAfterUsage()\nFold if ratio > 75%"]
    BUDGET["updateBudget()\nWarn at 80%, stop at 100%"]
  end

  START --> RESET --> LOAD --> HEAL --> CHECK
  CHECK --> BUILD --> STREAM --> THINK --> REPAIR
  REPAIR -->|has tool calls| DISPATCH
  REPAIR -->|no tool calls| COMPACT
  DISPATCH --> RUN --> APPEND
  APPEND -->|more iter cycles| BUILD
  APPEND -->|turn done| COMPACT
  COMPACT -->|fold| HEAL
  COMPACT -->|carry on| DONE(["Await next user input"])
  COMPACT -->|exit summary| FORCE(["forceSummaryAfterIterLimit()"])

Key Loop States

StateFileDescription
ImmutablePrefixmemory/runtime.tsSystem prompt + tool specs + few-shots β€” pinned for session
AppendOnlyLogmemory/runtime.tsMonotonically growing conversation β€” preserves cache prefix
VolatileScratchmemory/runtime.tsR1 reasoning + transient state β€” reset each turn
ReadTrackertools/read-tracker.tsFiles model has read β€” gates edit_file. Cleared on fold
SessionStatstelemetry/stats.tsPer-session token usage + cost + cache hit ratio
InflightSetcore/inflight.tsTracking in-progress tool calls for TUI spinner display

Mid-Turn Steering

Users can inject guidance during a running turn. The wrapper constant preserves context:

MID_TURN_STEER_WRAPPER = "[Mid-turn steer queued by the user. 
Do not treat this as a new task; use it only as additional 
guidance for the current task after completing the current step.]"

⚑ Parallel Tool Dispatch

flowchart TB
  CALLS["Repaired Tool Calls\n(repairedCalls[])"]
  
  subgraph "Chunking Logic (dispatchToolCallsChunked)"
    CHECK{"REASONIX_TOOL_DISPATCH\n=== 'serial'?"}
    SERIAL["Serial Mode\nOne call per chunk"]
    GROUP["Group parallel-safe calls\nup to REASONIX_PARALLEL_MAX (default 3)\nStop at first non-parallel-safe"]
    BARRIER["Serial Barrier\nNon-parallel-safe call\nruns alone"]
  end
  
  subgraph "Execution"
    RACE["Promise.allSettled()\nRace chunk in parallel"]
    ORDER["Yield results\nin DECLARED ORDER\n(not completion order)"]
    APPEND["Append tool messages\nto AppendOnlyLog"]
  end

  CALLS --> CHECK
  CHECK -->|yes| SERIAL
  CHECK -->|no| GROUP
  GROUP --> BARRIER
  SERIAL --> RACE
  BARRIER --> RACE
  RACE --> ORDER --> APPEND
  APPEND -->|more calls| CHECK
  APPEND -->|done| NEXT(["Return to agent loop"])

Parallel-Safe Tools (Built-In)

ToolWhy Safe
read_file, list_directory, directory_treeRead-only filesystem ops
search_files, search_content, get_file_infoRead-only search ops
web_search, web_fetchExternal, no filesystem side effects
recall_memory, semantic_searchRead-only memory access
run_skill, spawn_subagentIsolated child loops
job_output, list_jobsIn-memory job queries

Not parallel-safe: edit_file, write_file, run_command β€” these mutate state and must execute serially for read-after-write ordering.

Configuration: REASONIX_PARALLEL_MAX (default 3, hard cap 16) Β· REASONIX_TOOL_DISPATCH=serial (escape hatch)

πŸ”§ Tool-Call Repair Pipeline

Pass Order: scavenge β†’ truncation β†’ storm (flatten runs at construction)

flowchart TB
  RAW["Raw Model Response\nβ€’ tool_calls[]\nβ€’ reasoning_content\nβ€’ content"]

  subgraph "Phase 0: Registration Time"
    ANALYZE["analyzeSchema()\nCheck depth > 2\nor leaf params > 10"]
    FLAT_STORE["Store flatSchema\non InternalTool"]
  end

  subgraph "Phase 1: Scavenge"
    COMBINE["Combine channels\nreasoning + content"]
    SCAN["Regex + JSON parse\nFind tool calls in\n<think> + DSML markup"]
    DEDUP["Deduplicate\n(by name+args signature)\nMax 4 scavenged calls"]
  end

  subgraph "Phase 2: Truncation Repair"
    CHECK_JSON["Check each call's\nargument JSON"]
    FIX["Close braces/strings\nFallback: leave original\n(rejects with 'invalid JSON')"]
  end

  subgraph "Phase 3: Storm Breaker"
    WINDOW["Sliding window\n(default 6 calls)"]
    DETECT["Detect identical\n(tool_name, args) tuples"]
    ACT["3+ repeats in window\n→ suppress call\n→ inject reflection"]
  end

  RAW --> ANALYZE
  RAW --> COMBINE --> SCAN --> DEDUP
  DEDUP --> CHECK_JSON --> FIX
  FIX --> WINDOW --> DETECT --> ACT
  ACT --> REPAIRED(["Repaired calls β†’ dispatchToolCallsChunked()"])

Scavenge Deep-Dive src/repair/scavenge.ts

Searches BOTH channels:

  • reasoning_content: Where R1 models leak JSON tool calls inside <think> blocks
  • content: Where DSML markup appears in regular text turns

Channels are joined with newline and scanned independently. Dedup prefers first-seen (declared calls take priority over scavenged). Only novel (name, args) signatures are added. Default max: 4 scavenged calls.

Storm Breaker src/repair/storm.ts

Prevents infinite loops where the model calls the same tool with identical args repeatedly. Configurable:

  • Window size: 6 calls (default)
  • Threshold: 3 identical calls in window β†’ suppress
  • Exemptions: stormExempt: true for cheap state-inspection tools
  • Reset: Mutating calls (writes, edits) clear the window β€” post-edit verify reads aren't false positives

πŸ“ Context Manager β€” Auto-Compaction

flowchart TB
  USAGE["After turn response\nUsage.promptTokens"]

  subgraph "Threshold Decision (decideAfterUsage)"
    FST{"> 80%?\nFORCE_SUMMARY_THRESHOLD"}
    AGGR{"> 78%?\nHISTORY_FOLD_AGGRESSIVE"}
    NORM{"> 75%?\nHISTORY_FOLD_THRESHOLD"}
    PRE{"> 90% at turn start?\nTURN_START_FOLD_THRESHOLD"}
  end

  subgraph "Fold Execution"
    PINS["Extract pinned skills\n(collectPinnedSkills)"]
    SUMMARIZE["Summarize head messages\nv4-flash + effort=high\n15s timeout"]
    REWRITE["rewriteSession()\nReplace head with summary\n+ pinned skill bodies\n+ recent tail"]
  end

  FST -->|yes| EXIT(["Exit with summary\nforceSummaryAfterIterLimit()"])
  FST -->|no| AGGR
  AGGR -->|yes| AGG_FOLD(["Aggressive fold\ntailBudget = 10% ctxMax"])
  AGGR -->|no| NORM
  NORM -->|yes| NORM_FOLD(["Normal fold\ntailBudget = 20% ctxMax"])
  NORM -->|no| CARRY(["Carry on"])
  PRE -->|yes| PRE_FOLD(["Pre-iter fold\nbefore turn starts"])

  AGG_FOLD --> PINS
  NORM_FOLD --> PINS
  PRE_FOLD --> PINS
  PINS --> SUMMARIZE --> REWRITE

Compaction Thresholds

ConstantValueTrigger
FORCE_SUMMARY_THRESHOLD0.80Exit turn with summary β€” defense in depth
HISTORY_FOLD_AGGRESSIVE_THRESHOLD0.78Fold harder β€” tail budget 10%
HISTORY_FOLD_THRESHOLD0.75Normal fold β€” tail budget 20%
TURN_START_FOLD_THRESHOLD0.90Pre-turn fold β€” covers session restore, huge paste
HISTORY_FOLD_MIN_SAVINGS_FRACTION0.30Skip fold if head wouldn't shrink by β‰₯30%
Skill Pin Preservation: Active skill memos wrapped in <skill-pin name="...">...</skill-pin> are lifted from the head before summarization and re-appended verbatim after. This ensures skill procedures survive context folds.

πŸ”Œ MCP Protocol Architecture

flowchart TB
  subgraph "Transport Layer"
    STDIO["StdioTransport\nSubprocess spawn\nJSON-RPC over stdin/stdout"]
    SSE["SseTransport\nHTTP GET /sse event stream\nPOST for client→server"]
    STREAM_HTTP["StreamableHttpTransport\nStreaming HTTP with\nupgrade path"]
  end

  subgraph "Client"
    CONNECT["McpClient.connect()\nHandshake + initialize"]
    TOOLS["listTools()\n→ ToolSpec[] registered\nin ToolRegistry"]
    CALL["callTool(name, args)\n→ result string\ncapped at maxResultTokens"]
  end

  subgraph "Registry"
    CATALOG["catalog.ts\nMCP server directory"]
    MARKETPLACE["marketplace-overlay/\nCurated server list\nwith i18n (zh-CN.json)"]
    SPEC["spec.ts\nMCP schema validation"]
  end

  STDIO --> CONNECT
  SSE --> CONNECT
  STREAM_HTTP --> CONNECT
  CONNECT --> TOOLS --> CALL
  CATALOG --> MARKETPLACE
  SPEC --> CONNECT

πŸ“ˆ Design Evolution

VersionMilestone
v0.0.xPillar 1 end-to-end, repair pipeline complete, Ink TUI scaffold
v0.1Ο„-bench numbers published, streaming polish, transcript replay
v0.3MCP client (stdio + SSE), session persistence
v0.4.xreasonix code with SEARCH/REPLACE edits, review/auto gate, background jobs, hooks
v0.5.xV4 model support, skills, memory, subagents, actionable error messages
v0.6Cost control (flash-first, auto-compaction, /pro one-shot, self-report escalation, cost badges). Shared prompt fragments. UI refactor (App.tsx split into hooks, slash.ts split into 13 modules)
v0.31 (current)Branch + harvest removed. Leaner surface, fewer slash commands

Explicit Non-Goals

  • Multi-agent orchestration (subagents are cost-reduction, not coordination)
  • RAG / vector retrieval
  • Non-DeepSeek backends (OpenAI shim possible via --model but untested)
  • Web UI / SaaS
  • Silent cost escalation (every pro call is user-visible)

DeepSeek Reasonix Architecture Analysis Β· Generated by Hermes Agent Β· github.com/esengine/deepseek-reasonix