THYNKQ PRIVATE BRIEF

Local AI Architecture

One operator. Two machines. Router intelligence. Token-efficient delivery at production speed.

Rack Topology

Orchestrator + fast local assist

deepseek-coder-v2:16b

Direct lane: 10.0.0.1

Heavy code worker + dashboard fleet node

qwen3-coder:30b

Direct lane: 10.0.0.2 / 10.0.0.3

Unified Router

Task Intake

Classifier

Route Score

Dispatch + Fallback

Probe ethernet targets first, then WiFi fallback.
Attach i5 preflight context before high-scope k8 code runs.
Fallback to secondary node on timeout, unreachable, or no-change.

End-to-End Task Flow

1Rodolf defines objective

2Claude or Codex frames execution prompt

3thynkq-router selects best local node

4run-task creates branch and executes

5Commit and push branch to GitHub

6Rodolf reviews and merges

Weekly Benchmark Control Loop

Weekly model benchmark job runs on k8
Measure quality, t/s, wall time, and stability
Update routing defaults from measured results
Keep primary and backup model choices evidence-based

Token Saving Strategy

Local deterministic edits

95%

Local bounded implementation tasks

84%

Cross-file tasks with fallback

68%

Cloud escalation for edge cases

22%