THYNKQ PRIVATE BRIEF
Local AI Architecture
One operator. Two machines. Router intelligence. Token-efficient delivery at production speed.
Rack Topology
i5
Orchestrator + fast local assist
deepseek-coder-v2:16b
Direct lane: 10.0.0.1
k8
Heavy code worker + dashboard fleet node
qwen3-coder:30b
Direct lane: 10.0.0.2 / 10.0.0.3
Unified Router
Task Intake
Classifier
Route Score
Dispatch + Fallback
- Probe ethernet targets first, then WiFi fallback.
- Attach i5 preflight context before high-scope k8 code runs.
- Fallback to secondary node on timeout, unreachable, or no-change.
End-to-End Task Flow
1Rodolf defines objective
2Claude or Codex frames execution prompt
3thynkq-router selects best local node
4run-task creates branch and executes
5Commit and push branch to GitHub
6Rodolf reviews and merges
Weekly Benchmark Control Loop
- Weekly model benchmark job runs on k8
- Measure quality, t/s, wall time, and stability
- Update routing defaults from measured results
- Keep primary and backup model choices evidence-based
Token Saving Strategy
Local deterministic edits
95%
Local bounded implementation tasks
84%
Cross-file tasks with fallback
68%
Cloud escalation for edge cases
22%