🧠 Multi-Agent Local Stack · Plug-and-Play
Your own Manus-like system,
fully under your control
Five AI agents running locally via Docker — working on the same project, sharing files and vector memory, reviewing each other's outputs, and iterating until quality passes a judge threshold. Perplexity researches the web. Vane researches private data. Manus builds. Claude critiques. Loop until score ≥ 8.
5
Agents
6
Services
3
Context layers
≥8
Quality gate
Part of AgentOS
This Ops Deck is the Intelligence + Execution layers of AgentOS — the full 7-layer AI operating system. View the complete system architecture:
AgentOS →
🔥
Key Insight
This stack gives you multi-agent collaboration, shared memory, critique loops, real API integration, and fully local orchestration — essentially a self-hosted Manus-like system you control completely. The loop is: Research → Build → Review → Revise (if score < 8) → Done. All agents share the same /data volume, Postgres state, and Qdrant vector memory.
Multi-agent collaboration Shared vector memory Critique loops Real API integration Local orchestration Self-healing revisions
1
Architecture Overview
User → Orchestrator → Workers → Shared Context
User
Orchestrator (Node.js) :3000 · routes + loops
Perplexity Research Agent
sonar-pro
·
Vane Private Research
local LLMs + SearxNG
·
Manus Build Agent
gpt-4o-mini
·
Judge (Claude) Critic Agent
claude-3-opus
Shared Context Layer /data (files) · Postgres (state) · Qdrant (vector memory)
Orchestration Loop
1
Perplexity Research Phase
POST http://perplexity_worker:8000/run — queries sonar-pro with the task goal, writes findings to /data/docs/research.md
4b
Vane Private Research Phase
If task involves sensitive data (financials, agreements, proprietary datasets), Vane runs local research via SearxNG + Ollama. Custom actions can call ML pipeline sidecars. No data leaves infrastructure.
2
Manus Build Phase
POST http://manus_worker:8000/run — reads research.md, builds implementation via gpt-4o-mini, writes to /data/code/implementation.py
3
Judge Review Phase
POST http://judge_worker:8000/review — Claude reviews implementation.py, scores 1–10, writes feedback to /data/docs/review.md
?
Gate Score < 8 → Revise Loop
If review score < 8: POST manus_worker:8000/revise — Manus reads review.md + current code, improves and rewrites. Judge re-reviews. Loop continues until score ≥ 8.
Done — quality gate passed
Score ≥ 8: orchestrator returns { done: true }. Final artifacts in /data/code/ and /data/docs/.
2
Agent Workers
5 agents · 3 primary + 2 extended
Perplexity
Research Agent
Deep web research using sonar-pro — the same model powering Perplexity's answer engine. Receives the task goal, queries the live web, synthesizes findings into structured research.md. This grounds Manus's implementation in real, current information rather than stale training data.
sonar-pro api.perplexity.ai PERPLEXITY_API_KEY
worker.py endpoint
@app.post("/run") def run(state): # POST to sonar-pro with goal text = query_perplexity(state["goal"]) write("/data/docs/research.md", text) return {"status": "done"}
Vane
Private Research Agent
Self-hosted AI answering engine for sensitive data analysis. Runs SearxNG meta-search locally, connects to any OpenAI-compatible LLM (Ollama, vLLM, llama.cpp). Custom ActionRegistry lets you register ML pipeline sidecars as tools the LLM calls autonomously. Handles business financials, contracts, agreements — nothing leaves your infrastructure.
local LLMs SearxNG VANE_API_URL
Key capabilities
# Private document analysis (financials, contracts) # Local web search via SearxNG # Custom tool actions (ML pipeline sidecar) # Shared Qdrant vector memory # SSE streaming to orchestrator # Speed / Balanced / Quality modes
Manus
Build Agent
Execution agent — reads research, builds implementation, revises based on judge feedback. Uses gpt-4o-mini for cost-effective code synthesis. Has two endpoints: /run for initial build and /revise for improvement loops — the core of the self-healing cycle.
gpt-4o-mini api.openai.com OPENAI_API_KEY
worker.py endpoints
@app.post("/run") # initial build @app.post("/revise") # improvement loop reads: research.md | review.md writes: /data/code/implementation.py
Judge (Claude)
Critic Agent
Quality gate — Claude reads the implementation and returns structured feedback + a numeric score. Uses claude-3-opus-20240229 for the best critique quality. Score < 8 triggers a Manus revision. Score ≥ 8 marks the task complete. Writes feedback to review.md so Manus can act on it.
claude-3-opus api.anthropic.com ANTHROPIC_API_KEY
worker.py endpoint
@app.post("/review") reads: implementation.py writes: /data/docs/review.md return {"score": 110} # gate at 8
Extended Agent Roles (plug in as additional workers)
Gemini
Multimodal Reasoning
Add as a second research worker for multimodal tasks — image analysis, long-context document reading (2M tokens), or cross-format synthesis. Replace or augment the Perplexity research step for document-heavy projects.
gemini-2.5-pro GOOGLE_API_KEY
Codex
Test Generation
Add between Manus /run and Judge /review — Codex writes a test suite against the implementation before Claude judges it. Tests failing = automatic revise loop before the score gate, saving Judge API calls.
gpt-5.4 OPENAI_API_KEY
3
Shared Context Layer
/data · Postgres · Qdrant
/data Volume
Shared Docker volume mounted at /app/data on every container. Any agent write is immediately visible to all other agents — no API calls needed for file exchange.
/data ├── /docs├── research.md ← Perplexity writes├── private-research.md ← Vane writes (sensitive data)└── review.md ← Judge writes ├── /code└── implementation.py ← Manus writes └── state.json ← Orchestrator tracks
Postgres (State)
Persistent relational state — task metadata, agent run history, loop counts, scores per iteration. Survives container restarts. Lets the orchestrator query "how many revise loops have run for task X?"
POSTGRES_USER: ai
POSTGRES_PASSWORD: ai
POSTGRES_DB: ai
Port: 5432 (internal)
Image: postgres:15
Qdrant (Vector Memory)
Long-term semantic memory — agents store key decisions, research findings, and past solutions as vectors. Future tasks retrieve relevant context via similarity search, so agents don't repeat prior mistakes or redo prior research.
Port: 6333 (exposed)
Image: qdrant/qdrant
Collection: "memory"
client.upsert() to store
client.search() to recall
Qdrant integration snippet
from qdrant_client import QdrantClient client = QdrantClient(host="qdrant", port=6333) client.upsert( collection_name="memory", points=[{ "id": 1, "vector": embedding, "payload": {"text": decision} }] )
4
Docker Compose Services
8 services · docker-compose up --build
Service Type Port Image / Build Env & Volumes
orchestrator Orchestrator 3000 ./orchestrator (Node.js) .env · ./data:/app/data
perplexity_worker Worker 8000 (internal) ./workers/perplexity (FastAPI) .env · ./data:/app/data
manus_worker Worker 8000 (internal) ./workers/manus (FastAPI) .env · ./data:/app/data
judge_worker Worker 8000 (internal) ./workers/judge (FastAPI) .env · ./data:/app/data
postgres Infra 5432 (internal) postgres:15 POSTGRES_USER/PASSWORD/DB
qdrant Infra 6333 qdrant/qdrant
docker-compose.yml (key structure)
version: "3.9" services: orchestrator: build: ./orchestrator ports: ["3000:3000"] depends_on: [postgres, qdrant] perplexity_worker: # Research build: ./workers/perplexity vane: # Private Research image: itzcrazykns1337/perplexica:latest searxng: # Meta-search backend for Vane image: searxng/searxng:latest manus_worker: # Build + Revise build: ./workers/manus judge_worker: # Critique (Claude) build: ./workers/judge postgres: # State DB image: postgres:15 qdrant: # Vector memory image: qdrant/qdrant ports: ["6333:6333"]
5
Environment Variables
Create .env in project root
Agent API Keys
OPENAI_API_KEYManus worker (gpt-4o-mini)
ANTHROPIC_API_KEYJudge worker (claude-3-opus)
GOOGLE_API_KEYGemini worker (optional)
PERPLEXITY_API_KEYPerplexity worker (sonar-pro)
VANE_API_URLVane search endpoint (http://vane:3000/api/search)
SEARXNG_URLSearxNG backend for Vane (http://searxng:8080)
Infrastructure
POSTGRES_USERai
POSTGRES_PASSWORDai (change in prod)
POSTGRES_DBai
QDRANT_HOSTqdrant (internal DNS)
6
Run It
Two commands to launch the full stack
Start all services
# Build images and start all 8 services docker-compose up --build # Wait for healthy status, then trigger a task curl -X POST http://localhost:3000/run \ -H "Content-Type: application/json" \ -d '{"goal": "build a trading bot"}'
Orchestrator
localhost:3000
POST /run to trigger
Qdrant UI
localhost:6333
Browse vector memory
Output files
./data/
research.md · implementation.py · review.md
7
Upgrade Path
Expand the stack after the baseline works
🤖
Gemini Worker — Multimodal Reasoning
Add a second research worker using gemini-2.5-pro for tasks requiring image analysis, PDF parsing, or 2M-token context windows. Route multimodal research tasks to Gemini, text-only to Perplexity. Orchestrator picks based on task type in state.json.
Redis Pub/Sub — Real-Time Agent Triggers
Replace sequential HTTP calls with Redis publish/subscribe events. Each agent subscribes to its trigger channel — faster, non-blocking, and allows multiple agents to work in parallel. Add redis:alpine to docker-compose.
📊
Next.js Dashboard — Visualize Agent Outputs
Add a Next.js frontend service that reads state.json + Postgres, showing live agent activity, loop counts, review scores per iteration, and final artifacts. Connects to the AI Shop dashboard you're reading now.
🔀
Task Graphs Instead of Linear Flows
Replace the linear Research→Build→Review chain with a DAG (directed acyclic graph) of tasks. Parallel sub-tasks — e.g., Perplexity researches API docs while Gemini reads existing codebase — then Manus merges both. Add a planner agent that generates the graph from the goal.
🔄
Git Auto-Commit Agent
Add a lightweight git worker that commits implementation.py after each successful review (score ≥ 8). Message includes agent, score, and iteration count: feat(manus): impl v3 · judge 9/10. Full audit trail of the build loop.
⚖️
Evaluation Agent Swarm — Multiple Judges
Run 3+ judge workers in parallel (Claude, Gemini, GPT-4o as critics), aggregate their scores. Average score ≥ 8 triggers completion. Eliminates single-judge bias and produces richer review.md with multi-perspective feedback for Manus to revise against.
🔐
Agent Roles — Planner / Executor / Verifier
Add a Planner agent (Claude) that breaks the goal into sub-tasks before Manus executes them. Add a Verifier agent (Codex) that runs tests on the implementation before it reaches the Judge. Three distinct roles prevent the Executor from trying to do everything.
🧠
Self-Healing Loops — Retry with Modified Prompts
If Manus fails to improve (score doesn't go up after 2 revisions), the orchestrator automatically modifies the build prompt — adds constraints from review.md, changes the framing, reduces scope. Prevents infinite loops on intractable tasks.