The orchestrator-workers pattern combines elements of planning and parallelization: a central orchestrator LLM dynamically analyzes a task, breaks it into subtasks, delegates those subtasks to worker LLMs, and then synthesizes their results into a final output.
The key difference from prompt chaining is flexibility — the subtasks are not predefined by the developer but determined by the orchestrator based on the specific input. The key difference from parallelization is that the orchestrator acts as an intelligent coordinator, not just a simple fan-out/fan-in mechanism.
┌───────────────┐
│ User Task │
└───────┬───────┘
│
┌───────▼───────┐
│ Orchestrator │
│ (Analyze & │
│ Delegate) │
└──┬────┬────┬──┘
│ │ │
┌────────┘ │ └────────┐
│ │ │
┌─────▼────┐ ┌────▼─────┐ ┌────▼─────┐
│ Worker 1 │ │ Worker 2 │ │ Worker N │
│ (subtask)│ │ (subtask)│ │ (subtask)│
└─────┬────┘ └────┬─────┘ └────┬─────┘
│ │ │
└────────┐ │ ┌────────┘
│ │ │
┌──▼────▼────▼──┐
│ Orchestrator │
│ (Synthesize) │
└───────────────┘
# See code/orchestrator_workers.py for the full implementation
class Orchestrator:
def __init__(self, llm, worker_pool=None):
self.llm = llm
self.worker_pool = worker_pool or []
def run(self, task):
# Step 1: Analyze and decompose the task
decomposition = self.llm.generate(
f"Analyze this task and break it into independent subtasks.\n"
f"For each subtask, specify:\n"
f"- description: what needs to be done\n"
f"- worker_type: what kind of specialist is needed\n"
f"- dependencies: which other subtasks must complete first\n\n"
f"Task: {task}\n\n"
f"Output as JSON."
)
subtasks = parse_subtasks(decomposition)
# Step 2: Execute subtasks (respecting dependencies)
results = {}
for batch in topological_sort(subtasks):
batch_results = execute_parallel(batch, self.worker_pool)
results.update(batch_results)
# Step 3: Synthesize results
synthesis = self.llm.generate(
f"Original task: {task}\n\n"
f"Subtask results:\n{format_results(results)}\n\n"
f"Synthesize these results into a comprehensive response."
)
return synthesis
Workers can be specialized agents with different tools, system prompts, and even different models:
# See code/orchestrator_workers.py for the full implementation
class Worker:
def __init__(self, name, specialty, tools=None, model=None):
self.name = name
self.specialty = specialty
self.tools = tools or []
self.model = model or "default"
self.system_prompt = (
f"You are a specialist in {specialty}. "
f"Complete the assigned task thoroughly and precisely."
)
def execute(self, subtask, context=None):
prompt = f"Task: {subtask}"
if context:
prompt = f"Context:\n{context}\n\n{prompt}"
return self.model.generate(
prompt,
system=self.system_prompt,
tools=self.tools
)
# Create a pool of specialized workers
workers = [
Worker("researcher", "web research and data gathering",
tools=[web_search, arxiv_search]),
Worker("analyst", "data analysis and statistics",
tools=[python_executor, data_tools]),
Worker("writer", "clear technical writing",
tools=[]),
Worker("reviewer", "code review and quality assurance",
tools=[linter, test_runner]),
]
A coding orchestrator-workers system that handles multi-file refactoring:
# See code/orchestrator_workers.py for the full implementation
def code_refactoring_orchestrator(llm, codebase, refactoring_request):
"""Orchestrate a multi-file code refactoring."""
# Orchestrator analyzes what needs to change
analysis = llm.generate(
f"Analyze this refactoring request and determine which files "
f"need to change and what changes are needed in each:\n\n"
f"Request: {refactoring_request}\n"
f"Files in codebase: {list(codebase.keys())}\n\n"
f"For each file that needs changes, describe the change."
)
file_changes = parse_file_changes(analysis)
# Workers make the changes in parallel
results = {}
for filename, change_description in file_changes.items():
worker_result = llm.generate(
f"Apply this change to the file:\n"
f"Change: {change_description}\n"
f"Current file content:\n{codebase[filename]}\n\n"
f"Return the updated file content."
)
results[filename] = worker_result
# Orchestrator verifies consistency
verification = llm.generate(
f"Verify that these file changes are consistent with each other "
f"and correctly implement the refactoring:\n"
f"Request: {refactoring_request}\n"
f"Changes: {results}\n\n"
f"Are there any inconsistencies or missing changes?"
)
return results, verification
The orchestrator assigns all subtasks upfront and waits for all results:
subtasks = orchestrator.decompose(task)
results = parallel_execute(subtasks)
final = orchestrator.synthesize(results)
The orchestrator assigns subtasks one at a time, adapting based on results:
while not done:
next_subtask = orchestrator.decide_next(task, completed_results)
result = worker.execute(next_subtask)
completed_results.append(result)
done = orchestrator.check_completion(completed_results)
The orchestrator reviews worker outputs and sends back for revision if needed:
for subtask in subtasks:
result = worker.execute(subtask)
while not orchestrator.is_satisfactory(result):
feedback = orchestrator.critique(result)
result = worker.revise(subtask, result, feedback)
| Aspect | Orchestrator-Workers | Prompt Chaining | Parallelization |
|---|---|---|---|
| Subtask definition | Dynamic (LLM) | Static (developer) | Static (developer) |
| Number of subtasks | Variable | Fixed | Fixed |
| Coordination | Intelligent | Sequential | Simple fan-out |
| Best for | Complex, unpredictable tasks | Known workflows | Independent subtasks |
Navigation: