Open Agent SDK expands a single agent into an orchestrated multi-role collaboration system through sub-agents, a Task state machine, and Team/Mailbox communication. This design addresses complex task decomposition, progress tracking, and cross-agent coordination. Keywords: Swift Agent, multi-agent orchestration, Task state machine.
Technical Specification Snapshot
| Parameter | Description |
|---|---|
| Primary Language | Swift |
| Core Protocols | SubAgentSpawner, ToolProtocol |
| Collaboration Mechanisms | TaskStore, TeamStore, MailboxStore |
| Repository | terryso/open-agent-sdk-swift |
| Article Focus | Multi-agent collaboration and task orchestration |
| Built-in Sub-Agents | Explore, Plan |
| Core Dependencies | LLMProvider, LLMClient, Swift Actor concurrency model |
AI Visual Insight: This image is a blog platform sharing prompt animation. It does not show the SDK architecture, code execution flow, or system UI, so it does not provide implementation details that support technical analysis.
Open Agent SDK breaks multi-agent collaboration into three clear layers
A single agent is good at local decision-making, but when it faces long-chain tasks such as code exploration, solution design, implementation, and validation, the context grows rapidly. Open Agent SDK does not solve this by stacking more prompts. Instead, it distributes execution across multiple capability units.
This mechanism is divided into three layers: sub-agents decide who executes, Task tracks how far the work has progressed, and Team plus Mailbox define how members collaborate. This layered design makes runtime orchestration more stable and easier to control in terms of cost and permissions.
Sub-agent creation uses the SubAgentSpawner protocol to achieve decoupling
SubAgentSpawner is defined at the type level instead of being embedded directly into Core. This is a classic dependency inversion pattern: the tool layer depends on a protocol rather than a concrete implementation, and the actual spawning logic is injected through ToolContext.agentSpawner.
public protocol SubAgentSpawner: Sendable {
func spawn(
prompt: String,
model: String?,
systemPrompt: String?,
allowedTools: [String]?,
maxTurns: Int?
) async -> SubAgentResult
}
This protocol defines the entry point for sub-agent creation, allowing the tool layer to delegate work safely without knowing the specific agent implementation.
The enhanced spawn version further adds parameters such as disallowedTools, mcpServers, teamName, and mode. The default implementation remains backward-compatible with the basic version, so existing implementations do not need to be modified.
The core value of DefaultSubAgentSpawner is safe delegation, not simple instance creation
The real key is not creating an agent. It is restricting what that agent can do. DefaultSubAgentSpawner inherits the parent agent’s toolset by default, but it removes AgentTool first so that sub-agents cannot keep spawning recursively.
var subTools = parentTools.filter { $0.name != "Agent" } // Filter out AgentTool to prevent recursion
if let allowed = allowedTools, !allowed.isEmpty {
let allowedSet = Set(allowed)
subTools = subTools.filter { allowedSet.contains($0.name) } // Keep only allowed tools
}
This logic handles both recursion prevention and tool whitelist filtering, making it the core implementation of the multi-agent security boundary.
The system then builds AgentOptions from the parent model or override parameters, waits synchronously for the sub-agent to finish execution, and wraps the text output into SubAgentResult before returning it to the parent agent. This means the current implementation favors blocking collaboration rather than a fully asynchronous pipeline.
AgentTool is the multi-agent entry point exposed to the LLM
From the model’s perspective, it cannot see SubAgentSpawner. It only sees a standard tool: AgentTool. The model uses structured parameters to specify the task prompt, sub-agent type, maximum turns, and tool scope.
{
"prompt": "Explore the project structure and find all Swift source files",
"description": "Explore codebase",
"subagent_type": "Explore"
}
This call triggers a dedicated exploration-oriented sub-agent instead of forcing the parent agent to continue exploring the repository inside a long context window.
The SDK also includes two built-in sub-agent types: Explore and Plan. The former focuses on code retrieval and file understanding, while the latter focuses on architecture design and implementation planning. In practice, both are predefined combinations of system prompts and tool permission templates.
The Task system provides stable execution tracking through Actors and a state machine
Once tasks start to split, messages alone cannot reveal global progress. The purpose of the Task system is to provide the LLM with a task ledger that it can query, update, and filter.
TaskStore uses a Swift Actor to hold state, which means concurrent access is serialized by default without explicit locking. This property is critical when multiple agents create or update tasks at the same time.
public actor TaskStore {
private var tasks: [String: Task] = [:]
private var taskCounter: Int = 0
public func create(subject: String, owner: String? = nil) -> Task {
taskCounter += 1
let id = "task_\(taskCounter)" // Generate a unique task ID
let task = Task(id: id, subject: subject, status: .pending, owner: owner)
tasks[id] = task
return task
}
}
This code shows the minimal closed loop of task storage: ID generation, persistent retention, and concurrency safety.
The Task state machine enforces irreversible terminal states
Task states include pending, inProgress, completed, failed, and cancelled. The last three are terminal states. Once a task enters one of them, it cannot transition back, which makes the task system auditable.
private func isValidTransition(from: TaskStatus, to: TaskStatus) -> Bool {
switch from {
case .pending, .inProgress:
return true // Allow transitions only during active phases
case .completed, .failed, .cancelled:
return false // Disallow changes after terminal states
}
}
This validation prevents the LLM from changing a completed task back to in-progress during retries, which protects the integrity of orchestration history.
In addition, TaskStatus.parse() accepts both inProgress and in_progress. This is a classic design pattern that absorbs model output variance and significantly reduces tool invocation failures.
Team and Mailbox upgrade multi-agent work from one-off delegation to sustained collaboration
If sub-agents are well suited for short-term outsourced work, the Team system is better suited for long-running multi-role collaboration. TeamStore is also built on an Actor and is responsible for creating teams, maintaining membership, and marking disbanded status.
The Team design stays intentionally minimal: roles include only leader and member, and states include only active and disbanded. It does not introduce a complex organizational hierarchy. Instead, it prioritizes clear orchestration semantics.
Mailbox uses a pull-based message model to trade real-time behavior for implementation stability
Communication between agents is not real-time push. It uses mailbox-style delivery instead. The sender writes to the recipient’s mailbox, and the recipient actively reads and clears messages during its own turn.
public actor MailboxStore {
private var mailboxes: [String: [AgentMessage]] = [:]
public func send(from: String, to: String, content: String) {
let message = AgentMessage(from: from, to: to, content: content)
mailboxes[to, default: []].append(message) // Write to the recipient mailbox
}
}
This code captures the essence of Mailbox: weak real-time guarantees, strong decoupling, and better support for offline agents.
SendMessageTool also performs three levels of validation: whether a MailboxStore exists in the environment, whether the sender belongs to the team, and whether the recipient is a member of the same team. This prevents uncontrolled cross-team messaging and preserves collaboration boundaries.
Three orchestration patterns cover most agent workflow scenarios
The first pattern is parent agent plus parallel sub-agents. The parent agent creates multiple tasks, delegates module analysis to separate Explore sub-agents, and then consolidates the results. This pattern works well for code review, repository exploration, and module inventory.
The second pattern is team collaboration plus message passing. A typical flow is that an explorer investigates first, a planner produces the approach, and a coder implements it. Messages flow through the mailbox, while tasks accumulate in TaskStore, making responsibilities clearer.
let tools = [
createAgentTool(),
createTaskCreateTool(),
createTaskUpdateTool(),
createSendMessageTool()
] // Combine delegation, tracking, and communication tools
This tool combination defines the upper bound of collaboration: agents must be able to delegate, record progress, and communicate in order to form a true workflow loop.
The third pattern is a work queue. The parent agent submits multiple pending tasks at once, and sub-agents claim them, update status, and submit outputs on their own. This resembles lightweight distributed scheduling, although the scheduling rules are still decided by the LLM.
The real strength of this design is clear boundaries and controllable cost
Open Agent SDK does not try to solve every problem with one super-agent. Instead, it decomposes complexity into three stable abstractions: protocols, state machines, and mailboxes. The benefits are clear: context does not grow without bound, task history remains traceable, and team boundaries are verifiable.
Its trade-offs are also explicit: sub-agents execute in blocking mode by default, task dependencies are not resolved automatically, and messages use polling-style retrieval. But these constraints reduce implementation complexity and improve runtime controllability, which is a reasonable engineering trade-off for production-grade agent systems.
FAQ
1. Why can sub-agents not spawn more sub-agents by default?
Because DefaultSubAgentSpawner filters out AgentTool, which prevents recursive spawning from causing runaway token costs, unbounded scheduling depth, and unclear permission boundaries.
2. Does the Task system support automatic dependency scheduling?
The current structure reserves blockedBy and blocks fields, but TaskStore does not resolve dependencies automatically. Dependency checks still need to be handled explicitly by the LLM through TaskList and TaskUpdate.
3. What scenarios are Team and Mailbox best suited for?
They are best suited for workflows that require multiple agents to collaborate over time, hand off results, and maintain stable responsibilities, such as pipeline-style cooperation across exploration, planning, coding, and testing.
Core Summary: This article systematically breaks down the three-layer multi-agent collaboration model implemented by Open Agent SDK in Swift: sub-agent spawning, the Task state machine, and Team plus Mailbox communication. It also covers typical orchestration patterns and design trade-offs, making it a strong reference for building scalable agent workflows.