Operators¶
An Operator is the agent-level interface of MOSAIC, the unified abstraction that lets the GUI assign a worker to each individual agent or a group of agents. While Workers handle process-level concerns (training, telemetry, GPU isolation), Operators are strictly for evaluation and interactive play. Then, the worker inside an Operator loads a trained policy (or calls an LLM API, or reads keyboard input) and computes actions step-by-step. The Operator wraps this and answers the question “given this observation, what action should I take?”
%%{init: {"flowchart": {"curve": "linear"}} }%%
graph TB
GUI["Qt6 GUI<br/>(Main Process)"]
LAUNCHER["OperatorLauncher<br/>(Subprocess Manager)"]
GUI --> LAUNCHER
LAUNCHER -- "stdin/stdout JSON" --> H_OP
LAUNCHER -- "stdin/stdout JSON" --> L_OP
LAUNCHER -- "stdin/stdout JSON" --> R_OP
LAUNCHER -- "stdin/stdout JSON" --> B_OP
subgraph H_OP["Human Operator"]
HW["human_worker<br/>Keyboard Input"]
end
subgraph L_OP["LLM Operator"]
LW1["balrog_worker<br/>Single-Agent"]
LW2["mosaic_llm_worker<br/>Multi-Agent"]
LW3["chess_worker<br/>Two-Player"]
end
subgraph R_OP["RL Operator"]
RW1["cleanrl_worker<br/>PPO / DQN"]
RW2["xuance_worker<br/>MAPPO / QMIX"]
RW3["ray_worker<br/>PPO / IMPALA"]
end
subgraph B_OP["Baseline Operator"]
BW["operators_worker<br/>Random / Scripted"]
end
style GUI fill:#4a90d9,stroke:#2e5a87,color:#fff
style LAUNCHER fill:#50c878,stroke:#2e8b57,color:#fff
style H_OP fill:#9370db,stroke:#6a0dad,color:#fff
style L_OP fill:#9370db,stroke:#6a0dad,color:#fff
style R_OP fill:#9370db,stroke:#6a0dad,color:#fff
style B_OP fill:#9370db,stroke:#6a0dad,color:#fff
style HW fill:#ff7f50,stroke:#cc5500,color:#fff
style LW1 fill:#ff7f50,stroke:#cc5500,color:#fff
style LW2 fill:#ff7f50,stroke:#cc5500,color:#fff
style LW3 fill:#ff7f50,stroke:#cc5500,color:#fff
style RW1 fill:#ff7f50,stroke:#cc5500,color:#fff
style RW2 fill:#ff7f50,stroke:#cc5500,color:#fff
style RW3 fill:#ff7f50,stroke:#cc5500,color:#fff
style BW fill:#ff7f50,stroke:#cc5500,color:#fff
Key Principles¶
Protocol-Based |
Operators implement Python |
Category System |
Every operator belongs to a category: |
Interactive Mode |
Operators run as subprocesses with |
Multi-Operator Comparison |
Multiple operators can run side-by-side on the same environment with shared seeds for scientific comparison (e.g., LLM vs RL on the same MiniGrid layout). |
Decoupled Execution |
Manual mode (click-to-step) and Script mode (automated experiments) are fully independent code paths with separate state machines. |
Available Operators¶
Operator |
Category |
Backend |
Use Case |
|---|---|---|---|
Human |
human |
Keyboard input via GUI |
Manual play and debugging |
BALROG LLM |
llm |
balrog_worker (vLLM, OpenRouter) |
Single-agent LLM benchmarking on MiniGrid/BabyAI |
MOSAIC LLM |
llm |
mosaic_llm_worker (vLLM, OpenRouter, OpenAI, Anthropic) |
Multi-agent LLM with coordination and Theory of Mind |
Chess LLM |
llm |
chess_worker (llm_chess prompting) |
LLM chess play with multi-turn dialog |
CleanRL |
rl |
cleanrl_worker (PPO, DQN) |
Trained single-agent RL policy evaluation |
XuanCe |
rl |
xuance_worker (MAPPO, QMIX) |
Trained multi-agent RL policy evaluation |
Ray RLlib |
rl |
ray_worker (PPO, IMPALA) |
Distributed RL policy evaluation |
Random Baseline |
baseline |
operators_worker (random action) |
Baseline comparison for experiments |
Tip
An Operator wraps one or more Workers. The Operator is the
agent-level interface (select_action(obs) -> action) that the
GUI interacts with. The Worker is the process-level engine that
runs inside the Operator. This separation is what enables heterogeneous
teams – e.g., an RL-trained policy and an LLM playing side-by-side
in the same multi-agent environment. See What Is an Operator? for the
full motivation and diagrams.