Operators¶

An Operator is the agent-level interface of MOSAIC, the unified abstraction that lets the GUI assign a worker to each individual agent or a group of agents. While Workers handle process-level concerns (training, telemetry, GPU isolation), Operators are strictly for evaluation and interactive play. Then, the worker inside an Operator loads a trained policy (or calls an LLM API, or reads keyboard input) and computes actions step-by-step. The Operator wraps this and answers the question “given this observation, what action should I take?”

        %%{init: {"flowchart": {"curve": "linear"}} }%%
graph TB
    GUI["Qt6 GUI<br/>(Main Process)"]
    LAUNCHER["OperatorLauncher<br/>(Subprocess Manager)"]

    GUI --> LAUNCHER

    LAUNCHER -- "stdin/stdout JSON" --> H_OP
    LAUNCHER -- "stdin/stdout JSON" --> L_OP
    LAUNCHER -- "stdin/stdout JSON" --> R_OP
    LAUNCHER -- "stdin/stdout JSON" --> B_OP

    subgraph H_OP["Human Operator"]
        HW["human_worker<br/>Keyboard Input"]
    end

    subgraph L_OP["LLM Operator"]
        LW1["balrog_worker<br/>Single-Agent"]
        LW2["mosaic_llm_worker<br/>Multi-Agent"]
        LW3["chess_worker<br/>Two-Player"]
    end

    subgraph R_OP["RL Operator"]
        RW1["cleanrl_worker<br/>PPO / DQN"]
        RW2["xuance_worker<br/>MAPPO / QMIX"]
        RW3["ray_worker<br/>PPO / IMPALA"]
    end

    subgraph B_OP["Baseline Operator"]
        BW["operators_worker<br/>Random / Scripted"]
    end

    style GUI fill:#4a90d9,stroke:#2e5a87,color:#fff
    style LAUNCHER fill:#50c878,stroke:#2e8b57,color:#fff
    style H_OP fill:#9370db,stroke:#6a0dad,color:#fff
    style L_OP fill:#9370db,stroke:#6a0dad,color:#fff
    style R_OP fill:#9370db,stroke:#6a0dad,color:#fff
    style B_OP fill:#9370db,stroke:#6a0dad,color:#fff
    style HW fill:#ff7f50,stroke:#cc5500,color:#fff
    style LW1 fill:#ff7f50,stroke:#cc5500,color:#fff
    style LW2 fill:#ff7f50,stroke:#cc5500,color:#fff
    style LW3 fill:#ff7f50,stroke:#cc5500,color:#fff
    style RW1 fill:#ff7f50,stroke:#cc5500,color:#fff
    style RW2 fill:#ff7f50,stroke:#cc5500,color:#fff
    style RW3 fill:#ff7f50,stroke:#cc5500,color:#fff
    style BW fill:#ff7f50,stroke:#cc5500,color:#fff

Key Principles¶

Protocol-Based	Operators implement Python `Protocol` classes – no base class inheritance required. Any object with `select_action(obs)` is a valid operator.
Category System	Every operator belongs to a category: `human`, `llm`, `rl`, or `baseline`. The GUI adapts its configuration UI based on category.
Interactive Mode	Operators run as subprocesses with `--interactive` flag, enabling step-by-step JSON commands over stdin/stdout. This keeps the GUI responsive while operators compute.
Multi-Operator Comparison	Multiple operators can run side-by-side on the same environment with shared seeds for scientific comparison (e.g., LLM vs RL on the same MiniGrid layout).
Decoupled Execution	Manual mode (click-to-step) and Script mode (automated experiments) are fully independent code paths with separate state machines.

Available Operators¶

Operator	Category	Backend	Use Case
Human	human	Keyboard input via GUI	Manual play and debugging
BALROG LLM	llm	balrog_worker (vLLM, OpenRouter)	Single-agent LLM benchmarking on MiniGrid/BabyAI
MOSAIC LLM	llm	mosaic_llm_worker (vLLM, OpenRouter, OpenAI, Anthropic)	Multi-agent LLM with coordination and Theory of Mind
Chess LLM	llm	chess_worker (llm_chess prompting)	LLM chess play with multi-turn dialog
CleanRL	rl	cleanrl_worker (PPO, DQN)	Trained single-agent RL policy evaluation
XuanCe	rl	xuance_worker (MAPPO, QMIX)	Trained multi-agent RL policy evaluation
Ray RLlib	rl	ray_worker (PPO, IMPALA)	Distributed RL policy evaluation
Random Baseline	baseline	operators_worker (random action)	Baseline comparison for experiments

Tip

An Operator wraps one or more Workers. The Operator is the agent-level interface (select_action(obs) -> action) that the GUI interacts with. The Worker is the process-level engine that runs inside the Operator. This separation is what enables heterogeneous teams – e.g., an RL-trained policy and an LLM playing side-by-side in the same multi-agent environment. See What Is an Operator? for the full motivation and diagrams.