MOSAIC Random Worker¶

The MOSAIC Random Worker is MOSAIC’s lightweight baseline agent for multi-agent and single-agent environments. It selects actions without any learned policy, providing a performance floor for comparison against RL, LLM, and human decision-makers.

The worker supports three action-selection behaviors; random (uniform sampling), noop (always action 0), and cycling (round-robin through the action space), making it useful for sanity checks, environment debugging, and establishing baselines in heterogeneous experiments (e.g., RL + LLM teammates vs Random opponents).

Paradigm	Baseline agent (single-agent and multi-agent)
Task Type	Random baseline, noop baseline, cycling baseline
Behaviors	`random` (uniform), `noop` (action 0), `cycling` (round-robin)
Environments	All Gymnasium-compatible: MiniGrid, BabyAI, MosaicMultiGrid, Atari, FrozenLake, Taxi, Blackjack, MeltingPot, PettingZoo, and more
Execution	Subprocess (autonomous or interactive step-by-step)
GPU required	No
Source	`3rd_party/mosaic/random_worker/random_worker/`
Entry point	`random-worker` (CLI)

Overview¶

The MOSAIC Random Worker provides a zero-intelligence baseline for any Gymnasium-compatible environment. It requires no training, no API keys, and no GPU, just point it at an environment and it will select actions according to one of three simple strategies.

This enables several research workflows:

Baseline comparison: Measure how much better a trained RL policy or LLM agent performs compared to random play.
Environment debugging: Verify that environments, rendering, and telemetry pipelines work end-to-end before deploying expensive workers.
Heterogeneous experiments: Fill opponent or teammate slots with random agents in mixed-paradigm evaluations (e.g., RL team vs Random team).

Key features:

3 action behaviors: random (uniform sampling), noop (always still), cycling (round-robin)
Automatic action space resolution: creates a temporary env to detect Discrete(N)
Multi-agent support: handles Dict action spaces (MosaicMultiGrid Soccer, Basketball, etc.)
Dual runtime modes: autonomous (batch episodes) or interactive (GUI step-by-step)
Action-selector mode: for PettingZoo games where the GUI owns the environment
Deterministic seeding: reproducible action sequences via --seed
RGB rendering: captures frames for GUI visualization
117 tests across 15 test classes covering all behaviors and environments

Architecture¶

The worker follows the standard MOSAIC shim pattern with two runtime modes:

        graph TB
    subgraph "MOSAIC GUI"
        FORM["Operator Config<br/>(Baseline worker)"]
        DAEMON["Operator Launcher"]
    end

    subgraph "Random Worker Subprocess"
        CLI["cli.py<br/>(random-worker)"]
        CFG["config.py<br/>(RandomWorkerConfig)"]
        RT["runtime.py<br/>(RandomWorkerRuntime)"]
    end

    subgraph "Environment"
        ENV["Gymnasium<br/>(any compatible env)"]
    end

    FORM -->|"config JSON"| DAEMON
    DAEMON -->|"spawn"| CLI
    CLI --> CFG --> RT
    RT -->|"reset / step"| ENV

    style FORM fill:#4a90d9,stroke:#2e5a87,color:#fff
    style DAEMON fill:#50c878,stroke:#2e8b57,color:#fff
    style CLI fill:#ff7f50,stroke:#cc5500,color:#fff
    style CFG fill:#ff7f50,stroke:#cc5500,color:#fff
    style RT fill:#ff7f50,stroke:#cc5500,color:#fff
    style ENV fill:#e8e8e8,stroke:#999

Action Behaviors¶

Behavior	Strategy	Description
random	Uniform sampling	Samples uniformly from `Discrete(N)`. Default behavior. Provides a true random baseline for comparison.
noop	Always action 0	Always selects action 0 (`still` / do nothing). Tests environment behavior with a passive agent.
cycling	Round-robin	Cycles through actions `0, 1, 2, ..., N-1, 0, 1, ...` Deterministically exercises every action in order.

Supported Environments¶

The Random Worker automatically resolves the action space by creating a temporary environment instance. It supports any Gymnasium-compatible environment, including:

Environment	Action Space	Notes
MiniGrid (all variants)	Discrete(7)	Empty, DoorKey, KeyCorridor, etc.
BabyAI (all variants)	Discrete(7)	GoTo, Pickup, Open tasks
MosaicMultiGrid Soccer 1v1	Discrete(8) x 2	Multi-agent Dict action space
MosaicMultiGrid Soccer 2v2	Discrete(8) x 4	4-agent team play
MosaicMultiGrid Basketball 3v3	Discrete(8) x 6	6-agent team play
MosaicMultiGrid Collect	Discrete(8) x 2–4	Ball collection variants
Gymnasium Toy Text	Discrete(N)	FrozenLake, Taxi, Blackjack, CliffWalking
Atari / ALE	Discrete(N)	All 128 ALE games
PettingZoo	varies	Chess, Connect Four, Go (action-selector mode)

For multi-agent environments with Dict action spaces, the worker automatically unwraps to individual Discrete spaces and builds per-agent action dictionaries at each step.

Runtime Modes¶

Autonomous mode (batch episodes, for Script Experiments):

random-worker --run-id test123 \
    --task MiniGrid-Empty-8x8-v0 \
    --behavior random --seed 42

Interactive mode (GUI step-by-step, action-selector protocol):

random-worker --run-id test123 --interactive \
    --task MosaicMultiGrid-Soccer-1vs1-IndAgObs-v0

Interactive mode reads JSON commands from stdin and emits telemetry to stdout:

{"cmd": "init_agent", "game_name": "chess_v6", "player_id": "player_0"}
{"cmd": "select_action", "observation": ["..."], "player_id": "player_0"}
{"cmd": "stop"}

Autonomous mode uses the env-owning protocol:

{"cmd": "reset", "seed": 42}
{"cmd": "step"}
{"cmd": "stop"}

Configuration¶

CLI arguments:

Argument	Default	Description
`--run-id`	(required)	Unique run identifier (assigned by GUI)
`--task`	`""`	Gymnasium environment ID (required for autonomous mode)
`--env-name`	`""`	Environment family name
`--seed`	`None`	Random seed for reproducible action sequences
`--behavior`	`random`	Action selection strategy: `random`, `noop`, or `cycling`
`--interactive`	`false`	Run in interactive (action-selector) mode

RandomWorkerConfig dataclass:

@dataclass
class RandomWorkerConfig:
    run_id: str = ""
    env_name: str = ""
    task: str = ""
    seed: Optional[int] = None
    behavior: str = "random"  # random | noop | cycling

Test Coverage¶

The Random Worker has 117 tests across 15 test classes:

Test Class	Tests	Coverage
TestRandomWorkerConfig	2	Config dataclass defaults and construction
TestCLI	3	Argument parsing and entry point
TestRuntimeProtocol	6	JSON protocol with fallback Discrete(7)
TestFullLoop	6	Stdin/stdout loop, malformed JSON, empty lines
TestActionSpaceResolution	12	MiniGrid, BabyAI, MosaicMultiGrid, FrozenLake, Taxi, Blackjack
TestBehaviorsWithRealEnvs	12	random/noop/cycling across Discrete(2,4,6,7,8)
TestMultiAgent	4	4-agent Soccer, 6-agent Basketball, reinit
TestReproducibility	3	Seed determinism
TestFullLoopRealEnvs	3	Full protocol with Soccer 2v2, FrozenLake, BabyAI
TestSubprocessIntegration	7	Real subprocess launches
TestEdgeCases	7	Large obs, missing fields, rapid fire
TestAutonomousMode	16	Env-owning reset/step/stop protocol
TestMosaicMultigridAutonomous	18	All 11 MosaicMultiGrid envs, render, episode end
TestMosaicMultigridInteractive	14	Action space resolution, 4-agent Soccer, 6-agent Basketball
TestMosaicMultigridSubprocess	3	Real subprocess: Soccer 2v2, Soccer 1v1, Basketball 3v3

Installation