Overview

MOSAIC is built on a layered architecture that separates concerns and enables extensibility.

MOSAIC Full Architecture

Full architecture: Evaluation Phase (left), Training Phase (right), Daemon Process (gRPC Server, RunRegistry, Dispatcher, Broadcasters), and Worker Processes (CleanRL, XuanCe, Ray RLlib, BALROG, MOSAIC LLM).

System Layers

Layer

Key Components

Visual Layer

MainWindow, ControlPanel, RenderTabs, AdvancedConfigTab

Service Layer

PolicyMappingService, ActorService, TelemetryService, OperatorService, SessionSeedManager, StorageRecorderService

Controller Layer

SessionController, HumanInputController, InteractionController, LiveTelemetryController

Adapter Layer

EnvironmentAdapter (base), PettingZooAdapter, ALEAdapter, MiniGridAdapter, ViZDoomAdapter, SMACAdapter, and 50+ environment-specific adapters

Worker Layer (gRPC/IPC)

CleanRL, XuanCe, RLlib, BALROG, MOSAIC LLM, Chess LLM

Fast Lane (Shared Memory)

FastLaneWriter, FastLaneReader, FastLaneConsumer, SPSC ring buffer for real-time frame delivery

Visual Layer

The visual layer provides the user interface built with PyQt6. The screenshot below shows the four main regions of the MOSAIC Qt Shell, each highlighted with a distinct colour:

MOSAIC Visual Layer, annotated screenshot

Annotated screenshot of the MOSAIC Qt Shell showing all four visual regions.

Colour

Region

Description

Main Window

The top-level application shell (MainWindow). Hosts the menu bar (Settings, Control Panel, Render View, Game Info, Runtime Log, Chat), the three content panes below, and the global Dark Mode toggle.

Control Panel

Left sidebar (ControlPanel). Contains environment selection (Family, Environment, Seed), game configuration (Input Mode, Display Resolution, Control Mode), the game-flow buttons (Start / Pause / Continue / Terminate / Agent Step / Reset), and keyboard assignment for multi-human play via evdev.

Render Tabs

Centre area (RenderTabs, a QTabWidget). Displays the live environment frame through switchable tabs: Grid, Raw, Video, Human Replay, Multi-Operator, Management, Tensorboard. Dynamic per-run tabs (e.g. FastLaneTab, LiveTelemetryTab) are added automatically when training starts. See Overview for the full rendering architecture.

Runtime Logs

Right panel (RuntimeLogPanel). Streams structured log messages with Component and Severity filters. Every log line carries a LOG### code (see Log Constants) for fast searching.

Component summary:

  • MainWindow: Application shell with tab management and menu bar

  • ControlPanel: Environment selection and actor configuration

  • RenderTabs: Display environment renders (RGB, ASCII, etc.)

  • RuntimeLogPanel: Filterable structured log viewer

  • AdvancedConfigTab: Fine-grained experiment configuration (accessible via the Settings menu)

Service Layer

Services provide business logic independent of the UI:

  • PolicyMappingService: Per-agent policy binding with paradigm awareness and link groups for one-to-one and one-to-many policy mappings in multi-agent RL

  • ActorService: Actor registration and action selection — see Actors

  • TelemetryService: Aggregates telemetry events and forwards to storage backends

  • OperatorService: Multi-agent environment orchestration during evaluation

  • SessionSeedManager: Deterministic seeding across Python, NumPy, and Qt for reproducibility

  • StorageRecorderService: HDF5-based session recording and replay

  • ServiceLocator: Central registry for service discovery

Controller Layer

Controllers coordinate between services and the UI via Qt signals:

  • SessionController: Manages the adapter lifecycle and evaluation loop

  • HumanInputController: Captures keyboard and mouse input for human agents

  • InteractionController: Abstract base with environment-specific subclasses (Box2D, TurnBased, ALE, ViZDoom, SMAC, Procgen, Jumanji)

  • LiveTelemetryController: Real-time telemetry display and updates

Adapter Layer

Adapters provide a unified EnvironmentAdapter interface to different environment types. MOSAIC uses an adapter factory pattern to instantiate the correct adapter at runtime:

  • EnvironmentAdapter: Abstract base class defining the step/reset/render contract

  • PettingZooAdapter: PettingZoo multi-agent environments (AEC and Parallel)

  • ALEAdapter: Atari 2600 games via the Arcade Learning Environment

  • MiniGridAdapter: Procedural grid-world navigation (25+ variants)

  • BabyAIAdapter: Language-grounded instruction following (35+ variants)

  • ViZDoomAdapter: Doom-based first-person visual RL

  • SMACAdapter / SMACv2Adapter: StarCraft Multi-Agent Challenge

  • JumanjiAdapter: JAX-accelerated environments (20+ variants)

  • And 50+ more covering Gymnasium, Box2D, MuJoCo, Crafter, MiniHack, NetHack, TextWorld, Procgen, Overcooked, RWARE, Melting Pot, PyBullet Drones, and others

Worker Layer

External training and inference backends communicate via gRPC/IPC:

  • CleanRL: Single-agent RL (PPO, DQN, SAC, TD3, DDPG, C51, Rainbow)

  • XuanCe: Multi-agent RL (MAPPO, QMIX, MADDPG, VDN, COMA)

  • RLlib: Distributed RL with Ray (PPO, IMPALA, APPO)

  • BALROG: Single-agent LLM benchmarking (MiniGrid, BabyAI, MiniHack, Crafter)

  • MOSAIC LLM: Multi-agent LLM with coordination strategies and Theory of Mind

  • Chess LLM: LLM chess play with multi-turn dialog

Fast Lane

The Fast Lane provides real-time frame delivery from workers to the GUI via a shared-memory SPSC ring buffer, bypassing the gRPC/SQLite slow lane for rendering. Key components:

  • FastLaneWriter / FastLaneReader: Shared-memory ring buffer with seqlock semantics

  • FastLaneConsumer: Qt-side poller that converts shared-memory frames to QImage

  • tile_frames(): Composites vectorized environment frames into a single image

  • worker_helpers: Injects fast-lane environment variables into worker subprocess launch

See the Overview section for the full rendering architecture.