Changelog¶

All notable changes to MOSAIC will be documented here.

[1.0.0] - 2026-02-24¶

Added¶

Operator Abstraction: Unified agent-level interface for RL, LLM, VLM, and Human decision-makers
IPC Worker Protocol: stdin/stdout JSON protocol making any worker interchangeable
Two Evaluation Modes: Manual Mode (lock-step side-by-side) and Script Mode (automated batch)
8 Integrated Workers: CleanRL, XuanCe, Ray RLlib, BALROG, MOSAIC LLM, Chess LLM, Human Worker, Random Worker
26 Environment Families: Gymnasium, Atari, MiniGrid, BabyAI, ViZDoom, NetHack, Crafter, Procgen, BabaIsAI, Jumanji, PyBullet Drones, PettingZoo, MOSAIC MultiGrid, INI MultiGrid, Melting Pot, Overcooked, SMAC, SMACv2, RWARE, MuJoCo
Heterogeneous Decision-Maker: Mix RL, LLM, Human, and Random agents in the same multi-agent environment
Homogeneous Decision-Maker: Deploy teams of identical paradigm (all-RL, all-LLM, all-Human)
Multi-Keyboard Support: Linux evdev-based per-keyboard routing for multi-human play
Deterministic Cross-Paradigm Evaluation: Shared seed schedules for reproducible comparison
Script Mode Configs: Declarative Python scripts for automated batch evaluation
Curriculum Training: CleanRL DoorKey progression with environment wrappers
XuanCe Solo Training Configs: Soccer and Basketball (blue/green teams)
Visual-First GUI: PyQt6 interface for experiment configuration, rendering, and telemetry
Resource Management: GPU allocation, queue limits, health monitoring
Per-Agent Policy Binding: PolicyMappingService for routing agents to workers
Link Groups for Multi-Agent RL: Flexible one-to-one and one-to-many policy mappings for MAPPO/IPPO evaluation. Link groups prevent manual copy-paste errors, ensure automatic policy path updates, and enable complex team configurations with multiple independent policy groups
Runtime Logging: JSONL telemetry per step and episode
Sphinx Documentation: Full docs with video demos, architecture diagrams, and API reference