What Is a Worker?¶
Workers solve a fundamental problem in multi-framework RL research: how do you run CleanRL, Ray RLlib, XuanCe, and LLM agents in the same platform without them conflicting?
The Problem¶
RL libraries are designed to run standalone. They each have their own:
Dependency trees (often conflicting versions of NumPy, PyTorch, etc.)
Logging systems (TensorBoard, Weights & Biases, custom loggers)
Configuration formats (YAML, JSON, CLI args, Python dicts)
GPU management strategies
Error handling and signal traps
Running them inside a single process leads to import conflicts, segfaults, and unpredictable behaviour.
graph LR
subgraph "❌ Single-Process (Fragile)"
APP["Monolithic App"]
APP --> CRL["CleanRL"]
APP --> RAY["Ray RLlib"]
APP --> XUA["XuanCe"]
CRL -.->|"torch 2.0"| CONFLICT["⚠️ Version Conflict"]
RAY -.->|"torch 2.1"| CONFLICT
end
style CONFLICT fill:#ff4444,stroke:#cc0000,color:#fff
style APP fill:#ddd,stroke:#999
The Solution: Process Isolation¶
MOSAIC runs every RL library in its own OS process. The only communication channel is a simple, well-defined IPC protocol.
graph TB
subgraph "✅ Multi-Process (Robust)"
DAEMON["Trainer Daemon"]
DAEMON -->|"spawn"| P1["Process 1<br/>CleanRL + torch 2.0"]
DAEMON -->|"spawn"| P2["Process 2<br/>Ray RLlib + torch 2.1"]
DAEMON -->|"spawn"| P3["Process 3<br/>XuanCe + torch 2.0"]
P1 -->|"JSONL stdout"| DAEMON
P2 -->|"JSONL stdout"| DAEMON
P3 -->|"JSONL stdout"| DAEMON
end
style DAEMON fill:#50c878,stroke:#2e8b57,color:#fff
style P1 fill:#ff7f50,stroke:#cc5500,color:#fff
style P2 fill:#ff7f50,stroke:#cc5500,color:#fff
style P3 fill:#ff7f50,stroke:#cc5500,color:#fff
This gives us:
Fault containment: a crashed worker cannot freeze the GUI or kill other workers
Dependency freedom: each worker can pin its own library versions
GPU isolation:
CUDA_VISIBLE_DEVICESis set per-processClean shutdown:
SIGTERMto the process group kills the entire worker tree
Worker vs Operator¶
MOSAIC distinguishes two concepts that are easy to confuse:
Concept |
Definition |
Examples |
|---|---|---|
Worker |
A process-level execution unit that handles training, API
calls, or bash script for custom workflows. Workers live in |
|
Operator |
An agent-level interface used strictly for evaluation.
The operator assigns workers to agents, loads trained policies
(or connects to LLM endpoints), and exposes
|
RL Operator, LLM Operator, Human Operator |
Workers handle training separately from operators. An RL worker
trains a policy and saves a checkpoint; an LLM worker connects to a
vLLM instance. The operator
then takes the result (trained policy or LLM endpoint) and assigns it
to an agent for evaluation via select_action().
%%{init: {"flowchart": {"curve": "linear"}} }%%
graph TB
subgraph Training["Training Phase (Worker)"]
WORKER["CleanRL Worker"]
WORKER -->|"produces"| CKPT["Checkpoint"]
end
subgraph Eval["Evaluation Phase (Operator)"]
OP["RL Operator"]
SA["select_action(obs)"]
OP --> SA
end
CKPT -->|"loaded by"| OP
style Training fill:#ff7f50,stroke:#cc5500,color:#fff
style Eval fill:#4a90d9,stroke:#2e5a87,color:#fff
style CKPT fill:#f0e68c,stroke:#bdb76b
The Shim Pattern¶
Workers never modify the upstream library. Instead, a thin shim layer sits between MOSAIC and the library, translating everything:
Benefits:
Upstream libraries can be updated independently
The shim is the only code that needs testing against MOSAIC
Adding a new worker means writing a new shim, not forking a library
Protocol changes only affect the shim layer
Directory Layout¶
Every worker follows the same structure:
3rd_party/my_worker/
├── pyproject.toml # Package metadata + entry point
├── README.md
├── my_worker/ # MOSAIC shim layer
│ ├── __init__.py # get_worker_metadata()
│ ├── config.py # WorkerConfig implementation
│ ├── runtime.py # Training orchestration
│ ├── analytics.py # Analytics manifest generation
│ ├── telemetry.py # JSONL lifecycle events
│ ├── fastlane.py # Shared-memory rendering
│ └── cli.py # CLI entry point
├── upstream_lib/ # Vendored or pip-installed
└── tests/
└── test_standardization.py # Protocol compliance tests