Services API¶

Service classes providing business logic.

OperatorConfig¶

class gym_gui.services.operator.OperatorConfig(operator_id: str, display_name: str, env_name: str = 'babyai', task: str = 'BabyAI-GoToRedBall-v0', workers: ~typing.Dict[str, ~gym_gui.services.operator.WorkerAssignment] = <factory>, run_id: str | None = None, execution_mode: str = 'aec', max_steps: int | None = None, observation_mode: str = 'visible_teammates', coordination_level: int = 1, view_size: int | None = None)[source]

Configuration for a single operator instance in multi-operator mode.

This dataclass holds all the information needed to configure and run an operator (LLM or RL worker) in the multi-operator comparison view.

An Operator binds one or more workers to a single environment: - Single-agent envs (babyai, minigrid): 1 worker → 1 environment - Multi-agent envs (chess, connect4): N workers → 1 environment

operator_id

Unique ID for this operator instance (e.g., “operator_0”).

Type:: str

display_name

User-visible name (e.g., “GPT-4 LLM”, “Chess Match”).

Type:: str

env_name

Environment family (e.g., “babyai”, “minigrid”, “pettingzoo”).

Type:: str

task

Task/level within the environment (e.g., “BabyAI-GoToRedBall-v0”, “chess_v6”).

Type:: str

workers

Dict mapping player_id to WorkerAssignment. Single-agent: {“agent”: WorkerAssignment(…)} Multi-agent: {“player_0”: WorkerAssignment(…), “player_1”: WorkerAssignment(…)}

Type:: Dict[str, gym_gui.services.operator.WorkerAssignment]

run_id

Assigned run ID when operator is started (for telemetry routing).

Type:: str | None

Factory Methods:: Use OperatorConfig.single_agent() for single-agent environments. Use OperatorConfig.multi_agent() for multi-agent environments (chess, Go, etc.).
Backwards Compatibility:: Properties operator_type, worker_id, settings read from workers[“agent”] for compatibility with existing code that expects single-worker operators.

coordination_level: int = 1

display_name: str

env_name: str = 'babyai'

execution_mode: str = 'aec'

get_ai_agents() → list[str][source]

Get list of agent IDs that are controlled by AI (LLM/RL).

Returns:: List of agent IDs where worker_type != “human”.

get_human_agents() → list[str][source]

Get list of agent IDs that are controlled by humans.

Returns:: List of agent IDs where worker_type == “human”.

get_worker_for_player(player_id: str) → WorkerAssignment | None[source]

Get the worker assignment for a specific player.

Parameters:: player_id – The player ID (e.g., “player_0”, “agent”).
Returns:: WorkerAssignment for that player, or None if not found.

has_human_agents() → bool[source]

Check if this operator has any human-controlled agents.

Returns:: True if at least one agent is human-controlled.

property is_multiagent: bool: Check if this operator has multiple workers (multi-agent environment).

is_parallel_multiagent() → bool[source]

Check if this is a parallel/simultaneous multi-agent environment.

Returns:: True if execution_mode is “parallel” and there are multiple agents.

max_steps: int | None = None

classmethod multi_agent(operator_id: str, display_name: str, env_name: str, task: str, player_workers: Dict[str, WorkerAssignment], execution_mode: str = 'aec', observation_mode: str = 'visible_teammates', coordination_level: int = 1, max_steps: int | None = None, view_size: int | None = None) → OperatorConfig[source]

Create a multi-agent operator config.

Parameters:

operator_id – Unique operator ID.
display_name – Display name for UI.
env_name – Environment family (e.g., “pettingzoo”).
task – Specific task (e.g., “chess_v6”).
player_workers – Dict mapping player_id to WorkerAssignment.
execution_mode – Execution paradigm - “aec” (turn-based) or “parallel” (simultaneous).
observation_mode – Observation mode for MultiGrid - “egocentric” or “visible_teammates”.
coordination_level – Coordination strategy level (1=Emergent, 2=Basic Hints, 3=Role-Based).
max_steps – Maximum steps per episode before truncation.
view_size – Agent view size for MOSAIC (None = env default of 3).

Returns:

OperatorConfig with multiple workers for multi-agent env.

observation_mode: str = 'visible_teammates'

operator_id: str

property operator_type: str

Get operator type (for backwards compatibility).

Returns the worker_type of the first worker (single-agent mode) or ‘multiagent’ if multiple workers are assigned.

property player_ids: list[str]: Get list of player IDs this operator manages.

run_id: str | None = None

property settings: Dict[str, Any]

Get worker settings (for backwards compatibility).

Returns the settings of the first worker.

classmethod single_agent(operator_id: str, display_name: str, worker_id: str, worker_type: str, env_name: str = 'babyai', task: str = 'BabyAI-GoToRedBall-v0', settings: Dict[str, Any] | None = None, max_steps: int | None = None) → OperatorConfig[source]

Create a single-agent operator config.

Parameters:

operator_id – Unique operator ID.
display_name – Display name for UI.
worker_id – Worker to use (e.g., “balrog_worker”).
worker_type – Type of worker (“llm”, “vlm”, “rl”, “human”).
env_name – Environment family.
task – Specific task/level.
settings – Worker-specific settings.
max_steps – Maximum steps per episode before truncation.

Returns:

OperatorConfig with single worker assigned to “agent”.

task: str = 'BabyAI-GoToRedBall-v0'

view_size: int | None = None

with_run_id(run_id: str) → OperatorConfig[source]: Return a copy of this config with the run_id set.

property worker_id: str

Get worker ID (for backwards compatibility).

Returns the worker_id of the first worker.

workers: Dict[str, WorkerAssignment]

WorkerAssignment¶

class gym_gui.services.operator.WorkerAssignment(worker_id: str, worker_type: str, settings: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Configuration for a single worker assigned to a player in an operator.

This dataclass holds worker-specific settings for one player/agent slot within an operator. For single-agent environments, there’s one WorkerAssignment. For multi-agent environments (e.g., chess), there’s one per player.

worker_id

References WorkerDefinition (e.g., “balrog_worker”, “cleanrl_worker”).

Type:: str

worker_type

Type of worker - “llm”, “vlm”, “rl”, “human”, or “baseline”.

Type:: str

settings

Worker-specific settings (client_name, model_id, api_key, etc.).

Type:: Dict[str, Any]

settings: Dict[str, Any]

worker_id: str

worker_type: str

OperatorLauncher¶

OperatorScriptExecutionManager¶

class gym_gui.services.operator_script_execution_manager.OperatorScriptExecutionManager(*args: Any, **kwargs: Any)[source]¶

Manages automated execution of baseline operator experiments from scripts.

This class implements an event-driven state machine for running multiple episodes with different seeds automatically. It handles: - Launching operators with initial seeds - Stepping operators until episode completion - Advancing to next episode with new seed - Tracking progress and emitting updates

Signals:: launch_operator: Request to launch operator subprocess reset_operator: Request to reset operator with new seed step_operator: Request to step operator stop_operator: Request to stop operator progress_updated: Progress notification for UI experiment_completed: Experiment finished notification

property current_episode: int¶: Get current episode number (1-indexed).

experiment_completed¶: alias of int

property is_running: bool¶: Check if experiment is currently running.

launch_operator¶: alias of str

on_episode_ended(operator_id: str, terminated: bool, truncated: bool) → None[source]¶

Handle episode end from operator.

This advances to the next episode or completes the experiment.

Parameters:

operator_id – ID of the operator that finished an episode.
terminated – Whether episode terminated naturally.
truncated – Whether episode was truncated.

on_ready_received(operator_id: str) → None[source]¶

Handle ready response from operator (after reset/launch).

This triggers automatic stepping.

Parameters:: operator_id – ID of the operator that is ready.

on_step_received(operator_id: str) → None[source]¶

Handle step response from operator.

Triggers the next step after a pacing delay, giving Qt’s event loop time to process paint events so the render view updates each frame.

Parameters:: operator_id – ID of the operator that completed a step.

progress_updated¶: alias of int

reset_operator¶: alias of str

start_experiment(operator_configs: List[OperatorConfig], execution_config: Dict[str, Any]) → None[source]¶

Start automated experiment execution.

Parameters:

operator_configs – List of operator configurations from script.
execution_config – Execution settings (num_episodes, seeds).

property step_delay_ms: int¶: Get step pacing delay in milliseconds.

step_operator¶: alias of str

stop_experiment() → None[source]¶: Stop running experiment.

stop_operator¶: alias of str

property total_episodes: int¶: Get total number of episodes.

PolicyMappingService¶

class gym_gui.services.policy_mapping.PolicyMappingService(actor_service: ActorService)[source]¶

Per-agent policy mapping for multi-agent environments.

Extends ActorService to support: 1. Multiple active policies (one per agent) 2. Paradigm-aware action selection 3. Worker-specific routing

For single-agent environments, delegates to ActorService. For multi-agent, maintains agent_id → policy_id mapping.

Example

>>> actor_service = ActorService()
>>> actor_service.register_actor(HumanKeyboardActor(), activate=True)
>>> actor_service.register_actor(CleanRLWorkerActor())
>>>
>>> mapping = PolicyMappingService(actor_service)
>>> mapping.set_paradigm(SteppingParadigm.SEQUENTIAL)
>>> mapping.set_agents(["player_0", "player_1"])
>>> mapping.bind_agent_policy("player_0", "human_keyboard")
>>> mapping.bind_agent_policy("player_1", "cleanrl_worker")

ActorService¶

class gym_gui.services.actor.ActorService[source]¶

Registry that coordinates active actors for the current session.

available_actor_ids() → Iterable[str][source]¶

describe_actors() → tuple[ActorDescriptor, ...][source]¶: Return metadata for all registered actors in registration order.

get_active_actor() → Actor | None[source]¶

get_active_actor_id() → str | None[source]¶

get_actor_descriptor(actor_id: str) → ActorDescriptor | None[source]¶

property last_seed: int | None¶

notify_episode_end(summary: EpisodeSummary) → None[source]¶

notify_step(snapshot: StepSnapshot) → None[source]¶

register_actor(actor: Actor, *, display_name: str | None = None, description: str | None = None, policy_label: str | None = None, backend_label: str | None = None, activate: bool = False) → None[source]¶

seed(seed: int) → None[source]¶: Propagate a deterministic seed to all registered actors.

select_action(snapshot: StepSnapshot) → int | None[source]¶

set_active_actor(actor_id: str) → None[source]¶

AgentPolicyBinding¶

class gym_gui.services.policy_mapping.AgentPolicyBinding(agent_id: str, policy_id: str, worker_id: str | None = None, config: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Binding between an agent and its policy controller.

agent_id

Unique identifier for the agent in the environment.

Type:: str

policy_id

References an Actor registered in ActorService.

Type:: str

worker_id

Optional worker identifier (e.g., “cleanrl_worker”, “llm_worker”).

Type:: str | None

config

Worker-specific configuration options.

Type:: Dict[str, Any]

agent_id: str

config: Dict[str, Any]

policy_id: str

worker_id: str | None = None