Services API

Service classes providing business logic.

OperatorConfig

class gym_gui.services.operator.OperatorConfig(operator_id: str, display_name: str, env_name: str = 'babyai', task: str = 'BabyAI-GoToRedBall-v0', workers: ~typing.Dict[str, ~gym_gui.services.operator.WorkerAssignment] = <factory>, run_id: str | None = None, execution_mode: str = 'aec', max_steps: int | None = None, observation_mode: str = 'visible_teammates', coordination_level: int = 1, view_size: int | None = None)[source]

Configuration for a single operator instance in multi-operator mode.

This dataclass holds all the information needed to configure and run an operator (LLM or RL worker) in the multi-operator comparison view.

An Operator binds one or more workers to a single environment: - Single-agent envs (babyai, minigrid): 1 worker → 1 environment - Multi-agent envs (chess, connect4): N workers → 1 environment

operator_id

Unique ID for this operator instance (e.g., “operator_0”).

Type:

str

display_name

User-visible name (e.g., “GPT-4 LLM”, “Chess Match”).

Type:

str

env_name

Environment family (e.g., “babyai”, “minigrid”, “pettingzoo”).

Type:

str

task

Task/level within the environment (e.g., “BabyAI-GoToRedBall-v0”, “chess_v6”).

Type:

str

workers

Dict mapping player_id to WorkerAssignment. Single-agent: {“agent”: WorkerAssignment(…)} Multi-agent: {“player_0”: WorkerAssignment(…), “player_1”: WorkerAssignment(…)}

Type:

Dict[str, gym_gui.services.operator.WorkerAssignment]

run_id

Assigned run ID when operator is started (for telemetry routing).

Type:

str | None

Factory Methods:

Use OperatorConfig.single_agent() for single-agent environments. Use OperatorConfig.multi_agent() for multi-agent environments (chess, Go, etc.).

Backwards Compatibility:

Properties operator_type, worker_id, settings read from workers[“agent”] for compatibility with existing code that expects single-worker operators.

coordination_level: int = 1
display_name: str
env_name: str = 'babyai'
execution_mode: str = 'aec'
get_ai_agents() list[str][source]

Get list of agent IDs that are controlled by AI (LLM/RL).

Returns:

List of agent IDs where worker_type != “human”.

get_human_agents() list[str][source]

Get list of agent IDs that are controlled by humans.

Returns:

List of agent IDs where worker_type == “human”.

get_worker_for_player(player_id: str) WorkerAssignment | None[source]

Get the worker assignment for a specific player.

Parameters:

player_id – The player ID (e.g., “player_0”, “agent”).

Returns:

WorkerAssignment for that player, or None if not found.

has_human_agents() bool[source]

Check if this operator has any human-controlled agents.

Returns:

True if at least one agent is human-controlled.

property is_multiagent: bool

Check if this operator has multiple workers (multi-agent environment).

is_parallel_multiagent() bool[source]

Check if this is a parallel/simultaneous multi-agent environment.

Returns:

True if execution_mode is “parallel” and there are multiple agents.

max_steps: int | None = None
classmethod multi_agent(operator_id: str, display_name: str, env_name: str, task: str, player_workers: Dict[str, WorkerAssignment], execution_mode: str = 'aec', observation_mode: str = 'visible_teammates', coordination_level: int = 1, max_steps: int | None = None, view_size: int | None = None) OperatorConfig[source]

Create a multi-agent operator config.

Parameters:
  • operator_id – Unique operator ID.

  • display_name – Display name for UI.

  • env_name – Environment family (e.g., “pettingzoo”).

  • task – Specific task (e.g., “chess_v6”).

  • player_workers – Dict mapping player_id to WorkerAssignment.

  • execution_mode – Execution paradigm - “aec” (turn-based) or “parallel” (simultaneous).

  • observation_mode – Observation mode for MultiGrid - “egocentric” or “visible_teammates”.

  • coordination_level – Coordination strategy level (1=Emergent, 2=Basic Hints, 3=Role-Based).

  • max_steps – Maximum steps per episode before truncation.

  • view_size – Agent view size for MOSAIC (None = env default of 3).

Returns:

OperatorConfig with multiple workers for multi-agent env.

observation_mode: str = 'visible_teammates'
operator_id: str
property operator_type: str

Get operator type (for backwards compatibility).

Returns the worker_type of the first worker (single-agent mode) or ‘multiagent’ if multiple workers are assigned.

property player_ids: list[str]

Get list of player IDs this operator manages.

run_id: str | None = None
property settings: Dict[str, Any]

Get worker settings (for backwards compatibility).

Returns the settings of the first worker.

classmethod single_agent(operator_id: str, display_name: str, worker_id: str, worker_type: str, env_name: str = 'babyai', task: str = 'BabyAI-GoToRedBall-v0', settings: Dict[str, Any] | None = None, max_steps: int | None = None) OperatorConfig[source]

Create a single-agent operator config.

Parameters:
  • operator_id – Unique operator ID.

  • display_name – Display name for UI.

  • worker_id – Worker to use (e.g., “balrog_worker”).

  • worker_type – Type of worker (“llm”, “vlm”, “rl”, “human”).

  • env_name – Environment family.

  • task – Specific task/level.

  • settings – Worker-specific settings.

  • max_steps – Maximum steps per episode before truncation.

Returns:

OperatorConfig with single worker assigned to “agent”.

task: str = 'BabyAI-GoToRedBall-v0'
view_size: int | None = None
with_run_id(run_id: str) OperatorConfig[source]

Return a copy of this config with the run_id set.

property worker_id: str

Get worker ID (for backwards compatibility).

Returns the worker_id of the first worker.

workers: Dict[str, WorkerAssignment]

WorkerAssignment

class gym_gui.services.operator.WorkerAssignment(worker_id: str, worker_type: str, settings: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Configuration for a single worker assigned to a player in an operator.

This dataclass holds worker-specific settings for one player/agent slot within an operator. For single-agent environments, there’s one WorkerAssignment. For multi-agent environments (e.g., chess), there’s one per player.

worker_id

References WorkerDefinition (e.g., “balrog_worker”, “cleanrl_worker”).

Type:

str

worker_type

Type of worker - “llm”, “vlm”, “rl”, “human”, or “baseline”.

Type:

str

settings

Worker-specific settings (client_name, model_id, api_key, etc.).

Type:

Dict[str, Any]

settings: Dict[str, Any]
worker_id: str
worker_type: str

OperatorLauncher

OperatorScriptExecutionManager

class gym_gui.services.operator_script_execution_manager.OperatorScriptExecutionManager(*args: Any, **kwargs: Any)[source]

Manages automated execution of baseline operator experiments from scripts.

This class implements an event-driven state machine for running multiple episodes with different seeds automatically. It handles: - Launching operators with initial seeds - Stepping operators until episode completion - Advancing to next episode with new seed - Tracking progress and emitting updates

Signals:

launch_operator: Request to launch operator subprocess reset_operator: Request to reset operator with new seed step_operator: Request to step operator stop_operator: Request to stop operator progress_updated: Progress notification for UI experiment_completed: Experiment finished notification

property current_episode: int

Get current episode number (1-indexed).

experiment_completed

alias of int

property is_running: bool

Check if experiment is currently running.

launch_operator

alias of str

on_episode_ended(operator_id: str, terminated: bool, truncated: bool) None[source]

Handle episode end from operator.

This advances to the next episode or completes the experiment.

Parameters:
  • operator_id – ID of the operator that finished an episode.

  • terminated – Whether episode terminated naturally.

  • truncated – Whether episode was truncated.

on_ready_received(operator_id: str) None[source]

Handle ready response from operator (after reset/launch).

This triggers automatic stepping.

Parameters:

operator_id – ID of the operator that is ready.

on_step_received(operator_id: str) None[source]

Handle step response from operator.

Triggers the next step after a pacing delay, giving Qt’s event loop time to process paint events so the render view updates each frame.

Parameters:

operator_id – ID of the operator that completed a step.

progress_updated

alias of int

reset_operator

alias of str

start_experiment(operator_configs: List[OperatorConfig], execution_config: Dict[str, Any]) None[source]

Start automated experiment execution.

Parameters:
  • operator_configs – List of operator configurations from script.

  • execution_config – Execution settings (num_episodes, seeds).

property step_delay_ms: int

Get step pacing delay in milliseconds.

step_operator

alias of str

stop_experiment() None[source]

Stop running experiment.

stop_operator

alias of str

property total_episodes: int

Get total number of episodes.

PolicyMappingService

class gym_gui.services.policy_mapping.PolicyMappingService(actor_service: ActorService)[source]

Per-agent policy mapping for multi-agent environments.

Extends ActorService to support: 1. Multiple active policies (one per agent) 2. Paradigm-aware action selection 3. Worker-specific routing

For single-agent environments, delegates to ActorService. For multi-agent, maintains agent_id → policy_id mapping.

Example

>>> actor_service = ActorService()
>>> actor_service.register_actor(HumanKeyboardActor(), activate=True)
>>> actor_service.register_actor(CleanRLWorkerActor())
>>>
>>> mapping = PolicyMappingService(actor_service)
>>> mapping.set_paradigm(SteppingParadigm.SEQUENTIAL)
>>> mapping.set_agents(["player_0", "player_1"])
>>> mapping.bind_agent_policy("player_0", "human_keyboard")
>>> mapping.bind_agent_policy("player_1", "cleanrl_worker")

See also

property actor_service: ActorService

Get the underlying ActorService.

property agent_ids: List[str]

Get list of configured agent IDs.

available_policy_ids() Iterable[str][source]

Get available policy IDs from ActorService.

Returns:

Iterable of policy IDs.

bind_agent_policy(agent_id: str, policy_id: str, *, worker_id: str | None = None, config: Dict[str, Any] | None = None) None[source]

Bind an agent to a specific policy.

Parameters:
  • agent_id – The agent to bind.

  • policy_id – The policy (Actor) to use for this agent.

  • worker_id – Optional worker identifier for remote execution.

  • config – Optional worker-specific configuration.

Raises:

KeyError – If policy_id is not registered in ActorService.

get_all_bindings() Dict[str, AgentPolicyBinding][source]

Get all agent-policy bindings.

Returns:

Copy of bindings dictionary.

get_binding(agent_id: str) AgentPolicyBinding | None[source]

Get the policy binding for an agent.

Parameters:

agent_id – The agent identifier.

Returns:

The binding if found, otherwise None.

is_multi_agent() bool[source]

Check if we’re in multi-agent mode.

Returns:

True if more than one agent is configured.

notify_all_episode_end(summaries: Dict[str, EpisodeSummary]) None[source]

Notify all agents of episode end.

Parameters:

summaries – Dict mapping agent_id to EpisodeSummary.

notify_episode_end(agent_id: str, summary: EpisodeSummary) None[source]

Notify the appropriate policy of episode end.

Parameters:
  • agent_id – The agent whose episode ended.

  • summary – Episode summary information.

notify_step(agent_id: str, snapshot: StepSnapshot) None[source]

Notify the appropriate policy of a step result.

Parameters:
  • agent_id – The agent that took the step.

  • snapshot – The step result.

notify_steps(snapshots: Dict[str, StepSnapshot]) None[source]

Notify all agents of their step results (Simultaneous mode).

Parameters:

snapshots – Dict mapping agent_id to StepSnapshot.

property paradigm: SteppingParadigm

Get the current stepping paradigm.

reset() None[source]

Reset all bindings for a new session.

select_action(agent_id: str, snapshot: StepSnapshot) int | None[source]

Select action for a specific agent (Sequential/AEC mode).

Parameters:
  • agent_id – The agent needing an action.

  • snapshot – Current step state.

Returns:

The action to take, or None to abstain.

select_actions(observations: Dict[str, Any], snapshots: Dict[str, StepSnapshot]) Dict[str, int | None][source]

Select actions for all agents (Simultaneous/POSG mode).

Parameters:
  • observations – Dict mapping agent_id to observation.

  • snapshots – Dict mapping agent_id to StepSnapshot.

Returns:

Dict mapping agent_id to action (or None).

set_agents(agent_ids: List[str]) None[source]

Configure the list of agents in the environment.

Auto-binds agents to the default policy if not already bound.

Parameters:

agent_ids – List of agent identifiers from the environment.

set_paradigm(paradigm: SteppingParadigm) None[source]

Set the stepping paradigm for this session.

Parameters:

paradigm – The stepping paradigm (SINGLE_AGENT, SEQUENTIAL, etc.)

unbind_agent(agent_id: str) None[source]

Remove binding for an agent.

Parameters:

agent_id – The agent to unbind.

ActorService

class gym_gui.services.actor.ActorService[source]

Registry that coordinates active actors for the current session.

available_actor_ids() Iterable[str][source]
describe_actors() tuple[ActorDescriptor, ...][source]

Return metadata for all registered actors in registration order.

get_active_actor() Actor | None[source]
get_active_actor_id() str | None[source]
get_actor_descriptor(actor_id: str) ActorDescriptor | None[source]
property last_seed: int | None
notify_episode_end(summary: EpisodeSummary) None[source]
notify_step(snapshot: StepSnapshot) None[source]
register_actor(actor: Actor, *, display_name: str | None = None, description: str | None = None, policy_label: str | None = None, backend_label: str | None = None, activate: bool = False) None[source]
seed(seed: int) None[source]

Propagate a deterministic seed to all registered actors.

select_action(snapshot: StepSnapshot) int | None[source]
set_active_actor(actor_id: str) None[source]

AgentPolicyBinding

class gym_gui.services.policy_mapping.AgentPolicyBinding(agent_id: str, policy_id: str, worker_id: str | None = None, config: ~typing.Dict[str, ~typing.Any] = <factory>)[source]

Binding between an agent and its policy controller.

agent_id

Unique identifier for the agent in the environment.

Type:

str

policy_id

References an Actor registered in ActorService.

Type:

str

worker_id

Optional worker identifier (e.g., “cleanrl_worker”, “llm_worker”).

Type:

str | None

config

Worker-specific configuration options.

Type:

Dict[str, Any]

agent_id: str
config: Dict[str, Any]
policy_id: str
worker_id: str | None = None