Services API¶
Service classes providing business logic.
OperatorConfig¶
- class gym_gui.services.operator.OperatorConfig(operator_id: str, display_name: str, env_name: str = 'babyai', task: str = 'BabyAI-GoToRedBall-v0', workers: ~typing.Dict[str, ~gym_gui.services.operator.WorkerAssignment] = <factory>, run_id: str | None = None, execution_mode: str = 'aec', max_steps: int | None = None, observation_mode: str = 'visible_teammates', coordination_level: int = 1, view_size: int | None = None)[source]
Configuration for a single operator instance in multi-operator mode.
This dataclass holds all the information needed to configure and run an operator (LLM or RL worker) in the multi-operator comparison view.
An Operator binds one or more workers to a single environment: - Single-agent envs (babyai, minigrid): 1 worker → 1 environment - Multi-agent envs (chess, connect4): N workers → 1 environment
- operator_id
Unique ID for this operator instance (e.g., “operator_0”).
- Type:
- display_name
User-visible name (e.g., “GPT-4 LLM”, “Chess Match”).
- Type:
- env_name
Environment family (e.g., “babyai”, “minigrid”, “pettingzoo”).
- Type:
- task
Task/level within the environment (e.g., “BabyAI-GoToRedBall-v0”, “chess_v6”).
- Type:
- workers
Dict mapping player_id to WorkerAssignment. Single-agent: {“agent”: WorkerAssignment(…)} Multi-agent: {“player_0”: WorkerAssignment(…), “player_1”: WorkerAssignment(…)}
- Type:
Dict[str, gym_gui.services.operator.WorkerAssignment]
- run_id
Assigned run ID when operator is started (for telemetry routing).
- Type:
str | None
- Factory Methods:
Use OperatorConfig.single_agent() for single-agent environments. Use OperatorConfig.multi_agent() for multi-agent environments (chess, Go, etc.).
- Backwards Compatibility:
Properties operator_type, worker_id, settings read from workers[“agent”] for compatibility with existing code that expects single-worker operators.
- coordination_level: int = 1
- display_name: str
- env_name: str = 'babyai'
- execution_mode: str = 'aec'
- get_ai_agents() list[str][source]
Get list of agent IDs that are controlled by AI (LLM/RL).
- Returns:
List of agent IDs where worker_type != “human”.
- get_human_agents() list[str][source]
Get list of agent IDs that are controlled by humans.
- Returns:
List of agent IDs where worker_type == “human”.
- get_worker_for_player(player_id: str) WorkerAssignment | None[source]
Get the worker assignment for a specific player.
- Parameters:
player_id – The player ID (e.g., “player_0”, “agent”).
- Returns:
WorkerAssignment for that player, or None if not found.
- has_human_agents() bool[source]
Check if this operator has any human-controlled agents.
- Returns:
True if at least one agent is human-controlled.
- property is_multiagent: bool
Check if this operator has multiple workers (multi-agent environment).
- is_parallel_multiagent() bool[source]
Check if this is a parallel/simultaneous multi-agent environment.
- Returns:
True if execution_mode is “parallel” and there are multiple agents.
- classmethod multi_agent(operator_id: str, display_name: str, env_name: str, task: str, player_workers: Dict[str, WorkerAssignment], execution_mode: str = 'aec', observation_mode: str = 'visible_teammates', coordination_level: int = 1, max_steps: int | None = None, view_size: int | None = None) OperatorConfig[source]
Create a multi-agent operator config.
- Parameters:
operator_id – Unique operator ID.
display_name – Display name for UI.
env_name – Environment family (e.g., “pettingzoo”).
task – Specific task (e.g., “chess_v6”).
player_workers – Dict mapping player_id to WorkerAssignment.
execution_mode – Execution paradigm - “aec” (turn-based) or “parallel” (simultaneous).
observation_mode – Observation mode for MultiGrid - “egocentric” or “visible_teammates”.
coordination_level – Coordination strategy level (1=Emergent, 2=Basic Hints, 3=Role-Based).
max_steps – Maximum steps per episode before truncation.
view_size – Agent view size for MOSAIC (None = env default of 3).
- Returns:
OperatorConfig with multiple workers for multi-agent env.
- observation_mode: str = 'visible_teammates'
- operator_id: str
- property operator_type: str
Get operator type (for backwards compatibility).
Returns the worker_type of the first worker (single-agent mode) or ‘multiagent’ if multiple workers are assigned.
- property settings: Dict[str, Any]
Get worker settings (for backwards compatibility).
Returns the settings of the first worker.
- classmethod single_agent(operator_id: str, display_name: str, worker_id: str, worker_type: str, env_name: str = 'babyai', task: str = 'BabyAI-GoToRedBall-v0', settings: Dict[str, Any] | None = None, max_steps: int | None = None) OperatorConfig[source]
Create a single-agent operator config.
- Parameters:
operator_id – Unique operator ID.
display_name – Display name for UI.
worker_id – Worker to use (e.g., “balrog_worker”).
worker_type – Type of worker (“llm”, “vlm”, “rl”, “human”).
env_name – Environment family.
task – Specific task/level.
settings – Worker-specific settings.
max_steps – Maximum steps per episode before truncation.
- Returns:
OperatorConfig with single worker assigned to “agent”.
- task: str = 'BabyAI-GoToRedBall-v0'
- property worker_id: str
Get worker ID (for backwards compatibility).
Returns the worker_id of the first worker.
WorkerAssignment¶
- class gym_gui.services.operator.WorkerAssignment(worker_id: str, worker_type: str, settings: ~typing.Dict[str, ~typing.Any] = <factory>)[source]
Configuration for a single worker assigned to a player in an operator.
This dataclass holds worker-specific settings for one player/agent slot within an operator. For single-agent environments, there’s one WorkerAssignment. For multi-agent environments (e.g., chess), there’s one per player.
- worker_id
References WorkerDefinition (e.g., “balrog_worker”, “cleanrl_worker”).
- Type:
- worker_type
Type of worker - “llm”, “vlm”, “rl”, “human”, or “baseline”.
- Type:
- settings
Worker-specific settings (client_name, model_id, api_key, etc.).
- Type:
Dict[str, Any]
- worker_id: str
- worker_type: str
OperatorLauncher¶
OperatorScriptExecutionManager¶
- class gym_gui.services.operator_script_execution_manager.OperatorScriptExecutionManager(*args: Any, **kwargs: Any)[source]¶
Manages automated execution of baseline operator experiments from scripts.
This class implements an event-driven state machine for running multiple episodes with different seeds automatically. It handles: - Launching operators with initial seeds - Stepping operators until episode completion - Advancing to next episode with new seed - Tracking progress and emitting updates
- Signals:
launch_operator: Request to launch operator subprocess reset_operator: Request to reset operator with new seed step_operator: Request to step operator stop_operator: Request to stop operator progress_updated: Progress notification for UI experiment_completed: Experiment finished notification
- on_episode_ended(operator_id: str, terminated: bool, truncated: bool) None[source]¶
Handle episode end from operator.
This advances to the next episode or completes the experiment.
- Parameters:
operator_id – ID of the operator that finished an episode.
terminated – Whether episode terminated naturally.
truncated – Whether episode was truncated.
- on_ready_received(operator_id: str) None[source]¶
Handle ready response from operator (after reset/launch).
This triggers automatic stepping.
- Parameters:
operator_id – ID of the operator that is ready.
- on_step_received(operator_id: str) None[source]¶
Handle step response from operator.
Triggers the next step after a pacing delay, giving Qt’s event loop time to process paint events so the render view updates each frame.
- Parameters:
operator_id – ID of the operator that completed a step.
PolicyMappingService¶
- class gym_gui.services.policy_mapping.PolicyMappingService(actor_service: ActorService)[source]¶
Per-agent policy mapping for multi-agent environments.
Extends ActorService to support: 1. Multiple active policies (one per agent) 2. Paradigm-aware action selection 3. Worker-specific routing
For single-agent environments, delegates to ActorService. For multi-agent, maintains agent_id → policy_id mapping.
Example
>>> actor_service = ActorService() >>> actor_service.register_actor(HumanKeyboardActor(), activate=True) >>> actor_service.register_actor(CleanRLWorkerActor()) >>> >>> mapping = PolicyMappingService(actor_service) >>> mapping.set_paradigm(SteppingParadigm.SEQUENTIAL) >>> mapping.set_agents(["player_0", "player_1"]) >>> mapping.bind_agent_policy("player_0", "human_keyboard") >>> mapping.bind_agent_policy("player_1", "cleanrl_worker")
See also
PolicyMappingService for policy mapping details
- property actor_service: ActorService¶
Get the underlying ActorService.
- available_policy_ids() Iterable[str][source]¶
Get available policy IDs from ActorService.
- Returns:
Iterable of policy IDs.
- bind_agent_policy(agent_id: str, policy_id: str, *, worker_id: str | None = None, config: Dict[str, Any] | None = None) None[source]¶
Bind an agent to a specific policy.
- Parameters:
agent_id – The agent to bind.
policy_id – The policy (Actor) to use for this agent.
worker_id – Optional worker identifier for remote execution.
config – Optional worker-specific configuration.
- Raises:
KeyError – If policy_id is not registered in ActorService.
- get_all_bindings() Dict[str, AgentPolicyBinding][source]¶
Get all agent-policy bindings.
- Returns:
Copy of bindings dictionary.
- get_binding(agent_id: str) AgentPolicyBinding | None[source]¶
Get the policy binding for an agent.
- Parameters:
agent_id – The agent identifier.
- Returns:
The binding if found, otherwise None.
- is_multi_agent() bool[source]¶
Check if we’re in multi-agent mode.
- Returns:
True if more than one agent is configured.
- notify_all_episode_end(summaries: Dict[str, EpisodeSummary]) None[source]¶
Notify all agents of episode end.
- Parameters:
summaries – Dict mapping agent_id to EpisodeSummary.
- notify_episode_end(agent_id: str, summary: EpisodeSummary) None[source]¶
Notify the appropriate policy of episode end.
- Parameters:
agent_id – The agent whose episode ended.
summary – Episode summary information.
- notify_step(agent_id: str, snapshot: StepSnapshot) None[source]¶
Notify the appropriate policy of a step result.
- Parameters:
agent_id – The agent that took the step.
snapshot – The step result.
- notify_steps(snapshots: Dict[str, StepSnapshot]) None[source]¶
Notify all agents of their step results (Simultaneous mode).
- Parameters:
snapshots – Dict mapping agent_id to StepSnapshot.
- property paradigm: SteppingParadigm¶
Get the current stepping paradigm.
- select_action(agent_id: str, snapshot: StepSnapshot) int | None[source]¶
Select action for a specific agent (Sequential/AEC mode).
- Parameters:
agent_id – The agent needing an action.
snapshot – Current step state.
- Returns:
The action to take, or None to abstain.
- select_actions(observations: Dict[str, Any], snapshots: Dict[str, StepSnapshot]) Dict[str, int | None][source]¶
Select actions for all agents (Simultaneous/POSG mode).
- Parameters:
observations – Dict mapping agent_id to observation.
snapshots – Dict mapping agent_id to StepSnapshot.
- Returns:
Dict mapping agent_id to action (or None).
- set_agents(agent_ids: List[str]) None[source]¶
Configure the list of agents in the environment.
Auto-binds agents to the default policy if not already bound.
- Parameters:
agent_ids – List of agent identifiers from the environment.
- set_paradigm(paradigm: SteppingParadigm) None[source]¶
Set the stepping paradigm for this session.
- Parameters:
paradigm – The stepping paradigm (SINGLE_AGENT, SEQUENTIAL, etc.)
ActorService¶
- class gym_gui.services.actor.ActorService[source]¶
Registry that coordinates active actors for the current session.
- describe_actors() tuple[ActorDescriptor, ...][source]¶
Return metadata for all registered actors in registration order.
AgentPolicyBinding¶
- class gym_gui.services.policy_mapping.AgentPolicyBinding(agent_id: str, policy_id: str, worker_id: str | None = None, config: ~typing.Dict[str, ~typing.Any] = <factory>)[source]
Binding between an agent and its policy controller.
- agent_id
Unique identifier for the agent in the environment.
- Type:
- policy_id
References an Actor registered in ActorService.
- Type:
- worker_id
Optional worker identifier (e.g., “cleanrl_worker”, “llm_worker”).
- Type:
str | None
- config
Worker-specific configuration options.
- Type:
Dict[str, Any]
- agent_id: str
- policy_id: str