Single-Agent Human Control¶
In single-agent mode, one human player controls the environment using the system keyboard. This page traces the full journey of a keypress from the physical keyboard to the updated render view.
End-to-End Flow¶
sequenceDiagram
participant User
participant HIC as HumanInputController
participant SC as SessionController
participant Adapter
participant Env as Gymnasium Environment
participant RR as RendererRegistry
participant RV as Render View
User->>HIC: keyPress (e.g., Arrow Right)
HIC->>SC: perform_human_action(action=1)
SC->>Adapter: _apply_action(1)
Adapter->>Env: env.step(1)
Env-->>Adapter: obs, reward, terminated, info
Adapter-->>SC: step result
SC->>SC: step_processed signal
SC-->>RR: render payload
RR->>RV: strategy.render(payload)
RV-->>User: Updated visual display
Control Modes¶
One ControlMode value drives human input in single-agent
environments:
Mode |
Behaviour |
|---|---|
|
All actions come from the keyboard. The environment waits for human input before advancing. |
Example: Playing FrozenLake¶
FrozenLake is a turn-based grid-world environment with 4 discrete actions. It uses shortcut-based input mode because every action is a single keypress and there is no need for simultaneous key combinations.
Key mappings (shortcut-based)
Key |
Action Index |
Effect |
|---|---|---|
Arrow Left / A |
0 |
Move left |
Arrow Down / S |
1 |
Move down |
Arrow Right / D |
2 |
Move right |
Arrow Up / W |
3 |
Move up |
Renderer: GridRendererStrategy renders the FrozenLake map as a
tile grid using FrozenLakeAssets. The elf sprite moves from tile to
tile as the agent position updates.
Walkthrough
The player presses Right (or D).
The
QShortcutfires and callsSessionController.perform_human_action(2).SessionControllerchecks that the control mode allows human input, then calls_apply_action(2).The adapter calls
env.step(2), which returns a new observation (the elf moved one tile right), reward, and termination flag.SessionControlleremits thestep_processedsignal.The
RendererRegistrypasses the new observation toGridRendererStrategy.render(), which updates the elf sprite position on the tile map.The player sees the elf in its new position and presses the next key.
Example: Playing Atari¶
Atari games are real-time environments with up to 18 discrete actions that include directional movement, fire, and all direction+fire combinations. They use state-based input mode to support simultaneous key presses (e.g., Up+Space = UPFIRE).
Key handling (state-based)
Keys Held |
Action Index |
Effect |
|---|---|---|
(none) |
0 |
NOOP |
Space |
1 |
FIRE |
Arrow Up |
2 |
UP |
Arrow Right |
3 |
RIGHT |
Arrow Left |
4 |
LEFT |
Arrow Down |
5 |
DOWN |
Arrow Up + Arrow Right |
6 |
UPRIGHT |
Arrow Up + Arrow Left |
7 |
UPLEFT |
Arrow Down + Arrow Right |
8 |
DOWNRIGHT |
Arrow Down + Arrow Left |
9 |
DOWNLEFT |
Space + Arrow Up |
10 |
UPFIRE |
Space + Arrow Right |
11 |
RIGHTFIRE |
Space + Arrow Left |
12 |
LEFTFIRE |
Space + Arrow Down |
13 |
DOWNFIRE |
Resolver: AleKeyCombinationResolver inspects the set of
currently pressed keys and returns the correct composite action index.
Renderer: RgbRendererStrategy displays each game frame as a
scaled RGB image.
Walkthrough
The player holds Up and presses Space simultaneously.
HumanInputController.eventFiltertracks both keys in_pressed_keys.AleKeyCombinationResolver.resolve({Up, Space})returns action index 10 (UPFIRE).perform_human_action(10)is called onSessionController.The adapter calls
env.step(10); the Atari emulator advances one frame and returns an RGB observation.RgbRendererStrategy.render()converts the NumPy array to aQPixmapand paints it in the render tab.The loop repeats at the environment’s native frame rate.