Single-Agent Human Control¶

In single-agent mode, one human player controls the environment using the system keyboard. This page traces the full journey of a keypress from the physical keyboard to the updated render view.

End-to-End Flow¶

        sequenceDiagram
    participant User
    participant HIC as HumanInputController
    participant SC as SessionController
    participant Adapter
    participant Env as Gymnasium Environment
    participant RR as RendererRegistry
    participant RV as Render View

    User->>HIC: keyPress (e.g., Arrow Right)
    HIC->>SC: perform_human_action(action=1)
    SC->>Adapter: _apply_action(1)
    Adapter->>Env: env.step(1)
    Env-->>Adapter: obs, reward, terminated, info
    Adapter-->>SC: step result
    SC->>SC: step_processed signal
    SC-->>RR: render payload
    RR->>RV: strategy.render(payload)
    RV-->>User: Updated visual display

Control Modes¶

One ControlMode value drives human input in single-agent environments:

Mode	Behaviour
`HUMAN_ONLY`	All actions come from the keyboard. The environment waits for human input before advancing.

Example: Playing FrozenLake¶

FrozenLake is a turn-based grid-world environment with 4 discrete actions. It uses shortcut-based input mode because every action is a single keypress and there is no need for simultaneous key combinations.

Key mappings (shortcut-based)

Key	Action Index	Effect
Arrow Left / A	0	Move left
Arrow Down / S	1	Move down
Arrow Right / D	2	Move right
Arrow Up / W	3	Move up

Renderer: GridRendererStrategy renders the FrozenLake map as a tile grid using FrozenLakeAssets. The elf sprite moves from tile to tile as the agent position updates.

Walkthrough

The player presses Right (or D).
The QShortcut fires and calls SessionController.perform_human_action(2).
SessionController checks that the control mode allows human input, then calls _apply_action(2).
The adapter calls env.step(2), which returns a new observation (the elf moved one tile right), reward, and termination flag.
SessionController emits the step_processed signal.
The RendererRegistry passes the new observation to GridRendererStrategy.render(), which updates the elf sprite position on the tile map.
The player sees the elf in its new position and presses the next key.

Example: Playing Atari¶

Atari games are real-time environments with up to 18 discrete actions that include directional movement, fire, and all direction+fire combinations. They use state-based input mode to support simultaneous key presses (e.g., Up+Space = UPFIRE).

Key handling (state-based)

Keys Held	Action Index	Effect
(none)	0	NOOP
Space	1	FIRE
Arrow Up	2	UP
Arrow Right	3	RIGHT
Arrow Left	4	LEFT
Arrow Down	5	DOWN
Arrow Up + Arrow Right	6	UPRIGHT
Arrow Up + Arrow Left	7	UPLEFT
Arrow Down + Arrow Right	8	DOWNRIGHT
Arrow Down + Arrow Left	9	DOWNLEFT
Space + Arrow Up	10	UPFIRE
Space + Arrow Right	11	RIGHTFIRE
Space + Arrow Left	12	LEFTFIRE
Space + Arrow Down	13	DOWNFIRE

Resolver: AleKeyCombinationResolver inspects the set of currently pressed keys and returns the correct composite action index.

Renderer: RgbRendererStrategy displays each game frame as a scaled RGB image.

Walkthrough

The player holds Up and presses Space simultaneously.
HumanInputController.eventFilter tracks both keys in _pressed_keys.
AleKeyCombinationResolver.resolve({Up, Space}) returns action index 10 (UPFIRE).
perform_human_action(10) is called on SessionController.
The adapter calls env.step(10); the Atari emulator advances one frame and returns an RGB observation.
RgbRendererStrategy.render() converts the NumPy array to a QPixmap and paints it in the render tab.
The loop repeats at the environment’s native frame rate.