Keyboard Input System¶
The keyboard input system translates physical key presses into
environment actions. It is implemented primarily in
gym_gui.controllers.human_input and covers two input modes,
per-family key combination resolvers, shortcut mappings, and
multi-keyboard support for multi-agent play.
Two Input Modes¶
MOSAIC offers two fundamentally different ways to process keyboard input.
The mode is selectable in the Game Configuration panel and is stored as
an InputMode enum value.
Mode |
Mechanism |
Best For |
|---|---|---|
|
Qt |
Turn-based games: FrozenLake, MiniGrid, Chess, MiniHack, Jumanji puzzles. |
|
|
Real-time arcade games: Procgen, Atari, ViZDoom, Box2D, MeltingPot. |
sequenceDiagram
participant User
participant Widget as Qt Widget
participant QS as QShortcut
participant HIC as HumanInputController
participant Res as KeyCombinationResolver
participant SC as SessionController
participant Adapt as Adapter
alt Shortcut-Based
User->>Widget: keyPressEvent
Widget->>QS: triggered signal
QS->>SC: perform_human_action(action)
else State-Based
User->>Widget: keyPressEvent
Widget->>HIC: eventFilter (KeyPress)
HIC->>Res: resolve(pressed_keys)
Res-->>HIC: action index
HIC->>SC: perform_human_action(action)
end
SC->>Adapt: _apply_action(action)
Key Combination Resolvers¶
In state-based mode, a KeyCombinationResolver subclass inspects the
set of currently pressed keys and returns the appropriate action index.
Each resolver targets a specific environment family.
Resolver |
Environments |
Actions |
|---|---|---|
|
MiniGrid (7 actions) |
Left / Right / Forward / Pickup / Drop / Toggle / Done |
|
mosaic_multigrid (8 actions) |
Noop / Left / Right / Forward / Pickup / Drop / Toggle / Done |
|
INI multigrid (7 actions) |
Left / Right / Forward / Pickup / Drop / Toggle / Done |
|
RWARE warehouse (5 actions) |
Noop / Forward / Left / Right / Toggle |
|
MeltingPot (8 to 11 actions) |
Noop / Forward / Backward / Strafe / Turn / Interact / Fire |
|
Procgen (15 actions) |
8 directions + 6 action buttons |
|
Atari / ALE (18 actions) |
4 dirs + fire + all direction-fire combos |
|
LunarLander (4 actions) |
Idle / Left engine / Main engine / Right engine |
|
CarRacing (5 actions) |
Coast / Right / Left / Accel / Brake |
|
BipedalWalker (5 actions) |
Neutral / Forward / Back / Crouch / Extend |
|
ViZDoom scenarios |
Scenario-specific button sets |
The factory function get_key_combination_resolver(game_id, action_space)
selects the correct resolver based on the GameId enum and the
environment’s action space.
Shortcut Mappings¶
In shortcut-based mode, MOSAIC registers QShortcut objects using
the ShortcutMapping dataclass, which pairs a QKeySequence with
an action index and a human-readable label. The helper function
_mapping(key_str, action, label) constructs these entries.
Each environment family defines its own mapping dictionary:
_TOY_TEXT_MAPPINGS: FrozenLake, Taxi, CliffWalking, Blackjack_MINIG_GRID_MAPPINGS: MiniGrid family_MULTIGRID_MAPPINGS: MOSAIC MultiGrid and INI MultiGrid_BOX_2D_MAPPINGS: LunarLander, CarRacing, BipedalWalker_VIZDOOM_MAPPINGS: ViZDoom scenarios_MINIHACK_MAPPINGS: MiniHack dungeon crawling_NETHACK_MAPPINGS: NetHack challenge_CRAFTER_MAPPINGS: Crafter survival_BABAISAI_MAPPINGS: BabaIsAI puzzles_PROCGEN_MAPPINGS: Procgen games_ALE_MAPPINGS: Atari / ALE_JUMANJI_MAPPINGS: Jumanji logic games
Common Key Reference¶
The table below summarises keys shared across most environment families.
Key |
Common Action |
|---|---|
Arrow Up / W |
Move forward / Up |
Arrow Down / S |
Move backward / Down |
Arrow Left / A |
Turn or move left |
Arrow Right / D |
Turn or move right |
Space |
Fire / Interact / Pickup |
E / Enter |
Toggle / Use / Interact |
G |
Pickup (grid worlds) |
H |
Drop (grid worlds) |
Q |
Done / Noop |
Multi-Keyboard Support¶
For multi-agent environments where multiple humans each control a separate agent, MOSAIC supports routing physical USB keyboards to different agents via Linux evdev. A USB hub (4+ ports) with one keyboard per agent lets each player press the same keys (WASD, Space, etc.) on their own keyboard while only their agent responds.
On Linux, X11 merges all keyboards into a single virtual device, so
Qt’s QInputDevice.systemId() cannot distinguish them. MOSAIC
bypasses X11 entirely by reading raw /dev/input/eventX file
descriptors through EvdevKeyboardMonitor, a background QThread
that emits per-device key_pressed / key_released signals.
For the full architecture, data flow, hardware requirements, setup instructions, and troubleshooting guide, see the dedicated page: Multi-Keyboard Support (Evdev).
Tip
For multi-agent environments, MOSAIC automatically forces state-based input mode because shortcut-based mode conflicts with per-device keyboard routing.