Multi-Keyboard Support (Evdev)¶
MOSAIC supports true multi-keyboard input so that multiple human players can each control a separate agent in the same multi-agent environment. Because the standard Linux display server merges all keyboards into one virtual device, MOSAIC reads each physical keyboard independently via the Linux evdev subsystem.
Why Multiple Keyboards?¶
Multi-agent environments such as MultiGrid-Soccer (2 v 2, 4 agents), Overcooked (2 cooperating chefs), MeltingPot (up to 16 agents), or RWARE (2 to 8 warehouse robots) are designed for simultaneous human play. Each player needs a dedicated keyboard so that their key presses only move their own agent.
A USB hub with at least 4 ports is recommended because:
Most laptops expose only 1 to 3 USB ports, which are already occupied by mouse, headset, or other peripherals.
A 4-port hub lets you connect 4 inexpensive USB keyboards (one per agent) to a single host USB port.
Larger hubs (7-port) support up to 7 human players for environments like MeltingPot that can exceed 4 agents.
Tested hardware combinations:
Keyboard |
Vendor ID |
Product ID |
USB Port |
|---|---|---|---|
Logitech USB Keyboard |
046d |
c31c |
Hub port 1 |
Logitech USB Keyboard |
046d |
c34b |
Hub port 2 |
SIGMACHIP USB Keyboard |
1c4f |
0002 |
Hub port 3 |
Lenovo Black Silk |
17ef |
602d |
Hub port 4 |
All four keyboards share the same key bindings (WASD for movement, Space for interact, etc.), the evdev layer distinguishes them by which physical USB port they are plugged into, not by which keys are pressed.
The X11 Problem¶
On Linux with X11 (the most common display server on Ubuntu 22.04), all
connected keyboards are merged into a single logical device called the
Virtual core keyboard. Every key event Qt sees comes from this
virtual device, so QInputDevice.systemId() returns the same ID
regardless of which physical keyboard was pressed.
Three approaches were evaluated:
Approach |
Details |
Result |
|---|---|---|
Qt |
Under X11, the display server merges all physical keyboards into a
single virtual core device. As a result, |
Rejected |
XInput2 native event filter |
The XInput2 extension can report distinct device identifiers for pointing devices; however, Qt consumes keyboard events at the toolkit level before they reach the application’s native event filter, preventing per-device attribution for key presses. |
Rejected |
evdev (direct |
Opens each |
Adopted |
Evidence is preserved in the test suite:
test_qt_keyboard_device_id.py: all keyboards showsystemId=3test_xinput2_multi_keyboard.py: mouse events work, keyboard events do nottest_usb_hub_keyboards.py: evdev correctly identifies per-device key presses
Architecture¶
graph TD
KB1[USB Keyboard 1] --> HUB[USB Hub]
KB2[USB Keyboard 2] --> HUB
KB3[USB Keyboard 3] --> HUB
KB4[USB Keyboard 4] --> HUB
HUB --> EVDEV["/dev/input/eventX"]
EVDEV --> MON[EvdevKeyboardMonitor<br/>QThread]
MON --> |key_pressed signal| HIC[HumanInputController]
HIC --> |device_path → agent_id| ROUTE{Agent Router}
ROUTE --> A0[Agent 0 pressed_keys]
ROUTE --> A1[Agent 1 pressed_keys]
ROUTE --> A2[Agent 2 pressed_keys]
ROUTE --> A3[Agent 3 pressed_keys]
A0 --> RES0[KeyCombinationResolver]
A1 --> RES1[KeyCombinationResolver]
A2 --> RES2[KeyCombinationResolver]
A3 --> RES3[KeyCombinationResolver]
RES0 --> SC[SessionController]
RES1 --> SC
RES2 --> SC
RES3 --> SC
style KB1 fill:#4a90d9,stroke:#2e5a87,color:#fff
style KB2 fill:#4a90d9,stroke:#2e5a87,color:#fff
style KB3 fill:#4a90d9,stroke:#2e5a87,color:#fff
style KB4 fill:#4a90d9,stroke:#2e5a87,color:#fff
style HUB fill:#4a90d9,stroke:#2e5a87,color:#fff
style MON fill:#ff7f50,stroke:#cc5500,color:#fff
style HIC fill:#50c878,stroke:#2e8b57,color:#fff
style SC fill:#50c878,stroke:#2e8b57,color:#fff
style ROUTE fill:#9370db,stroke:#6a0dad,color:#fff
Data Flow¶
The complete path of a single key press from hardware to game action:
Player presses W on Keyboard 2 (plugged into hub port 2).
The Linux kernel writes a 24-byte
input_eventstruct to/dev/input/event17.EvdevKeyboardMonitor(running on a backgroundQThread) wakes fromselect()and reads the raw bytes.The struct is unpacked:
struct input_event { struct timeval time; /* 16 bytes */ __u16 type; /* 2 bytes (EV_KEY = 1) */ __u16 code; /* 2 bytes (KEY_W = 17) */ __s32 value; /* 4 bytes (1 = press) */ }; /* Total: 24 bytes */
The monitor emits
key_pressed(device_path, keycode=17, timestamp).HumanInputControllerreceives the signal and looks updevice_pathin_evdev_device_to_agentto findagent_id(e.g.,agent_1).linux_keycode_to_qt_key(17)converts Linux keycode 17 toQt.Key.Key_W, thenint()converts the enum to an integer (critical: the pressed-keys set stores plain integers for compatibility with theKeyCombinationResolver).The key is added to
_agent_pressed_keys["agent_1"].The
KeyCombinationResolverfor this environment inspects agent 1’s pressed-key set and returns an action index (e.g.,FORWARD = 2).perform_human_action(2)is emitted for agent 1 only.The MultiGrid environment receives agent 1’s action; all other agents receive
STILL(no-op) unless their keyboards were also pressed.
Implementation Components¶
EvdevKeyboardMonitor¶
A QObject that runs on a dedicated QThread. Located at
gym_gui/controllers/evdev_keyboard_monitor.py.
Signals:
Signal |
Description |
|---|---|
|
|
|
|
|
A |
|
A device was removed (hot-unplug or read error). |
|
Human-readable error message (e.g., permission denied). |
Key methods:
discover_keyboards(): scans/dev/input/by-path/*kbdand/dev/input/by-id/*kbd, deduplicates by real path, reads device names from/proc/bus/input/devices.add_device(device): opens the event file withO_RDONLY | O_NONBLOCK. OnPermissionError, emits an error with instructions to runsudo usermod -a -G input $USER.start_monitoring(): moves itself to aQThreadand enters the_monitor_loop()._monitor_loop(): callsselect()with a 0.5 s timeout across all open file descriptors; dispatches to_process_device_events()for each readable fd._process_device_events(fd): reads 24-byte packets, unpacks withstruct.unpack('llHHi', data), emitskey_pressedorkey_releasedonEV_KEYevents.
Device Path Strategy¶
MOSAIC uses /dev/input/by-path/*-event-kbd paths because they are
stable across reboots, the path encodes the USB topology (PCI bus,
hub port number), not a volatile event number.
Example paths for 4 keyboards on one hub:
/dev/input/by-path/pci-0000:00:14.0-usb-0:5.1:1.0-event-kbd
/dev/input/by-path/pci-0000:00:14.0-usb-0:5.2:1.0-event-kbd
/dev/input/by-path/pci-0000:00:14.0-usb-0:5.3:1.0-event-kbd
/dev/input/by-path/pci-0000:00:14.0-usb-0:5.4:1.0-event-kbd
/dev/input/by-id/ is not used because two keyboards of the same
make and model produce duplicate by-id names.
/dev/input/eventX numbers change on reboot and are also avoided.
Keycode Translation¶
gym_gui/controllers/keycode_translation.py maps Linux scancodes
(defined in /usr/include/linux/input-event-codes.h) to
Qt.Key enum values.
LINUX_TO_QT_KEYCODE = {
17: Qt.Key.Key_W, # KEY_W
30: Qt.Key.Key_A, # KEY_A
31: Qt.Key.Key_S, # KEY_S
32: Qt.Key.Key_D, # KEY_D
57: Qt.Key.Key_Space, # KEY_SPACE
103: Qt.Key.Key_Up, # KEY_UP
105: Qt.Key.Key_Left, # KEY_LEFT
106: Qt.Key.Key_Right,# KEY_RIGHT
108: Qt.Key.Key_Down, # KEY_DOWN
# ... 100+ entries covering letters, numbers,
# function keys, numpad, modifiers, etc.
}
After translation, the rest of the input pipeline (resolvers, action mapping) is identical to single-keyboard mode.
KeyboardAssignmentWidget¶
A Qt widget (gym_gui/ui/widgets/keyboard_assignment_widget.py) that
provides the user interface for mapping keyboards to agents:
Auto-detection: discovers all connected USB keyboards via the evdev monitor.
Auto-assign: maps the first N discovered keyboards to agents 0..*N*-1 in the order they appear on the USB bus.
Manual override: drag-and-drop or drop-down selection to reassign any keyboard to any agent.
Test mode: press any key and the widget highlights which agent the keyboard controls.
Persistence: assignments are saved to
config/keyboard_assignments.yamland restored on next launch.
Why State-Based Mode Is Required¶
Multi-agent environments force state-based input mode. This is enforced at three layers to prevent misconfiguration:
Layer |
Mechanism |
Effect |
|---|---|---|
Configuration |
|
Declares the requirement in code. |
UI |
|
User cannot select the wrong mode. |
Controller |
|
Catches programmatic errors. |
The technical reason is that shortcut-based mode uses Qt’s global
QShortcut system. When evdev is active, a key press generates
two events:
The evdev monitor correctly routes the key to one agent.
Qt’s
QShortcutalso fires for all agents (because X11 merged the keyboard into the virtual core device).
The result is that one key press moves every agent instead of just
the intended one. State-based mode avoids this because it polls
per-agent _agent_pressed_keys sets, which are populated exclusively
by evdev signals.
System Requirements¶
Platform¶
Linux only: evdev is a Linux kernel interface.
Tested on Ubuntu 22.04 with X11.
Kernel 2.6 or later (evdev has been stable since 2003).
Permissions¶
The user must be a member of the input group to read
/dev/input/event* files:
sudo usermod -a -G input $USER
# Log out and log back in for the change to take effect
Alternatively, create a udev rule:
# /etc/udev/rules.d/99-mosaic-keyboards.rules
KERNEL=="event*", SUBSYSTEM=="input", MODE="0660", GROUP="plugdev"
Hardware¶
A USB hub (4+ ports recommended).
One USB keyboard per agent. Any brand works, keyboards do not need to match.
User Workflow¶
One-Time Setup¶
Plug a USB hub into the host machine.
Connect one keyboard per agent to the hub (e.g., 4 keyboards for a 4-player Soccer match).
Add your user to the
inputgroup (see Permissions above).Log out and back in.
In-Game Usage¶
Launch the MOSAIC GUI.
Select a multi-agent environment (e.g., MultiGrid-Soccer-v0).
Set the number of agents (e.g., 4).
Click Load Environment.
The Keyboard Assignment Widget appears automatically.
Click Auto-Assign, keyboards are mapped to agents in USB-port order.
Click Apply Assignments.
Start the game. Each keyboard controls only its assigned agent.
Testing and Verification¶
Click Test on the Keyboard Assignment Widget.
Press any key on each keyboard.
The widget highlights which agent the keyboard controls.
Verify all keyboards operate independently.
Per-Agent Key Bindings¶
Every keyboard uses the same key layout. The evdev layer distinguishes keyboards by USB port, not by key.
Agent 0 (Keyboard on hub port 1):
W = Forward, A = Left, S = Backward, D = Right, Space = Interact
Agent 1 (Keyboard on hub port 2):
W = Forward, A = Left, S = Backward, D = Right, Space = Interact
Agent 2 (Keyboard on hub port 3):
W = Forward, A = Left, S = Backward, D = Right, Space = Interact
Agent 3 (Keyboard on hub port 4):
W = Forward, A = Left, S = Backward, D = Right, Space = Interact
Configuration Persistence¶
Keyboard-to-agent assignments are saved to
config/keyboard_assignments.yaml:
keyboard_assignments:
agent_0:
device_path: /dev/input/by-path/pci-0000:00:14.0-usb-0:5.1:1.0-event-kbd
device_name: "Logitech USB Keyboard"
vendor_id: "046d"
product_id: "c31c"
agent_1:
device_path: /dev/input/by-path/pci-0000:00:14.0-usb-0:5.2:1.0-event-kbd
device_name: "Logitech USB Keyboard"
vendor_id: "046d"
product_id: "c34b"
Because paths are based on USB topology, the same physical port always maps to the same agent, even after a reboot.
Performance¶
Metric |
Value |
|---|---|
Input latency |
< 10 ms (keypress to action) |
CPU usage |
< 2 % for the monitoring thread |
Memory overhead |
~5 MB for the evdev monitor |
Simultaneous keys |
20+ keys across 4 keyboards |
The select()-based event loop only wakes when there is data to read,
keeping idle CPU usage near zero.
Troubleshooting¶
Permission Denied¶
Symptom: PermissionError: [Errno 13] Permission denied: '/dev/input/event16'
Fix:
sudo usermod -a -G input $USER
# Log out and back in
groups | grep input # verify
No Keyboards Detected¶
Symptom: The Keyboard Assignment Widget shows “No keyboards detected.”
Diagnosis:
# Verify kernel sees the keyboards
ls -la /dev/input/by-path/*kbd
# Check file permissions
ls -la /dev/input/event*
# Test direct device access
sudo evtest /dev/input/event16 # should show key events
One Keyboard Moves All Agents¶
Symptom: Pressing a key on any keyboard moves every agent at once.
Cause: evdev monitoring failed to start, so the system fell back to Qt shortcuts which fire globally.
Diagnosis:
# Check the MOSAIC log for evdev setup messages
grep "LOG407" var/logs/gym_gui.log | tail -10
# LOG4072 = setup started, LOG4073 = success, LOG4074 = failure
# Verify assignments were applied
grep "Apply Assignments" var/logs/gym_gui.log
Worker-Based Keyboard Input (Subprocess)¶
For multi-agent environments where adapter.step() and
env.render() block the Qt main thread (causing agents to appear
frozen and then “teleport”), MOSAIC provides a subprocess-based
keyboard input mode.
Each physical keyboard is assigned to a human_worker subprocess that reads evdev events in its own process. The GUI never blocks on keyboard I/O because the worker communicates via JSON pipes.
graph TD
KB1[USB Keyboard 1] --> P1[human_worker subprocess<br/>agent_0]
KB2[USB Keyboard 2] --> P2[human_worker subprocess<br/>agent_1]
P1 --> |action_selected JSON| GUI[GUI Main Process]
P2 --> |action_selected JSON| GUI
GUI --> |select_action JSON| P1
GUI --> |select_action JSON| P2
GUI --> ENV["env.step({0: action_0, 1: action_1})"]
ENV --> RENDER[Render Frame]
style KB1 fill:#4a90d9,stroke:#2e5a87,color:#fff
style KB2 fill:#4a90d9,stroke:#2e5a87,color:#fff
style P1 fill:#ff7f50,stroke:#cc5500,color:#fff
style P2 fill:#ff7f50,stroke:#cc5500,color:#fff
style GUI fill:#50c878,stroke:#2e8b57,color:#fff
style ENV fill:#9370db,stroke:#6a0dad,color:#fff
How It Works¶
The GUI launches one
human_workersubprocess per agent in--mode keyboardwith the evdev device path.Each subprocess opens its
/dev/input/eventXdevice and creates a pure-Python keycode resolver (no Qt dependency).When the GUI sends
{"cmd": "select_action"}, the subprocess blocks reading the keyboard until a recognized key is pressed.The subprocess returns
{"type": "action_selected", "action": 3}.The GUI collects all agents’ actions via
MultiAgentStepStateand callsenv.step(all_actions)through the standard pipeline.Rewards, observations, and replay recording are fully preserved.
CLI Usage¶
human-worker --mode keyboard \
--device-path /dev/input/by-path/pci-0000:00:14.0-usb-0:5.1:1.0-event-kbd \
--env-name multigrid \
--player-name "Player 1"
The subprocess uses pure Python (select(), os.read(),
struct.unpack()) to read raw evdev events and maps Linux
keycodes directly to action indices without requiring Qt.
Known Limitations¶
Linux only: evdev is not available on Windows or macOS.
USB required: Bluetooth keyboards may have unreliable device paths.
Same key layout: all keyboards share the same key bindings (WASD, etc.). Per-keyboard remapping is not yet supported.