The key insight is that both modes share the same physics layer, but differ in their perception layer. An ActionLog bridges this gap, providing structured event data that humans can view as a combat log and agents receive as text context.

Headless Mode Available: McRogueFace now supports true headless execution (--headless --exec) with deterministic frame stepping via mcrfpy.step() (#157 completed). This enables agent testing without an X11 display server or GPU, making it suitable for CI pipelines, remote servers, and automated evaluation harnesses.

Architecture Layers

┌─────────────────────────────────────────────────────────────┐
│                    PHYSICS LAYER                             │
│         Entity interactions: bump, ev_enter, ev_exit         │
│         Deterministic, turn-based, grid-based                │
└─────────────────────────────────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────┐
│                  EPISTEMOLOGY LAYER                          │
│         ActionLog + SpeechChannel + TurnContext              │
│         "What happened and who perceived it"                 │
└─────────────────────────────────────────────────────────────┘
                              │
            ┌─────────────────┼─────────────────┐
            ▼                 ▼                 ▼
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
│   HUMAN CLIENT    │ │   AGENT CLIENT    │ │  REPLAY CLIENT    │
│ • Renders grid    │ │ • Screenshot +    │ │ • Step through    │
│ • Keyboard input  │ │   TurnContext     │ │   ActionLog       │
│ • Optional log UI │ │ • LLM query/parse │ │ • Animate actions │
└───────────────────┘ └───────────────────┘ └───────────────────┘

Physics Layer: Entity Interaction Model

Adapted from Crypt of Sokoban's proven interaction system.

Core Events

Event	Trigger	Purpose
`bump(other, dx, dy)`	Entity attempts to enter occupied tile	Collision resolution, combat, interaction
`ev_enter(other)`	Entity successfully enters tile	Triggers (pressure plates, pickups)
`ev_exit(other)`	Entity leaves tile	State reversal (plate release)

Resolution Order

The draw_order property determines which entity handles bump first when multiple occupy a tile:

draw_order	Entity Type	Rationale
10	Player/Agent	Highest priority - always respond to bumps
7	Enemies	Combat entities
5	Items	Can be picked up
2	Doors	Conditional passage
1	Floor triggers	Lowest - checked last, allows overlap

Entity Base Class

class GameEntity:
    """Base class for all interactive entities."""
    
    draw_order: int = 5
    description: str = "an entity"
    detailed_description: str = "You see an entity."
    
    def bump(self, actor, dx, dy, test=False) -> ActionRecord:
        """Called when another entity tries to move into this tile."""
        raise NotImplementedError
    
    def ev_enter(self, other) -> Optional[ActionRecord]:
        """Called when another entity enters this tile."""
        pass
    
    def ev_exit(self, other) -> Optional[ActionRecord]:
        """Called when another entity leaves this tile."""
        pass
    
    def act(self) -> Optional[ActionRecord]:
        """Called on this entity's turn (for NPCs)."""
        pass

Epistemology Layer: ActionLog

The central event bus that records all game actions for perception filtering.

ActionRecord

@dataclass
class ActionRecord:
    """A single recorded action in the game world."""
    
    turn: int                    # Game turn number
    actor_id: str                # Unique entity identifier
    actor_description: str       # "the knight", "a guard"
    action_type: str             # "move", "take", "unlock", "attack", "speak"
    
    # Action details
    args: Dict[str, Any]         # {"direction": "north", "item": "brass_key"}
    target_id: Optional[str]     # Entity acted upon
    target_description: Optional[str]
    
    # Outcome
    result: str                  # "success", "blocked", "hit", "miss"
    result_message: str          # "The knight unlocks the door with the brass key."
    
    # Spatial info (for perception filtering)
    position: Tuple[int, int]    # Where it happened
    sound_radius: int            # How far the sound travels (0 = silent)

ActionLog Class

class ActionLog:
    """Central record of all game actions."""
    
    def record(self, action: ActionRecord) -> None:
        """Record an action."""
        
    def get_visible_to(self, observer, grid, since_turn: int) -> List[ActionRecord]:
        """Get actions the observer could SEE (in FOV when they happened)."""
        
    def get_audible_to(self, observer, grid, since_turn: int) -> List[ActionRecord]:
        """Get actions the observer could HEAR.
        
        - In FOV: Full detail ("The guard says 'Halt!'")
        - Out of FOV, in range: Vague ("You hear a voice to the east")
        - Out of range: Nothing
        """
    
    def get_turn_summary(self, turn: int) -> List[ActionRecord]:
        """Get all actions from a specific turn (for replay)."""

SpeechChannel

Speech is a special action type with FOV-based reception:

class SpeechChannel:
    """Handles agent-to-agent communication."""
    
    def speak(self, speaker, message: str, volume: str = "normal") -> ActionRecord:
        """
        Broadcast speech from speaker.
        
        volume:
        - "whisper": Adjacent tiles only (radius 1)
        - "normal": FOV range (same as sight)
        - "shout": Entire room
        
        Returns ActionRecord with sound_radius set appropriately.
        """
    
    def get_heard_speech(self, listener, grid, since_turn: int) -> List[SpeechRecord]:
        """
        Get speech heard by listener.
        
        Returns:
        - Full text if speaker was in listener's FOV
        - "You hear indistinct speech to the {direction}" if out of FOV but in range
        """

TurnContext

What an entity perceives when it's their turn:

@dataclass
class TurnContext:
    """Complete perception state for an entity's turn."""
    
    # Identity
    actor_id: str
    actor_description: str
    position: Tuple[int, int]
    current_room: str
    
    # Visual perception (current FOV)
    visible_entities: List[EntitySnapshot]
    visible_terrain: List[TileSnapshot]
    
    # Auditory perception (since last turn)
    heard_speech: List[SpeechRecord]
    heard_sounds: List[str]  # "footsteps to the north", "a door creaking"
    
    # Observed actions (in FOV since last turn)
    observed_actions: List[ActionRecord]
    
    # Available actions
    available_actions: List[str]  # ["GO NORTH", "TAKE brass_key", "SPEAK '...'"]
    
    # Inventory
    inventory: List[str]
    
    def to_prose(self) -> str:
        """Generate natural language description for LLM context."""
        parts = []
        
        # Location
        parts.append(f"You are in {self.current_room}.")
        
        # What you see
        if self.visible_entities:
            parts.append(self._describe_visible())
        
        # What happened (observed actions)
        if self.observed_actions:
            parts.append("Since your last turn:")
            for action in self.observed_actions:
                parts.append(f"  - {action.result_message}")
        
        # What you heard
        if self.heard_speech:
            for speech in self.heard_speech:
                if speech.in_fov:
                    parts.append(f'{speech.speaker} says: "{speech.message}"')
                else:
                    parts.append(f"You hear someone speaking to the {speech.direction}.")
        
        # Available actions
        parts.append(f"Available actions: {', '.join(self.available_actions)}")
        
        return "\n".join(parts)

Game Entities

Academically-presentable entity types for agent learning scenarios.

KeyEntity

class KeyEntity(GameEntity):
    """A key that can be picked up and used on matching doors."""
    
    draw_order = 5
    
    def __init__(self, x, y, key_id: str, display_name: str = "a brass key"):
        self.key_id = key_id
        self.description = display_name
        self.detailed_description = f"{display_name}. It might unlock something."
    
    def bump(self, actor, dx, dy, test=False) -> ActionRecord:
        # Actor picks up the key
        if hasattr(actor, 'inventory'):
            if not test:
                actor.inventory.append(self.key_id)
                self.die()  # Remove from world
            return ActionRecord(
                action_type="take",
                result="success",
                result_message=f"{actor.description} picks up {self.description}.",
                sound_radius=2
            )
        return ActionRecord(
            action_type="take",
            result="failure",
            result_message=f"{self.description} lies on the ground.",
            sound_radius=0
        )

LockedDoorEntity

class LockedDoorEntity(GameEntity):
    """A door that requires a specific key to open."""
    
    draw_order = 2
    
    def __init__(self, x, y, key_id: str, destination_room: str):
        self.key_id = key_id
        self.destination_room = destination_room
        self.locked = True
        self.description = "a locked door"
        self.detailed_description = "A sturdy wooden door. It has a brass keyhole."
    
    def bump(self, actor, dx, dy, test=False) -> ActionRecord:
        if self.locked:
            # Check if actor has the key
            if hasattr(actor, 'inventory') and self.key_id in actor.inventory:
                if not test:
                    self.unlock()
                return ActionRecord(
                    action_type="unlock",
                    result="success",
                    result_message=f"{actor.description} unlocks the door.",
                    sound_radius=5  # Loud click
                )
            else:
                return ActionRecord(
                    action_type="open",
                    result="blocked",
                    result_message="The door is locked.",
                    sound_radius=1
                )
        else:
            # Door is open, allow passage
            if not test:
                actor.do_move(actor.x + dx, actor.y + dy)
            return ActionRecord(
                action_type="move",
                result="success",
                result_message=f"{actor.description} passes through the doorway.",
                sound_radius=2
            )
    
    def unlock(self):
        self.locked = False
        self.sprite_number = OPEN_DOOR_SPRITE
        self.description = "an open doorway"

GuardEntity

class GuardEntity(GameEntity):
    """An NPC that patrols a fixed route and reacts to intruders."""
    
    draw_order = 7
    
    def __init__(self, x, y, patrol_route: List[Tuple[int, int]], 
                 behavior: str = "patrol"):
        self.patrol_route = patrol_route
        self.patrol_index = 0
        self.behavior = behavior  # "patrol", "stationary", "chase"
        self.alert = False
        self.description = "a guard"
        self.detailed_description = "An armored guard. They look vigilant."
        self.sight_range = 6
    
    def act(self) -> Optional[ActionRecord]:
        """Called on guard's turn."""
        
        if self.behavior == "patrol":
            return self._patrol_step()
        elif self.behavior == "chase" and self.target:
            return self._chase_step()
        
        return ActionRecord(
            action_type="wait",
            result="success",
            result_message=f"{self.description} stands watch.",
            sound_radius=0
        )
    
    def _patrol_step(self) -> ActionRecord:
        """Move to next point on patrol route."""
        next_pos = self.patrol_route[self.patrol_index]
        self.patrol_index = (self.patrol_index + 1) % len(self.patrol_route)
        
        # Move toward next patrol point
        dx = sign(next_pos[0] - self.x)
        dy = sign(next_pos[1] - self.y)
        
        if self.try_move(dx, dy):
            return ActionRecord(
                action_type="move",
                result="success", 
                result_message=f"{self.description} continues their patrol.",
                sound_radius=3  # Footsteps
            )
    
    def check_fov_for_intruders(self, entities, grid) -> Optional[ActionRecord]:
        """Check if any player/agent is visible."""
        for entity in entities:
            if isinstance(entity, PlayerEntity) and grid.is_in_fov(entity.x, entity.y):
                self.alert = True
                self.target = entity
                self.behavior = "chase"
                return ActionRecord(
                    action_type="speak",
                    result="success",
                    result_message=f'{self.description} shouts: "Halt! Intruder!"',
                    sound_radius=10  # Shout carries far
                )
        return None

CombatantEntity

class CombatantEntity(GameEntity):
    """Basic enemy that engages in melee combat."""
    
    draw_order = 7
    
    def __init__(self, x, y, hp: int = 3, damage: int = 1, 
                 description: str = "a hostile creature"):
        self.hp = hp
        self.max_hp = hp
        self.damage = damage
        self.description = description
    
    def bump(self, actor, dx, dy, test=False) -> ActionRecord:
        """Handle being bumped (attacked)."""
        
        if self.hp <= 0:
            # Dead - allow walking over
            if not test:
                actor.do_move(self.x, self.y)
            return ActionRecord(
                action_type="move",
                result="success",
                result_message=f"{actor.description} steps over the fallen {self.description}.",
                sound_radius=1
            )
        
        # Combat!
        if hasattr(actor, 'damage'):
            damage_dealt = actor.damage
            if not test:
                self.hp -= damage_dealt
            
            if self.hp <= 0:
                return ActionRecord(
                    action_type="attack",
                    result="kill",
                    result_message=f"{actor.description} defeats {self.description}!",
                    sound_radius=5
                )
            else:
                return ActionRecord(
                    action_type="attack",
                    result="hit",
                    result_message=f"{actor.description} strikes {self.description}.",
                    sound_radius=4
                )
        
        return ActionRecord(
            action_type="bump",
            result="blocked",
            result_message=f"{self.description} blocks the path.",
            sound_radius=1
        )

Agent Learning Scenarios

Progressive scenarios for evaluating agent capabilities.

Scenario 1: Key Hunt

Setup:

Agent spawns in Room A
Key in Room B (visible from A through doorway)
Locked door to Goal Room C

Learning Objectives:

Object recognition ("I see a key")
Cause-effect reasoning ("key unlocks door")
Sequential planning (go to B → get key → return to door → unlock → enter C)

Success Metric: Agent reaches Goal Room C

Scenario 2: Guard Patrol

Setup:

Single guard on 4-point rectangular patrol
Agent must cross patrol path to reach goal
No combat option - must avoid detection

Learning Objectives:

Behavior observation ("guard moves clockwise")
Prediction ("guard will be at X in 2 turns")
Timing ("I should move now while guard faces away")

Success Metric: Agent reaches goal without triggering alert

Scenario 3: Cooperative Unlock

Setup:

Agent A in Room 1 with key
Agent B in Room 2 near locked door
Door blocks Agent B's goal

Learning Objectives:

Situation communication ("I have the key")
Coordination ("Wait, I'm coming to unlock it")
Theory of mind ("Agent B needs the door opened")

Success Metric: Both agents reach their respective goals

Scenario 4: Combat Decision

Setup:

Weak enemy blocks direct path to goal
Alternative route available (longer, safe)
Agent has limited HP

Learning Objectives:

Risk assessment ("Can I win this fight?")
Resource management ("Is the shortcut worth the HP?")
Alternative planning ("I could go around instead")

Success Metric: Agent reaches goal (either path)

Development Phases

Phase 1: ActionLog Foundation

Goal: Instrument entity interactions without breaking existing CoS gameplay.

Tasks:

Create ActionRecord dataclass
Create ActionLog class with basic recording
Modify COSEntity.bump() to return ActionRecord
Add ActionRecord returns to ev_enter(), ev_exit()
Wire ActionLog into game loop

Validation: CoS plays normally; ActionLog captures all events

Phase 2: TurnContext & Perception

Goal: Generate per-entity perception from ActionLog.

Tasks:

Create TurnContext dataclass
Implement FOV-filtered action retrieval
Implement to_prose() for LLM text generation
Add SpeechChannel with FOV-based reception
Create TurnOrchestrator that generates TurnContext per turn

Validation: Print TurnContext.to_prose() alongside gameplay; verify accuracy

Phase 3: Clean Entity Set

Goal: Implement academically-presentable entities.

Tasks:

KeyEntity - takeable item
LockedDoorEntity - conditional passage
GuardEntity - patrol behavior with FOV detection
CombatantEntity - basic melee combat
Level loader for scenario definitions

Validation: Human can play through all 4 scenarios

Phase 4: Agent Integration

Goal: Connect LLM agents to the game world.

Tasks:

Adapt existing TurnOrchestrator from vllm_demo
Connect TurnContext.to_prose() to LLM prompts
Parse LLM responses to ActionRecord
Execute actions through entity system
Screenshot + text context packaging

Validation: Single agent completes Scenario 1 (Key Hunt)

Phase 5: Multi-Agent & Speech

Goal: Enable agent-to-agent communication.

Tasks:

Multi-agent turn sequencing
Speech action execution via SpeechChannel
Speech reception in TurnContext
Test Scenario 3 (Cooperative Unlock)

Validation: Two agents coordinate via speech to solve puzzle

Phase 6: Evaluation & Analysis

Goal: Systematic evaluation of agent capabilities.

Tasks:

Simulation logging (full ActionLog export)
Success/failure metrics per scenario
Behavior analysis tools
Comparison across LLM models

Deliverable: Paper-ready evaluation results

File Structure

tests/
├── vllm_demo/                    # Existing demo code
│   ├── action_parser.py          # LLM response parsing
│   ├── action_executor.py        # → Refactor to use ActionLog
│   ├── world_graph.py            # → Keep for room descriptions
│   ├── turn_orchestrator.py      # → Enhance with TurnContext
│   └── scenarios/                # New: scenario definitions
│       ├── key_hunt.py
│       ├── guard_patrol.py
│       ├── cooperative_unlock.py
│       └── combat_decision.py
│
src/scripts/
├── cos_entities.py               # → Add ActionRecord returns
├── game_entities.py              # New: clean academic entities
├── action_log.py                 # New: ActionLog system
├── speech_channel.py             # New: agent communication
└── turn_context.py               # New: perception generation

#153 - Separate render loop from game state loop (complete)
#154 - Grounded Multi-Agent Testbed (parent, open, 2 comments)
#155 - Deterministic Text Descriptions (closed)
#156 - Turn-based LLM Agent Orchestration (open, 4 comments, in progress)
#157 - Headless mode (complete - enables display-free agent testing)

References

Crypt of Sokoban entity system: src/scripts/cos_entities.py
Existing VLLM demos: tests/vllm_demo/
FOV/Perspective system: Issue #154 comment (2025-12-01)

LLM Agent Testbed Architecture

Overview

Architecture Layers

Physics Layer: Entity Interaction Model

Core Events

Resolution Order

Entity Base Class

Epistemology Layer: ActionLog

ActionRecord

ActionLog Class

SpeechChannel

TurnContext

Game Entities

KeyEntity

LockedDoorEntity

GuardEntity

CombatantEntity

Agent Learning Scenarios

Scenario 1: Key Hunt

Scenario 2: Guard Patrol

Scenario 3: Cooperative Unlock

Scenario 4: Combat Decision

Development Phases

Phase 1: ActionLog Foundation

Phase 2: TurnContext & Perception

Phase 3: Clean Entity Set

Phase 4: Agent Integration

Phase 5: Multi-Agent & Speech

Phase 6: Evaluation & Analysis

File Structure

Related Issues

References