Add LLM Agent Testbed Architecture
parent
79c9584cac
commit
2c9d9cd341
1 changed files with 658 additions and 0 deletions
658
LLM-Agent-Testbed-Architecture.md
Normal file
658
LLM-Agent-Testbed-Architecture.md
Normal file
|
|
@ -0,0 +1,658 @@
|
||||||
|
# LLM Agent Testbed Architecture
|
||||||
|
|
||||||
|
**Status**: Planning
|
||||||
|
**Parent Issue**: #154 - Grounded Multi-Agent Testbed
|
||||||
|
**Last Updated**: 2025-12-14
|
||||||
|
|
||||||
|
## Overview
|
||||||
|
|
||||||
|
This document describes the architecture for running LLM agents in McRogueFace environments. The system serves dual purposes:
|
||||||
|
|
||||||
|
1. **Human-playable game** with traditional roguelike mechanics
|
||||||
|
2. **Agent testbed** for studying grounded language understanding
|
||||||
|
|
||||||
|
The key insight is that both modes share the same physics layer, but differ in their perception layer. An **ActionLog** bridges this gap, providing structured event data that humans can view as a combat log and agents receive as text context.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Architecture Layers
|
||||||
|
|
||||||
|
```
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ PHYSICS LAYER │
|
||||||
|
│ Entity interactions: bump, ev_enter, ev_exit │
|
||||||
|
│ Deterministic, turn-based, grid-based │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
▼
|
||||||
|
┌─────────────────────────────────────────────────────────────┐
|
||||||
|
│ EPISTEMOLOGY LAYER │
|
||||||
|
│ ActionLog + SpeechChannel + TurnContext │
|
||||||
|
│ "What happened and who perceived it" │
|
||||||
|
└─────────────────────────────────────────────────────────────┘
|
||||||
|
│
|
||||||
|
┌─────────────────┼─────────────────┐
|
||||||
|
▼ ▼ ▼
|
||||||
|
┌───────────────────┐ ┌───────────────────┐ ┌───────────────────┐
|
||||||
|
│ HUMAN CLIENT │ │ AGENT CLIENT │ │ REPLAY CLIENT │
|
||||||
|
│ • Renders grid │ │ • Screenshot + │ │ • Step through │
|
||||||
|
│ • Keyboard input │ │ TurnContext │ │ ActionLog │
|
||||||
|
│ • Optional log UI │ │ • LLM query/parse │ │ • Animate actions │
|
||||||
|
└───────────────────┘ └───────────────────┘ └───────────────────┘
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Physics Layer: Entity Interaction Model
|
||||||
|
|
||||||
|
Adapted from Crypt of Sokoban's proven interaction system.
|
||||||
|
|
||||||
|
### Core Events
|
||||||
|
|
||||||
|
| Event | Trigger | Purpose |
|
||||||
|
|-------|---------|---------|
|
||||||
|
| `bump(other, dx, dy)` | Entity attempts to enter occupied tile | Collision resolution, combat, interaction |
|
||||||
|
| `ev_enter(other)` | Entity successfully enters tile | Triggers (pressure plates, pickups) |
|
||||||
|
| `ev_exit(other)` | Entity leaves tile | State reversal (plate release) |
|
||||||
|
|
||||||
|
### Resolution Order
|
||||||
|
|
||||||
|
The `draw_order` property determines which entity handles `bump` first when multiple occupy a tile:
|
||||||
|
|
||||||
|
| draw_order | Entity Type | Rationale |
|
||||||
|
|------------|-------------|-----------|
|
||||||
|
| 10 | Player/Agent | Highest priority - always respond to bumps |
|
||||||
|
| 7 | Enemies | Combat entities |
|
||||||
|
| 5 | Items | Can be picked up |
|
||||||
|
| 2 | Doors | Conditional passage |
|
||||||
|
| 1 | Floor triggers | Lowest - checked last, allows overlap |
|
||||||
|
|
||||||
|
### Entity Base Class
|
||||||
|
|
||||||
|
```python
|
||||||
|
class GameEntity:
|
||||||
|
"""Base class for all interactive entities."""
|
||||||
|
|
||||||
|
draw_order: int = 5
|
||||||
|
description: str = "an entity"
|
||||||
|
detailed_description: str = "You see an entity."
|
||||||
|
|
||||||
|
def bump(self, actor, dx, dy, test=False) -> ActionRecord:
|
||||||
|
"""Called when another entity tries to move into this tile."""
|
||||||
|
raise NotImplementedError
|
||||||
|
|
||||||
|
def ev_enter(self, other) -> Optional[ActionRecord]:
|
||||||
|
"""Called when another entity enters this tile."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def ev_exit(self, other) -> Optional[ActionRecord]:
|
||||||
|
"""Called when another entity leaves this tile."""
|
||||||
|
pass
|
||||||
|
|
||||||
|
def act(self) -> Optional[ActionRecord]:
|
||||||
|
"""Called on this entity's turn (for NPCs)."""
|
||||||
|
pass
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Epistemology Layer: ActionLog
|
||||||
|
|
||||||
|
The central event bus that records all game actions for perception filtering.
|
||||||
|
|
||||||
|
### ActionRecord
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class ActionRecord:
|
||||||
|
"""A single recorded action in the game world."""
|
||||||
|
|
||||||
|
turn: int # Game turn number
|
||||||
|
actor_id: str # Unique entity identifier
|
||||||
|
actor_description: str # "the knight", "a guard"
|
||||||
|
action_type: str # "move", "take", "unlock", "attack", "speak"
|
||||||
|
|
||||||
|
# Action details
|
||||||
|
args: Dict[str, Any] # {"direction": "north", "item": "brass_key"}
|
||||||
|
target_id: Optional[str] # Entity acted upon
|
||||||
|
target_description: Optional[str]
|
||||||
|
|
||||||
|
# Outcome
|
||||||
|
result: str # "success", "blocked", "hit", "miss"
|
||||||
|
result_message: str # "The knight unlocks the door with the brass key."
|
||||||
|
|
||||||
|
# Spatial info (for perception filtering)
|
||||||
|
position: Tuple[int, int] # Where it happened
|
||||||
|
sound_radius: int # How far the sound travels (0 = silent)
|
||||||
|
```
|
||||||
|
|
||||||
|
### ActionLog Class
|
||||||
|
|
||||||
|
```python
|
||||||
|
class ActionLog:
|
||||||
|
"""Central record of all game actions."""
|
||||||
|
|
||||||
|
def record(self, action: ActionRecord) -> None:
|
||||||
|
"""Record an action."""
|
||||||
|
|
||||||
|
def get_visible_to(self, observer, grid, since_turn: int) -> List[ActionRecord]:
|
||||||
|
"""Get actions the observer could SEE (in FOV when they happened)."""
|
||||||
|
|
||||||
|
def get_audible_to(self, observer, grid, since_turn: int) -> List[ActionRecord]:
|
||||||
|
"""Get actions the observer could HEAR.
|
||||||
|
|
||||||
|
- In FOV: Full detail ("The guard says 'Halt!'")
|
||||||
|
- Out of FOV, in range: Vague ("You hear a voice to the east")
|
||||||
|
- Out of range: Nothing
|
||||||
|
"""
|
||||||
|
|
||||||
|
def get_turn_summary(self, turn: int) -> List[ActionRecord]:
|
||||||
|
"""Get all actions from a specific turn (for replay)."""
|
||||||
|
```
|
||||||
|
|
||||||
|
### SpeechChannel
|
||||||
|
|
||||||
|
Speech is a special action type with FOV-based reception:
|
||||||
|
|
||||||
|
```python
|
||||||
|
class SpeechChannel:
|
||||||
|
"""Handles agent-to-agent communication."""
|
||||||
|
|
||||||
|
def speak(self, speaker, message: str, volume: str = "normal") -> ActionRecord:
|
||||||
|
"""
|
||||||
|
Broadcast speech from speaker.
|
||||||
|
|
||||||
|
volume:
|
||||||
|
- "whisper": Adjacent tiles only (radius 1)
|
||||||
|
- "normal": FOV range (same as sight)
|
||||||
|
- "shout": Entire room
|
||||||
|
|
||||||
|
Returns ActionRecord with sound_radius set appropriately.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def get_heard_speech(self, listener, grid, since_turn: int) -> List[SpeechRecord]:
|
||||||
|
"""
|
||||||
|
Get speech heard by listener.
|
||||||
|
|
||||||
|
Returns:
|
||||||
|
- Full text if speaker was in listener's FOV
|
||||||
|
- "You hear indistinct speech to the {direction}" if out of FOV but in range
|
||||||
|
"""
|
||||||
|
```
|
||||||
|
|
||||||
|
### TurnContext
|
||||||
|
|
||||||
|
What an entity perceives when it's their turn:
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class TurnContext:
|
||||||
|
"""Complete perception state for an entity's turn."""
|
||||||
|
|
||||||
|
# Identity
|
||||||
|
actor_id: str
|
||||||
|
actor_description: str
|
||||||
|
position: Tuple[int, int]
|
||||||
|
current_room: str
|
||||||
|
|
||||||
|
# Visual perception (current FOV)
|
||||||
|
visible_entities: List[EntitySnapshot]
|
||||||
|
visible_terrain: List[TileSnapshot]
|
||||||
|
|
||||||
|
# Auditory perception (since last turn)
|
||||||
|
heard_speech: List[SpeechRecord]
|
||||||
|
heard_sounds: List[str] # "footsteps to the north", "a door creaking"
|
||||||
|
|
||||||
|
# Observed actions (in FOV since last turn)
|
||||||
|
observed_actions: List[ActionRecord]
|
||||||
|
|
||||||
|
# Available actions
|
||||||
|
available_actions: List[str] # ["GO NORTH", "TAKE brass_key", "SPEAK '...'"]
|
||||||
|
|
||||||
|
# Inventory
|
||||||
|
inventory: List[str]
|
||||||
|
|
||||||
|
def to_prose(self) -> str:
|
||||||
|
"""Generate natural language description for LLM context."""
|
||||||
|
parts = []
|
||||||
|
|
||||||
|
# Location
|
||||||
|
parts.append(f"You are in {self.current_room}.")
|
||||||
|
|
||||||
|
# What you see
|
||||||
|
if self.visible_entities:
|
||||||
|
parts.append(self._describe_visible())
|
||||||
|
|
||||||
|
# What happened (observed actions)
|
||||||
|
if self.observed_actions:
|
||||||
|
parts.append("Since your last turn:")
|
||||||
|
for action in self.observed_actions:
|
||||||
|
parts.append(f" - {action.result_message}")
|
||||||
|
|
||||||
|
# What you heard
|
||||||
|
if self.heard_speech:
|
||||||
|
for speech in self.heard_speech:
|
||||||
|
if speech.in_fov:
|
||||||
|
parts.append(f'{speech.speaker} says: "{speech.message}"')
|
||||||
|
else:
|
||||||
|
parts.append(f"You hear someone speaking to the {speech.direction}.")
|
||||||
|
|
||||||
|
# Available actions
|
||||||
|
parts.append(f"Available actions: {', '.join(self.available_actions)}")
|
||||||
|
|
||||||
|
return "\n".join(parts)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Game Entities
|
||||||
|
|
||||||
|
Academically-presentable entity types for agent learning scenarios.
|
||||||
|
|
||||||
|
### KeyEntity
|
||||||
|
|
||||||
|
```python
|
||||||
|
class KeyEntity(GameEntity):
|
||||||
|
"""A key that can be picked up and used on matching doors."""
|
||||||
|
|
||||||
|
draw_order = 5
|
||||||
|
|
||||||
|
def __init__(self, x, y, key_id: str, display_name: str = "a brass key"):
|
||||||
|
self.key_id = key_id
|
||||||
|
self.description = display_name
|
||||||
|
self.detailed_description = f"{display_name}. It might unlock something."
|
||||||
|
|
||||||
|
def bump(self, actor, dx, dy, test=False) -> ActionRecord:
|
||||||
|
# Actor picks up the key
|
||||||
|
if hasattr(actor, 'inventory'):
|
||||||
|
if not test:
|
||||||
|
actor.inventory.append(self.key_id)
|
||||||
|
self.die() # Remove from world
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="take",
|
||||||
|
result="success",
|
||||||
|
result_message=f"{actor.description} picks up {self.description}.",
|
||||||
|
sound_radius=2
|
||||||
|
)
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="take",
|
||||||
|
result="failure",
|
||||||
|
result_message=f"{self.description} lies on the ground.",
|
||||||
|
sound_radius=0
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
### LockedDoorEntity
|
||||||
|
|
||||||
|
```python
|
||||||
|
class LockedDoorEntity(GameEntity):
|
||||||
|
"""A door that requires a specific key to open."""
|
||||||
|
|
||||||
|
draw_order = 2
|
||||||
|
|
||||||
|
def __init__(self, x, y, key_id: str, destination_room: str):
|
||||||
|
self.key_id = key_id
|
||||||
|
self.destination_room = destination_room
|
||||||
|
self.locked = True
|
||||||
|
self.description = "a locked door"
|
||||||
|
self.detailed_description = "A sturdy wooden door. It has a brass keyhole."
|
||||||
|
|
||||||
|
def bump(self, actor, dx, dy, test=False) -> ActionRecord:
|
||||||
|
if self.locked:
|
||||||
|
# Check if actor has the key
|
||||||
|
if hasattr(actor, 'inventory') and self.key_id in actor.inventory:
|
||||||
|
if not test:
|
||||||
|
self.unlock()
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="unlock",
|
||||||
|
result="success",
|
||||||
|
result_message=f"{actor.description} unlocks the door.",
|
||||||
|
sound_radius=5 # Loud click
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="open",
|
||||||
|
result="blocked",
|
||||||
|
result_message="The door is locked.",
|
||||||
|
sound_radius=1
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
# Door is open, allow passage
|
||||||
|
if not test:
|
||||||
|
actor.do_move(actor.x + dx, actor.y + dy)
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="move",
|
||||||
|
result="success",
|
||||||
|
result_message=f"{actor.description} passes through the doorway.",
|
||||||
|
sound_radius=2
|
||||||
|
)
|
||||||
|
|
||||||
|
def unlock(self):
|
||||||
|
self.locked = False
|
||||||
|
self.sprite_number = OPEN_DOOR_SPRITE
|
||||||
|
self.description = "an open doorway"
|
||||||
|
```
|
||||||
|
|
||||||
|
### GuardEntity
|
||||||
|
|
||||||
|
```python
|
||||||
|
class GuardEntity(GameEntity):
|
||||||
|
"""An NPC that patrols a fixed route and reacts to intruders."""
|
||||||
|
|
||||||
|
draw_order = 7
|
||||||
|
|
||||||
|
def __init__(self, x, y, patrol_route: List[Tuple[int, int]],
|
||||||
|
behavior: str = "patrol"):
|
||||||
|
self.patrol_route = patrol_route
|
||||||
|
self.patrol_index = 0
|
||||||
|
self.behavior = behavior # "patrol", "stationary", "chase"
|
||||||
|
self.alert = False
|
||||||
|
self.description = "a guard"
|
||||||
|
self.detailed_description = "An armored guard. They look vigilant."
|
||||||
|
self.sight_range = 6
|
||||||
|
|
||||||
|
def act(self) -> Optional[ActionRecord]:
|
||||||
|
"""Called on guard's turn."""
|
||||||
|
|
||||||
|
if self.behavior == "patrol":
|
||||||
|
return self._patrol_step()
|
||||||
|
elif self.behavior == "chase" and self.target:
|
||||||
|
return self._chase_step()
|
||||||
|
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="wait",
|
||||||
|
result="success",
|
||||||
|
result_message=f"{self.description} stands watch.",
|
||||||
|
sound_radius=0
|
||||||
|
)
|
||||||
|
|
||||||
|
def _patrol_step(self) -> ActionRecord:
|
||||||
|
"""Move to next point on patrol route."""
|
||||||
|
next_pos = self.patrol_route[self.patrol_index]
|
||||||
|
self.patrol_index = (self.patrol_index + 1) % len(self.patrol_route)
|
||||||
|
|
||||||
|
# Move toward next patrol point
|
||||||
|
dx = sign(next_pos[0] - self.x)
|
||||||
|
dy = sign(next_pos[1] - self.y)
|
||||||
|
|
||||||
|
if self.try_move(dx, dy):
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="move",
|
||||||
|
result="success",
|
||||||
|
result_message=f"{self.description} continues their patrol.",
|
||||||
|
sound_radius=3 # Footsteps
|
||||||
|
)
|
||||||
|
|
||||||
|
def check_fov_for_intruders(self, entities, grid) -> Optional[ActionRecord]:
|
||||||
|
"""Check if any player/agent is visible."""
|
||||||
|
for entity in entities:
|
||||||
|
if isinstance(entity, PlayerEntity) and grid.is_in_fov(entity.x, entity.y):
|
||||||
|
self.alert = True
|
||||||
|
self.target = entity
|
||||||
|
self.behavior = "chase"
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="speak",
|
||||||
|
result="success",
|
||||||
|
result_message=f'{self.description} shouts: "Halt! Intruder!"',
|
||||||
|
sound_radius=10 # Shout carries far
|
||||||
|
)
|
||||||
|
return None
|
||||||
|
```
|
||||||
|
|
||||||
|
### CombatantEntity
|
||||||
|
|
||||||
|
```python
|
||||||
|
class CombatantEntity(GameEntity):
|
||||||
|
"""Basic enemy that engages in melee combat."""
|
||||||
|
|
||||||
|
draw_order = 7
|
||||||
|
|
||||||
|
def __init__(self, x, y, hp: int = 3, damage: int = 1,
|
||||||
|
description: str = "a hostile creature"):
|
||||||
|
self.hp = hp
|
||||||
|
self.max_hp = hp
|
||||||
|
self.damage = damage
|
||||||
|
self.description = description
|
||||||
|
|
||||||
|
def bump(self, actor, dx, dy, test=False) -> ActionRecord:
|
||||||
|
"""Handle being bumped (attacked)."""
|
||||||
|
|
||||||
|
if self.hp <= 0:
|
||||||
|
# Dead - allow walking over
|
||||||
|
if not test:
|
||||||
|
actor.do_move(self.x, self.y)
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="move",
|
||||||
|
result="success",
|
||||||
|
result_message=f"{actor.description} steps over the fallen {self.description}.",
|
||||||
|
sound_radius=1
|
||||||
|
)
|
||||||
|
|
||||||
|
# Combat!
|
||||||
|
if hasattr(actor, 'damage'):
|
||||||
|
damage_dealt = actor.damage
|
||||||
|
if not test:
|
||||||
|
self.hp -= damage_dealt
|
||||||
|
|
||||||
|
if self.hp <= 0:
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="attack",
|
||||||
|
result="kill",
|
||||||
|
result_message=f"{actor.description} defeats {self.description}!",
|
||||||
|
sound_radius=5
|
||||||
|
)
|
||||||
|
else:
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="attack",
|
||||||
|
result="hit",
|
||||||
|
result_message=f"{actor.description} strikes {self.description}.",
|
||||||
|
sound_radius=4
|
||||||
|
)
|
||||||
|
|
||||||
|
return ActionRecord(
|
||||||
|
action_type="bump",
|
||||||
|
result="blocked",
|
||||||
|
result_message=f"{self.description} blocks the path.",
|
||||||
|
sound_radius=1
|
||||||
|
)
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Agent Learning Scenarios
|
||||||
|
|
||||||
|
Progressive scenarios for evaluating agent capabilities.
|
||||||
|
|
||||||
|
### Scenario 1: Key Hunt
|
||||||
|
|
||||||
|
**Setup:**
|
||||||
|
- Agent spawns in Room A
|
||||||
|
- Key in Room B (visible from A through doorway)
|
||||||
|
- Locked door to Goal Room C
|
||||||
|
|
||||||
|
**Learning Objectives:**
|
||||||
|
- Object recognition ("I see a key")
|
||||||
|
- Cause-effect reasoning ("key unlocks door")
|
||||||
|
- Sequential planning (go to B → get key → return to door → unlock → enter C)
|
||||||
|
|
||||||
|
**Success Metric:** Agent reaches Goal Room C
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Scenario 2: Guard Patrol
|
||||||
|
|
||||||
|
**Setup:**
|
||||||
|
- Single guard on 4-point rectangular patrol
|
||||||
|
- Agent must cross patrol path to reach goal
|
||||||
|
- No combat option - must avoid detection
|
||||||
|
|
||||||
|
**Learning Objectives:**
|
||||||
|
- Behavior observation ("guard moves clockwise")
|
||||||
|
- Prediction ("guard will be at X in 2 turns")
|
||||||
|
- Timing ("I should move now while guard faces away")
|
||||||
|
|
||||||
|
**Success Metric:** Agent reaches goal without triggering alert
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Scenario 3: Cooperative Unlock
|
||||||
|
|
||||||
|
**Setup:**
|
||||||
|
- Agent A in Room 1 with key
|
||||||
|
- Agent B in Room 2 near locked door
|
||||||
|
- Door blocks Agent B's goal
|
||||||
|
|
||||||
|
**Learning Objectives:**
|
||||||
|
- Situation communication ("I have the key")
|
||||||
|
- Coordination ("Wait, I'm coming to unlock it")
|
||||||
|
- Theory of mind ("Agent B needs the door opened")
|
||||||
|
|
||||||
|
**Success Metric:** Both agents reach their respective goals
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Scenario 4: Combat Decision
|
||||||
|
|
||||||
|
**Setup:**
|
||||||
|
- Weak enemy blocks direct path to goal
|
||||||
|
- Alternative route available (longer, safe)
|
||||||
|
- Agent has limited HP
|
||||||
|
|
||||||
|
**Learning Objectives:**
|
||||||
|
- Risk assessment ("Can I win this fight?")
|
||||||
|
- Resource management ("Is the shortcut worth the HP?")
|
||||||
|
- Alternative planning ("I could go around instead")
|
||||||
|
|
||||||
|
**Success Metric:** Agent reaches goal (either path)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Development Phases
|
||||||
|
|
||||||
|
### Phase 1: ActionLog Foundation
|
||||||
|
|
||||||
|
**Goal:** Instrument entity interactions without breaking existing CoS gameplay.
|
||||||
|
|
||||||
|
**Tasks:**
|
||||||
|
1. Create `ActionRecord` dataclass
|
||||||
|
2. Create `ActionLog` class with basic recording
|
||||||
|
3. Modify `COSEntity.bump()` to return `ActionRecord`
|
||||||
|
4. Add `ActionRecord` returns to `ev_enter()`, `ev_exit()`
|
||||||
|
5. Wire `ActionLog` into game loop
|
||||||
|
|
||||||
|
**Validation:** CoS plays normally; ActionLog captures all events
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 2: TurnContext & Perception
|
||||||
|
|
||||||
|
**Goal:** Generate per-entity perception from ActionLog.
|
||||||
|
|
||||||
|
**Tasks:**
|
||||||
|
1. Create `TurnContext` dataclass
|
||||||
|
2. Implement FOV-filtered action retrieval
|
||||||
|
3. Implement `to_prose()` for LLM text generation
|
||||||
|
4. Add `SpeechChannel` with FOV-based reception
|
||||||
|
5. Create `TurnOrchestrator` that generates TurnContext per turn
|
||||||
|
|
||||||
|
**Validation:** Print TurnContext.to_prose() alongside gameplay; verify accuracy
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 3: Clean Entity Set
|
||||||
|
|
||||||
|
**Goal:** Implement academically-presentable entities.
|
||||||
|
|
||||||
|
**Tasks:**
|
||||||
|
1. `KeyEntity` - takeable item
|
||||||
|
2. `LockedDoorEntity` - conditional passage
|
||||||
|
3. `GuardEntity` - patrol behavior with FOV detection
|
||||||
|
4. `CombatantEntity` - basic melee combat
|
||||||
|
5. Level loader for scenario definitions
|
||||||
|
|
||||||
|
**Validation:** Human can play through all 4 scenarios
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 4: Agent Integration
|
||||||
|
|
||||||
|
**Goal:** Connect LLM agents to the game world.
|
||||||
|
|
||||||
|
**Tasks:**
|
||||||
|
1. Adapt existing `TurnOrchestrator` from vllm_demo
|
||||||
|
2. Connect `TurnContext.to_prose()` to LLM prompts
|
||||||
|
3. Parse LLM responses to `ActionRecord`
|
||||||
|
4. Execute actions through entity system
|
||||||
|
5. Screenshot + text context packaging
|
||||||
|
|
||||||
|
**Validation:** Single agent completes Scenario 1 (Key Hunt)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 5: Multi-Agent & Speech
|
||||||
|
|
||||||
|
**Goal:** Enable agent-to-agent communication.
|
||||||
|
|
||||||
|
**Tasks:**
|
||||||
|
1. Multi-agent turn sequencing
|
||||||
|
2. Speech action execution via `SpeechChannel`
|
||||||
|
3. Speech reception in `TurnContext`
|
||||||
|
4. Test Scenario 3 (Cooperative Unlock)
|
||||||
|
|
||||||
|
**Validation:** Two agents coordinate via speech to solve puzzle
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Phase 6: Evaluation & Analysis
|
||||||
|
|
||||||
|
**Goal:** Systematic evaluation of agent capabilities.
|
||||||
|
|
||||||
|
**Tasks:**
|
||||||
|
1. Simulation logging (full ActionLog export)
|
||||||
|
2. Success/failure metrics per scenario
|
||||||
|
3. Behavior analysis tools
|
||||||
|
4. Comparison across LLM models
|
||||||
|
|
||||||
|
**Deliverable:** Paper-ready evaluation results
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## File Structure
|
||||||
|
|
||||||
|
```
|
||||||
|
tests/
|
||||||
|
├── vllm_demo/ # Existing demo code
|
||||||
|
│ ├── action_parser.py # LLM response parsing
|
||||||
|
│ ├── action_executor.py # → Refactor to use ActionLog
|
||||||
|
│ ├── world_graph.py # → Keep for room descriptions
|
||||||
|
│ ├── turn_orchestrator.py # → Enhance with TurnContext
|
||||||
|
│ └── scenarios/ # New: scenario definitions
|
||||||
|
│ ├── key_hunt.py
|
||||||
|
│ ├── guard_patrol.py
|
||||||
|
│ ├── cooperative_unlock.py
|
||||||
|
│ └── combat_decision.py
|
||||||
|
│
|
||||||
|
src/scripts/
|
||||||
|
├── cos_entities.py # → Add ActionRecord returns
|
||||||
|
├── game_entities.py # New: clean academic entities
|
||||||
|
├── action_log.py # New: ActionLog system
|
||||||
|
├── speech_channel.py # New: agent communication
|
||||||
|
└── turn_context.py # New: perception generation
|
||||||
|
```
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Related Issues
|
||||||
|
|
||||||
|
- #154 - Grounded Multi-Agent Testbed (parent)
|
||||||
|
- #155 - Deterministic Text Descriptions (closed)
|
||||||
|
- #156 - Turn-based LLM Agent Orchestration (in progress)
|
||||||
|
- #153 - Separate render loop from game state loop (complete)
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Crypt of Sokoban entity system: `src/scripts/cos_entities.py`
|
||||||
|
- Existing VLLM demos: `tests/vllm_demo/`
|
||||||
|
- FOV/Perspective system: Issue #154 comment (2025-12-01)
|
||||||
Loading…
Add table
Add a link
Reference in a new issue