0265afa054
Import the runnable game code, content, docs, scripts, and repo guidance while leaving local agent state, dependency installs, build output, and backup copies out of the published tree.
703 lines
26 KiB
Markdown
703 lines
26 KiB
Markdown
# SYSADMIN CHRONICLES — ARCHITECTURE DOCUMENT
|
||
> Version 5.0 | Status: Active development
|
||
>
|
||
> Changelog:
|
||
> v5.0 — GDScript/Godot codebase removed. Node.js + Svelte is the only codebase.
|
||
> v4.0 — Full architecture pivot to Node.js game server + Svelte web HUD.
|
||
> v3.x — Save system, world flags, trust, incidents, pressure system (GDScript era).
|
||
> v2.0 — Native Godot 4 + libvirt design (superseded).
|
||
> v1.0 — Browser/v86 prototype (superseded).
|
||
|
||
---
|
||
|
||
## 1. PROJECT OVERVIEW
|
||
|
||
**Sysadmin Chronicles** is a native Linux-only game where the player works as a
|
||
junior sysadmin at Axiom Works, handling tickets inside **real Linux virtual
|
||
machines** managed by **QEMU/KVM via libvirt**.
|
||
|
||
The runtime stack (as of v4.0):
|
||
- **Game server** — Node.js / Express + WebSocket (`server/`). Owns all game
|
||
logic: quest state, trust, validation, VM lifecycle, incidents, save state.
|
||
- **Web HUD** — Svelte single-page app (`frontend/`). Tickets, mail, Sage, docs,
|
||
trust bar. Served from the game server at `http://192.168.100.1:3000`.
|
||
- **Workstation VM** — XFCE desktop (Debian 12, sc-workstation). Player's desk.
|
||
Chromium auto-opens the HUD. Tilix provides a real terminal for SSH to target VMs.
|
||
- **Target VMs** — Headless Debian (hermes) and Arch (vulcan). Quest objectives
|
||
live here. Player investigates and fixes via SSH from the workstation terminal.
|
||
|
||
The player experience:
|
||
- Sits at the workstation VM (via SPICE/remote-viewer fullscreen on the host)
|
||
- Reads tickets and mail in the Chromium HUD
|
||
- Opens Tilix, SSHes to hermes or vulcan, fixes real problems
|
||
- Clicks "Mark Complete" in the HUD — game server SSHes in and validates VM state
|
||
- World reacts, trust shifts, new mail arrives via WebSocket push
|
||
|
||
No simulated terminal. No fake SSH sessions.
|
||
|
||
---
|
||
|
||
## 2. CORE DESIGN PRINCIPLES
|
||
|
||
- Realism over simulation
|
||
- Native Linux execution only
|
||
- CLI-first development and asset wiring
|
||
- Minimal, stable scenes; behavior lives in scripts
|
||
- Data-driven content for quests, tickets, incidents, and dialogue
|
||
- State-based validation only; never command-sequence checking
|
||
- Multiple valid solutions where possible
|
||
- Pressure comes from evolving systems, not arbitrary timers
|
||
- Progression unlocks access, tools, and scope, not RPG stats
|
||
- Deterministic systems so content is testable and agent-friendly
|
||
- The dirty VM state is the game — preserve it, do not erase it
|
||
|
||
---
|
||
|
||
## 3. HIGH-LEVEL ARCHITECTURE
|
||
|
||
```
|
||
HOST MACHINE
|
||
├── game-server/ Node.js/Express + WebSocket (server/src/)
|
||
│ ├── ContentLoader loads content/ JSON at startup
|
||
│ ├── QuestEngine quest state machine
|
||
│ ├── TicketService ticket state, mark-complete handler
|
||
│ ├── ValidationEngine SSH into VMs, evaluates rules
|
||
│ ├── VMManager virsh start/stop/snapshot wrappers
|
||
│ ├── TrustSystem score, unlock evaluation, revocation
|
||
│ ├── ProgressionSystem unlocked docs, VMs, access
|
||
│ ├── EmailService inbox, follow-up emails, reply options
|
||
│ ├── SageService rule-based knowledge base / dialogue
|
||
│ ├── ShiftTimer shift clock, pressure tick schedule
|
||
│ ├── IncidentScheduler incident injection
|
||
│ └── SaveState ~/.local/share/sysadmin-chronicles/save.json
|
||
│
|
||
├── frontend/ Svelte web HUD (frontend/src/)
|
||
│ ├── TicketsPanel ticket list, detail, "Mark Complete" button
|
||
│ ├── MailPanel inbox, message view, reply buttons
|
||
│ ├── DocsPanel trust-gated internal docs
|
||
│ ├── SagePanel chat / knowledge base search
|
||
│ └── HeaderBar trust indicator, shift timer, unread count
|
||
│
|
||
└── content/ JSON content — quests, tickets, dialogue, etc.
|
||
|
||
NETWORK: sc-internal (libvirt bridge 192.168.100.0/24)
|
||
192.168.100.1 host (game server port 3000)
|
||
|
||
VMs on sc-internal
|
||
├── sc-workstation (ares) Debian 12 XFCE — player's desk
|
||
│ ├── Chromium → http://192.168.100.1:3000 (HUD, always open)
|
||
│ └── Tilix → SSH to hermes/vulcan (real terminal)
|
||
├── sc-web-server (hermes) headless Debian (Q002–Q005, Q007)
|
||
└── sc-build-machine (vulcan) headless Arch (Q006, Q008)
|
||
|
||
PLAYER FLOW:
|
||
Host starts game server → boots sc-workstation via SPICE
|
||
Player sees XFCE desktop → Chromium with HUD auto-open
|
||
Reads ticket → opens Tilix → SSH hermes → fixes problem
|
||
Clicks "Mark Complete" → server SSHes hermes → validates
|
||
Trust updates → WebSocket pushes to browser → new mail arrives
|
||
```
|
||
|
||
---
|
||
|
||
## 4. RUNTIME MODEL
|
||
|
||
### 4.1 Game Server — Node.js
|
||
|
||
The game server (`server/src/index.js`) is a Node.js/Express application:
|
||
- Serves `frontend/dist/` as static files at `/`
|
||
- WebSocket server on the same port (real-time event push to HUD)
|
||
- On startup: loads all content JSON, hydrates services from save file,
|
||
ensures workstation VM is live via VMManager
|
||
|
||
The server is responsible for:
|
||
- All game logic (quest state, trust, progression, incidents)
|
||
- VM lifecycle management (virsh via child_process)
|
||
- Validation — SSH into target VMs and evaluate rules
|
||
- Save/load (single JSON file at `~/.local/share/sysadmin-chronicles/save.json`)
|
||
- WebSocket broadcast of trust changes, new mail, shift ticks, incident alerts
|
||
|
||
### 4.2 Frontend — Svelte
|
||
|
||
The web HUD (`frontend/src/`) is a Svelte single-page app:
|
||
- Built with Vite; output lands in `frontend/dist/` and is served by the game server
|
||
- All data fetched from the game server API; no local state beyond UI
|
||
- WebSocket client for real-time updates
|
||
- Does not run validation — only displays results
|
||
|
||
### 4.3 Target Platform
|
||
|
||
- Host OS: Linux
|
||
- Supported deployment model: start game server on host, view workstation via SPICE
|
||
- Required host: KVM, libvirt, virsh, Node.js 18+, virt-viewer
|
||
- Required install model: one-time host setup with clean uninstall path
|
||
|
||
No Windows, macOS, or browser target is planned for the host. The HUD is a web
|
||
app served locally — it is never exposed to the internet.
|
||
|
||
---
|
||
|
||
## 5. VIRTUAL MACHINE SYSTEM
|
||
|
||
### 5.1 Required Stack
|
||
|
||
- `qemu-system-*`
|
||
- `KVM`
|
||
- `libvirtd`
|
||
- `virsh`
|
||
- libvirt virtual networks
|
||
- qcow2-backed VM images
|
||
|
||
Runtime policy:
|
||
- The shipped game should not require broad `sudo` usage during normal play
|
||
- One-time host setup may require admin approval
|
||
- Ongoing gameplay should run as a regular user against a prepared VM runtime
|
||
|
||
### 5.2 Core Behavior
|
||
|
||
The game controls VMs through libvirt, not by emulating them internally.
|
||
|
||
Responsibilities:
|
||
- Ensure required domains and networks exist
|
||
- Start the active VM
|
||
- Stop or suspend inactive VMs
|
||
- Revert to known snapshots for resets
|
||
- Query runtime state for evaluation
|
||
- Attach the player to the appropriate VM workflow
|
||
|
||
The workstation and at least one target VM must be able to run at the same
|
||
time. This is required for real SSH-based play and for background incidents to
|
||
continue evolving while the player works elsewhere.
|
||
|
||
Operational guidance:
|
||
- `workstation` stays live during normal play
|
||
- At least one target VM stays live with it
|
||
- Later phases may keep all major quest VMs active simultaneously
|
||
- Resource budgets should be documented and enforced conservatively
|
||
|
||
Lab finding:
|
||
- Small headless target VMs were inexpensive on the test host
|
||
- The workstation became materially heavier once a real graphical session and
|
||
browser were added
|
||
- Budget the workstation separately from server-style quest VMs
|
||
|
||
### 5.3 Initial VM Roles
|
||
|
||
| ID | Role | Distro | Hostname | Purpose |
|
||
|----|------|--------|----------|---------|
|
||
| `workstation` | Player desktop | Debian 12 | `ares` | XFCE + Chromium HUD + Tilix terminal |
|
||
| `web_server` | Service host | Debian 12 | `hermes` | Web/service quests (Q002–Q005, Q007) |
|
||
| `build_machine` | Build box | Arch | `vulcan` | Package/build/update quests (Q006, Q008) |
|
||
|
||
### 5.3.1 Workstation Profile
|
||
|
||
The workstation is a full XFCE desktop (Debian 12, 768–1536 MB RAM):
|
||
- **Chromium** — opens `http://192.168.100.1:3000` on login (game HUD)
|
||
- **Tilix** — split-pane terminal, set as default; player SSHes to hermes/vulcan from here
|
||
- **Full sysadmin CLI toolkit** pre-installed (vim, htop, tmux, curl, nmap, tcpdump, etc.)
|
||
- SPICE display with QXL video — dynamic resolution via vdagent; fullscreen via `remote-viewer`
|
||
- `always_live: true` — stays running between shifts; suspended on game quit, resumed on next launch
|
||
|
||
Player never needs to interact with the workstation VM's internal file system for
|
||
game objectives — all quest work happens on the target VMs via SSH.
|
||
|
||
### 5.3.2 Why XFCE + Chromium (not terminal-only)
|
||
|
||
Earlier iterations used a terminal-only workstation. The game was redesigned
|
||
because a terminal-only approach would require building a fake terminal and fake SSH.
|
||
The XFCE + real browser approach is simpler, more realistic, and requires no
|
||
terminal simulation at all:
|
||
|
||
- Player uses a real Tilix terminal — no simulation
|
||
- Player SSHes with real SSH — no protocol emulation
|
||
- The HUD is a real web app — no custom UI framework needed for game chrome
|
||
- Downside: workstation VM costs ~480–768 MB RAM; budget accordingly
|
||
|
||
### 5.4 Snapshot Strategy
|
||
|
||
Snapshots are the reset primitive and the save primitive.
|
||
|
||
Named snapshot tiers per VM:
|
||
|
||
| Name | Purpose |
|
||
|------|---------|
|
||
| `baseline.clean` | Authored starting state for a fresh quest arc |
|
||
| `baseline.recovery` | Fallback if live state is unrecoverable |
|
||
| `checkpoint.shift-{N}` | Auto-saved at start of each in-game shift |
|
||
|
||
Rules:
|
||
- Snapshot names are deterministic
|
||
- Quest scripts may declare required baseline snapshots
|
||
- Validation never depends on snapshot history; only current observed state
|
||
- The game retains a maximum of 5 shift checkpoints per VM; older ones are pruned
|
||
- `baseline.clean` and `baseline.recovery` are never pruned by the game
|
||
|
||
### 5.5 Networking Model
|
||
|
||
Networking is host-controlled through libvirt.
|
||
|
||
Supported modes:
|
||
- `quest`: constrained, deterministic virtual networks and fixtures
|
||
- `sandbox`: broader connectivity for experimentation
|
||
|
||
Examples:
|
||
- Internal-only network between workstation and target VM
|
||
- Broken DNS as part of a quest
|
||
- Deliberately degraded service reachability
|
||
- Optional outbound package mirror access for selected scenarios
|
||
|
||
### 5.6 VM Provisioning Hooks
|
||
|
||
Quest-specific VM state — broken configs, missing files, log histories — is
|
||
authored into the VM baseline before the snapshot is taken. This is done via
|
||
idempotent provisioning scripts:
|
||
|
||
```
|
||
tools/vm/quest-prep/Q0XX-prep.sh
|
||
```
|
||
|
||
These scripts run against the target VM before the quest's `baseline.clean`
|
||
snapshot is taken. They are never run at quest activation time. See
|
||
QUEST_AUTHORING.md for the full provisioning workflow.
|
||
|
||
---
|
||
|
||
## 6. OBSERVATION AND VALIDATION
|
||
|
||
### 6.1 Validation Philosophy
|
||
|
||
Quest completion is based on **system state**, not on how the player got there.
|
||
|
||
Allowed evidence includes:
|
||
- Files and directory contents
|
||
- Ownership and permissions
|
||
- Service state
|
||
- Process state
|
||
- Open ports
|
||
- Package state
|
||
- Mount state
|
||
- Disk utilization
|
||
- System configuration values
|
||
|
||
Disallowed as primary success conditions:
|
||
- Specific commands typed
|
||
- Specific files opened
|
||
- UI click history
|
||
|
||
### 6.2 Observation Sources
|
||
|
||
Primary sources:
|
||
- `virsh domstate`, `domifaddr`, and domain metadata
|
||
- Host-driven inspection tooling such as libguestfs where practical
|
||
- SSH-based read-only checks initiated by the host when needed
|
||
- Quest-specific host probe scripts for higher-level state summaries
|
||
|
||
Authoritative rule:
|
||
- Quest validation must use host-authoritative checks only
|
||
- In-guest helpers may improve responsiveness, but cannot decide success
|
||
|
||
In-guest helpers should use neutral names (examples: `atlas-index`, `yardd`,
|
||
`ops-telemetry-cache`) and must not be trusted as a security boundary.
|
||
|
||
Operational note:
|
||
- Routine package operations inside guests may emit maintenance or virtualization
|
||
notices that break immersion
|
||
- Base images should suppress or tune guest maintenance messaging where safe
|
||
for the authored environment
|
||
- Validation and incident design should not rely on noisy package-manager side
|
||
effects being visible to the player
|
||
|
||
### 6.3 Validation Rule Model
|
||
|
||
Core rule families:
|
||
- `file_exists` / `file_contains` / `file_mode` / `file_owner`
|
||
- `directory_exists`
|
||
- `service_state` / `service_enabled`
|
||
- `process_running` / `process_user`
|
||
- `port_listening`
|
||
- `package_installed`
|
||
- `mount_present`
|
||
- `disk_usage_below` / `disk_usage_above`
|
||
- `command_assert` — fallback only, must verify state not behavior
|
||
- `and` / `or` / `not`
|
||
|
||
### 6.4 Trust Boundary
|
||
|
||
The player may gain root access on some machines. The guest is not trusted. The
|
||
host validation layer is trusted. Anti-cheat is achieved through external
|
||
validation, not secrecy.
|
||
|
||
---
|
||
|
||
## 7. GAMEPLAY SYSTEMS
|
||
|
||
### 7.1 Core Loop
|
||
|
||
1. Ticket arrives with incomplete context
|
||
2. Player evaluates urgency against other active problems
|
||
3. Player enters or connects into the relevant VM
|
||
4. Player investigates using real Linux tools
|
||
5. Player applies a fix
|
||
6. Game validates resulting state
|
||
7. World reacts
|
||
8. Trust shifts
|
||
9. Future conditions reflect earlier choices
|
||
|
||
### 7.2 System Pressure
|
||
|
||
Pressure is systemic, not a countdown bar. Examples:
|
||
- Disk usage keeps climbing
|
||
- A log fills with worsening symptoms
|
||
- A degraded service starts affecting another team
|
||
- A quick fix suppresses one symptom while creating later instability
|
||
|
||
Pressure is authored as state transitions and event chains via incident files.
|
||
|
||
### 7.3 Trust / Reputation
|
||
|
||
Trust measures how much the organization relies on the player.
|
||
|
||
Trust affects:
|
||
- sudo scope
|
||
- accessible machines
|
||
- diagnostic tooling
|
||
- ticket sensitivity
|
||
- documentation visibility
|
||
|
||
**Trust increases** when the player resolves problems cleanly, finds root causes,
|
||
and avoids collateral damage.
|
||
|
||
**Trust decreases** when the player breaks unrelated systems, applies fragile
|
||
fixes, ignores urgent incidents, or resolves symptoms but not causes.
|
||
|
||
**Trust revocation**: if trust falls below a declared threshold in the trust
|
||
unlock table, specific access strings are revoked. A subsequent trust increase
|
||
does not automatically restore revoked access — the player must re-earn the
|
||
unlock tier. Revocation rules must be explicitly declared per unlock tier.
|
||
|
||
### 7.4 Multiple Valid Solutions
|
||
|
||
Quests support realistic alternatives where possible:
|
||
- quick workaround
|
||
- operationally acceptable fix
|
||
- proper long-term fix
|
||
|
||
Branch resolution rule:
|
||
- multiple branches may match the same final state
|
||
- each branch must declare a numeric `priority`
|
||
- the highest matching priority wins
|
||
- ties are a content error and fail validation during authoring checks
|
||
|
||
### 7.5 Dynamic Events
|
||
|
||
Dynamic events inject prioritization pressure and are authored in incident files.
|
||
Events are selected from authored pools and activated by progression, trust,
|
||
current system state, and world flags.
|
||
|
||
Each incident declares a `blast_radius_quests` list so the incident scheduler
|
||
can avoid activating an incident that would corrupt active quest evidence or
|
||
simultaneously interfere with an in-progress objective.
|
||
|
||
### 7.6 Investigation Quality
|
||
|
||
Clues must be legible and grounded. Every quest declares a `clue_fingerprint`
|
||
documenting what evidence exists in the VM baseline. Content validation checks
|
||
that the fingerprint is plausible. The player should feel rewarded for competent
|
||
debugging rather than guessing.
|
||
|
||
### 7.7 Progression
|
||
|
||
Progression unlocks:
|
||
- broader sudo access
|
||
- new servers
|
||
- more dangerous responsibilities
|
||
- better internal docs
|
||
- helper scripts and diagnostics
|
||
|
||
This is institutional progression, not character stats.
|
||
|
||
### 7.8 Mentor Thread
|
||
|
||
Marcus is the primary mentor character. His dialogue runs across the full game
|
||
as a `series_id: marcus-main` thread. Each dialogue file that belongs to an
|
||
ongoing character relationship declares `series_id` and `series_position`.
|
||
|
||
The dialogue system tracks series state so Marcus remembers what happened in
|
||
earlier quests and can reference it in later ones. This is the primary vehicle
|
||
for institutional memory and character continuity.
|
||
|
||
### 7.9 Tone and Humor
|
||
|
||
The tone is dry, realistic, and slightly dysfunctional. Examples:
|
||
- contradictory runbooks
|
||
- tickets that misidentify the problem
|
||
- passive-aggressive internal notes
|
||
- perfect urgency attached to trivial formatting requests
|
||
|
||
Humor must support immersion, not break it.
|
||
|
||
---
|
||
|
||
## 8. COMMAND AND ACCESS MODEL
|
||
|
||
Access is controlled realistically through:
|
||
- user accounts and group membership
|
||
- sudoers configuration
|
||
- reachable hosts
|
||
- available packages and tooling
|
||
|
||
If a player cannot run `systemctl`, the reason is that the VM account lacks the
|
||
required privileges, not that the game disabled the verb.
|
||
|
||
---
|
||
|
||
## 9. PRESENTATION LAYER
|
||
|
||
The player's view is the workstation VM desktop, viewed fullscreen via SPICE:
|
||
|
||
```bash
|
||
scripts/start-game.sh
|
||
# → starts game server
|
||
# → virsh start sc-workstation (if not already running)
|
||
# → remote-viewer --full-screen spice://127.0.0.1:<port>
|
||
```
|
||
|
||
The player sees an XFCE desktop with Chromium pre-opened to the HUD.
|
||
|
||
### 9.1 VM Display
|
||
|
||
- **Protocol**: SPICE with QXL video driver
|
||
- **Client**: `remote-viewer` (from `virt-viewer` package) in fullscreen mode
|
||
- **Resolution**: dynamic — guest vdagent resizes to match host display
|
||
- **Cursor release**: `Ctrl+Alt`; fullscreen toggle: `F11`
|
||
- **Clipboard sharing**: via spice-vdagent in the guest
|
||
|
||
No VNC, no custom viewer widget. The host runs `remote-viewer` and the player
|
||
works inside the workstation VM.
|
||
|
||
### 9.2 HUD (Svelte Web App)
|
||
|
||
The game HUD is a Svelte single-page app served at `http://192.168.100.1:3000`:
|
||
|
||
- **TicketsPanel** — ticket list, detail view, "Mark Complete" button
|
||
- **MailPanel** — inbox, message body, reply buttons (where applicable)
|
||
- **DocsPanel** — trust-gated internal docs, rendered from content/docs/
|
||
- **SagePanel** — chat interface to SageService knowledge base
|
||
- **HeaderBar** — trust indicator (no number, behavior only), shift timer, unread badge
|
||
|
||
The HUD is a company intranet portal in look and feel — dark, monospace, minimal.
|
||
|
||
### 9.3 One-Time Setup and Uninstall
|
||
|
||
Host-side setup is unavoidable (KVM, libvirt, VM images). It must be simple.
|
||
|
||
Principles:
|
||
- one-time setup only (`tools/setup/first-run-setup.sh`)
|
||
- plain-language explanation of what will be installed
|
||
- managed resources use the `sc-` prefix (never touch other libvirt domains)
|
||
- full uninstall removes all game-owned domains, networks, storage, helper files
|
||
- normal gameplay does not require broad `sudo`
|
||
|
||
---
|
||
|
||
## 10. DATA MODEL
|
||
|
||
Authoring formats:
|
||
- JSON for quests, tickets, incidents, dialogue, documentation metadata
|
||
- Shell helper scripts where CLI integration is necessary
|
||
|
||
Top-level content domains:
|
||
|
||
| Domain | Purpose |
|
||
|--------|---------|
|
||
| `quests/` | Objective chains and validation rules |
|
||
| `tickets/` | Player-facing problem statements |
|
||
| `incidents/` | Dynamic system pressure events |
|
||
| `dialogue/` | Workplace messages, hints, follow-ups |
|
||
| `docs/` | Internal documentation metadata/content |
|
||
| `progression/` | Trust thresholds, unlocks, access tiers |
|
||
| `vm_profiles/` | Domain names, snapshots, networks, probe config |
|
||
| `helpers/` | Non-obvious guest helper naming/config data |
|
||
| `world_flags/` | Central registry of all world state flags |
|
||
|
||
Each authored scenario must declare:
|
||
- `required_vms` — all VMs the quest touches
|
||
- `baseline_snapshot` — starting snapshot for this quest
|
||
- `clue_fingerprint` — evidence declared in the VM baseline
|
||
- validation rules and branch priorities
|
||
- escalation behavior
|
||
- trust impact
|
||
- `blast_radius` — incident IDs the quest may interact with
|
||
- follow-on world effects
|
||
|
||
---
|
||
|
||
## 11. SAVE MODEL
|
||
|
||
### 11.1 Dirty State Model
|
||
|
||
The game uses a **dirty state model**. VM disk state is preserved across
|
||
sessions as-is. The game does not revert to a clean baseline on load — it
|
||
resumes from whatever state the VMs are currently in.
|
||
|
||
This is intentional. The player's history of changes is part of the game. A
|
||
machine they fixed stays fixed. A machine they damaged stays damaged until they
|
||
repair it or request reimage.
|
||
|
||
Two persistence layers:
|
||
|
||
**Game State Layer** — saved as JSON:
|
||
- Trust score and history
|
||
- Unlocked access, sudo scopes, docs, tools
|
||
- Active/completed quest and ticket state
|
||
- World flags (current values and change history)
|
||
- Incident scheduler state
|
||
- In-world clock and shift counter
|
||
|
||
**VM State Layer** — saved as libvirt snapshot references:
|
||
- Per-VM reference to current snapshot tier or live disk
|
||
- Per-VM managed recovery checkpoint list
|
||
- Reimage history per VM
|
||
|
||
### 11.2 Shift Checkpoints
|
||
|
||
At the start of each in-game shift:
|
||
1. Game state JSON is saved
|
||
2. A named snapshot is created per active VM: `checkpoint.shift-{N}`
|
||
3. The checkpoint reference is recorded in the save file
|
||
4. Shift checkpoints beyond the retention limit (default: 5) are pruned
|
||
|
||
Shift checkpoint rollback is an explicit player action ("start this shift
|
||
over") with a confirmation prompt. It does not undo trust changes or dialogue
|
||
already delivered.
|
||
|
||
### 11.3 Load-Time Reconciliation
|
||
|
||
On load, the observation service validates current VM state against saved world
|
||
flags. Minor drift is handled silently. Major drift — missing snapshots,
|
||
unbootable VMs — triggers the recovery flow.
|
||
|
||
If a referenced snapshot is missing:
|
||
- If `baseline.recovery` exists, offer resume from recovery
|
||
- If `baseline.recovery` is also gone, the VM is treated as unrecoverable
|
||
|
||
### 11.4 Recovery / Reimage Flow
|
||
|
||
When a VM is unrecoverable, the player can report it for reimage through an
|
||
in-world mechanic:
|
||
|
||
1. Player submits a reimage request (ticket to management)
|
||
2. In-world delay is imposed (one in-game shift)
|
||
3. Machine is restored from `baseline.recovery` or `baseline.clean`
|
||
4. Trust penalty is applied based on severity
|
||
5. In-progress quests on that VM are reset
|
||
6. Evidence from before the reimage is gone — acknowledged in-world
|
||
|
||
This is the designed escape valve. It has visible consequences but allows
|
||
forward progress.
|
||
|
||
### 11.5 Host Storage Management
|
||
|
||
qcow2 images with many snapshots can balloon. The game enforces:
|
||
- Maximum of 5 shift checkpoints per VM (configurable in vm_profile)
|
||
- Authored baseline and recovery snapshots are never pruned by the game
|
||
- `resource_budget` in vm_profile declares expected disk footprint
|
||
|
||
### 11.6 Developer Reset
|
||
|
||
Not available in the shipped game. CLI only:
|
||
|
||
```bash
|
||
bash tools/vm/snapshot-all.sh --revert-to baseline.clean
|
||
```
|
||
|
||
Completely resets all VMs to authored baseline. Used during content authoring
|
||
and automated test runs.
|
||
|
||
---
|
||
|
||
## 12. MODULE BREAKDOWN
|
||
|
||
### Server (`server/src/`)
|
||
|
||
| Module | Responsibility |
|
||
|--------|----------------|
|
||
| `index.js` | Express + WebSocket entry point; service wiring; static file serving |
|
||
| `ContentLoader` | Loads all content/ JSON at startup; never writes |
|
||
| `QuestEngine` | Quest state machine (pending → active → resolved) |
|
||
| `TicketService` | Ticket state, mark-complete handler, branch resolution |
|
||
| `ValidationEngine` | SSH into VMs, evaluates all rule types against real state |
|
||
| `VMManager` | virsh start/stop/snapshot/getIP wrappers |
|
||
| `TrustSystem` | Score tracking, unlock evaluation, revocation |
|
||
| `ProgressionSystem` | Unlocked docs, VMs, access strings |
|
||
| `EmailService` | Inbox, follow-up emails, reply options, WebSocket push |
|
||
| `SageService` | Rule-based dialogue / knowledge base |
|
||
| `ShiftTimer` | Shift clock, broadcasts shift:tick via WebSocket |
|
||
| `IncidentScheduler` | Pressure tick loop, incident injection |
|
||
| `ShiftReviewService` | End-of-shift performance review email generation |
|
||
| `CertificationService` | Awards internal certs after quest chain completion |
|
||
| `SaveState` | Read/write `~/.local/share/sysadmin-chronicles/save.json` |
|
||
| `lib/ssh.js` | Promisified SSH command execution (node-ssh) |
|
||
| `lib/virsh.js` | virsh command wrappers |
|
||
| `lib/eventBus.js` | Internal Node.js EventEmitter for service coordination |
|
||
|
||
### Frontend (`frontend/src/`)
|
||
|
||
| Component | Responsibility |
|
||
|-----------|----------------|
|
||
| `App.svelte` | Root component; WebSocket connection; panel routing |
|
||
| `TicketsPanel` | Ticket list, detail, mark-complete flow |
|
||
| `MailPanel` | Inbox, message body, reply buttons |
|
||
| `DocsPanel` | Trust-gated doc list and content viewer |
|
||
| `SagePanel` | Chat interface, follow-up prompts |
|
||
| `VmsPanel` | Live VM status indicators |
|
||
| `HeaderBar` | Trust display, shift timer, mail unread count |
|
||
| `lib/api.js` | Fetch wrapper for all REST API calls |
|
||
|
||
---
|
||
|
||
## 13. SECURITY AND SAFETY
|
||
|
||
Requirements:
|
||
- Scope libvirt resources to dedicated game domains/networks/storage pools
|
||
- Never operate on arbitrary host VMs by default
|
||
- Use explicit naming/prefixing for all game-managed resources (`sc-` prefix)
|
||
- Separate quest-mode constrained networks from broader sandbox networks
|
||
- Prefer least-privilege host integration
|
||
- Provide a dry-run and diagnostic mode for development scripts
|
||
|
||
The game manages only the resources it created or was explicitly pointed at
|
||
during setup.
|
||
|
||
---
|
||
|
||
## 14. TECHNOLOGY DECISIONS
|
||
|
||
| Technology | Role | Reason |
|
||
|-----------|------|--------|
|
||
| Node.js / Express | Game server | Async I/O, native SSH/virsh via child_process, easy JSON |
|
||
| Svelte / Vite | Web HUD | Lightweight, no virtual DOM overhead, fast build |
|
||
| WebSocket (`ws`) | Real-time push | Trust changes, mail, incidents without polling |
|
||
| QEMU/KVM | Virtualization backend | Real Linux environments |
|
||
| libvirt / virsh | VM lifecycle control | Standard Linux automation surface |
|
||
| SPICE + QXL | Workstation display | Dynamic resolution, clipboard sharing, fullscreen |
|
||
| `remote-viewer` | Host-side SPICE client | Ships with virt-viewer; fullscreen with F11 |
|
||
| JSON | Content authoring | Data-driven, easy to diff, unchanged from prior design |
|
||
| node-ssh | SSH execution in validation | Clean Promise API; BatchMode, key-based auth |
|
||
|
||
Not in scope: v86, WebAssembly, browser-only runtime, service-worker networking.
|
||
|
||
---
|
||
|
||
## 15. DEVELOPMENT PRIORITIES
|
||
|
||
1. Native architecture consistency
|
||
2. VM control integration
|
||
3. Observation and validation
|
||
4. Core gameplay loop
|
||
5. Pressure, trust, and dynamic event systems
|
||
6. Presentation polish
|
||
|
||
If a design choice improves presentation but weakens VM realism or maintainable
|
||
automation, reject it.
|