chore: bootstrap lean sysadmin-chronicles repo
Import the runnable game code, content, docs, scripts, and repo guidance while leaving local agent state, dependency installs, build output, and backup copies out of the published tree.
This commit is contained in:
+199
@@ -0,0 +1,199 @@
|
||||
# SYSADMIN CHRONICLES — OPEN ISSUES
|
||||
> Version 2.0 | Last updated: Phase 1 skeleton build
|
||||
>
|
||||
> All known design gaps, content bugs, and deferred decisions.
|
||||
> Items here must NOT be implemented with a best-guess — wait for a resolution.
|
||||
> Mark items RESOLVED with the fix details when closed.
|
||||
|
||||
---
|
||||
|
||||
## AGENT INSTRUCTIONS (READ FIRST)
|
||||
> Added during Phase 1 skeleton build. Document any decisions you make here.
|
||||
>
|
||||
> When you resolve an open issue:
|
||||
> 1. Move it to the RESOLVED ISSUES section at the bottom.
|
||||
> 2. Add resolution details and the file(s) changed.
|
||||
> 3. Update ROADMAP.md to mark relevant tasks complete.
|
||||
>
|
||||
> When you make a minor direction change (non-game-changing):
|
||||
> 1. Note it here under NEW DECISIONS.
|
||||
> 2. Update the relevant doc (ARCHITECTURE.md, SAVE_SYSTEM.md, etc.)
|
||||
> 3. Do NOT silently patch content files — note it here first.
|
||||
>
|
||||
> **CODE QUALITY AUDIT — COMPLETE**
|
||||
> All P1, P2, and P3 items from docs/CODEX_AUDIT_FIXES.md have been resolved.
|
||||
> docs/CODEX_AUDIT_FIXES.md has been deleted per its own WHEN DONE instruction.
|
||||
|
||||
---
|
||||
|
||||
## NEW DECISIONS (Made During Phase 1 Build)
|
||||
|
||||
### ND-001 — T005-T008 bundled file needs split (same as OI-008 pattern)
|
||||
The file T005-T008.json is a bundled array. Content loader expects one file per
|
||||
ticket with an "id" field. It has been kept as-is (content loader skips arrays).
|
||||
Split into T005.json–T008.json before Phase 5 ticket loading is implemented.
|
||||
|
||||
### ND-002 — OI-006 (persists: false) already resolved in SAVE_SYSTEM.md v1.3
|
||||
SAVE_SYSTEM.md v1.3 defines shift-boundary reset. SaveState.js implements the
|
||||
reset_shift_flags() equivalent at shift start. Closed.
|
||||
|
||||
### ND-003 — OI-007 (Q002 blast_radius) safe to fix mechanically
|
||||
Q002-syntax-error.json blast_radius: ["I001"] should be []. Mechanical fix —
|
||||
the next agent can apply it directly without asking. See OI-007.
|
||||
|
||||
### ND-004 — Stack is Node.js + Svelte
|
||||
The game runs as a Node.js/Express server with a Svelte web HUD. The workstation
|
||||
is a real XFCE VM (sc-workstation). All game logic lives in `server/src/`.
|
||||
|
||||
### ND-005 — opsbridge/sudo SSH path for host→workstation validation
|
||||
The validation path from the host uses the `opsbridge` management user then
|
||||
`sudo -H -i -u player` inside the guest, because Q001 intentionally removes
|
||||
`/home/player/.ssh/authorized_keys`. The correct form is:
|
||||
|
||||
```
|
||||
ssh opsbridge@<guest> sudo -H -i -u player -- sh -c '<command>'
|
||||
```
|
||||
|
||||
Separated sudo flags (`-H -i -u`) required — combined `-Hiu` misparses on some builds.
|
||||
|
||||
**Status**: RESOLVED — confirmed working in ValidationEngine.js SSH path.
|
||||
|
||||
### ND-006 — build_machine snapshot chain now materializes baseline.post-q006 from Q006 clean state
|
||||
`tools/setup/seed-vms.sh` now builds `sc-build-machine` in two authored stages:
|
||||
`Q006-prep.sh` creates the broken `baseline.clean` state for "Time Is A Flat
|
||||
Circle", and `Q006-post-clean.sh` applies the clean branch outcome before taking
|
||||
`baseline.post-q006`.
|
||||
|
||||
`Q008` is still a separate multi-VM provisioning gap. Its authored starting
|
||||
state touches both vulcan and hermes, so it should not be guessed into the
|
||||
single-domain snapshot chain until that flow is designed explicitly.
|
||||
|
||||
### ND-007 — terminal UX: Tilix is the player's terminal (no in-game simulation)
|
||||
**Status**: RESOLVED. The player uses a real Tilix terminal inside the workstation
|
||||
XFCE VM. All terminal UX (history, scrollback, copy/paste) is handled by Tilix.
|
||||
No terminal simulation needed. See `docs/WORKSTATION_POLISH_BACKLOG.md` for
|
||||
outstanding workstation desktop polish items.
|
||||
|
||||
### ND-008 — vulcan player shell/PATH is still misprovisioned
|
||||
**Status**: RESOLVED 2026-04-24.
|
||||
Root cause: `inetutils` (provides `/usr/bin/hostname` on Arch) was not in the
|
||||
`build-build-machine.sh` pacman install. Hermes (Debian) has hostname pre-installed.
|
||||
Fix applied in `tools/vm/build-build-machine.sh`:
|
||||
- Added `inetutils` to the runcmd pacman install line.
|
||||
- Added runcmd entries to write `/home/player/.bashrc` (explicit PATH) and
|
||||
`.bash_profile` (sources .bashrc), then chown to player.
|
||||
Regression gate added to `tools/setup/seed-vms.sh` (STEP 1b): after builds,
|
||||
SSH-tests `hostname` on sc-web-server and sc-build-machine; fails fast if missing.
|
||||
|
||||
---
|
||||
|
||||
## MUST RESOLVE BEFORE PHASE 3
|
||||
|
||||
### OI-001 — Q001 permissive-setup branch contradictory logic
|
||||
**File**: content/quests/Q001-welcome-aboard.json
|
||||
Option A: bad-but-not-fatal permissions (755 dir), quest completes with warning.
|
||||
Option B: fatally wrong permissions (777), quest does NOT complete via this branch.
|
||||
**Decision needed**: Which option?
|
||||
**Status**: RESOLVED — permissive-setup branch (Option A/lenient) was already correctly implemented. Q001 branch validates file_exists + file_owner without checking mode, so 755 directory case completes the quest with trust_delta 0. marcus-Q001.json already has complete-permissive stage.
|
||||
|
||||
### OI-002 — Q008 rollback-only vs rollback-and-pin have identical validation
|
||||
**File**: content/quests/Q008-bad-upstream.json
|
||||
Need a distinguishing rule for pinned vs unpinned. Likely an IgnorePkg entry
|
||||
in /etc/pacman.conf (detectable via file_contains).
|
||||
**Status**: RESOLVED — Q008 already has file_contains check for IgnorePkg in /etc/pacman.conf on rollback-and-pin branch, and a not-rule on the rollback-only branch to ensure mutual exclusion. Confirmed in Q008 internal_notes.
|
||||
|
||||
---
|
||||
|
||||
## MUST RESOLVE BEFORE PHASE 5
|
||||
|
||||
### OI-003 — Incident files I002 and I003 are missing
|
||||
Author I002-backup-pressure-recurrence.json and I003-app-update-recurrence.json
|
||||
following the I001 pattern before Phase 6.
|
||||
**Status**: RESOLVED — both files authored (content/incidents/I002-backup-pressure-recurrence.json, I003-app-update-recurrence.json). Content validator passes zero errors.
|
||||
|
||||
### OI-004 — pressure_profile field is referenced but never defined
|
||||
Recommend: separate files in content/pressure_profiles/ with a defined schema.
|
||||
**Status**: RESOLVED — created content/pressure_profiles/ with web_outage_escalation.json and app_outage_escalation.json. Schema uses trigger_after_seconds steps with notification, notification_severity, and escalate_linked_ticket fields. escalate_linked_ticket resolves to the quest's own ticket_id at runtime.
|
||||
|
||||
### OI-005 — check_mode: explicit trigger mechanism undefined
|
||||
A "Verify Fix" button in the ticket panel UI, shown per-objective when check_mode == explicit.
|
||||
**Status**: RESOLVED — Verify Fix button implemented in TicketsPanel.svelte. Button appears
|
||||
per-objective when check_mode == explicit, disables during check, re-enables with 2s delay on failure.
|
||||
|
||||
---
|
||||
|
||||
## LOW PRIORITY / ANYTIME
|
||||
|
||||
### OI-007 — Q002 blast_radius incorrectly references I001
|
||||
Fix: change blast_radius: ["I001"] to blast_radius: [] in Q002.
|
||||
**Status**: RESOLVED — blast_radius set to [] in Q002-syntax-error.json; _blast_radius_note added explaining I001 triggers only from Q003 quick-fix branch.
|
||||
|
||||
### OI-008 — tier2-dialogue.json naming convention
|
||||
Individual files exist: marcus-Q005.json, marcus-Q006.json, marcus-Q008.json,
|
||||
priya-Q007.json. The bundled file is kept as tier2-dialogue.SPLIT_PENDING.json.
|
||||
Verify individual files are complete then delete the SPLIT_PENDING file.
|
||||
**Status**: RESOLVED — individual files confirmed present. Bundled file removed.
|
||||
|
||||
### OI-009 — sarah-web series has only one member
|
||||
sarah-Q003-angry.json declares series_id: "sarah-web" but no second member exists.
|
||||
Either add sarah-Q004+ or remove series_id until a second file is authored.
|
||||
**Status**: RESOLVED — series_id and series_position removed from sarah-Q003-angry.json. Series grouping deferred until a second sarah-web member is authored.
|
||||
|
||||
### OI-011 — Snapshot baseline chain
|
||||
seed-vms.sh implements the chain. Need formal policy in QUEST_AUTHORING.md.
|
||||
Chain: workstation: baseline.day-one; web_server: clean→post-q002→post-q003→post-q004;
|
||||
build_machine: clean→post-q006. Each post-qXXX baseline from CLEAN branch resolution.
|
||||
**Status**: RESOLVED — Baseline Snapshot Chain subsection added to docs/QUEST_AUTHORING.md in VM PROVISIONING HOOKS section. Documents chain per VM, the clean-branch-only rule, and naming convention.
|
||||
|
||||
---
|
||||
|
||||
## RESOLVED ISSUES
|
||||
|
||||
### OI-006 — persists: false flag semantics
|
||||
Resolution: Shift-boundary reset handled in SaveState.js at shift start.
|
||||
|
||||
### OI-010 — file_absent and file_owner_is_not undocumented rule types
|
||||
Resolution: Added to ValidationEngine.js as full rule types.
|
||||
Still needs: update QUEST_AUTHORING.md rule reference table.
|
||||
|
||||
### OI-012 — SSH execution contract
|
||||
Resolution: server/src/lib/ssh.js — Promise-based, structured result (stdout/stderr/exitCode),
|
||||
BatchMode key-based auth, 30s default timeout.
|
||||
|
||||
### OI-013 — Language choice
|
||||
Resolution: Node.js + Svelte. See ND-004.
|
||||
|
||||
---
|
||||
|
||||
## ADDITIONAL RESOLUTIONS (Phase 1 continued)
|
||||
|
||||
### OI-003 — I002 and I003 incident files authored
|
||||
**Resolution**:
|
||||
- I002-backup-pressure-recurrence.json authored — triggers on hermes_backup_partial flag,
|
||||
3-step escalation, resolves when cron+ownership+disk all correct.
|
||||
- I003-app-update-recurrence.json authored — triggers when rollback-only branch taken on Q008,
|
||||
re-installs broken version unless pinned. Resolves when IgnorePkg + correct version confirmed.
|
||||
- Content validator now passes zero errors.
|
||||
|
||||
### ND-001 — T005-T008 split complete
|
||||
T005.json, T006.json, T007.json, T008.json created from the bundled file.
|
||||
Content validator now loads all 8 tickets correctly.
|
||||
|
||||
### OI-007 — Q002 blast_radius fix
|
||||
**Resolution**: The validator was fixed to normalize world_flags.json array format.
|
||||
The Q002 blast_radius: ["I001"] issue is documented — apply this one-line fix
|
||||
directly: change blast_radius in Q002-syntax-error.json from ["I001"] to [].
|
||||
|
||||
### VALIDATOR FIXES applied this session:
|
||||
- validate-content.js now normalizes world_flags.json array format correctly
|
||||
- Advisory clue_fingerprint rule types (service_state_is, file_size_above, etc.)
|
||||
are now accepted — they describe evidence, not runtime-evaluated rules
|
||||
- T005-T008 bundled file is now skipped correctly (SPLIT_DONE suffix)
|
||||
- WorldFlags handling now normalizes both Array and Dict flag formats
|
||||
|
||||
### CONTENT STATUS: validate-content.js exits 0 (zero errors, 2 warnings)
|
||||
Warnings are all expected and documented:
|
||||
- priya-ops series: 1 member (needs future dialogue)
|
||||
- T005-T008.SPLIT_DONE.json: skipped (bundled file, split done)
|
||||
(sarah-web series warning removed — series_id stripped from sarah-Q003-angry.json per OI-009)
|
||||
(tier2-dialogue.SPLIT_PENDING warning removed — renamed to .SPLIT_DONE.bak per OI-008)
|
||||
Reference in New Issue
Block a user