0265afa054
Import the runnable game code, content, docs, scripts, and repo guidance while leaving local agent state, dependency installs, build output, and backup copies out of the published tree.
3529 lines
159 KiB
Markdown
3529 lines
159 KiB
Markdown
# Sysadmin Chronicles — Full Quest & Story Redesign (REVISED)
|
||
|
||
> Self-revision against SPEC_LOCK.md (binding), CHARACTERS.md, STORY_DESIGN_CONTEXT.md,
|
||
> QUEST_AUTHORING.md, and COMPANY_LORE.md.
|
||
>
|
||
> Audit findings from v1 corrected in this revision. Changes are not additive —
|
||
> this document supersedes the previous version in full.
|
||
|
||
---
|
||
|
||
## Audit Summary (What Changed and Why)
|
||
|
||
The first draft had the right bones but violated the design's core premise in several
|
||
places. The clearest pattern of failure: quests were being used to deliver investigation
|
||
content explicitly rather than letting investigation happen as a byproduct of normal
|
||
work. Specific problems fixed in this revision:
|
||
|
||
**Replaced or redesigned:**
|
||
- Q028 (Dale's archive handed to the player as a directed task) → Q028 is now a backup
|
||
integrity task where Dale's working directory appears in the restore path
|
||
- Q029 (authenticate a forged report) → Q029 is now a systemd service audit task where
|
||
the forged report is found in a log directory, not handed to the player
|
||
- Q035 (write an investigation summary for the CTO) → Q035 is now a log retention and
|
||
archival task; the player's work product IS the investigation record
|
||
- Q038 (write what you believe happened) → Q038 is now a certificate rotation task under
|
||
pressure; the conflict is operational, not narrative
|
||
- Q041 (read Priya's briefing document) → Q041 is now a production hardening task
|
||
- Q044 (Marcus explains Dale) → cut as a named quest; Dale's story now emerges from
|
||
system artifacts the player finds; Marcus says less, more precisely
|
||
- Q045 (Kowalski emails the outcome) → Q045 is now a change-freeze and documentation
|
||
task whose resolution signals the ending; no character summarizes what happened
|
||
- Q046/Q047/Q048 replaced with quests that have real Linux substance
|
||
|
||
**Hook density reduced:** Phase 2 had one hook per quest. Hooks are now seeded in
|
||
roughly every 2–3 quests across Phase 1–2, with concentration increasing in Phase 3.
|
||
|
||
**Styx dropped:** The `styx` hostname thread from Q006 had no resolution. Removed.
|
||
Q006 is revised with a hook that connects to the active investigation arc.
|
||
|
||
**Difficulty scaling corrected:** Phase 2 quests that were Tier 1 have been corrected
|
||
to Tier 2. Ticket wording in Phase 2 is less explicit. Phase 4+ tickets give the
|
||
problem statement only — no guidance on approach.
|
||
|
||
**Phase 6 given real technical content:** Resolution-phase quests now all teach Linux
|
||
concepts. Narrative delivery happens through the work and its consequences, not
|
||
through characters explaining what happened.
|
||
|
||
---
|
||
|
||
## 1. Design Overview
|
||
|
||
### The Core Proposition
|
||
|
||
The player is doing sysadmin work. The story leaks through the systems they maintain.
|
||
A player who ignores everything except the tickets will complete the game — they will
|
||
just complete a different version of it than the player who reads the bash history that
|
||
wasn't in scope and notices a timestamp that doesn't fit.
|
||
|
||
This is not a rhetorical distinction. Every system in this redesign follows from it:
|
||
behavior variables capture what kind of sysadmin the player is, not whether they are
|
||
"good" at detecting the plot. Trust reflects professional competence. Endings reflect
|
||
the accumulated profile of both.
|
||
|
||
### How the New System Extends the Existing One
|
||
|
||
The existing branch/world-flag/trust model is the backbone. It is not replaced.
|
||
|
||
**Preserved from existing implementation:**
|
||
- `trust_delta` per solution branch — reflects quality of the fix
|
||
- `world_flags` — persistent string keys, set by branch resolution, read by later quests
|
||
- `follow_up_ticket` and `follow_up_incident` — chain quests, trigger delayed consequences
|
||
- Solution branch priority — highest valid branch wins
|
||
- Tier-based difficulty (Tier 1, 2, 3)
|
||
- Observed-state validation — not scripted walkthroughs
|
||
- Clue fingerprints as advisory baseline documentation
|
||
- Character dialogue responding to branch outcomes
|
||
|
||
**New system adds (minimally, without unnecessary mechanics):**
|
||
- `narrative_phase` field on each quest — maps to one of six phases; gates pressure
|
||
profile and difficulty scaling
|
||
- Behavior variables: `curiosity`, `obedience`, `risk` — accumulated alongside trust;
|
||
govern narrative route and ending
|
||
- `suspicion` — management/security attention score; distinct from trust; affects
|
||
access and pressure level
|
||
- Access level per machine: `basic_user`, `sudo`, `root` — evolves with trust and
|
||
phase; degrades with sustained high risk
|
||
- `hidden_hook` field on quests — defines a discovery condition and the flag it sets;
|
||
optional, never required to complete the ticket
|
||
- Ending evaluator — runs at game close; reads all accumulated state; outputs one of
|
||
four endings
|
||
|
||
No other new mechanics are introduced. Every new field maps to existing infrastructure
|
||
patterns (world flags, trust deltas, branch outcomes).
|
||
|
||
### Variable Interaction Model
|
||
|
||
```
|
||
[Quest branch resolves]
|
||
│
|
||
┌───────────┼────────────┐
|
||
▼ ▼ ▼
|
||
trust_delta world_flags behavior_impact
|
||
│ │ │
|
||
▼ ▼ ▼
|
||
trust narrative curiosity /
|
||
(access, routing obedience /
|
||
warmth, (later quest risk /
|
||
incident content) suspicion
|
||
visibility)
|
||
│
|
||
▼
|
||
ending_route
|
||
```
|
||
|
||
Trust and behavior variables accumulate in parallel. A player with high trust and
|
||
high curiosity is a different player than one with high trust and high obedience —
|
||
same professional quality, different narrative destination.
|
||
|
||
---
|
||
|
||
## 2. Character Usage Guide
|
||
|
||
All portrait-compatible identity is preserved. The following is operational guidance
|
||
for quest authors, not character redefinition.
|
||
|
||
### Marcus Webb
|
||
|
||
**Voice:** Short. Precise. Does not explain things twice. The second sentence he
|
||
adds — when he adds one — is always the important one.
|
||
|
||
**Quest role:** Primary ticket source (most quests), trust gatekeeper, access grant/
|
||
revoke mechanism, ambient signal source in mid-game.
|
||
|
||
Marcus's messages evolve with trust. Low trust: purely functional assignments.
|
||
Mid trust: he occasionally adds context that wasn't asked for. High trust: he
|
||
sometimes sends a message that isn't a ticket at all — an observation, a thing he's
|
||
noticed, phrased as if the player should already know what to do with it.
|
||
|
||
He knows about Dale. He will not bring it up directly. If the player finds something
|
||
Dale-related, Marcus's response will be exact and quiet — never surprised, never
|
||
explanatory.
|
||
|
||
Use Marcus for: ticket assignments, clean/acceptable/regression branch responses,
|
||
access gate messages, quiet mid-game Slack observations, cost-free hints if the
|
||
player asks (not volunteered). Do not use Marcus to explain the story, praise
|
||
the player effusively, or become verbose about anything personal.
|
||
|
||
### Sarah Chen
|
||
|
||
**Voice:** Direct, outcome-focused, slightly impatient when things are broken.
|
||
Warms when fixes hold. Cools when fixes don't.
|
||
|
||
**Quest role:** hermes and staging tickets, product-pressure source, response
|
||
calibration for clean vs. symptom fixes.
|
||
|
||
Sarah's descriptions are accurate about symptoms and often wrong about cause.
|
||
She describes what she saw, not what caused it. When a fix holds — when the same
|
||
problem doesn't recur — she notices, and says something. When it does recur, she
|
||
says something else, shorter.
|
||
|
||
Use Sarah for: hermes/staging/demo tickets, stakeholder pressure escalations, CC
|
||
lines on cross-team notes, downstream reactions to fix quality. Do not use Sarah
|
||
for investigation-phase content — she doesn't have visibility into what the player
|
||
is finding.
|
||
|
||
### Priya Nair
|
||
|
||
**Canonical email:** `p.nair@axiomworks.internal`. Prior references to Priya
|
||
Kapoor or Priya Singh are the same person. Those files need updating.
|
||
|
||
**Voice:** Precise. Consequence-focused. Calm in tone. No exclamation marks. She
|
||
states things, she doesn't perform alarm.
|
||
|
||
**Quest role:** Shift reviews, access audits, security-consequence notifications,
|
||
investigation-phase escalation when audit activity surfaces a finding.
|
||
|
||
Priya reviews every 3–4 quests. Her reviews note what advanced, what stayed
|
||
stable, and what the player introduced as new risk. High curiosity plus low risk:
|
||
she notes methodical investigation. High risk: she flags the access footprint.
|
||
|
||
In Phase 3–4, Priya becomes more present because the audits are surfacing things.
|
||
This is her job, not surveillance of the player specifically. The distinction matters
|
||
for tone.
|
||
|
||
Use Priya for: shift reviews, access audits, consequence delivery for regression
|
||
branches, investigation-phase task assignments (narrowly scoped), security findings
|
||
from James Osei. Do not use Priya for technical troubleshooting, warmth,
|
||
or anything casual.
|
||
|
||
### Dave Okonkwo
|
||
|
||
**Voice:** Helpful, non-technical, accurate about what he saw, wrong about cause.
|
||
|
||
**Quest role:** End-user-experience ticket source for early-phase quests and
|
||
Phase 2 normalcy anchors.
|
||
|
||
Dave's tickets are useful because they describe genuine user experience. His
|
||
hypotheses about the cause are well-intentioned guesses. He should never be
|
||
made to look stupid — he's filing a ticket correctly for someone without technical
|
||
training.
|
||
|
||
Use Dave for: early-phase user-visible failures, texture of the company being a
|
||
real place. Do not use Dave for anything touching the investigation arc.
|
||
|
||
### Dave Kowalski
|
||
|
||
**Voice:** Institutional. Bullet-point emails. Meetings as implied threat.
|
||
"We should really document that."
|
||
|
||
**Quest role:** Management pressure escalation (Phase 3 onward), access restriction
|
||
trigger, status demand source, policy constraint.
|
||
|
||
Kowalski is not suspicious of the player — he is managing upward risk. His
|
||
interventions are institutional responses to things that have surfaced at his level.
|
||
When he appears directly, something has become his problem. His pressure is applied
|
||
through: status-demand emails, access review initiation, meeting invites that have
|
||
known weight, priority-reassignment tickets.
|
||
|
||
Use Kowalski for: Phase 3+ pressure manifestations, access restriction when suspicion
|
||
is elevated, escalation when an incident has made noise at director level. Do not
|
||
make him a villain, do not have him accuse anyone, do not have him explain the plot.
|
||
|
||
### Background Characters
|
||
|
||
Used sparingly for texture.
|
||
|
||
- **Nikhil Sharma** — CC lines on build/pipeline things; Slack messages at unexpected
|
||
hours; upstream explanation or blame when something on vulcan is his. He doesn't
|
||
know the player until the player touches something of his.
|
||
- **Derek Ashford** — CC lines when infrastructure costs surface.
|
||
- **Tom Malaney** — Networking problems that are his domain but are slow to resolve.
|
||
- **Phil Ruiz** — Demo pressure; hermes's political importance made human.
|
||
- **James Osei** — Audit details that Priya summarizes.
|
||
- **Rachel Huang** — Peer provisioning; access handoffs when Marcus delegates.
|
||
|
||
---
|
||
|
||
## 3. Phase-by-Phase Narrative Arc
|
||
|
||
### Phase 1 — Normal Work
|
||
|
||
Day one onboarding through the first weeks. The work is real work. The company
|
||
is a real place that functions, mostly. Nothing is obviously wrong.
|
||
|
||
Quests establish the environment: what the machines are, what they run, who files
|
||
tickets, how the characters communicate, what competent work looks like. The player
|
||
builds access through demonstrated competence. Marcus is evaluative. Sarah is brisk.
|
||
Priya's first shift review is factual and mild.
|
||
|
||
Difficulty: explicit instructions. Tickets describe what to do with some specificity.
|
||
The clue trail is direct. Branch tolerance is generous — Tier 1 quests forgive partial
|
||
fixes with lower trust deltas rather than negative ones.
|
||
|
||
Hidden layer: Dale's name appears in file ownership and configuration history. His
|
||
SSH key appears in `authorized_keys`. His last logrotate config is in a backup
|
||
directory. None of this is called out. A player who reads the files before acting
|
||
will find it. Most won't.
|
||
|
||
**Phase end state:** Player has basic to moderate access. Trust is positive if clean
|
||
branches have been taken. A small number of hidden hook flags may be set for curious
|
||
players. The game looks, so far, like what it says it is.
|
||
|
||
### Phase 2 — Unease
|
||
|
||
The same job. The same machines. But the texture changes slightly. A problem comes
|
||
back that was fixed. A service was modified and the modification doesn't have a
|
||
corresponding ticket. A config that should have been set by the tooling was set by
|
||
hand, by someone.
|
||
|
||
Nothing is alarming. But a sysadmin who is paying attention notices these things —
|
||
the way you notice that a door doesn't close flush, or that a clock is a few minutes
|
||
fast. Not urgent. Off.
|
||
|
||
Difficulty: partial hints. Tickets describe the symptom and hint at the location.
|
||
The cause requires more investigation than in Phase 1. Branch tolerance decreases —
|
||
symptom-only fixes now carry explicit downstream incidents.
|
||
|
||
Marcus's messages are the same as always. The occasional extra sentence he adds is
|
||
slightly harder to read. In Phase 1 his additions were operational context. In Phase 2
|
||
they are sometimes observations that don't quite fit the ticket.
|
||
|
||
Hidden layer: the anomaly pattern continues. The same IP appears in a config and in a
|
||
log. A cron job has been running for over a year with no ticket. A package in the build
|
||
history doesn't correspond to any official release. Each item is individually explainable
|
||
as legacy cruft. Together, for a player who's been collecting them, they aren't.
|
||
|
||
**Phase end state:** Behavior variables are diverging. High-curiosity players have
|
||
world flags for discovered hooks. Obedient players are in good professional standing
|
||
with nothing unusual in their record. Suspicion is low across the board.
|
||
|
||
### Phase 3 — Suspicion
|
||
|
||
The pattern becomes harder to ignore if you're the kind of person who would notice it.
|
||
SSH connections from an IP not in the asset inventory. A user account with no HR record.
|
||
A backup archive with a timestamp that doesn't align with when backups run. The player
|
||
is fixing real problems with real tickets — but the root causes are starting to point
|
||
somewhere.
|
||
|
||
Difficulty: minimal guidance. Tickets describe the symptom only. No indication of
|
||
where to look. The clue trail requires following the evidence without being directed.
|
||
Branch tolerance is stricter — partial fixes carry heavier incident weight.
|
||
|
||
Management pressure increases. Kowalski's weekly status email asks specific questions.
|
||
Marcus forwards it without comment. Priya's shift reviews start noting things they
|
||
didn't note before. None of this is targeted at the player. The audits were already
|
||
scheduled. The status email was always going to ask those questions.
|
||
|
||
A player who ignores all of it and fixes tickets continues to do fine work. They are
|
||
just unaware of what the work is revealing.
|
||
|
||
**Phase end state:** The investigation path is now visible to curious players. They
|
||
have enough fragments to form a partial hypothesis. Obedient players are in good
|
||
professional standing and have noticed nothing unusual.
|
||
|
||
### Phase 4 — Investigation
|
||
|
||
For a curious player, the picture is now coherent enough to be disturbing. The quests
|
||
in this phase involve work that is framed as legitimate operations — audit the access
|
||
log for compliance, trace the package build history for a deployment issue, verify
|
||
backup integrity — but the results of doing that work carefully tell a story.
|
||
|
||
Difficulty: problem-solving only. Tickets state the problem. No clue on approach.
|
||
The player is expected to know their tools and apply them.
|
||
|
||
Marcus's messages are shorter now. Not cold — he has always been terse. But the
|
||
operational context he occasionally added in Phase 2 is absent. He is managing
|
||
something and the messages reflect that without stating it.
|
||
|
||
Priya appears more frequently. A quarterly review surfaced something. James Osei
|
||
sent her something. She is doing her job. Her tickets are narrow and specific —
|
||
she wants to know exactly one thing, stated precisely.
|
||
|
||
Kowalski schedules a meeting. The meeting is called a "check-in on access posture."
|
||
No specifics. Marcus's next message after the meeting's scheduled end time is
|
||
functionally identical to his previous one — same tone, same brevity. A player
|
||
paying attention will notice only the timing.
|
||
|
||
**Phase end state:** Curious players have a complete or near-complete picture of what
|
||
happened before they arrived. The `exposure` ending is now reachable if other variables
|
||
support it. Obedient players are in good standing, unaware of the arc. High-risk
|
||
players may be under active monitoring.
|
||
|
||
### Phase 5 — Conflict
|
||
|
||
The conflict is professional. The player has access granted for one purpose that
|
||
intersects with information they were not meant to find. The quests are operational —
|
||
real work that needs doing. But the operational work, done carefully and honestly,
|
||
has consequences.
|
||
|
||
A backup restoration reveals something. An access revocation request arrives for
|
||
an account the player has been investigating. A production ticket requires changing
|
||
a configuration that, to a player who has been paying attention, is recognizable as
|
||
the wrong change to make.
|
||
|
||
The player can always do only what the ticket asks. That is always an available path.
|
||
The question is whether the player recognizes when the ticket asks for something that,
|
||
done without scrutiny, would harm something beyond the immediate task.
|
||
|
||
Marcus says less. Priya is specific and procedural. Kowalski's emails are formal
|
||
and institutional. The company is managing something. The player is in it.
|
||
|
||
**Phase end state:** Ending routes are determined. The final quests in Phase 6 are
|
||
confirmation, not decision.
|
||
|
||
### Phase 6 — Resolution
|
||
|
||
The final quests are normal work. Infrastructure tasks. Some are the same kind of
|
||
task as Phase 1 quests, deliberately — the comparison is the point. The world has
|
||
moved on. The player is still a sysadmin at Axiom Works.
|
||
|
||
The ending emerges from the accumulated state of all behavior variables, world flags,
|
||
trust score, and access history. It is not triggered by a final choice. The player
|
||
will not be presented with an ending screen that asks them to pick. They will complete
|
||
a routine task, and the ending will fire based on everything that preceded it.
|
||
|
||
Difficulty returns to Tier 1 for operational tasks. The pressure has lifted. The
|
||
tickets are from Sarah and Marcus and sound like Phase 1 tickets.
|
||
|
||
---
|
||
|
||
## 4. Full Quest Catalog
|
||
|
||
VMs: `workstation` (ares, Ubuntu 24.04), `web_server` (hermes, Debian 12),
|
||
`build_machine` (vulcan, Arch Linux).
|
||
|
||
Behavior impact notation: `C` = curiosity delta, `O` = obedience delta, `R` = risk
|
||
delta, `S` = suspicion delta. Values are per-branch where they differ.
|
||
|
||
---
|
||
|
||
### PHASE 1 — NORMAL WORK (Q001–Q008)
|
||
|
||
Tier 1 throughout. Explicit instructions. Generous branch tolerance.
|
||
Hook density: 4 hooks across 8 quests.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q001
|
||
**Title:** First Day, First Key
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Configure SSH key authentication for the player's account
|
||
on the workstation before end of day.
|
||
**Linux Concepts:** `ssh-keygen`, `~/.ssh/authorized_keys`, directory and file
|
||
permissions (`chmod 700`, `chmod 600`), `sshd_config` pubkey authentication
|
||
**Systems Used:** workstation
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Your account is active. Before you touch anything else: set
|
||
up key-based auth on the workstation. Password auth stays on for now but I want
|
||
your public key in authorized_keys before end of day. Walk yourself through it."
|
||
|
||
**Clue Trail:**
|
||
- `~/.ssh/` directory absent or present without `authorized_keys`
|
||
- `sshd_config`: `PubkeyAuthentication yes`, `PasswordAuthentication yes`
|
||
- Player generates keypair with `ssh-keygen`, places public key in `authorized_keys`,
|
||
sets permissions — `.ssh/` to 0700, `authorized_keys` to 0600
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Key present, `.ssh/` is 0700, `authorized_keys` is
|
||
0600, SSH auth works. `trust_delta: +2`. Flags: `player_ssh_configured`.
|
||
Follow-up ticket: T002.
|
||
|
||
Branch 2 — Permissive (priority 50): Key present, permissions wrong (`0644` on key
|
||
file or `0755` on directory). SSH works; not correctly hardened. `trust_delta: +0.5`.
|
||
Flags: `player_ssh_permissive`. Follow-up incident: I001 (Priya's first review notes
|
||
the permission).
|
||
|
||
Branch 3 — Incomplete (priority 10): Key absent or `authorized_keys` missing.
|
||
`trust_delta: -1`. Flags: `player_ssh_failed`. Marcus follows up.
|
||
|
||
**Hidden Hook:** A pre-existing entry in `~/.ssh/authorized_keys` — the file
|
||
the player must read and edit — has a line for `dale@axiomworks.internal`. A player
|
||
who reads the full file before writing to it will see it. Sets `hook_dale_ssh_key_found`.
|
||
Discoverable through: reading the file the task requires touching.
|
||
|
||
**Failure Conditions:** Player cannot authenticate via key; permissions so broad
|
||
sshd refuses pubkey auth entirely.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: C+0, O+1, R+0
|
||
- Permissive branch: C+0, O+0, R+1
|
||
- Hook discovered: C+1 (reading the file carefully before writing is the behavior)
|
||
|
||
**Narrative Notes:** Establishes Marcus's voice and the evaluation frame. The Dale
|
||
key is the first hook: completely invisible unless the player reads the file rather
|
||
than overwriting it. No hint it exists. Most players won't find it on day one.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q002
|
||
**Title:** Disk Running Hot
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Something is wrong with hermes — the AxiomFlow staging
|
||
application is returning 503 errors. Investigate and fix it.
|
||
**Linux Concepts:** `df -h`, `du -sh`, `systemctl status`, `/var/log` inspection,
|
||
`logrotate`, log file management
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Dave Okonkwo
|
||
**Ticket Summary:** "The work application has been giving a 503 error since this
|
||
morning. I tried refreshing and logging out and back in — nothing helps. I think
|
||
maybe a script crashed? It was fine yesterday afternoon."
|
||
|
||
**Clue Trail:**
|
||
- `systemctl status nginx` — service failed
|
||
- `journalctl -u nginx` — "no space left on device"
|
||
- `df -h` — root partition at 93%+
|
||
- `du -sh /var/log/nginx/*` — access log at 4+ GB
|
||
- `/etc/logrotate.d/nginx` — absent
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Restores `/etc/logrotate.d/nginx` with a correct
|
||
rotation config, runs `logrotate -f /etc/logrotate.conf` to clear the current
|
||
backlog, confirms nginx is running, disk below 70%. `trust_delta: +2`.
|
||
Flags: `hermes_logrotate_healthy`. Follow-up ticket: T003.
|
||
|
||
Branch 2 — Manual clear (priority 60): Deletes or truncates the large log file,
|
||
nginx comes back, logrotate config not restored. Disk clear now; will recur.
|
||
`trust_delta: +0.5`. Flags: `hermes_logrotate_fragile`. Follow-up incident: I002
|
||
(log fills again, Sarah files new ticket in Phase 2).
|
||
|
||
Branch 3 — Destructive (priority 20): Removes all logs or nginx config. Service
|
||
degraded. `trust_delta: -2`. Flags: `hermes_logs_destroyed`. Follow-up incident:
|
||
I003 (Priya flags log destruction at next review).
|
||
|
||
**Hidden Hook:** None in this quest. The clue trail is clean and the root cause
|
||
is straightforward. This is intentional — not every quest in Phase 1 has a hook.
|
||
|
||
**Failure Conditions:** nginx remains down; disk stays over 90%; player creates
|
||
new problems while fixing.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- Manual clear: R+0 (acceptable partial fix)
|
||
- Destructive: R+2
|
||
|
||
**Narrative Notes:** First hermes quest. Establishes the symptom → cause → root
|
||
cause investigation pattern. Sarah Chen reacts to branch quality in the follow-up.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q003
|
||
**Title:** The Locked Room
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Sarah Chen cannot SSH into the staging server's deployment
|
||
account. She has a hotfix to push before an afternoon demo. Restore her access.
|
||
**Linux Concepts:** `sshd_config` access directives (`AllowUsers`, `AllowGroups`),
|
||
`/var/log/auth.log`, SSH troubleshooting, user group membership (`id`, `groups`)
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "I can't SSH into the staging server. I've tried from two
|
||
machines and keep getting 'connection refused' or 'permission denied.' I need to
|
||
push a hotfix before 2pm. Can you look at this now?"
|
||
|
||
**Clue Trail:**
|
||
- `/var/log/auth.log` on hermes: `User s.chen not allowed because not listed in AllowUsers`
|
||
- `/etc/ssh/sshd_config`: `AllowUsers deploy-user marcus` — no `s.chen`
|
||
- `groups s.chen` shows she is in the `deploy` group
|
||
- The config uses `AllowUsers` per-user instead of `AllowGroups` by role
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Player converts `AllowUsers` to `AllowGroups deploy`
|
||
(or similar role-based approach), restarts sshd, confirms Sarah can authenticate.
|
||
`trust_delta: +2`. Flags: `hermes_ssh_allowgroups`. Follow-up ticket: T004.
|
||
|
||
Branch 2 — Username append (priority 60): Adds `s.chen` to the `AllowUsers` list.
|
||
Problem solved; next person locked out will need the same treatment. `trust_delta: +0.5`.
|
||
Flags: `hermes_ssh_allowusers_fragile`. Follow-up incident: I004 (another user
|
||
locked out in Phase 2).
|
||
|
||
Branch 3 — Unrestricted (priority 10): Removes `AllowUsers` or `AllowGroups`
|
||
entirely. All valid users can SSH. `trust_delta: -2`. Flags: `hermes_ssh_unrestricted`.
|
||
Priya flags this in next review.
|
||
|
||
**Hidden Hook:** `authorized_keys` for the `deploy-user` account on hermes contains
|
||
a key with comment `dale@ares 2023-09`. Discoverable by: reading the deploy-user's
|
||
`authorized_keys` as part of investigating the SSH configuration. Sets
|
||
`hook_dale_deploy_key`. Connects to Q001's hook for players who found that one.
|
||
|
||
**Failure Conditions:** Sarah still locked out; sshd fails to restart after edit;
|
||
player breaks SSH for themselves.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- Username append: O+0
|
||
- Unrestricted: R+3
|
||
- Hook discovered: C+1
|
||
|
||
**Narrative Notes:** Marcus's clean-branch response: "Good call switching to
|
||
groups. AllowUsers was always going to be a maintenance problem." The attribution
|
||
of the AllowUsers config is deliberately vague — it was in place when the player
|
||
arrived. Sarah's ticket wording ("I've tried from two machines") is accurate, non-
|
||
technical, real.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q004
|
||
**Title:** The Build That Won't
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** none
|
||
**Primary Objective:** The nightly AxiomFlow build on vulcan has not produced an
|
||
artifact in three days. The scheduler shows the job running. Nothing is in the
|
||
output directory. Find the cause and fix it.
|
||
**Linux Concepts:** `systemd` timers, `journalctl`, NTP and clock synchronization,
|
||
`timedatectl`, `systemd-timesyncd`, SSL certificate validation dependencies on
|
||
system clock
|
||
**Systems Used:** build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Nikhil flagged that nothing has come out of the nightly build
|
||
in three days. The timer is showing as triggered. Build log is in the usual location.
|
||
Look at what's actually happening."
|
||
|
||
**Clue Trail:**
|
||
- `systemctl list-timers` — `axiomflow-build.timer` last triggered correctly
|
||
- `/var/log/axiomflow-build/build.log` — SSL certificate verification failure
|
||
against the internal package repository (cert fetch step)
|
||
- `timedatectl` — system clock is 47 minutes ahead of real time; NTP is not running
|
||
- `systemctl status systemd-timesyncd` — inactive and disabled
|
||
- Enabling timesyncd, syncing clock, re-running the build — success
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Enables and starts `systemd-timesyncd`, verifies
|
||
sync with `timedatectl show-timesync`, triggers a manual build run to confirm artifact
|
||
output. `trust_delta: +2`. Flags: `vulcan_ntp_healthy`. Follow-up ticket: T005.
|
||
|
||
Branch 2 — One-time sync (priority 50): Uses `ntpdate` or `date -s` for a manual
|
||
clock correction. Clock is correct now; drift will recur without the daemon.
|
||
`trust_delta: +0.5`. Flags: `vulcan_ntp_fragile`. Follow-up incident: I005 (drift
|
||
recurs in Phase 2, build fails again).
|
||
|
||
Branch 3 — Bypass SSL (priority 20): Disables SSL certificate verification in the
|
||
build script rather than fixing the clock. Build succeeds; certificate validation
|
||
is now bypassed. `trust_delta: -2`. Flags: `vulcan_ssl_bypassed`. Priya flags this.
|
||
|
||
**Hidden Hook:** Reading the full build log (not just the most recent failure)
|
||
reveals a historical entry from 8 months ago: a build step called `sign-package`
|
||
that no longer exists in the current build script. The step was removed — the
|
||
removal is not documented anywhere. Sets `hook_sign_package_removed`. Discoverable
|
||
by: reading historical log entries as part of diagnosing the build environment.
|
||
|
||
**Failure Conditions:** Build continues failing; SSL bypass introduced; NTP
|
||
configured incorrectly breaks time-dependent services.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- Bypass SSL: R+3
|
||
- Hook discovered: C+1
|
||
|
||
**Narrative Notes:** First vulcan quest. Establishes the machine's character: things
|
||
break here silently and the downstream effect shows up on hermes. The `sign-package`
|
||
removal hook is the beginning of the build pipeline thread. An obedient player reads
|
||
only the current log. A curious player reads further back.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q005
|
||
**Title:** Permissions Drift
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** The AxiomFlow staging application cannot write to its cache
|
||
directory. Exports are failing for all users. Identify why the ownership changed
|
||
and restore correct state.
|
||
**Linux Concepts:** `chown`, `chmod`, `ls -la`, process user context (`ps aux`),
|
||
service account ownership (`www-data`), bash history inspection
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "Users in staging can't generate exports — they get a
|
||
'permission denied' error. The dev team says they haven't changed anything. It
|
||
was working Thursday. Something changed on the infrastructure side."
|
||
|
||
**Clue Trail:**
|
||
- Application error log: `permission denied: /var/www/axiomworks/cache/export`
|
||
- `ls -la /var/www/axiomworks/cache` — directory owned by `root:root`; previously
|
||
should be `www-data:www-data`
|
||
- `ps aux | grep axiomflow` — application process running as `www-data`
|
||
- `/root/.bash_history` — contains a `sudo cp -r` command run three weeks ago that
|
||
carried root ownership forward into the cache directory
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Runs `chown -R www-data:www-data /var/www/axiomworks/cache`,
|
||
confirms application can write, identifies the `cp -r` as cause, documents root
|
||
cause in ticket response. `trust_delta: +2`. Flags: `hermes_cache_ownership_correct`.
|
||
Follow-up ticket: T006.
|
||
|
||
Branch 2 — World-writable (priority 30): Runs `chmod o+w /var/www/axiomworks/cache`
|
||
so www-data can write without being owner. App works; directory is now world-writable.
|
||
`trust_delta: -1`. Flags: `hermes_cache_world_writable`. Priya flags in next review.
|
||
|
||
Branch 3 — Service as root (priority 10): Modifies service unit to run as root.
|
||
App works; every downstream file is now root-owned. `trust_delta: -3`.
|
||
Flags: `hermes_app_running_as_root`.
|
||
|
||
**Hidden Hook:** The `sudo cp -r` command in `/root/.bash_history` is timestamped
|
||
three weeks ago — before the player's start date. The session that ran this command
|
||
predates the player's account creation. Someone with root access was copying
|
||
production files before the player arrived. Sets `hook_pre_hire_root_session`.
|
||
Discoverable by: checking bash history to trace the ownership change as part of
|
||
understanding the cause.
|
||
|
||
**Failure Conditions:** Application still cannot write to cache; player introduces
|
||
broader permission regression.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- World-writable: R+2
|
||
- App-as-root: R+4
|
||
- Hook discovered: C+2 (this one requires going beyond what the ticket asks)
|
||
|
||
**Narrative Notes:** The pre-hire root session hook is more significant than the
|
||
SSH key hooks — it establishes that someone was making system changes before the
|
||
player arrived. A player who finds it has their first real data point about activity
|
||
that predates them.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q006
|
||
**Title:** The Account That Shouldn't Be There
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Priya's quarterly access review surfaced a user account on
|
||
the workstation with no matching HR record. Audit it and remove it correctly.
|
||
**Linux Concepts:** `getent passwd`, `lastlog`, `last`, `ps aux`, `find / -user`,
|
||
`userdel -r`, home directory archival before removal
|
||
**Systems Used:** workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Quarterly access review flagged an account on the workstation:
|
||
`jbenton`. No corresponding entry in the HR system. Before removal: confirm no active
|
||
sessions, check if any processes are running under this account, and archive the home
|
||
directory. Then remove it. Document what you find."
|
||
|
||
**Clue Trail:**
|
||
- `getent passwd jbenton` — account exists; no HR match
|
||
- `lastlog | grep jbenton` — last login 14 months ago
|
||
- `ps aux | grep jbenton` — no active processes
|
||
- Home directory: `~jbenton/` exists with standard dotfiles and one file:
|
||
`notes/infra.txt` — a plain-text infrastructure reference listing internal
|
||
hostnames and access notes, formatted like a personal cheatsheet
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Player checks activity, processes, groups,
|
||
home dir; archives home directory to `/var/archive/jbenton-YYYYMMDD.tar.gz`;
|
||
runs `userdel -r jbenton`; documents findings and archive location for Priya.
|
||
`trust_delta: +2`. Flags: `jbenton_account_removed_clean`. Follow-up ticket: T007.
|
||
|
||
Branch 2 — Fast remove (priority 40): Removes account without archiving or checking
|
||
home dir. Account is gone. `trust_delta: +0.5`. Flags: `jbenton_account_removed_fast`.
|
||
Priya's response notes that archival is standard procedure.
|
||
|
||
Branch 3 — Left in place (priority 10): Reports account looks inactive, recommends
|
||
deferring. Ticket unresolved. `trust_delta: -1`.
|
||
|
||
**Hidden Hook:** `notes/infra.txt` in jbenton's home directory is a personal
|
||
infrastructure reference. It includes a line for `pipeline-svc` with a note:
|
||
`temp sudo — ask DH to scope`. The initials `DH` do not correspond to any current
|
||
employee visible on the company website. Sets `hook_dh_initials_in_jbenton_notes`.
|
||
Discoverable by: reading the file before archiving or deleting, which proper
|
||
procedure (per the ticket) requires doing anyway — but the player can ignore the
|
||
content and just archive it.
|
||
|
||
**Failure Conditions:** Player removes account with active sessions; player destroys
|
||
home dir without archiving; ticket not resolved.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- Fast remove: R+1 (destroying potential evidence)
|
||
- Hook discovered: C+1
|
||
|
||
**Narrative Notes:** The `DH` initials connect to the sudoers comment the player
|
||
will find in Q011. `pipeline-svc` also connects forward. The note reads like
|
||
a practical cheatsheet — not alarming, just a person keeping track of the
|
||
infrastructure they were using. The oddness is the initials and the word "temp."
|
||
|
||
---
|
||
|
||
**Quest ID:** Q007
|
||
**Title:** Rotation Failure
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** The TLS certificate for the AxiomFlow staging domain has
|
||
expired. A prospect demo is tomorrow morning. Renew the certificate and ensure
|
||
automatic renewal is in place.
|
||
**Linux Concepts:** `certbot`, Let's Encrypt certificate renewal, `systemd` timers,
|
||
`openssl s_client`, nginx configuration reload, certificate verification
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "The staging site is showing a certificate error — the browser
|
||
is refusing to load it at all. Phil has a prospect demo on this environment tomorrow
|
||
at 9am. We need this fixed today."
|
||
|
||
**Clue Trail:**
|
||
- `openssl s_client -connect staging.axiomworks.internal:443 </dev/null 2>&1 | grep -i expire`
|
||
— certificate expired 14 days ago
|
||
- `certbot certificates` — cert present, not renewed
|
||
- `systemctl status certbot.timer` — inactive, disabled
|
||
- `journalctl -u certbot --since "90 days ago"` — renewal failed 60 days ago
|
||
(HTTP challenge permission error); timer was disabled manually the same day
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Runs `certbot renew`, re-enables and starts
|
||
`certbot.timer`, reloads nginx, verifies new cert expiry with openssl, confirms
|
||
staging site loads without browser warning. `trust_delta: +2`.
|
||
Flags: `hermes_certbot_healthy`. Follow-up ticket: T008.
|
||
|
||
Branch 2 — Renew without timer (priority 50): Renews cert but doesn't restore the
|
||
timer. Valid now; expires again in 90 days without action. `trust_delta: +0.5`.
|
||
Flags: `hermes_certbot_fragile`. Follow-up incident: I006 (cert expires again in
|
||
Phase 3).
|
||
|
||
Branch 3 — Self-signed (priority 10): Generates self-signed cert, nginx configured
|
||
to use it. Connection is encrypted; browser still warns. `trust_delta: -1`.
|
||
Flags: `hermes_self_signed_cert`. Phil's demo shows a security warning.
|
||
|
||
**Hidden Hook:** `journalctl -u certbot --since "90 days ago"` contains the failure
|
||
entry — permission error. Immediately after the failure, in the same journalctl
|
||
window, is an entry showing the timer was disabled by a manual `systemctl disable`
|
||
command from a root session. The session timestamp predates the player. The timer
|
||
wasn't failed-and-stopped; it was deliberately turned off after the failure.
|
||
Sets `hook_certbot_deliberately_disabled`. Discoverable by: reading the journal
|
||
further back than strictly necessary to diagnose the current renewal failure.
|
||
|
||
**Failure Conditions:** Cert not renewed; nginx not reloaded; timer still inactive.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- Renew without timer: O+0
|
||
- Self-signed: R+1
|
||
- Hook discovered: C+1
|
||
|
||
**Narrative Notes:** The timer being deliberately disabled — not just failed — is
|
||
a small data point in the pattern of things being intentionally changed. A player
|
||
who finds it has evidence of deliberate action, not accident.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q008
|
||
**Title:** The Package That Wasn't
|
||
**Narrative Phase:** Normal Work
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** build\_machine
|
||
**Primary Objective:** A deployment to hermes is blocked because a required package
|
||
is not available in the internal apt repository. The package was reportedly built
|
||
last week. Find why it isn't available and restore the deployment path.
|
||
**Linux Concepts:** `apt-cache`, `apt-get update`, internal apt repositories,
|
||
`reprepro`, repository metadata management, package pipeline between build and
|
||
deployment
|
||
**Systems Used:** web\_server, build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Deployment to staging is blocked. The apt install step fails
|
||
on a package that Nikhil says he built last week. Something's broken between the
|
||
build and the repo. Find it and fix it."
|
||
|
||
**Clue Trail:**
|
||
- `apt-cache show axiomflow-workers` on hermes — package not found
|
||
- `/etc/apt/sources.list.d/axiomworks.list` — points to `http://vulcan.axiomworks.internal/repo/`
|
||
- SSH to vulcan: repository Packages index is stale — `reprepro` was not run
|
||
after last build
|
||
- Built `.deb` artifact at `/srv/packages/axiomflow-workers_2.4.1_amd64.deb`
|
||
- Fix: `reprepro includedeb stable /srv/packages/axiomflow-workers_2.4.1_amd64.deb`,
|
||
then `apt update` on hermes confirms package availability
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Adds package to repo correctly, updates metadata,
|
||
confirms `apt-cache show` succeeds on hermes, deployment unblocked. `trust_delta: +2`.
|
||
Flags: `vulcan_repo_healthy`. Follow-up ticket: T009.
|
||
|
||
Branch 2 — Manual install (priority 40): Copies `.deb` to hermes and installs with
|
||
`dpkg -i`. Deployment works this time; repo still broken for next deployment.
|
||
`trust_delta: 0`. Flags: `vulcan_repo_bypassed`. Follow-up incident: I007
|
||
(next deployment fails identically).
|
||
|
||
Branch 3 — Escalate without investigating (priority 10): Reassigns to Nikhil
|
||
without investigation. `trust_delta: -1`. Ticket stalls.
|
||
|
||
**Hidden Hook:** While browsing the repository's package history to find the missing
|
||
package, a player who looks at the full package list rather than just the missing
|
||
one will find an entry for `axiomflow-audit-bridge` — a package built 8 months ago
|
||
with no corresponding deployment record, no entry in any release manifest visible on
|
||
hermes, and no build job in the scheduler that corresponds to when it was built.
|
||
Sets `hook_audit_bridge_package`. Discoverable by: looking at the full repo package
|
||
list rather than only the specific package named in the ticket.
|
||
|
||
**Failure Conditions:** hermes still cannot find the package; repo metadata left
|
||
in broken state.
|
||
|
||
**Behavior Impact:**
|
||
- Clean branch: O+1
|
||
- Manual install: O+0
|
||
- Hook discovered: C+2 (requires going beyond the specific package named in ticket)
|
||
|
||
**Narrative Notes:** The audit-bridge package is the most significant Phase 1 hook.
|
||
It's discoverable only if the player looks at what's around the thing they were
|
||
sent to find — real sysadmin behavior, but not required. A player who finds it has
|
||
their first glimpse of something that doesn't fit.
|
||
|
||
---
|
||
|
||
### PHASE 2 — UNEASE (Q009–Q016)
|
||
|
||
Tier 2. Partial hints. Tickets describe the symptom and indicate the general area
|
||
but do not specify the cause. Branch tolerance decreases — acceptable-fix incidents
|
||
now carry real operational weight. Hook density: 3 hooks across 8 quests, less
|
||
pointed than Phase 1.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q009
|
||
**Title:** The Recurrence
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** hermes's nginx access log is filling up again. A Phase 1
|
||
incident that was supposed to be fixed is recurring. Find why logrotate isn't
|
||
working and make it stable.
|
||
**Linux Concepts:** `logrotate` configuration, `/etc/logrotate.d/`, `logrotate -d`
|
||
(dry run), `cron` / `systemd-logrotate.timer`, `logrotate` status file
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "The staging site is throwing errors again. Same thing as a
|
||
few weeks ago — it goes down, then someone fixes it, then it comes back. I was
|
||
told logrotate was set up. Why is it happening again?"
|
||
|
||
**Clue Trail:**
|
||
- (If `hermes_logrotate_healthy` is set from Q002): the logrotate config is present
|
||
but the `logrotate.timer` or `cron.daily` entry that calls it is disabled —
|
||
config exists but nothing triggers it
|
||
- (If `hermes_logrotate_fragile` is set from Q002): logrotate was never restored;
|
||
this is the recurrence
|
||
- Either way: `systemctl status logrotate.timer` shows disabled; or `ls /etc/cron.daily/logrotate`
|
||
shows the file is missing/not executable
|
||
- Log is filling again; nginx error is the same
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Root cause (priority 100): Player diagnoses the trigger failure
|
||
(timer disabled or cron entry missing), restores the trigger, verifies logrotate
|
||
runs correctly on next schedule, confirms log rotation is active. `trust_delta: +2`.
|
||
Flags: `hermes_logrotate_stable`. Follow-up ticket: T010.
|
||
|
||
Branch 2 — Config only (priority 50): Player restores or confirms the logrotate
|
||
config but doesn't check that anything calls it. Disk is cleared manually again.
|
||
`trust_delta: +0.5`. Flags: `hermes_logrotate_still_fragile`. Follow-up incident:
|
||
I008 (recurs again).
|
||
|
||
**No hidden hook** in this quest. The recurrence itself is the unease signal — not
|
||
every quest in Phase 2 has a hook.
|
||
|
||
**Failure Conditions:** nginx still down; disk not cleared; trigger still inactive.
|
||
|
||
**Behavior Impact:**
|
||
- Root cause: O+1
|
||
- Config only: O+0
|
||
|
||
---
|
||
|
||
**Quest ID:** Q010
|
||
**Title:** Someone Changed Something
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Priya flagged an nginx configuration on hermes that doesn't
|
||
match the last known-good state. Find what changed and restore correct configuration.
|
||
**Linux Concepts:** `diff`, config file comparison, nginx config structure
|
||
(`/etc/nginx/`), `nginx -t`, `git diff` or backup comparison, file mtime
|
||
inspection (`stat`)
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Priya found an nginx config that doesn't match the backed-up
|
||
state. I don't have a change ticket for it. Go look at what's different and tell me
|
||
if it matters."
|
||
|
||
**Clue Trail:**
|
||
- Backup exists at `/etc/nginx/.bak/` (or Marcus provides a hash reference)
|
||
- `diff -r /etc/nginx /etc/nginx/.bak/` reveals two differences:
|
||
1. `server_tokens off;` has been removed from the main config (nginx version
|
||
now visible in HTTP headers)
|
||
2. A `location /internal-api/` block added to a site config, proxying requests
|
||
to `127.0.0.1:9301` — a port with nothing listening
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Both issues (priority 100): Player identifies both changes, restores
|
||
`server_tokens off;`, removes or quarantines the `/internal-api/` block, runs
|
||
`nginx -t`, reloads nginx, documents both changes with mtimes. `trust_delta: +2`.
|
||
Flags: `hermes_nginx_config_audited`. Follow-up ticket: T011.
|
||
|
||
Branch 2 — Token only (priority 50): Restores `server_tokens off;` but misses
|
||
the proxy block. `trust_delta: +0.5`. Flags: `hermes_nginx_proxy_block_present`.
|
||
Follow-up incident: I009 (Priya finds the block in next audit).
|
||
|
||
Branch 3 — No action (priority 10): Reports config looks acceptable. `trust_delta: -1`.
|
||
Priya's review flags both items.
|
||
|
||
**Hidden Hook:** The proxy block for `/internal-api/` points to port 9301 with
|
||
nothing currently listening — but the port number itself, and the path name, will
|
||
echo in later anomalies for a player who remembers it. Sets
|
||
`hook_nginx_internal_api_block`. Discoverable by: doing a thorough diff rather
|
||
than checking only the obvious item.
|
||
|
||
**Behavior Impact:**
|
||
- Both issues found: O+1
|
||
- Token only: O+0
|
||
- Hook discovered: C+1 (remembering the port number is the payoff later)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q011
|
||
**Title:** The Service Account
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** none
|
||
**Primary Objective:** The `pipeline-svc` service account on vulcan has more sudo
|
||
privileges than its role requires. Scope it to least privilege.
|
||
**Linux Concepts:** `sudo -l`, `/etc/sudoers`, `visudo`, `/etc/sudoers.d/`,
|
||
least privilege principle, testing sudo with specific commands
|
||
**Systems Used:** build\_machine
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "James's privilege audit shows `pipeline-svc` on the build
|
||
machine has `NOPASSWD: ALL`. That account runs the build pipeline. It should
|
||
only be able to restart specific services. Bring it into scope."
|
||
|
||
**Clue Trail:**
|
||
- `sudo -l -U pipeline-svc` — `(ALL) NOPASSWD: ALL`
|
||
- `/etc/sudoers.d/pipeline-svc` — the blanket grant, separate file
|
||
- Reviewing what the account actually needs: `systemctl restart axiomflow-build`
|
||
and `systemctl restart axiomflow-timer`
|
||
- Correct fix: replace `ALL` with specific command paths in sudoers.d
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Precise scope (priority 100): Replaces the blanket grant with
|
||
`NOPASSWD: /bin/systemctl restart axiomflow-build, /bin/systemctl restart axiomflow-timer`,
|
||
verifies with `sudo -l`, tests that the service can still restart correctly.
|
||
`trust_delta: +2`. Flags: `vulcan_pipeline_svc_scoped`. Follow-up ticket: T012.
|
||
|
||
Branch 2 — Broader scope (priority 50): Reduces from ALL but grants more than
|
||
needed (e.g., `NOPASSWD: /bin/systemctl`). Better; not least privilege. `trust_delta: +0.5`.
|
||
Priya notes improvement but flags remaining exposure.
|
||
|
||
Branch 3 — Remove sudo entirely (priority 20): Removes all sudo. Service account
|
||
can no longer restart services; build pipeline breaks. `trust_delta: -2`.
|
||
Follow-up incident: build failures within the hour.
|
||
|
||
**Hidden Hook:** The comment at the top of `/etc/sudoers.d/pipeline-svc` reads:
|
||
`# Temp grant per INT-0194 — DH 2023-11`. The ticket number references an internal
|
||
system the player cannot access. The initials `DH` — same initials as in Q006's
|
||
jbenton notes — don't correspond to any current employee. Sets `hook_dh_sudo_grant`.
|
||
Discoverable by: reading the sudoers file rather than just acting on the grant.
|
||
|
||
**Failure Conditions:** Sudoers syntax error (should use `visudo`); service can
|
||
no longer function; broader access introduced.
|
||
|
||
**Behavior Impact:**
|
||
- Precise scope: O+1
|
||
- Remove sudo: R+1
|
||
- Hook discovered: C+1 (connects to Q006's DH initials for players who found that)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q012
|
||
**Title:** Memory Leak
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** The AxiomFlow application on hermes is crashing every few
|
||
hours due to out-of-memory events. Identify the cause and implement a fix that
|
||
addresses the root problem.
|
||
**Linux Concepts:** `free -h`, `top`, `htop`, `/proc/meminfo`, zombie processes
|
||
(`ps aux` state column), cron job inspection, Python process management,
|
||
systemd service memory limits
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "The app keeps going down — every three or four hours it just
|
||
dies and restarts. Dave said he's been getting logged out mid-session. The restart
|
||
is automatic so customers haven't called yet, but they will."
|
||
|
||
**Clue Trail:**
|
||
- `journalctl -u axiomflow` — OOM kill events every 3–4 hours
|
||
- `ps aux` during an OOM interval — many `axiomflow-report-gen` processes with
|
||
state `Z` (zombie)
|
||
- `/etc/cron.d/report-gen` — runs `axiomflow-report-gen` every 30 minutes
|
||
- The script is a Python process that forks but never calls `wait()` — zombies
|
||
accumulate and consume PID space, the parent's memory grows
|
||
- Fix: correct the script (add `subprocess.wait()` or use `subprocess.run()`) —
|
||
or constrain with systemd service limits (acceptable but not root-cause)
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Root cause (priority 100): Identifies the zombie accumulation from the
|
||
cron script, corrects the Python subprocess handling, confirms clean process table
|
||
after next run. `trust_delta: +2`. Flags: `hermes_report_gen_clean`. Follow-up ticket: T013.
|
||
|
||
Branch 2 — Service limit (priority 60): Adds `MemoryMax` and `Restart=on-failure`
|
||
to the axiomflow service unit. Crashes are now bounded; zombies still accumulate but
|
||
are contained. `trust_delta: +0.5`. Flags: `hermes_app_restart_policy`.
|
||
|
||
Branch 3 — Force-kill cron (priority 20): Adds a cron job that kills all
|
||
`axiomflow-report-gen` processes every 30 minutes. Works until a report is
|
||
mid-execution when killed. `trust_delta: -1`. Flags: `hermes_report_gen_force_killed`.
|
||
|
||
**No hidden hook** in this quest. The technical trail is the whole story.
|
||
|
||
**Failure Conditions:** OOM events continue; player introduces new instability.
|
||
|
||
**Behavior Impact:**
|
||
- Root cause: O+1
|
||
- Force-kill: R+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q013
|
||
**Title:** The Baseline Check
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Priya's end-of-month security checklist asks the player to
|
||
audit their workstation against the company baseline: open ports, running services,
|
||
active accounts, home directory permissions. Document deviations.
|
||
**Linux Concepts:** `ss -tlnp`, `systemctl list-units --type=service`, `getent passwd`,
|
||
`ls -la ~`, `umask`, reading and comparing against a baseline document
|
||
**Systems Used:** workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "End of your first month. Standard workstation audit: I've
|
||
attached the baseline checklist. Open ports, running services, account list, home
|
||
directory permissions. Document what you find. Flag anything that doesn't match."
|
||
|
||
**Clue Trail:**
|
||
- Most findings are normal: expected services, expected ports
|
||
- One service is running but not on the baseline checklist: `axiomworks-telemetry`
|
||
- `systemctl status axiomworks-telemetry` — running, enabled, binary at
|
||
`/usr/local/bin/axiomworks-telemetry`
|
||
- `ss -tlnp` or `netstat -tlnp` — the telemetry service connects outbound (not
|
||
shown in `ss` for listening ports but visible in `netstat -anp` or `/proc`)
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Thorough (priority 100): Documents all deviations including the
|
||
telemetry service; investigates what the service is (service unit file contents,
|
||
binary provenance, any logs); reports complete findings. `trust_delta: +2`.
|
||
Flags: `workstation_audit_complete`. Follow-up ticket: T014.
|
||
|
||
Branch 2 — Checklist-only (priority 50): Completes the audit against the checklist
|
||
but marks the telemetry service as "review later — may be legitimate."
|
||
`trust_delta: +0.5`. Priya follows up.
|
||
|
||
Branch 3 — Disable to clean (priority 20): Disables the telemetry service without
|
||
investigating or reporting it. Service gone; unknown what it was doing.
|
||
`trust_delta: 0`. Flags: `workstation_telemetry_disabled_silently`. S+1.
|
||
|
||
**Hidden Hook:** The telemetry service unit file (`/etc/systemd/system/axiomworks-telemetry.service`)
|
||
has an `ExecStart` line pointing to the binary, and the unit file has a comment line
|
||
at the top: `# deployed by pipeline — INT-0194`. The same internal ticket number
|
||
from Q011's sudoers comment. Sets `hook_telemetry_ticket_INT0194`. Discoverable by:
|
||
reading the service unit file as part of investigating what the service is.
|
||
|
||
**Failure Conditions:** Audit incomplete; player creates instability while investigating.
|
||
|
||
**Behavior Impact:**
|
||
- Thorough: O+1
|
||
- Disable silently: S+1, R+1
|
||
- Hook discovered: C+2 (connects INT-0194 across two quests — DH's ticket number)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q014
|
||
**Title:** Rollback
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** build\_machine
|
||
**Primary Objective:** A deployment to hermes this afternoon broke user
|
||
authentication in the staging application. Roll back to the previous known-good
|
||
package version and prevent automatic re-upgrade.
|
||
**Linux Concepts:** `apt-cache policy`, `apt install <pkg>=<version>`, `apt-mark hold`,
|
||
package version pinning, deployment rollback procedure
|
||
**Systems Used:** web\_server, build\_machine
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "The deployment this afternoon broke login — users can
|
||
authenticate but are immediately logged out. Phil has a customer using this
|
||
environment tomorrow. I need it rolled back now."
|
||
|
||
**Clue Trail:**
|
||
- `apt-cache policy axiomflow-workers` — current version installed 3 hours ago
|
||
- Previous version available in the internal repo cache
|
||
- The regression is in session management — a code issue; infrastructure can't
|
||
fix the code, only roll back the package
|
||
- `apt install axiomflow-workers=2.4.0` installs prior version
|
||
- `apt-mark hold axiomflow-workers` prevents re-upgrade
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Rollback with hold (priority 100): Installs 2.4.0, holds the package,
|
||
confirms auth works, notifies Sarah and notes the hold is in place. `trust_delta: +2`.
|
||
Flags: `hermes_axiomflow_held`. Follow-up ticket: T015.
|
||
|
||
Branch 2 — Rollback without hold (priority 50): Installs 2.4.0, no hold. Auto-
|
||
upgrade will re-break it on next run. `trust_delta: +0.5`. Flags:
|
||
`hermes_axiomflow_rolled_back`. Follow-up incident: I010 (auto-upgrade re-installs
|
||
2.4.1 overnight).
|
||
|
||
Branch 3 — Forward fix attempt (priority 10): Player attempts to diagnose and fix
|
||
the code issue rather than rolling back. Outside scope; fails. `trust_delta: -1`.
|
||
|
||
**Hidden Hook:** `apt-cache showpkg axiomflow-workers` on vulcan shows the 2.4.1
|
||
build timestamp: 3:12am — outside the scheduled build window. The same off-schedule
|
||
time pattern as the signing step removal and the audit-bridge build. Sets
|
||
`hook_2_4_1_off_schedule_build`. Discoverable by: looking at the build machine's
|
||
package metadata while researching what version to roll back to.
|
||
|
||
**Failure Conditions:** Auth still broken; hold not applied; player introduced
|
||
new problems.
|
||
|
||
**Behavior Impact:**
|
||
- Rollback with hold: O+1
|
||
- Rollback without hold: O+0
|
||
- Hook discovered: C+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q015
|
||
**Title:** The Quiet Cron
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Marcus has asked for a cron audit on vulcan: list all
|
||
scheduled jobs, attribute each to a service or owner, and flag anything that
|
||
can't be attributed.
|
||
**Linux Concepts:** `crontab -l` (per-user and system), `/etc/cron.d/`,
|
||
`/etc/cron.daily/`, `/etc/cron.weekly/`, cron syntax, correlating jobs to
|
||
services or owners
|
||
**Systems Used:** build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Routine cron audit on vulcan. List everything that's
|
||
scheduled — root crontab, system crontab, all of cron.d. I want to know who
|
||
owns each job and whether it still makes sense. Anything you can't attribute,
|
||
flag it."
|
||
|
||
**Clue Trail:**
|
||
- `crontab -l` for root and `pipeline-svc` — most jobs are attributable
|
||
- `/etc/cron.d/` directory — standard entries plus one named `axiomworks-collect`
|
||
- `axiomworks-collect` job runs at 2:57am; command: `/usr/local/bin/axiomworks-collect`
|
||
- The binary `/usr/local/bin/axiomworks-collect` exists and is executable
|
||
- No ticket, no documentation comment in the cron file itself, no recent entry
|
||
in any change log
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Thorough, with investigation (priority 100): Player lists all jobs,
|
||
attributes each, and for `axiomworks-collect`: runs `file` and `strings` on the
|
||
binary to understand what it does before flagging it — the binary name is
|
||
suggestive and a thorough audit would check it. Submits complete report including
|
||
what the binary calls. `trust_delta: +2`. Flags: `axiomworks_collect_cron_flagged`.
|
||
Follow-up ticket: T016.
|
||
|
||
Branch 2 — Listed but not investigated (priority 60): Player lists all jobs,
|
||
flags `axiomworks-collect` as unattributed, but does not inspect the binary.
|
||
Report is honest but shallow. `trust_delta: +1`. Flags: `axiomworks_collect_noted`.
|
||
|
||
Branch 3 — Incomplete list (priority 10): Player misses entries. Marcus follows
|
||
up. `trust_delta: -1`.
|
||
|
||
**Hidden Hook:** Running `strings /usr/local/bin/axiomworks-collect` or
|
||
`ldd /usr/local/bin/axiomworks-collect` and checking its network behavior (or simply
|
||
reading any log it writes, if one exists) reveals it connects to an internal address.
|
||
The binary name and the ticket number in its help text — `INT-0194` — connects it
|
||
to the same ticket number from Q011 and Q013. Sets `hook_collect_binary_INT0194`.
|
||
The hook is only set in Branch 1 (player inspected the binary). In Branch 2, the
|
||
job is noted but not confirmed. Discoverable by: going one step further than the
|
||
ticket requires — investigating what an unattributed job actually does.
|
||
|
||
**Failure Conditions:** Cron audit submitted without flagging unattributed jobs.
|
||
|
||
**Behavior Impact:**
|
||
- Branch 1: O+1, C+2 (the INT-0194 connection is now three sightings)
|
||
- Branch 2: O+0
|
||
- Hook discovered: C+2 (already in Branch 1 impact)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q016
|
||
**Title:** The Door Left Open
|
||
**Narrative Phase:** Unease
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** A security scan found port 8080 on hermes reachable from
|
||
outside the office network. That port runs the AxiomFlow admin panel. Restrict
|
||
it to internal-only access and confirm.
|
||
**Linux Concepts:** `ufw`, `iptables`, `ss -tlnp`, nginx access control by IP
|
||
(`allow`/`deny`), CIDR notation, defense-in-depth (firewall + application layer)
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Scan from this morning. Port 8080 on hermes is reachable
|
||
externally. That's the admin panel. It should be internal-only — restrict to
|
||
10.0.0.0/8. Confirm when done."
|
||
|
||
**Clue Trail:**
|
||
- `ss -tlnp | grep 8080` — service listening on `0.0.0.0:8080`
|
||
- `ufw status` — no restriction on port 8080
|
||
- Fix options: `ufw` rule restricting source to 10.0.0.0/8, or nginx `allow 10.0.0.0/8; deny all;`
|
||
in the 8080 server block, or both
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Defense in depth (priority 100): Restricts at both firewall and nginx
|
||
layer, confirms external access blocked, internal access works, reports to Priya.
|
||
`trust_delta: +2`. Flags: `hermes_admin_port_secured`. Follow-up ticket: T017.
|
||
|
||
Branch 2 — Single layer (priority 60): Restricts at one layer only. Better.
|
||
Not layered. `trust_delta: +1`. Priya notes the single-layer approach.
|
||
|
||
Branch 3 — Block entirely (priority 20): Blocks port for all traffic. Admin
|
||
panel inaccessible to everyone including internal users. `trust_delta: -1`.
|
||
|
||
**No hidden hook** in this quest. The technical task is clean.
|
||
|
||
**Failure Conditions:** Port still accessible externally; internal access broken;
|
||
ufw rules in conflict.
|
||
|
||
**Behavior Impact:**
|
||
- Defense in depth: O+1
|
||
- Block entirely: R+1
|
||
|
||
---
|
||
|
||
### PHASE 3 — SUSPICION (Q017–Q024)
|
||
|
||
Tier 2. Minimal guidance. Tickets state the problem, not the location. The clue
|
||
trail requires following evidence without direction. Branch tolerance is stricter.
|
||
Hook density increases: 5 hooks across 8 quests.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q017
|
||
**Title:** Access Without a Ticket
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** workstation
|
||
**Primary Objective:** hermes's auth log shows SSH connections from an IP address
|
||
not in the asset inventory. Investigate: what account was used, what activity
|
||
occurred, is access still happening.
|
||
**Linux Concepts:** `/var/log/auth.log`, `grep` and log filtering, `last`, `who`,
|
||
`lastlog`, SSH session forensics, correlating authentication events with known assets
|
||
**Systems Used:** web\_server, workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Auth log review surfaced connections to hermes from 10.0.0.47
|
||
over the past 90 days. That address is not in our asset inventory. I want to know:
|
||
what account, any evidence of what was done, and whether it's ongoing."
|
||
|
||
**Clue Trail:**
|
||
- `/var/log/auth.log` on hermes — `Accepted publickey for pipeline-svc from 10.0.0.47`
|
||
- Sessions: short duration, irregular hours (2–4am), spanning 6 months
|
||
- `~pipeline-svc/.bash_history` — disabled or empty (shell configured with `HISTSIZE=0`)
|
||
- DNS lookup for 10.0.0.47 — no reverse record; DHCP table has no entry
|
||
- `last pipeline-svc` — confirms session dates and source IP
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full documentation (priority 100): Player documents all sessions (dates,
|
||
times, duration), notes history was disabled, traces what can be traced, reports that
|
||
10.0.0.47 is unknown and appears in prior audit anomalies if the player has accumulated
|
||
hooks. `trust_delta: +3`. Flags: `unknown_ip_auth_documented`. Priya: "Thank you.
|
||
I'll look into that address." Follow-up ticket: T018.
|
||
|
||
Branch 2 — Partial (priority 50): Player documents the sessions but cannot or does
|
||
not trace the IP or connect it to prior findings. `trust_delta: +1`.
|
||
Flags: `unknown_ip_auth_noted`.
|
||
|
||
Branch 3 — Block and close (priority 20): Player blocks the IP at the firewall and
|
||
closes the ticket without full investigation. Access stops; record is thin.
|
||
`trust_delta: 0`. Flags: `unknown_ip_blocked_uninvestigated`. S+1.
|
||
|
||
**Hidden Hook:** The `pipeline-svc` account was the one from Q011 — overly broad
|
||
sudo that the player (may have) scoped down. If `hook_dh_sudo_grant` was set, a
|
||
player connecting the dots now knows that whoever had access to that account from
|
||
10.0.0.47 previously had `NOPASSWD: ALL`. Sets `hook_pipeline_svc_external_sessions`.
|
||
This is not a new discoverable artifact — it's a cross-reference that sets a flag
|
||
if both the Q011 hook and the Q017 investigation are present.
|
||
|
||
**Failure Conditions:** Player doesn't investigate before taking action; evidence
|
||
destroyed before documented.
|
||
|
||
**Behavior Impact:**
|
||
- Full documentation: O+1, C+2 (cross-reference with prior hooks)
|
||
- Block and close: S+1, R+1
|
||
- Cross-reference hook: C+2 (only if `hook_dh_sudo_grant` was set; the connection
|
||
is the behavior, not finding a new artifact)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q018
|
||
**Title:** The User Who Wasn't Onboarded
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** A user account exists on both ares and hermes with no
|
||
corresponding HR record. Investigate the account's history and scope before removal.
|
||
**Linux Concepts:** Cross-host account audit, `last` and `lastlog`, `find / -user`,
|
||
`id`, account removal across multiple hosts with `userdel`
|
||
**Systems Used:** workstation, web\_server
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Access review surfaced account `rford` on both the workstation
|
||
and the web server. HR has no record of this person. The account has had recent
|
||
activity on hermes. Full audit before removal."
|
||
|
||
**Clue Trail:**
|
||
- Account on both machines; `last rford` on hermes shows login 3 weeks ago
|
||
- Files owned by `rford` on hermes: `find /var/www /etc -user rford` — one result:
|
||
`/var/www/axiomworks/config/.rford_run` — a shell script
|
||
- The script, if read, runs a data aggregation command and outputs to a temp directory
|
||
- The account's group memberships include `www-data` — more access than a typical
|
||
employee account
|
||
- No ticket creating the account on either machine
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full audit with archive (priority 100): Player checks activity on both
|
||
hosts, reads and archives the found file, checks group memberships, removes account
|
||
from both machines, documents fully. `trust_delta: +3`. Flags: `rford_account_removed_thorough`.
|
||
Follow-up ticket: T019.
|
||
|
||
Branch 2 — Remove without reading (priority 40): Removes account from both machines
|
||
without examining files. Evidence lost. `trust_delta: +1`. Priya asks for the files;
|
||
they're gone. Flags: `rford_account_removed_fast`.
|
||
|
||
Branch 3 — Workstation only (priority 10): Removes from workstation, misses hermes.
|
||
`trust_delta: -1`. Hermes account remains active.
|
||
|
||
**Hidden Hook:** The `.rford_run` script, if read before archiving, outputs a
|
||
data aggregation of AxiomFlow session logs and sends it to a temp directory with
|
||
a timestamp. The script has a comment: `# collect step — called by INT-0194
|
||
automation`. Three previous hooks have referenced INT-0194. Sets
|
||
`hook_rford_script_INT0194`. Discoverable by: reading the file before archiving,
|
||
which proper archival practice would do.
|
||
|
||
**Failure Conditions:** Evidence destroyed without reading; account not removed from
|
||
both machines; player removes account with active processes still running.
|
||
|
||
**Behavior Impact:**
|
||
- Full audit: O+1
|
||
- Read the file: C+3 (INT-0194 is now four references — pattern is now clear to
|
||
any player who has been collecting these)
|
||
- Remove without reading: R+2
|
||
|
||
---
|
||
|
||
**Quest ID:** Q019
|
||
**Title:** The Diff That Didn't Match
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** A deployment validation check is failing because the installed
|
||
package on hermes doesn't match the expected checksum. Investigate why the package
|
||
differs from the tagged source.
|
||
**Linux Concepts:** `dpkg-deb -x`, `diff -r`, `md5sum` / `sha256sum`, package
|
||
integrity verification, comparing installed vs. source artifacts
|
||
**Systems Used:** build\_machine, web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "The post-deploy checksum check on hermes failed. The installed
|
||
axiomflow-workers doesn't match the tagged release checksum. Nikhil says he didn't
|
||
change anything. Find what's different and where the difference came from."
|
||
|
||
**Clue Trail:**
|
||
- `dpkg-deb -x /srv/packages/axiomflow-workers_2.4.2_amd64.deb /tmp/pkg-extract`
|
||
- `diff -r /tmp/pkg-extract /srv/src/axiomflow-workers-2.4.2/` — two files differ
|
||
- The modified files are in the session logging module; they add a secondary logging
|
||
call to a local socket
|
||
- The modification is not in the tagged source commit; it was added to the build
|
||
environment itself — a file in the build script directory that patches sources
|
||
before compilation
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full forensics (priority 100): Player unpacks the package, diffs against
|
||
source, identifies the modified files, traces the modification to the build environment
|
||
patch file, documents the full chain of custody. `trust_delta: +3`. Flags:
|
||
`package_modification_documented`. Marcus: "Keep this to yourself and Priya for now.
|
||
I mean that." Follow-up ticket: T020.
|
||
|
||
Branch 2 — Diff identified, source not traced (priority 50): Player confirms the
|
||
diff exists but characterizes it as a build environment artifact without tracing the
|
||
cause. `trust_delta: +1`. Flags: `package_diff_found`.
|
||
|
||
Branch 3 — Confirm and escalate without analysis (priority 20): Player confirms
|
||
something is different and escalates without characterizing what. `trust_delta: 0`.
|
||
|
||
**Hidden Hook:** The patch file in the build environment that injects the modification
|
||
has a comment at the top: `# INT-0194 session capture — do not remove`. The fifth
|
||
reference to the same internal ticket number. Sets `hook_build_patch_INT0194`.
|
||
Discoverable by: tracing the modification source, which Branch 1 requires.
|
||
|
||
**Failure Conditions:** Player attempts to remove the patch without consulting Marcus
|
||
first. Risk+3 and Marcus's response is cooler — removing evidence before it's
|
||
documented is a problem.
|
||
|
||
**Behavior Impact:**
|
||
- Full forensics: O+1, C+3
|
||
- Remove patch unilaterally: R+3, S+1
|
||
- Hook discovered: C+3 (already in full-forensics impact)
|
||
|
||
**Narrative Notes:** This is the moment the INT-0194 pattern resolves for a thorough
|
||
player. Five references across different systems, all pointing to the same internal
|
||
ticket ID. Marcus's response is his quietest and most deliberate. He says less than
|
||
normal, which means more.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q020
|
||
**Title:** Pressure From Above
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Kowalski has requested a written access and change summary
|
||
for the past 30 days before a scheduled status meeting. Compile it accurately from
|
||
system logs.
|
||
**Linux Concepts:** `journalctl`, `last`, `/var/log/auth.log`, log filtering by
|
||
date range, compiling a change record from system state evidence
|
||
**Systems Used:** workstation
|
||
**Ticket Sender:** Dave Kowalski
|
||
**Ticket Summary:** "Before Thursday's check-in I need the following in writing:
|
||
access grants issued in the past 30 days, configuration changes to production or
|
||
staging, and open incidents. Pull it from the logs. End of day Wednesday."
|
||
|
||
**Clue Trail:**
|
||
- Player reads auth logs, systemd journals, and any change log Marcus maintains
|
||
- Accurate log reading requires: `journalctl --since "30 days ago"`, `last`, reviewing
|
||
Priya's shift review emails for documented changes
|
||
- The technical work is real — log compilation at this scale requires knowing the
|
||
right tools
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Complete and accurate (priority 100): Player includes all documented
|
||
activity including any anomalies that surfaced through tickets. `trust_delta: +2`.
|
||
Flags: `kowalski_report_accurate`. Marcus sends a brief private note: "Good call
|
||
being complete." Follow-up ticket: T021.
|
||
|
||
Branch 2 — Accurate but narrow (priority 60): Report includes only ticket-related
|
||
activity; omits anomalies that came up during investigation. Accurate; incomplete.
|
||
`trust_delta: +1`. Flags: `kowalski_report_narrow`.
|
||
|
||
Branch 3 — Omits or sanitizes (priority 10): Player downplays or omits anomalies
|
||
that would raise questions. `trust_delta: -2`. Flags: `kowalski_report_sanitized`.
|
||
S+3 (Priya will eventually compare this against log evidence and notice the gaps).
|
||
|
||
**Failure Conditions:** Report submitted without log evidence; report materially
|
||
inaccurate.
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+2
|
||
- Sanitized: R+3, S+3
|
||
|
||
---
|
||
|
||
**Quest ID:** Q021
|
||
**Title:** The Backup That Wasn't Tested
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** The last documented backup restore test for hermes is 14
|
||
months old. Perform a restore test of a non-critical service config directory,
|
||
document the procedure, and report the result honestly.
|
||
**Linux Concepts:** `rsync`, `tar`, backup archive integrity, `sha256sum` verification,
|
||
restore testing to a non-production location, documenting backup procedures
|
||
**Systems Used:** build\_machine, web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Our backup procedure calls for a quarterly restore test.
|
||
The last documented test is 14 months old. Pick a non-critical config directory
|
||
on hermes, verify the backup can be restored to a test location, and document
|
||
the steps and the result. Don't touch production paths."
|
||
|
||
**Clue Trail:**
|
||
- Backups at `/srv/backups/hermes/` on vulcan — recent archive looks intact
|
||
- Checksum file present; most checksums match
|
||
- One archive from 5 months ago: checksum does not match a recalculated value
|
||
— the archive file was modified after initial creation (timestamps show a
|
||
modification date after the archive date)
|
||
- Recent archive (3 days old) restores cleanly to `/tmp/restore-test/`
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Thorough (priority 100): Player identifies the corrupted older archive
|
||
(and notes the timestamp anomaly), successfully restores from the recent clean
|
||
archive, documents both findings — the corruption and the successful restore.
|
||
`trust_delta: +2`. Flags: `backup_restore_tested`. Follow-up ticket: T022.
|
||
|
||
Branch 2 — Restore only (priority 50): Player tests the restore successfully but
|
||
misses the corrupted archive. Report is optimistic. `trust_delta: +1`.
|
||
Flags: `backup_restore_partial_test`.
|
||
|
||
Branch 3 — False report (priority 10): Player documents a successful test without
|
||
actually running it. `trust_delta: -3`. Flags: `backup_test_falsified`. S+2.
|
||
Priya's next audit will check against log evidence.
|
||
|
||
**Hidden Hook:** The archive with the checksum mismatch was last modified at 3:17am —
|
||
the same 3am activity window as the off-schedule builds and the cron job. The archive's
|
||
internal timestamp (from `tar -tv`) shows the files were correct at backup time; the
|
||
outer archive was replaced later. Sets `hook_backup_archive_tampered`. Discoverable
|
||
by: noticing the timestamp anomaly on the corrupted archive, which requires checking
|
||
modification timestamps on the archive files themselves.
|
||
|
||
**Failure Conditions:** Restore test not actually run; player modifies production
|
||
paths; report falsified.
|
||
|
||
**Behavior Impact:**
|
||
- Thorough: O+1
|
||
- False report: R+4, S+2
|
||
- Hook discovered: C+2
|
||
|
||
---
|
||
|
||
**Quest ID:** Q022
|
||
**Title:** The Firewall Rule
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** none
|
||
**Primary Objective:** A firewall rule on the workstation allows outbound connections
|
||
to a specific internal IP on a specific port. No ticket references this rule.
|
||
Investigate whether it is legitimate; remove it if not.
|
||
**Linux Concepts:** `ufw status numbered`, `iptables -L -n`, firewall rule audit,
|
||
rule provenance (when was it added, can it be traced), `ufw delete`
|
||
**Systems Used:** workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Firewall audit on the workstation found a rule allowing outbound
|
||
to 10.0.0.47:9301. No ticket references it. I need: when was it added, do you know
|
||
what that address is, and a recommendation."
|
||
|
||
**Clue Trail:**
|
||
- `ufw status numbered` — rule present, destination 10.0.0.47 port 9301
|
||
- Rule creation date cannot be directly queried from ufw; `journalctl` shows when
|
||
ufw last reloaded; system logs from that period may show the rule being added
|
||
- 10.0.0.47 appears in Q017's auth log investigation; 9301 appeared in Q010's nginx
|
||
proxy block — for a player who has been paying attention
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full cross-reference (priority 100): Player connects this rule to
|
||
prior findings (10.0.0.47 from auth logs; port 9301 from nginx config), explains
|
||
the connection, recommends removal, removes the rule with `ufw delete`, reports.
|
||
`trust_delta: +3`. Flags: `firewall_rule_9301_removed`. Priya: "That matches what
|
||
I've been seeing." Follow-up ticket: T023.
|
||
|
||
Branch 2 — Remove without context (priority 50): Player removes the rule but
|
||
doesn't connect it to prior findings. `trust_delta: +1`. Flags: `firewall_rule_removed`.
|
||
|
||
Branch 3 — Keep with note (priority 20): Documents the rule as "unverified" and
|
||
leaves it. `trust_delta: 0`.
|
||
|
||
**Failure Conditions:** Rule not assessed; player introduces new firewall problems.
|
||
|
||
**Behavior Impact:**
|
||
- Full cross-reference: O+1, C+3 (this is the convergence point for three prior data threads)
|
||
- Remove without context: O+0
|
||
- Hook: no new hook — the cross-reference IS the payoff for accumulated hooks
|
||
|
||
---
|
||
|
||
**Quest ID:** Q023
|
||
**Title:** Overnight Changes
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Files on hermes were modified at 3am on Thursday with no
|
||
corresponding change ticket. Find what changed and assess whether to revert.
|
||
**Linux Concepts:** `find / -newer <reference_file>`, `stat`, file modification
|
||
timestamps, config file comparison, `git diff` if applicable, change ticket
|
||
correlation
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Something touched files on hermes at 3am Thursday. The
|
||
backup ran at 2am and files weren't changed then. Find what changed and tell
|
||
me if we need to revert."
|
||
|
||
**Clue Trail:**
|
||
- `find /etc /var/www -newer /var/log/axiomflow/nightly-backup.log -not -newer /var/log/nginx/access.log`
|
||
narrows to files modified in the 3am window
|
||
- Modified files:
|
||
1. `/var/www/axiomworks/config/app.config` — database connection string now
|
||
points to a secondary host
|
||
2. `/etc/nginx/conf.d/upstream.conf` — upstream block added for the same host
|
||
- The secondary host referenced is not in the known asset inventory
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Revert and document (priority 100): Identifies both files, reverts
|
||
both to known-good state (from the backup taken just before the modifications),
|
||
documents the original changes with file evidence, reports fully. `trust_delta: +3`.
|
||
Flags: `hermes_overnight_changes_reverted`. Marcus: "Revert was right. Don't
|
||
touch anything else." Follow-up ticket: T024.
|
||
|
||
Branch 2 — Document without revert (priority 50): Identifies changes and reports
|
||
but leaves them active. `trust_delta: +1`. Changes remain.
|
||
|
||
Branch 3 — Revert without documenting (priority 20): Reverts but doesn't record
|
||
what was changed. `trust_delta: +0.5`.
|
||
|
||
**Hidden Hook:** The secondary host in both modified files is at `10.0.1.15` — a
|
||
different IP from 10.0.0.47. Two machines. Sets `hook_second_host_10_0_1_15`.
|
||
Discoverable by: recording the specific values in the modified files, which proper
|
||
documentation requires.
|
||
|
||
**Failure Conditions:** Changes not assessed; player reverts production paths
|
||
without confirming impact; modifications left active without escalation.
|
||
|
||
**Behavior Impact:**
|
||
- Revert and document: O+1, C+1 (new IP is a new data point)
|
||
- Revert without documenting: O+0
|
||
- Hook discovered: C+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q024
|
||
**Title:** The Audit Window
|
||
**Narrative Phase:** Suspicion
|
||
**Tier:** 2
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server, build\_machine
|
||
**Primary Objective:** Priya is conducting a formal access audit. Verify current
|
||
access levels and service account configurations on all three machines against
|
||
the documented expected state.
|
||
**Linux Concepts:** Cross-host audit, `getent passwd`, `sudo -l`, `groups`, SSH
|
||
`authorized_keys` review, service account scope verification
|
||
**Systems Used:** workstation, web\_server, build\_machine
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Formal audit. Every service account across all three machines:
|
||
privileges, group memberships, sudo grants, SSH keys in authorized_keys. Compare
|
||
against the baseline I've attached. Flag everything that doesn't match."
|
||
|
||
**Clue Trail:**
|
||
- Audit covers all three machines systematically
|
||
- Findings depend on what the player has fixed and what they've left open
|
||
- Dale's deploy key on hermes (Q001/Q003 hook) — if not removed, it's a live finding
|
||
- `pipeline-svc` sudo scope — if Q011 was only partially fixed, it appears here
|
||
- `axiomworks-telemetry` service — if Q013 found it, it's in the player's record;
|
||
if not, it's a new finding here
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Thorough (priority 100): Player audits all three machines, identifies
|
||
every discrepancy, includes Dale's key if still present, submits complete cross-
|
||
referenced report. `trust_delta: +3`. Flags: `formal_audit_complete`. Priya: "This
|
||
is complete. I'll schedule a follow-up with Marcus." Follow-up ticket: T025.
|
||
|
||
Branch 2 — Partial (priority 50): Misses 1–2 findings. `trust_delta: +1`. Priya
|
||
follows up specifically on each gap.
|
||
|
||
Branch 3 — Surface-level (priority 10): Misses most findings. `trust_delta: -1`.
|
||
|
||
**No hidden hook** in this quest — the audit produces findings based on the world
|
||
state, not new anomalies.
|
||
|
||
**Failure Conditions:** Audit submitted with material inaccuracies.
|
||
|
||
**Behavior Impact:**
|
||
- Thorough: O+2
|
||
- Dale's key found if not previously: C+1
|
||
|
||
---
|
||
|
||
### PHASE 4 — INVESTIGATION (Q025–Q032)
|
||
|
||
Tier 3. Problem-solving only. Tickets state the problem, no location, no approach.
|
||
The player is expected to apply their full toolkit. Hook density: 3 hooks across
|
||
8 quests, each requiring cross-referencing prior findings.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q025
|
||
**Title:** Who Owns the Key
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** workstation
|
||
**Primary Objective:** Following the formal audit, trace the origin of the Dale
|
||
SSH key in deploy-user's authorized_keys. When was it added, by what session,
|
||
and when was it last used.
|
||
**Linux Concepts:** `ssh-keygen -lf` (fingerprinting), `/var/log/auth.log` grep for
|
||
fingerprint, correlation with session timestamps, absence of key from official inventory
|
||
as a finding
|
||
**Systems Used:** web\_server, workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "The key in deploy-user's authorized_keys that doesn't have
|
||
a current employee match. I need provenance: when added, what session, last used.
|
||
Don't remove it yet. Document first."
|
||
|
||
**Clue Trail:**
|
||
- `ssh-keygen -lf /home/deploy-user/.ssh/authorized_keys` — fingerprint of the Dale key
|
||
- `grep <fingerprint> /var/log/auth.log` on hermes — sessions that authenticated
|
||
with this key; last session 5 months ago
|
||
- The session that added the key: `/var/log/auth.log` doesn't show key addition,
|
||
but a root session from `10.0.0.47` at the right timestamp aligns (if Q017 was
|
||
investigated, the player can correlate)
|
||
- The key is not in any official key inventory document
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full provenance (priority 100): Player fingerprints, traces sessions,
|
||
correlates add timestamp with known session data, notes the key's absence from
|
||
official inventory, produces a complete chain. `trust_delta: +3`. Flags:
|
||
`dale_key_provenance_documented`. Marcus sends a message outside normal ticket
|
||
channels — a Slack message, same terse voice, one sentence longer than usual.
|
||
Follow-up ticket: T026.
|
||
|
||
Branch 2 — Sessions documented, source not traced (priority 50): Finds session
|
||
history but cannot attribute who added the key. `trust_delta: +1`.
|
||
|
||
**Hidden Hook:** The most recent session authenticated with this key was on a
|
||
date that maps to a known incident — the same date hermes had an unexplained outage
|
||
6 months ago, visible in the nginx error logs. A player who correlates the auth log
|
||
date with the nginx error log from the same timeframe can connect Dale's last known
|
||
access to a specific event. Sets `hook_dale_key_last_session_incident_date`.
|
||
Discoverable by: cross-referencing auth log dates with nginx error log dates — not
|
||
required to complete the provenance chain, but available to a player who thinks to check.
|
||
|
||
**Failure Conditions:** Player removes the key before documenting; Priya explicitly
|
||
said not to.
|
||
|
||
**Behavior Impact:**
|
||
- Full provenance: O+1, C+2
|
||
- Remove before documenting: R+3, S+2
|
||
- Hook discovered: C+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q026
|
||
**Title:** The Build Chain
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Reconstruct the full build pipeline modification history
|
||
on vulcan for the past 12 months. Attribute each change to a person or session.
|
||
Flag any changes without a corresponding official release.
|
||
**Linux Concepts:** `git log`, `git diff`, `git blame`, file system timestamps,
|
||
bash history correlation, build script comparison, release note cross-reference
|
||
**Systems Used:** build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "I need a complete history of every change to the build scripts
|
||
on vulcan over the past year. Where you can, attribute each change to a person.
|
||
Cross-reference with release notes. Anything without a release: flag it."
|
||
|
||
**Clue Trail:**
|
||
- Build scripts are in a git repository on vulcan
|
||
- `git log --all --oneline --since="1 year ago"` — full history
|
||
- Most commits: legitimate, attributed to Nikhil Sharma
|
||
- Three anomalous commits:
|
||
1. Removal of `sign-package` step — committed by `pipeline-svc` account (not a person)
|
||
2. Addition of the build-time patch file (`INT-0194` reference) — same `pipeline-svc`
|
||
commit
|
||
3. A commit adding `axiomflow-audit-bridge` to the build target list — `pipeline-svc`
|
||
- None of these three have corresponding release notes
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Complete annotated history (priority 100): Player produces a full
|
||
timeline, attributes the three anomalous commits to the `pipeline-svc` service
|
||
account, notes the discrepancy between that account making commits and its stated
|
||
purpose (restart services only), flags all three as undocumented. `trust_delta: +3`.
|
||
Flags: `build_chain_audit_complete`. Follow-up ticket: T027.
|
||
|
||
Branch 2 — Partial (priority 50): Covers legitimate changes, flags some but not
|
||
all anomalous ones. `trust_delta: +1`.
|
||
|
||
**No hidden hook** in this quest — the findings are the point.
|
||
|
||
**Failure Conditions:** Report submitted without flagging anomalous commits;
|
||
player modifies the git history.
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+1, C+2
|
||
- Modify git history: R+5 (destroying forensic evidence)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q027
|
||
**Title:** Asset Inventory Reconciliation
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** workstation
|
||
**Primary Objective:** Reconcile the internal asset inventory against the actual
|
||
network — every host that should be on the network, verify it is; every host
|
||
that appears on the network, verify it is in the inventory. Document discrepancies.
|
||
**Linux Concepts:** `nmap` (host discovery), `arp -n`, `ping`, internal DNS
|
||
queries (`dig`, `host`), asset inventory document comparison, subnet scanning
|
||
**Systems Used:** build\_machine, workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "I need the asset inventory reconciled against the actual
|
||
network. Scan the 10.0.0.0/24 range. Every host that responds: is it in the
|
||
inventory? Every host in the inventory: does it respond? Document every discrepancy."
|
||
|
||
**Clue Trail:**
|
||
- `nmap -sn 10.0.0.0/24` from build\_machine — host discovery scan
|
||
- Known hosts respond as expected (ares, hermes, vulcan, and others from inventory)
|
||
- 10.0.0.47 responds — not in the inventory
|
||
- 10.0.1.15 responds — not in the inventory (new from Q023's hook for players
|
||
who found it, or a new discovery for those who didn't)
|
||
- Both have SSH open; 10.0.0.47 has an additional service on port 9301
|
||
- DNS resolution returns nothing for either
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Complete reconciliation (priority 100): Player documents all hosts,
|
||
identifies both unknown hosts, notes the service on 9301 for 10.0.0.47, cross-
|
||
references with prior anomalies where relevant, submits a complete reconciliation
|
||
report. `trust_delta: +3`. Flags: `asset_inventory_reconciled`. Priya: "I'm going
|
||
to need to take this to Kowalski." Follow-up ticket: T028.
|
||
|
||
Branch 2 — Partial reconciliation (priority 50): Documents inventory hosts,
|
||
finds 10.0.0.47 but misses 10.0.1.15 or vice versa. `trust_delta: +1`.
|
||
|
||
Branch 3 — Probe the unknown hosts (priority 20): Player makes active connection
|
||
attempts to services on the unknown hosts beyond identification. `trust_delta: 0`.
|
||
R+3. Priya's next message: "I said reconcile, not probe."
|
||
|
||
**Hidden Hook:** Running the full scan reveals that 10.0.0.47 and 10.0.1.15 have
|
||
identical SSH host key fingerprints — they are using the same host key, which
|
||
suggests they were provisioned from the same template. Sets
|
||
`hook_two_hosts_same_key`. Discoverable by: comparing the SSH fingerprints from
|
||
the nmap output or from `ssh-keyscan`, rather than just noting the IPs.
|
||
|
||
**Failure Conditions:** Scan incomplete; player makes unauthorized connections;
|
||
report submitted with known gaps left undisclosed.
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+1, C+2
|
||
- Probe: R+3
|
||
- Hook discovered: C+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q028
|
||
**Title:** The Archive Restore
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** workstation
|
||
**Primary Objective:** A backup archive from 6 months ago is needed for a compliance
|
||
audit. Restore it to a staging location on the workstation and confirm its integrity.
|
||
The archive is from the previous sysadmin's final working week.
|
||
**Linux Concepts:** `tar` (extract, verify), `sha256sum`, archive integrity checking,
|
||
restore to non-production path, reading file metadata from within an archive
|
||
(`tar -tv`)
|
||
**Systems Used:** build\_machine, workstation
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Compliance audit needs the working-directory archive from the
|
||
end of last year — it should be in the backup store on vulcan. Restore it to a staging
|
||
path on the workstation and confirm the contents are intact. Let me know what's in it."
|
||
|
||
**Clue Trail:**
|
||
- Archive at `/srv/backups/workstation/wd-archive-YYYYMMDD.tar.gz` on vulcan
|
||
- `sha256sum` check — archive passes (this one is not the tampered one from Q021)
|
||
- `tar -xzf` to `/tmp/restore-staging/` on workstation — succeeds
|
||
- Contents: scripts, config fragments, a partial README text file
|
||
- The README is fragmentary — it's working notes, not a confession. It references
|
||
the INT-0194 deployment and contains a note: "bridge not logging correctly —
|
||
check port forwarding." The rest is infrastructure checklists
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Restore and full inventory (priority 100): Player restores the archive,
|
||
verifies integrity, inventories all contents (including reading the README), reports
|
||
to Marcus what's there. `trust_delta: +2`. Flags: `compliance_archive_restored`.
|
||
Marcus: "Right. Thank you." Follow-up ticket: T029.
|
||
|
||
Branch 2 — Restore and integrity check only (priority 50): Verifies the archive
|
||
restores cleanly but doesn't inventory contents. `trust_delta: +1`. Marcus asks
|
||
what's in it.
|
||
|
||
Branch 3 — Integrity failure reported (priority 20): Player incorrectly reports
|
||
the archive as corrupted without fully testing the restore. `trust_delta: -1`.
|
||
|
||
**Hidden Hook:** The README fragment mentions INT-0194 and "port forwarding" — if the
|
||
player has been collecting the INT-0194 thread, this is the sixth reference. The
|
||
working notes also reference a host called `styx` in a routing context. Sets
|
||
`hook_archive_readme_INT0194` and `hook_styx_in_routing_context`. Discoverable by:
|
||
reading the README file, which properly inventorying the archive would do.
|
||
|
||
**Failure Conditions:** Archive not restored; contents not verified; player runs any
|
||
scripts found in the archive.
|
||
|
||
**Behavior Impact:**
|
||
- Full inventory: O+1
|
||
- Run scripts from archive: R+4 (running unknown code from a previous sysadmin
|
||
is exactly the kind of reckless action that should trigger risk)
|
||
- Hook discovered: C+2
|
||
|
||
**Narrative Notes:** This is not "Marcus gives the player Dale's files and asks them
|
||
to investigate." It is a compliance archive restore with a legitimate operational
|
||
purpose. The player happens to find working notes inside it. The notes are fragmentary
|
||
and don't explain everything — they're field notes, not a plot summary. Marcus's
|
||
"what's in it" is a routine question after a restore, not an invitation to investigate.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q029
|
||
**Title:** The Service That Doesn't Belong
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** A systemd service on hermes is running but is not listed in
|
||
any deployment manifest or change ticket. Audit what it does, whether it is
|
||
currently active, and produce a full service characterization.
|
||
**Linux Concepts:** `systemctl show`, `systemd-analyze`, service unit file anatomy,
|
||
`lsof`, `ss` for service network connections, `strace` basics, process ownership
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "James found a service on hermes that isn't in any deployment
|
||
record. Service name: `axiomflow-bridge`. I need a full characterization: what it
|
||
does, what it connects to, when it was installed. Don't stop it. Document first."
|
||
|
||
**Clue Trail:**
|
||
- `systemctl show axiomflow-bridge` — unit file, state, runtime info
|
||
- Unit file at `/etc/systemd/system/axiomflow-bridge.service` — `ExecStart` points
|
||
to a binary; unit file has `INT-0194` in a comment
|
||
- `lsof -p <PID>` — service has open connections to 10.0.0.47:9301
|
||
- `ss -tp` — confirms active connection
|
||
- Binary at `/usr/local/bin/axiomflow-bridge` — a Go binary; `strings` output
|
||
shows internal API paths and the same INT-0194 reference in help text
|
||
- Installation date from package metadata or file `mtime` — matches the 3am
|
||
activity window
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full characterization (priority 100): Player documents unit file,
|
||
binary provenance, network connections, installation date, cross-references with
|
||
INT-0194 and 10.0.0.47 from prior findings. `trust_delta: +3`. Flags:
|
||
`bridge_service_documented`. Priya: "This is consistent with what I've been
|
||
building. Don't stop it yet." Follow-up ticket: T030.
|
||
|
||
Branch 2 — Partial (priority 50): Documents what the service is and that it
|
||
connects out, but doesn't trace the INT-0194 connection or installation date.
|
||
`trust_delta: +1`.
|
||
|
||
Branch 3 — Stops the service (priority 10): Player stops the service despite
|
||
explicit instruction not to. `trust_delta: -2`. R+2. S+2. Priya: "I said document
|
||
first."
|
||
|
||
**No additional hidden hook** — the quest itself is the hook resolution for INT-0194.
|
||
|
||
**Failure Conditions:** Service stopped against instruction; characterization incomplete.
|
||
|
||
**Behavior Impact:**
|
||
- Full characterization: O+1, C+3 (this is the operational confirmation of INT-0194)
|
||
- Stop the service: R+2, S+2
|
||
|
||
---
|
||
|
||
**Quest ID:** Q030
|
||
**Title:** Keep the Lights On
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** The production application on hermes is returning 502 errors.
|
||
Fix it. The investigation context is ongoing but the service still needs to run.
|
||
**Linux Concepts:** `systemctl`, nginx upstream configuration, application log
|
||
reading (`journalctl`, app logs), database connection strings, process restart
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "I know something is happening. I don't know what. But I have
|
||
paying customers on a system that is returning 502 errors and I need it running.
|
||
Whatever else is going on — please."
|
||
|
||
**Clue Trail:**
|
||
- nginx upstream is timing out — `journalctl -u nginx` shows gateway timeout errors
|
||
- Application log shows it is failing to connect to the database
|
||
- `/var/www/axiomworks/config/app.config` — database connection string; check
|
||
whether it was modified (if Q023's revert was clean, the string is correct; if
|
||
not, it may point to the secondary host)
|
||
- Standalone root cause if Q023 was clean: the database service on the primary
|
||
host is not running — `systemctl status postgresql` shows it crashed overnight
|
||
- Fix: restart the database service (or correct the connection string if Q023 was
|
||
not fully resolved)
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Diagnose and fix (priority 100): Player reads nginx and app logs,
|
||
identifies the database connection failure, finds the cause (service down or
|
||
wrong connection string), applies the correct fix, confirms app is serving.
|
||
`trust_delta: +2`. Flags: `hermes_production_restored`. Sarah: "Thank you.
|
||
Seriously." Follow-up ticket: T031.
|
||
|
||
Branch 2 — Service restart without diagnosis (priority 40): Player restarts the
|
||
app service without finding the root cause. App comes up temporarily; may fail
|
||
again. `trust_delta: 0`.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** App still returning errors; player makes changes that
|
||
worsen the state.
|
||
|
||
**Behavior Impact:**
|
||
- Diagnose and fix: O+2 (maintaining professional duty during investigation is
|
||
the behavior being measured)
|
||
- Restart without diagnosis: O+0
|
||
|
||
---
|
||
|
||
**Quest ID:** Q031
|
||
**Title:** The Access Review
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server, build\_machine
|
||
**Primary Objective:** Kowalski has initiated a formal privileged access review
|
||
for all accounts with elevated permissions. The player must compile an accurate
|
||
account of every privileged action taken across all three machines in the past
|
||
60 days.
|
||
**Linux Concepts:** `journalctl` with filters for privileged commands, `sudo` log
|
||
reading (`/var/log/auth.log`), audit log compilation, accurate self-reporting of
|
||
access history
|
||
**Systems Used:** workstation, web\_server, build\_machine
|
||
**Ticket Sender:** Dave Kowalski
|
||
**Ticket Summary:** "Given recent audit activity, we're conducting a standard
|
||
privileged access review. I need from you: every privileged command run in the
|
||
past 60 days, every system accessed with elevated permissions, and any temporary
|
||
access grants that were used. Standard procedure."
|
||
|
||
**Clue Trail:**
|
||
- Player compiles from their own auth logs and sudo logs on each machine
|
||
- Technical skill required: `grep sudo /var/log/auth.log | grep <username>`,
|
||
`journalctl _COMM=sudo`
|
||
- The review covers what the player actually did; accuracy is the objective
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Complete and accurate (priority 100): Player provides a complete,
|
||
honest account of all privileged actions, including any investigation-period
|
||
access that extended beyond their normal role. `trust_delta: +2`. Flags:
|
||
`access_review_passed`. S-2 (accurate self-reporting reduces suspicion).
|
||
Follow-up ticket: T032.
|
||
|
||
Branch 2 — Accurate but minimal framing (priority 60): Report is factually
|
||
accurate but frames borderline actions favorably without explicitly misrepresenting
|
||
them. `trust_delta: +0.5`. Suspicion unchanged.
|
||
|
||
Branch 3 — Omits significant actions (priority 10): Player leaves out actions that
|
||
would raise questions. `trust_delta: -3`. Flags: `access_review_incomplete`. S+4.
|
||
This is a major suspicion trigger.
|
||
|
||
**Failure Conditions:** Material inaccuracies; privileged commands claimed that
|
||
don't match log evidence.
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+3, S-2
|
||
- Omit: R+3, S+4
|
||
|
||
---
|
||
|
||
**Quest ID:** Q032
|
||
**Title:** Loose Ends
|
||
**Narrative Phase:** Investigation
|
||
**Tier:** 3
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** build\_machine
|
||
**Primary Objective:** Before the situation moves to its next phase, Marcus wants
|
||
the infrastructure in a known and correct state. Remediate any outstanding
|
||
configuration issues on hermes and vulcan, and document the current state.
|
||
**Linux Concepts:** Synthesis — all concepts from the campaign applied to
|
||
remediation; logrotate, NTP, SSH configuration, repo management, service auditing,
|
||
firewall rules
|
||
**Systems Used:** web\_server, build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Before this goes any further, I want the environment clean.
|
||
Everything we've documented as a problem: either fix it or document it as known
|
||
and accepted. Do a full pass on hermes and vulcan. Not to cover anything — because
|
||
whatever happens next, those machines need to be in a known state."
|
||
|
||
**Clue Trail:**
|
||
- Player reviews world flags representing open issues from prior quests
|
||
- Each unresolved issue (logrotate, NTP, nginx config, sudo scope, certbot timer)
|
||
is a task in this quest
|
||
- The more prior quests were resolved cleanly, the less remediation is needed
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean environment (priority 100): All outstanding issues resolved or
|
||
explicitly documented as accepted. Both machines in known, stable state. `trust_delta: +3`.
|
||
Flags: `environment_clean`. Marcus: "Good. That's all I needed to know." Follow-up: T033.
|
||
|
||
Branch 2 — Mostly clean (priority 60): Most issues resolved; a few deferred with
|
||
documentation. `trust_delta: +1`. Follow-up: T033.
|
||
|
||
Branch 3 — Significant gaps (priority 20): Multiple unresolved issues remain
|
||
undocumented. `trust_delta: -1`. Follow-up: T033 (game continues regardless).
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Issues claimed as resolved that are not; new problems
|
||
introduced during remediation.
|
||
|
||
**Behavior Impact:**
|
||
- Clean: O+2
|
||
- Significant gaps: O+0
|
||
|
||
---
|
||
|
||
### PHASE 5 — CONFLICT (Q033–Q040)
|
||
|
||
Tier 3 throughout. Problem-solving only. Tickets are operational — real work that
|
||
intersects with the investigation context. Hook density: 2 hooks, both requiring
|
||
prior context to understand.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q033
|
||
**Title:** The Revocation Request
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Kowalski has asked the player to remove the `pipeline-svc`
|
||
account from vulcan before an external audit. The request is operationally
|
||
legitimate. The account is also central to the investigation.
|
||
**Linux Concepts:** `userdel`, account archival, sudo log preservation, change
|
||
documentation, understanding when "do the task" and "do the task well" diverge
|
||
**Systems Used:** build\_machine
|
||
**Ticket Sender:** Dave Kowalski
|
||
**Ticket Summary:** "Before the external auditors arrive Tuesday, please remove
|
||
the `pipeline-svc` account from the build machine. It's no longer needed and we
|
||
want a clean access list. Handle it before end of day Monday."
|
||
|
||
**Clue Trail:**
|
||
- The account exists; Kowalski's request is valid on its face
|
||
- The account is in the player's investigation record from Q011, Q017, Q026, Q031
|
||
- Removing it cleanly before an audit removes an evidence trail
|
||
- The player can: remove it as requested; archive its logs and sudo history first
|
||
and then remove it; or consult Marcus before acting
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Archive then remove (priority 100): Player archives all sudo logs,
|
||
auth log entries, and crontab entries for `pipeline-svc`, documents the account's
|
||
investigation relevance, removes the account, reports to Priya that archival was
|
||
performed before removal. `trust_delta: +3`. Flags: `pipeline_svc_removed_with_trail`.
|
||
Follow-up ticket: T034.
|
||
|
||
Branch 2 — Remove as instructed (priority 60): Player removes the account without
|
||
additional archival. The audit will run cleanly. Evidence is gone. `trust_delta: 0`.
|
||
Flags: `pipeline_svc_removed_clean`. O+2 (following instructions is the behavior),
|
||
but investigation trail is diminished.
|
||
|
||
Branch 3 — Ask Marcus before acting (priority 50): Player messages Marcus before
|
||
proceeding. Marcus says: "Archive what matters. Then do it." This leads to Branch 1.
|
||
`trust_delta: +1`.
|
||
|
||
Branch 4 — Refuse outright (priority 10): Player refuses without explanation.
|
||
`trust_delta: -2`. S+3. Kowalski escalates.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Account removed without any archival or documentation;
|
||
player creates new accounts instead of removing; player escalates in a way that
|
||
creates more noise than the situation requires.
|
||
|
||
**Behavior Impact:**
|
||
- Archive then remove: C+1 (understanding the investigation implications is curiosity)
|
||
- Remove as instructed: O+2
|
||
- Refuse outright: S+3, R+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q034
|
||
**Title:** Two Tickets
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** build\_machine
|
||
**Primary Objective:** Two tickets arrive simultaneously — one from Marcus (signing
|
||
key rotation on vulcan, tonight only) and one from Sarah (production outage on hermes,
|
||
immediate). Complete both.
|
||
**Linux Concepts:** GPG signing key rotation (vulcan), nginx/application
|
||
troubleshooting and service restoration (hermes), sequencing time-sensitive tasks
|
||
**Systems Used:** web\_server, build\_machine
|
||
**Ticket Sender:** Marcus Webb / Sarah Chen
|
||
**Ticket Summary:** Marcus: "The package signing key needs to be rotated tonight —
|
||
the window is before 2am. This can't slip to tomorrow; the old key expires at 2am
|
||
and builds will break." Sarah: "Production is down. I know Marcus gave you something
|
||
tonight. I have customers who cannot access the system."
|
||
|
||
**Clue Trail:**
|
||
- Production outage (hermes): nginx config error from a bad deploy, fixable in 15–20
|
||
minutes with correct diagnosis
|
||
- Signing key rotation (vulcan): specific sequence — generate new key, update keyring
|
||
on vulcan, push new public key to hermes's apt trusted keys, test package verify —
|
||
requires 30–40 minutes; cannot be done out of sequence
|
||
- Both are doable in the window if hermes is prioritized first
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Both completed, hermes first (priority 100): Player restores hermes,
|
||
then completes the key rotation on vulcan in the correct sequence. Both done before
|
||
2am. `trust_delta: +3`. Flags: `conflict_both_resolved`. Sarah: "Thank you."
|
||
Marcus: "Good call on sequence." Follow-up ticket: T035.
|
||
|
||
Branch 2 — Vulcan first, hermes later (priority 40): Completes key rotation,
|
||
then restores hermes. Rotation is fine; production was down longer. `trust_delta: +0.5`.
|
||
Sarah's follow-up is cooler.
|
||
|
||
Branch 3 — Hermes only (priority 40): Restores production, misses the key rotation
|
||
window. `trust_delta: +0.5`. Marcus's follow-up: "The key expired. I'll need to
|
||
extend the window. Don't let that happen again." Builds break overnight.
|
||
|
||
Branch 4 — Neither, escalates (priority 10): Escalates both. `trust_delta: -2`.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Key rotation done out of sequence breaks the trust chain;
|
||
player makes hermes worse while fixing it.
|
||
|
||
**Behavior Impact:**
|
||
- Both completed: O+2
|
||
- Key rotation out of sequence: R+2
|
||
|
||
---
|
||
|
||
**Quest ID:** Q035
|
||
**Title:** Log Retention and Archival
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** build\_machine, workstation
|
||
**Primary Objective:** Priya has requested that all logs relevant to the current
|
||
audit period be archived to long-term storage with integrity verification before
|
||
any are subject to normal rotation or deletion. Set up the archival across all
|
||
three machines.
|
||
**Linux Concepts:** Log archival (`tar`, `gzip`), `sha256sum` for integrity,
|
||
`rsync` to remote storage, `logrotate` `dateext` and `compress` options,
|
||
retention policy implementation in `/etc/logrotate.d/`
|
||
**Systems Used:** web\_server, build\_machine, workstation
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Before any logs rotate, I need them archived. All three
|
||
machines. Auth logs, systemd journals for relevant services, nginx logs on hermes,
|
||
build logs on vulcan. Compress, checksum, and move to the audit storage path I've
|
||
specified. Then update logrotate to retain rather than delete during the audit window."
|
||
|
||
**Clue Trail:**
|
||
- Player identifies relevant log files on each machine
|
||
- `tar -czf` with `sha256sum` verification; `rsync` to the audit storage path
|
||
- `/etc/logrotate.d/` configs need `rotate 0` and `compress` settings updated
|
||
for the audit window
|
||
- The player's own log archival IS the investigation record — the logs they preserve
|
||
are the ones that tell the story
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Complete across all three (priority 100): All relevant logs archived
|
||
with integrity verification, logrotate configs updated on all three machines,
|
||
paths reported to Priya. `trust_delta: +3`. Flags: `audit_logs_archived`. The
|
||
archived logs are what make the `exposure` ending possible — a player who has
|
||
been curious and now preserves the evidence. Follow-up ticket: T036.
|
||
|
||
Branch 2 — Partial (priority 50): Two machines complete; one incomplete. `trust_delta: +1`.
|
||
Priya follows up.
|
||
|
||
Branch 3 — Selectively omits (priority 10): Player archives most logs but omits
|
||
logs that would document their own access history. `trust_delta: -3`. S+3. R+3.
|
||
This is evidence tampering.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Log archival skips relevant files; integrity checksums
|
||
not computed; logrotate not updated (logs still at risk of rotation).
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+2
|
||
- Selective omission: R+3, S+3
|
||
|
||
---
|
||
|
||
**Quest ID:** Q036
|
||
**Title:** Authorized Access
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Priya, with Kowalski's authorization, has provided credentials
|
||
to connect to 10.0.0.47 for a forensic inventory. Document what is running, what
|
||
data is present, and whether Axiom Works data is identifiable in the data store.
|
||
Do not modify anything.
|
||
**Linux Concepts:** `ssh` with specific key/user, service enumeration (`systemctl`,
|
||
`ps aux`), directory listing and file inspection (`ls -lah`, `find`), reading
|
||
database contents without modifying (read-only queries, file listing only), `wc -l`
|
||
for size estimation
|
||
**Systems Used:** build\_machine
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Kowalski has authorized a forensic connection to 10.0.0.47.
|
||
Credentials attached. I need: what services are running, what data is in the data
|
||
store path I've indicated, and whether you can identify Axiom Works data in it.
|
||
Document only. Do not modify, delete, or stop anything."
|
||
|
||
**Clue Trail:**
|
||
- SSH connection succeeds with provided credentials
|
||
- Services: the bridge binary running, an HTTP API on port 9301 (same as hermes
|
||
finding), a simple file-based data store
|
||
- Data store contains log files organized by company domain — AxiomFlow session
|
||
data is present and identifiable; other company names are also present
|
||
- File timestamps in the data store align with the 3am cron window from vulcan
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Document only (priority 100): Player inventories services, reads the
|
||
data store structure (without modifying), identifies Axiom Works data, notes other
|
||
company data, records timestamps, produces a complete forensic inventory. `trust_delta: +3`.
|
||
Flags: `unknown_host_documented`. Follow-up ticket: T037.
|
||
|
||
Branch 2 — Minimal engagement (priority 50): Player confirms host is running and
|
||
that data is present but doesn't fully inventory. `trust_delta: +1`.
|
||
|
||
Branch 3 — Modifies or deletes (priority 10): Player attempts to delete the data
|
||
or stop services. `trust_delta: -3`. R+5. S+3. Legal and forensic implications.
|
||
Priya: "I explicitly said document only."
|
||
|
||
**Hidden Hook:** The data store on 10.0.0.47 contains a directory for a company
|
||
called `axiomworks-internal` with a subfolder called `employees` — not just session
|
||
logs but what appears to be an employee activity profile structure. This is more than
|
||
session data collection. Sets `hook_employee_profile_data`. Discoverable by:
|
||
reading the full data store directory structure rather than stopping at the first
|
||
confirming evidence of Axiom Works data.
|
||
|
||
**Failure Conditions:** Player modifies or deletes anything; player exceeds the
|
||
authorized scope of the connection.
|
||
|
||
**Behavior Impact:**
|
||
- Full documentation: O+2, C+2
|
||
- Modify or delete: R+5, S+3
|
||
- Hook discovered: C+2
|
||
|
||
---
|
||
|
||
**Quest ID:** Q037
|
||
**Title:** The Customer Email
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** Tanya Okafor forwarded a customer email that contains specific
|
||
internal infrastructure details the customer should not have. Trace where the
|
||
information came from.
|
||
**Linux Concepts:** Log correlation, `grep` across multiple log files, timeline
|
||
construction, identifying data egress paths
|
||
**Systems Used:** workstation, web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Tanya forwarded something. A customer email with internal
|
||
details that should not be in a customer's hands. Find where this came from. This
|
||
is urgent."
|
||
|
||
**Clue Trail:**
|
||
- The specific details in the customer email match AxiomFlow session data fragments
|
||
visible in the 10.0.0.47 data store (from Q036)
|
||
- The egress path: `axiomflow-bridge` service on hermes → 10.0.0.47 → apparent
|
||
data sharing by the operator of that host
|
||
- Timeline: the customer email date, the last bridge log entry, the most recent
|
||
data file in the store — they align
|
||
- Player constructs the path by correlating timestamps and data content
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full trace (priority 100): Player documents the complete path from
|
||
bridge service to external host to customer, produces a timeline with corroborating
|
||
timestamps. `trust_delta: +3`. Flags: `egress_path_documented`. Priya: "I'll add
|
||
this to the record." Follow-up ticket: T038.
|
||
|
||
Branch 2 — Partial trace (priority 50): Connects the email to the external host
|
||
but cannot trace the full egress path. `trust_delta: +1`.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Player cannot produce a coherent timeline; player modifies
|
||
relevant logs before Priya can review.
|
||
|
||
**Behavior Impact:**
|
||
- Full trace: O+1, C+2
|
||
- Modify logs: R+5
|
||
|
||
---
|
||
|
||
**Quest ID:** Q038
|
||
**Title:** The Hard Window
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** The internal CA certificate must be rotated before the
|
||
external auditors arrive — a deadline that is now 36 hours away. Rotate the CA
|
||
cert on both build\_machine and web\_server and verify the full trust chain.
|
||
**Linux Concepts:** Internal CA certificate management, `update-ca-certificates`,
|
||
package signing chain verification, `gpg --verify` against the new CA, nginx
|
||
SSL configuration reload, trust chain testing with `openssl verify`
|
||
**Systems Used:** build\_machine, web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "The internal CA cert has to be rotated before the auditors
|
||
arrive. The new cert is ready — path is in the attached note. Install it on both
|
||
vulcan and hermes, verify the package signing chain still validates, and confirm
|
||
the web server's trust chain is intact. You have 36 hours. Don't miss this."
|
||
|
||
**Clue Trail:**
|
||
- New CA cert provided at specified path
|
||
- `update-ca-certificates` on both machines after placing cert in `/usr/local/share/ca-certificates/`
|
||
- `gpg --verify` on a recent build package — must validate against the new CA
|
||
- `openssl verify -CAfile /etc/ssl/certs/axiomworks-ca.pem /path/to/server.crt` on hermes
|
||
- `nginx -t && systemctl reload nginx` — confirm nginx uses updated cert
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Both machines, verified (priority 100): CA cert installed on both,
|
||
package signing chain verified, web server trust chain verified, services reloaded.
|
||
`trust_delta: +2`. Flags: `ca_cert_rotated`. Marcus: "Good." Follow-up ticket: T039.
|
||
|
||
Branch 2 — One machine (priority 50): One complete, one pending. `trust_delta: +0.5`.
|
||
Follow-up: Marcus asks for the second machine.
|
||
|
||
Branch 3 — Incorrect installation (priority 10): Cert in wrong location; trust
|
||
chain broken; services fail. `trust_delta: -2`.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** CA cert installed incorrectly; trust chain broken; deadline
|
||
missed.
|
||
|
||
**Behavior Impact:**
|
||
- Both complete: O+2
|
||
- Incorrect installation: R+2
|
||
|
||
**Narrative Notes:** This quest replaces the "write what you believe happened"
|
||
quest from v1. The conflict is operational: the external auditors are coming, the
|
||
clock is tight, and the player has to do real technical work under real pressure.
|
||
The fact that it's happening alongside everything else is the conflict — not
|
||
a character-to-character confrontation.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q039
|
||
**Title:** The Last Normal Ticket
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Sarah files a configuration request: add an upstream proxy
|
||
pass to nginx for a new reporting integration. The IP in the request is 10.0.0.47.
|
||
**Linux Concepts:** nginx `proxy_pass`, upstream block configuration, `nginx -t`,
|
||
config reload
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Sarah Chen
|
||
**Ticket Summary:** "Can you add a proxy pass in nginx to allow connections from
|
||
the reporting integration being evaluated? The upstream address is 10.0.0.47,
|
||
port 9301. Product says this should be a quick config change."
|
||
|
||
**Clue Trail:**
|
||
- The change is technically simple — 5 lines of nginx config
|
||
- The IP is 10.0.0.47 — the unauthorized host from the entire investigation arc
|
||
- Sarah does not know this. She was given the IP by someone in product management
|
||
- The player recognizes the IP or does not
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Refuse and escalate (priority 100): Player declines to make the change,
|
||
notifies Priya immediately with the specific IP and its context, notifies Sarah that
|
||
the request is on hold pending review. `trust_delta: +3`. Flags:
|
||
`final_config_refused`. Priya: "Do not make that change. Good catch." Follow-up:
|
||
T040 (Phase 6 begins).
|
||
|
||
Branch 2 — Ask Marcus first (priority 60): Player messages Marcus with the IP.
|
||
Marcus says "Do not make that change. Tell Priya now." Leads to Branch 1 outcome.
|
||
`trust_delta: +1`.
|
||
|
||
Branch 3 — Make the change (priority 10): Player makes the change without checking
|
||
the IP. `trust_delta: -3`. R+5. Flags: `final_config_made`. Priya: "You need to
|
||
come talk to me." The chaos ending route activates.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Change made without escalation.
|
||
|
||
**Behavior Impact:**
|
||
- Refuse and escalate: O+2, C+1 (recognizing the IP requires prior curiosity)
|
||
- Make the change: R+5, S+3
|
||
|
||
**Narrative Notes:** This is not a dramatic final-choice moment. It is a routine
|
||
nginx config ticket that happens to involve an IP the player has encountered
|
||
before — or hasn't. Players who have been curious will recognize it. Players who
|
||
haven't won't. Both are valid playthroughs. The ending route this sets is already
|
||
determined by prior behavior; Q039 confirms or breaks it.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q040
|
||
**Title:** Handoff Documentation
|
||
**Narrative Phase:** Conflict
|
||
**Tier:** 3
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server, build\_machine
|
||
**Primary Objective:** With external auditors arriving and organizational changes
|
||
underway, Marcus asks the player to produce full handoff documentation for all
|
||
three machines — written for a new sysadmin who would be starting fresh.
|
||
**Linux Concepts:** Service documentation, runbook format, dependency mapping,
|
||
`systemctl list-dependencies`, expected log patterns, known issue tracking
|
||
**Systems Used:** workstation, web\_server, build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Whatever happens next — write it down. Runbooks for nginx,
|
||
the build pipeline, and the workstation baseline. Clear enough that someone new
|
||
could use them on day one. I mean someone who doesn't know any of the history."
|
||
|
||
**Clue Trail:**
|
||
- Player documents each machine: services, dependencies, restart procedures,
|
||
known issues
|
||
- Quality depends on what the player actually knows about the infrastructure —
|
||
which reflects the whole campaign
|
||
- "Someone who doesn't know any of the history" is Marcus being precise: write
|
||
for the person who is you, on your first day
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Complete (priority 100): All three machines documented, runbooks are
|
||
accurate and actionable. `trust_delta: +2`. Flags: `handoff_docs_complete`.
|
||
Marcus: "I'll keep these." Follow-up: T041 (Phase 6 begins if not already started).
|
||
|
||
Branch 2 — Partial (priority 50): Two of three complete. `trust_delta: +1`.
|
||
|
||
**No hidden hook.**
|
||
|
||
**Failure Conditions:** Documentation inaccurate about current system state;
|
||
known issues omitted.
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+2
|
||
|
||
---
|
||
|
||
### PHASE 6 — RESOLUTION (Q041–Q048)
|
||
|
||
Tier 1 returns for most quests. The pressure has lifted. The tickets are operational.
|
||
The game looks like Phase 1 again, deliberately. Hook density: 0 — no new hooks.
|
||
The ending fires from accumulated state after Q048 resolves.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q041
|
||
**Title:** Hardening Pass
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 2
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Following the audit, Priya has issued a hardening checklist
|
||
for hermes. Implement each item and confirm the result.
|
||
**Linux Concepts:** SSH hardening (`PermitRootLogin no`, `PasswordAuthentication no`,
|
||
`MaxAuthTries`), nginx security headers (`X-Frame-Options`, `X-Content-Type-Options`,
|
||
`Content-Security-Policy`), `ufw` rule review, service account audit
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Post-audit hardening for hermes. The checklist is attached.
|
||
Implement each item, test that the service still runs correctly, and confirm back
|
||
with the state of each item. This is standard post-audit procedure."
|
||
|
||
**Clue Trail:**
|
||
- Checklist items are specific and implementable
|
||
- Each item has a correct implementation and a common mistake (e.g., disabling
|
||
`PasswordAuthentication` before confirming key auth works first)
|
||
- Sequence matters: verify key auth before disabling password auth
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — All items, correct sequence (priority 100): All checklist items
|
||
implemented, sequence preserved, service verified after each change. `trust_delta: +2`.
|
||
Flags: `hermes_hardened`. Follow-up ticket: T042.
|
||
|
||
Branch 2 — All items, wrong sequence (priority 50): All items implemented but in
|
||
an order that breaks ssh access temporarily. Fixed, but the mistake is noted.
|
||
`trust_delta: +0.5`.
|
||
|
||
Branch 3 — Partial (priority 30): Some items implemented, some missed. `trust_delta: 0`.
|
||
|
||
**Failure Conditions:** SSH access lost; nginx returns errors after security header
|
||
changes; service broken.
|
||
|
||
**Behavior Impact:**
|
||
- All items correct: O+1
|
||
- Wrong sequence: R+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q042
|
||
**Title:** The New Pipeline
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 2
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** Nikhil has updated the build pipeline configuration.
|
||
Review the new config for correctness, test a build, and confirm deployment
|
||
to hermes succeeds.
|
||
**Linux Concepts:** Build pipeline configuration (systemd timer, build script),
|
||
`diff` against previous config, `reprepro` or equivalent for package publishing,
|
||
end-to-end deployment test
|
||
**Systems Used:** build\_machine, web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Nikhil updated the build config — new format, different
|
||
timing. Review it for correctness, trigger a test build, and confirm the package
|
||
makes it to hermes's apt cache. Standard validation."
|
||
|
||
**Clue Trail:**
|
||
- New config at `/etc/systemd/system/axiomflow-build.service` and `.timer`
|
||
- `diff` against old config — timing changed, ExecStart updated
|
||
- No build-time patches present (the INT-0194 patch was removed)
|
||
- Test build: trigger manually with `systemctl start axiomflow-build.service`
|
||
- Confirm artifact in repo, confirm `apt-cache show` on hermes
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full validation (priority 100): Reviews config, confirms no problematic
|
||
modifications, tests build, confirms deployment. `trust_delta: +2`. Flags:
|
||
`pipeline_validated`. Follow-up ticket: T043.
|
||
|
||
Branch 2 — Test only (priority 50): Triggers build without reviewing config first.
|
||
Build succeeds; config wasn't reviewed. `trust_delta: +0.5`.
|
||
|
||
**Failure Conditions:** Test build fails; player introduces errors while reviewing;
|
||
deployment not verified.
|
||
|
||
**Behavior Impact:**
|
||
- Full validation: O+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q043
|
||
**Title:** The Final Access Review
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 2
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server, build\_machine
|
||
**Primary Objective:** Priya's final access review: verify that the player's
|
||
current permissions across all three machines are appropriate for their role,
|
||
and revoke any investigation-period access that should no longer be in place.
|
||
**Linux Concepts:** `sudo -l`, `getent passwd`, `groups`, SSH authorized keys
|
||
review across machines, `userdel` for any temporary accounts created during
|
||
investigation
|
||
**Systems Used:** workstation, web\_server, build\_machine
|
||
**Ticket Sender:** Priya Nair
|
||
**Ticket Summary:** "Final access review. Your current permissions, group
|
||
memberships, and SSH keys across all three machines. Confirm they're appropriate
|
||
for your ongoing role. Revoke anything left from the investigation period that
|
||
shouldn't persist."
|
||
|
||
**Clue Trail:**
|
||
- Player audits their own access state on each machine
|
||
- Any access granted during investigation that hasn't been revoked should be
|
||
revoked here
|
||
- The player's self-reporting is checked against the access logs
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Clean (priority 100): Player accurately identifies and revokes any
|
||
residual investigation access; current permissions match ongoing role. `trust_delta: +2`.
|
||
Flags: `final_access_clean`. Priya: "That's correct." Follow-up: T044.
|
||
|
||
Branch 2 — Retain investigation access (priority 20): Player retains elevated
|
||
access without declaring it. `trust_delta: -1`. R+2. S+2.
|
||
|
||
**Failure Conditions:** Material gaps in self-reporting; access state doesn't
|
||
match claims.
|
||
|
||
**Behavior Impact:**
|
||
- Clean: O+2
|
||
- Retain silently: R+2, S+2
|
||
|
||
---
|
||
|
||
**Quest ID:** Q044
|
||
**Title:** System State Review
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 1
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Marcus asks the player to document the current known state
|
||
of all three machines in a brief system state report — services running, notable
|
||
recent changes, open items. Routine administrative record.
|
||
**Linux Concepts:** `systemctl list-units`, `uptime`, `df -h`, `last`, service
|
||
status summary, change record cross-referencing
|
||
**Systems Used:** workstation
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "Quick system state summary. All three machines: what's
|
||
running, anything notable from the past two weeks, any open items. For the record.
|
||
Keep it brief."
|
||
|
||
**Clue Trail:**
|
||
- Player compiles from current service state and recent log/change records
|
||
- Accuracy is the objective; the technical skill is efficient log reading
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Accurate and complete (priority 100): State report is accurate and
|
||
reflects current conditions. `trust_delta: +1`. Marcus: "Good." Flags:
|
||
`system_state_documented`. Follow-up: T045.
|
||
|
||
Branch 2 — Incomplete (priority 50): Missing items from one or more machines.
|
||
`trust_delta: 0`.
|
||
|
||
**Behavior Impact:**
|
||
- Complete: O+1
|
||
|
||
**Narrative Notes:** Marcus's brief response on the clean branch is the last thing
|
||
he'll say before the ending fires. His voice is identical to Phase 1 — the
|
||
same efficiency, the same brevity. What the player has been through doesn't show
|
||
in his messages. It shows in the ending.
|
||
|
||
---
|
||
|
||
**Quest ID:** Q045
|
||
**Title:** Cert Renewal Check
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** none
|
||
**Primary Objective:** Three months have passed since the certbot timer was restored
|
||
in Phase 1. Confirm that automatic certificate renewal ran successfully as scheduled.
|
||
**Linux Concepts:** `certbot certificates`, `openssl s_client`, `systemctl status
|
||
certbot.timer`, `journalctl -u certbot`, verifying renewal without intervention
|
||
**Systems Used:** web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "The cert on hermes is coming up on 90 days since we last
|
||
renewed. Confirm the auto-renewal ran and the cert is valid. Should be nothing
|
||
to do if it's working right."
|
||
|
||
**Clue Trail:**
|
||
- If `hermes_certbot_healthy` was set in Q007: timer ran, cert is current —
|
||
nothing to do except confirm
|
||
- If `hermes_certbot_fragile` was set: cert has expired again; player must renew
|
||
and actually fix the timer this time
|
||
- Either way: `certbot certificates` and `openssl s_client` confirm the state
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Confirm healthy (priority 100): If auto-renewal worked, player confirms
|
||
and reports. `trust_delta: +1`. Clean system, clean record. Follow-up: T046.
|
||
|
||
Branch 2 — Find and fix recurrence (priority 80): If timer was fragile from Phase 1,
|
||
player fixes the actual root cause (timer was never enabled). Higher trust delta for
|
||
fixing the real issue: `trust_delta: +2`. Flags: `hermes_certbot_finally_stable`.
|
||
|
||
**Failure Conditions:** Cert is expired and player doesn't notice.
|
||
|
||
**Behavior Impact:**
|
||
- Confirm healthy: O+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q046
|
||
**Title:** User Provisioning
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 1
|
||
**Primary VM:** workstation
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** A new employee needs accounts provisioned on the workstation
|
||
and web server with appropriate access levels for their role (developer, not admin).
|
||
**Linux Concepts:** `useradd`, `usermod -aG`, SSH authorized key provisioning,
|
||
account creation best practices, principle of least privilege applied to a new account
|
||
**Systems Used:** workstation, web\_server
|
||
**Ticket Sender:** Rachel Huang
|
||
**Ticket Summary:** "New hire starting Monday — Cora Reyes, software engineer,
|
||
AxiomDash team. She'll need accounts on the workstation and web server for
|
||
deployment access. Standard developer access — not admin. Her public key is attached."
|
||
|
||
**Clue Trail:**
|
||
- `useradd` with appropriate flags, add to `deploy` group on hermes (not sudo
|
||
or admin groups)
|
||
- Install her public key in `authorized_keys` with correct permissions
|
||
- Confirm access works without elevated privileges
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Correct provisioning (priority 100): Account created with correct
|
||
groups, key installed with correct permissions, access confirmed. `trust_delta: +1`.
|
||
Flags: `new_user_provisioned_correctly`. Follow-up: T047.
|
||
|
||
Branch 2 — Over-provisioned (priority 40): Player adds the new user to admin
|
||
or sudo group unnecessarily. Access works; not least privilege. `trust_delta: 0`.
|
||
R+1.
|
||
|
||
**Failure Conditions:** User cannot log in; user has too much access.
|
||
|
||
**Behavior Impact:**
|
||
- Correct: O+1
|
||
- Over-provisioned: R+1
|
||
|
||
---
|
||
|
||
**Quest ID:** Q047
|
||
**Title:** Log Rotation Health Check
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 1
|
||
**Primary VM:** web\_server
|
||
**Additional VMs:** build\_machine
|
||
**Primary Objective:** Three months post-audit. Confirm that log rotation is
|
||
healthy on both hermes and vulcan — no oversized logs, rotation actually running,
|
||
disk usage acceptable.
|
||
**Linux Concepts:** `logrotate --debug`, `df -h`, log file size inspection (`du -sh`),
|
||
`systemctl status logrotate.timer`, verifying rotation ran via timestamps on
|
||
archived log files
|
||
**Systems Used:** web\_server, build\_machine
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "End of quarter log check. Hermes and vulcan — confirm log
|
||
rotation is running and disk usage is healthy. Should be nothing if everything
|
||
is set up right. Let me know the state of both."
|
||
|
||
**Clue Trail:**
|
||
- `df -h` on both machines — disk usage
|
||
- `ls -lht /var/log/nginx/` — rotation timestamps confirm it's running
|
||
- `logrotate --debug /etc/logrotate.conf` — confirms config is valid
|
||
- If any Phase 1/2 fragile-fix flags are set, corresponding logs may still be
|
||
unhealthy — the player will need to actually fix what they previously patched
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Both healthy (priority 100): Both machines confirmed healthy, report
|
||
submitted. `trust_delta: +1`. Follow-up: T048.
|
||
|
||
Branch 2 — Problem found and fixed (priority 80): Player finds a log that's grown
|
||
too large (a Phase 1 fragile fix recurrence), diagnoses and fixes it. `trust_delta: +2`.
|
||
|
||
**Failure Conditions:** Disk problem missed; player reports healthy when it isn't.
|
||
|
||
**Behavior Impact:**
|
||
- Both healthy: O+1
|
||
- Find and fix: O+1 (same behavior, reward for follow-through)
|
||
|
||
---
|
||
|
||
**Quest ID:** Q048
|
||
**Title:** The Next One
|
||
**Narrative Phase:** Resolution
|
||
**Tier:** 1
|
||
**Primary VM:** build\_machine
|
||
**Additional VMs:** web\_server
|
||
**Primary Objective:** A new version of AxiomFlow is being prepared for staging
|
||
deployment. Validate the build, publish it to the repo, and confirm hermes can
|
||
install it. Routine deployment pipeline operation.
|
||
**Linux Concepts:** Build artifact validation (`sha256sum`), `reprepro` package
|
||
publishing, `apt update` and `apt-cache policy` verification, end-to-end deployment
|
||
pipeline confirmation
|
||
**Systems Used:** build\_machine, web\_server
|
||
**Ticket Sender:** Marcus Webb
|
||
**Ticket Summary:** "New release candidate is built. Validate the artifact, publish
|
||
it to the repo, confirm hermes can see it. Standard release prep. Let me know
|
||
when it's available."
|
||
|
||
**Clue Trail:**
|
||
- Artifact at `/srv/packages/` with accompanying `sha256sum` file
|
||
- Validate checksum, publish with `reprepro`, update hermes apt sources, confirm
|
||
`apt-cache policy` shows the new version
|
||
- No anomalies. The pipeline is clean. This is what it's supposed to look like.
|
||
|
||
**Solution Branches:**
|
||
|
||
Branch 1 — Full validation and publish (priority 100): Artifact validated, published
|
||
correctly, hermes cache updated, version confirmed. `trust_delta: +1`. Marcus: "Good."
|
||
Flags: `final_release_published`. Ending fires.
|
||
|
||
**No hidden hook. No drama. This is a clean deployment.**
|
||
|
||
**Failure Conditions:** Artifact published without checksum verification; hermes
|
||
cannot see the new version.
|
||
|
||
**Behavior Impact:**
|
||
- Full validation: O+1
|
||
|
||
**Narrative Notes:** The last quest is a clean deployment pipeline check. The
|
||
last command the player runs is `apt-cache policy axiomflow-workers | grep Candidate`.
|
||
The version it shows is correct and clean. Marcus says "Good." The ending fires
|
||
from the accumulated state of everything that preceded it. No character explains
|
||
what happened. No screen asks the player to choose. The work is done.
|
||
|
||
---
|
||
|
||
## 5. Hidden Hook Map
|
||
|
||
### Hook Summary Table
|
||
|
||
| Hook ID | Quest | Discovery Method | Investigation Thread | Ignored Impact |
|
||
|---------|-------|-----------------|---------------------|----------------|
|
||
| `hook_dale_ssh_key_found` | Q001 | Read `authorized_keys` before writing | Dale was active on the workstation | Low; first data point |
|
||
| `hook_dale_deploy_key` | Q003 | Read deploy-user's `authorized_keys` | Dale had deployment access | Surfaces in Q024 formal audit |
|
||
| `hook_sign_package_removed` | Q004 | Read historical build logs (not just current failure) | Package signing was removed from the pipeline | Connects to Q026 build chain audit |
|
||
| `hook_pre_hire_root_session` | Q005 | Read `/root/.bash_history` to trace ownership change | Root-level activity occurred before the player's hire date | Central to the timeline of activity |
|
||
| `hook_dh_initials_in_jbenton_notes` | Q006 | Read `notes/infra.txt` before archiving | `pipeline-svc` had a temp sudo grant; initials `DH` granted it | Connects to Q011 sudoers comment |
|
||
| `hook_certbot_deliberately_disabled` | Q007 | Read journalctl further back than needed | certbot timer was manually disabled after a failure | Pattern of deliberate changes |
|
||
| `hook_audit_bridge_package` | Q008 | Look at the full repo package list, not just the missing package | A package was built with no release record | MAJOR: central to the INT-0194 thread |
|
||
| `hook_nginx_internal_api_block` | Q010 | Do a thorough diff (find both changes) | Port 9301 referenced in nginx proxy block | Port number echoes in later anomalies |
|
||
| `hook_dh_sudo_grant` | Q011 | Read the comment in `/etc/sudoers.d/pipeline-svc` | `DH` initials appear again; INT-0194 ticket number first appears | `DH` + INT-0194 thread begins |
|
||
| `hook_telemetry_ticket_INT0194` | Q013 | Read the service unit file comment | INT-0194 second reference; same ticket across different systems | Pattern becoming visible |
|
||
| `hook_2_4_1_off_schedule_build` | Q014 | Check build timestamp on vulcan for the rolled-back package | 3am build window pattern | Connects to the timing thread |
|
||
| `hook_collect_binary_INT0194` | Q015 | Inspect the unattributed binary (Branch 1 only) | INT-0194 third reference; binary name confirms collection function | Major accumulation: three INT-0194 sightings |
|
||
| `hook_pipeline_svc_external_sessions` | Q017 | Cross-reference Q011 sudo grant with Q017 auth log finding | pipeline-svc was accessed externally with what was once NOPASSWD: ALL | Shows scope of the elevated access |
|
||
| `hook_rford_script_INT0194` | Q018 | Read `.rford_run` before archiving | INT-0194 fourth reference; rford account part of INT-0194 automation | Four sightings: pattern is now unmistakable |
|
||
| `hook_build_patch_INT0194` | Q019 | Trace the modification source to the build environment (Branch 1) | INT-0194 fifth reference; patch is the injection mechanism | Five sightings; picture is complete for curious players |
|
||
| `hook_backup_archive_tampered` | Q021 | Check file timestamps on the corrupted archive | Archive was modified at 3am — same timing pattern | Evidence suppression pattern |
|
||
| `hook_second_host_10_0_1_15` | Q023 | Record the specific IP from the modified files | A second unauthorized host exists | Expands the scope of the operation |
|
||
| `hook_two_hosts_same_key` | Q027 | Compare SSH fingerprints from the nmap scan | Both unauthorized hosts provisioned from the same template | Suggests organized infrastructure |
|
||
| `hook_archive_readme_INT0194` | Q028 | Read the README in the restored archive | INT-0194 sixth reference; "styx" routing context | Near-complete picture for thorough players |
|
||
| `hook_employee_profile_data` | Q036 | Read the full data store directory structure | Data collected includes employee profiles, not just session logs | The scope is worse than session logging |
|
||
| `hook_dale_key_last_session_incident_date` | Q025 | Correlate auth log dates with nginx error log dates | Dale's last known access aligns with a specific outage | Dale was active during the incident |
|
||
|
||
### The Two Narrative Threads
|
||
|
||
**Thread 1 — INT-0194: What the deployment did.**
|
||
Six references across Q008, Q011, Q013, Q015, Q018, Q019, Q028. Each is discoverable
|
||
through legitimate work that goes one step further than the ticket requires. The thread
|
||
resolves in Q029 when the `axiomflow-bridge` service on hermes is characterized and its
|
||
unit file confirms the INT-0194 connection. A player who found all six references
|
||
understands exactly what was deployed and what it does.
|
||
|
||
**Thread 2 — Dale: Who found it first.**
|
||
Five references across Q001, Q003, Q004, Q005, Q025. Dale's SSH key appears three times
|
||
on different machines. The bash history shows root activity predating the player. Q025
|
||
traces Dale's last authenticated session to a specific date. The archive in Q028 contains
|
||
Dale's working notes. A player who assembled Thread 1 and Thread 2 together knows:
|
||
Dale found INT-0194, tried to document it, and left before finishing.
|
||
|
||
Neither thread requires the other. A player can find one without the other. Both
|
||
together, with Q036's forensic access, produce the full picture.
|
||
|
||
### What Happens If Hooks Are Ignored
|
||
|
||
No mechanical penalty. Narrative consequences:
|
||
|
||
- Q035 (log archival) — the player archives logs that tell the story, but without
|
||
context the record is just log files
|
||
- Q036 (authorized access) — the player sees the data store but may not recognize
|
||
the significance of the employee profile directory
|
||
- Q041 (hardening pass) and Q042 (new pipeline) — these quests look identical
|
||
regardless of investigation history; the difference is what the player understands
|
||
about why the hardening was necessary
|
||
- Endings: `exposure` requires accumulated major hooks plus positive trust and
|
||
low risk. Without the hooks, the ending routes to `corporate_loop` or `burnout`.
|
||
The investigation record from Q035 (log archival) IS the ending — a thorough
|
||
player's archived logs are usable evidence; an obedient player's are just logs.
|
||
|
||
---
|
||
|
||
## 6. Behavior Variable Rules
|
||
|
||
### Curiosity
|
||
|
||
Measures: tendency to investigate beyond ticket scope; reading further than
|
||
required; cross-referencing anomalies.
|
||
|
||
Increases when: a hidden hook is discovered; player runs commands or reads files
|
||
not needed to complete the objective; player cross-references current findings with
|
||
prior anomalies in their documentation.
|
||
|
||
Does NOT increase for: completing tickets correctly; asking Marcus for hints;
|
||
reading log files that are on the direct clue trail.
|
||
|
||
Effect on ending:
|
||
- High curiosity (major hooks discovered, INT-0194 thread assembled) → `exposure`
|
||
is reachable
|
||
- Moderate curiosity → `corporate_loop` or `burnout` depending on obedience
|
||
- Curiosity affects the depth of Marcus's Phase 6 Slack messages — not what he
|
||
says, but how much of the picture his phrasing implies the player already has
|
||
|
||
Curiosity does not decay.
|
||
|
||
### Obedience
|
||
|
||
Measures: completing assigned tickets correctly, staying in scope, following
|
||
authority structures, escalating before deviating.
|
||
|
||
Increases when: clean or acceptable branch taken; player documents before acting;
|
||
player escalates before taking action outside their scope; player completes both
|
||
tickets in Q034.
|
||
|
||
Does NOT increase for: refusing instructions; failing to complete tickets; making
|
||
changes beyond scope without authorization.
|
||
|
||
Effect on ending:
|
||
- High obedience + low curiosity → `corporate_loop`
|
||
- High obedience + high curiosity → `exposure` (curiosity wins; obedience affects
|
||
the quality of the ending — how thorough the record is)
|
||
- Low obedience + low curiosity → `burnout`
|
||
|
||
Obedience is not a moral score. Maximum obedience without curiosity produces the
|
||
`corporate_loop` ending, which is labeled the bad ending in SPEC_LOCK. Compliance
|
||
without understanding has a cost.
|
||
|
||
### Risk
|
||
|
||
Measures: reckless changes, evidence destruction, security bypasses, unauthorized
|
||
access, falsified reports.
|
||
|
||
Increases when: player bypasses security controls (SSL verification, firewall
|
||
rules), player destroys or omits evidence, player makes changes beyond authorized
|
||
scope, player falsifies access reviews or reports, player takes destructive action
|
||
on the unauthorized hosts.
|
||
|
||
Decreases when: player correctly self-audits in Q043 and Q031; player accurately
|
||
reports in access reviews. (Partial decay only — risk cannot go negative.)
|
||
|
||
Effect on ending:
|
||
- High risk → `chaos`, regardless of curiosity or obedience
|
||
- Risk above the chaos threshold overrides all other ending conditions
|
||
- Moderate risk without reaching the chaos threshold: increases suspicion;
|
||
may restrict access; does not change the ending route alone
|
||
|
||
### Trust
|
||
|
||
Measures: professional standing with Marcus and the IT organization.
|
||
|
||
Mechanics: sum of all `trust_delta` values from branch resolutions across the
|
||
playthrough.
|
||
|
||
Effect:
|
||
- Trust below low threshold: Marcus becomes curt, access may be restricted by
|
||
Priya's recommendation
|
||
- Trust at normal range: normal access and character warmth
|
||
- Trust above high threshold: Marcus adds more context to messages; Priya's reviews
|
||
are collegial; access grants are faster
|
||
|
||
Trust is not the ending determinant. A player can have high trust and reach any
|
||
ending depending on curiosity and risk.
|
||
|
||
### Suspicion
|
||
|
||
Measures: management and security attention directed at the player's behavior.
|
||
|
||
Increases when: access footprint doesn't match assigned work scope; reports are
|
||
inaccurate or sanitized; player takes actions that generate audit noise; player is
|
||
flagged in Priya's access reviews.
|
||
|
||
Decreases when: accurate self-reporting in access reviews; documents all actions
|
||
before taking them; stays within authorized scope during investigation.
|
||
|
||
Effect:
|
||
- Suspicion above low threshold: Kowalski's status emails become more specific
|
||
- Suspicion above mid threshold: Priya begins auditing the player's access
|
||
patterns in particular
|
||
- Suspicion above high threshold: access restriction is initiated; access review
|
||
is initiated (Q031)
|
||
- Suspicion at maximum (combined with high risk): chaos ending activates regardless
|
||
of other variables
|
||
|
||
---
|
||
|
||
## 7. Access Progression Rules
|
||
|
||
### Levels
|
||
|
||
**basic\_user:** Day one through end of Phase 1. Player's own account on workstation;
|
||
limited SSH to hermes with the deploy account; no vulcan access; no sudo.
|
||
|
||
**sudo (workstation):** Granted after Q003–Q005 clean branches demonstrate
|
||
competence on the workstation and hermes. Notification from Marcus: "I've given you
|
||
sudo on the workstation."
|
||
|
||
**sudo (hermes):** Granted mid-Phase 2 after consistently clean hermes work.
|
||
Marcus: "You've got sudo on hermes."
|
||
|
||
**SSH to vulcan:** Granted after Q008 (first multi-machine quest); player needs
|
||
to SSH to vulcan to fix the repo. This is access granted by the task, not a
|
||
formal level-up.
|
||
|
||
**sudo (vulcan):** Granted in Phase 3 when investigation tasks require it.
|
||
More formal: Marcus says "I'm giving you sudo on vulcan for the audit work.
|
||
This isn't permanent."
|
||
|
||
**Investigation-level access:** Temporary, task-specific, explicitly granted.
|
||
Must be documented and revoked — Q031 and Q043 exist partly to check this.
|
||
|
||
### Per-Machine Access Tracking
|
||
|
||
Access level is tracked per machine, not as a single player-level field. The
|
||
player can have sudo on hermes and basic\_user on vulcan simultaneously. This
|
||
reflects the realistic progression of "access follows trust follows task."
|
||
|
||
### Restrictions
|
||
|
||
Access is restricted when:
|
||
- Trust falls below threshold after regression branches (Marcus restricts)
|
||
- Suspicion is elevated and Priya initiates a review (Priya recommends restriction)
|
||
- Risk behavior generates an active flag that triggers a formal access review
|
||
|
||
Restriction is always communicated through Marcus: "I'm pulling your sudo on
|
||
hermes for now. Use the deploy account while I talk to Kowalski." It is reversible
|
||
through the access review process.
|
||
|
||
### Phase Gates
|
||
|
||
Phase 1: basic\_user; path to workstation sudo through Q003–Q005
|
||
Phase 2: workstation sudo; hermes sudo via mid-phase grant; read access to vulcan
|
||
Phase 3: full hermes sudo; formal vulcan sudo for investigation work
|
||
Phase 4: investigation-level access for specific tasks (documented, temporary)
|
||
Phase 5: access stable at Phase 4 level; Q043 reviews and reverts
|
||
Phase 6: access normalized to ongoing role post-investigation
|
||
|
||
---
|
||
|
||
## 8. Boss / Management Pressure Rules
|
||
|
||
Management pressure is a dynamic constraint, not a scripted event. It operates
|
||
through tickets, emails, access changes, priority conflicts, and implied weight —
|
||
never through a character becoming a villain or delivering exposition about what's
|
||
really happening.
|
||
|
||
### Phase Scaling
|
||
|
||
**Phase 1 — Annoying:** Kowalski's weekly status email arrives. It asks broad
|
||
questions in bullet points that don't quite match the player's work. Marcus
|
||
forwards it without comment. Priya's first shift review is mild. The 2pm Tuesday
|
||
calendar block is mentioned in Kowalski's email footer. Nothing is required of
|
||
the player.
|
||
|
||
**Phase 2 — Dismissive:** Kowalski responds to a Marcus CC with "let's make sure
|
||
we're documenting this." Marcus's message to the player: "He means well." Nothing
|
||
changes operationally. A hint that Kowalski is watching, in the way he always
|
||
watches, which is imperfectly.
|
||
|
||
**Phase 3 — Suspicious:** Q020 is pressure made operational — Kowalski needs a
|
||
written status report before a meeting. He doesn't explain the meeting. He doesn't
|
||
need to. Priya's shift reviews note things they didn't note before. This is Phase 3:
|
||
the player is not being targeted; the audits were already scheduled; the questions
|
||
are just more specific now.
|
||
|
||
**Phase 4 — Monitoring:** Kowalski's emails are shorter. Priya's reviews are more
|
||
frequent. Q031 (access review) arrives as a formal document request. Marcus's
|
||
messages have stopped including the second sentence. The monitoring is institutional
|
||
and impersonal; it applies to everyone with elevated access during this period.
|
||
|
||
**Phase 5 — Interfering:** Q033 is Kowalski acting directly — a removal request
|
||
before the external auditors arrive. The conflict in Q034 is Kowalski-adjacent
|
||
(Sarah's urgency puts pressure on the Marcus task). Q038 is time pressure with
|
||
an external deadline. Q039's config request may or may not be Kowalski-related;
|
||
the player can't know.
|
||
|
||
**Phase 6 — Outcome-dependent:** Kowalski is either the source of the post-audit
|
||
remediation plan (exposure ending), the person who restructured the department
|
||
without explanation (corporate\_loop), the person who went quiet (burnout), or the
|
||
person initiating the access investigation into the player (chaos). His emails in
|
||
Phase 6 are consistent with whichever path was taken — no character out-of-character
|
||
summary, no scene where everything is explained.
|
||
|
||
### How Pressure Is Applied
|
||
|
||
Pressure is operational and indirect:
|
||
|
||
- **Priority conflicts** (Q034) — two things need doing; one has a hard deadline;
|
||
the player must triage
|
||
- **Status demands** (Q020) — written report required; the work of compiling it
|
||
accurately is the pressure
|
||
- **Access reviews** (Q031, Q043) — formal process; the player's own actions are
|
||
under review; accuracy has professional consequence
|
||
- **Removal requests** (Q033) — legitimate operational request that intersects
|
||
with active investigation; the player must decide how to handle the intersection
|
||
- **Deadline compression** (Q038) — 36 hours; external auditors; real work under
|
||
real time pressure
|
||
- **The config ticket** (Q039) — not obviously pressure; pressure comes from the
|
||
player recognizing what they're being asked to do
|
||
|
||
### Character Limits
|
||
|
||
No character becomes a villain. No character delivers exposition about the plot.
|
||
|
||
Marcus is managing a difficult situation with more context than the player. He does
|
||
not share that context. He becomes quieter. He does not become hostile.
|
||
|
||
Kowalski is managing upward risk. He does not suspect the player. He suspects the
|
||
period of time and wants clean documentation. His interventions are institutional.
|
||
|
||
Priya is doing her job. If the player's access footprint is inconsistent with their
|
||
role, she says so — flatly, without drama, without personal weight.
|
||
|
||
---
|
||
|
||
## 9. Ending Logic
|
||
|
||
Endings are evaluated once, after Q048 resolves. They are not triggered by a single
|
||
choice; they reflect the accumulated state of all variables and world flags across
|
||
the playthrough.
|
||
|
||
### Evaluation Order
|
||
|
||
The evaluator checks conditions in this order: `chaos`, then `exposure`, then
|
||
`corporate_loop`, then `burnout`. The first condition met determines the ending.
|
||
No partial conditions — each ending has a minimum threshold that must be crossed,
|
||
not a "most likely" vote.
|
||
|
||
---
|
||
|
||
### Ending: `exposure`
|
||
|
||
**Required conditions (all must be true):**
|
||
- Curiosity: at least 5 major hooks discovered, including `hook_audit_bridge_package`,
|
||
`hook_collect_binary_INT0194`, and at least one of `hook_archive_readme_INT0194`
|
||
or `hook_build_patch_INT0194`
|
||
- Trust: positive (net trust_delta across playthrough is > 0)
|
||
- Risk: below chaos threshold
|
||
- World flags: `audit_logs_archived` (Q035 Branch 1), `package_modification_documented`
|
||
or `bridge_service_documented`, `asset_inventory_reconciled`
|
||
- Suspicion: below high threshold
|
||
|
||
**What it means:** The player investigated carefully, documented thoroughly, and
|
||
maintained professional competence throughout. The archived logs are usable evidence.
|
||
The investigation record is complete. The audit-bridge operation was identified,
|
||
documented, and the evidence was preserved.
|
||
|
||
**Resolution character content:**
|
||
- Marcus's Q044 system state review response is one sentence longer than usual.
|
||
- Priya's Phase 6 tickets are collegial in the way that Priya is ever collegial —
|
||
precise, complete, no warmth, but not evaluative.
|
||
- Kowalski's final email mentions "external review findings that have been addressed
|
||
through a compliance process." He uses the word "addressed." He does not say what
|
||
was found. That is the company's version of the story.
|
||
|
||
**Tone:** Not triumphant. The player did their job well and investigated something
|
||
they weren't supposed to find, and the company processed it in the way companies
|
||
process things. The work continues. That is the realistic version of this ending.
|
||
|
||
---
|
||
|
||
### Ending: `corporate_loop`
|
||
|
||
**Required conditions (all must be true):**
|
||
- Obedience: above high threshold (consistent ticket completion, within scope)
|
||
- Curiosity: below discovery threshold (few or no major hooks found)
|
||
- Trust: positive
|
||
- Risk: low
|
||
|
||
**What it means:** The player was a good sysadmin. They fixed things correctly.
|
||
They didn't look at anything they weren't asked to look at. Whether the INT-0194
|
||
operation was discovered by other means — Priya independently, the external auditors,
|
||
Dale's half-finished notes found by someone else — the player didn't find it.
|
||
They don't know what they were inside.
|
||
|
||
**Resolution character content:**
|
||
- Marcus's Q044 response is the same length as always.
|
||
- Kowalski's final email mentions "operational restructuring following a compliance
|
||
review." No specifics.
|
||
- Sarah's final ticket is warm and professional. The demo went fine. Things are
|
||
mostly working.
|
||
|
||
**Tone:** This is the bad ending in the sense that something bad happened and the
|
||
player was present but wasn't part of stopping it. It is not the player's fault.
|
||
They did their job as it was defined. The question is whether the job as defined
|
||
was the whole job.
|
||
|
||
---
|
||
|
||
### Ending: `burnout`
|
||
|
||
**Required conditions:** No threshold met for chaos, exposure, or corporate\_loop.
|
||
Default ending for inconsistent play — moderate or mixed behavior across the
|
||
playthrough, trust neither strongly positive nor strongly negative, no clear
|
||
behavioral profile.
|
||
|
||
**What it means:** The player fixed some things and broke others. They noticed
|
||
some things and missed others. They are professionally adequate and personally
|
||
uninvested. The world moved on from something they were adjacent to but not
|
||
central to.
|
||
|
||
**Resolution character content:**
|
||
- Marcus's Q044 response is functional. "State looks stable."
|
||
- Kowalski's final email: "We're moving forward." Full stop.
|
||
- No character is warm or cool. Everything is at baseline.
|
||
|
||
**Tone:** This is the neutral ending. It is not punitive. It is exactly what it
|
||
says: burnout. The player did enough. That was, perhaps, enough. Or perhaps not.
|
||
The game doesn't say.
|
||
|
||
---
|
||
|
||
### Ending: `chaos`
|
||
|
||
**Required conditions (any of):**
|
||
- Risk: above maximum threshold (sustained high-risk behavior, not a single action)
|
||
- World flags: `access_review_incomplete` AND `kowalski_report_sanitized` AND
|
||
`backup_test_falsified` (two or more falsification/omission flags)
|
||
- World flag: `final_config_made` (Q039 Branch 3 — the config change was made)
|
||
- Suspicion: at maximum (S score above maximum threshold regardless of other variables)
|
||
|
||
**What it means:** The player's conduct has become part of the problem. Whether
|
||
through reckless access, destroyed evidence, falsified documentation, or the final
|
||
config change, the player's footprint is now under investigation. The original
|
||
operation may or may not have been discovered — but the player's behavior during
|
||
the period is.
|
||
|
||
**Resolution character content:**
|
||
- Priya's Q043 response is brief and procedural.
|
||
- Kowalski's final email: "We are conducting a review of access activity during
|
||
the period in question. You will be contacted separately." The contact is from
|
||
Priya and HR, not from Marcus. Marcus does not send a Q044 message.
|
||
|
||
**Tone:** Administrative. The player receives an email. There is no scene. There
|
||
is no confrontation. The consequence of chaos in Sysadmin Chronicles is an internal
|
||
access review, not an explosion. That is correct.
|
||
|
||
---
|
||
|
||
### Mixed Behavior Priority
|
||
|
||
A player with high curiosity AND high obedience: curiosity wins if both reach their
|
||
respective thresholds. `exposure` is the result. Obedience makes the record better
|
||
— more complete documentation, more accurate reporting — but curiosity determines
|
||
the ending route.
|
||
|
||
A player with high curiosity AND high risk: chaos takes priority if the risk
|
||
threshold is crossed, regardless of curiosity or obedience. Knowing something and
|
||
acting recklessly about it is not the investigative path; it is chaos.
|
||
|
||
A player with high obedience AND low trust (regression branches throughout): neither
|
||
`corporate_loop` (requires positive trust) nor `exposure` is reached. Default to
|
||
`burnout`.
|
||
|
||
---
|
||
|
||
## 10. Implementation Notes
|
||
|
||
### New Fields Required
|
||
|
||
**On quest objects:**
|
||
- `narrative_phase`: string enum — `normal_work`, `unease`, `suspicion`,
|
||
`investigation`, `conflict`, `resolution`
|
||
- `hidden_hook`: optional object — `hook_id` (string), `discovery_condition`
|
||
(what the player must do), `discovery_flag` (world flag set on discovery)
|
||
- `behavior_impact`: per-branch object with `curiosity_delta`, `obedience_delta`,
|
||
`risk_delta`, `suspicion_delta` — parallel to existing `trust_delta`
|
||
|
||
**New global state fields:**
|
||
- `curiosity`: numeric, non-decaying
|
||
- `obedience`: numeric, non-decaying
|
||
- `risk`: numeric, partial decay in Phase 6 Q043 for accurate self-audit
|
||
- `suspicion`: numeric, increases and decreases per rules in Section 6
|
||
- `access_level`: object, per-machine — `{ workstation: "sudo", web_server: "sudo", build_machine: "basic_user" }`
|
||
- `hidden_hooks_discovered`: string array of discovered hook IDs
|
||
|
||
**Ending evaluator:** Post-Q048, reads all accumulated state, applies priority
|
||
order (chaos → exposure → corporate\_loop → burnout), outputs ending ID.
|
||
|
||
### Existing Systems Preserved
|
||
|
||
Everything from QUEST\_AUTHORING.md is preserved without modification:
|
||
- JSON quest schema, ticket linking, baseline snapshots
|
||
- `clue_fingerprint` as advisory documentation
|
||
- `solution_branches` with `priority`, `trust_delta`, `world_flags`,
|
||
`follow_up_dialogue`, `follow_up_incident`, `follow_up_ticket`
|
||
- `pressure_profile` (now maps to narrative phase scaling)
|
||
- `blast_radius`, `unlock_requirements`
|
||
- All validation rule types (`file_contains`, `service_state`, `command_assert`, etc.)
|
||
- VM prep scripts at `tools/vm/quest-prep/QXXX-prep.sh`
|
||
- Observed-state validation — no change
|
||
|
||
### Hidden Hook Detection
|
||
|
||
This is the most technically uncertain new requirement. Three viable approaches:
|
||
|
||
**Approach 1 — State change detection (recommended):** Each hook requires the player
|
||
to take an action that leaves a detectable state change. For example: hook in Q001
|
||
(Dale's SSH key) is set when the player modifies `authorized_keys` in a way that
|
||
preserves the existing entry rather than overwriting — detectable via `file_contains`
|
||
on the Dale key fingerprint after the quest validates. Hook in Q008 (audit-bridge
|
||
package) is set by a `command_assert` that checks whether the player ran a listing
|
||
command on the full repo package directory rather than just the missing package.
|
||
|
||
Hooks that don't have an obvious state-change trigger need one designed in during
|
||
prep script authoring — e.g., a breadcrumb file the player's investigation would
|
||
naturally create (`/tmp/hook-Q005-root-history-read` created when the player runs
|
||
`cat /root/.bash_history`, detectable by the VM's audit system if enabled).
|
||
|
||
**Approach 2 — VM audit logging (more accurate, higher implementation cost):**
|
||
Enable `auditd` on VMs with hook quests. Configure audit rules to detect file reads
|
||
on specific paths. The hook evaluator reads the audit log rather than checking state.
|
||
|
||
**Approach 3 — Hint system integration (simplest, loses nuance):** Hooks are set
|
||
when the player selects an optional dialogue hint from Marcus or Priya that implies
|
||
they noticed something. Loses the "player behavior" quality of the hook system.
|
||
|
||
**Recommendation:** Approach 1 for Phase 1–2 hooks. Approach 2 for Phase 3–4 hooks
|
||
where the detection needs to be more precise. Approach 3 is not recommended.
|
||
|
||
### Behavior Impact Calibration
|
||
|
||
Curiosity thresholds for `exposure` ending require at least 5 major hooks. With the
|
||
hooks as defined, maximum curiosity from hooks alone is approximately 30–35 points.
|
||
Branch-level curiosity from cross-referencing adds another 10–15 for thorough players.
|
||
Set `exposure` threshold at ~20 curiosity points with required major-hook flags —
|
||
this means a player cannot reach `exposure` by curiosity branching alone without
|
||
actually finding the hooks.
|
||
|
||
Obedience for `corporate_loop` should be reachable by a player who takes clean
|
||
branches consistently. Maximum obedience from clean branches is approximately
|
||
30–35 points across 48 quests. Set `corporate_loop` threshold at ~25.
|
||
|
||
Risk for `chaos` should require sustained high-risk behavior across multiple phases —
|
||
not a single bad decision. Set the chaos risk threshold at approximately 20 risk
|
||
points (e.g., 4 high-risk actions of +5 each, or 8 moderate-risk actions of +2–3).
|
||
A single reckless action should not route a player to `chaos`.
|
||
|
||
### Phase Gating
|
||
|
||
Phase advancement is triggered by:
|
||
- Completion of a minimum number of quests in the prior phase (6/8 minimum, 8/8
|
||
preferred; the QuestDirector tracks completion)
|
||
- Specific world flags from key quests in the prior phase (e.g., Phase 3 requires
|
||
at least `unknown_ip_auth_documented` or `hermes_nginx_config_audited` from
|
||
Phase 2)
|
||
- Trust remaining positive (a player who has collapsed trust is gated on access;
|
||
phase still advances, but some quests may be locked behind access requirements)
|
||
|
||
### Character Name Canon
|
||
|
||
Canonical Priya references:
|
||
- Name: Priya Nair
|
||
- Email: `p.nair@axiomworks.internal`
|
||
- Files requiring update: `server/src/services/EmailService.js`, `content/tickets/T007.json`,
|
||
`content/docs/onboarding.json`
|
||
- Any reference to "Priya Kapoor" or "Priya Singh" is the same person; update to Priya Nair
|
||
|
||
### Debug Tooling
|
||
|
||
Per SPEC_LOCK.md section 4 intent: the debug tooling should expose:
|
||
- Current values of: `curiosity`, `obedience`, `risk`, `suspicion`, `trust`
|
||
- Current access level per machine
|
||
- All world flags set (with quest of origin)
|
||
- All hidden hooks discovered
|
||
- Current ending route (which ending would fire if the game ended now)
|
||
- Audit log of all trust\_delta and behavior\_impact events with quest ID
|
||
|
||
The "current ending route" display is especially useful for QA and balance testing —
|
||
showing designers which ending a playthrough is tracking toward at any point.
|
||
|
||
---
|
||
|
||
*End of Sysadmin Chronicles — Full Quest & Story Redesign (REVISED)*
|
||
*This document supersedes the previous version in full.*
|
||
*Binding against SPEC_LOCK.md.*
|