12 KiB
Pi-Kit DietPi Image Prep Spec
This file defines how to design a prep script for a DietPi-based Pi-Kit image.
The script’s job:
Prepare a running Pi-Kit system to be cloned as a “golden image” without removing any intentional software, configs, hostname, or passwords.
0. Context & Goals
Starting point
- OS: DietPi (Debian-based), already installed.
- Extra software: web stack, Pi-Kit dashboard, DNS/ad-blocker, DBs, monitoring, etc.
- System has been used for testing (logs, histories, test data, junk).
Goal
- Prepare system for cloning as a product image.
- KEEP:
- All intentionally installed packages/software.
- All custom configs (web, apps, DietPi configs, firewall).
- Current hostname.
- Existing passwords (system + services) as shipped defaults.
- RESET/CLEAR:
- Host-unique identity data (machine-id, SSH host keys, etc.).
- Logs, histories, caches.
- Test/personal accounts and data.
1. Discovery Phase (MUST HAPPEN BEFORE SCRIPT DESIGN)
Before writing any code, inspect the system and external docs.
The AI MUST:
-
Detect installed components
- Determine which key packages/services are present, e.g.:
- Web server (nginx, lighttpd, apache2, etc.).
- DNS/ad-blocker (Pi-hole or similar).
- DB engines (MariaDB, PostgreSQL, SQLite usage).
- Monitoring/metrics (Netdata, Uptime Kuma, etc.).
- Use this to decide which cleanup sections apply.
- Determine which key packages/services are present, e.g.:
-
Verify paths/layouts
- For each service or category:
- Confirm relevant paths/directories actually exist.
- Do not assume standard paths without checking.
- Example: Only treat
/var/log/nginxas Nginx logs if:- Nginx is installed, AND
- That directory exists.
- For each service or category:
-
Consult upstream docs (online)
- Check current:
- DietPi docs and/or DietPi GitHub.
- Docs for major services (e.g. Pi-hole, Nginx, MariaDB, etc.).
- Use docs to confirm:
- Data vs config locations.
- Safe cache/log cleanup methods.
- Prefer documented behavior over guesses.
- Check current:
-
Classify actions
- For each potential cleanup:
- Mark as safe if clearly understood and documented.
- Mark as uncertain if layout deviates or docs are unclear.
- Plan to:
- Perform safe actions.
- Skip uncertain actions and surface them for manual review.
- For each potential cleanup:
-
Fail safe
- If something doesn’t match expectations:
- Do NOT plan a destructive operation on it.
- Flag it as “needs manual review” in the confirmation phase.
- If something doesn’t match expectations:
2. Identity & Host-Specific Secrets
DO NOT CHANGE:
- Hostname (whatever it currently is).
- Any existing passwords (system or service-level) that are part of the appliance defaults.
RESET/CLEAR:
-
Machine identity
- Clear:
/etc/machine-id/var/lib/dbus/machine-id(if present)
- Rely on OS to recreate them on next boot.
- Clear:
-
Random seed
- Clear persisted random seed (e.g.
/var/lib/systemd/random-seed) so each clone gets unique entropy.
- Clear persisted random seed (e.g.
-
SSH host keys
- Remove all SSH host key files (server keys only).
- Leave user SSH keypairs unless explicitly identified as dev/test and safe to remove.
-
SSH known_hosts
- Clear
known_hostsfor:rootdietpi(or primary DietPi user)- Any other persistent users
- Clear
-
VPN keys (conditional)
- If keys are meant to be unique per device:
- Remove WireGuard/OpenVPN private keys and per-device configs embedding them.
- If the design requires fixed server keys:
- KEEP server keys.
- REMOVE test/client keys/profiles that are tied to dev use.
- If keys are meant to be unique per device:
-
TLS certificates (conditional)
- REMOVE:
- Let’s Encrypt/ACME certs tied to personal domains.
- Per-device self-signed certs that should regenerate.
- KEEP:
- Shared CAs/certs only if explicitly part of product design.
- REMOVE:
3. Users & Personal Traces
-
Accounts
- KEEP:
- Accounts that are part of the product.
- REMOVE:
- Test-only accounts (users created for dev/debug).
- KEEP:
-
Shell histories
- Clear shell histories for all remaining users:
root,dietpi, others that stay.
- Clear shell histories for all remaining users:
-
Home directories
- For users that remain:
- KEEP:
- Intentional config/dotfiles (shell rc, app config, etc.).
- REMOVE:
- Downloads, random files, scratch notes.
- Editor backup/swap files, stray temp files.
- Debug dumps, one-off scripts not part of product.
- KEEP:
- For users that are removed:
- Delete their home dirs entirely.
- For users that remain:
-
SSH client keys
- REMOVE:
- Clearly personal/test keys (e.g. with your email in comments).
- KEEP:
- Only keys explicitly required by product design.
- REMOVE:
4. Logs & Telemetry
-
System logs
- Clear:
- Systemd journal (persistent logs).
/var/logfiles + rotated/compressed variants, where safe.
- Clear:
-
Service logs
- For installed services (web servers, DNS/ad-blockers, DBs, etc.):
- Clear their log files and rotated versions.
- For installed services (web servers, DNS/ad-blockers, DBs, etc.):
-
Monitoring/metrics
- For tools like Netdata, Uptime Kuma, etc.:
- KEEP:
- Config, target definitions.
- CLEAR:
- Historical metric/alert data (TSDBs, history files, etc.).
- KEEP:
- For tools like Netdata, Uptime Kuma, etc.:
5. Package Manager & Caches
-
APT
- Clear:
- Downloaded
.debarchives. - Safe APT caches (as per documentation).
- Downloaded
- Clear:
-
Other caches
- Under
/var/cacheand~/.cache:- CLEAR:
- Caches known to be safe and auto-regenerated.
- DO NOT CLEAR:
- Caches that are required for correct functioning or very expensive to rebuild, unless docs confirm safety.
- CLEAR:
- Under
-
Temp directories
- Empty:
/tmp/var/tmp
- Empty:
-
Crash dumps
- Remove crash dumps and core files (e.g.
/var/crashand similar locations).
- Remove crash dumps and core files (e.g.
6. Service Data vs Config (Per-App Logic)
General rule:
Keep configuration & structure. Remove dev/test data, history, and personal content.
The AI must apply this using detected services + docs.
6.1 Web Servers (nginx / lighttpd / apache2)
- KEEP:
- Main config and site configs that define Pi-Kit behavior.
- App code in
/var/www/...(or equivalent Pi-Kit web root).
- CLEAR:
- Access/error logs.
- Non-critical caches if docs confirm they’re safe to recreate.
6.2 DNS / Ad-blockers (Pi-hole or similar)
- KEEP:
- Upstream DNS settings.
- Blocklists / adlists / local DNS overrides.
- DHCP config if it is part of the product’s behavior.
- CLEAR:
- Query history / statistics DB.
- Log files.
- DO NOT:
- Change the current admin password (it is the product default).
6.3 Databases (MariaDB, PostgreSQL, SQLite, etc.)
- KEEP:
- DB schema.
- Seed/default data required for every user.
- REMOVE/RESET:
- Dev/test user accounts (with your email, etc.).
- Test content/records not meant for production image.
- Access tokens, session records, API keys tied to dev use.
- For SQLite-based apps:
- Decide per app (based on docs) whether to:
- Ship a pre-seeded “clean” DB, OR
- Let it auto-create DB on first run.
- Decide per app (based on docs) whether to:
6.4 Other services (Nextcloud, Jellyfin, Gotify, Uptime Kuma, etc.)
For each detected service:
- KEEP:
- Global config, ports, base URLs, application settings needed for Pi-Kit.
- CLEAR:
- Personal/dev user accounts.
- Your media/content (unless intentionally shipping sample content).
- Notification endpoints tied to your own email / Gotify / Telegram, unless explicitly desired.
If docs or structure are unclear, mark cleanup as uncertain and surface in confirmation instead of guessing.
7. Networking & Firewall
HARD CONSTRAINTS:
- Do NOT modify hostname.
- Do NOT weaken/remove the product firewall rules.
-
Firewall
- Detect firewall system in use (iptables, nftables, UFW, etc.).
- KEEP:
- All persistent firewall configs that define Pi-Kit’s security behavior.
- DO NOT:
- Flush or reset firewall rules unless it’s clearly a dev-only configuration (and that’s confirmed).
-
Other networking state
- Safe to CLEAR:
- DHCP lease files.
- DNS caches.
- DO NOT ALTER:
- Static IP/bridge/VLAN config that appears to be part of the intended appliance setup.
- Safe to CLEAR:
8. DietPi-Specific State & First-Boot Behavior
-
DietPi automation/config
- Identify DietPi automation configuration (e.g.
dietpi.txt, related files). - KEEP:
- The intended defaults (locale, timezone, etc.).
- Any automation that is part of Pi-Kit behavior.
- AVOID:
- Re-triggering DietPi’s generic first-boot flow unless that is intentionally desired.
- Identify DietPi automation configuration (e.g.
-
DietPi logs/temp
- CLEAR:
- DietPi-specific logs and temp files.
- KEEP:
- All DietPi configuration and automation files.
- CLEAR:
-
Pi-Kit first-boot logic
- Ensure any Pi-Kit specific first-run services/hooks are:
- Enabled.
- Not dependent on data being cleaned (e.g., they must not require removed dev tokens/paths).
- Ensure any Pi-Kit specific first-run services/hooks are:
9. Shell & Tooling State
-
Tool caches
- For root and main user(s), CLEAR:
- Safe caches in
~/.cache(pip, npm, cargo, etc.), if not needed at runtime.
- Safe caches in
- Avoid clearing caches that are critical or painful to rebuild unless doc-backed.
- For root and main user(s), CLEAR:
-
Build artifacts
- REMOVE:
- Source trees, build directories, and other dev artifacts that are not part of final product.
- REMOVE:
-
Cronjobs / timers
- Audit:
- User crontabs.
- System crontabs.
- Systemd timers.
- KEEP:
- Jobs that are part of Pi-Kit behavior.
- REMOVE:
- Jobs/timers clearly used for dev/testing only.
- Audit:
10. Implementation Requirements (For the Future Script)
When generating the actual script, the AI MUST:
-
Error handling
- Check exit statuses where relevant.
- Handle missing paths/directories gracefully:
- If a path doesn’t exist, skip and log; do not fail hard.
- Avoid wide-destructive operations without validation:
- No “blind” deletions on unverified globs.
-
Idempotency
- Script can run multiple times without progressively breaking the system.
- After repeated runs, image should remain valid and “clean”.
-
Conservative behavior
- If uncertain about an operation:
- Do NOT perform it.
- Log a warning and mark for manual review.
- If uncertain about an operation:
-
Logging
- For each major category (identity, logs, caches, per-service cleanup, etc.):
- Log what was targeted and outcome:
cleanedskipped (not installed/not found)skipped (uncertain; manual review)
- Log what was targeted and outcome:
- Provide a summary at the end.
- For each major category (identity, logs, caches, per-service cleanup, etc.):
11. Mandatory Pre-Script Confirmation Step
Before writing any script, the AI MUST:
-
Present a system-specific plan
- Based on discovery + docs, list:
- Exactly which paths, files, DBs, and data types it intends to:
- Remove
- Reset
- Leave untouched
- For each item or group: a short explanation of why.
- Exactly which paths, files, DBs, and data types it intends to:
- Based on discovery + docs, list:
-
Highlight conflicts / ambiguities
- If any cleanup might:
- Affect passwords,
- Affect hostname,
- Affect firewall rules,
- Or contradict this spec in any way,
- The AI must:
- Call it out explicitly.
- Explain tradeoffs and propose a safe option.
- If any cleanup might:
-
Highlight extra opportunities
- If the AI finds additional cleanup opportunities not explicitly listed here (e.g., new DietPi features, new log paths):
- Describe them clearly.
- Explain pros/cons of adding them.
- Ask whether to include them.
- If the AI finds additional cleanup opportunities not explicitly listed here (e.g., new DietPi features, new log paths):
-
Wait for explicit approval
- Do NOT generate the script until:
- The user (me) has reviewed the plan.
- Conflicts and extra opportunities have been discussed.
- Explicit approval (with any modifications) has been given.
- Do NOT generate the script until:
Only after that confirmation may the AI produce the actual prep script.