diff options
Diffstat (limited to 'AGENTS.md')
| -rw-r--r-- | AGENTS.md | 304 |
1 files changed, 304 insertions, 0 deletions
diff --git a/AGENTS.md b/AGENTS.md new file mode 100644 index 0000000..97b3aab --- /dev/null +++ b/AGENTS.md @@ -0,0 +1,304 @@ +# CLAUDE.md + +## Project Overview + +Witryna is a minimalist Git-based static site deployment orchestrator. It listens for webhook triggers, pulls Git repositories, runs containerized build commands, and publishes static assets via atomic symlink switching. Following the Unix philosophy: "Do one thing and do it well." + +## Design Philosophy + +This project follows a **minimal philosophy**: software should be simple, minimal, and frugal. + +- **No feature creep.** Only add functionality that serves the core mission. If a feature is "nice to have" but not essential, leave it out. +- **Minimal dependencies.** Every external crate is a liability — it adds compile time, attack surface, and maintenance burden. Prefer the standard library and existing dependencies over pulling in new ones. Justify any new dependency before adding it. +- **No over-engineering.** Write the simplest code that solves the problem. Avoid abstractions, indirection, and generalization unless there is a concrete, present need. Three similar lines are better than a premature abstraction. +- **Small, auditable codebase.** The entire program should be understandable by a single person. Favour clarity and brevity over cleverness. +- **Lean runtime.** Witryna delegates heavy lifting to external tools (Git, Podman/Docker, the OS). It does not reimplement functionality that already exists in well-tested programs. + +## Commands + +### CLI Subcommands + +```bash +witryna serve # Start the deployment server +witryna validate # Validate config and print summary +witryna run <site> [-v] # One-off build (synchronous) +witryna status [-s <site>] [--json] # Deployment status +# Config discovery: ./witryna.toml → $XDG_CONFIG_HOME/witryna/witryna.toml → /etc/witryna/witryna.toml +# Override with: witryna --config /path/to/witryna.toml <command> +``` + +### Development + +```bash +# Development (just recipes) +just fmt # Auto-format Rust code +just lint # Run all lints (fmt check + clippy + yamllint + gitleaks) +just test # Run unit tests +just test-integration # Run integration tests (Tier 1 + Tier 2) +just test-integration-serial # Integration tests with --test-threads=1 (for SIGHUP) +just test-all # All lints + unit tests + integration tests +just pre-commit # Mirrors lefthook pre-commit checks +just man-1 # View witryna(1) man page (needs cargo build first) +just man-5 # View witryna.toml(5) man page + +# Cargo (direct) +cargo build # Build the project +cargo run # Run the application +cargo check # Type-check without building +``` + +## Architecture + +### Core Components + +1. **HTTP Server (axum)**: Listens on localhost, handles webhook POST requests +2. **Site Manager**: Manages site configurations from `witryna.toml` +3. **Build Executor**: Runs containerized builds via Podman/Docker +4. **Asset Publisher**: Atomic symlink switching for zero-downtime deployments + +### Key Files + +- `witryna.toml` - Main configuration (listen address, sites, tokens) +- `witryna.yaml` - Per-repository build configuration (image, command, public dir). Searched in order: `.witryna.yaml`, `.witryna.yml`, `witryna.yaml`, `witryna.yml`. Or set `config_file` in `witryna.toml` for a custom path. + +### Directory Structure + +``` +/var/lib/witryna/ +├── clones/{site-name}/ # Git repository clones +├── builds/{site-name}/ +│ ├── {timestamp}/ # Timestamped build outputs +│ └── current -> {latest} # Symlink to current build +└── cache/{site-name}/ # Persistent build caches + +/var/log/witryna/ +└── {site-name}/ + └── {timestamp}.log # Build logs +``` + +### API + +- `GET /health` - Health check (returns `200 OK`) +- `POST /{site_name}` - Trigger deployment (`Authorization: Bearer <token>` required when `webhook_token` is configured) + - `202 Accepted` - Build triggered (immediate or queued) + - `401 Unauthorized` - Invalid token (only when `webhook_token` is configured; `{"error": "unauthorized"}`) + - `404 Not Found` - Unknown site (`{"error": "not_found"}`) + - `429 Too Many Requests` - Rate limit exceeded (`{"error": "rate_limit_exceeded"}`) + +### System Diagram + +``` + Internet + | ++-----------------------------------------------------------------------------+ +| User's Server (e.g., DigitalOcean) | +| | +| +--------------------------------+ +-----------------------------------+ | +| | Web Server (VHOST 1: Public) | | Web Server (VHOST 2: Webhooks) | | +| | | | | | +| | my-cool-site.com | | witryna-endpoint.com/{site_name} | | +| +--------------|-----------------+ +-----------------|-----------------+ | +| | (serves files) | (reverse proxy) | +| | | | +| /var/www/my-site/ <------------------. +-------------------------------+ | +| ^ `---| Witryna (Rust App) | | +| | (symlink) | listening on | | +| | | 127.0.0.1:8080/{site_name} | | +| /var/lib/witryna/builds/.. +----------|--------------------+ | +| ^ | (executes commands) | +| | v | +| `----------------------------------(uses)-------------------> Git & Container Runtime +| | (e.g., Podman/Docker) ++-----------------------------------------------------------------------------+ +``` + +### Deployment Workflow + +Upon receiving a valid webhook request, Witryna executes asynchronously: + +1. **Acquire Lock / Queue:** Per-site non-blocking lock. If a build is in progress, the request is queued (depth-1, latest-wins). Queued rebuilds run after the current build completes. +2. **Determine Paths:** Construct clone/build paths from `base_dir` and `site_name`. +3. **Fetch Source Code:** `git clone` if first time, `git pull` otherwise. +3b. **Initialize Submodules:** If `.gitmodules` exists, run `git submodule sync --recursive` (pull only) then `git submodule update --init --recursive [--depth N]`. +4. **Parse Repository Config:** Read build config (`.witryna.yaml` / `witryna.yaml` / custom `config_file`) or use `witryna.toml` overrides. +5. **Execute Build:** Run container command, e.g.: + ```bash + # Podman (default --network=bridge, rootless with userns mapping): + podman run --rm --cap-drop=ALL --network=bridge --userns=keep-id \ + -v /var/lib/witryna/clones/my-site:/workspace:Z \ + -w /workspace \ # or /workspace/{container_workdir} + node:20-alpine sh -c "npm install && npm run build" + + # Docker (needs DAC_OVERRIDE for host-UID workspace access): + docker run --rm --cap-drop=ALL --cap-add=DAC_OVERRIDE --network=bridge \ + -v /var/lib/witryna/clones/my-site:/workspace \ + -w /workspace \ # or /workspace/{container_workdir} + node:20-alpine sh -c "npm install && npm run build" + ``` +6. **Publish Assets:** Copy built `public` dir to timestamped directory, atomically switch symlink via `ln -sfn`. +6b. **Post-Deploy Hook (Optional):** Run `post_deploy` command with `WITRYNA_SITE`, `WITRYNA_BUILD_DIR`, `WITRYNA_BUILD_TIMESTAMP` env vars. 30s timeout, non-fatal on failure. +7. **Release Lock:** Release the per-site lock. +8. **Log Outcome:** Log success or failure. + +## Testing + +### Unit Tests + +- Keep tests in the same files as implementation using `#[cfg(test)]` modules +- Think TDD: identify the function's purpose, its expected outputs, and its failure modes — then write tests for those. Test *behaviour*, not implementation details. +- Do not write dummy tests just for coverage (e.g., asserting a constructor returns an object, or that `Option` defaults to `None`). Every test must verify a meaningful property. +- Test both happy paths and error conditions +- Use descriptive test names: `<function>_<scenario>_<expected_result>` + +```rust +#[cfg(test)] +mod tests { + use super::*; + + #[tokio::test] + async fn build_executor_valid_config_returns_success() { + // ... + } +} +``` + +### Integration Tests + +Integration tests run locally via `cargo test --features integration`. Each test starts its own server on a random port with a temporary directory — no VMs, no containers for the test harness itself (container runtimes are only needed for tests that exercise the build pipeline). + +#### Running Integration Tests + +```bash +# Run all integration tests +cargo test --features integration + +# Run with single thread (required if running SIGHUP tests) +cargo test --features integration -- --test-threads=1 + +# Run specific test categories +cargo test --features integration auth # Authentication tests +cargo test --features integration deploy # Full deployment pipeline +cargo test --features integration sighup # SIGHUP reload tests +cargo test --features integration polling # Periodic polling tests +cargo test --features integration edge # Edge case / security tests +cargo test --features integration overrides # Build config override tests +``` + +#### Test Tiers + +- **Tier 1 (no container runtime needed):** health, auth (401), 404, concurrent build (409), rate limit (429), edge cases, SIGHUP +- **Tier 2 (requires podman or docker):** deploy, logs, cleanup, overrides, polling + +Tests that require git or a container runtime automatically skip with an explicit message (e.g., `SKIPPED: no container runtime (podman/docker) found`) when the dependency is missing. + +#### SIGHUP Test Isolation + +SIGHUP tests send real signals to the test process. They use `#[serial]` from the `serial_test` crate to ensure they run one at a time. For full safety, run them with a single test thread: + +```bash +cargo test --features integration sighup -- --test-threads=1 +``` + +#### Test Structure + +``` +tests/integration/ + main.rs # Feature gate + mod declarations + harness.rs # TestServer (async reqwest, TempDir, shutdown oneshot) + git_helpers.rs # Local bare repo creation + git detection + runtime.rs # Container runtime detection + skip macros + health.rs # GET /health → 200 + auth.rs # 401: missing/invalid/malformed/empty bearer + not_found.rs # 404: unknown site + deploy.rs # Full build pipeline (Tier 2) + concurrent.rs # 409 via DashSet injection (Tier 1) + rate_limit.rs # 429 with isolated server (Tier 1) + logs.rs # Build log verification (Tier 2) + cleanup.rs # Old build cleanup (Tier 2) + sighup.rs # SIGHUP reload (#[serial], Tier 1) + overrides.rs # Build config overrides (Tier 2) + polling.rs # Periodic polling (#[serial], Tier 2) + edge_cases.rs # Path traversal, long names, etc. (Tier 1) + cache.rs # Cache directory persistence (Tier 2) + env_vars.rs # Environment variable passing (Tier 2) + cli_run.rs # witryna run command (Tier 2) + cli_status.rs # witryna status command (Tier 1) + hooks.rs # Post-deploy hooks (Tier 2) +``` + +#### Test Categories + +- **Core pipeline** — health, auth (401), 404, deployment (202), concurrent build rejection (409), rate limiting (429) +- **FEAT-001** — SIGHUP config hot-reload +- **FEAT-002** — build config overrides from `witryna.toml` (complete and partial) +- **FEAT-003** — periodic repository polling, new commit detection +- **OPS** — build log persistence, old build cleanup +- **Edge cases** — path traversal, long site names, rapid SIGHUP, empty auth headers + +## Security + +### OWASP Guidelines for Endpoints + +Follow OWASP best practices for all HTTP endpoints: + +1. **Authentication & Authorization** + - Validate `Authorization: Bearer <token>` on every request when `webhook_token` is configured + - Use constant-time comparison for token validation to prevent timing attacks + - Reject requests with missing or malformed tokens with `401 Unauthorized` + - When `webhook_token` is omitted (empty), authentication is disabled for that site; a warning is logged at startup + +2. **Input Validation** + - Validate and sanitize `site_name` parameter (alphanumeric, hyphens only) + - Reject path traversal attempts (`../`, encoded variants) + - Limit request body size to prevent DoS + +3. **Rate Limiting** + - Implement rate limiting per token/IP to prevent abuse + - Return `429 Too Many Requests` when exceeded + +4. **Error Handling** + - Never expose internal error details in responses + - Log detailed errors server-side with `tracing` + - Return generic error messages to clients + +5. **Command Injection Prevention** + - Never interpolate user input into shell commands + - Use typed arguments when invoking Podman/Docker + - Validate repository URLs against allowlist + +6. **Container Security** + - Drop all capabilities not explicitly needed (`--cap-drop=ALL`) + - Default network mode is `bridge` (standard NAT networking); set to `none` for maximum isolation + - Configurable resource limits: `container_memory`, `container_cpus`, `container_pids_limit` + - Configurable working directory: `container_workdir` (relative path, no traversal) + - Podman: rootless via `--userns=keep-id`; Docker: `--cap-add=DAC_OVERRIDE` for workspace access + +## Conventions + +- Use `anyhow` for error handling with context +- Use `tracing` macros for logging (`info!`, `debug!`, `error!`) +- Async-first: prefer `tokio::fs` over `std::fs` +- Use `DashSet` for concurrent build tracking +- `SPRINT.md` is gitignored — update it after each task to track progress, but **never commit it** +- Test functions: do **not** use the `test_` prefix — the `#[test]` attribute is sufficient +- String conversion: use `.to_owned()` on `&str`, not `.to_string()` — reserve `.to_string()` for `Display` types + +## Branching + +- Implement each new feature or task on a **dedicated branch** named after the task ID (e.g., `cli-002-man-pages`, `pkg-001-cargo-deb`) +- Branch from `main` before starting work: `git checkout -b <branch-name> main` +- Keep the branch focused on a single task — do not mix unrelated changes +- Merge back to `main` only after the task is complete and tests pass +- Do not delete the branch until the merge is confirmed + +## Commit Rules + +**IMPORTANT:** Before completing any task, run `just test-all` to verify everything passes, then run `/commit-smart` to commit changes. + +- Only commit files modified in the current session +- Use atomic commits with descriptive messages +- Do not push unless explicitly asked +- Use always Cargo for dependency management +- **NEVER** touch the `.git` directory directly (no removing lock files, no manual index manipulation) +- **NEVER** run `git reset --hard`, `git checkout .`, `git restore --staged`, or `git config` +- Always use `git add` to stage files — do not use `git restore --staged :/` or other reset-style commands |
