# CLAUDE.md

## Project Overview

Witryna is a minimalist Git-based static site deployment orchestrator. It listens for webhook triggers, pulls Git repositories, runs containerized build commands, and publishes static assets via atomic symlink switching. Following the Unix philosophy: "Do one thing and do it well."

## Design Philosophy

This project follows a **minimal philosophy**: software should be simple, minimal, and frugal.

- **No feature creep.** Only add functionality that serves the core mission. If a feature is "nice to have" but not essential, leave it out.
- **Minimal dependencies.** Every external crate is a liability — it adds compile time, attack surface, and maintenance burden. Prefer the standard library and existing dependencies over pulling in new ones. Justify any new dependency before adding it.
- **No over-engineering.** Write the simplest code that solves the problem. Avoid abstractions, indirection, and generalization unless there is a concrete, present need. Three similar lines are better than a premature abstraction.
- **Small, auditable codebase.** The entire program should be understandable by a single person. Favour clarity and brevity over cleverness.
- **Lean runtime.** Witryna delegates heavy lifting to external tools (Git, Podman/Docker, the OS). It does not reimplement functionality that already exists in well-tested programs.

## Commands

### CLI Subcommands

```bash
witryna serve              # Start the deployment server
witryna validate           # Validate config and print summary
witryna run <site> [-v]    # One-off build (synchronous)
witryna status [-s <site>] [--json]  # Deployment status
# Config discovery: ./witryna.toml → $XDG_CONFIG_HOME/witryna/witryna.toml → /etc/witryna/witryna.toml
# Override with: witryna --config /path/to/witryna.toml <command>
```

### Development

```bash
# Development (just recipes)
just fmt                     # Auto-format Rust code
just lint                    # Run all lints (fmt check + clippy + yamllint + gitleaks)
just test                    # Run unit tests
just test-integration        # Run integration tests (Tier 1 + Tier 2)
just test-integration-serial # Integration tests with --test-threads=1 (for SIGHUP)
just test-all                # All lints + unit tests + integration tests
just pre-commit              # Mirrors lefthook pre-commit checks
just man-1                   # View witryna(1) man page (needs cargo build first)
just man-5                   # View witryna.toml(5) man page

# Cargo (direct)
cargo build           # Build the project
cargo run             # Run the application
cargo check           # Type-check without building
```

## Architecture

### Core Components

1. **HTTP Server (axum)**: Listens on localhost, handles webhook POST requests
2. **Site Manager**: Manages site configurations from `witryna.toml`
3. **Build Executor**: Runs containerized builds via Podman/Docker
4. **Asset Publisher**: Atomic symlink switching for zero-downtime deployments

### Key Files

- `witryna.toml` - Main configuration (listen address, sites, tokens)
- `witryna.yaml` - Per-repository build configuration (image, command, public dir). Searched in order: `.witryna.yaml`, `.witryna.yml`, `witryna.yaml`, `witryna.yml`. Or set `config_file` in `witryna.toml` for a custom path.

### Directory Structure

```
/var/lib/witryna/
├── clones/{site-name}/     # Git repository clones
├── builds/{site-name}/
│   ├── {timestamp}/        # Timestamped build outputs
│   └── current -> {latest} # Symlink to current build
└── cache/{site-name}/      # Persistent build caches

/var/log/witryna/
└── {site-name}/
    └── {timestamp}.log     # Build logs
```

### API

- `GET /health` - Health check (returns `200 OK`)
- `POST /{site_name}` - Trigger deployment (`Authorization: Bearer <token>` required when `webhook_token` is configured)
  - `202 Accepted` - Build triggered (immediate or queued)
  - `401 Unauthorized` - Invalid token (only when `webhook_token` is configured; `{"error": "unauthorized"}`)
  - `404 Not Found` - Unknown site (`{"error": "not_found"}`)
  - `429 Too Many Requests` - Rate limit exceeded (`{"error": "rate_limit_exceeded"}`)

### System Diagram

```
                  Internet
                     |
+-----------------------------------------------------------------------------+
|                          User's Server (e.g., DigitalOcean)                   |
|                                                                             |
|  +--------------------------------+   +-----------------------------------+ |
|  | Web Server (VHOST 1: Public)   |   | Web Server (VHOST 2: Webhooks)    | |
|  |                                |   |                                   | |
|  |  my-cool-site.com              |   |  witryna-endpoint.com/{site_name} | |
|  +--------------|-----------------+   +-----------------|-----------------+ |
|                 | (serves files)                        | (reverse proxy)   |
|                 |                                       |                   |
|  /var/www/my-site/ <------------------.   +-------------------------------+ |
|                 ^                     `---| Witryna (Rust App)            | |
|                 | (symlink)               | listening on                  | |
|                 |                         | 127.0.0.1:8080/{site_name}    | |
|  /var/lib/witryna/builds/..                +----------|--------------------+ |
|                 ^                                    | (executes commands)  |
|                 |                                    v                      |
|                 `----------------------------------(uses)-------------------> Git & Container Runtime
|                                                                             | (e.g., Podman/Docker)
+-----------------------------------------------------------------------------+
```

### Deployment Workflow

Upon receiving a valid webhook request, Witryna executes asynchronously:

1. **Acquire Lock / Queue:** Per-site non-blocking lock. If a build is in progress, the request is queued (depth-1, latest-wins). Queued rebuilds run after the current build completes.
2. **Determine Paths:** Construct clone/build paths from `base_dir` and `site_name`.
3. **Fetch Source Code:** `git clone` if first time, `git pull` otherwise.
3b. **Initialize Submodules:** If `.gitmodules` exists, run `git submodule sync --recursive` (pull only) then `git submodule update --init --recursive [--depth N]`.
4. **Parse Repository Config:** Read build config (`.witryna.yaml` / `witryna.yaml` / custom `config_file`) or use `witryna.toml` overrides.
5. **Execute Build:** Run container command, e.g.:
   ```bash
   # Podman (default --network=bridge, rootless with userns mapping):
   podman run --rm --cap-drop=ALL --network=bridge --userns=keep-id \
     -v /var/lib/witryna/clones/my-site:/workspace:Z \
     -w /workspace \                    # or /workspace/{container_workdir}
     node:20-alpine sh -c "npm install && npm run build"

   # Docker (needs DAC_OVERRIDE for host-UID workspace access):
   docker run --rm --cap-drop=ALL --cap-add=DAC_OVERRIDE --network=bridge \
     -v /var/lib/witryna/clones/my-site:/workspace \
     -w /workspace \                    # or /workspace/{container_workdir}
     node:20-alpine sh -c "npm install && npm run build"
   ```
6. **Publish Assets:** Copy built `public` dir to timestamped directory, atomically switch symlink via `ln -sfn`.
6b. **Post-Deploy Hook (Optional):** Run `post_deploy` command with `WITRYNA_SITE`, `WITRYNA_BUILD_DIR`, `WITRYNA_BUILD_TIMESTAMP` env vars. 30s timeout, non-fatal on failure.
7. **Release Lock:** Release the per-site lock.
8. **Log Outcome:** Log success or failure.

## Testing

### Unit Tests

- Keep tests in the same files as implementation using `#[cfg(test)]` modules
- Think TDD: identify the function's purpose, its expected outputs, and its failure modes — then write tests for those. Test *behaviour*, not implementation details.
- Do not write dummy tests just for coverage (e.g., asserting a constructor returns an object, or that `Option` defaults to `None`). Every test must verify a meaningful property.
- Test both happy paths and error conditions
- Use descriptive test names: `<function>_<scenario>_<expected_result>`

```rust
#[cfg(test)]
mod tests {
    use super::*;

    #[tokio::test]
    async fn build_executor_valid_config_returns_success() {
        // ...
    }
}
```

### Integration Tests

Integration tests run locally via `cargo test --features integration`. Each test starts its own server on a random port with a temporary directory — no VMs, no containers for the test harness itself (container runtimes are only needed for tests that exercise the build pipeline).

#### Running Integration Tests

```bash
# Run all integration tests
cargo test --features integration

# Run with single thread (required if running SIGHUP tests)
cargo test --features integration -- --test-threads=1

# Run specific test categories
cargo test --features integration auth       # Authentication tests
cargo test --features integration deploy     # Full deployment pipeline
cargo test --features integration sighup     # SIGHUP reload tests
cargo test --features integration polling    # Periodic polling tests
cargo test --features integration edge       # Edge case / security tests
cargo test --features integration overrides  # Build config override tests
```

#### Test Tiers

- **Tier 1 (no container runtime needed):** health, auth (401), 404, concurrent build (409), rate limit (429), edge cases, SIGHUP
- **Tier 2 (requires podman or docker):** deploy, logs, cleanup, overrides, polling

Tests that require git or a container runtime automatically skip with an explicit message (e.g., `SKIPPED: no container runtime (podman/docker) found`) when the dependency is missing.

#### SIGHUP Test Isolation

SIGHUP tests send real signals to the test process. They use `#[serial]` from the `serial_test` crate to ensure they run one at a time. For full safety, run them with a single test thread:

```bash
cargo test --features integration sighup -- --test-threads=1
```

#### Test Structure

```
tests/integration/
  main.rs            # Feature gate + mod declarations
  harness.rs         # TestServer (async reqwest, TempDir, shutdown oneshot)
  git_helpers.rs     # Local bare repo creation + git detection
  runtime.rs         # Container runtime detection + skip macros
  health.rs          # GET /health → 200
  auth.rs            # 401: missing/invalid/malformed/empty bearer
  not_found.rs       # 404: unknown site
  deploy.rs          # Full build pipeline (Tier 2)
  concurrent.rs      # 409 via DashSet injection (Tier 1)
  rate_limit.rs      # 429 with isolated server (Tier 1)
  logs.rs            # Build log verification (Tier 2)
  cleanup.rs         # Old build cleanup (Tier 2)
  sighup.rs          # SIGHUP reload (#[serial], Tier 1)
  overrides.rs       # Build config overrides (Tier 2)
  polling.rs         # Periodic polling (#[serial], Tier 2)
  edge_cases.rs      # Path traversal, long names, etc. (Tier 1)
  cache.rs           # Cache directory persistence (Tier 2)
  env_vars.rs        # Environment variable passing (Tier 2)
  cli_run.rs         # witryna run command (Tier 2)
  cli_status.rs      # witryna status command (Tier 1)
  hooks.rs           # Post-deploy hooks (Tier 2)
```

#### Test Categories

- **Core pipeline** — health, auth (401), 404, deployment (202), concurrent build rejection (409), rate limiting (429)
- **FEAT-001** — SIGHUP config hot-reload
- **FEAT-002** — build config overrides from `witryna.toml` (complete and partial)
- **FEAT-003** — periodic repository polling, new commit detection
- **OPS** — build log persistence, old build cleanup
- **Edge cases** — path traversal, long site names, rapid SIGHUP, empty auth headers

## Security

### OWASP Guidelines for Endpoints

Follow OWASP best practices for all HTTP endpoints:

1. **Authentication & Authorization**
   - Validate `Authorization: Bearer <token>` on every request when `webhook_token` is configured
   - Use constant-time comparison for token validation to prevent timing attacks
   - Reject requests with missing or malformed tokens with `401 Unauthorized`
   - When `webhook_token` is omitted (empty), authentication is disabled for that site; a warning is logged at startup

2. **Input Validation**
   - Validate and sanitize `site_name` parameter (alphanumeric, hyphens only)
   - Reject path traversal attempts (`../`, encoded variants)
   - Limit request body size to prevent DoS

3. **Rate Limiting**
   - Implement rate limiting per token/IP to prevent abuse
   - Return `429 Too Many Requests` when exceeded

4. **Error Handling**
   - Never expose internal error details in responses
   - Log detailed errors server-side with `tracing`
   - Return generic error messages to clients

5. **Command Injection Prevention**
   - Never interpolate user input into shell commands
   - Use typed arguments when invoking Podman/Docker
   - Validate repository URLs against allowlist

6. **Container Security**
   - Drop all capabilities not explicitly needed (`--cap-drop=ALL`)
   - Default network mode is `bridge` (standard NAT networking); set to `none` for maximum isolation
   - Configurable resource limits: `container_memory`, `container_cpus`, `container_pids_limit`
   - Configurable working directory: `container_workdir` (relative path, no traversal)
   - Podman: rootless via `--userns=keep-id`; Docker: `--cap-add=DAC_OVERRIDE` for workspace access

## Conventions

- Use `anyhow` for error handling with context
- Use `tracing` macros for logging (`info!`, `debug!`, `error!`)
- Async-first: prefer `tokio::fs` over `std::fs`
- Use `DashSet` for concurrent build tracking
- `SPRINT.md` is gitignored — update it after each task to track progress, but **never commit it**
- Test functions: do **not** use the `test_` prefix — the `#[test]` attribute is sufficient
- String conversion: use `.to_owned()` on `&str`, not `.to_string()` — reserve `.to_string()` for `Display` types

## Branching

- Implement each new feature or task on a **dedicated branch** named after the task ID (e.g., `cli-002-man-pages`, `pkg-001-cargo-deb`)
- Branch from `main` before starting work: `git checkout -b <branch-name> main`
- Keep the branch focused on a single task — do not mix unrelated changes
- Merge back to `main` only after the task is complete and tests pass
- Do not delete the branch until the merge is confirmed

## Commit Rules

**IMPORTANT:** Before completing any task, run `just test-all` to verify everything passes, then run `/commit-smart` to commit changes.

- Only commit files modified in the current session
- Use atomic commits with descriptive messages
- Do not push unless explicitly asked
- Use always Cargo for dependency management
- **NEVER** touch the `.git` directory directly (no removing lock files, no manual index manipulation)
- **NEVER** run `git reset --hard`, `git checkout .`, `git restore --staged`, or `git config`
- Always use `git add` to stage files — do not use `git restore --staged :/` or other reset-style commands