Commit graph

11 commits

Author SHA1 Message Date
atagen
c65c75bb9f 7: packaging — systemd user unit + Nix modules + README
Ships the daemon as a real installable, not just `cargo build`.

Artifacts

  - `contrib/systemd/headroom.service` — user-scope unit. Type=simple
    (the daemon doesn't fork), After=pipewire.service, Restart=on-
    failure with a 2 s back-off so a crash loop doesn't spam stderr,
    StandardOutput/Error=journal, LimitRTPRIO=20 / LimitNICE=-11 to
    match the rtkit-style grant PipeWire's own unit carries. The
    file is templated with `@bindir@` so the build derivation can
    substitute in an absolute store path at install time, without
    the unit having to rely on whatever `headroom` happens to be on
    PATH.

  - `nix/home-module.nix` — `services.headroom.enable`. Installs the
    package on the user's PATH, symlinks the shipped profiles into
    `$XDG_CONFIG_HOME/headroom/profiles/`, and writes the systemd
    user unit (start After=pipewire.service Requires=pipewire.service
    Wants=wireplumber.service WantedBy=pipewire.service). Knobs:
    `installDefaultProfiles` for users who maintain their own set,
    `extraProfiles` (attrset of filename → path) to drop in personal
    profiles that override shipped ones by name.

  - `nix/nixos-module.nix` — `programs.headroom.enable`. Narrow scope:
    binary on global PATH, the package's `lib/systemd/user/*.service`
    is materialised under `/etc/systemd/user/` via `systemd.packages`,
    and an assertion fires if pipewire isn't enabled (clearer than a
    runtime crash). Per-user defaults (profile install, RT priority
    tuning) live in the Home Manager module; the two compose.

Build derivation

  `postInstall` now installs the unit (with `@bindir@` substituted to
  `$out/bin`) and copies `profiles/*.toml` to
  `$out/share/headroom/profiles/`. The flake's version lookup moved
  from `crates/headroom-cli/Cargo.toml` (where `version.workspace =
  true` evaluates to a table, not a string) to the workspace
  `Cargo.toml`. Modules exposed under `nixosModules.default` and
  `homeModules.default`.

README

  Rewrote the install section: Nix flake-based install with both
  Home Manager and NixOS module examples, plus a from-scratch
  `cargo install` + `install`/`sed` recipe for non-Nix users. Added
  a usage section with the common `headroom` subcommands and bumped
  the status banner from "pre-alpha" to "alpha" (signal chain,
  routing, IPC, monitor TUI, profile reload, and packaging all work
  end-to-end now).

Verified

  - `nix flake check` passes; NixOS module type-checks under
    nixpkgs eval.
  - `nix build .#headroom` produces `bin/headroom`,
    `lib/systemd/user/headroom.service` with the absolute store-path
    ExecStart baked in, and all five shipped profiles under
    `share/headroom/profiles/`.
  - `systemd-analyze verify --user` accepts the unit.
  - 185 workspace tests still pass; clippy clean at -D warnings
    --all-targets; `nix fmt` happy.
2026-05-21 17:00:25 +10:00
atagen
d52cd6db3b 8e: playback callback timing instrumentation + spike investigation
Adds a lock-free `PlaybackTiming` struct (atomics: call_count,
sum_us, max_us, spike_count, last_spike_us, last_spike_at_call)
shared between the bus filter's `playback_process` callback (RT
thread, writes) and the AGC controller (daemon thread, reads).
The audio thread wraps each inner call in
`Instant::now()` ... `state.timing.record(elapsed)` — wait-free,
no allocation. The AGC tick samples the snapshot once per second
and logs at WARN when new spikes have landed since the previous
sample, DEBUG otherwise. `#[global_allocator]` declaration in
`headroom-cli` now sits behind `cfg(debug_assertions)` so release
builds compile cleanly (assert_no_alloc strips `AllocDisabler`
under its default `disable_release` feature).

Spike investigation outcome

  PLAN §11 follow-up noted: ~240 μs steady state, ~2 ms BUSY
  spikes at ~10 s cadence. My ~3 min capture of a 1 kHz sine
  routed through processed (release build) showed:

  - Steady state ~2180 μs / call
  - Max climbed slowly: 2186 → 2222 → 2606 → 2655 → 2812 μs over
    ~1 min (1.3× steady-state, well within the per-quantum budget)
  - Callback rate ~4 Hz, implying the Mbox is negotiating a large
    quantum (~12k frames per call vs the 1024-frame baseline
    PLAN §4.7 measured). Per-frame DSP cost is identical to the
    original budget; the longer wall-clock is just the longer
    quantum

  No clear ~10 s-cadence outlier pattern reproduced. The system
  is comfortably inside budget (~2.2 ms / 250 ms quantum ≈ 1% of
  one core). Without an audible artefact or a reproducible
  failure mode I'm not chasing the original spike further; the
  instrumentation stays so future regressions are visible at
  WARN level. `SPIKE_THRESHOLD_US = 5000` is comfortably above
  steady-state at both small and large quanta, so only real
  outliers trip the log.

Verified

  185 tests pass; clippy clean at -D warnings --all-targets.
  Release build runs sine playback continuously for >3 min with
  no assert_no_alloc abort, no panic, no spike warning. Debug
  build (with assert_no_alloc active) likewise stable across
  thousands of audio callbacks (revalidated as part of the
  release-build comparison).
2026-05-21 16:42:46 +10:00
atagen
9220143db7 8a: assert_no_alloc on audio-thread callbacks
Wraps the three audio-thread `process` callbacks
(`capture_process`, `playback_process`, `tap_process`) with
`assert_no_alloc::assert_no_alloc(|| inner(...))`. The
`headroom-cli` binary installs `AllocDisabler` as `#[global_allocator]`
so any allocation inside one of those blocks during debug builds
aborts the process with "memory allocation of N bytes failed".

Each callback was renamed to `*_inner` to keep the thin wrapper
function pointer stable for pipewire-rs's `process(fn_ptr)`.

`assert_no_alloc`'s `disable_release` is on by default — release
builds get the system allocator unwrapped and the macros become
no-ops, so the audio thread pays zero runtime cost in production.

Verified

  Positive smoke (5 s of 1 kHz sine through processed): daemon
  stays up across thousands of capture/playback/tap callbacks. No
  abort. Audio threads are alloc-free as designed.

  Negative smoke (temporarily inserted `Vec::with_capacity(1024)`
  inside `capture_process_inner`): daemon aborts (SIGABRT, exit
  134) on the first audio callback with the expected
  "memory allocation of 1024 bytes failed" stderr message —
  confirming the harness is wired correctly and not silently a
  no-op. Sanity-check alloc reverted before commit.

  185 tests pass; clippy clean at -D warnings --all-targets.
2026-05-21 16:21:53 +10:00
atagen
8af6dff98d 4l: filter.playback through 4k enforcement + sticky default.audio.sink
Two follow-ups from 4k's commit body, both surfaced by the same
smoke-test setup.

filter.playback through 4k

  `try_capture_filter_playback` and the bypass-retarget pass in
  `adopt_new_real_sink` now call `enqueue_route` on top of the
  existing `write_stream_target`. Without that, WirePlumber was
  fanning the filter's output port to *both* the real sink (the
  intended target) and `headroom-processed:playback` — a feedback
  loop where the filter's output flowed back into the processed
  sink, then through monitor → filter capture → DSP → filter
  playback again.

  Plumbing 4k for the filter required two small tweaks elsewhere:

  - `enforce_link_for_managed_stream` and `apply_pending_routes`
    were destroying every non-target outbound link from a managed
    source. That included Layer A passive tap links, which sent
    Layer A's own retry loop into a create/destroy fight with this
    code. Both paths now skip links whose destination isn't a
    known Audio/Sink, so only WP-created sink links get torn down.

  - The processed sink is now also recorded in `sinks_by_name`
    (previously skipped because it's "tracked elsewhere" in
    `processed_sink_id`). `apply_pending_routes` resolves the
    target by name, so it needed processed visible here to handle
    Route::Processed.

Sticky default.audio.sink

  `adopt_new_real_sink` previously short-circuited via
  `apply_real_sink_change` when the real sink name hadn't changed
  — which meant the *first* time WP rewrote `default.audio.sink`
  away from `headroom-processed` we'd re-assert, but on every
  subsequent rewrite to the same Mbox value we'd skip out before
  reaching the re-assert call at the bottom of the function. WP
  won 1-0 after the first round.

  Fixed by hoisting the re-assertion into a dedicated method
  (`reassert_default_processed`) with a per-second attempt cap (10
  per second), called both from the idempotent early-exit path
  and from the end of the full retarget path. The cap is what
  keeps a hostile WP policy from pulling us into a hot loop — at
  10 Hz we tolerate a brief metadata storm, then back off for the
  remainder of the window.

Verified

  185 tests still pass; clippy clean at -D warnings --all-targets.

  Live smoke against a running PipeWire/WP:
  - `pw-metadata` confirms `default.audio.sink` settles on
    `headroom-processed` after daemon startup (daemon wrote 3
    times in ~30 ms, WP yielded; metadata then stayed put).
  - `pw-link` confirms `headroom-filter.playback:output_{FL,FR}`
    has exactly one outbound link each — to the Mbox playback
    ports — with no link back to processed:playback.
  - Sine-into-processed regression still passes: 59/59
    meter ticks above the floor, momentary_lufs around -28, true
    peak around -21 dBTP — bus DSP chain still processing
    end-to-end after the filter's link surface was tightened.
2026-05-21 15:58:18 +10:00
atagen
df8af6c4d2 4k: routing establishes explicit links, not just target.object
Phase 5 smoke-tested the monitor TUI and surfaced that the bus DSP
never sees signal: bus meters stay at the LUFS floor / -200 dBTP
even when `headroom status` reports a stream as route=processed.
The root cause is in routing, not the TUI.

Why writing target.object alone wasn't enough

  The daemon's routing engine wrote `target.object` on the stream
  node and relied on WirePlumber to (re-)link the stream to the
  declared sink. That works for streams the daemon creates itself
  (`headroom-filter.playback`): the `pw_stream` carries
  target.object at connect time, before WP sees the node global,
  so WP's first linking decision honours it.

  For external clients (pw-cat, Strawberry) the order is reversed:
  WP links the stream the instant the node global appears,
  *before* the daemon's registry callback fires
  `try_route_stream`. The metadata write that follows is a no-op
  for routing — WP doesn't re-link in response to a target.object
  change on an already-linked node. Verified manually: writing
  target.object on a live stream + severing its bad link did NOT
  cause WP to relink to the declared target. WP just left the
  stream unrouted.

What this commit changes

  RoutingState now tracks `Link` registry globals (`links_by_id` +
  `outbound_links_by_node` reverse index) and Audio/Sink globals
  by name (`sinks_by_name` now also carries `headroom-processed`,
  not just the real-hardware sinks). On every routing decision —
  `try_route_stream`, `apply_pw_command(RouteStream)`, and the
  bypass-retarget pass inside `adopt_new_real_sink` — the daemon
  also enqueues a `PendingRoute` for the source node.

  Two enforcement paths:

  - **Fast vigilance** in `try_capture_link`: when WP creates a
    new link out of a managed stream that lands on a different
    Audio/Sink, the daemon calls `registry.destroy_global(link_id)`
    immediately. Links to non-sinks (Layer A taps, other
    downstream consumers) are left alone — Layer A owns those.

  - **50 ms drain loop** in `apply_pending_routes`: for each
    pending route, once the source's output ports and the target
    sink's input ports are visible on the registry, the daemon
    destroys any remaining outbound link landing on the wrong
    sink and creates the desired link via `link-factory` (new
    `create_routing_link` helper — non-passive variant of the
    existing `create_explicit_link` Layer A uses). The owned
    `Link` proxies live in `managed_route_links` keyed by source
    node id; dropping them tears the links down via
    `object.linger = "false"`.

  `target.object` writes are kept (cheap hint that helps fresh
  pw_streams and documents intent) but are no longer the source
  of truth.

Verified

  All 185 tests still pass; clippy clean at -D warnings
  --all-targets.

  Live smoke (pw-cat /dev/zero of a 1 kHz sine at -20 dBFS into
  `--target headroom-processed`):
  - Before: pw-cat:output → Mbox:playback directly; bus meters
    pinned at floor, integrated_lufs = -200, true_peak = -200.
  - After: `routed pw-cat → headroom-processed` followed within
    50 ms by `explicit routing link established`; pw-link confirms
    pw-cat:output → headroom-processed:playback (+ the Layer A
    tap link, preserved). Bus meters show momentary -28 → -16
    LUFS, true_peak around -34 to -19 dBTP, compressor GR -2.6 dB,
    limiter GR -6.7 dB — i.e. the bus DSP chain is processing
    signal end-to-end for the first time.
  - Layer A tap creation logs exactly once (vs. the
    create/destroy fighting loop the first cut had before
    `enforce_link_for_managed_stream` learned to skip non-sink
    destinations).

Known limits not addressed here

  - `default.audio.sink` reassertion by WP. The daemon still
    writes `default.audio.sink = headroom-processed` but WP's
    session policy may rewrite it back. With explicit links, this
    is now mostly cosmetic — new streams whose target.object
    matches headroom-processed will be routed correctly via the
    same enforcement path even if default is something else. The
    metadata side will be tightened later if it turns out to
    matter operationally.

  - A spurious filter.playback → processed:playback feedback link
    still appears in the live graph (the bus filter's own output
    being linked back to its sink). Suspected source: a leftover
    rule on the filter node. To investigate separately; doesn't
    currently affect signal flow because filter capture sees
    signal from the real producer.
2026-05-21 15:36:15 +10:00
atagen
e528a98417 5: monitor TUI + wire fill-ins
`headroom monitor` becomes a full-screen ratatui TUI by default;
the previous behaviour (line-delimited JSON, useful for scripts and
tests) is preserved behind --json.

5 — Monitor TUI

  New `crates/headroom-cli/src/tui.rs` (~700 lines incl. tests).
  Main thread does subscribe + initial status() + route_list() before
  entering raw mode, so connect errors surface as clean stderr
  messages instead of corrupting the terminal. A reader thread owns
  the headroom_client::Client and forwards each subscription event
  through a crossbeam channel; an input thread blocks on
  event::read() and forwards keys (q / Esc / Ctrl-C) through a
  second channel; the main thread `select!`s both plus a 10 Hz
  ticker (so uptime + staleness display advance even when no
  events are flowing). On quit the OS reaps the reader; a CLI tool
  doesn't need a graceful UnixStream shutdown.

  Layout: outer block carries the profile / version / uptime in the
  top-right title and a footer with subscribed topics + an overflow /
  error / disconnected banner when relevant. Inside: bus DSP gauges
  (AGC target, compressor GR, limiter GR, true peak), a loudness
  panel (momentary / short-term / integrated, greyed when stale),
  and a streams table with route + Layer A reduction column.

Wire types caught up to the daemon

  `headroom-ipc::RoutingEvent` gained `StreamRemoved`,
  `LayerAAttached`, `LayerADetached` variants — these are events the
  daemon already publishes (registry.rs §pw) but that
  weren't typed in the proto. Without `StreamRemoved` the TUI would
  accumulate departed streams forever; without the Layer A pair the
  per-stream column couldn't track tap state.

  New `LayerALevel` struct types the `meters/layer_a_level` payload
  (node_id, app, volume_lin, reduction_db).

  `headroom_core::agc::LOUDNESS_FLOOR_LUFS` is now `pub` — it's
  published as-is in MeterTick.*_lufs fields when ebur128 has no
  useful measurement yet, so clients need it to render "no
  measurement" without hard-coding `-200.0`.

Toolchain notes

  ratatui and crossterm pinned to =0.28.1. Newer ratatui pulls in
  `instability` 0.3.12 + `darling` 0.23 which need rustc 1.88+; the
  project pins 1.86 via rust-toolchain.toml. Lockfile also pins
  `instability` to 0.3.7 and `darling` to 0.20.10 (older patches that
  still build on 1.86).

Verified

  185 tests passing (was 178: +5 for TUI event mapping +
  fmt_uptime, +2 for stream_removed / layer_a_level handling).
  Clippy clean at -D warnings --all-targets.

  Live smoke: daemon emits routing/{stream_routed, stream_removed,
  layer_a_attached, layer_a_detached} and meters/{tick, layer_a_level}
  in shapes that round-trip cleanly through the new typed enums.
  TUI binary survives raw-mode init + initial RPCs + subscription
  against a live daemon.

Known unrelated daemon gap (to be fixed next): pre-existing streams
aren't actually re-linked when the daemon writes target.object —
WirePlumber updates metadata but doesn't tear the old link down or
create a new one into the processed sink. Bus DSP path therefore
sees silence even when status reports route=processed. Not Phase 5;
addressed separately.
2026-05-21 13:35:27 +10:00
atagen
79e4baedd0 4g: bus meters publishing + housekeeping
Closes the last gap before Phase 5's monitor TUI: per-app meter
events already publish on the meters topic via the registry watcher;
bus-level DSP meters now also publish.

4g — Bus meters
  headroom_core::meters::BusMetrics is an Arc<parking_lot::Mutex<...>>
  snapshot owned by the playback callback (try_lock; skip on
  contention) and read by the AGC controller on each 50 ms tick.
  Carries: compressor GR, limiter total/soft/hard GR, true peak. The
  AGC controller combines these with its ebur128 readings (momentary,
  short-term, integrated) and the current smoothed AGC target, then
  publishes a headroom_ipc::MeterTick on Topic::Meters.

  Publish cadence honours profile.meters.publish_hz, capped at the
  AGC tick rate (20 Hz). Lower publish_hz throttles to every Nth
  tick.

  Mode::I added to the AGC's EbuR128 so loudness_global() is
  available without a second ebur128 instance. Bounded cost — a
  histogram walk per call, <=20 Hz.

  LUFS values are sanitised to a -200.0 dB floor via
  finite_or_floor() — ebur128 returns -inf (not Err) for "no usable
  measurement yet," and non-finite f32 can't survive JSON
  serialisation (serde_json renders as null).

Housekeeping shipped alongside

  headroom-client moved from [dependencies] to [dev-dependencies] in
  headroom-core — it's only used inside ipc::server's tests. Verified
  by full clippy + test run; production builds no longer pull it in.

  Pre-existing clippy nits cleared (limiter.rs x5, app_level.rs,
  ipc/ops.rs, pw/filter.rs). All field_reassign_with_default or
  assign_op_pattern in test code; stage-6 commit ran clippy without
  --all-targets so these slipped through.

Verified

  178 tests passing (28 dsp + 48 dsp + 20 ipc + 106 core including
  +2 new meters tests + 4 client). Clippy clean at default level with
  -D warnings --all-targets.

  Smoke test: monitor meters subscription receives 20 Hz MeterTick
  events with the expected JSON shape (all fields finite).
2026-05-21 10:29:38 +10:00
atagen
fcf421b94c stage 6: per-app 2026-05-20 23:49:58 +10:00
atagen
9edd809416 stage 4 (a–d): IPC server, ops, broadcast
Phase 4 first four checkpoints — daemon now serves the wire protocol
specified in IPC.md and broadcasts events to subscribers.

  4a IPC server skeleton
     UnixListener at $XDG_RUNTIME_DIR/headroom/control.sock, accept
     thread, per-connection thread, hello-on-connect, codec
     round-trip, 0600 perms with stale-socket detection. Caught and
     fixed a sigprocmask ordering bug: block SIGTERM/SIGINT
     process-wide BEFORE the IPC accept thread spawns, otherwise it
     inherits the unblocked mask and the signal takes the default
     disposition before pipewire's signalfd can read it.

  4b Read-only ops + shared state
     Arc<Mutex<DaemonState>> (parking_lot) for cross-thread daemon
     state. RoutingState moved off Rc<RefCell<>>-only and reads
     profile from the shared lock. Captures the headroom-processed
     node id via the registry. Implements: status, profile.list,
     profile.show, route.list, setting.get (serde-roundtrip dotted
     lookup), setting.list (flattened).

  4c Mutating ops
     profile.use (idempotent no-op until 4e ships the disk loader),
     profile.reload (empty list till 4e), route.set/unset with
     single-app user-rule replace semantics, setting.set with serde
     round-trip type-safety, bypass.set. CLI fix:
     allow_hyphen_values so 'headroom set foo.bar -0.5' works.

  4d Subscriptions + broadcast
     Per-connection split into reader thread + writer thread, joined
     by a bounded crossbeam_channel<ServerFrame>(64). Broadcaster in
     DaemonState fans out events via try_send; bounded queues drop
     on overflow with per-(subscriber, topic) counters and a
     daemon::overflow flush event piggybacked onto the next
     successful publish.

     Live events wired: daemon::started, daemon::shutdown,
     routing::rule_changed, routing::stream_routed,
     routing::stream_removed. CLI 'monitor [topics]' command
     subscribes by topic list.

Workspace deps unchanged; uses already-declared crossbeam-channel,
parking_lot. Sinks/SinkInfo gained Default derives.

Tests: 97 passing (28 dsp, 20 ipc, 45 core, 4 client). Clippy clean
at default level under -D warnings.

Remaining Phase 4 punch-list (recommended order):
  4e profile TOML loader + hot reload (notify-debouncer-mini)
  4h preferred_real_sink tracking
  4i target.object routing reliability on real WirePlumber
  4f slow AGC loop with ebur128
  4g meters publishing
  4j auto-promote to default sink (optional flag)
2026-05-19 23:14:18 +10:00
atagen
ae83310772 stage 3: daemon core
Phase 3 — bring up the daemon end-to-end through six checkpoints:

  3a Module skeleton (error, profile, routing, runtime, pw/*)
  3b Pure routing engine + 13 tests (no PipeWire dep)
  3c PwContext: main loop, sigprocmask-block SIGTERM/SIGINT before
     add_signal_local so signalfd actually picks them up
  3d headroom-processed virtual sink via the adapter factory with
     factory.name=support.null-audio-sink
  3e Filter: two pw_streams (capture from monitor / playback to real
     sink) with an rtrb SPSC ring between them. DSP chain
     (Compressor → two-tier Limiter) runs in the playback callback.
     Allocation-free; #![forbid(unsafe_code)] preserved via
     bytemuck::try_cast_slice for the byte↔f32 reinterpretation.
  3f Registry watcher binds the default metadata, evaluates new
     Stream/Output/Audio nodes against profile rules, writes
     target.object for processed routes. Self-stream guard skips
     anything whose node.name starts with 'headroom-filter'.

Workspace deps added: pipewire = { features = ["v0_3_44"] } for the
modern TARGET_OBJECT key, libspa, rtrb, nix (sigprocmask), bytemuck.

Tests: 65 passing (28 dsp, 20 ipc, 4 client, 13 core). Clippy clean
at default level under -D warnings.

PLAN.md §5 renumbered to fix stale subsection labels (was 4.1–4.4
from before the per-app insertion).

Known limitations punted to Phase 4 (documented in commit history
and team memory):
  - WirePlumber doesn't always honor late target.object writes once
    a stream is already linked (timing race).
  - preferred_real_sink dynamic tracking stubbed.
  - No auto-promote of headroom-processed to system default.
  - application.process.binary occasionally arrives in late metadata
    updates after the global registers; routing logs show '?' until
    we add a re-read.
2026-05-19 22:15:49 +10:00
atagen
ca1910de60 stage 2 2026-05-19 16:33:09 +10:00