headroom/PLAN.md

63 KiB
Raw Permalink Blame History

Headroom

A Rust AGC + compressor + true-peak limiter for PipeWire. Per-application exclusion, profile-based presets, single-binary daemon, scriptable over a Unix-domain socket.

This document is the canonical plan. It supersedes the earlier conversational sketch.


1. Goals & non-goals

Goals

  • Hard safety net on the processed route. Audio routed through headroom-processed is guaranteed to leave the filter below a configurable ceiling (default 0.1 dBTP) with proper inter-sample peak handling. The guarantee is enforced inline in the filter, downstream of every control-plane code path, and survives daemon misbehaviour, profile reloads, and bad routing decisions. Streams routed bypass ride the real sink directly and are not subject to this contract (see §2 path ①); the contract also does not extend to whatever resampling or post-processing the downstream device path applies after the filter's output.
  • Per-application exclusion. Music players, games, and DAWs route around the processor; browsers, voice chat, and "everything else" go through it. Rules are app-level and live in profiles.
  • Drop-in defaults. First-run experience: install, enable user service, done. No mandatory config. Power users edit TOML or use the CLI.
  • Profiles for distinct listening scenarios (default / night / speech / transparent / bypass-all).
  • Single binary. Daemon, filter, routing, and control loop all live in one process. The DSP kernels are a separate crate so they can be reused (LV2/standalone) later.
  • Scriptable. Unix-domain-socket IPC with a documented JSON schema so anyone can write an alternative client (Qt/QuickShell panel, Eww widget, scripts). A first-party Rust crate (headroom-ipc) wraps it.
  • Rust, lean dep tree. No NIH where mature crates exist, no bloat where they don't.

Non-goals (v0)

  • Surround / >2-channel content. v0 is stereo only; >2ch is routed directly to the real sink, untouched by Headroom's filter chain.
  • LV2/CLAP plugin distribution. The DSP crate is plugin-shaped so this is cheap to add later, but it's not a v0 deliverable.
  • GUI. Third parties can build one against the IPC.
  • Capture-side processing (microphone). v0 is playback only.

2. Architecture

Each app's audio takes one of four end-to-end paths, chosen by two orthogonal profile flags: a routing decision (processed vs. bypass) and a per-app level-control flag (on vs. off).

                ┌─── optional, opt-in per app (Layer A) ────────────────┐
                │                                                       │
                │   ┌─► passive tap ─► peak + RMS ─► AppLevelController │
                │   │      (sibling link in same quantum)         │     │
                │   │                                             │     │
                │   │           Props.channelVolumes write ◄──────┘     │
                │   │                                                   │
                └───┼───────────────────────────────────────────────────┘
                    │
                    │       APP STREAM NODE
                    │   ┌──────────────────────────┐
                    │   │ raw output               │
   app's audio  ───►├──►│   × channelVolumes       │──► output port
                    │   └──────────────────────────┘
                    │                                            │
                    └────────────────────────────────────────────│
                                                                 │
                              routing decision (Layer B)         │
                              target.object set by daemon        │
                                                                 │
                       ┌─────────────────────────────────────────┴┐
                       ▼                                          ▼
                  route = "bypass"                       route = "processed"
                  target.object =                        target.object =
                  preferred_real_sink                    headroom-processed
                       │                                          │
                       │                                          ▼
                       │                              ┌─────────────────────┐
                       │                              │ headroom-processed  │
                       │                              │ (virtual sink, the  │
                       │                              │  system default)    │
                       │                              └─────────┬───────────┘
                       │                                        ▼
                       │                              ┌─────────────────────┐
                       │                              │  headroom-filter    │
                       │                              │  (pw_filter node,   │
                       │                              │   4 mono ports —    │
                       │              Layer C (bus DSP) │  FL/FR in + out)   │
                       │                              │  AGC → compressor   │
                       │                              │  → soft → hard      │
                       │                              └─────────┬───────────┘
                       │                                        │
                       ▼                                        ▼
                  preferred_real_sink  ◄──────────────────────► (DAC)

The four end-to-end paths

Routing = bypass Routing = processed
per-app off true bypass — Headroom touches nothing on the signal path. Same latency as if Headroom weren't installed. bus DSP only — stream flows through headroom-processed and the inline chain. channelVolumes left at whatever the user/app set.
per-app on per-app only — level-reactive channelVolumes writes, no graph hop. Zero added signal-path latency. full stack — per-app level control and bus DSP. Maximum protection.

Path-by-path properties:

Path Signal-path latency added Limiter contract? Per-app gain ride?
① bypass / per-app off 0 no no
② bypass / per-app on 0 no yes (Layer A)
③ processed / per-app off filter hop + ~2 ms lookahead yes (Layer C hard tier) no
④ processed / per-app on filter hop + ~2 ms lookahead yes (Layer C hard tier) yes (Layer A)

The two flags are independent. A competitive game's typical config is ①: zero Headroom involvement in its audio. A user concerned about notification dings on top of that game would put Discord on ② or ④ (so notifications get tamed via Discord's own channelVolumes) while leaving the game on ①.

                  headroom-core (daemon, one process)
                  • per-app level controllers (Layer A)
                  • routing engine + preferred_real_sink (Layer B)
                  • slow AGC loop, profile manager (Layer C)
                  • IPC server
                          │
                          ▼
              $XDG_RUNTIME_DIR/headroom/control.sock
                          │
              ┌───────────┴───────────┐
              ▼                       ▼
         headroom CLI         third-party clients
                              (Qt panel, widgets, …)

See §4 for Layer A's mechanics and §5 for the PipeWire-level details of Layers B and C.

One virtual sink, one daemon process

  • headroom-processed — virtual sink. Set as the system default so new streams land in it by default. Its monitor is captured by headroom-filter, pushed through the DSP graph, and emitted to the current preferred_real_sink.
  • No bypass sink. Streams marked route = "bypass" are pointed directly at preferred_real_sink via a target.object metadata write. They pay zero added latency vs. running without Headroom installed at all — there's no extra graph hop, no extra DSP. The word "bypass" in the profile DSL means "route directly to the real sink, untouched."
  • The daemon owns:
    • the one virtual sink (created on startup, torn down on exit);
    • the filter — a single pw_filter node (headroom-filter) with four mono DSP ports (input FL/FR + output FL/FR) running on PipeWire's realtime data thread. Wrapped by the in-house pipewire-filter workspace crate because pipewire-rs 0.8 doesn't expose pw_filter. WirePlumber doesn't auto-link pw_filter, so the routing engine creates the processed.monitor → filter in and filter out → preferred_real_sink links explicitly via link-factory (the same primitive the routing engine already uses for stream re-pinning in Phase 4k);
    • one AppLevelController per managed app stream (§4), each with its own passive pw_stream tap, peak/RMS envelopes, and Props.channelVolumes writer. Created/destroyed on stream lifecycle events.
    • preferred_real_sink tracking. The daemon watches the default.audio.sink metadata key. When the user changes the system default (via pavucontrol, wpctl set-default, etc.) to a hardware sink, the daemon (a) treats that sink as the new preferred_real_sink, (b) re-links headroom-filter's playback stream to it, and (c) rewrites target.object for every currently-bypassed stream so they follow. Hotplug / Bluetooth handoffs use the same machinery.
    • the slow AGC loop (reads loudness, writes gain target into the filter via an rtrb channel);
    • the routing engine (subscribes to the PipeWire registry, evaluates rules on new streams, writes target.object to the default metadata: either headroom-processed for processed streams or preferred_real_sink for bypassed streams);
    • the IPC server.

Why no headroom-bypass sink

An earlier iteration of the design had a second virtual sink (headroom-bypass) that loopback'd to the real sink, so "bypassed" streams routed to it. This added one PipeWire quantum of latency to every bypassed stream for no functional benefit — module-loopback buffers across the quantum boundary even when the DSP is a no-op. Direct routing via target.object skips the hop entirely. The win is real for competitive games, DAW monitoring, and music players: they now ride exactly the same path they'd take if Headroom weren't installed.

Why this is not the "analytical sink + adjust master volume"

shape originally proposed

Volume control via SPA Props updates is not sample-accurate. A true-peak limiter needs a small internal delay line so gain reduction is applied to the same samples that were analyzed. Therefore the brickwall must be inline. The analytical-monitor approach is still used — for the slow AGC loop, where multi-second time constants make control-plane latency irrelevant — but it cannot own the ceiling.

Why a native pw_filter node, not an LV2 plugin in module-filter-chain

LV2 is not native to PipeWire; it's one of several plugin formats module-filter-chain happens to host (via lilv). Using LV2 would split Headroom into a plugin + a daemon + a filter-chain JSON, pull in a lilv runtime, and force gain-target updates through a 32-bit-float control-port abstraction. A native pw_filter node is the same primitive module-filter-chain itself uses internally, but written directly in Rust, in the same process as the rest of the daemon. One binary, no IPC for parameter updates, idiomatic Rust audio thread. An LV2 wrapper of headroom-dsp remains a viable optional deliverable for use in DAWs.

Bus filter implementation — the in-house pipewire-filter crate. pipewire-rs 0.8 ships Stream but not Filter. Since headroom-core declares #![forbid(unsafe_code)], the unsafe FFI lives in a separate small workspace crate (crates/pipewire-filter/), mirroring pipewire-rs's Stream patterns (heap-boxed FilterListener<D>, RAII Buffer<'p>, // SAFETY: on every unsafe block). The crate covers exactly the events Headroom needs (process, state_changed, param_changed). Audited by Codex on landing; the two findings that would have been real bugs in our use (over-permissive Sync on PortData; passing the error ptr to the old state in state_changed) were applied. Architectural rule: when pipewire-rs later ships its own Filter, switch to it and delete this crate.

Earlier shape (now retired): two pw_streams + a ring. The dual-pw_stream arrangement we shipped in Phase 3 had no PipeWire graph dependency between capture and playback, so the scheduler was free to fire playback before capture in the same quantum → ring-empty → tremolo at quantum cadence. The mitigation was a 65k-sample SPSC ring sized for 4× the worst-case buffer (clock.quantum-limit × CHANNELS), adding ~340 ms average latency. pw_filter removes the ring entirely: a single node has its own input→process→output ordering by construction (the same ordering module-filter-chain relies on). See headroom-pipewire-gotchas #14, #17, #18 for the full diagnostic trail.


3. DSP

3.1 Two-tier true-peak limiter (headroom-dsp::limiter)

The limiter has two parallel tiers sharing the same upsampler, downsampler, delay line, and sliding peak buffer. Both run at the oversampled rate.

Hard tier — the safety contract. Output ceiling default 0.1 dBTP, configurable. Instant attack on the gain envelope plus a brief hold and a slow release. Two defensive clamp stages downstream (once in the oversampled domain, once at the input rate after downsampling) guarantee the contract numerically — the envelope can misbehave and the contract still holds. Never bypassed, never disabled.

Contract scope (caveat). The ≤ 0.1 dBTP guarantee holds at the filter's output, not at the speaker. The bus filter is hardcoded F32 stereo @ 48 kHz (headroom-dsp::limiter's 4× oversampler is sized for 48 k); when the real sink negotiates a different rate (44.1 kHz, 96 kHz, 192 kHz), PipeWire inserts a downstream resampler between filter.playback and the sink. Polynomial / windowed-sinc resamplers can elevate inter-sample peaks slightly through their own reconstruction, so the limiter's true-peak guarantee leaks across that resampling stage. In practice the elevation is small (a few tenths of a dB worst case for a clean band-limited resampler), and the contract still holds at the bus output where headroom is in control. For the contract to hold end-to-end the filter would need to match the real sink's rate and rebuild its DSP coefficients on rate-change — that's the v1 work tracked as PLAN §11 "filter rate matching" (deferred from 8d, gated on a multi-rate hardware test bench).

Soft tier — the comfort cap. Targets a dynamic ceiling computed as program_lufs + max_psr_db. Smooth attack/release envelope so the gain reduction sounds like volume riding, not a slap. Pulls transients to a comfortable peak-to-loudness ratio (default 14 dB) before they ever threaten the hard ceiling. When the AGC hasn't yet provided a program loudness (startup, after reset), the soft tier falls back to a static ceiling. Disabled by omitting [limiter.soft] in a profile — useful for the transparent profile where users want pure brickwall behavior.

Algorithm (per oversampled sample, after upsampling):

  1. Push raw |s| into the sliding-window peak buffer; read the max-of-window.
  2. Soft tier computes target = soft_ceiling / window_peak (clamped to ≤ 1), runs through the smooth attack/release envelope, yields soft_gain.
  3. Hard tier predicts the worst-case effective peak after the soft tier acts (max of window_peak * soft_gain and the asymptote min(window_peak, soft_ceiling)), then sizes hard_target to keep that under the hard ceiling. Instant attack, hold, exponential release. Yields hard_gain.
  4. total_gain = min(soft_gain, hard_gain).
  5. Multiply the delayed sample by total_gain.
  6. Clamp at hard ceiling (defense-in-depth).
  7. Downsample, clamp again at hard ceiling at the input rate.

When the soft tier is doing its job, the hard tier's "predicted-post-soft" target stays above 1.0 and the hard tier never engages. When the soft tier is mid-attack (peak just arrived), the hard tier snaps in as a safety, then releases as the soft tier catches up.

The compressor and AGC stages run before the limiter.

3.2 Feed-forward compressor (headroom-dsp::compressor)

Standard shape: log-domain detector (peak or RMS, switchable) → ratio + soft knee → attack/release envelope smoother → makeup gain → linear gain → apply to (small) delayed input. ~150 lines of clean code.

Defaults aimed at "gentle, transparent": threshold 24 dBFS, ratio 2.5:1, knee 6 dB, attack 10 ms, release 100 ms, makeup auto.

3.3 Slow AGC (headroom-core::agc)

Algorithmic descendant of EasyEffects' autogain.cpp. Runs outside the audio thread, on a ~50 ms control tick.

  • Feeds the audio thread's monitor tap into ebur128 with Mode::M | S | I | TRUE_PEAK.
  • Computes target_gain_dB = target_lufs measured_lufs.
  • Smooths with separate attack/release coefficients (leaky integrator).
  • Gates when momentary loudness < silence threshold.
  • Soft-clamps so the AGC can never push more than ±N dB (profile knob).
  • Writes the new gain target into the audio thread via an rtrb queue.

The AGC's gain is applied before the compressor. The compressor and limiter still own their own behaviour and ceilings.

3.4 Measurement: ebur128

Mode::M | S | I | TRUE_PEAK. EBU TECH 3341/3342 conformant via the ebur128 crate. Constructed on the daemon thread; fed from a ring-buffer consumer that pulls from the audio thread. The audio thread allocates nothing.

This is bus-level measurement only — used to drive the slow AGC loop and meter the processed sink output. Per-app measurement (§4) uses a different, much cheaper metric.


4. Per-application level control (Layer A)

An opt-in, near-zero-latency feedback loop that watches each managed application's output stream and adjusts its Props.channelVolumes multiplier in response to two parallel level metrics:

  • a fast peak envelope that catches short bursts and sustained loud passages (think: a notification ding, a video that just got louder), and
  • a slow RMS envelope that catches sustained loudness mismatches (think: "Discord is permanently louder than everything else even when nobody's shouting").

A stream's applied gain reduction is max(peak_reduction, rms_reduction) — whichever path is asking for more cut wins, and recovery only happens when both paths agree the stream has settled. This is the layer's whole point: the peak path handles transients within one quantum; the RMS path keeps long-term inter-app loudness balanced. Neither alone is enough.

Orthogonal to bus routing — a stream can be processed or bypassed and level-controlled independently. Its goal is "tame noisy apps without startling the listener and without making the chronic loudmouth permanently dominate," while the signal path itself stays untouched.

4.1 Why this is zero-latency

The per-app multiplier is the channelVolumes value PipeWire already applies inside the app's stream node — it's the same number pavucontrol's per-app slider writes to. Adjusting it doesn't insert a graph node; nothing new sits between the app and its destination sink. The only cost is that the analysis happens via a sibling fanout link, not in the playback path: PipeWire schedules fanout consumers in parallel within the same quantum, so the playback path's timing is identical to the no-tap case.

                    ┌──► passive tap (analysis only)
                    │       │
                    │       ▼
                    │   peak + RMS envelopes
                    │   (audio thread, sub-ms)
   app stream ──────┤       │
   (output port)    │       ▼
                    │   rtrb push
                    │       │
                    │       ▼
                    │   AppLevelController (daemon thread)
                    │       │
                    │       │  Props.channelVolumes write
                    │       ▼  (back into the app stream node)
                    │   ┌─────────────────────┐
                    └──►│ app stream multiplies
                        │ by channelVolumes,  │──► (its sink — Layer B)
                        │ then publishes.     │
                        └─────────────────────┘

Important correction (2026-05-22): the diagram above shows the tap branching off the source before the channelVolumes multiplier, but in practice PipeWire's standard adapter applies channelVolumes inside the source node — anything reading the output port sees the post-attenuation signal. Untreated, this closes a feedback loop on the controller: write reduction → tap measures attenuated signal → envelopes release → "no reduction needed" → controller stops writing, gain freezes wherever it last was, dynamics no longer tracked. The implementation compensates by dividing incoming peak_lin / mean_sq_lin by last_written_lin (and its square) inside AppLevelController::process_block, recovering the pre-attenuation signal estimate. Below a floor of 40 dB applied gain (GAIN_COMPENSATION_FLOOR = 0.01) the compensation is skipped — a fully-muted stream would otherwise amplify floor noise back to max-cut and lock the user out of unmuting. See app_level.rs and the per-app-gain memory note for the rationale and the corner cases.

Source-suspension catch-up. When the source node suspends (PipeWire's adapter stops delivering buffers — Strawberry between tracks, the user pausing, a screensaver kicking in) the tap's process_block doesn't run, so the envelopes don't release and the controller carries stale attenuation into the next stretch of audio. AppLevelController::tick_silent(now) — called from the Layer A drain timer on every pass — advances envelopes through silent gaps by feeding (0, 0) inputs at the controller's block period. Bounded by MAX_SILENT_CATCHUP_BLOCKS (~10 s); past that the envelopes have fully released anyway and we short-circuit via envelopes.reset(). The drain pass runs at 5 ms cadence, so post-resume audio sees a fresh controller within one tick.

Per-app user_ceiling persistence across stream lifecycles. Apps like Strawberry create a fresh Stream/Output/Audio node per track. PipeWire carries over the previous node's Props.channelVolumes — frequently our own last-written value from the prior track. The new managed_stream's controller is fresh (last_written_lin = 1.0); without intervention, the first param event from subscribe_params(Props) fires with the inherited daemon-value, the echo check fails (diff vs 1.0 is huge), on_external_change misattributes it as user-set, and the ceiling gets locked at whatever the previous track's reduction was. RoutingState therefore holds a persisted_ceilings map keyed by app_label (process_binary, falling back to application_name); on managed_stream teardown we save the controller's current user_ceiling_lin, and on spawn we AppLevelController::restore_state(ceiling, now) plus write the ceiling to Props.channelVolumes BEFORE calling subscribe_params(Props). The ordering is load-bearing — writing after subscribe races against the initial-state replay and the bug recurs. First-time apps (no persisted entry) still treat the first observation as user-set, which is correct because no daemon-value can have been inherited yet.

4.2 The metrics: peak + RMS, no LUFS

LUFS is the wrong measurement here. Its shortest window (momentary, 400 ms) blurs out exactly the transients we want to catch, and the K-weighting filter adds CPU for no benefit when we're trying to react fast. We also explicitly want a second path that targets sustained loudness — for that, plain mean-square RMS is the right cheap stand-in, not LUFS.

Metric Window Job
Peak envelopemax(|samples|) per block, smoothed ~100 ms attack window, ~500 ms release Fast: catches a notification ding, a clip getting louder, a partner standing up and shouting. Triggers cut on peak_threshold_db (default 6 dBFS).
RMS envelope — block mean-square, smoothed ~12 s Slow: catches "this app is just chronically louder than everything else." Triggers cut on rms_target_db (default ≈ 20 dBFS RMS).

Both are computed from the same raw buffer in the audio thread, so the audio-thread cost is one additional MAC accumulator and a max- scan per sample. Cost analysis in §4.7.

4.3 Architecture

For each managed playback stream (matched by routing rule — see §6):

  1. Audio thread (tap stream's process callback):
    • Pull the buffer from the fanout link.
    • peak = max(|samples|) over the block.
    • mean_sq = Σ(x*x) / n over the block.
    • Push {node_id, peak, mean_sq} to a per-stream rtrb.
  2. Daemon thread (AppLevelController per stream):
    • Drain the rtrb.
    • Update peak envelope (one-pole, fast α — attack within a block, release ~500 ms).
    • Update RMS envelope (one-pole, slow α — window ~12 s).
    • Compute peak_reduction_db and rms_reduction_db independently, then proposed = max(peak_reduction_db, rms_reduction_db).
    • Smooth toward proposed.
    • If the smoothed value is significantly different from last-written AND we're not rate-limited (~10 Hz max writes per stream), submit Props.channelVolumes update.

The recovery condition is intentionally both-paths-agree: a release on the peak path only counts toward unwinding gain reduction if the RMS path also reads quiet. This avoids the pumping artefact where a transient-heavy stream would rapidly release between transients only to be slapped back down on the next one.

4.4 Honouring user-set volumes

The daemon subscribes to Props param-change events on each managed stream. When a channelVolumes change arrives that's meaningfully different from last_written_volume, it wasn't us — the user adjusted via pavucontrol, a hotkey, an app's own UI, etc. The controller then either:

  • defers entirely (stops adjusting the stream until the user opts back in via headroom per-app reset <app>), or
  • treats the user value as a ceiling (continues to cut on spikes but never raises above what the user wanted).

Default is the ceiling behaviour — it's the principle-of-least-surprise choice. Users who want strict deference set a profile flag.

A historical concern: apps that fight back

Some PulseAudio-era apps (Discord most famously) used to read and re-assert their own channelVolumes periodically, fighting any external volume manager. The pattern produced a visible ping-pong loop and effectively disabled per-app management.

The pattern is largely absent from modern PipeWire-native and Electron-based apps in 2024+: in-app sliders write channelVolumes only on user interaction, not on a timer. From Headroom's perspective, those user-interaction writes are indistinguishable from a pavucontrol slider move — both are legitimate external changes the deference policy correctly yields to.

If a fight-back app does appear, the ceiling deference mode degrades gracefully:

  • App produces hot output → Headroom cuts to 0.5.
  • App writes channelVolumes = 1.0 back over our cut.
  • Headroom detects the external change, marks the new value (1.0) as the ceiling, and stops actively writing.
  • Layer A becomes effectively inert for that stream — there is no ping-pong, the user just doesn't get the per-app cut they were hoping for. The bus-level Layer C limiter (if engaged) still enforces the absolute output ceiling regardless.

Explicit pattern detection and rate-limiting of ceiling updates (e.g., "ignore ceiling-restoring writes that arrive within N seconds of our own writes") is deferred to v1, pending evidence from real-world testing that any modern app warrants it. The graceful degradation property is the v0 contract.

4.5 Reaction-time honesty

The signal-path latency is zero. The reaction latency to a spike is bounded by:

spike in block N ─► analysis (same quantum)
                ─► rtrb push (ns)
                ─► controller computes (μs)
                ─► Props write to pw main loop
                ─► applied to block N+1 of the app stream

So sustained loud passages are attenuated within ~one quantum (520 ms depending on the system's quantum). Isolated one-block transients still leak through — the first block carrying the spike plays with the old gain; subsequent blocks see the reduction. This is the irreducible cost of "no lookahead allowed." For absolute spike prevention you need lookahead, which means latency, which contradicts the constraint of this layer.

On the processed route the bus-level Layer C limiter (§3.1) catches anything that would exceed the ceiling regardless of whether Layer A has caught up; on bypass routes Layer A is the only thing watching, so isolated one-block transients reach the real sink. Layer A reduces workload on Layer C where Layer C is in the path, and is a best-effort comfort filter where it isn't; it doesn't replace the limiter.

4.6 Layered budget summary

Layer Metric Time scale Signal-path latency added
A: per-app peak sample peak per block tens of ms 0
A: per-app RMS block mean-square seconds 0
C: inline soft tier true-peak, lookahead sub-ms shared with hard tier
C: inline hard tier true-peak, lookahead sub-ms ~2 ms lookahead
C: bus AGC LUFS (ebur128) many seconds — (control plane only)

Five distinct jobs, five distinct time scales, no two layers duplicate each other. Layer A is the cheapest line of defense and the only one that costs zero latency on the audio path.

4.7 Resource budget per stream

No TRUE_PEAK (recommended for Layer A)
Audio thread per quantum ~10 μs (peak + RMS pass)
Daemon thread per measurement ~few μs (HashMap lookup + envelope math)
Memory per controller ~100 bytes
Memory per ebur128 (if enabled) — N/A; Layer A doesn't use ebur128

At realistic stream counts (25 managed apps): <0.5% CPU total, <1 KB RAM total. Doesn't move the needle.

4.8 Lifecycle

  • Stream appears with media.class = Stream/Output/Audio matching a [[per_app.rules]] pattern: create tap link (pw_link_create), spawn controller, register rtrb.
  • Stream disappears (pw_registry::global_removed): tear down tap, drop controller, clean up rtrb.
  • App restarts: new node_id → fresh controller. User-volume deference state is per-stream-instance, which is the right default.

5. PipeWire integration

5.1 Sinks

Created on daemon startup by emitting a pipewire.conf.d fragment into $XDG_CONFIG_HOME/pipewire/pipewire.conf.d/headroom.conf (if not already present) and reloading. Alternative: create them at runtime via pw-loopback equivalents using pipewire-rs. v0 ships with the runtime-creation path so the install footprint is "one binary, one unit file."

Sink properties:

  • headroom-processed: node.name=headroom-processed, media.class=Audio/Sink, audio.position=[FL,FR], node.description="Headroom (processed)". Promoted to system default on startup so new streams land in it by default.

There is no second sink. Bypassed streams are routed directly at the current preferred_real_sink via target.object metadata writes (see §4.3).

5.2 The filter

One pw_filter node (headroom-filter), wrapped by the in-house pipewire-filter workspace crate, with four mono DSP ports — the canonical shape module-filter-chain uses:

  • input_FL / input_FRDirection::Input, format.dsp = "32 bit float mono audio", audio.channel = FL|FR. The routing engine links these to the corresponding monitor ports on headroom-processed.
  • output_FL / output_FRDirection::Output, same format properties. The routing engine links these to the corresponding input ports on preferred_real_sink.

Single process callback per quantum: dequeue all four mono buffers, run AGC gain → compressor → limiter on the (in_l[i], in_r[i]) pair, write (out_l[i], out_r[i]). Queue all four buffers back via Buffer::Drop. Allocation-free; guarded by assert_no_alloc in debug. Parameter updates arrive over an rtrb SPSC queue from the control thread.

Routing. WirePlumber's policy does not auto-link pw_filter nodes (the pw_filter API has no AUTOCONNECT flag and WP has no default linking heuristic for hybrid input+output nodes). The routing engine therefore wires the filter explicitly: try_capture_filter_playback matches the filter's registry global by node.name, then enqueues two routes through the existing pending_routes machinery — one source-=-processed / target-=-filter for the input legs, one source-=-filter / target-=-real_sink for the output legs. The pair_count >= 2 ordinal pairing in apply_pending_routes (FL→FL, FR→FR) is exactly the per-channel mono structure above.

The filter is resolved as a routing target via resolve_routing_target / is_routing_target helpers that check filter_playback_id ahead of sinks_by_name — the filter is not registered as a fake Audio/Sink, so the map stays genuinely sink-only.

Rebuild on rate change. When the real sink's negotiated rate changes (PwCommand::RebuildFilter), the routing engine clears filter_playback_id before dropping the old filter so the new filter's registry global is recaptured even if its global_add races ahead of the old global_remove.

5.3 Routing

  • On startup, write default.audio.sink in the default metadata to point at headroom-processed so new streams default to the processor. The previous value (the user's hardware sink) is captured as the initial preferred_real_sink.
  • Subscribe to pw_registry global-added events.
  • On any new node with media.class == "Stream/Output/Audio" and node.dont-move != true:
    • Read application.process.binary, application.name, pipewire.access.portal.app_id, media.role.
    • Evaluate routing rules from the active profile to decide processed vs. bypass.
    • Write target.object into the default metadata for the new stream:
      • processedheadroom-processed's object.serial.
      • bypasspreferred_real_sink's object.serial. WirePlumber honours this for any movable stream.
  • Watch default.audio.sink metadata changes. When the user switches the system default to a hardware sink, the daemon:
    • records that sink as the new preferred_real_sink,
    • re-links headroom-filter's playback stream to it,
    • rewrites target.object for every currently-bypassed stream so they follow the new hardware,
    • re-asserts headroom-processed as the default for new streams (so subsequent app launches still land in the processor).
  • Hotplug (sink appears/disappears) goes through the same code path.

5.4 Stream identification

Property Reliability Use
application.process.binary high (kernel-sourced) primary key
application.name medium secondary / display
pipewire.access.portal.app_id high (Flatpak only) match sandboxed apps
media.role low (most apps omit) bonus signal only
media.class structural gate to playback streams

6. Profiles

Profile files live in $XDG_CONFIG_HOME/headroom/profiles/*.toml, shadowing shipped defaults in /usr/share/headroom/profiles/ by name. Profile files are user-authored configuration — they're the thing you open in $EDITOR. File-watcher hot-reload via notify-debouncer-mini is planned; in the meantime profile.reload re-scans on demand.

Daemon-managed user state — active profile name, per-app route overrides made via route.set, dotted-key tweaks made via setting.set, the global bypass flag — is not mixed in with the profile TOMLs. It lives in a single overlay.toml at $XDG_STATE_HOME/headroom/overlay.toml, written atomically by the daemon (stage to overlay.toml.tmp-…, then rename). The overlay rides on top of whichever profile is active, so route.set obs bypass persists across profile.use night — that's a user preference, not a tweak of default. If the overlay names an active profile that's not on disk, the daemon falls back to the built-in default and surfaces a warning; it does not refuse to start.

Each profile is a complete listening scenario. Schema (headroom-core::profile):

name = "default"
description = "Gentle transparent processing for everyday use."

[agc]
enabled = true
target_lufs = -18.0       # ITU-R BS.1770 integrated target
attack_ms = 2000.0
release_ms = 800.0
silence_threshold_lufs = -70.0
max_boost_db = 12.0
max_cut_db = 12.0

[compressor]
enabled = true
detector = "peak"          # "peak" | "rms"
threshold_db = -24.0
ratio = 2.5
knee_db = 6.0
attack_ms = 10.0
release_ms = 100.0
makeup_db = "auto"         # number or "auto"

[limiter]
ceiling_dbtp = -0.1
lookahead_ms = 2.0
release_ms = 80.0
hold_ms = 5.0
oversample = 4             # 1 | 2 | 4 | 8 (1 disables ISP detection)
link = "stereo"            # "stereo" | "dual-mono"

[meters]
publish_hz = 20.0

[[rules]]
match = { process_binary = ["spotify", "mpv", "ardour", "reaper", "qpwgraph"] }
route = "bypass"

[[rules]]
match = { process_binary = ["firefox", "chromium", "google-chrome", "Discord", "discord", "element-desktop", "Slack", "zoom", "WEBRTC VoiceEngine"] }
route = "processed"

[default_route]
route = "processed"        # safe default: anything unmatched is processed

# ----------------------------------------------------------------------
# Per-application level control (Layer A). Orthogonal to routing — you
# can enable per-app on bypass-routed streams to get zero-latency
# level control (e.g. tame Discord notifications without touching
# the game's audio path).
# ----------------------------------------------------------------------
[per_app]
enabled = true                # master switch; false disables Layer A entirely
default_enabled = false       # for streams not matched by any rule below

# Per-rule knobs. Matches use the same key set as [[rules]] above.
[[per_app.rules]]
match = { process_binary = ["Discord", "discord", "element-desktop", "Slack", "zoom"] }
enabled = true
peak_threshold_db = -6.0      # short-window peak above this triggers cut
rms_target_db = -20.0         # long-term RMS target (slow path)
max_cut_db = 12.0             # never cut more than this
peak_attack_ms = 5.0
peak_release_ms = 500.0
rms_window_ms = 1500.0
# Controller-side knobs (all optional; defaults shown).
smoother_ms = 30.0            # anti-bounce smoother on max(peak,rms)
write_db_threshold = 0.5      # dB diff below which we don't fire a write
min_write_interval_ms = 100.0 # min ms between writes per stream (10 Hz cap)
defer_to_user = "ceiling"     # "ceiling" | "strict"

[[per_app.rules]]
match = { process_binary = ["firefox", "chromium", "google-chrome"] }
enabled = true
peak_threshold_db = -3.0      # browsers run hotter; raise the trigger
rms_target_db = -18.0

# Music, DAWs, games default to per-app off — they're either trusted
# to set their own level or routed bypass for a reason.
[[per_app.rules]]
match = { process_binary = ["spotify", "mpv", "ardour", "reaper", "qpwgraph", "carla"] }
enabled = false

Shipped profiles

name one-liner
default Gentle transparent processing, sensible for daily use.
night Aggressive: 20 LUFS, 4:1, fast release, narrow dynamic range.
speech VoIP-focused; short attack, fast release, controlled dynamic range.
transparent Limiter only. Compressor + AGC bypassed. Safety net only.
bypass-all Routes everything directly to the real sink. The kill switch.
spike-protection Minimal processing; high-threshold catch only. Untouched audio, hard guard against blasts.
movie Wide-DR film: lifts dialogue, keeps action punchy but bounded.
music Inter-track loudness leveling; routes music players through the bus.
podcast Spoken-word playback: even narration loudness, smooth and unfatiguing.
commute Listening in noise: heavy normalization + boost, kept loud.
gaming Latency-first: games bypass, voice chat processed, notifications tamed per-app.
party Loud room playback (anti-night): maximum loudness, dynamics sacrificed.
broadcast-14 Normalizes everything to 14 LUFS (streaming loudness) so sources match.
quiet-hours More aggressive than night: very low ceiling, near-flat dynamics.

The limiter section of bypass-all is irrelevant in practice (nothing flows through headroom-processed), but its ceiling field is still respected as a fail-safe in case a stream lands on the processed sink anyway.


7. IPC

Transport: Unix-domain socket, SOCK_STREAM, 0600, at $XDG_RUNTIME_DIR/headroom/control.sock.

Wire protocol: see IPC.md for the full normative schema. Summary: u32 BE length prefix + UTF-8 JSON payload. Three message shapes — Request (id + op + args), Response (id + result|error), Event (topic + data). Subscribers signal interest by topic; events fan out to all subscribers with bounded per-subscriber queues. Slow subscribers have events dropped (overflow events count is itself published on the daemon topic so clients know they fell behind).

The first-party Rust wrapper is headroom-client, mirroring how niri-ipc wraps Niri's socket: a thin, no-magic crate that re-exports the wire types from headroom-ipc and adds a blocking Client (and an optional async AsyncClient behind a feature flag).


8. CLI

headroom status                              # current profile, sinks, levels
headroom daemon                              # run the daemon (systemd Type=simple)
headroom profile list | use <name> | show [name]
headroom route list
headroom route set   <app> processed|bypass  # persists in user profile
headroom route unset <app>
headroom route stream <node-id> processed|bypass    # ad-hoc
headroom set <key> <value>                   # tweak active profile in place
headroom get <key>
headroom bypass on|off                       # global kill switch
headroom reload                              # reload profiles from disk
headroom monitor                             # live meter TUI (uses subscribe)

CLI is sync, blocks on UnixStream. Talks the same JSON wire as any other client.


9. Crates

headroom/
├── flake.nix                  # devshell + package
├── Cargo.toml                 # workspace
├── PLAN.md                    # this file
├── IPC.md                     # wire-protocol schema (normative)
├── README.md
└── crates/
    ├── headroom-dsp/            # AGC + compressor + limiter (pure DSP, no PW)
    ├── headroom-ipc/            # wire types, framing, serde; no I/O
    ├── headroom-client/         # blocking client (+ optional async); thin
    ├── headroom-core/           # daemon: PW integration, routing, profiles, IPC server
    └── headroom-cli/            # `headroom` binary; depends on headroom-client

External crates (final v0 dep list)

Audio / DSP

  • pipewire, libspa — official PipeWire bindings.
  • ebur128 — measurement.
  • rtrb — SPSC ring buffer (audio ↔ control).
  • basedrop — RT-safe shared ownership.
  • assert_no_alloc — debug-build tripwire.

Plumbing

  • serde, serde_json — IPC + profile (de)serialization.
  • serde-toml (toml) — profile files.
  • clap (derive) — CLI.
  • tracing, tracing-subscriber, tracing-journald — logs.
  • notify, notify-debouncer-mini — profile hot-reload.
  • crossbeam-channel — control-plane channels.
  • parking_lot — mutexes.
  • signal-hook — clean shutdown.
  • thiserror — error types.

No tokio, no zbus, no dbus-*.


10. Nix

flake.nix ships:

  • A devshell with the Rust toolchain from nixpkgs, pkg-config, pipewire's dev outputs, clang (for bindgen if invoked by deps), socat (handy for poking the IPC), jq.
  • A package output (packages.<system>.default) that builds the daemon + CLI with rustPlatform.buildRustPackage. v0 uses cargoLock.lockFile. Crane can come later if incremental builds in CI become a bottleneck.
  • A nixosModules.default placeholder so packagers can wire the user unit later. Not implemented in v0 of the flake itself.

Intermediate dev work uses plain cargo inside nix develop. Final builds and any CI go through nix build.


11. Phased implementation

The phases are roughly token-of-work units, not calendar weeks. All planned phases (08) are done as of 2026-05-21; this section is preserved as historical context + a reading guide to the commit log. See headroom-project in team memory for the per-commit ledger.

Phase 0 — scaffolding. Flake, workspace, crate skeletons, README, PLAN/IPC docs. (done as part of this commit)

Phase 1 — IPC + client. headroom-ipc (types, framing, codec) and headroom-client (blocking Client) implemented against the schema in IPC.md. Round-trip tests, fuzz the codec. (this commit)

Phase 2 — DSP kernels. headroom-dsp with limiter, compressor, AGC, oversampler, envelope. Tested in isolation against synthesized signals; limiter validated to hold a 0.1 dBTP ceiling on EBU TECH 3341 generators. (this commit: limiter first)

Phase 3 — daemon core. headroom-core brings up the headroom-processed virtual sink, the bus filter (originally a pw_stream pair + SPSC ring; rewritten to a single pw_filter node in 2026-05-22 — see PW gotchas #14, #17, #18 and the pipewire-filter workspace crate), the preferred_real_sink tracker, the registry subscriber, and the routing engine. Hardcoded profile, no IPC server yet.

Phase 4 — IPC server + profile manager. Wire headroom-core to the IPC schema. Profile loading + hot-reload. Slow AGC loop ticking on real loudness measurements.

Sub-stages used in commits / TODOs:

  • 4a4d — Unix socket server, op dispatch, mutating ops, event broadcaster.

  • 4eProfileStore: shipped + user profiles, atomic reload, user overlay at $XDG_STATE_HOME/headroom/overlay.toml. profile.use, profile.reload, setting.set, route.set all dispatch through it.

  • 4f — DSP parameter propagation: setting.set reaches the running filter via the rtrb control queue, so live profile/setting edits take effect without restart.

  • 4hpreferred_real_sink tracking: subscribe to default.audio.sink, snapshot the prior default, promote headroom-processed, retarget every bypassed stream on default-sink change, on hotplug, and on Bluetooth handoff. Also pins the filter's playback to the tracked real sink so processed audio follows when the user switches default, and resolves the real sink's node id from the registry for status reporting.

  • 4iroute.stream <node-id> processed|bypass: ad-hoc per-stream override that doesn't write a profile rule. Crosses the IPC-thread → PipeWire-thread boundary via a crossbeam channel drained by a 50 ms timer source on the main loop. State updates synchronously; metadata write follows ≤ ~50 ms later.

  • Slow AGC loop — wraps up Phase 4. Audio-thread AgcGain stage sits at the head of the DSP chain (anti-zipper smoother around a per-sample multiplier). Filter pushes pre-AGC input samples into a dedicated measurement ring. A AgcController on the PipeWire main loop ticks at 50 ms: drains the ring into ebur128 (Mode S | M | TRUE_PEAK), reads [agc] config from the active profile, computes target_lufs short_term_lufs clamped to [-max_cut_db, +max_boost_db], gates below silence_threshold_lufs, slow-smooths via leaky integrator, and pushes the result through FilterControl on the same rtrb channel setting.set uses.

Tracked follow-ups (carried past their sub-stage)

Items deliberately deferred from earlier sub-stages so they don't get lost. Pick up by name when the trigger that gates them fires.

  • Ephemeral overlay mutations. (4e follow-up.) All route.set / setting.set changes are persisted to overlay.toml. A --ephemeral flag (or --volatile) on the CLI for one-shot tweaks that don't outlive the daemon was considered and dropped from v0 for simplicity. Revisit if real users ask for it; the store-level change is a flag on the setter methods. Dormant — no user has asked through Phase 8.
  • Filter rate matching to the real sink. (F5 follow-up.) §3.1 documents the contract leak when the real sink runs at a non-48 kHz native rate. Closing it requires dynamic FILTER_SAMPLE_RATE, kernel rebuild on real-sink change (compressor + limiter coefficients are rate-dependent), and Layer A's LAYER_A_BLOCK_DT_S constant becoming dynamic too. Gated on a multi-rate hardware test bench — no point shipping the refactor without something to validate it against. v1 scope.
  • Bus filter is two pw_streams + an SPSC ring → per-quantum tremolo on shared-driver topologies. Closed 2026-05-22 by rewrite to a single pw_filter node (new in-house pipewire-filter workspace crate holding the unsafe FFI; one process callback with input→DSP→output ordering by construction; capture↔playback ring deleted entirely). Surfaced on first soak that WP doesn't auto-link pw_filter, so the filter was restructured to 4 mono ports (canonical module-filter-chain shape) and the routing engine extended to wire it explicitly via link-factory. See §5.2 above and pipewire-gotchas #14/#17/#18.
  • Filter playback BUSY spikes (periodic, ~10 s cadence). Closed in 8e (d52cd6d). The instrumentation added by 8e did not reproduce the ~8×-baseline outlier pattern in a ~3 min release-build capture; steady state was ~2.2 ms / call at this hardware's quantum with max growing only to 1.3× baseline. PlaybackTiming stays so future regressions surface at WARN. Original observation may have been a transient WP/PW housekeeping artefact under a different config; no actionable code change.
  • Sub-millisecond dispatch primitive for spike-reactive writes. (Phase 6 optimisation, downgraded from prerequisite.) The 4i PwCommand channel uses a 50 ms polling timer, fine for route.stream and slow AGC. Layer A's per-app Props.channelVolumes writes were originally feared to need a sub-ms wake primitive. After 6a/6b benches landed (see §11.6 below) we re-evaluated: at a 5 ms polling timer and 21 ms PipeWire quantum, the worst-case detection-to-write latency stays well inside one quantum, which is what PLAN §4.5 actually promises. Polling reuses existing infrastructure and is cheap (controller tick is ~30 ns; even at 200 Hz it's lost in the noise). The tighter primitive — EventSource::signal with an unsafe impl Send shim around spa_loop_utils.signal_event, or a pipe + IoSource — stays on the table as an optimisation if manual testing shows audible spike-leak artefacts. pw::command module docs still carry the constraint warning for future variants that might be tempted to share the 50 ms timer.

Phase 5 — CLI + monitor TUI. headroom-cli implements all the subcommands above, plus a monitor TUI built on the meters subscription.

Phase 6 — Per-application level control (Layer A). Per-managed-stream tap creation, AppLevelController with peak + RMS envelopes, Props.channelVolumes writer, user-volume deference logic, [per_app] profile parsing, headroom per-app … CLI verbs, and a per-stream meter event on the IPC. Land after the bus path is stable so we have a baseline to compare against.

Sub-stages:

  • 6a — Pure DSP. headroom_dsp::LevelEnvelopes: two-tier (peak
    • RMS) block-rate detector, max(peak_reduction, rms_reduction) combined, clamped to max_cut_db. Allocation-free, block-rate-driven (audio thread emits one (peak, mean_sq) pair per quantum).
  • 6b — Daemon-side glue. headroom_core::app_level::AppLevelController: rule snapshot, envelopes, 30 ms anti-bounce smoother, 0.5 dB / 100 ms write gate, ceiling vs strict deference state. app_level::evaluate matches [[per_app.rules]] against PwNodeInfo using the same matcher the routing engine uses.
  • 6c — PipeWire tap + audio-thread analysis. Mechanism: per managed stream we create our own pw_stream (Direction::Input, F32LE stereo, rate left unspecified to negotiate with the source, AUTOCONNECT off, NODE_DONT_RECONNECT, node.dont-move), connect() with no target, set_active(true). PipeWire creates our input ports from the declared format. We then build explicit passive port-level links via link-factory with link.output.port / link.input.port set to the source's and tap's port global IDs respectively, plus link.passive = true. Why not target.object or target_id: empirically (6c manual smoke) WirePlumber's policy refuses to wire Stream/Output → Stream/Input via any session-manager-mediated path — it logs no error, just doesn't act. The stream-level target was getting set on the node (node.target = <source-id>) but no link ever appeared. Going through link-factory with explicit port IDs bypasses the session manager entirely and uses PipeWire core directly. Per managed stream: one pw_stream, two Link proxies (one per channel), one MeasurementSample rtrb (capacity 64). Audio-thread process runs peak = max(|x|) and mean_sq = Σx²/N over the block, pushes one sample to the ring. Lifecycle: registry watcher sees a Stream/Output/Audio matching a per_app rule → spawn tap (ports come up asynchronously) → the Layer A drain timer (6d) retries link creation each tick until both port sets are visible on the registry → links built, stream transitions to Streaming, samples flow. On registry global_remove of the source, drop the ManagedStream; declaration order severs links first, then the tap stream + listener.
  • 6dProps.channelVolumes writes + controller drain timer. A polling timer source on the PipeWire main loop ticks every 5 ms (200 Hz, CPU cost ≪ 0.1% of one core per the benches), iterates active controllers, drains each measurement ring, calls process_block, and on a Some return writes Props.channelVolumes via the bound default metadata (subject = source node id). The 5 ms tick guarantees a spike detected at quantum boundary N is written before quantum N+1 starts on typical 21 ms quanta — see §4.5 reaction-time honesty table.
  • 6e — User-volume deference + per-stream meter events. Subscribe to Props param-change events on each managed stream. Distinguish daemon writes from external by comparing against last_written_lin (within 1e-4) — external changes apply ceiling-mode or strict-mode deference per the matched rule's defer_to_user field. Per-stream meters publish on the meters topic with the smoothed reduction, the peak/RMS envelope values, and the current applied channelVolumes.

Validated cost budget (criterion microbenches, run 2026-05). PLAN §4.7 budgeted "~10 μs/quantum audio thread, few μs/measurement daemon thread." Reality on this hardware:

Bench Time
Audio-thread peak + mean_sq scan, 1024-frame stereo block 1.33 μs
LevelEnvelopes::process_block (daemon) 18 ns
AppLevelController::process_block hot signal 29 ns
AppLevelController::process_block quiet signal 22 ns

5 managed streams: audio thread ≈ 6.6 μs/quantum (0.03% of one core at 21 ms quanta); daemon ≈ 145 ns/quantum. ~7-10× under the PLAN budget, so the design has room for many more managed streams, or for adding ebur128 / TRUE_PEAK to Layer A later if useful.

Manual latency validation (post-6c implementation). PipeWire scheduling can't be benched from Rust alone. Use:

  • pw-top — note the source-node QUANT and any WAIT/BUSY or delay column before attaching the tap; attach Layer A; confirm the source-node numbers don't change. The tap appears as a new row with its own quantum; the test is whether the app's numbers degrade.
  • qpwgraph / helvum — visually confirm the source node has two outgoing links (one to its original destination, one to our tap), both terminating correctly.
  • Ear — connect/disconnect the tap on live audio. Crackles or dropouts on attach indicate the §4.1 sibling-fanout claim doesn't hold and the design needs revisiting.

If those three say "fine," the §4.1 promise is upheld in practice and 6c is acceptance-tested. jack_iodelay and other true-round-trip tools are overkill.

Phase 7 — Packaging. Done — c65c75b. contrib/systemd/headroom.service (user-scope, Type=simple, After=pipewire.service, Restart=on-failure, journald, LimitRTPRIO=20). The package's postInstall substitutes the unit's @bindir@ placeholder with an absolute store path and copies profiles/*.toml to share/headroom/profiles/. Two Nix modules: nixosModules.default (programs.headroom.enable — binary on global PATH + systemd.packages for systemctl --user discovery + hard assertion on services.pipewire.enable) and homeModules.default (services.headroom.enable — symlinks shipped profiles into $XDG_CONFIG_HOME/headroom/profiles/, extraProfiles attrset for per-user overrides, writes the systemd user unit). README rewritten with install + usage sections.

Phase 8 — Hardening. Done — 9220143 + d52cd6d + verification.

  • 8a — assert_no_alloc on audio-thread callbacks (9220143). #[global_allocator] AllocDisabler in headroom-cli/src/main.rs behind cfg(debug_assertions) (release strips it via the crate's default disable_release). The three RT callbacks (capture_process, playback_process, tap_process) wrap their body in assert_no_alloc(|| inner(...)). Verified by a deliberate Vec::with_capacity injection → SIGABRT on first audio callback; reverted before commit. Audio thread proven alloc-free under multi-thousand-callback live load.
  • 8b — live profile-reload under signal flow (verification only). Edit $XDG_CONFIG_HOME/headroom/profiles/<active>.toml while a sine plays: notify-debouncer-mini fires, ProfileStore::reload runs, setting.set propagates via FilterControl's rtrb to the audio thread. Compressor GR went 0 → 9.3 dB ≈ 1 s after edit and back to 0 after restore; 180 meter ticks over 9 s with max inter-tick gap = exact 50.0 ms (the AGC period). No glitches.
  • 8c — sink hotplug / default-sink change (verification only). wpctl set-default <other-sink> while daemon runs: on_metadata_property fires, adopt_new_real_sink runs, filter.playback re-pinned via 4k explicit-link enforcement, routing/real_sink_changed emitted on the wire. Bounces back cleanly.
  • 8d — multi-rate hardware (partial / deferred). Filter is hardcoded F32 stereo @ 48 kHz; PipeWire's link layer inserts a resampler at the filter.playback → real-sink edge when rates differ; bus DSP stays at 48 kHz internally. Architecture is sound; real-hardware validation (USB DAC at 96k etc.) deferred until available.
  • 8e — playback callback timing instrumentation (d52cd6d). Lock-free PlaybackTiming atomics in meters.rs; AGC controller drains once per second and logs at WARN above SPIKE_THRESHOLD_US = 5000. The original ~10 s-cadence ~8× spike pattern from §11 follow-ups did not reproduce in a ~3 min release-build capture; steady state 2.2 ms / call at ~4 Hz, max climbed to only 1.3× baseline. Instrumentation kept so future regressions surface.

12. Risks & open questions

These are the original v0 design risks — still useful as a checklist for new contributors. Phase 4k/4l/8c have exercised the routing / hotplug / default-sink branches; the bullets below are unchanged since several of them remain live concerns for non-NixOS distros and multi-rate hardware. See headroom-project in team memory for current status per risk.

  • WirePlumber re-linking on device hotplug. When a Bluetooth headset connects, WP re-evaluates linking. Headroom must re-pin its routed streams. Tractable; the registry events surface this.
  • Latency budget. Processed path: one quantum hop (the filter) plus lookahead (~2 ms) plus 4× oversampling buffering ≈ 815 ms added to processed-path latency. Fine for video/voice. Bypass path: zero added latency — the stream rides the real sink directly.
  • Default-sink changes. When the user switches the system default to a hardware sink, the daemon adopts it as preferred_real_sink, re-links the filter's playback, retargets bypassed streams, and re-asserts headroom-processed as the default for new streams. Watching default.audio.sink in the metadata is the trigger.
  • Sample-rate mismatch. headroom-processed, the filter, and the real sink must agree, or PipeWire resamples behind our back. The filter should source its rate from the real sink and convert on the capture side only.
  • Surround content downmix vs. passthrough. v0 punts: anything >2ch is force-bypassed regardless of profile rule. The bus filter is F32 stereo by construction and pulling a 5.1+ stream into it would either drop the centre/LFE/surround channels (with explicit links pairing only the first two ports) or run our DSP on a downmix that wasn't asked for. The check fires in routing::evaluate based on PwNodeInfo.audio_channels (parsed from the stream's audio.channels property). The explicit-link pairing in apply_pending_routes was generalised from take(2) to take(min(src, dst)) so wide bypass to a wide real sink links all channels; narrower sinks let PipeWire's source-side adapter handle downmix as usual.

13. License

GPL-3.0-or-later for the daemon and CLI. headroom-dsp and headroom-ipc are MPL-2.0 so third-party clients and plugin hosts can link them without GPL contagion. (Re-evaluate when LSP-derived code is introduced; current plan does not pull any.)