AETHER audio pipeline: a runnable claim for ~9h 49m of shipped spoken-word

2026-05-19 · project aether

This is a lab notebook entry, not a marketing brief. Every claim is graded against the controlled evidence vocabulary¹; every empirical number is footnoted to a file in the working tree or to an ffprobe reading of the shipped artifact. The substrate under examination is Aether.Audio.* — an Elixir / Membrane Framework pipeline plus an operationally-equivalent bash harness — that has produced 15 shipped spoken-word programs across the Stax canon.

1. The claim

AETHER's audio pipeline takes a markdown script of a Stax canon program and emits a finished .opus + .mp3 pair plus four sidecar artifacts (.json chapter markers, .txt timecodes, .html player page, content .md) through a single substrate. As of 2026-05-19, the pipeline has produced 15 shipped programs: Edition I (Phonograph Object Lesson, 40:08)², Edition III (Anti-Edison Vol I LP, 1:27:59)³, Edition VI (Anti-Edison Vol II LP, 1:20:45)⁴, and 12 × monthly Stax Almanac (Jan–Dec, 6:20:47 total)⁵.

Total shipped audio: 9 h 49 m 39 s (35,378.98 s of .opus measured via ffprobe)⁶. Total narration words: ~95,600 across the 15 scripts⁷. Voice is en_US-libritts_r-medium (Piper TTS)⁸; ambient beds are sox-synthesized via one of 14 named recipes in Aether.Audio.AmbientSource⁹.

<!— runnable-claim: aether-audio-pipeline-shipped-programs —>

Evidence grade: compiled. The Elixir code path passes mix compile clean on Elixir 1.17 / OTP 27¹⁰; 38 / 39 tests in mix test pass (the one failure is an unrelated Phoenix page-controller assertion against stale boilerplate)¹¹. The substrate has no dedicated unit tests for Aether.Audio.* — integration evidence is the 15 produced programs and their ffprobe-verifiable durations. The shell-harness path is a separate implementation running the same toolchain; the two paths produce the same shape of output but have not been bit-compared. Do not read this entry as unit-tested or audited — those grades are not yet earned.

2. Why audio

The Stax canon has print-form essays — Mercantile Thesis, Anti-Edison arc, twelve-figure Almanac. Print and long-form audio reach different attention surfaces: a 6,000-word lineage essay sits in the research-paper surface, a 33-minute monthly Almanac sits in the commute / walk-hours podcast surface. Both surfaces are load-bearing.

The pipeline's job is to make audio shipment the same operational cost as essay shipment: one markdown script in, finished .opus + .mp3 + chapter markers + player page out. Without the pipeline, each program would be a bespoke production round — voice talent booking, studio time, ambient sourcing, manual mastering, manual chaptering, manual page wiring. The pipeline collapses that to "write the script, run the build."

This is the substrate move: the pipeline-shape is the artifact, the audio files are the by-product¹².

3. Architecture — two operationally-equivalent paths

The substrate has two paths that produce the same shape of output from the same input script:

Path 1 — Elixir / Membrane Framework (canonical). Top-level orchestrator Aether.Audio.LineageProgram (260 LOC) composes shell-out wrappers for Piper TTS (PiperShim), sox-synthesized ambient (AmbientSource), sox-backed mix (Mixer), ffmpeg loudnorm (LoudnessNormalize), and a real Membrane.Pipeline for the WAV → Opus encode leg (OpusEncodePipeline) with a WavStripHeader filter element bridging Membrane.File.Source to Membrane.FFmpeg.SWResample.Converter¹³. The Membrane element model is the production-correct factoring for the linear encoder DAG; the synthesis-and-mix leg shells out because no upstream-stable Membrane mixer or loudnorm element existed in the 1.0 plug-in set¹⁴. The wrapper factoring preserves the option to swap to a native Membrane implementation in v2 without changing the call surface.

Path 2 — bash + python3 harness (operational shortcut). Three shell scripts under ~/aether/audio-pipeline/: build-edition-iii.sh (Anti-Edison Vol I)¹⁵, build-edition-vi.sh (Vol II)¹⁶, and build-almanac-batch.sh (the 8-program May–December batch)¹⁷. The harness invokes the same Piper binary, the same sox filter chains (recipe arguments copied verbatim from AmbientSource's build_args/6 clauses), and the same ffmpeg command lines the Elixir path runs through System.cmd/3. The two paths are not bit-identical (Opus encoder non-determinism; the Vol I / Vol II shell scripts skip the explicit loudnorm pass that the Elixir path and the Almanac batch include); they are operationally equivalent at the format and duration level¹⁸.

The Elixir path is the documented canonical implementation. The shell-harness produced 14 of the 15 shipped files because for a one-shot long-form build it is faster to debug shell-stderr than a Phoenix-app-spawned Port.

4. The ten-phase build

The full pipeline runs ten phases per program:

Chapter split. Regex on ^# Chapter (\d+) — (.+)$; preamble

before the first heading becomes chapter 0 (program intro). PiperShim.split_chapters/2 for Elixir; an inline Python heredoc in the shell harness¹⁹.

Piper TTS per chapter. `piper —model en_US-libritts_r-medium.onnx

—sentence_silence 0.4 < chapter-N.txt > chapter-N.wav` (22050 Hz mono). Mtime-cached²⁰.

Duration measurement. soxi -D chapter-N.wav; sum to total

narration length, used to size the ambient bed²¹.

Narration concatenation. `sox chapter-0.wav … chapter-N.wav

narration-full.wav`. For Editions III / VI, four 25-second interludes are interleaved at side breaks²².

Ambient bed synthesis. sox -n … synth <dur+10> <recipe-args>

writes background.wav 10 seconds longer than the narration. Recipe selection is the load-bearing per-program parameter (§5).

Loudness normalization. `ffmpeg -af

loudnorm=I=-16:TP=-1.5:LRA=11 writes normalized.wav` at -16 LUFS integrated, -1.5 dBFS true-peak — broadcast-podcast central target (Apple -16, Spotify -14, Amazon -14)²³.

Mix narration + ambient. `sox -m -v 1.0 narration-full.wav

-v 0.13 background.wav mixed.wav trim 0 <narration-dur>`; ambient sits ~-18 dBFS under narration²⁴.

Opus encode. ffmpeg -c:a libopus -b:a 96k -application audio

with title / artist=Stax / album / date=2026 / genre="Spoken Word" metadata²⁵.

MP3 fallback encode. ffmpeg -c:a libmp3lame -b:a 128k, same

metadata schema²⁶.

Sidecar generation. python3 gen-sidecars.py (690 LOC) reads

a per-program registry and emits .json chapter markers, .txt timecodes, .html player page, and content .md²⁷.

Phase order is fixed by data dependency. The Elixir path encodes the graph via a with chain in LineageProgram.run/3²⁹; the shell harness uses sequential phase blocks with set -e.

5. The ambient recipe library

The per-program ambient bed is the load-bearing aesthetic parameter of the pipeline. A monthly on Medici banking gets :florence_duomo (110 Hz + 165 Hz sines, 0.15 Hz tremolo, 100-step reverb — a synthesized chant drone, NOT a sampled choir); a monthly on the Hanseatic League's Treaty of Stralsund gets :hanseatic_dock (brown noise + 0.18 Hz tremolo, 70-step reverb, 60–900 Hz band-pass)³³. The full set in AmbientSource.build_args/6 is 14 recipes plus a :gaslit_factory_floor recipe that lives only in the shell-harness build script for Edition III³⁴:

Every recipe is sox-synthesized: no sampled audio, no field recordings, no third-party sound libraries. The license posture is clean by construction — synthetic noise is mathematically generated with no human creative input beyond the recipe parameters, so the produced audio is unambiguously the work of the pipeline-author³⁵. The beds are suggestive of the historical setting (a Hanseatic dock has waves, a Yokohama steamship has piston cadence), not field-recorded reconstructions.

6. The runnable-claim contract

A reader can rebuild any of the 15 programs from source. For Anti-Edison Vol I¹⁵:

cd ~/aether/audio-pipeline
bash build-edition-iii.sh   # ~5 min wallclock on this host
ffprobe ~/blog/public/canon/audio/edition-iii-anti-edison-vol-i.opus

ffprobe should report duration 5278.92 ± ε s (≈ 1 h 27 m 59 s) and a libopus stream at 96 kbps³¹. The output .opus is not bit-identical to the shipped file: libopus and libmp3lame carry frame-timing and silence-detection state across invocations. Duration is deterministic to within the encoder's ~10 ms frame boundary; the narration content is bit-identical at the WAV stage before the lossy encoder runs³⁰. The Almanac batch build runs eight programs back-to-back via bash build-almanac-batch.sh (~40 min wallclock).

The Elixir canonical path is a single iex call (Phoenix context, with NATS progress events to the /audio LiveView):

iex> Aether.Audio.Pipeline.run!(%{
...>   script: "~/aether/audio-pipeline/scripts/almanac-january-rockefeller.md",
...>   basename: "almanac-january-rockefeller",
...>   output_dir: "~/blog/public/canon/audio",
...>   title: "Stax Almanac · January · John D. Rockefeller",
...>   album: "Stax Almanac"
...> })

Aether.Audio.Pipeline wraps LineageProgram.run/3 and publishes seven NATS subjects under homelab.audio.pipeline.* so the /audio LiveView can render progress as the run advances³².

7. The 15 shipped programs

All durations measured via ffprobe -show_entries format=duration on the published .opus files at ~/blog/public/canon/audio/ as of 2026-05-19⁶:

| Program | Duration | Chapters | Ambient recipe | |--------------------------------------------------—|--------------—|--------—|-------------------------------—| | Edition I — Phonograph Object Lesson | 40:08 | 6 | :pink_reverb | | Edition III — Anti-Edison Vol I LP | 1:27:59 | 6 + 4 IL | :gaslit_factory_floor | | Edition VI — Anti-Edison Vol II LP | 1:20:45 | 6 + 4 IL | :electrified_factory_floor | | Almanac · January · Rockefeller | 33:59 | 6 | :rumble_machinery | | Almanac · February · Tudor | 29:37 | 6 | :north_atlantic_winter | | Almanac · March · Perkin | 32:45 | 6 | :victorian_laboratory | | Almanac · April · Medici | 35:13 | 6 | :florence_duomo | | Almanac · May · Hanseatic League | 31:22 | 6 | :hanseatic_dock | | Almanac · June · Rothschild | 31:00 | 6 | :waterloo_courier_road | | Almanac · July · Carnegie | 32:43 | 6 | :homestead_blast_furnace | | Almanac · August · Slim | 32:33 | 6 | :trading_floor_after_hours | | Almanac · September · Ren Zhengfei | 32:38 | 6 | :shenzhen_apartment_1987 | | Almanac · October · Morgan | 32:39 | 6 | :library_mahogany | | Almanac · November · Polo | 27:37 | 6 | :venetian_lagoon | | Almanac · December · Iwasaki | 28:42 | 6 | :yokohama_steamship | | Total | 9:49:39 | 94 | 15 distinct recipe-instances |

(Chapters count includes the program-introduction chapter 0; "IL" = side-break interlude on Editions III / VI.)

Every program is published under CC BY-NC 4.0 per the Stax Editions drop-house charter §12³⁵.

8. Pipeline-as-substrate

What the pipeline enables that bespoke per-program production wouldn't:

Marginal-cost collapse. A new monthly Almanac is "write the

script, add a registry entry, pick a recipe, run the build." Piper TTS runs at real-time factor ~0.13–0.17× on this laptop³⁷; a 30-minute program takes ~4–5 minutes to synthesize and another ~30 seconds to mix, normalize, encode. The 8-program May–December batch ran in under 40 minutes total.

Format consistency. Every program ships with the same

chapter-marker shape, the same -16 LUFS target, the same 96k Opus + 128k MP3 encode, the same five sidecar artifacts. The player page renders identically for the Phonograph Edition and a monthly Almanac.

Substrate for the 2027 cycle. The 2027 twelve-figure Almanac

roster is already published; the pipeline is what makes that ship-cadence feasible without re-tooling per program.

Composability with the Director's Track DSL. Chapter-marker

sidecars let the Director's DSL³⁸ address program sections by basename + chapter index — a scene like play :almanac-jan-rockefeller chapter: 3 zone: "kitchen" reaches the section level, not just the file level.

9. Honest limitations

Six things the current pipeline does not do:

**Piper TTS produces serviceable but unmistakably-synthesized

narration.** en_US-libritts_r-medium is the best-public-OSS multi-speaker LibriTTS variant — uniform prosody at 165–185 wpm — but it is not Studio One quality and there has been no professional voice audition pipeline⁸. The Edition III physical-press capsule (Q4 2026) will need a real-human-voice master.

**Loudness normalization targets streamed playback (-16 LUFS), not

vinyl pressing.** The physical LP master will need a separate normalization pass tuned to the pressing facility's spec³⁹.

Ambient beds are sox-synthesized at recipe-level fidelity.

:venetian_lagoon is suggestive of the historical setting; it is not a field recording of San Marco at dawn. "Synthetic, suggestive, licensed-clean," not "actual lagoon water."

**Differential testing against the canonical Membrane pipeline is

partial. The shell-harness and Elixir paths produce the same shape of output (Opus + MP3 + sidecars, same metadata schema, durations within encoder frame-boundary tolerance), but the two paths have not** been bit-compared end-to-end. Opus encoder non-determinism makes byte-identical comparison non-trivial; the next-frontier work is a duration+integrated-LUFS comparison harness³⁶.

The pipeline does not handle music yet. Spoken-word + synthetic

ambient is the entire scope. A real instrumental score (e.g., for the Edition III LP physical press) would need a parallel pipeline or external mastering.

No dedicated unit tests for Aether.Audio.*. The 38 / 39

passing tests in mix test cover the Director runtime, Director scene, and web controllers — not the audio modules¹¹. The integration evidence is the 15 produced programs; that is real evidence and it is narrower than what unit-tested would warrant. A PiperShim.split_chapters/2 test, an AmbientSource argument-builder test, and a LineageProgram.chapter_timestamps/1 property test are the next-frontier closures.

10. v2 deferrals

Honest deferred items for the next pass:

Real human voice. Audition / Pro Tools handoff for the Edition

III physical LP press master; long-term audition pipeline for the broader public-distribution surface.

Music score layering. Instrumental composition + multi-track

music+narration mix for programs where a synthetic ambient bed is insufficient.

Container-level chapter markers. Bake chapter timecodes into

the Opus OpusTags block (and ID3v2 CHAP for MP3) so chapter-aware players render them as a scrubbable list⁴⁰.

Public-API exposure. Once the AETHER NATS bus matures, a

mix call AETHER /audio/produce '{"script": "...", "ambient": ":victorian_laboratory"}' returning a finished .opus is a clean v2 shape that composes with the rest of the agent fleet.

Native Membrane mixer + loudnorm. Swap sox / ffmpeg shell-outs

for native Membrane elements when the plug-in set ships upstream-stable equivalents; the wrapper factoring is designed for that swap¹⁴.

External CC0 ambient pool. Curated pool of CC0 / public-domain

field recordings (archive.org, Free Music Archive CC0 tier, Library of Congress field-recording collection) selectable in place of synthetic beds where field fidelity matters⁴¹.

11. Cross-references

Lineage Mode — the substrate concept of "every Stax canon

program ships an audio component"; the pipeline is the operational implementation.

Director's Track DSL — composes with this pipeline at the

program-section level.

Stax Editions I, III, VI — ship audio components produced by

this pipeline; Edition II (Almanac) is now audio-complete across the full 12-month roster via this substrate.

Design-system contract —

~/codex/methods/stax-dev-portfolio-design-system.md defines the evidence vocabulary¹.

License posture — pipeline code AGPL-3.0⁴²; produced

audio CC BY-NC 4.0³⁵; Piper voice model CC BY 4.0 (LibriTTS-R derivative)⁸; sox-synthesized ambient CC0-equivalent by construction.

12. Status footer

Evidence grade: compiled. Elixir code path compiles clean

under Elixir 1.17 / OTP 27; 38 / 39 tests pass in mix test; the audio modules have no dedicated unit tests. Integration evidence is the 15 shipped programs (9 h 49 m 39 s of audio, ffprobe-verifiable at ~/blog/public/canon/audio/). Not unit-tested, not differential-tested end-to-end, not audited.

Reproducible: true. The bash-harness commands and iex call in

§6 are the canonical reproduction recipes.

Last verified: 2026-05-19, Intel Core i7-1065G7 @ 1.30 GHz,

Linux 7.0.3-arch1-1 x86_64, Elixir 1.17.3 / OTP 27, Piper v2023.11.14-2, sox v14.4.2, ffmpeg v8.0.1.

Open gaps: unit-test coverage for PiperShim / AmbientSource

/ LineageProgram; duration+LUFS differential test between shell and Elixir paths; container-level chapter markers; human-voice audition pipeline; vinyl-press master pass; native Membrane mixer + loudnorm.

Footnotes

~/codex/methods/stax-dev-portfolio-design-system.md defines the controlled evidence vocabulary used across /lab entries: compiled, unit-tested, property-tested, fuzz-tested, differential-tested, benchmarked, audited, sketch, NOASSERTION. The vocabulary is enforced via the frontmatter evidence: field, which the renderer surfaces as a pill in the entry header. ↩
ffprobe -v error -show_entries format=duration -of csv=p=0 ~/blog/public/canon/audio/edition-i-phonograph-object-lesson.opus → 2407.985125 s = 40:07.99 (rounded to 40:08 for the display table). Six chapters: program intro + five narrative chapters, per the JSON sidecar at ~/blog/public/canon/audio/edition-i-phonograph-object-lesson.json. ↩
ffprobe …/edition-iii-anti-edison-vol-i.opus → 5278.920417 s = 1:27:58.92. Six chapters (intro + five narrative sides A1/A2/B1/B2/coda) plus four side-break interludes at A1→A2, A2→B1, B1→B2, B2→coda; structure per the JSON sidecar at ~/blog/public/canon/audio/edition-iii-anti-edison-vol-i.json. ↩
ffprobe …/edition-vi-anti-edison-vol-ii.opus → 4844.774583 s = 1:20:44.77. Same shape as Edition III (intro + four narrative sides + coda + four interludes); structure per the JSON sidecar at ~/blog/public/canon/audio/edition-vi-anti-edison-vol-ii.json. ↩
Sum of ffprobe durations across the 12 monthly .opus files (Jan through Dec) = 22847.30 s = 6 h 20 m 47 s. Per-file values in §7. Source-of-truth durations measured 2026-05-19 against ~/blog/public/canon/audio/almanac-*.opus. ↩
Aggregate measurement: for f in edition-i-phonograph-object-lesson edition-iii-anti-edison-vol-i edition-vi-anti-edison-vol-ii almanac-january-rockefeller almanac-{02-feb-tudor,03-mar-perkin,04-apr-medici,05-may-hanse,06-jun-rothschild,07-jul-carnegie,08-aug-slim,09-sep-ren,10-oct-morgan,11-nov-polo,12-dec-iwasaki}; do ffprobe -v error -show_entries format=duration -of csv=p=0 $f.opus; done | awk '{s+=$1} END {print s}' → 35378.98 s = 9 h 49 m 38.98 s. ↩
wc -w across the 15 script files under ~/aether/audio-pipeline/scripts/edition-*.md and ~/aether/audio-pipeline/scripts/almanac-*.md: 95,632 words total. Per-script counts range from 4,171 (December · Iwasaki) to 13,595 (Edition III · Anti-Edison Vol I). The total counts the markdown source words including chapter headings; the spoken narration after stripping headings and code blocks is marginally smaller, but the ~95,600 figure is the load-bearing approximation. ↩
~/aether/audio-pipeline/piper/en_US-libritts_r-medium.onnx, model card at https://huggingface.co/rhasspy/piper-voices/tree/main/en/en_US/libritts_r/medium. Per ~/aether/audio-pipeline/scripts/credits.md: license is CC BY 4.0 (LibriTTS-R dataset derivative). Piper engine itself (v2023.11.14-2) at https://github.com/rhasspy/piper is MIT-licensed. Tested against amy-medium and ryan-medium; libritts_r-medium had the most uniform prosody for long-form essay narration at 165–185 wpm. ↩
~/aether/lib/aether/audio/ambient_source.ex, lines 154–512. Fourteen named build_args/6 clauses, one per recipe atom; clause heads enumerate :pink_reverb, :rumble_machinery, :north_atlantic_winter, :victorian_laboratory, :florence_duomo, :hanseatic_dock, :waterloo_courier_road, :homestead_blast_furnace, :trading_floor_after_hours, :shenzhen_apartment_1987, :library_mahogany, :venetian_lagoon, :yokohama_steamship, :electrified_factory_floor. Default-gain and default-fade clauses at lines 122–152. ↩
cd ~/aether && mix compile --force 2>&1 | tail -2 on 2026-05-19: Compiling 37 files (.ex) → Generated aether app. Clean compile, zero warnings, on Elixir 1.17.3 / OTP 27. ↩
cd ~/aether && mix test 2>&1 | tail -2 on 2026-05-19: Finished in 6.9 seconds (0.8s async, 6.1s sync) → 39 tests, 1 failure. The one failure is in test/aether_web/controllers/page_controller_test.exs:6 — an assertion against the Phoenix-generated "Peace of mind from prototype to production" boilerplate text that was overwritten by the AETHER Sonos-zones dashboard. Unrelated to audio. The audio modules themselves have no dedicated tests; the only files under test/aether/ are director/runtime_test.exs and director/scene_test.exs. ↩
The "pipeline is the artifact, audio files are the by-product" framing is the same substrate move that the portfolio-bench /lab entry makes for benchmarks: a single bench is a perf claim, five aligned bench files are a substrate claim. One audio file is a production artifact; 15 audio files through one pipeline is a substrate claim. ↩
Module roster at ~/aether/lib/aether/audio/: pipeline.ex (193 LOC — orchestrator with NATS events), lineage_program.ex (260 LOC — DAG), piper_shim.ex (180 LOC — TTS subprocess shim), ambient_source.ex (513 LOC — recipe library), mixer.ex (92 LOC — sox-backed two-input mix), loudness_normalize.ex (93 LOC — ffmpeg loudnorm), opus_encode_pipeline.ex (105 LOC — real Membrane.Pipeline for WAV→Opus), wav_strip_header.ex (106 LOC — Membrane.Filter for header strip). Total: ~1,542 Elixir LOC for the audio pipeline. ↩
~/aether/lib/aether/audio/lineage_program.ex lines 23–38: "The encoder leg is implemented as a real Membrane.Pipeline … because that is the part of the pipeline where Membrane's element model is the production-correct factoring: file-source → format-converter → encoder → file-sink is a single linear DAG with no multi-source mixing and no subprocess streaming, which is exactly what Membrane Core 1.3 is good at. The synthesis-and-mix leg shells out to piper/sox/ffmpeg behind Membrane-element-shaped Elixir modules, because Membrane has no upstream-stable mixer or loudnorm element in the 1.0 plugin set and rolling them is an unjustified scope expansion. The wrapper factoring preserves the option to swap to a native Membrane implementation in v2 without changing the call surface." ↩
~/aether/audio-pipeline/build-edition-iii.sh, 157 lines. Phases enumerated as echo "=== Phase N: …" blocks; Phase 1 (split chapters via inline Python heredoc), Phase 2 (Piper TTS per chapter, mtime-cached), Phase 3 (chapter durations via soxi -D), Phase 4 (synthesize 4 × 25-second interludes), Phase 5 (assemble program with sox concat), Phase 6 (ambient bed via :gaslit_factory_floor recipe — sox-only inline), Phase 7 (mix narration + ambient), Phase 8 (Opus encode at 96k), Phase 9 (MP3 encode at 128k). Note: this script omits the explicit loudnorm pass that the Elixir path and the Almanac batch script run. The integrated loudness of the shipped Edition III is set by the mix gains rather than by a loudnorm post-pass. ↩
~/aether/audio-pipeline/build-edition-vi.sh, 157 lines. Same 9-phase shape as the Vol I script; recipe is :electrified_factory_floor (the Vol II sister recipe documented in ambient_source.ex lines 471–512); narrative ordering interleaves four interludes between five chapter sides plus a coda. ↩
~/aether/audio-pipeline/build-almanac-batch.sh, 357 lines. The build_one shell function (lines 37–298) implements the 10-phase per-program build including the explicit loudnorm step at Phase 7. Twelve build_one invocations at the bottom of the script (May through December in batch-2; the Feb/Mar/Apr batch-1 invocations are commented out as previously shipped earlier the same day per the comment block at lines 13–22). ↩
The Elixir path and the shell-harness path share the same Piper binary, the same en_US-libritts_r-medium.onnx voice, the same sox synth-and-reverb arguments (the bash case "$RECIPE" block in build-almanac-batch.sh lines 116–262 mirrors Aether.Audio.AmbientSource.build_args/6 clause-by-clause), and the same ffmpeg libopus + libmp3lame command lines. They are not bit-identical (Opus encoder non-determinism, and the Edition III/VI shell scripts skip the explicit loudnorm pass). They are operationally equivalent at the format and duration level. A duration+LUFS differential-test harness is an open v2 gap. ↩
Elixir path: Aether.Audio.PiperShim.split_chapters/2 at ~/aether/lib/aether/audio/piper_shim.ex lines 56–105 — regex pattern ^# Chapter (\d+) — (.+?)$\n(.*?)(?=^# Chapter |\z) matches against the script, preamble (anything before the first # Chapter heading, with the document's H1 stripped) becomes chapter 0. Shell path: inline Python python3 - "$SCRIPT" "$CHAPTERS_DIR" <<'PY' … PY heredoc in each of the three build scripts; same regex shape (re.split(r'^# Chapter (\d+) — (.+)$', src, flags=re.MULTILINE)). ↩
Elixir path: Aether.Audio.PiperShim.run_piper/3 at piper_shim.ex lines 145–169; constructs <piper> --model <voice> --output_file <wav> --sentence_silence <s> --quiet < <txt> via sh -c so piper reads from the file and gets a real EOF. Mtime-cache check at lines 127–143 — synthesis short-circuits if the output WAV is newer than the input .txt. Shell path: equivalent inline [[ -f "$out" && "$out" -nt "$in" ]] cache check + "$PIPER" --model "$VOICE" --output_file "$out" --sentence_silence 0.5 < "$in". ↩
Elixir path: LineageProgram.duration_seconds/1 at lineage_program.ex lines 120–131 — wraps soxi -D <path>, parses the float from stdout. Shell path: inline dur=$(soxi -D "$CHAPTERS_DIR/chapter-$i.wav") in each script. ↩
Elixir path: LineageProgram.concat_chapters/2 at lineage_program.ex lines 107–118 — sox <chapter-0.wav> … <chapter-N.wav> narration-full.wav. Shell path: explicit sox chapter-0.wav chapter-1.wav … narration-full.wav enumeration; Edition III/VI scripts interleave interlude-N.wav files at the side breaks. ↩
~/aether/lib/aether/audio/loudness_normalize.ex lines 1–31 — moduledoc names the broadcast-podcast target reasoning: "Apple Podcasts target is −16 LUFS, Spotify is −14 LUFS, Amazon Music is −14 LUFS; −16 is the conservative central value that plays comfortably on every major podcast platform without triggering platform-side compression." Single-pass loudnorm (accurate to ~0.5 LUFS of target); v2 can upgrade to two-pass for finer precision. Filter at line 53: loudnorm=I=-16:TP=-1.5:LRA=11:print_format=summary. ↩
~/aether/lib/aether/audio/mixer.ex lines 32–78 — sox -m -v <narration-gain> <narration> -v <ambient-gain> <ambient> <output> [trim 0 <narration-dur>]. Defaults: narration gain 1.0 (0 dB), ambient gain 0.12 (~-18 dB). Shell-harness Almanac uses ambient gain 0.15 (~-16 dB); Edition III/VI shell uses 0.13 (~-18 dB). ↩
~/aether/lib/aether/audio/lineage_program.ex encode_opus/3 at lines 149–183 — ffmpeg -y -i <wav> -c:a libopus -b:a 96000 -application audio -metadata title=… -metadata artist=Stax -metadata album=… -metadata date=2026 -metadata genre="Spoken Word" <output>. Verification step at verify_with_membrane/1 lines 190–197 (file exists and size > 100 bytes — minimal smoke test). ↩
lineage_program.ex encode_mp3/3 at lines 199–221 — ffmpeg -y -i <wav> -c:a libmp3lame -b:a 128k -metadata title=… <output>. Same metadata schema as Opus. ↩
~/aether/audio-pipeline/gen-sidecars.py, 690 lines. Per-program registry at lines 25 onward (one dict literal per basename); the script reads <basename>.json chapter-marker output, writes .txt (HH:MM:SS.mmm chapter timecodes), .html (full player + transcript page), and a ~/blog/content/canon/audio/<basename>.md for the blog builder. Invoked from build-almanac-batch.sh Phase 10 as python3 "$PIPELINE_DIR/gen-sidecars.py" "$BASENAME" "$SCRIPT" "$WORK_DIR". ↩
~/aether/audio-pipeline/gen-sidecars.py. The REGISTRY dict at lines 25 onward keys every shipped program by basename and stores title, edition number, month, ambient recipe, lineage cluster, primary sources, and chapter structure. Adding a new program is one registry entry. ↩
~/aether/lib/aether/audio/lineage_program.ex run/3 at lines 68–101 — single with expression chains PiperShim.synthesize → concat_chapters → duration_seconds → AmbientSource.render → Mixer.mix → LoudnessNormalize.normalize → encode_opus → encode_mp3 → build_manifest. Any failure short-circuits to {:error, reason}. ↩
libopus carries internal state across frames (silence-detection, frame-timing, look-ahead buffering) such that two invocations on the same input WAV produce .opus files that have the same audio content at PCM-decode level but are not byte-identical at the container level. Duration is reproducible to within the Opus frame boundary (~10 ms). libmp3lame has analogous non-determinism. Byte-deterministic encoding would require a different codec (e.g., FLAC) or codec-level seed pinning; neither is in v1 scope. ↩
ffprobe ~/blog/public/canon/audio/edition-iii-anti-edison-vol-i.opus reports: Format ogg; Stream #0:0 Audio: opus, 48000 Hz, mono, fltp, 96 kb/s; Duration 01:27:58.92; Metadata title, artist=Stax, album=Stax Editions, date=2026, genre=Spoken Word. Properties consistent across the 15 shipped programs (the Almanac album metadata reads Stax Almanac; the Edition I album reads Stax Edition I). ↩
~/aether/lib/aether/audio/pipeline.ex lines 30–36 — seven NATS subjects published over the lifecycle of a run: homelab.audio.pipeline.started, .chapter, .mixed, .normalized, .encoded, .done, .error. The orchestrator generates a run-id (line 185–192) and publishes a coalesced summary at the end; v0.2 will push event emission down into LineageProgram itself for live streaming. The /audio LiveView consumes these subjects to render a "now mastering" panel. ↩
~/aether/lib/aether/audio/ambient_source.ex :florence_duomo clause at lines 261–305: synth <dur> sine 110, synth <dur> sine mix 165, tremolo 0.15 25, vol 0.06, reverb 100 90 100, highpass 80, lowpass 1500, fade t 5 <dur> 5. :hanseatic_dock clause at lines 307–320: synth <dur> brownnoise, tremolo 0.18 35, vol 0.06, reverb 70 60 70, highpass 60, lowpass 900, fade t 4 <dur> 4. Every clause is sox-synthesized arguments only; no sample file paths. ↩
The :gaslit_factory_floor recipe used by the Edition III shipped audio lives only in ~/aether/audio-pipeline/build-edition-iii.sh lines 122–129 (inline sox-synth args). The Elixir Aether.Audio.AmbientSource defines :electrified_factory_floor (Edition VI Anti-Edison Vol II) but not :gaslit_factory_floor (Edition III Anti-Edison Vol I). The Vol VI module-doc comment at ambient_source.ex lines 471–481 explicitly references "the Edition III Vol I shell-harness ambient bed" — i.e., acknowledges the shell-only origin. Porting :gaslit_factory_floor to the Elixir module is an honest v2 cleanup. ↩
~/aether/audio-pipeline/scripts/credits.md, "License of the resulting audio" section: "CC BY-NC 4.0 per the Stax Editions drop-house charter §12." The same line appears in every program's gen-sidecars.py REGISTRY entry under the license key. ↩
A duration+integrated-LUFS differential-test harness — run the shell-harness build of all 15 programs, run the Elixir-pipeline build of the same 15, compare ffprobe duration and ffmpeg loudnorm integrated-LUFS readings within 10 ms / 0.5 LUFS tolerance — is the next-frontier closure that would justify upgrading the evidence grade from compiled to differential-tested. Not done. ↩
Piper TTS real-time factor on this host (Intel Core i7-1065G7 @ 1.30 GHz, CPU-only inference): 0.13–0.17×, i.e., synthesizing 60 seconds of narration takes 7.8–10.2 seconds of wallclock. Measured empirically across the 15 program builds; the per-chapter Piper invocation in the shell-harness logs reports a "synthesized in Xs" line that gives the read-off. ↩
The Director's Track DSL is the AETHER substrate's scene-driven composition layer; the play_program operation in the DSL takes a basename + zone and (where applicable) chapter index, which addresses an audio file produced by this pipeline. The Director runtime + scene tests at ~/aether/test/aether/director/{runtime,scene}_test.exs are where the DSL's evidence sits; the binding to this pipeline is the basename + chapter-index addressing scheme that this pipeline's sidecar artifacts make addressable. ↩
The Edition III physical-press LP (Q4 2026) is a separate mastering job from the streamed-podcast target. RIAA EQ pre-emphasis, lacquer-cutting headroom, and side-A / side-B level matching are all out of scope for the -16 LUFS streamed target. The streamed .opus and the LP master are different artifacts produced from the same narration WAV. ↩
The Opus OpusTags block and the MP3 ID3v2 CHAP frame both support container-level chapter markers; chapter-aware players (Overcast, Apple Podcasts, VLC) will render them as a scrubbable list. The pipeline currently emits chapter timecodes to a .txt sidecar and to the <basename>.json for the player page, but does not bake them into the Opus or MP3 container. ffmpeg supports -map_metadata + -map_chapters for this; the next-frontier work is a small post-encode pass that writes a chapter-metadata file and re-encodes (or remuxes) with chapters. ↩
~/aether/lib/aether/audio/ambient_source.ex moduledoc lines 81–84: "v2 enhancement: replace the synthetic bed with selection from a curated CC0 ambient pool (e.g. archive.org Public Domain Audio, Free Music Archive CC0 tier)." The wrapper factoring at render/3 makes the swap one-clause-change in the dispatch: add a defp render_from_pool/3 and route :archive_org_<recipe> atoms to it. ↩
The AETHER application as a whole is licensed AGPL-3.0; the audio pipeline modules live under ~/aether/lib/aether/audio/ inside that license boundary. The Piper voice model carries its own CC BY 4.0 (LibriTTS-R derivative); the produced audio is CC BY-NC 4.0 per the Stax Editions charter; these three licenses compose without conflict because each governs a different layer (code / model / output). ↩