Week of 2026-05-18 — music client v0.2: UPnP ContentDirectory browse + queue management

Started 2026-05-18 · shipped · SMC17/aether

The v0.1 commit shipped the now-playing surface and called out queue inspection as the work to come: "Queue inspection requires UPnP browse — coming in v0.2." This is that ship.

What v0.2 adds

The thesis stays the same: anti-Sonos-S2, restrained typography, no account-auth on a LAN, no SDK dependence. v0.2 closes the gap between "a control surface for the currently-playing track" and "a control surface for the whole listening session."

Three layers extended in lock-step:

  1. <internal-lab>/sonos-cli/sonos.py — added UPnP ContentDirectory

Browse plus the five queue-mutation ops from AVTransport:1 (AddURIToQueue, RemoveTrackFromQueue, ReorderTracksInQueue, RemoveAllTracksFromQueue, Seek(Unit=TRACK_NR)).

  1. <internal-lab>/sonos-cli/sonos_bridge.pyhandle_command()

dispatches the new ops as NATS subjects, and after any successful queue mutation re-publishes the new queue contents to lab.sonos.zone.<zone>.queue so the LiveView gets push-only updates with no polling.

  1. <internal-lab>/aether/lib/aether/zone.ex + <internal-lab>/aether/lib/aether_web/live/music_live.ex

Aether.Zone now subscribes to all three per-zone subjects (.state + .queue + .browse) and routes each on the topic suffix; MusicLive renders a queue panel as the third column, binds HTML5-native drag-to-reorder, and opens a five-tab library browser modal (Favorites / Playlists / Artists / Albums / Tracks).

DIDL-Lite parser — a real XML parser, not regex

Sonos returns ContentDirectory results as DIDL-Lite — a well-formed XML payload — wrapped (entity-encoded) inside the SOAP envelope's <Result> element. v0.1 parsed GetPositionInfo with regex because the field shape is fixed and the payload tiny. v0.2 parses with xml.etree.ElementTree because DIDL-Lite items have variable child sets (some have upnp:album, some don't; favorites can be tracks or radio stations or container references; the <res> element carries both the play URI as text and a duration attribute).

Regex would have shipped, but it would have been wrong the first time a track title contained an angle bracket or an ampersand — and our own queue-add test hit exactly that failure (a Spotify URI with & query separators broke soap_call's naïve interpolation). Once the parser is in, the same payload parses today and tomorrow:

_DIDL_NS = {
    "didl": "urn:schemas-upnp-org:metadata-1-0/DIDL-Lite/",
    "dc":   "http://purl.org/dc/elements/1.1/",
    "upnp": "urn:schemas-upnp-org:metadata-1-0/upnp/",
    "r":    "urn:schemas-rinconnetworks-com:metadata-1-0/",
}

def _parse_didl(didl_xml: str) -> list:
    root = ET.fromstring(didl_xml)
    out = []
    for tag in ("container", "item"):
        for el in root.findall(f"didl:{tag}", _DIDL_NS):
            entry = {
                "id":         el.attrib.get("id", ""),
                "title":      _text(el, "dc:title"),
                "creator":    _text(el, "dc:creator"),
                "album":      _text(el, "upnp:album"),
                "upnp_class": _text(el, "upnp:class"),
                "duration":   "",
                "res":        "",
            }
            res_el = el.find("didl:res", _DIDL_NS)
            if res_el is not None:
                entry["res"] = (res_el.text or "").strip()
                entry["duration"] = res_el.attrib.get("duration", "")
            out.append(entry)
    return out

While we were in there: soap_call now XML-escapes string values exactly once when constructing the SOAP envelope (a latent bug v0.1 got away with because none of its values contained metacharacters). The DIDL-Lite metadata builders (play_uri, _track_metadata_didl) now hand off raw markup and let soap_call do the single entity-encode, which is the only correct layering — Sonos un-escapes once on receive, so anything that pre-escaped before today was doing it the way that happened to work only because the values were short ASCII.

Drag-to-reorder — HTML5 native, no library

The queue rows are draggable="true" divs carrying a data-position attribute. Drop-target dragover paints a thin red top-border (.q-item.drag-over). On drop, we read data-position from the source and the target, then push queue_reorder with {from, to} integers to the LiveView. The LiveView translates that to the UPnP ReorderTracksInQueue triple — StartingIndex, NumberOfTracks=1, InsertBefore — accounting for the off-by-one that UPnP requires when dragging downward (the source's own removal shifts every later slot by one, so InsertBefore must be to + 1 when moving down and just to when moving up).

No drag-and-drop library. No react-dnd. No sortablejs. The HTML5 drag-and-drop API is a W3C standard supported in every browser that runs LiveView; same posture as the MediaSession bridge in v0.1.

Library modal — five containers, lazy-loaded

The modal opens on the "Browse library →" button below the queue panel. The first tab (Favorites, container ID FV:2) fetches on open; switching tabs fires a new browse cmd at the bridge for that container's object_id. Results land at lab.sonos.zone.<zone>.browse, the Zone GenServer stores them keyed by object_id (so revisiting a tab is instant — the data is already in z.browse[object_id]), and the LiveView reads browse_state.items to render rows.

Each row has three add-buttons: Play now (clear queue → add → Seek), Play next (EnqueueAsNext=1), and + queue (EnqueueAsNext=0, append). All three flow through the same queue_add LiveView event; the modes are translated into the correct UPnP arg combination inside the event handler. The bridge publishes a fresh queue snapshot after the mutation lands, and the LiveView's queue panel re-renders without explicit refresh.

Pagination is --count 100 per request; a v0.3 follow-up would add a "load next 100" footer when total_matches > returned.

End-to-end verification

# Restart bridge with new ops (no zone-state interruption visible
# to the LiveView — the publisher_loop's next 10-second poll continues
# fresh; the LiveView only sees a brief gap, never a stale state).
cd <internal-lab>/sonos-cli
pkill -f sonos_bridge.py
nohup python3 sonos_bridge.py > /tmp/sonos_bridge.log 2>&1 &

# queue_get from CLI
nats pub lab.sonos.zone.office.cmd '{"op":"queue_get"}'
# → bridge log:  cmd: office queue_get → ok
# → audit:       {"op":"queue_get","status":"ok","count":0}
# → queue subj:  {"object_id":"Q:0","items":[],"total_matches":0,...}

# queue_add (adds Free Bird via Spotify-via-Sonos)
nats pub lab.sonos.zone.office.cmd \
  '{"op":"queue_add","uri":"x-sonos-spotify:spotify%3atrack%3a5EWPGh7jbTNO2wakv8LjUI?sid=12&flags=8192&sn=10","title":"Free Bird","artist":"Lynyrd Skynyrd"}'
# → audit: {"op":"queue_add","first_track_enqueued":1,"num_tracks_added":1,"new_queue_length":1,"status":"ok"}

# queue_reorder + queue_remove + queue_clear all confirmed via the audit
# subject — each lands status:"ok" and triggers a queue re-publish that
# fans into Aether.Zone's :queue state and back out via Phoenix.PubSub.

# /music HTML render shows the new bound events:
#   queue_get, queue_clear, queue_remove, play_at, open_browser
# plus the existing transport set: play, pause, stop, next, previous,
# select_zone.

License + posture

AGPL-3.0 across all new code, same as v0.1. Zero new third-party SDKs. The two new browser APIs touched — HTML5 drag-and-drop and the existing MediaSession bridge — are both W3C standards, not platform-vendor APIs. The DIDL-Lite parser uses stdlib xml.etree, not lxml. The Sonos protocol is unchanged from the documented UPnP MediaServer/MediaRenderer spec; nothing here depends on the Sonos cloud, Sonos S2 app, Sonos account auth, or the proprietary Sonos Local Music Library extension.

What's deferred — natural v0.3

Three threads opened naturally and got bounded:

  1. Pagination "load more"browse already supports start/count

but the modal only shows the first 100 of any container. Adding a footer button is one event handler and one render branch.

  1. Album/Artist drill-in — clicking a kind:container row should

browse into that container (a Sonos album's id is itself a browse object). Today, container rows show "(container)" and a disabled button. The browse op handles this already; only the LiveView click handler is missing.

  1. Now-playing index in the queue — Sonos returns the active

queue position as GetPositionInfo.Track. We track current title and artist in Aether.Zone, but not the current queue index. A v0.3 state-merge pass that includes current_track_index would replace the brittle title-match heuristic that today highlights the now-playing row.

The natural next layer beyond v0.3 is the Zig rewrite of sonos.py — the file's docstring has called out the Zig substrate under <internal-lab>/sonos-cli/src/ since the beginning, and v0.2's DIDL-Lite parser + entity-encoding rules now give us a complete behavioral spec to port against.

Cross-references

this software instantiates.

the PWA is the screen surface that pairs with the eventual physical artifact. v0.2's library browser is what makes the Phonograph's no-screen mode practical (queue your set from the phone, then walk away).

the NATS bus, the LiveView socket. The music client composes-with AETHER, not replaces it.

[s2]: /journal/sonos-s2-touchscreen-monoculture [phono]: /objects/phonograph [aether]: https://github.com/SMC17/aether