10 KiB
CLAUDE.md -- Shanty Architecture Reference
This document is the authoritative reference for any LLM working on the Shanty codebase. Read this before making changes.
What Is Shanty?
Shanty is a self-hosted music management application ("better Lidarr"). It searches MusicBrainz for music metadata, downloads from YouTube via yt-dlp, tags and organizes files, and serves the library over the Subsonic protocol. It is a Cargo workspace where each component is both a standalone CLI tool and a library consumed by the web app.
Design Philosophy
-
Modular crates. Each crate is a library and a CLI binary. The web app imports the library side; the CLI binary is for standalone use. Crates are git submodules hosted at
ssh://connor@git.rcjohnstone.com:2222/Shanty/{name}.git. The exceptions areshanty-configandshanty-data, which are local workspace crates (not submodules). -
MBID-first matching. All matching and deduplication in the web app uses MusicBrainz recording MBIDs, never string-based name matching. Name matching is only used in standalone CLI tools as a fallback.
-
Provider-swappable data layer. All external API calls go through trait-based providers in
shanty-data. Metadata, artist images, bios, lyrics, and cover art each have a trait with multiple implementations. The active provider is selected via config. -
Track-level watchlist. When a user watches an artist or album, it is expanded into individual track
wanted_itemrecords via MusicBrainz, each with a recording MBID. This enables per-track status tracking through the pipeline. -
Release groups, not releases. The UI shows deduplicated release groups (album concepts), not individual releases (which have tons of reissues/regional editions). Filtered by secondary type -- default is studio only.
Workspace Structure
All crates live in the workspace root /home/connor/docs/projects/shanty/.
| Crate | Purpose |
|---|---|
shanty (root) |
Top-level binary entry point, Actix server setup, graceful shutdown, background task spawning |
shanty-config |
Shared config types (AppConfig), YAML loading/saving, environment variable overrides |
shanty-data |
Unified external data providers: MusicBrainz (remote + local hybrid), Wikipedia, fanart.tv, Last.fm, LRCLIB, Cover Art Archive, MB dump import |
shanty-db |
Sea-ORM + SQLite schema, migrations, query modules for all tables |
shanty-index |
Scan directories, extract metadata from audio files via lofty |
shanty-tag |
MusicBrainz lookup, fuzzy matching, file tag writing |
shanty-org |
File organization with configurable format templates |
shanty-watch |
Watchlist management, MusicBrainz discography expansion (artist/album to tracks) |
shanty-dl |
yt-dlp download backend, rate limiting, download queue processing, ytmusicapi search |
shanty-search |
SearchProvider trait, MusicBrainz search + release group listing |
shanty-playlist |
Playlist generation strategies (similar-artist, genre, random, smart rules) |
shanty-web |
Actix backend routes, Yew (WASM) frontend, background task modules |
shanty-notify |
Notifications via Apprise/webhooks (stub -- not yet implemented) |
shanty-serve |
Subsonic streaming (stub -- functionality is in shanty-web) |
shanty-play |
Built-in web player (stub -- not yet implemented) |
The frontend is at shanty-web/frontend/ and is excluded from the Cargo workspace. It builds separately with Trunk to shanty-web/static/.
Key Architectural Patterns
Data Providers (shanty-data)
shanty-data owns all external API calls. Key traits:
MetadataFetcher-- artist info, release tracks, release resolution (MusicBrainz)ArtistImageFetcher-- artist photos (Wikipedia, fanart.tv)ArtistBioFetcher-- artist biographies (Wikipedia, Last.fm)LyricsFetcher-- song lyrics (LRCLIB)CoverArtFetcher-- album art (Cover Art Archive)SimilarArtistFetcher-- similar artist data (Last.fm)
HybridMusicBrainzFetcher wraps LocalMusicBrainzFetcher (optional) + MusicBrainzFetcher (remote API). It tries the local SQLite database first and falls back to the rate-limited remote API. The local DB is populated by importing MusicBrainz JSON dumps.
Web Server (shanty-web + root binary)
The root shanty binary sets up the Actix server, creates shared AppState, and spawns background tasks. The shanty-web crate provides the route handlers and frontend.
AppState holds: database connection, MusicBrainz client (hybrid), search provider, Wikipedia fetcher, shared config (behind Arc<RwLock>), task manager, scheduler info, Firefox login session state.
Background Tasks
Four background loops run via tokio::spawn + sleep:
- cookie_refresh -- refreshes YouTube cookies via headless Firefox (every 6 hours, configurable)
- pipeline_scheduler -- runs the full download pipeline automatically (every 3 hours, configurable)
- monitor -- checks monitored artists for new releases (every 12 hours, configurable)
- mb_update -- re-imports MusicBrainz dumps if auto_update is enabled (weekly)
One-off tasks (index, tag, organize, download process, monitor check, MB import) are spawned on demand and tracked via TaskManager.
Database
Sea-ORM with SQLite. Migrations run automatically on startup. Key tables:
artists-- name (unique), musicbrainz_id (unique), monitored flag, top_songs/similar_artists (JSON)albums-- name, album_artist, year, genre, musicbrainz_id, artist_id FKtracks-- file_path (unique), all metadata fields, musicbrainz_id, artist_id/album_id FKswanted_items-- item_type, name, musicbrainz_id, artist_id, status (Wanted/Available/Downloaded/Owned), user_iddownload_queue-- query, wanted_item_id FK, status, retry_countsearch_cache-- query_key (unique), provider, result_json, expires_at (used for MB data, lyrics, artist enrichment)users-- username, password_hash (bcrypt), role (Admin/User), subsonic_password (plaintext per Subsonic protocol)playlists/playlist_tracks-- saved playlists with ordered track references
Subsonic API
Mounted at /rest/* with separate authentication (username + MD5 token, per the Subsonic protocol spec). Supports browsing, streaming, playlists, search, cover art, and scrobbling. Opus files are auto-transcoded to MP3 via ffmpeg for client compatibility.
Frontend
Yew 0.21 with client-side rendering (CSR, no SSR). Built with Trunk to WASM. The compiled output goes to shanty-web/static/ and is served by Actix with SPA fallback (all non-API routes serve index.html).
Data Flow (Pipeline)
The full automated pipeline, triggered by "Set Sail" in the UI or by the pipeline scheduler:
- Search -- find artist/album on MusicBrainz
- Watch -- add to watchlist (expands to individual track wanted_items with recording MBIDs)
- Sync --
shanty_dl::sync_wanted_to_queuecreates download_queue entries for wanted items - Download -- yt-dlp downloads via YouTube Music search (ytmusicapi Python script), creates track records in DB with MBIDs from wanted_items
- Index -- scan library, extract metadata from new files
- Tag -- MusicBrainz lookup by MBID (skips search since MBID is known), write tags to files
- Organize -- move files to
{artist}/{album}/{track_number} - {title}.{ext}in the library - Promote -- all Downloaded wanted_items are marked as Owned
Configuration
YAML config file at ~/.config/shanty/config.yaml (or SHANTY_CONFIG env var). Environment variables override YAML values. In Docker, the config file is at /config/config.yaml.
Key environment variables:
SHANTY_DATA_DIR-- base directory for all application data (Docker:/data)SHANTY_DATABASE_URL-- SQLite connection stringSHANTY_LIBRARY_PATH-- music library rootSHANTY_CONFIG-- path to config YAMLSHANTY_WEB_PORT/SHANTY_WEB_BIND-- server bindingSHANTY_LASTFM_API_KEY-- Last.fm API key (for bios and similar-artist playlists)SHANTY_FANART_API_KEY-- fanart.tv API key (for artist images/banners)
The config is loaded once at startup and held in Arc<RwLock<AppConfig>>. It can be updated at runtime via the /api/config PUT endpoint, which writes back to the YAML file and updates the in-memory config.
Coding Standards
- Rust edition 2024 with resolver v3
cargo clippy -- -D warningsmust passcargo fmtfor formatting- All crates must compile independently
- Never use
#[allow(dead_code)]-- remove dead code instead - Never create local DB records for artists the user is just browsing (only persist when they watch)
- Use MBIDs for all matching in the web app, never name-based matching
- Artist credits: use the primary (first) artist only, never concatenate collaborators
Testing
cargo test --workspace
Integration tests per crate. Mock providers exist for MusicBrainz in tests. The frontend is excluded from workspace tests (it has its own build process via Trunk).
Important Constraints
- MusicBrainz rate limit: 1 request per 1.1 seconds for the remote API. Mitigated by the local SQLite database (imported from MB dumps) and aggressive caching in
search_cache. - YouTube cookies expire roughly every 2 weeks. Auto-refreshed by headless Firefox every 6 hours when cookie_refresh is enabled.
- Session key is random on startup -- user sessions do not survive server restarts.
- Subsonic password is stored in plaintext per the Subsonic protocol specification. Users are warned about this in the UI.
- Opus transcoding for Subsonic clients transcodes the entire file to memory before streaming. Not ideal for very large files.
Making Changes
- Backend route changes: edit files in
shanty-web/src/routes/ - Frontend changes: edit files in
shanty-web/frontend/src/, thencd shanty-web/frontend && trunk build - Config changes: update
shanty-config/src/lib.rs(add field + default), updateapply_env_overridesif adding env var support - Database schema changes: add a migration in
shanty-db, update entities and queries - Adding a new external data source: add a provider implementation in
shanty-databehind the appropriate trait - Each crate submodule must be committed and pushed independently before updating the parent workspace