Files
Connor Johnstone 3708829c3b
CI / check (push) Successful in 1m9s
CI / docker (push) Successful in 1m34s
Badges
2026-03-23 17:02:11 -04:00

12 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

What Is Shanty?

Shanty is a self-hosted music management application ("better Lidarr"). It searches MusicBrainz for music metadata, downloads from YouTube via yt-dlp, tags and organizes files, and serves the library over the Subsonic protocol. It is a Cargo workspace where each component is both a standalone CLI tool and a library consumed by the web app.

Development Commands

A justfile provides common workflows. Run just to see all targets.

just check          # fmt + lint + test (full pre-commit check)
just dev            # build frontend + run server
just build          # cargo build --workspace
just test           # cargo test --workspace
just lint           # cargo clippy --workspace -- -D warnings
just fmt            # cargo fmt
just frontend       # cd shanty-web/frontend && trunk build
just run            # cargo run --bin shanty

Running single crate tests:

cargo test --package shanty-db
cargo test --package shanty-tag

Running a single test by name:

cargo test --package shanty-tag test_tag_with_match

Frontend build (Yew/Trunk → WASM):

cd shanty-web/frontend && trunk build              # dev
cd shanty-web/frontend && trunk build --release    # optimized

Running the server with verbose logging:

cargo run --bin shanty -- -v      # info
cargo run --bin shanty -- -vv     # debug
cargo run --bin shanty -- -vvv    # trace

MusicBrainz dump import subcommand:

cargo run --bin shanty -- mb-import --download --data-dir /path/to/dumps

Prerequisites: Rust (stable, edition 2024), yt-dlp, ffmpeg, Python 3, ytmusicapi, Trunk. The rust-toolchain.toml pins stable and adds the wasm32-unknown-unknown target.

Design Philosophy

  1. Modular crates. Each crate is a library and a CLI binary. The web app imports the library side; the CLI binary is for standalone use. Crates are git submodules hosted at ssh://connor@git.rcjohnstone.com:2222/Shanty/{name}.git. The exceptions are shanty-config and shanty-data, which are local workspace crates (not submodules).

  2. MBID-first matching. All matching and deduplication in the web app uses MusicBrainz recording MBIDs, never string-based name matching. Name matching is only used in standalone CLI tools as a fallback.

  3. Provider-swappable data layer. All external API calls go through trait-based providers in shanty-data. Metadata, artist images, bios, lyrics, and cover art each have a trait with multiple implementations. The active provider is selected via config.

  4. Track-level watchlist. When a user watches an artist or album, it is expanded into individual track wanted_item records via MusicBrainz, each with a recording MBID. This enables per-track status tracking through the pipeline.

  5. Release groups, not releases. The UI shows deduplicated release groups (album concepts), not individual releases (which have tons of reissues/regional editions). Filtered by secondary type -- default is studio only.

Workspace Structure

All crates live in the workspace root.

Crate Purpose
shanty (root) Top-level binary entry point, Actix server setup, graceful shutdown, background task spawning
shanty-config Shared config types (AppConfig), YAML loading/saving, environment variable overrides
shanty-data Unified external data providers: MusicBrainz (remote + local hybrid), Wikipedia, fanart.tv, Last.fm, LRCLIB, Cover Art Archive, MB dump import
shanty-db Sea-ORM + SQLite schema, migrations, query modules for all tables
shanty-index Scan directories, extract metadata from audio files via lofty
shanty-tag MusicBrainz lookup, fuzzy matching, file tag writing
shanty-org File organization with configurable format templates
shanty-watch Watchlist management, MusicBrainz discography expansion (artist/album to tracks)
shanty-dl yt-dlp download backend, rate limiting, download queue processing, ytmusicapi search
shanty-search SearchProvider trait, MusicBrainz search + release group listing
shanty-playlist Playlist generation strategies (similar-artist, genre, random, smart rules)
shanty-web Actix backend routes, Yew (WASM) frontend, background task modules
shanty-notify Notifications via Apprise/webhooks (stub -- not yet implemented)
shanty-serve Subsonic streaming (stub -- functionality is in shanty-web)
shanty-play Built-in web player (stub -- not yet implemented)

The frontend is at shanty-web/frontend/ and is excluded from the Cargo workspace. It builds separately with Trunk to shanty-web/static/.

Key Architectural Patterns

Data Providers (shanty-data)

shanty-data owns all external API calls. Key traits:

  • MetadataFetcher -- artist info, release tracks, release resolution (MusicBrainz)
  • ArtistImageFetcher -- artist photos (Wikipedia, fanart.tv)
  • ArtistBioFetcher -- artist biographies (Wikipedia, Last.fm)
  • LyricsFetcher -- song lyrics (LRCLIB)
  • CoverArtFetcher -- album art (Cover Art Archive)
  • SimilarArtistFetcher -- similar artist data (Last.fm)

HybridMusicBrainzFetcher wraps LocalMusicBrainzFetcher (optional) + MusicBrainzFetcher (remote API). It tries the local SQLite database first and falls back to the rate-limited remote API. The local DB is populated by importing MusicBrainz JSON dumps.

Web Server (shanty-web + root binary)

The root shanty binary sets up the Actix server, creates shared AppState, and spawns background tasks. The shanty-web crate provides the route handlers and frontend.

AppState holds: database connection, MusicBrainz client (hybrid), search provider, Wikipedia fetcher, shared config (behind Arc<RwLock>), task manager, scheduler info, Firefox login session state.

Background Tasks

Four background loops run via tokio::spawn + sleep:

  1. cookie_refresh -- refreshes YouTube cookies via headless Firefox (every 6 hours, configurable)
  2. pipeline_scheduler -- runs the full download pipeline automatically (every 3 hours, configurable)
  3. monitor -- checks monitored artists for new releases (every 12 hours, configurable)
  4. mb_update -- re-imports MusicBrainz dumps if auto_update is enabled (weekly)

One-off tasks (index, tag, organize, download process, monitor check, MB import) are spawned on demand and tracked via TaskManager.

Database

Sea-ORM with SQLite. Migrations run automatically on startup. Key tables:

  • artists -- name (unique), musicbrainz_id (unique), monitored flag, top_songs/similar_artists (JSON)
  • albums -- name, album_artist, year, genre, musicbrainz_id, artist_id FK
  • tracks -- file_path (unique), all metadata fields, musicbrainz_id, artist_id/album_id FKs
  • wanted_items -- item_type, name, musicbrainz_id, artist_id, status (Wanted/Available/Downloaded/Owned), user_id
  • download_queue -- query, wanted_item_id FK, status, retry_count
  • search_cache -- query_key (unique), provider, result_json, expires_at (used for MB data, lyrics, artist enrichment)
  • users -- username, password_hash (bcrypt), role (Admin/User), subsonic_password (plaintext per Subsonic protocol)
  • playlists / playlist_tracks -- saved playlists with ordered track references

Subsonic API

Mounted at /rest/* with separate authentication (username + MD5 token, per the Subsonic protocol spec). Supports browsing, streaming, playlists, search, cover art, and scrobbling. Opus files are auto-transcoded to MP3 via ffmpeg for client compatibility.

Frontend

Yew 0.21 with client-side rendering (CSR, no SSR). Built with Trunk to WASM. The compiled output goes to shanty-web/static/ and is served by Actix with SPA fallback (all non-API routes serve index.html).

Data Flow (Pipeline)

The full automated pipeline, triggered by "Set Sail" in the UI or by the pipeline scheduler:

  1. Search -- find artist/album on MusicBrainz
  2. Watch -- add to watchlist (expands to individual track wanted_items with recording MBIDs)
  3. Sync -- shanty_dl::sync_wanted_to_queue creates download_queue entries for wanted items
  4. Download -- yt-dlp downloads via YouTube Music search (ytmusicapi Python script), creates track records in DB with MBIDs from wanted_items
  5. Index -- scan library, extract metadata from new files
  6. Tag -- MusicBrainz lookup by MBID (skips search since MBID is known), write tags to files
  7. Organize -- move files to {artist}/{album}/{track_number} - {title}.{ext} in the library
  8. Promote -- all Downloaded wanted_items are marked as Owned

Configuration

YAML config file at ~/.config/shanty/config.yaml (or SHANTY_CONFIG env var). Environment variables override YAML values. In Docker, the config file is at /config/config.yaml.

Key environment variables:

  • SHANTY_DATA_DIR -- base directory for all application data (Docker: /data)
  • SHANTY_DATABASE_URL -- SQLite connection string
  • SHANTY_LIBRARY_PATH -- music library root
  • SHANTY_CONFIG -- path to config YAML
  • SHANTY_WEB_PORT / SHANTY_WEB_BIND -- server binding
  • SHANTY_LASTFM_API_KEY -- Last.fm API key (for bios and similar-artist playlists)
  • SHANTY_FANART_API_KEY -- fanart.tv API key (for artist images/banners)

The config is loaded once at startup and held in Arc<RwLock<AppConfig>>. It can be updated at runtime via the /api/config PUT endpoint, which writes back to the YAML file and updates the in-memory config.

Coding Standards

  • Rust edition 2024 with resolver v3
  • cargo clippy -- -D warnings must pass
  • cargo fmt for formatting
  • All crates must compile independently
  • Never use #[allow(dead_code)] -- remove dead code instead
  • Never create local DB records for artists the user is just browsing (only persist when they watch)
  • Use MBIDs for all matching in the web app, never name-based matching
  • Artist credits: use the primary (first) artist only, never concatenate collaborators

Testing

cargo test --workspace           # all tests
cargo test --package shanty-db   # single crate

The frontend is excluded from workspace tests (it has its own build process via Trunk).

Test patterns used across crates:

  • In-memory SQLite: Integration tests create Database::new("sqlite::memory:") for fast, isolated DB testing. No fixtures directory -- data is inserted programmatically.
  • Mock providers: Each crate that depends on external APIs defines its own mock trait implementations (e.g., MockProvider for MetadataProvider, MockBackend for DownloadBackend). Mocks are self-contained per crate.
  • Temp files: tempfile::TempDir + lofty to create real MP3 files with valid ID3 tags for index/org/tag tests.
  • Integration tests: {crate}/tests/integration.rs (async via #[tokio::test])
  • Unit tests: #[cfg(test)] modules inline in source files for pure functions (sanitization, parsing, normalization, template rendering).

Important Constraints

  • MusicBrainz rate limit: 1 request per 1.1 seconds for the remote API. Mitigated by the local SQLite database (imported from MB dumps) and aggressive caching in search_cache.
  • YouTube cookies expire roughly every 2 weeks. Auto-refreshed by headless Firefox every 6 hours when cookie_refresh is enabled.
  • Session key is random on startup -- user sessions do not survive server restarts.
  • Subsonic password is stored in plaintext per the Subsonic protocol specification. Users are warned about this in the UI.
  • Opus transcoding for Subsonic clients transcodes the entire file to memory before streaming. Not ideal for very large files.

Making Changes

  • Backend route changes: edit files in shanty-web/src/routes/
  • Frontend changes: edit files in shanty-web/frontend/src/, then cd shanty-web/frontend && trunk build
  • Config changes: update shanty-config/src/lib.rs (add field + default), update apply_env_overrides if adding env var support
  • Database schema changes: add a migration in shanty-db, update entities and queries
  • Adding a new external data source: add a provider implementation in shanty-data behind the appropriate trait
  • Each crate submodule must be committed and pushed independently before updating the parent workspace