Shanty/Main

Fork 0

Files

T

Connor Johnstone 3708829c3b

CI / check (push) Successful in 1m9s

Details

CI / docker (push) Successful in 1m34s

Details

Badges

2026-03-23 17:02:11 -04:00

12 KiB

Raw Permalink Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

What Is Shanty?

Shanty is a self-hosted music management application ("better Lidarr"). It searches MusicBrainz for music metadata, downloads from YouTube via yt-dlp, tags and organizes files, and serves the library over the Subsonic protocol. It is a Cargo workspace where each component is both a standalone CLI tool and a library consumed by the web app.

Development Commands

A justfile provides common workflows. Run just to see all targets.

just check          # fmt + lint + test (full pre-commit check)
just dev            # build frontend + run server
just build          # cargo build --workspace
just test           # cargo test --workspace
just lint           # cargo clippy --workspace -- -D warnings
just fmt            # cargo fmt
just frontend       # cd shanty-web/frontend && trunk build
just run            # cargo run --bin shanty

Running single crate tests:

cargo test --package shanty-db
cargo test --package shanty-tag

Running a single test by name:

cargo test --package shanty-tag test_tag_with_match

Frontend build (Yew/Trunk → WASM):

cd shanty-web/frontend && trunk build              # dev
cd shanty-web/frontend && trunk build --release    # optimized

Running the server with verbose logging:

cargo run --bin shanty -- -v      # info
cargo run --bin shanty -- -vv     # debug
cargo run --bin shanty -- -vvv    # trace

MusicBrainz dump import subcommand:

cargo run --bin shanty -- mb-import --download --data-dir /path/to/dumps

Prerequisites: Rust (stable, edition 2024), yt-dlp, ffmpeg, Python 3, ytmusicapi, Trunk. The rust-toolchain.toml pins stable and adds the wasm32-unknown-unknown target.

Design Philosophy

Modular crates. Each crate is a library and a CLI binary. The web app imports the library side; the CLI binary is for standalone use. Crates are git submodules hosted at ssh://connor@git.rcjohnstone.com:2222/Shanty/{name}.git. The exceptions are shanty-config and shanty-data, which are local workspace crates (not submodules).
MBID-first matching. All matching and deduplication in the web app uses MusicBrainz recording MBIDs, never string-based name matching. Name matching is only used in standalone CLI tools as a fallback.
Provider-swappable data layer. All external API calls go through trait-based providers in shanty-data. Metadata, artist images, bios, lyrics, and cover art each have a trait with multiple implementations. The active provider is selected via config.
Track-level watchlist. When a user watches an artist or album, it is expanded into individual track wanted_item records via MusicBrainz, each with a recording MBID. This enables per-track status tracking through the pipeline.
Release groups, not releases. The UI shows deduplicated release groups (album concepts), not individual releases (which have tons of reissues/regional editions). Filtered by secondary type -- default is studio only.

Workspace Structure

All crates live in the workspace root.

Crate	Purpose
`shanty` (root)	Top-level binary entry point, Actix server setup, graceful shutdown, background task spawning
`shanty-config`	Shared config types (AppConfig), YAML loading/saving, environment variable overrides
`shanty-data`	Unified external data providers: MusicBrainz (remote + local hybrid), Wikipedia, fanart.tv, Last.fm, LRCLIB, Cover Art Archive, MB dump import
`shanty-db`	Sea-ORM + SQLite schema, migrations, query modules for all tables
`shanty-index`	Scan directories, extract metadata from audio files via lofty
`shanty-tag`	MusicBrainz lookup, fuzzy matching, file tag writing
`shanty-org`	File organization with configurable format templates
`shanty-watch`	Watchlist management, MusicBrainz discography expansion (artist/album to tracks)
`shanty-dl`	yt-dlp download backend, rate limiting, download queue processing, ytmusicapi search
`shanty-search`	SearchProvider trait, MusicBrainz search + release group listing
`shanty-playlist`	Playlist generation strategies (similar-artist, genre, random, smart rules)
`shanty-web`	Actix backend routes, Yew (WASM) frontend, background task modules
`shanty-notify`	Notifications via Apprise/webhooks (stub -- not yet implemented)
`shanty-serve`	Subsonic streaming (stub -- functionality is in shanty-web)
`shanty-play`	Built-in web player (stub -- not yet implemented)

The frontend is at shanty-web/frontend/ and is excluded from the Cargo workspace. It builds separately with Trunk to shanty-web/static/.

Key Architectural Patterns

Data Providers (shanty-data)

shanty-data owns all external API calls. Key traits:

MetadataFetcher -- artist info, release tracks, release resolution (MusicBrainz)
ArtistImageFetcher -- artist photos (Wikipedia, fanart.tv)
ArtistBioFetcher -- artist biographies (Wikipedia, Last.fm)
LyricsFetcher -- song lyrics (LRCLIB)
CoverArtFetcher -- album art (Cover Art Archive)
SimilarArtistFetcher -- similar artist data (Last.fm)

HybridMusicBrainzFetcher wraps LocalMusicBrainzFetcher (optional) + MusicBrainzFetcher (remote API). It tries the local SQLite database first and falls back to the rate-limited remote API. The local DB is populated by importing MusicBrainz JSON dumps.

Web Server (shanty-web + root binary)

The root shanty binary sets up the Actix server, creates shared AppState, and spawns background tasks. The shanty-web crate provides the route handlers and frontend.

AppState holds: database connection, MusicBrainz client (hybrid), search provider, Wikipedia fetcher, shared config (behind Arc<RwLock>), task manager, scheduler info, Firefox login session state.

Background Tasks

Four background loops run via tokio::spawn + sleep:

cookie_refresh -- refreshes YouTube cookies via headless Firefox (every 6 hours, configurable)
pipeline_scheduler -- runs the full download pipeline automatically (every 3 hours, configurable)
monitor -- checks monitored artists for new releases (every 12 hours, configurable)
mb_update -- re-imports MusicBrainz dumps if auto_update is enabled (weekly)

One-off tasks (index, tag, organize, download process, monitor check, MB import) are spawned on demand and tracked via TaskManager.

Database

Sea-ORM with SQLite. Migrations run automatically on startup. Key tables:

artists -- name (unique), musicbrainz_id (unique), monitored flag, top_songs/similar_artists (JSON)
albums -- name, album_artist, year, genre, musicbrainz_id, artist_id FK
tracks -- file_path (unique), all metadata fields, musicbrainz_id, artist_id/album_id FKs
wanted_items -- item_type, name, musicbrainz_id, artist_id, status (Wanted/Available/Downloaded/Owned), user_id
download_queue -- query, wanted_item_id FK, status, retry_count
search_cache -- query_key (unique), provider, result_json, expires_at (used for MB data, lyrics, artist enrichment)
users -- username, password_hash (bcrypt), role (Admin/User), subsonic_password (plaintext per Subsonic protocol)
playlists / playlist_tracks -- saved playlists with ordered track references

Subsonic API

Mounted at /rest/* with separate authentication (username + MD5 token, per the Subsonic protocol spec). Supports browsing, streaming, playlists, search, cover art, and scrobbling. Opus files are auto-transcoded to MP3 via ffmpeg for client compatibility.

Frontend

Yew 0.21 with client-side rendering (CSR, no SSR). Built with Trunk to WASM. The compiled output goes to shanty-web/static/ and is served by Actix with SPA fallback (all non-API routes serve index.html).

Data Flow (Pipeline)

The full automated pipeline, triggered by "Set Sail" in the UI or by the pipeline scheduler:

Search -- find artist/album on MusicBrainz
Watch -- add to watchlist (expands to individual track wanted_items with recording MBIDs)
Sync -- shanty_dl::sync_wanted_to_queue creates download_queue entries for wanted items
Download -- yt-dlp downloads via YouTube Music search (ytmusicapi Python script), creates track records in DB with MBIDs from wanted_items
Index -- scan library, extract metadata from new files
Tag -- MusicBrainz lookup by MBID (skips search since MBID is known), write tags to files
Organize -- move files to {artist}/{album}/{track_number} - {title}.{ext} in the library
Promote -- all Downloaded wanted_items are marked as Owned

Configuration

YAML config file at ~/.config/shanty/config.yaml (or SHANTY_CONFIG env var). Environment variables override YAML values. In Docker, the config file is at /config/config.yaml.

Key environment variables:

SHANTY_DATA_DIR -- base directory for all application data (Docker: /data)
SHANTY_DATABASE_URL -- SQLite connection string
SHANTY_LIBRARY_PATH -- music library root
SHANTY_CONFIG -- path to config YAML
SHANTY_WEB_PORT / SHANTY_WEB_BIND -- server binding
SHANTY_LASTFM_API_KEY -- Last.fm API key (for bios and similar-artist playlists)
SHANTY_FANART_API_KEY -- fanart.tv API key (for artist images/banners)

The config is loaded once at startup and held in Arc<RwLock<AppConfig>>. It can be updated at runtime via the /api/config PUT endpoint, which writes back to the YAML file and updates the in-memory config.

Coding Standards

Rust edition 2024 with resolver v3
cargo clippy -- -D warnings must pass
cargo fmt for formatting
All crates must compile independently
Never use #[allow(dead_code)] -- remove dead code instead
Never create local DB records for artists the user is just browsing (only persist when they watch)
Use MBIDs for all matching in the web app, never name-based matching
Artist credits: use the primary (first) artist only, never concatenate collaborators

Testing

cargo test --workspace           # all tests
cargo test --package shanty-db   # single crate

The frontend is excluded from workspace tests (it has its own build process via Trunk).

Test patterns used across crates:

In-memory SQLite: Integration tests create Database::new("sqlite::memory:") for fast, isolated DB testing. No fixtures directory -- data is inserted programmatically.
Mock providers: Each crate that depends on external APIs defines its own mock trait implementations (e.g., MockProvider for MetadataProvider, MockBackend for DownloadBackend). Mocks are self-contained per crate.
Temp files: tempfile::TempDir + lofty to create real MP3 files with valid ID3 tags for index/org/tag tests.
Integration tests: {crate}/tests/integration.rs (async via #[tokio::test])
Unit tests: #[cfg(test)] modules inline in source files for pure functions (sanitization, parsing, normalization, template rendering).

Important Constraints

MusicBrainz rate limit: 1 request per 1.1 seconds for the remote API. Mitigated by the local SQLite database (imported from MB dumps) and aggressive caching in search_cache.
YouTube cookies expire roughly every 2 weeks. Auto-refreshed by headless Firefox every 6 hours when cookie_refresh is enabled.
Session key is random on startup -- user sessions do not survive server restarts.
Subsonic password is stored in plaintext per the Subsonic protocol specification. Users are warned about this in the UI.
Opus transcoding for Subsonic clients transcodes the entire file to memory before streaming. Not ideal for very large files.

Making Changes

Backend route changes: edit files in shanty-web/src/routes/
Frontend changes: edit files in shanty-web/frontend/src/, then cd shanty-web/frontend && trunk build
Config changes: update shanty-config/src/lib.rs (add field + default), update apply_env_overrides if adding env var support
Database schema changes: add a migration in shanty-db, update entities and queries
Adding a new external data source: add a provider implementation in shanty-data behind the appropriate trait
Each crate submodule must be committed and pushed independently before updating the parent workspace

12 KiB Raw Permalink Blame History