Replace ad-hoc scheduling with a concurrent work queue system #43
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Current state
The current scheduling uses three separate
tokio::spawn+sleeploops (inshanty-web/src/pipeline_scheduler.rs,shanty-web/src/monitor.rs, andshanty-web/src/cookie_refresh.rs). Each is a copy-paste pattern that reads config, sleeps, and runs work.The pipeline (
shanty-web/src/pipeline.rs/shanty-web/src/routes/system.rs) runs 6 strictly sequential steps: sync → download → index → tag → organize → enrich. Each step must fully complete before the next begins. This means if 50 tracks are downloading, the tagger waits idle for all 50 to finish before it can tag even the first one.Current problems
SchedulerInfostruct inshanty-web/src/state.rstracksnext_pipelineandnext_monitorasOption<NaiveDateTime>but these are volatile.SchedulerInfo).skip_pipeline,skip_monitorinSchedulerInfo) that could race with the scheduler loop.pipeline_scheduler.rs,monitor.rs, andcookie_refresh.rsall duplicate the same loop structure.shanty-web/src/tasks.rs) is an in-memory HashMap ofTaskInfostructs that accumulates indefinitely — no cleanup, no persistence.Proposed architecture: Work queue with typed workers
Replace the linear pipeline with a task bucket (work queue) with typed workers:
Download,Index,Tag,Organize— each is a fine-grained unit of work (one task per file/track, not one task per batch).Synccreates individualDownloadtasks (one per wanted item)Downloadworker finishes a file → creates anIndextask for that fileIndexworker finishes a track → creates aTagtask for that trackTagworker finishes → creates anOrganizetaskOrganizeworker finishes → track is donework_queuetable withid,task_type,status,payload_json,created_at,started_at,completed_at,error). In-progress tasks can be resumed after restart.Schedulerstruct replaces the three separate scheduler files. It manages all recurring jobs with:scheduler_statetable or similar)SchedulingConfig)Key files to modify
shanty-web/src/pipeline.rs— replace with work queue dispatch logicshanty-web/src/pipeline_scheduler.rs— merge into unified schedulershanty-web/src/monitor.rs— merge scheduler portion into unified schedulershanty-web/src/cookie_refresh.rs— merge into unified schedulershanty-web/src/tasks.rs— replace in-memory TaskManager with DB-backed work queueshanty-web/src/state.rs— simplify SchedulerInfo, remove skip flagsshanty-db/src/migration/— new migration forwork_queueandscheduler_statetablesshanty-web/src/routes/system.rs— updatetrigger_pipeline()to enqueue work items instead of running sequentiallyshanty-web/frontend/src/pages/dashboard.rs— update Background Tasks display to show work queue stateAcceptance criteria: