Building My Personal Streaming Library, Part 3: Wrangling a 115,000-Track Music Collection With Claude’s Help

Welcome back. If you’ve read this far into the series, you know I’ve been converting my videos from DVD and Blu-ray over to my Plex server, with a stack of complex scripts written mostly by Claude. But what about music?

The Scale of the Problem

My largest digital collection by file count is my music: 115,000 tracks ripped from many thousands of CDs over the years. Part of that comes from my radio-station days. We used to get CDs by the truckload, and anything that didn’t make it onto air was fair game for the staff to take home. So when I say I have a lot of CDs, it’s like saying a record store has a lot of records.

I knew I’d carried this collection from computer to computer over the years and that there were duplicates I needed to weed out. What I didn’t realize was how many. Most of the collection turned out to be duplicates. After deduplication, the file count was cut by a little more than half. That’s fine — my plan was to get to the bottom of the mess anyway, and a leaner library is the goal. I also haven’t ripped audio to my computer in many years, and I want to start ripping my more recent CD purchases, which is probably about 500 more discs, because library book sales in my area tend to have a lot of CDs.

What Claude Built

Same approach as the video pipeline: I described the problem and Claude wrote the scripts. The first one handled the deduplication. The second organized the survivors into a folder structure that’s actually navigable: top-level folders for each letter A through Z, plus one for Compilations and one for Soundtracks. Inside each letter folder is a list of artists; inside an artist folder, a list of albums; inside an album folder, the individual tracks. Compilations and Soundtracks are organized by album rather than artist.

That reorganization was a lot of work, and even on my fairly powerful server it wasn’t a quick run.

Once that was done, Claude wrote another script to handle the stragglers — tracks that weren’t properly labeled, tracks in compilations where the compilation itself wasn’t tagged as one, and tracks the script couldn’t identify by metadata. The script analyzes the audio from each unknown file and compares it against an online fingerprint database to figure out exactly which song it is. Once a track is identified, the script can put it in the right folder.

From there, another script downloads album art and other publicly available metadata, writes NFO files for Plex, and fills in the missing details. And there’s a small cleanup script that removes empty folders left behind by the earlier scripts.

Where I Am Now

All of this work, and I’m only at phase 5 of a 7-phase project. Music turned out to be a much bigger problem than the video collection, partly because of the sheer file count and partly because so many of the files had been moved and copied across computers over the years. The structure is finally coming together, though, and once the last two phases finish I’ll have a clean, fully tagged, Plex-friendly music library that mirrors what I’ve done with movies and TV.

If you’ve read all three articles in this series, thanks for sticking with me. If you’re working on a similar project, I’d love to hear what you’ve figured out that I haven’t — there’s a lot of trial and error in this kind of work, and most of it never gets written up anywhere.


The full series — Building My Personal Streaming Library:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *