Building My Personal Streaming Library, Part 2: File Naming and the Scripts That Organize It All

Welcome back. I’m assuming you’ve either read Part 1 of this series and have started converting some of your discs, or you already had a media collection and skipped ahead. Either way, this article is about file naming and the scripts I’ve built to organize everything.

The Scripts

You don’t have to use the file-naming convention I chose, but if you’re going to use the scripts I wrote, it’s much easier on you if you do. After some research, I settled on what seemed like the most standardized way of naming movie and TV files. The full breakdown — including examples for standalone movies, movies in series, and TV shows — is in the README of the repo:

https://github.com/kg4dkf/Media-Library-Tools

Having chosen my naming format, I signed up for a Claude Max subscription. That’s $100 for a month of coding help. To set expectations: this is not the same as hiring a developer and asking them to deliver a finished product. Claude will write code with incredible efficiency, but that code will sometimes contain bugs that Claude didn’t see coming. You’ll go through several rounds of “oh, I see what I did wrong” before everything settles. With that caveat, the work it did was absolutely worth the $100, and I got plenty of help on other projects in the same month.

What the Scripts Do for Your Movie Collection

My recommendation is to download the repo and try the movie scripts on a backup copy of your collection first. On an initial run, the scripts will:

  • Walk your environment and identify what you already have.
  • Organize movies into their proper folders.
  • Rename them to the standard format.
  • Eliminate duplicates by keeping the highest-quality copy and moving the others to a quarantine folder.
  • Generate reports on what was done and what needs your attention.

Once your collection is cleaned up, there’s an intake script that handles new additions. You drop newly ripped movies into a temp folder, and every so often you run the intake script. It moves the files into the main directory, renames them, and slots them into the right folder structure. The day-to-day maintenance becomes almost zero.

TV Shows Are a Separate Problem

TV files live in a separate directory in my scripts because they work very differently. The biggest issue is that most of my TV rips came off the disc with no episode information at all. The tracks were named “disc one track one,” “disc one track two,” “disc one track three” — there was nothing on the disc identifying which one was, say, episode 4 of season 2.

I could go through and manually watch each track until it hit a title screen, compare the episode title to an online episode list, and rename the file accordingly. But I have hundreds of TV shows. That’s thousands of hours of manual work.

So I asked Claude how to solve this, and Claude came up with a clever pipeline:

  • One script samples 90 seconds of audio from the middle of each unidentified episode and transcribes it.
  • A second script downloads the full subtitle transcript for every official episode of that show.
  • A third script takes the 90-second excerpt transcript and searches for that exact text inside each official episode transcript. When it finds the match, that’s the answer — the file is now identifiable. If there are multiple matches, the script pauses and asks me. (In theory two episodes could share the same 90 seconds of dialogue, but in practice that should be very rare.)

At this moment I haven’t run the whole pipeline end to end. I’m still in the subtitle-downloading phase. There isn’t a great free source for subtitles, so I signed up for a $20-a-month subscription to OpenSubtitles.org, which lets me download up to 2,000 episodes a day. Eventually it’ll get through everything, and then the matching script will tear through the collection.

I’ve already done the audio extraction and the excerpt transcriptions, so the show-by-show identification is ready to go. I just don’t want to manually point the script at each show. I’d rather wait until I have all the subtitle data and then let it crawl the whole TV directory in one big run.

What’s Next

That’s the file-naming and movie/TV organization story. In the next article I’ll talk about how I applied similar logic to a much larger problem: my music collection.


The full series — Building My Personal Streaming Library:

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *