I recently decided to start taking on the challenge of selfhosting and curating my music collection. I originally started looking at Lidarr as I am already a big fan of Radarr and Sonarr, but it wasn’t really what I was looking for. I’m not often seeking out full albums, and am more often finding my music by listening to single tracks from Spotify’s Discover Weekly playlist. I needed a solution that would let me replicate this experience while hosting my own MP3’s and ideally be entirely automated.
I currently have the following setup running on a VPS:
- Azuracast - This provides me a streaming radio station that cycles through my entire library 24/7
- Navidrome - This fills the gap of the Spotify-like interface where I can play specific tracks, albums, or playlists
I bootstrapped my library with a Python script that parsed a list of Spotify URL’s and downloaded all of the tracks with the spotdl library. This allowed me to grab my liked tracks, the playlists I had created, as well as a large number of albums I wanted.
I then used ChatGPT to write two python scripts:
-
The first script runs using cron every Monday and uses SpotDL to grab the contents of my Discover Weekly playlist from Spotify. It puts all of the files into a folder with that weeks date and also creates a playlist file. This way I can easily browse that weeks playlist in Navidrome and decide what to keep. It also sends me an email on completion/error
-
The second script is a bit more complex. This one does the same end result but for all of my LastFM reccomendations. This is done by spinning up a headless Chrome browser with Selenium in a docker container. It then logs into my LastFM account, parses each reccomendation, and then uses pytube to download the video links, since LastFM just directly links to Youtube videos. This list should change as I continue scrobbling via Navidrome and other sources, but I still need to determine how often the cron job should run.
My next step is figuring out how to connect to Azuracast/Navidrome using the many subsonic compatible clients so I can have mobile playback and things like offline playback. I’m currently looking at substreamer for Android.
I’d also like to look into a more seamless way of picking out the tracks I want to keep and discard from the playlists in Navidrome. I’m considering writing something to check its SQL database for liked tracks in each playlist and automatically move those into the main folder/playlist that Azuracast is playing from.
This whole setup took me only a couple days to create, and largely relied on ChatGPT to write the scripts and dockerfiles. I’m a capable programmer but GPT-4 is absolutely OP if you know what you’re trying to accomplish and how to debug its mistakes. That Selenium script only took me an hour from idea to completion and I never modified the code by hand, only prompted it for corrections/additions.
If anyone is interested I’ve uploaded all the scripts to a gist, you just need to go through and update with your credentials/URLs
Holy shit dude - you just made an automated radio station - pretty damn cool. Question for you, because you got my creative curiosity going and it looks like you’ve got the tech chops to answer - how difficult would it be to serve it up on a public web interface as a pirate radio station?
Do you mean like this?
;)
Edit: To more directly answer your question, this is using the “Public Pages” feature that is already built into Azuracast along with a bunch of custom CSS to make it look nicer
oh dang, this is really good. I was thinking about trying to self-host all my music as well, but i still need good storage lol.
This is all being hosted directly on the VPS, unless you’re storing FLAC it doesn’t take a ton of space
That’s fucking sick bro - nice work!
Haha I can feel the satisfaction behind this comment, well deserved! Your setup is really cool!
Also, clicking on that link gave me a Stromae song, so I must immediately assume that you have good taste in music.
I don’t speak a word of French but those folks make great music
Pretty cool, but won’t your discover weekly playlists never update based on your new self hosted songs? I figure if you stop using Spotify, the lists it suggests will eventually tend to have the same songs/artists on repeat. It’s a very minor gripe seeing as how you can always just manually search and find new songs based on radios and such. Best of luck!
I’ve gone ages without using Spotify and found the list still regularly updates regardless of whether or not I’m actually there listening. This is also why I threw in the Last.FM recommendations though, so I can have something more dynamic based on my current listening.
I’m interested in you getting chat gpt to write scripts, can you share tips and tricks on your prompt? This might make me use it for the first time lol.
Honestly just go for it, it’s pretty straightforward! I’d share my chat transcript but it at points contained things like my API keys.
I can however give some excerpts from the conversation:
You are a senior software engineer. Create a python script that logs into a website using the selenium/standalone-chrome docker container
This was actually my first time using the “You are a senior software engineer” bit, but I’ve heard a few people saying it works. I came across the idea for using Selenium from this prompt:
Please write a python script to load a website, login, navigate to a URL, and then scrape all of the text that matches a CSS selector
In fact here is the chat transcript for that one. Once I got to the end of this transcript I decided to try out the code. I realized selenium was using my installed browser and that wasn’t going to work once I moved this to a server. That was when I moved into a new chat that contains what became the final script, where I started the conversation with this prompt:
I need to write a script that submits a form on a webpage. This script needs to be run from a VPS that does not have a browser
It was in this conversation that I learned about using the headless chrome container. Everything I did was a combination of prompting for additions and reading the documentation on what that was capable of.
I will regularly ditch a chat thread and take the output from a previous one into a new one, as it takes the previous context of the conversation into account for informing future generation, and sometimes I want to pivot or I want to focus in on a specific approach.
Once I had a more focused idea of what the tech stack was going to be it was just a matter of prompt what I needed, test, feed it back any errors and get corrections, notice something wrong (like it wasn’t appending .mp3) to the files, or something else I wanted to change, and prompt it in plain english.
There’s all kinds of people saying you should use X method and Y approach, but I find I get great results by just being clear and concise with what I’m looking for, as I would when speaking to another developer.
Thanks, I’ll definitely mess around with it soon!
Please update this if you continue working on it, I’ve been looking for something like this.
Is chrome browser needed? Could this be swapped out for any chromium browser? I try not to use any google services (within reason, still need a gmail for work).
I don’t see why it wouldn’t be possible to swap out the docker container running Chrome with another that is running Chromium or Firefox. The only interaction with the browser itself is via Selenium, which should be agnostic. I just went with what ChatGPT was able to suggest immediately.
I should clarify this is running a headless browser, so you don’t actually need Chrome installed, it exists entirely within the confines of the container and is completely ephemeral. You could also modify this to work with the standard Selenium webdriver and your installed browser of choice, but I made this with the intention of running it on my server rather than my personal machine.
I would also be running on a server if I do this, and I know having it containerised would be fine privacy wise, it was more just curiosity about why you went with chrome. Makes sense that ChatGPT went with chrome though, since it’s the most used browser in the world at the moment.
How is the music quality that’s downloaded determined? (It could be somewhere in your script, haven’t looked at those yet).
Both spotdl and pytube are downloading from Youtube as their source, my understanding is they’re able to grab 320kbps audio if it’s available. It’s no FLAC ripped from CD, but it’s good enough for my use case since I don’t want to drag torrenting or usenet into my VPS
tx 4 sharing op. I always dream of making some sort of app/script that can play/stream songs from a radio stations playlist, but my programming skills are noobish and only have access to free ai. sounds like gpt 4 is something really useful for scripts.
ChatGPT and GPT in general is like having a pair programming intern who simultaneously knows everything but is also capable of making really dumb and obvious mistakes. When combined with the new code interpreter it’s crazy powerful, but at the end of the day to get the best results you need a skilled human operator guiding its outputs.
That said, give it a shot as it can definitely help you to improve your skill by explaining what a block of code does.
👍
I’ve got Navidrome feeding Substreamer on iOS. Works pretty well only it has a bit of a buffering lead in if I get skip happy or there’s too many FLAC files in a row.
I just manually feed it on the backend though. Will look at lidarr one day.
Very interested in Azuracast.
pretty cool! how good are the songs downloaded with spotdl? Also ever heard of soulseek? You can selfhost that i believe.
spotdl, like pytube, is looking up the tracks on Youtube and then using youtube-dl to grab the audio. It’s not FLAC, but it’s perfectly good for my needs.
Interesting
I have something similar but without the discover weekly and lastFM. Awesome idea. Guess that’ll be this weeks project.