LISTener's Friend

Github Repo

A flexible, interactive program that converts album lists to Spotify playlists

Ingredients: Python, API, OAuth, Recursion, File I/O, Web Scraping, Selenium, Headless-Chrome, BeautifulSoup, HTML Parsing

Below you will find the current contents of the project README to which I've added some snippets of the relevant code. If you would like a closer look at the program and how it is written please take a look at the repo linked above.


Why? When exploring new music I've always preferred listening to albums rather than "top" tracks or algorithmically generated playlists. The problem is that building playlists manually, say from a list like this, takes a lot of copy-paste-searching and click-n-dragging, so I built this to make the process faster and easier.

Getting Started

  1. Clone or download a copy of this repo.
  2. Install the required dependencies: pip install python-dotenv spotipy selenium bs4
  3. Rename the included example.env to .env, change the default input file path if desired, and update with your API credentials.
  4. Run the program: python listeners_friend.py
  5. Select type of input source from given options
    • Option 1: Add a list of albums in the format artist - album, each on a new line, to input.txt
    • Option 2, 6, 8: Provide a URL for supported source
    • Options 3, 4: No user input needed, fully automated
    • Options 5, 7: Interactive user input selection
  6. A new playlist will be created directly in your Spotify account, ready to play!

How It Works

Basics

// utils.py
# Print number of missing items to file and prepend basic info to playlist description
def handle_missing(missing, playlist_name, playlist_description, type):
    clean_playlist_name = re.sub(r"[/\\?%*:|\"<>\x7F\x00-\x1F]", "-", playlist_name)
    with open(f"not_found_for_{clean_playlist_name}_{today}.txt", "w", encoding="utf-8") as file:
            file.write("Not Found:\n")
            for item in missing:
                file.write(item + "\n")
    if (len(missing) > 1):
        type = type + "s";
    playlist_description = str(len(missing)) + f" {type} not found. " + playlist_description
    print(f"Info on missing {type} saved to text file in directory...")
    return playlist_description

def album_search(playlist_name, playlist_description, input_list, spotify):
    # Search for each album
    print("Searching Spotify library for albums...")
    for artist, album in input_list:
        # album specific search using Spotify's "album:{query}" format
        # returns only the top result
        result = spotify.search(q='album:' + album + ' artist:' + artist, type="album", limit=1)
        if result['albums']['items']:
            album_uris.append(result['albums']['items'][0]['uri'])
        else:
            missing.append(artist + " - " + album)

    if len(missing) > 0:
        print(f"{len(missing)} albums weren't found...")
        playlist_description = handle_missing(missing, playlist_name, playlist_description, "album")
        
    track_uris = track_search(spotify, album_uris)
    return track_uris, playlist_description, (len(input_list) - len(missing))

# Take album URIs and return track URIs
def track_search(spotify, album_uris = None, track_input = None, playlist_name = None, playlist_description = None):
    if album_uris:
        print("Searching for associated tracks...")
        # Get the tracks for each album URI
        for album in album_uris:
            tracks = spotify.album_tracks(album)['items']
            # Get the URIs for each track
            for track in tracks:
                track_uris.append(track['uri'])
        print(f"{len(album_uris)} albums have been converted to {len(track_uris)} tracks...")
        return track_uris
    elif track_input:
        print("Searching Spotify library for tracks...")
        for track in track_input:
            artist, title = track
            result = spotify.search(q=f"artist:{artist} track:{title}", type="track", limit=1)
            if result['tracks']['items']:
                uri = result['tracks']['items'][0]['uri']
                if uri:
                    track_uris.append(uri)
            else:
                missing.append(artist + " - " + title)
        if len(missing) > 0:
            print(f"{len(missing)} tracks weren't found...")
            playlist_description = handle_missing(missing, playlist_name, playlist_description, "track")
        return track_uris, playlist_description

# Check if track_uris > 11,000 (the max playlist size) and split into two playlists if so
# You don't need more than two... right? 22,000 tracks seems like an OK maximum
def giant_check(track_uris, playlist_description, playlist_name, spotify):
    overflow_track_uris = []
    print('Uh oh, this list is too big to contain in a single playlist!')
    print(f"Splitting {len(track_uris)} tracks in two...")
    overflow_track_uris = track_uris[11000:]
    max_track_uris = track_uris[:11000]
    overflow_playlist_name = playlist_name + " (Part 2)"
    playlist_name = playlist_name + " (Part 1)"
    print(f"Creating {playlist_name}, which contains 11,000 tracks")
    create_playlist(max_track_uris, playlist_description, playlist_name, spotify)
    print(f"Creating {overflow_playlist_name}, which contains {len(overflow_track_uris)} tracks")
    create_playlist(overflow_track_uris, playlist_description, overflow_playlist_name, spotify)
    

# Break potentially huge list of tracks into easily manageable chunks
def chunk_list(lst, chunk_size):
    for i in range(0, len(lst), chunk_size):
        yield lst[i:i + chunk_size]

# Take track URIs and final playlist title & description
def create_playlist(track_uris, playlist_description, playlist_name, spotify):
    # Split into two playlists if maximum track count is hit
    if len(track_uris) > 11000:
        track_uris = giant_check(track_uris, playlist_description, playlist_name, spotify)
    user_id = spotify.current_user()["id"]
    if track_uris:
        playlist = spotify.user_playlist_create(user_id, playlist_name, public=True, description=playlist_description)
        print(f"Playlist {playlist_name} has been created...")
        track_uris = [uri for uri in track_uris if uri is not None]

        # Split to chunks of 100 tracks, the max allowed by the API in a single post
        for chunk in chunk_list(track_uris, 100):
            try:
                spotify.playlist_add_items(playlist['id'], chunk)
            except spotipy.exceptions.SpotifyException as e:
                print(f"An error occurred: {e}")
    else:
        print("List of track URIs is empty, nothing added to playlist.")

Spotify API Limitations

Authorization

It is necessary to have Spotify API credentials stored in the .env file. If you need help, see the guide at the end of this doc for details.

The first time you run the program you'll be sent to a Spotify authorization page in your browser. It should be asking you if you want to allow connecting to { whatever you named your app when getting your API credentails }. After this you'll be routed to your Redirect URI. Copy the full URL and paste it into the command prompt to finalize authorization. Your OAuth token will be stored in the .cache file.

from dotenv import load_dotenv

import spotipy
from spotipy.oauth2 import SpotifyOAuth

load_dotenv()

spotify = spotipy.Spotify(auth_manager=SpotifyOAuth(
   client_id=os.getenv("SPOTIPY_CLIENT_ID"),
   client_secret=os.getenv("SPOTIPY_CLIENT_SECRET"),
   redirect_uri=os.getenv("SPOTIPY_REDIRECT_URI"),
   scope="playlist-modify-public"
))

Options

// utils.py
def display_options(options):
    # Check if this is just a bunch of strings to print or something more complex
    if type(options[0]) == str:
        for idx, option in enumerate(options, start=1):
            print(f"{idx}. {option}")
    # Right now this is just for displaying lists of NTS Episodes
    else:
        for idx, option in enumerate(options, start=1):
            tags = " #".join(f"{option['tags'][i]}" for i in range(len(option["tags"])))
            print(f"{idx}. {option['date']} {option['title']}, {option['location']} #{tags}")
        
def get_user_selection(options):
    while True:
        display_options(options)
        try:
            print()
            selected_option = int(input("Please select an option: "))
            if 1 <= selected_option <= len(options):
                return selected_option
            else:
                print(f"Invalid selection. Please choose a number between 1 and {len(options)}.")
        except ValueError:
            print("Invalid input. Please enter a number.")

Option 1: Use txt file

//main.py
if selected_option == 1:
   with open(os.getenv("INPUT_PATH"), "r") as file:
      input_list = [tuple(line.strip().split(" - ", 1)) for line in file if line.strip()]
      playlist_name = input("Enter playlist name: ")
      playlist_description = input("Enter playlist description: ")

// ... snip ...

if output_type == 'albums':  
   print("Creating an album-based playlist")
   track_uris, playlist_description, count = album_search(playlist_name, playlist_description, input_list, spotify)

// ... snip ...

create_playlist(track_uris, playlist_description, playlist_name, spotify)
   print(f"Playlist \"{playlist_name}\" has been successfully created!")
   if output_type == 'tracks':
      print(f"It contains {len(track_uris)} tracks!")
   elif output_type == 'albums':
      print(f"It contains {count} albums for a total of {len(track_uris)} tracks!")
   print(f"Get to listening!")

Options 2-8: Scraped Web Data Input

Lists hosted on supported websites can be scraped using Selenium and BeautifulSoup to build the finalized input array

//scraper.py

import os
from dotenv import load_dotenv
from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options
from bs4 import BeautifulSoup

load_dotenv()

def get_soup(url):
    print("Fetching data...")
    chrome_options = Options()
    chrome_options.add_argument("--headless")
    chrome_options.add_argument(f"user-agent={os.getenv('USER_AGENT')}")
    chrome_options.add_argument("--no-sandbox")
    chrome_options.add_argument("--disable-dev-shm-usage")

    driver = webdriver.Chrome(service=Service(), options=chrome_options)
    driver.get(url)
    html = driver.page_source
    driver.quit()

    return BeautifulSoup(html, 'html.parser')

If you are running the program in Bash you may have difficulty entering a URL for options 2 or 8!

Option 2: Use RateYourMusic List URL
// option_handlers.py
def handle_rym_list(soup):
    global page
    print(f"Processing page {str(page)}...")
    playlist_name = soup.find('h1').get_text(strip=True)
    playlist_description = soup.find('span', class_='rendered_text')
    if playlist_description is None:
        playlist_description = ''
    else:
        playlist_description = playlist_description.get_text(strip=True)
    # Truncate long descriptions to fit within Spotify's 300 character limit
    # Also taking into consideration the 19-21 chars of "### items not found. " prepended later
    playlist_description = (playlist_description[:276] + '...') if len(playlist_description) > 279 else playlist_description
    
    has_next_page = False
    next_url = None
    
    nav_div = soup.find('div', id='nav_bottom')
    if nav_div:
        nav_span = nav_div.find('span', class_='navspan')
        if nav_span:
            navlink_next = nav_span.find('a', class_='navlinknext')
            if navlink_next:
                next_url = navlink_next['href']
                if next_url:
                    next_url = "https://rateyourmusic.com" + next_url
                    has_next_page = True
        
    list = soup.find('table', id='user_list')
    if list is None:
        print("List not found on page {page}, potential captcha block")
    else: 
        for row in list.find_all('tr'):
            artist_tag = row.find('a', class_='list_artist')
            album_tag = row.find('a', class_='list_album')
            if not artist_tag or not album_tag:
                continue
            artist = artist_tag.get_text(strip=True)
            album = album_tag.get_text(strip=True)
            input_list.append((artist, album))
    if has_next_page == True:
        page = page + 1
        next_bowl_of_soup = get_soup(next_url)
        handle_rym_list(next_bowl_of_soup)
    return playlist_name, playlist_description, input_list
Option 3: Use Current Boomkat Bestsellers List
// option_handlers.py
def handle_boomkat(soup):
    print(f"Processing Boomkat Bestsellers list for the week ending {today}...")
    table = soup.find('div', class_='bestsellers')
    if table is None:
        print("Table not found")
    bestsellers_list = table.find('ol', class_='bestsellers-list')
    if bestsellers_list is None:
        print("List not found")
    else: 
        for item in bestsellers_list.find_all('li', class_='bestsellers-item'):
            artist = item.find('div', class_='product-name').find_all('a')[0].text.strip().title()
            album = item.find('div', class_='product-name').find_all('a')[1].text.strip()
            input_list.append((artist, album))
    playlist_name = "This Week's Boomkat Bestsellers"
    playlist_description = "For the week ending " + today
    return playlist_name, playlist_description, input_list
Option 4: Use Current Forced Exposure Bestsellers List

(Code to handle this is conceptually the same as the above Boomkat list handler with modifications to handle the differing page structure)

Options 5 & 6: Browse and Select from WFMU's "Heavy Play" Archive

These can be pretty huge (as in a few thousand songs) so they take a little longer to build than other options and the playlists themselves can be a little slow in your Spotify client!

// options_handlers.py
def handle_wfmu_latest(soup):
    
    year = input("For what year (2014-present): ")
    print("Select a date: ")
    
    # List available dates for selected year as YYYY-MM-DD
    for a_tag in soup.find_all("a", class_="playlist"):
        href = a_tag.get("href")
        match = re.search(r"/(\d{4})/", href)
        if match:
            url_year = match.group(1)
            if url_year == year:
                date_match = re.search(r"(\d{4}-\d{2}-\d{2})\.html", href)
                if date_match:
                    date = date_match.group(1)
                    print(date)
    date = input("Enter selection as YYYY-MM-DD: ")
    sub_url = "http://blogfiles.wfmu.org/BT/Airplay_Lists/" + year + "/" + date + ".html"
    return handle_wfmu_list(get_soup(sub_url))
    

def handle_wfmu_list(soup):
    date = soup.title.text[28:]
    print(date)
    playlist_name = "WFMU Heavy Play " + date
    playlist_description = ""
    for ul in soup.find_all('ul'):
        if any(li.find('strong') for li in ul.find_all('li')):
            for li in ul.find_all('li'):
                match = re.match(r'^(.*?) - (.*?) \((.*?)\)$', li.text)
                if match:
                    artist_name, album_title, record_label = match.groups()
                    input_list.append((artist_name.title(), album_title))
    return playlist_name, playlist_description, input_list
Options 7 & 8: Browse and Select from Recent NTS Radio Broadcasts

(Code to handle this is conceptually the same as the above WFMU list handler with modifications to handle the differing page structures)

How to Get Spotify API Credentials

To get the necessary info for your .env file you'll first need a (free) Spotify Developer account.

  1. After logging in and landing on the dev dashboard click Create app.

    Screenshot of the Create app screen from the Spotify Developer website
  2. Fill out the required fields:

    • Give your app a name (i.e. Text-to-Playlist App) and a brief description, maybe something to remind you why you made it.
    • For the Redirect URI you can supply your own or just use https://example.org/callback. Click Add.
    • Check the box for Web API access and save.
  3. After creating the app you'll be taken to its dashboard. Click Settings in the top right corner. Everything you need for your .env file is here on this page:

    Screenshot of the app settings screen from the Spotify Developer website
  4. Copy the Client ID and Client Secret (click View client secret) to your .env file. If you forgot what Redirect URI you chose earlier you can also grab that from here. The example.env is prepopulated with https://example.org/callback.

  5. You're ready to start building playlists!


projects · about · cv · home