How I got data off Spotify via the Spotify Web API.

by Penny Richmond

This blog will assume you already have Anaconda and JupyterLab installed.

  1. Firstly install Spotipy with PIP (enter pip install spotipy into the terminal). Spotipy is a thin python wrapper of the Spotify web API, it makes using the API easier.

  2. Open JupyterLab (run jupyter-lab form the terminal)

  3. Obtain your Client ID and Secret ID from Spotify.
    Go to https://developer.spotify.com/ and log in using your Spotify details. Go to 'Dashboard', then 'Create An App' or 'My New App". After creating the app you’ll see Client ID and Client Secret underneath the app name.

    Like so:
    Screenshot--6-

  1. To import and connect to spotipy; go to jupyter lab and copy and paste:
import spotipy
from spotipy.oauth2 import SpotifyClientCredentials
import pandas as pd
CLIENT_ID = “X”
CLIENT_SECRET = “Y”
sp=spotipy.Spotify(auth_manager=SpotifyClientCredentials(client_id="CLIENT_ID",client_secret="CLIENT_SECRET"))

Replacing X and Y with the strings just obtained from https://developer.spotify.com/.

  1. Now for getting the data!
    Create a function to retrieve data from the API:

def analyze_playlist(user_id, playlist_id):

# Create empty dataframe
playlist_features_list = ["artist","album","track_name",  "track_id","danceability","energy","key","loudness","mode", "speechiness","instrumentalness","liveness","valence","tempo", "duration_ms","time_signature"]

playlist_df = pd.DataFrame(columns = playlist_features_list)

# Loop through every track in the playlist, extract features and append the features to the playlist df

playlist = sp.user_playlist_tracks(creator, playlist_id)["tracks"]["items"]
for track in playlist:
    # Create empty dict
    playlist_features = {}
    # Get metadata
    playlist_features["artist"] = track["track"]["album"]["artists"][0]["name"]
    playlist_features["album"] = track["track"]["album"]["name"]
    playlist_features["track_name"] = track["track"]["name"]
    playlist_features["track_id"] = track["track"]["id"]
    
# Get audio features
    audio_features = sp.audio_features(playlist_features["track_id"])[0]
    for feature in playlist_features_list[4:]:
        playlist_features[feature] = audio_features[feature]
    
    # Concat the dfs
    track_df = pd.DataFrame(playlist_features, index = [0])
    playlist_df = pd.concat([playlist_df, track_df], ignore_index = True)
    
return playlist_df
  1. Go to spotify, get the playlist id / artist id or track id.

To get playlist ID. Go to the playlist, click on the three little dots next to the "<3". Go to "Share" then "Copy Playlist Link"

Screenshot-2021-01-13-at-15.23.18

The Spotify URI looks like this:
https://open.spotify.com/playlist/37i9dQZF1DZ06evO3ZnsAw?si=CYGg2uRvRnaXCf6UP02dVA
We want the bit in bold. This is our playlist_id. We will need this when we call our funciton to retreive the data.

To get your User ID: click the little down arrow next to your name in your Spotfy window. Click account. Screenshot-2021-01-13-at-15.27.49

This should take you to a web page featuring your account overview, and your User ID should be at the top as "username".

  1. Call the function using

df = analyze_playlist("user_id", "playlist_id")

and pop in the user_id and playlist_id we found in step 6.

  1. Create an excel file (or text file, or whatever you want) of the data using:

df.to_excel("dataframe.xlsx", index = False)

Et viola!!

P.S The data might need a bit of cleaning in Alteryx.

Avatar

Penny Richmond

Thu 29 Apr 2021

Tue 20 Apr 2021