Spotify Scraper

This is my first ever real outside coding “project” that I ever did and it was just for another class, statistics.

In this class we were allowed to choose any data we wanted to try and test a hypothesis about that data using p-value (basically a value saying whether some data is significantly significant). One person did chances of dice rolls on dungeons and dragons to see if someone was lucky while someone else did it on what type of fish to expect while fishing at a certain lake. I chose the Billboard hot 100 songs and compared them to the most popular songs of the year 2000 to test the hypothesis of if songs in today’s age got shorter compared to the most popular songs in 2000.

I choose popular songs because they reflect people’s taste and it would be hard to get a random sample of just any song out there. I first went to the internet to see if I could find any pieces of code that were already on the internet that I didn’t have to code myself. https://github.com/KoreanThinker/billboard-json/blob/main/crawler/billboard-hot-100.ts, shows the billboard hot 100 for today so that was helpful. But I still had to do the one for the year 2000, which was only available on a wikipedia page, and in order to scrape that I had to spend 3 hours watching a youtube tutorial about python web scraping. Once I had the two lists it was time to set up the spotipy (which is basically a python extension for spotify) and spotify developer api in order to make an app that would cycle through the lists and return the song length in a readable fashion for my presentation.

In the end I realized I would have done it faster if I just did it manually (By several hours). But through this experience I learned how cool it was to make something for yourself, and how motivation for your self interest will keep projects alive.