Spotify Top Hits Analysis

As a music lover, I've often wanted to dive into the world of music for data analysis. After spending some time exploring potential music related datasets I came across the Spotify Top Hits dataset on Kaggle which I thought was perfect for my first music related data analysis.
Link to Original Dataset: Click Here
Questions to Answer
Below are questions I wanted to answer regarding the dataset:
- What are the top 5 most popular songs?
- What are the top 5 artists based on most hits?
- What are the top 5 most popular genres?
- 3.1 What is the trend of each of these genres over time?
- 3.2 How danceable is each of these genres?
- How has the average duration of songs changed over time?
- How has the mood of songs changed over time?
Approach
Data Cleaning
- Performed data cleaning tasks using Pandas in a Jupyter Notebook
- Identified and removed outlier values
- Elimated duplicate data entries
- Conducted a throrough check for null values to ensure data integrity
Link to Data Cleaning: Click Here
Analysis
- Performed analysis on the cleaned dataset using Pandas in a Jupyter Notebook
- Employed data visualization libraries like Matplolib and Seaborn to create informative plots and graphs on insights
- Utilized Python libraries like NumPy and scikit-learn for linear regression analysis to identify possible trends in data
Link to Analysis: Click Here
You can click below if you want to see the entire project repo on Github