Stack Overflow Trends is a great way to track how different technologies are evolving over time. But if you’ve ever wanted to analyze that data yourself, you might have noticed that there’s no download button in sight. Frustrating, right?
I found myself in this exact situation when I wanted to dig into the trends for Apache Kafka. No CSV, no Excel export, nothing. But there’s a way around it. Here’s how you can extract the data directly from the API using the Mac terminal and convert it into a CSV file for easy analysis.
Prerequisites
Before diving in, make sure you have the following installed:
-
Homebrew: The package manager for MacOS. Install it with:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
-
Python and Pandas: If you don’t have Python installed, get it using:
brew install python3
Then install Pandas:
python3 -m pip install pandas
-
wget: To download the JSON response:
brew install wget
Step 1: Download the JSON Data
The Stack Overflow Trends site doesn’t provide direct download links, but we can grab the data using wget:
wget -O apache-kafka-response.json \
--header='Content-Type:application/json' \
'https://trends.stackoverflow.co/api/trends/'
Note: The
?tags=
parameter does not work as expected. We are grabbing the entire dataset and will filter it in the next steps.
Step 2: Convert JSON to CSV
Now, let’s write a simple Python script to parse the JSON and convert it to CSV using Pandas:
import json
import pandas as pd
# Load the JSON file
with open('apache-kafka-response.json', 'r') as file:
data = json.load(file)
# Extract Year and Month
years = data['Year']
months = data['Month']
# Extract TagPercents keys
tags = list(data['TagPercents'].keys())
# Construct the data rows
rows = []
for i in range(len(years)):
row = {'Year': years[i], 'Month': months[i]}
for tag in tags:
# Fill missing values with empty strings
row[tag] = data['TagPercents'][tag][i] if i < len(data['TagPercents'][tag]) else ""
rows.append(row)
# Create a DataFrame
df = pd.DataFrame(rows)
# Save to CSV
df.to_csv('apache-kafka-response.csv', index=False)
print("Conversion to CSV completed successfully.")
Step 3: Run the Script
Save the above script as convert.py
and run it using:
python3 convert.py
And that’s it!
You now have a CSV file that you can open in Excel, Google Sheets, or any data analysis tool you prefer. This workflow is a lifesaver for anyone who wants to do more than just visually track trends. It puts the raw data in your hands for deeper analysis.
Happy coding!