Visualisation — Make the Table Talk
.plot() for quick looks; seaborn makes statistical charts beautiful with one call; matplotlib is the underlying canvas you tweak (titles, labels) and save to a file.The fastest chart — straight off a DataFrame
pandas wires into matplotlib, so any Series or DataFrame can plot itself. This is your exploratory reflex — see the shape before you polish anything.
import pandas as pd
import matplotlib.pyplot as plt
df = pd.DataFrame({
"month": ["Jan","Feb","Mar","Apr","May"],
"revenue": [1200, 1800, 1500, 2100, 2400],
})
df.plot(x="month", y="revenue", kind="line", marker="o")
plt.show()
Swap kind= for the question you're asking: "line" for trend over time, "bar" for comparing categories, "hist" for a distribution, "scatter" for relationship between two numbers. In a notebook the chart renders inline; in a script, plt.show() pops a window.
🐘 PHP: To chart in PHP you'd ship data to a JS library like Chart.js in the browser — like the live timer you wired into the dashboard. Here the chart is generated server-side in a few lines and saved as a PNG you can drop into a report or email. No front end required.
Common business charts, paired to their question
# Trend: how did revenue move over time?
df.plot(x="month", y="revenue", kind="line", marker="o")
# Comparison: which region sold most? (group first, then plot)
by_region = sales.groupby("region")["revenue"].sum()
by_region.sort_values().plot(kind="barh") # horizontal bars read easily
# Distribution: what does order size look like?
sales["revenue"].plot(kind="hist", bins=20)
# Composition: share of total (use sparingly — bars usually beat pies)
by_region.plot(kind="pie", autopct="%1.0f%%")
The pattern to notice: aggregate first, then plot the result. A chart is usually the picture of a groupby, so your Chapter 7 skills feed directly in here.
seaborn — statistical charts that look good by default
seaborn sits on top of matplotlib and is built for exactly this kind of data — it takes a DataFrame and column names, and handles the styling for you.
import seaborn as sns
sns.barplot(data=sales, x="region", y="revenue", estimator="sum")
sns.histplot(data=sales, x="revenue", bins=20)
sns.scatterplot(data=sales, x="spend", y="revenue", hue="region")
sns.boxplot(data=sales, x="region", y="revenue") # spread + outliers per group
The hue= argument is the standout — colour the points by a third column and a flat scatter becomes a multi-group comparison for free. The boxplot is underrated for analytics: it shows median, spread, and outliers per category in one glance, which is often more honest than a bar of averages.
Polish — the parts that make it shareable
An untitled chart with axis labels like "0" is a draft. A few matplotlib calls turn it into something you'd put in front of leadership — and crucially, save to a file.
ax = by_region.sort_values().plot(kind="barh", color="steelblue")
ax.set_title("Q1 Revenue by Region")
ax.set_xlabel("Revenue ($)")
ax.set_ylabel("")
plt.tight_layout()
plt.savefig("revenue_by_region.png", dpi=150) # the deliverable
plt.show()
Always title the chart, label the axes in human terms, and savefig() at dpi=150 so it's crisp in a slide deck. tight_layout() stops labels getting clipped. That PNG is frequently the actual product of an analysis.
A One-Page Visual Summary
Goal: go from a dataset to three polished, saved charts that together tell a story — a realistic analytics deliverable.
- Make some data with a relationship worth seeing:
import pandas as pd, seaborn as sns, matplotlib.pyplot as plt df = pd.DataFrame({ "region": ["N","S","E","N","S","E","N","S"], "spend": [200, 150, 400, 250, 180, 420, 300, 160], "revenue":[1200, 800, 2100, 1500, 900, 2300, 1700, 850], }) - Comparison — revenue by region:
sns.barplot(data=df, x="region", y="revenue", estimator="sum"), title it,savefig("c1.png") - Relationship — does spend drive revenue?
sns.scatterplot(data=df, x="spend", y="revenue", hue="region"), save asc2.png - Distribution — spread of revenue:
sns.boxplot(data=df, x="region", y="revenue"), save asc3.png - Open the three PNGs — that's a visual narrative: who's biggest, what drives sales, how consistent each region is
Comparison, relationship, distribution — three lenses on the same data. Picking the right chart for the question is the real craft, and you just practised all three.
sns.regplot(data=df, x="spend", y="revenue") fits and draws a regression line through the points. You're now visualising a prediction — a natural bridge into the final chapter, where you'll fit that line as a real model and use it to forecast.