Visualisation — Make the Table Talk

A summary table informs; a good chart persuades. This chapter turns your DataFrames into visuals with matplotlib and seaborn — the difference between handing a stakeholder a wall of numbers and handing them an insight they grasp in two seconds. For Business Analytics, the chart is often the whole deliverable.

pandas has built-in .plot() for quick looks; seaborn makes statistical charts beautiful with one call; matplotlib is the underlying canvas you tweak (titles, labels) and save to a file.

The fastest chart — straight off a DataFrame

pandas wires into matplotlib, so any Series or DataFrame can plot itself. This is your exploratory reflex — see the shape before you polish anything.

import pandas as pd
import matplotlib.pyplot as plt

df = pd.DataFrame({
    "month": ["Jan","Feb","Mar","Apr","May"],
    "revenue": [1200, 1800, 1500, 2100, 2400],
})

df.plot(x="month", y="revenue", kind="line", marker="o")
plt.show()

Swap kind= for the question you're asking: "line" for trend over time, "bar" for comparing categories, "hist" for a distribution, "scatter" for relationship between two numbers. In a notebook the chart renders inline; in a script, plt.show() pops a window.

🐘 PHP: To chart in PHP you'd ship data to a JS library like Chart.js in the browser — like the live timer you wired into the dashboard. Here the chart is generated server-side in a few lines and saved as a PNG you can drop into a report or email. No front end required.

Common business charts, paired to their question

# Trend: how did revenue move over time?
df.plot(x="month", y="revenue", kind="line", marker="o")

# Comparison: which region sold most? (group first, then plot)
by_region = sales.groupby("region")["revenue"].sum()
by_region.sort_values().plot(kind="barh")     # horizontal bars read easily

# Distribution: what does order size look like?
sales["revenue"].plot(kind="hist", bins=20)

# Composition: share of total (use sparingly — bars usually beat pies)
by_region.plot(kind="pie", autopct="%1.0f%%")

The pattern to notice: aggregate first, then plot the result. A chart is usually the picture of a groupby, so your Chapter 7 skills feed directly in here.

seaborn — statistical charts that look good by default

seaborn sits on top of matplotlib and is built for exactly this kind of data — it takes a DataFrame and column names, and handles the styling for you.

import seaborn as sns

sns.barplot(data=sales, x="region", y="revenue", estimator="sum")
sns.histplot(data=sales, x="revenue", bins=20)
sns.scatterplot(data=sales, x="spend", y="revenue", hue="region")
sns.boxplot(data=sales, x="region", y="revenue")   # spread + outliers per group

The hue= argument is the standout — colour the points by a third column and a flat scatter becomes a multi-group comparison for free. The boxplot is underrated for analytics: it shows median, spread, and outliers per category in one glance, which is often more honest than a bar of averages.

Polish — the parts that make it shareable

An untitled chart with axis labels like "0" is a draft. A few matplotlib calls turn it into something you'd put in front of leadership — and crucially, save to a file.

ax = by_region.sort_values().plot(kind="barh", color="steelblue")
ax.set_title("Q1 Revenue by Region")
ax.set_xlabel("Revenue ($)")
ax.set_ylabel("")
plt.tight_layout()
plt.savefig("revenue_by_region.png", dpi=150)   # the deliverable
plt.show()

Always title the chart, label the axes in human terms, and savefig() at dpi=150 so it's crisp in a slide deck. tight_layout() stops labels getting clipped. That PNG is frequently the actual product of an analysis.

A One-Page Visual Summary

Goal: go from a dataset to three polished, saved charts that together tell a story — a realistic analytics deliverable.

Make some data with a relationship worth seeing:

import pandas as pd, seaborn as sns, matplotlib.pyplot as plt
df = pd.DataFrame({
    "region": ["N","S","E","N","S","E","N","S"],
    "spend":  [200, 150, 400, 250, 180, 420, 300, 160],
    "revenue":[1200, 800, 2100, 1500, 900, 2300, 1700, 850],
})

Comparison — revenue by region: sns.barplot(data=df, x="region", y="revenue", estimator="sum"), title it, savefig("c1.png")
Relationship — does spend drive revenue? sns.scatterplot(data=df, x="spend", y="revenue", hue="region"), save as c2.png
Distribution — spread of revenue: sns.boxplot(data=df, x="region", y="revenue"), save as c3.png
Open the three PNGs — that's a visual narrative: who's biggest, what drives sales, how consistent each region is

Comparison, relationship, distribution — three lenses on the same data. Picking the right chart for the question is the real craft, and you just practised all three.

You can produce quick exploratory plots off a DataFrame, reach for seaborn when you want statistical charts that look good, and polish + save a figure for an audience. You can now turn analysis into something a stakeholder actually feels.

Take the scatter from the project and add a trend line to make the relationship undeniable: sns.regplot(data=df, x="spend", y="revenue") fits and draws a regression line through the points. You're now visualising a prediction — a natural bridge into the final chapter, where you'll fit that line as a real model and use it to forecast.