matplotlib styling
Level 2 · Lesson 6
Hook
The same data, charted with default matplotlib vs five small tweaks, looks like two different reports. None of these are decoration — each removes a question the reader would have to ask.
Concept
The five moves that turn a default chart into a clean one:
- Set the figure size. Default is square-ish and tiny.
figsize=(8, 5)for most things,(10, 6)if you have long labels. - Pick one color per series, intentionally. No rainbows. Honolulu Blue
(
#0076B6) for Lions, neutral gray for comparisons. - Kill the top and right spines. They’re chartjunk.
- Add comparison context. A second team, a league average line, a previous-season ghost — whatever makes the number mean something.
- Annotate the point that matters. If the chart has a “story” data point,
call it out with
ax.annotate(...).
import matplotlib.pyplot as plt
LIONS = '#0076B6'NEUTRAL = '#999999'
fig, ax = plt.subplots(figsize=(8, 5))
# ... plot calls here ...
ax.spines['top'].set_visible(False)ax.spines['right'].set_visible(False)ax.set_title('Title that names the metric, the subject, and the period', fontsize=13, pad=12)ax.set_xlabel('X label', fontsize=10)ax.set_ylabel('Y label', fontsize=10)ax.grid(axis='y', linestyle='--', alpha=0.4)fig.text(0.99, 0.01, 'Source: nflverse', ha='right', fontsize=8, color='gray')plt.tight_layout()Lions example
Comparing Amon-Ra St. Brown’s 2024 weekly receiving yards against the NFC North WR1 group average:
import matplotlib.pyplot as pltimport pandas as pdfrom sqlalchemy import create_engine
eng = create_engine("postgresql+psycopg://onepride:lions@localhost:5432/onepride")
q = """WITH wr1 AS ( SELECT recent_team, player_display_name, SUM(receiving_yards) AS total FROM weekly_stats WHERE season = 2024 AND season_type = 'REG' AND position_group = 'WR' AND recent_team IN ('DET', 'GB', 'MIN', 'CHI') GROUP BY recent_team, player_display_name),top1 AS ( -- DISTINCT ON is a Postgres extension that picks the first row per -- group given an ORDER BY. Equivalent to a window-rank + filter. SELECT DISTINCT ON (recent_team) recent_team, player_display_name FROM wr1 ORDER BY recent_team, total DESC)SELECT ws.week, ws.player_display_name, ws.recent_team, ws.receiving_yardsFROM weekly_stats wsJOIN top1 USING (recent_team, player_display_name)WHERE ws.season = 2024 AND ws.season_type = 'REG'ORDER BY ws.week, ws.recent_team;"""
df = pd.read_sql(q, eng)arsb = df[df['player_display_name'] == 'Amon-Ra St. Brown']peers = df[df['player_display_name'] != 'Amon-Ra St. Brown']peer_avg = peers.groupby('week', as_index=False)['receiving_yards'].mean()
LIONS = '#0076B6'NEUTRAL = '#999999'
fig, ax = plt.subplots(figsize=(9, 5))ax.plot(arsb['week'], arsb['receiving_yards'], color=LIONS, marker='o', linewidth=2.5, label='Amon-Ra St. Brown')ax.plot(peer_avg['week'], peer_avg['receiving_yards'], color=NEUTRAL, linestyle='--', linewidth=1.5, label='NFC North WR1 avg (GB, MIN, CHI)')
ax.spines['top'].set_visible(False)ax.spines['right'].set_visible(False)ax.set_title('ARSB vs NFC North WR1 average — 2024 regular season', fontsize=13, pad=12)ax.set_xlabel('Week')ax.set_ylabel('Receiving yards')ax.grid(axis='y', linestyle='--', alpha=0.4)ax.legend(frameon=False, loc='upper left')fig.text(0.99, 0.01, 'Source: nflverse', ha='right', fontsize=8, color='gray')plt.tight_layout()plt.savefig('arsb-vs-nfc-north.png', dpi=150)That chart answers a question; the default version would just show one line.
Try it
Make a horizontal bar chart of the top 8 NFC North receivers by 2024 total receiving yards. Color Lions players Honolulu Blue, everyone else neutral gray. Sort with the leader on top.
Common mistakes
- Default palette. Tab-orange and tab-blue everywhere. Pick colors that match your subject.
- Tiny figures. A 4x3 PNG looks fine in a notebook and unreadable when
embedded in a doc. Default to
figsize=(8, 5)or larger. - Forgetting
tight_layout(). Long labels and titles get clipped in the saved PNG even if they look fine on screen. - Overloading. Two lines beat five. If you need five, use small multiples (subplots) — one player per panel.
- No comparison context. A solo line on a chart implies “this is good” or “this is bad” but doesn’t say compared to what. Always have a peer line, a league average, or a prior-year ghost.
Quick check
- Which two spines are usually safe to remove?
- Why default to
figsize=(8, 5)instead of matplotlib’s default? - What’s the one thing a comparison line adds that a solo line cannot?