matplotlib styling

Level 2 · Lesson 6

Hook

The same data, charted with default matplotlib vs five small tweaks, looks like two different reports. None of these are decoration — each removes a question the reader would have to ask.

Concept

The five moves that turn a default chart into a clean one:

Set the figure size. Default is square-ish and tiny. figsize=(8, 5) for most things, (10, 6) if you have long labels.
Pick one color per series, intentionally. No rainbows. Honolulu Blue (#0076B6) for Lions, neutral gray for comparisons.
Kill the top and right spines. They’re chartjunk.
Add comparison context. A second team, a league average line, a previous-season ghost — whatever makes the number mean something.
Annotate the point that matters. If the chart has a “story” data point, call it out with ax.annotate(...).

import matplotlib.pyplot as plt

LIONS = '#0076B6'
NEUTRAL = '#999999'

fig, ax = plt.subplots(figsize=(8, 5))

# ... plot calls here ...

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_title('Title that names the metric, the subject, and the period',
             fontsize=13, pad=12)
ax.set_xlabel('X label', fontsize=10)
ax.set_ylabel('Y label', fontsize=10)
ax.grid(axis='y', linestyle='--', alpha=0.4)
fig.text(0.99, 0.01, 'Source: nflverse', ha='right', fontsize=8, color='gray')
plt.tight_layout()

Lions example

Comparing Amon-Ra St. Brown’s 2024 weekly receiving yards against the NFC North WR1 group average:

import matplotlib.pyplot as plt
import pandas as pd
from sqlalchemy import create_engine

eng = create_engine("postgresql+psycopg://onepride:lions@localhost:5432/onepride")

q = """
WITH wr1 AS (
    SELECT recent_team,
           player_display_name,
           SUM(receiving_yards) AS total
    FROM weekly_stats
    WHERE season = 2024 AND season_type = 'REG'
      AND position_group = 'WR'
      AND recent_team IN ('DET', 'GB', 'MIN', 'CHI')
    GROUP BY recent_team, player_display_name
),
top1 AS (
    -- DISTINCT ON is a Postgres extension that picks the first row per
    -- group given an ORDER BY. Equivalent to a window-rank + filter.
    SELECT DISTINCT ON (recent_team) recent_team, player_display_name
    FROM wr1
    ORDER BY recent_team, total DESC
)
SELECT ws.week,
       ws.player_display_name,
       ws.recent_team,
       ws.receiving_yards
FROM weekly_stats ws
JOIN top1 USING (recent_team, player_display_name)
WHERE ws.season = 2024 AND ws.season_type = 'REG'
ORDER BY ws.week, ws.recent_team;
"""

df = pd.read_sql(q, eng)
arsb = df[df['player_display_name'] == 'Amon-Ra St. Brown']
peers = df[df['player_display_name'] != 'Amon-Ra St. Brown']
peer_avg = peers.groupby('week', as_index=False)['receiving_yards'].mean()

LIONS = '#0076B6'
NEUTRAL = '#999999'

fig, ax = plt.subplots(figsize=(9, 5))
ax.plot(arsb['week'], arsb['receiving_yards'], color=LIONS, marker='o',
        linewidth=2.5, label='Amon-Ra St. Brown')
ax.plot(peer_avg['week'], peer_avg['receiving_yards'], color=NEUTRAL,
        linestyle='--', linewidth=1.5, label='NFC North WR1 avg (GB, MIN, CHI)')

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.set_title('ARSB vs NFC North WR1 average — 2024 regular season',
             fontsize=13, pad=12)
ax.set_xlabel('Week')
ax.set_ylabel('Receiving yards')
ax.grid(axis='y', linestyle='--', alpha=0.4)
ax.legend(frameon=False, loc='upper left')
fig.text(0.99, 0.01, 'Source: nflverse', ha='right', fontsize=8, color='gray')
plt.tight_layout()
plt.savefig('arsb-vs-nfc-north.png', dpi=150)

That chart answers a question; the default version would just show one line.

Try it

Make a horizontal bar chart of the top 8 NFC North receivers by 2024 total receiving yards. Color Lions players Honolulu Blue, everyone else neutral gray. Sort with the leader on top.

Common mistakes

Default palette. Tab-orange and tab-blue everywhere. Pick colors that match your subject.
Tiny figures. A 4x3 PNG looks fine in a notebook and unreadable when embedded in a doc. Default to figsize=(8, 5) or larger.
Forgetting tight_layout(). Long labels and titles get clipped in the saved PNG even if they look fine on screen.
Overloading. Two lines beat five. If you need five, use small multiples (subplots) — one player per panel.
No comparison context. A solo line on a chart implies “this is good” or “this is bad” but doesn’t say compared to what. Always have a peer line, a league average, or a prior-year ghost.

Quick check

Which two spines are usually safe to remove?
Why default to figsize=(8, 5) instead of matplotlib’s default?
What’s the one thing a comparison line adds that a solo line cannot?

matplotlib styling

→ Hook

⊕ Concept

★ Lions example

▶ Try it

⚠ Common mistakes

✓ Quick check