Skip to main content
Technology & EngineeringData Science164 lines

Matplotlib

Expert guidance on Matplotlib for creating static, animated, and interactive visualizations in Python.

Quick Summary25 lines
You are an expert in Matplotlib for data analysis and science.

## Key Points

- **Use the object-oriented API** (`fig, ax = plt.subplots()`) instead of the pyplot state machine (`plt.plot()`). It is clearer and avoids accidental cross-talk between figures.
- **Always call `plt.tight_layout()`** or use `constrained_layout=True` to prevent label overlap.
- **Use `bbox_inches="tight"`** when saving to avoid cropped labels.
- **Label everything**: axes, legends, titles. A plot without labels is incomplete.
- **Choose colormaps carefully**: use perceptually uniform colormaps (`viridis`, `plasma`) instead of `jet` or `rainbow`.
- **Close figures** with `plt.close(fig)` in loops to avoid memory leaks.
- **Forgetting `plt.show()`** in scripts (not needed in Jupyter).
- **Mixing pyplot and OO API** causes confusing state. Pick one — prefer OO.
- **Not closing figures in loops** leads to memory exhaustion when generating many plots.
- **Using `plt.subplot` instead of `plt.subplots`** — the plural form returns a figure and array of axes in one call, which is almost always what you want.
- **Overlapping labels on tight layouts** — use `fig.autofmt_xdate()` for date axes or rotate ticks manually.

## Quick Example

```python
fig.savefig("figure.pdf", bbox_inches="tight", dpi=300)
fig.savefig("figure.svg", bbox_inches="tight")
fig.savefig("figure.png", bbox_inches="tight", dpi=300, transparent=True)
```
skilldb get data-science-skills/MatplotlibFull skill: 164 lines
Paste into your CLAUDE.md or agent config

Matplotlib — Data Science

You are an expert in Matplotlib for data analysis and science.

Overview

Matplotlib is Python's most widely used plotting library. It provides fine-grained control over every aspect of a figure — from axes layout to tick formatting. While higher-level libraries like Seaborn build on it, understanding Matplotlib's object-oriented API is essential for customizing any Python visualization.

Core Concepts

Figure and Axes

Every plot lives inside a Figure containing one or more Axes objects.

import matplotlib.pyplot as plt
import numpy as np

# Preferred: explicit Figure + Axes
fig, ax = plt.subplots(figsize=(8, 5))
ax.plot([1, 2, 3], [4, 5, 6])
ax.set_xlabel("X")
ax.set_ylabel("Y")
ax.set_title("Simple Line Plot")
plt.tight_layout()
plt.savefig("plot.png", dpi=150)
plt.show()

Subplots

fig, axes = plt.subplots(2, 2, figsize=(10, 8))

axes[0, 0].bar(["A", "B", "C"], [3, 7, 5])
axes[0, 1].scatter(np.random.randn(50), np.random.randn(50))
axes[1, 0].hist(np.random.randn(1000), bins=30, edgecolor="black")
axes[1, 1].boxplot([np.random.randn(100) for _ in range(4)])

for ax in axes.flat:
    ax.set_xlabel("x")
plt.tight_layout()

Plot Types

x = np.linspace(0, 10, 100)

# Line
ax.plot(x, np.sin(x), label="sin", linestyle="--", color="steelblue")

# Scatter
ax.scatter(x, y, c=colors, s=sizes, alpha=0.6, cmap="viridis")

# Bar
ax.bar(categories, values, color="coral", edgecolor="black")
ax.barh(categories, values)  # horizontal

# Histogram
ax.hist(data, bins=30, density=True, alpha=0.7)

# Heatmap
im = ax.imshow(matrix, cmap="coolwarm", aspect="auto")
fig.colorbar(im, ax=ax)

# Fill between
ax.fill_between(x, y_lower, y_upper, alpha=0.3)

Implementation Patterns

Style and Theming

# Use a built-in style
plt.style.use("seaborn-v0_8-whitegrid")

# Custom rcParams
plt.rcParams.update({
    "font.size": 12,
    "axes.labelsize": 14,
    "figure.dpi": 100,
    "axes.spines.top": False,
    "axes.spines.right": False,
})

Annotations and Text

ax.annotate(
    "Peak",
    xy=(peak_x, peak_y),
    xytext=(peak_x + 1, peak_y + 5),
    arrowprops=dict(arrowstyle="->", color="red"),
    fontsize=12,
)
ax.text(0.05, 0.95, "R² = 0.93", transform=ax.transAxes, va="top")

Twin Axes (Dual Y-Axis)

fig, ax1 = plt.subplots()
ax2 = ax1.twinx()

ax1.plot(dates, revenue, color="blue", label="Revenue")
ax2.plot(dates, users, color="orange", label="Users")

ax1.set_ylabel("Revenue ($)", color="blue")
ax2.set_ylabel("Users", color="orange")

Saving Publication-Quality Figures

fig.savefig("figure.pdf", bbox_inches="tight", dpi=300)
fig.savefig("figure.svg", bbox_inches="tight")
fig.savefig("figure.png", bbox_inches="tight", dpi=300, transparent=True)

Best Practices

  • Use the object-oriented API (fig, ax = plt.subplots()) instead of the pyplot state machine (plt.plot()). It is clearer and avoids accidental cross-talk between figures.
  • Always call plt.tight_layout() or use constrained_layout=True to prevent label overlap.
  • Use bbox_inches="tight" when saving to avoid cropped labels.
  • Label everything: axes, legends, titles. A plot without labels is incomplete.
  • Choose colormaps carefully: use perceptually uniform colormaps (viridis, plasma) instead of jet or rainbow.
  • Close figures with plt.close(fig) in loops to avoid memory leaks.

Core Philosophy

A visualization exists to communicate an insight, not to demonstrate technical skill. The best Matplotlib plots are the ones where the reader immediately grasps the pattern, trend, or comparison without needing to decode the chart mechanics. Every element -- color, label, annotation, axis range -- should serve the message. If an element does not help the reader understand the data, it is clutter.

Matplotlib's strength is control. Unlike higher-level libraries that impose opinionated defaults, Matplotlib lets you adjust every pixel. This power comes with responsibility: the defaults are often not publication-ready, so you must deliberately set font sizes, remove unnecessary spines, choose appropriate colormaps, and add clear labels. Treating these adjustments as mandatory rather than optional is what separates informative plots from confusing ones.

Prefer the object-oriented API from the start. The pyplot state machine (plt.plot(), plt.xlabel()) is convenient for quick one-offs, but it becomes a source of subtle bugs as soon as you have multiple figures or subplots. Building the habit of working with explicit fig and ax objects eliminates an entire category of errors and makes code easier to refactor into functions.

Anti-Patterns

  • Rainbow colormaps on sequential data: Using jet or rainbow colormaps for continuous data distorts perception because they are not perceptually uniform. Viewers see false boundaries where hue transitions occur. Use viridis, plasma, or inferno instead.

  • Unlabeled axes and missing legends: Producing plots without axis labels, titles, or legends. A chart that requires the reader to guess what the axes represent or which color corresponds to which series has failed at its communication purpose.

  • Generating plots in a loop without closing figures: Creating figures inside a loop with plt.subplots() but never calling plt.close(fig), causing memory to grow unbounded. In long-running processes this leads to crashes.

  • Mixing pyplot and object-oriented API: Alternating between plt.plot() and ax.plot() in the same script, creating confusion about which figure or axes is being modified. Pick the OO API and use it consistently.

  • Hardcoding figure aesthetics instead of using rcParams or style sheets: Setting font size, line width, and colors individually on every plot call rather than configuring them once via plt.rcParams or a style file. This makes it tedious to maintain visual consistency across a project.

Common Pitfalls

  • Forgetting plt.show() in scripts (not needed in Jupyter).
  • Mixing pyplot and OO API causes confusing state. Pick one — prefer OO.
  • Not closing figures in loops leads to memory exhaustion when generating many plots.
  • Using plt.subplot instead of plt.subplots — the plural form returns a figure and array of axes in one call, which is almost always what you want.
  • Overlapping labels on tight layouts — use fig.autofmt_xdate() for date axes or rotate ticks manually.

Install this skill directly: skilldb add data-science-skills

Get CLI access →