Home/ Cookbook/ Charts/ Scatter

Scatter plot with regression

Two continuous variables, one dot per observation, an optional regression line. The chart that answers "are these two things related?" — yes, no, or "kind of, with outliers".

Charts · 15 of 20 1 primitive · Chart.Scatter ~15 min React · @corelithzw/react

Overview

Two continuous axes, one dot per data point, an optional fit line and confidence band. The fastest way to spot a relationship — or refute one.

Scatter plots answer correlation questions: does this go up when that goes up? They also reveal what other chart types hide: outliers (the dots that don't sit on the line), clusters (clumps that suggest sub-populations), and heteroscedasticity (spread that grows with the value). A regression line on top tells you the direction and steepness; a confidence band tells you how confident the data is in that fit.

Reach for it for any two-variable comparison with more than 20 observations — operator output vs. shift length, household size vs. monthly spend, package weight vs. delivery cost. Skip it for time series (use a line), for category counts (use bars), or for under 10 points (just print the numbers).

The opinion: show the regression line by default, but always state the R² nearby. A confident-looking line with R² of 0.08 is a lie; the chart should call out its own weakness.

The chart

Operator shift-length vs. entries logged. Positive correlation with a clear ceiling around 11 hours.

Shift length vs. entries logged 40 observations of operator shifts. R² of 0.62 — strong positive correlation. Outliers cluster above 9-hour shifts where entries plateau. 0 20 40 60 80 4h 6h 8h 10h 12h R² = 0.62 n = 40 Shift length Entries logged

Required pieces

Roadmap: Chart.Scatter with built-in OLS regression and outlier detection ships in v0.2. v0.1 users compute regression server-side and pass the fit line as a sibling.

React snippet

Pass data with x and y; the chart fits the regression.

Chart.Scatter@corelithzw/react

Customising

Bubble (third dimension as size)

Encode a third numeric variable as dot radius. Useful for "value, frequency, recency"-style three-axis comparisons.

<Chart.Scatter
  data={data}
  xAccessor="x"
  yAccessor="y"
  sizeAccessor="count"
  variant="bubble"
/>

Grouped by category

Colour-code dots by a categorical variable — region, role, age band. Reveals whether the relationship varies by group.

<Chart.Scatter
  data={data}
  colorAccessor="region"
  legend
  regression="per-group"
/>

No regression line

Sometimes you just want the cloud — for exploration, before you commit to a fit.

<Chart.Scatter
  data={data}
  regression={false}
  pointOpacity={0.5}
/>

In context

Inside an exec dashboard row card. Title and R² in a stat-card-style header; the scatter sits underneath.

Shift length × Entries
40 shifts · October
R² 0.62 · strong

Accessibility

Hundreds of dots and a fit line are nearly impossible to read non-visually without a summary statement. Lead with the statistics.

  • One role="img" per scatter. Inside, <title> names it and <desc> states n, R², slope, and any obvious clusters or outliers.
  • Aria-label leads with the verdict. "Shift length vs. entries logged, positive correlation, R squared 0.62, two outliers above 10 hours" — not "Scatter plot with 40 dots".
  • Outliers carry a non-colour cue. Red dots are also labelled "outlier" in the desc and listed in the fallback table; colour-blind users hear them by name.
  • Hidden numeric table. Every data point appears with its x, y, group, and "outlier" flag in an sr-only <table>.
  • Keyboard nav by data point. Tab enters the chart; left/right arrows move to the next-nearest point along x; the value is announced.
  • R² is text, not a colour. "R² = 0.62 · strong" appears as a visible badge so screen readers and colour-blind users get the confidence without the chart shape.