Data Science Fundamentals — Why Choosing the Right Average Matters More Than You Think

One of the most common mistakes I see in data work — from junior analysts to experienced engineers looking at monitoring dashboards — is the uncritical use of arithmetic mean. The mean is default in every spreadsheet, every charting tool, every df.describe() output. And for a surprising range of real-world data, it’s the wrong statistic.

This is a topic I wrote about for the Ultra Tendency Academy blog: The Fundamentals of Data Science — Measures of Central Tendency. That article goes into the full mathematical derivation and worked examples for each measure. I’d recommend reading it for the complete treatment. What I want to do here is add the engineering intuition — the “why does this bite me in practice” layer that is easy to skip over in a foundations article.

The intuition problem

Central tendency sounds like a statistics concept. But engineers encounter it constantly under different names:

“What’s our average latency?” → P50 (median), not mean
“What’s the average growth rate?” → geometric mean, not arithmetic
“What’s average throughput in requests/sec?” → harmonic mean, not arithmetic
“What’s the typical salary at this company?” → median, not mean (Bezos problem)

The arithmetic mean (x₁ + x₂ + ... + xₙ) / n is the right answer to a specific question: what single value would I need to multiply by n to get the total? That’s a useful quantity. It’s just not “the middle” and it’s not “typical” when distributions are skewed.

When the mean actively misleads

Latency

Response times are right-skewed. The 99th percentile outlier — one 10-second request — will pull the arithmetic mean far above what 98% of users experience. That’s why SLAs are defined on percentiles, not means. The mean hides the tail. The median shows you what the typical user actually gets.

Rates and throughputs

Suppose a system handles 100 req/s from service A and 1 req/s from service B, and it processes the same number of requests from each. The arithmetic mean says 50.5 req/s. The harmonic mean — correct when averaging rates over equal amounts of work — gives 2 / (1/100 + 1/1) ≈ 2 req/s. The harmonic mean reflects the bottleneck. (Note: if instead you measured throughput over equal time periods, the arithmetic mean would be correct — the harmonic mean applies when the denominator of the rate is held constant.)

Growth rates

A stock goes from 100 to 200 (100% return) then from 200 to 100 (−50% return). Arithmetic mean: (100% + (−50%)) / 2 = 25% gain. Actual result: you’re back to 100. Zero gain. The geometric mean of the growth factors: √(2.0 × 0.5) = 1.0 — exactly right.

Compound annual growth rate (CAGR) is always a geometric mean. Every time you see a CAGR figure, someone correctly chose the geometric mean over the arithmetic mean.

What the article covers

The Ultra Tendency Academy article walks through all five primary measures — arithmetic, geometric, harmonic mean, median, mode — with formal definitions and Python examples. It then covers the practical decision framework: which distribution shapes and data types call for which measure.

It also covers the less commonly taught alternatives: trimmed mean (drop the top and bottom k% before averaging — balances robustness and efficiency), winsorized mean (clip outliers rather than remove them — preserves sample size), and weighted mean (when observations have different importances, like a GPA).

These three rarely appear in introductory statistics but come up constantly in production data systems. Weighted means are everywhere in recommendation engines, index calculations, and aggregated dashboards. Trimmed means appear in financial benchmarks (the ECB’s HICPX excludes energy and food prices — a form of trimming). Winsorized means are used in robust regression when you can’t simply discard outliers.

The engineering takeaway

Before you compute any average: ask what question you’re actually answering.

Total is what matters → arithmetic mean (revenue per user, items per order)
Distribution is skewed and tail is irrelevant → median (latency, salary, property prices)
Rates, ratios, or speeds → harmonic mean (req/s, MB/s, miles per gallon)
Multiplicative growth → geometric mean (CAGR, percentage returns, infection rates)
Categorical data → mode (most common error type, most popular item)

The formula is secondary. The question is primary. Getting the question right is what separates a data scientist from a person who computed a number.

For the full mathematical treatment and worked Python examples for each measure, read the original article: The Fundamentals of Data Science — Measures of Central Tendency.

Python Time Series at Scale — Lessons from Processing 400M Financial Records — applying correct aggregation choices at scale in a real financial pipeline
Polars vs Pandas — A Benchmark That Changed How I Process Data — the mechanics of efficient aggregation in Python’s two dominant dataframe libraries