Summary Statistics Calculator

Q: What is the mean in statistics and how is it calculated?

The mean (arithmetic mean or average) is the sum of all values divided by the number of values: x̄ = Σxᵢ / n. For example, the mean of {2, 4, 6, 8, 10} = (2+4+6+8+10) / 5 = 30 / 5 = 6. The mean is the most common measure of central tendency, but it is sensitive to outliers — a single extreme value can pull it far from the "typical" value in your dataset.

Q: How do you calculate the median step by step?

To find the median: (1) Sort all values from smallest to largest. (2) If n is odd, the median is the middle value at position (n+1)/2. (3) If n is even, the median is the average of the two middle values at positions n/2 and n/2+1. Example (odd n): {3, 7, 9, 12, 15} → median = 9. Example (even n): {3, 7, 9, 12} → median = (7+9)/2 = 8. The median is resistant to outliers and is preferred for skewed data.

Q: What is mode in statistics?

The mode is the value that appears most frequently in a dataset. A dataset can have one mode (unimodal), two modes (bimodal), or more (multimodal). If all values appear exactly once, there is no mode. Example: in {2, 3, 3, 5, 7, 7, 7}, the mode is 7. The mode is the only measure of central tendency that can be used for categorical (non-numeric) data.

Q: What is standard deviation and how do you interpret it?

Standard deviation (SD) measures how spread out values are around the mean. A small SD means values cluster closely around the mean; a large SD means they are spread widely. In a normal distribution, ~68% of data falls within ±1 SD of the mean, ~95% within ±2 SD, and ~99.7% within ±3 SD. For example, if a dataset has mean = 50 and SD = 5, then 68% of values lie between 45 and 55. This calculator uses the sample standard deviation (divides by n−1), which matches R, SPSS, and Excel STDEV.S.

Q: What is variance and how is it different from standard deviation?

Variance is the average of squared deviations from the mean: s² = Σ(xᵢ − x̄)² / (n−1). Standard deviation is simply the square root of variance. Variance is harder to interpret directly because its units are squared (e.g., kg² instead of kg), but it is mathematically convenient in statistical tests like ANOVA. Standard deviation is easier to interpret because it shares the same units as the original data.

Q: What is the interquartile range (IQR) and how is it calculated?

The interquartile range (IQR) is a measure of statistical spread that covers the middle 50% of your data. It is calculated as IQR = Q3 − Q1, where Q1 is the 25th percentile and Q3 is the 75th percentile. For example, if Q1 = 20 and Q3 = 35, then IQR = 15. Unlike range (Max − Min), the IQR is resistant to outliers, making it a better measure of spread for skewed data. IQR is also used to detect outliers: values below Q1 − 1.5×IQR or above Q3 + 1.5×IQR are flagged as potential outliers (Tukey's fence).

Q: How do you find Q1 and Q3 (first and third quartiles)?

Q1 (first quartile) is the 25th percentile — the value below which 25% of the data falls. Q3 (third quartile) is the 75th percentile — the value below which 75% of the data falls. This calculator uses linear interpolation (equivalent to Excel's QUARTILE.INC function and R's quantile(type=7)). Different software may give slightly different quartile values depending on the interpolation method used.

Q: What is the five-number summary in statistics?

The five-number summary consists of: (1) Minimum — the smallest value, (2) Q1 — the first quartile (25th percentile), (3) Median — the middle value (50th percentile), (4) Q3 — the third quartile (75th percentile), and (5) Maximum — the largest value. Together, these five numbers give a concise description of the distribution and form the basis of box plots (box-and-whisker plots). All five are calculated by this summary statistics calculator.

Q: What is the range in statistics?

The range is the simplest measure of spread: Range = Maximum − Minimum. It tells you the total width of your data. The range is easy to calculate but is heavily influenced by outliers — a single extreme value can make the range very large even if most data is tightly clustered. For a more robust measure of spread, use the IQR (interquartile range) instead.

Q: What is skewness and how do you interpret it?

Skewness measures the asymmetry of a distribution. A skewness near 0 means the distribution is roughly symmetric. Positive skewness (right skew) means the right tail is longer — a few high outliers pull the mean above the median (common in income, survival times, or cytokine levels). Negative skewness (left skew) means the left tail is longer — a few low outliers pull the mean below the median. As a rough guide: |skewness| < 0.5 is approximately symmetric; 0.5–1.0 is moderately skewed; >1.0 is substantially skewed.

📊

Summary Statistics Calculator

Mean, median, mode, standard deviation, variance, range, quartiles, IQR, skewness, and kurtosis — 14 descriptive statistics from one dataset, instantly.

💡 Quick Summary

This free online summary statistics calculator computes 14 descriptive statistics for any numeric dataset in one click: mean (arithmetic average), median (middle value), mode (most frequent value), sample standard deviation, sample variance, minimum, maximum, range, Q1 (first quartile / 25th percentile), Q3 (third quartile / 75th percentile), IQR (interquartile range), skewness, and excess kurtosis — plus a count of valid values (N) and missing values. Enter your data by typing, or paste directly from Excel, Google Sheets, or any spreadsheet. Used in biostatistics, data analysis, research, and statistics coursework.

📋 How to Use

Enter your numbers in the grid — one value per row. You can also paste from Excel or Google Sheets: just copy a column and paste into any cell. The calculator fills all values automatically.
Click + 10 Rows to add more rows if your dataset is larger than the default grid.
Choose Decimal Places (2–4) to control how results are rounded.
Click Calculate. All 14 descriptive statistics appear instantly, grouped by category: central tendency (mean, median, mode), spread (SD, variance, range, min, max), quartiles (Q1, Q3, IQR), and shape (skewness, kurtosis).
Non-numeric entries (text, empty cells) are automatically skipped and counted as Missing — no need to clean your data first.
Click Load Example to see results with a built-in sample dataset, or Reset to start fresh.

🔢 Formulas

Mean (arithmetic average)

x̄ = Σxᵢ / n

Median

Sort values ascending. If n is odd: median = x₍ₙ₊₁₎/₂. If n is even: median = (x₍ₙ/₂₎ + x₍ₙ/₂₊₁₎) / 2.

Mode

The value(s) that appear most frequently. Reported only when at least one value repeats.

Range

Range = Maximum − Minimum

Sample Variance

s² = Σ(xᵢ − x̄)² / (n − 1)

Sample SD

s = √[ Σ(xᵢ − x̄)² / (n − 1) ]

Q1 (First Quartile)

Linear interpolation at the 25th percentile — equivalent to Excel QUARTILE.INC and R quantile type 7.

Q3 (Third Quartile)

Linear interpolation at the 75th percentile.

IQR (Interquartile Range)

IQR = Q3 − Q1

Skewness

g₁ = [n / ((n−1)(n−2))] × Σ((xᵢ − x̄)/s)³ (adjusted Fisher-Pearson; requires n ≥ 3)

Excess Kurtosis

g₂ = {n(n+1)/[(n−1)(n−2)(n−3)]} × Σ((xᵢ − x̄)/s)⁴ − 3(n−1)²/[(n−2)(n−3)] (requires n ≥ 4)

📖 Interpreting Results

Mean — the arithmetic average

Add up all values and divide by n. The mean is sensitive to outliers: a single extreme value can shift it significantly. Compare it to the median to judge whether your data is skewed.

Median — the middle value

The median splits sorted data into two equal halves. It is resistant to outliers, making it the preferred measure of central tendency for skewed distributions such as income, survival times, or reaction times.

Mode — the most frequent value

The mode is the value that occurs most often. Data can be unimodal (one mode), bimodal (two modes), or multimodal. If all values are unique, there is no mode. The mode is the only measure of central tendency applicable to categorical data.

Range

Range = Max − Min. It captures the total spread of your data but is heavily influenced by outliers. For a more robust measure of spread, use the IQR instead.

Standard deviation (SD)

SD measures the average distance of each data point from the mean. In a normal distribution: ~68% of data falls within ±1 SD, ~95% within ±2 SD, and ~99.7% within ±3 SD (the empirical rule). A larger SD means more variability; SD = 0 means all values are identical.

Variance

Variance = SD². It is the average squared deviation from the mean. Variance is harder to interpret directly (units are squared), but it is the foundation of many statistical tests such as ANOVA.

Q1, Q3, and IQR

Q1 (25th percentile) and Q3 (75th percentile) mark the boundaries of the middle 50% of your data. The interquartile range (IQR = Q3 − Q1) is a robust measure of spread, unaffected by outliers. A common rule flags values below Q1 − 1.5×IQR or above Q3 + 1.5×IQR as potential outliers (Tukey's fence).

Five-number summary

The five-number summary — Minimum, Q1, Median, Q3, Maximum — gives a compact picture of your data's distribution and is the basis of box plots (box-and-whisker plots).

Mean vs Median — detecting skew

When mean ≈ median, the distribution is roughly symmetric. Mean > median signals right (positive) skew — a few high outliers pulling the mean up. Mean < median signals left (negative) skew — a few low outliers pulling the mean down.

Skewness near 0

The distribution is approximately symmetric. Values between −0.5 and +0.5 are generally considered roughly symmetric; |skewness| > 1 is considered substantially skewed.

Positive skewness (right skew)

The right tail is longer — a few unusually large values stretch the distribution rightward. Common in income data, reaction times, and biological concentration data.

Negative skewness (left skew)

The left tail is longer — a few unusually small values stretch the distribution leftward. Common in exam scores near the maximum or age at death in a healthy population.

Excess kurtosis = 0 (mesokurtic)

Tail weight is similar to a normal distribution.

Excess kurtosis > 0 (leptokurtic)

Heavier tails than normal — more extreme outliers than expected. Common in financial returns and biological measurements with rare extreme events.

Excess kurtosis < 0 (platykurtic)

Lighter tails than normal — fewer extreme values. The distribution is flatter and more uniform.

⚠️ Common Mistakes

Sample statistics, not population statistics

Variance and SD use Bessel's correction (dividing by n−1 instead of n). This gives unbiased estimates when your data is a sample from a larger population — the standard in biostatistics, biology, and most research. This matches SPSS, R (var(), sd()), and Excel (VAR.S, STDEV.S). If your dataset is the entire population, use population formulas (divide by n) instead.

Skewness requires n ≥ 3; kurtosis requires n ≥ 4

The adjusted Fisher-Pearson skewness formula is mathematically undefined for fewer than 3 values. Excess kurtosis is undefined for fewer than 4 values. These show as "n < 3" or "n < 4" when the sample is too small.

"No mode" means all values are unique

Mode is only reported when at least one value appears more than once. If every value in your dataset is unique, there is no mode. When multiple values share the highest frequency, all are shown as modes (multimodal data).

Mean and SD are unreliable for heavily skewed data

For skewed distributions (e.g. reaction times, income, cytokine concentrations), the mean and standard deviation can be misleading. In these cases, prefer the median as the measure of center and the IQR as the measure of spread.

❓ FAQ

What is the mean in statistics and how is it calculated?

The mean (arithmetic mean or average) is the sum of all values divided by the number of values: x̄ = Σxᵢ / n. For example, the mean of {2, 4, 6, 8, 10} = (2+4+6+8+10) / 5 = 30 / 5 = 6. The mean is the most common measure of central tendency, but it is sensitive to outliers — a single extreme value can pull it far from the "typical" value in your dataset.

How do you calculate the median step by step?

To find the median: (1) Sort all values from smallest to largest. (2) If n is odd, the median is the middle value at position (n+1)/2. (3) If n is even, the median is the average of the two middle values at positions n/2 and n/2+1. Example (odd n): {3, 7, 9, 12, 15} → median = 9. Example (even n): {3, 7, 9, 12} → median = (7+9)/2 = 8. The median is resistant to outliers and is preferred for skewed data.

What is mode in statistics?

The mode is the value that appears most frequently in a dataset. A dataset can have one mode (unimodal), two modes (bimodal), or more (multimodal). If all values appear exactly once, there is no mode. Example: in {2, 3, 3, 5, 7, 7, 7}, the mode is 7. The mode is the only measure of central tendency that can be used for categorical (non-numeric) data.

What is standard deviation and how do you interpret it?

Standard deviation (SD) measures how spread out values are around the mean. A small SD means values cluster closely around the mean; a large SD means they are spread widely. In a normal distribution, ~68% of data falls within ±1 SD of the mean, ~95% within ±2 SD, and ~99.7% within ±3 SD. For example, if a dataset has mean = 50 and SD = 5, then 68% of values lie between 45 and 55. This calculator uses the sample standard deviation (divides by n−1), which matches R, SPSS, and Excel STDEV.S.

What is variance and how is it different from standard deviation?

Variance is the average of squared deviations from the mean: s² = Σ(xᵢ − x̄)² / (n−1). Standard deviation is simply the square root of variance. Variance is harder to interpret directly because its units are squared (e.g., kg² instead of kg), but it is mathematically convenient in statistical tests like ANOVA. Standard deviation is easier to interpret because it shares the same units as the original data.

What is the interquartile range (IQR) and how is it calculated?

The interquartile range (IQR) is a measure of statistical spread that covers the middle 50% of your data. It is calculated as IQR = Q3 − Q1, where Q1 is the 25th percentile and Q3 is the 75th percentile. For example, if Q1 = 20 and Q3 = 35, then IQR = 15. Unlike range (Max − Min), the IQR is resistant to outliers, making it a better measure of spread for skewed data. IQR is also used to detect outliers: values below Q1 − 1.5×IQR or above Q3 + 1.5×IQR are flagged as potential outliers (Tukey's fence).

How do you find Q1 and Q3 (first and third quartiles)?

Q1 (first quartile) is the 25th percentile — the value below which 25% of the data falls. Q3 (third quartile) is the 75th percentile — the value below which 75% of the data falls. This calculator uses linear interpolation (equivalent to Excel's QUARTILE.INC function and R's quantile(type=7)). Different software may give slightly different quartile values depending on the interpolation method used.

What is the five-number summary in statistics?

The five-number summary consists of: (1) Minimum — the smallest value, (2) Q1 — the first quartile (25th percentile), (3) Median — the middle value (50th percentile), (4) Q3 — the third quartile (75th percentile), and (5) Maximum — the largest value. Together, these five numbers give a concise description of the distribution and form the basis of box plots (box-and-whisker plots). All five are calculated by this summary statistics calculator.

What is the range in statistics?

The range is the simplest measure of spread: Range = Maximum − Minimum. It tells you the total width of your data. The range is easy to calculate but is heavily influenced by outliers — a single extreme value can make the range very large even if most data is tightly clustered. For a more robust measure of spread, use the IQR (interquartile range) instead.

What is skewness and how do you interpret it?

Skewness measures the asymmetry of a distribution. A skewness near 0 means the distribution is roughly symmetric. Positive skewness (right skew) means the right tail is longer — a few high outliers pull the mean above the median (common in income, survival times, or cytokine levels). Negative skewness (left skew) means the left tail is longer — a few low outliers pull the mean below the median. As a rough guide: |skewness| < 0.5 is approximately symmetric; 0.5–1.0 is moderately skewed; >1.0 is substantially skewed.

What is excess kurtosis and what does it measure?

Excess kurtosis measures the heaviness of the tails of a distribution relative to a normal distribution (which has excess kurtosis = 0). Positive excess kurtosis (leptokurtic) indicates heavier tails — more extreme outliers than you would expect from a normal distribution. Negative excess kurtosis (platykurtic) indicates lighter tails — fewer extremes. Important: excess kurtosis does NOT measure how peaked or flat the center of the distribution is — that is a widespread misconception. This calculator uses the adjusted Fisher-Pearson formula, matching SPSS and Excel KURT().

What is the difference between sample and population standard deviation?

Population standard deviation (σ) divides by n and is used when you have data for the entire population. Sample standard deviation (s) divides by n−1 (Bessel's correction) and is used when your data is a sample drawn from a larger population — which is almost always the case in research. Dividing by n−1 corrects the downward bias that occurs when estimating population variance from a sample. This calculator uses the sample standard deviation (n−1), matching R sd(), Excel STDEV.S, and SPSS.

What are measures of central tendency?

Measures of central tendency describe the "center" or "typical value" of a dataset. The three main measures are: (1) Mean — the arithmetic average, best for symmetric data without outliers; (2) Median — the middle value, best for skewed data or data with outliers; (3) Mode — the most frequent value, useful for categorical data or identifying peaks in a distribution. This calculator computes all three simultaneously.

What are measures of spread (dispersion) in statistics?

Measures of spread quantify how variable your data is. This calculator reports five: (1) Range — Max minus Min, simple but sensitive to outliers; (2) Variance — average squared deviation from the mean; (3) Standard deviation — square root of variance, in the same units as the data; (4) IQR — interquartile range (Q3 − Q1), resistant to outliers; (5) Q1 and Q3 — the boundaries of the middle 50% of data.

What is the difference between mean and median?

The mean is the arithmetic average — the sum of all values divided by n. The median is the middle value when data is sorted. When the distribution is symmetric and has no outliers, mean ≈ median. When data is right-skewed (e.g. income, reaction times), mean > median because a few large values pull the average up. For skewed data, the median is the better "typical value." Comparing mean and median is a quick way to assess skewness.

Can I paste data from Excel or Google Sheets?

Yes. Copy a single column of numbers from Excel or Google Sheets (Ctrl+C) and paste into any cell in the data grid (Ctrl+V). The calculator automatically fills all rows downward. Comma-separated, space-separated, tab-separated, and semicolon-separated formats also work. Non-numeric tokens such as column headers or empty cells are automatically skipped and counted as missing values — no need to clean your data first.

Why does my dataset have no mode?

Mode is only reported when at least one value appears more than once. If every value in your dataset is unique (appears exactly once), the data has no mode. This is common in continuous measurements like height, weight, or concentration where repeated exact values are unlikely. In that case, use mean and median to describe the center of your data.

Why does this calculator divide variance by n−1 instead of n?

Dividing by n (population variance) systematically underestimates the true population variance when calculated from a sample. Bessel's correction (dividing by n−1) removes this bias and produces an unbiased estimate of the population variance. This is the standard approach in statistics, biostatistics, and data analysis. It matches what SPSS, R (var()), Excel (VAR.S), and Python (statistics.variance()) all use.