# What are the range and standard deviation?

This guide outlines three methods used to summarise the variability in a dataset.

It will help you identify which measure is most appropriate to use for a particular set of data.

Examples are also given of the use of these measures and how the standard deviation can be calculated using Excel.

Other useful guides:  Using averages, Working with percentages

## Introduction

Measures of average such as the median and mean represent the typical value for a dataset. Within the dataset the actual values usually differ from one another and from the average value itself.

The extent to which the median and mean are good representatives of the values in the original dataset depends upon the variability or dispersion in the original data.

Datasets are said to have high dispersion when they contain values considerably higher and lower than the mean value.

In figure 1 the number of different sized tutorial groups in semester 1 and semester 2 are presented. In both semesters the mean and median tutorial group size is 5 students, however the groups in semester 2 show more dispersion (or variability in size) than those in semester 1.

Dispersion within a dataset can be measured or described in several ways including the range, inter-quartile range and standard deviation.

### The Range

The range is the most obvious measure of dispersion and is the difference between the lowest and highest values in a dataset.

In figure 1, the size of the largest semester 1 tutorial group is 6 students and the size of the smallest group is 4 students, resulting in a range of 2 (6-4).

In semester 2, the largest tutorial group size is 7 students and the smallest tutorial group contains 3 students, therefore the range is 4 (7-3).

• The range is simple to compute and is useful when you wish to evaluate the whole of a dataset.
• The range is useful for showing the spread within a dataset and for comparing the spread between similar datasets.

An example of the use of the range to compare spread within datasets is provided in table 1. The scores of individual students in the examination and coursework component of a module are shown.

To find the range in marks the highest and lowest values need to be found from the table. The highest coursework mark was 48 and the lowest was 27 giving a range of 21. In the examination, the highest mark was 45 and the lowest 12 producing a range of 33. This indicates that there was wider variation in the students’ performance in the examination than in the coursework for this module.

Since the range is based solely on the two most extreme values within the dataset, if one of these is either exceptionally high or low (sometimes referred to as outlier) it will result in a range that is not typical of the variability within the dataset.

For example, imagine in the above example that one student failed to hand in any coursework and was awarded a mark of zero, however they sat the exam and scored 40. The range for the coursework marks would now become 48 (48-0), rather than 21, however the new range is not typical of the dataset as a whole and is distorted by the outlier in the coursework marks.

In order to reduce the problems caused by outliers in a dataset, the inter-quartile range is often calculated instead of the range.

### The Inter-quartile Range

The inter-quartile range is a measure that indicates the extent to which the central 50% of values within the dataset are dispersed. It is based upon, and related to, the median.

In the same way that the median divides a dataset into two halves, it can be further divided into quarters by identifying the upper and lower quartiles.

The lower quartile is found one quarter of the way along a dataset when the values have been arranged in order of magnitude; the upper quartile is found three quarters along the dataset.

Therefore, the upper quartile lies half way between the median and the highest value in the dataset whilst the lower quartile lies halfway between the median and the lowest value in the dataset. The inter-quartile range is found by subtracting the lower quartile from the upper quartile.

For example, the examination marks for 20 students following a particular module are arranged in order of magnitude.

• The median lies at the mid-point between the two central values (10th and 11th)
• = half-way between 60 and 62 =  61
• The lower quartile lies at the mid-point between the 5th and 6th values
• = half-way between 52 and 53 = 52.5
• The upper quartile lies at the mid-point between the 15th and 16th values
• = half-way between 70 and 71 = 70.5

The inter-quartile range for this dataset is therefore 70.5 – 52.5 = 18 whereas the range is: 80 – 43 = 37.

The inter-quartile range provides a clearer picture of the overall dataset by removing/ignoring the outlying values.

## Measures of Variance

Range

The range is the difference between the high and low values. Since it uses only the extreme values, it is greatly affected by extreme values.

Procedure for finding

1. Take the largest value and subtract the smallest value

Formula

Variance

The variance is the average squared deviation from the mean. It usefulness is limited because the units are squared and not the same as the original data. The sample variance is denoted by s2, it is an unbiased estimator of the population variance.

Procedure for finding

1. Find the mean of the data
2. Subtract the mean from each value to find the deviation from the mean
3. Square the deviation from the mean
4. Total the squares of the deviation from the mean
5. Divide by the degrees of freedom (one less than the sample size)

Formula

Standard Deviation

The standard deviation is the average deviation from the mean. It is found by taking the square root of the variance and solves the problem of not having the same units as the original data. The sample standard deviation is denoted by s. It is not an unbiased estimator of the population standard deviation.

Procedure for finding

1. Find the variance
2. Take the square root

Formula

### Less Common Measures of Variance

Mean Absolute Deviation

The sum of the deviations from the mean will always be zero. We need to make sure that none of the deviations are negative. We can do this by squaring each deviation (as we do in the variance or standard deviation) or by taking the absolute value (as we do in the mean absolute deviation).

Procedure for finding

1. Find the mean of the data
2. Subtract the mean from each data value to get the deviation from the mean
3. Take the absolute value of each deviation from the mean
4. Total the absolute values of the deviations from the mean
5. Divide the total by the sample size.

Formula

Variation

The variation is the sum of the squares of the deviations from the mean. It has units that are squared instead of the same as the original data and it does not take the sample size into account.

Procedure for finding

1. Find the mean of the data
2. Subtract the mean from each value to find the deviation from the mean
3. Square the deviation from the mean
4. Total the squares of the deviation from the mean

Formula

Range Rule of Thumb

The range rule of thumb says that the range is approximately four times the standard deviation. Alternatively, the standard deviation is approximately one-fourth the range. That means that most of the data lies within two standard deviations of the mean.

Procedure for finding

1. Find the range
2. Divide it by four

Formula

Pearson's Index of Skewness

Pearson's index of skewness can be used to determine whether the data is symmetric or skewed. If the index is between -1 and 1, then the distribution is symmetric. If the index is no more than -1 then it is skewed to the left and if it is at least 1, then it is skewed to the right.

Procedure for finding

1. Find the mean, median, and standard deviation of the data.
2. Subtract the median from the mean.
3. Multiply by 3
4. Divide by the standard deviation

Formula

Coefficient of Variation

The coefficient of variation is expressed as a percent and describes the standard deviation relative to the mean. It can be used to compare variability when the units are different (the units will divide out, providing just a raw number).

Procedure for finding

Formula

Chebyshev's Rule

## Descriptive Statistics

Statistical Indices of Data Variability

Measures of Dispersion

Range The range gives you the most basic information about the spread of scores. It is calculated by the difference between the lowest and highest scores.

Interquartile Range: The difference between the score representing the 75th percentile and the score representing the 25th percentile is the interquartile range. This value gives you the range of the middle 50% of the values in the data set.

Variance and Standard Deviation: The standard deviation is the square root of the average squared deviation from the mean. The average squared deviation from the mean is also known as the variance.

Understanding and Calculating the Standard Deviation Computers are used extensively for calculating the standard deviation and other statistics. However, calculating the standard deviation by hand once or twice can be helpful in developing an understanding of its meaning.

Calculating the variance and standard deviation Consider the observations 8,25,7,5,8,3,10,12,9.

1. First, determine n, which is the number of data values.
2. Second, calculate the arithmetic mean, which is the sum of scores divided by n. For this example, the mean = (8+25+7+5+8+3+10+12+9) / 9 or 9.67
3. Then, subtract the mean from each individual score to find the individual deviations.
4. Then, square the individual deviations.
5. Then, find the sum of the squares of the deviations…can you see why we squared them before adding the values?
6. Divide the sum of the squares of the deviations by n-1. This is the Variance!
7. Take the square root of the variance to obtain the standard deviation, which has the same units as the original data.
 Score Mean Deviation* SquaredDeviation 8 9.67 -1.67 2.79 25 9.67 +15.33 235.01 7 9.67 -2.67 7.13 5 9.67 -4.67 21.81 8 9.67 -1.67 2.79 3 9.67 -6.67 44.49 10 9.67 +.33 .11 12 9.67 +2.33 5.43 9 9.67 -.67 .45
 Sum of squared dev =  320.01
 *Deviation = Score – Mean

Standard Deviation = Square root(sum of squared deviations / (N-1)

 = Square root(320.01/(9-1)) = Square root(40) = 6.32

Raw score method for calculating standard deviation Again, consider the observations 8,25,7,5,8,3,10,12,9.

1. First, square each of the scores.
2. Determine N, which is the number of scores.
3. Compute the sum of X and the sum of X-squared.
4. Then, calculate the standard deviation as illustrated below.
•  Score X2 8 64 25 625 7 49 N=9 5 25 8 64 Sum of X=87 3 9 10 100 Sum of X2=1161 12 144 9 81 — — 87 1161

Standard Deviation = square root[(sum of X2)-((sum of X)*(sum of X)/N)/(N-1)]

 = square root[(1161)-(87*87)/9)/(9-1)] = square root[(1161-(7569/9)/8)] = square root[(1161-841)/8] = square root[320/8] = square root[40] = 6.32
• Even simple statistics, such as the standard deviation, are tedious to calculate “by hand”.

## Standard Deviation Calculator

• home / math / standard deviation calculator
• Please provide numbers separated by comma to calculate the standard deviation, variance, mean, sum, and margin of error.

RelatedProbability Calculator | Sample Size Calculator | Statistics Calculator

Standard deviation in statistics, typically denoted by σ, is a measure of variation or dispersion (refers to a distribution's extent of stretching or squeezing) between values in a set of data.

The lower the standard deviation, the closer the data points tend to be to the mean (or expected value), μ. Conversely, a higher standard deviation indicates a wider range of values.

Similarly to other mathematical and statistical concepts, there are many different situations in which standard deviation can be used, and thus many different equations. In addition to expressing population variability, the standard deviation is also often used to measure statistical results such as the margin of error.

When used in this manner, standard deviation is often called the standard error of the mean, or standard error of the estimate with regard to a mean. The calculator above computes population standard deviation and sample standard deviation, as well as confidence interval approximations.

### Population Standard Deviation

The population standard deviation, the standard definition of σ, is used when an entire population can be measured, and is the square root of the variance of a given data set. In cases where every member of a population can be sampled, the following equation can be used to find the standard deviation of the entire population:

 Where xi is an individual value μ is the mean/expected value N is the total number of values

For those unfamiliar with summation notation, the equation above may seem daunting, but when addressed through its individual components, this summation is not particularly complicated. The i=1 in the summation indicates the starting index, i.e.

for the data set 1, 3, 4, 7, 8, i=1 would be 1, i=2 would be 3, and so on.

Hence the summation notation simply means to perform the operation of (xi – μ2) on each value through N, which in this case is 5 since there are 5 values in this data set.

EX:           μ = (1+3+4+7+8) / 5 = 4.6         σ = √[(1 – 4.6)2 + (3 – 4.6)2 + … + (8 – 4.6)2)]/5 σ = √(12.96 + 2.56 + 0.36 + 5.76 + 11.56)/5 = 2.577

### Sample Standard Deviation

In many cases, it is not possible to sample every member within a population, requiring that the above equation be modified so that the standard deviation can be measured through a random sample of the population being studied. A common estimator for σ is the sample standard deviation, typically denoted by s

## Range and Standard Deviation – Magoosh Statistics Blog

When you start out with statistics, there are a lot of terms that can be super confusing. Take mean, median, and mode for example; they sound similar but mean completely different things.

But they are central to understanding how statistical models and methods work.

Another set of terms that are central to understanding statistical models are range and standard deviation.

### Home on the Range

When we think about it in mathematical terms, range is a pretty straightforward term. It means the distance between the highest value and the lowest value. Let’s take a look at a three data sets for an idea of their ranges.

The mean of each data set is the same, so we may be tempted to think that the data are the same. But a look at the range says otherwise. In the first dataset, X1, the range is 25 – 5 = 20. While dataset X3 has a range of 90 – (-60) = 150! This represents vast differences in the data that we have to account for in some way.

The range also represents the variability of the data. Datasets with a large range are said to have large variability, while datasets with smaller ranges are said to have small variability. Generally, smaller variability is better because it represents more precise measurements and yields more accurate analyses.

The range is a descriptive term that is useful for describing data. Its chief use is in calculating quartiles and interquartile range. But while range is a good gauge of the variability of the data, there is a more accurate and useful one: standard deviation.

### Good Ol’ Standard Deviation

Standard deviation is the standard way that we understand and report variability. The most awesome thing about standard deviation is that we can use it not only to describe data but also conduct further analyses such as ANOVA or multiple linear regressions.

Standard deviation is a reliable method for determining how variable the data is for both a sample and a population. Of course, we cannot truly know the standard deviation for a population, but with the standard deviation of a sample, we can infer it.

The deviation is how much a score varies from the overall mean of the data. In the case of our example data, it would be how much each value differs from the mean of 15. We generally use s to represent deviation. For our data the deviation is

## How can I calculate SD from a mean sample, range, N?

Creativ-Ceutical

Medizinische Universität Innsbruck

The University of Sheffield

Creativ-Ceutical

University of Southern Denmark

University of Portsmouth

University of Nottingham

University of Nottingham

University of Nottingham

University of Deusto

University of Nottingham

University of Nottingham

• Eric Lim

An appreciation and understanding of statistics is import to all practising clinicians, not simply researchers. This is because mathematics is the fundamental basis to which we base clinical decisions, usually with reference to the benefit in relation to risk. Unless a clinician has a basic understanding of statistics, he or she will never be in a…

## How to Find the Mean, Median, Mode, Range, and Standard Deviation

Updated May 14, 2018

By Karen G Blaettler

Simplify comparisons of sets of number, especially large sets of number, by calculating the center values using mean, mode and median. Use the ranges and standard deviations of the sets to examine the variability of data.

The mean identifies the average value of the set of numbers. For example, consider the data set containing the values 20, 24, 25, 36, 25, 22, 23.

To find the mean, use the formula: Mean equals the sum of the numbers in the data set divided by the number of values in the data set. In mathematical terms: Mean=(sum of all terms)÷(how many terms or values in the set).

Add the numbers in the example data set: 20+24+25+36+25+22+23=175.

Divide by the number of data points in the set. This set has seven values so divide by 7.

Insert the values into the formula to calculate the mean. The mean equals the sum of the values (175) divided by the number of data points (7). Since 175÷7=25, the mean of this data set equals 25. Not all mean values will equal a whole number.

The median identifies the midpoint or middle value of a set of numbers.

Put the numbers in order from smallest to largest. Use the example set of values: 20, 24, 25, 36, 25, 22, 23. Placed in order, the set becomes: 20, 22, 23, 24, 25, 25, 36.

Since this set of numbers has seven values, the median or value in the center is 24.

If the set of numbers has an even number of values, calculate the average of the two center values. For example, suppose the set of numbers contains the values 22, 23, 25, 26. The middle lies between 23 and 25. Adding 23 and 25 yields 48. Dividing 48 by two gives a median value of 24.

The mode identifies the most common value or values in the data set. Depending on the data, there might be one or more modes, or no mode at all.