2.3 Measures of the Location of the Data
Percentiles and Quartiles
The common measures of location are quartiles and percentiles. These are markers that divide a given ordered dataset into quarters and hundredths, respectively. Quartiles and percentiles may or may not be part of the data. If the marker is in-between two observations in the data set, we take the average of the two adjacent observations.
A value will be at the [latex]k^{\text{th}}[/latex] percentile if [latex]k%[/latex] of the data observations are less than or equal to that value. To score in the 90th percentile of an exam does not mean, necessarily, that you received 90% on a test. It means that 90% of test scores are the same or less than your score and 10% of the test scores are the same or greater than your test score.
Quartiles are special percentiles. The first quartile, [latex]Q_1[/latex], is the same as the [latex]25^{\text{th}}[/latex] percentile, and the third quartile, [latex]Q_3[/latex], is the same as the [latex]75^{\text{th}}[/latex] percentile. The median, M, is called both the second quartile and the [latex]50^{\text{th}}[/latex] percentile.
Video: Quartiles and IQR
When interpreting the median, we tend to say,“ Half of the values are below [the median] and half of the values are above than [the median].”
Percentiles
There are several methods for finding percentiles/quartiles and so it’s important to recognize the particular method expected in your assignments/class. For this course, we’ll be using the following approach:
\[i=\frac{k}{100}(n+1)\]
where [latex]k[/latex] is the percentile, [latex]n[/latex] is the data size, and [latex]i[/latex] is the rank or position of the data value in an ordered data set from smallest to the largest.
- If [latex]i[/latex] is an integer, then the [latex]k^{\text{th}}[/latex] percentile is the data value in the [latex]i^{\text{th}}[/latex] position in the ordered set of data.
- If [latex]i[/latex] is NOT an integer, then the [latex]k^{\text{th}}[/latex] percentile is the average of the data values in the adjacent to the [latex]i^{\text{th}}[/latex] position in the ordered set of data.
Video: Quartiles and Percentiles
Example: Find the [latex]85^{\text{th}}[/latex] percentile for the following data. \[22,35,15,26,40,28,18,20,25,34,39,42,24,22,19,27,22,34,40,20,38,28\]
First, sort the data in ascending order: [latex]15, 18, 19, 20, 20, 22, 22, 22, 24, 25, 26, 27, 28, 28, 34, 34, 35, 38, 39, 40, 40, 42[/latex].
The calculate the location, [latex]i[/latex]: \[i=\frac{85}{100}(22 +1) = 0.85(23)=19.55\]Since this location, [latex]19.55[/latex] is not a whole number, the percentile value will be the average of the data value in the [latex]19^{\text{th}}[/latex] and [latex]20^{\text{th}}[/latex] positions. Note that our [latex]i[/latex] is not the percentile value we’re after, but it’s just a location where the percentile is located, somewhere in between [latex]19[/latex] and [latex]20[/latex].
We’ll take an average of the data values in [latex]19^{\text{th}}[/latex] and [latex]20^{\text{th}}[/latex] position to find our [latex]85^{\text{th}}[/latex] percentile. The value in the [latex]19^{\text{th}}[/latex] position is [latex]39[/latex] and [latex]20^{\text{th}}[/latex] is [latex]40[/latex]. Finally, the [latex]85^{\text{th}}[/latex] is the average of these two: [latex]\dfrac{39+40}{2}=39.5[/latex].
Video Example: Find Value Corresponding to a Given Percentile
Practice
Percentile of a Value in a Data Set
Using the same data set above, what’s the percentile of the data value [latex]34[/latex]? In this case, we’re given the data value, and we need to find the corresponding percentile. We’ll use the following formula to determine the percentile for given data: \[\frac{x+0.5y}{n}100\] where [latex]x[/latex] is the count of data values below the value whose percentile we’re trying to find in an ordered data set (smallest to largest), [latex]y[/latex] is the frequency of the data value for which you want to find the percentile, and [latex]n[/latex] is the size of the data set (= total number of data). In the data set above: \[\underbrace{15, 18, 19, 20, 20, 22, 22, 22, 24, 25, 26, 27, 28, 28}_{x\:=\:\textrm{number of values below}\:=\:14}, \overbrace{34, 34}^{y\:=\:2}, 35, 38, 39, 40, 40, 42\]Using the above formula, we get: \[\frac{14+0.5(2)}{22}100=68.1818181818\]Rounding this to the nearest integer, we get [latex]68[/latex]. Therefore, [latex]34[/latex] is in the the [latex]68^{\text{th}}[/latex] percentile.
Calculating Five Number Summary of Data
The five-number summary are the values:
- minimum
- first (lower) quartile, [latex]Q_1[/latex]
- median
- third (upper) quartile, [latex]Q_3[/latex]
- maximum
Once you have an ordered data set, the two parts of the five-number summary — minimum and maximum — are easily identifiable. For the rest, although we can compute these values by hand, we can use technology to speed up our work. Note that since there are multiple methods to compute these values, we need to make sure that we use the methods that’re compatible with our textbook.
FIVE NUMBER SUMMARY USING TECHNOLOGY (You do NOT need to sort data for any of the following)
SUBEDI Calculator
Go to One Variable Statistics @ rsubedi.com
Data Input
Data Type
Enter Data
Enter your data in the spreadsheet column shown for data entry. You can copy your data from the original source (comma or separated list, spreadsheet data, etc.) and paste that into the table on the calculator.
CALCULATE (Results show in a panel to the right. Five number summary will be at the bottom of the results panel.
DESMOS
Enter data as a list [latex]L=[22,35,15,26,40,28,18,20,25,34,39,42,24,22,19,27,22,34,40,20,38,28][/latex].
In the next input box, type: stats(L)
Results will show the five number summary.
STATKEY
Go to Descriptive Statistics for One Quantitative Variable from the main StatKey page.
Click on Edit Data to enter your data.
Results will show on the right under Summary Statistics.
Mean, [latex]\bar x[/latex]
Standard Deviation, [latex]s[/latex]
FIVE NUMBER SUMMARY
Practice