Fred B. Bryant, Ph.D., Loyola University Chicago

Measurement involves the use of predetermined rules to assign numbers to categorize or quantify objects, events, or characteristics of people. In introductory statistics courses, students often learn that there are four basic types or levels of measurement in the health and psychosocial sciences. However, rarely do students understand why this knowledge matters—in other words, what difference does the type of measurement scale make, and why is it important to know the type of measurement scale you’re working with?

Below I describe the most commonly used framework for distinguishing among measurement scales, and I explain that the type of measurement scale matters because it determines the specific descriptive statistics and inferential statistical tests that are appropriate to use. I provide concrete examples of different types of measurement scales, describe the most commonly used descriptive statistics, and summarize guidelines for selecting the appropriate ** descriptive** statistics to use with each type of scale.

In a forthcoming blog post, I will summarize guidelines for selecting the appropriate ** inferential** statistics to use with each type of scale.

** Four Main Types of Measurement Scales. **Although researchers have developed a variety of conceptual frameworks over the years to categorize different types of measurement scales, arguably the most influential and well-known typology is that of Stanley S. Stevens, who published his ground-breaking classification scheme in a 1946

*Science*magazine article entitled, “On the Theory of Scales of Measurement.” In this article, Stevens argued that all scientific measurements reflect one of four different types of scales that he termed

*nominal*,

*ordinal*,

*interval*, and

*ratio*.

According to Stevens (1946), these four types of scales are distinguishable in terms of whether or not they possess each of three different measurement properties: (a) *magnitude* (that is, whether or not the numbers assigned to observations reflect varying amounts of the underlying variable being measured); (b) *equal intervals* (that is, whether or not the numerical differences between any two consecutive numbers on the measurement scale reflect equal differences in the amounts of the underlying variable being measured); and (c) an *absolute-zero point* (that is, whether or not there is an actual value on the measurement scale that truly reflects the complete absence of the underlying variable being measured).

** 1. Nominal** (also known as “categorical”) measurement scales reflect the crudest form of qualitative measurement and involve using numbers simply to label, classify, or categorize observations of differing types. (The word “nominal” comes from the Latin noun

*nomen*, which means “name.”) With a nominal scale, you cannot interpret the numbers assigned to observations as anything more than the names or labels for the things you are categorizing. Thus,

*nominal*measurement scales have neither magnitude, equal intervals, nor an absolute zero-point.

As an illustration of a *nominal* measure of musical preference, for example, we might ask people to indicate their favorite style of music, and then keep track of their responses by coding them as follows: *1* = pop, *2* = rock, *3* = hip-hop, *4 = *jazz, *5* = classical, *6* = country, *7* = heavy metal, *8* = gospel, *9* = R & B, *10* = punk, or *11* = other. Notice here that the numbers assigned to the different types of music lack *magnitude*—that is, they do not represent higher or lower amounts of an underlying characteristic, but simply reflect differences in type (or quality) rather than differences in degree (or quantity). Also, the numbers do not have *equal intervals*—for example, the difference between “rock” (2) and “pop” (1) [or, 2 – 1 = 1] is not in any meaningful way equivalent to the difference between, say, “country” (6) and “classical” (5) [or, 6 – 5 = 1]. Nor does this way of measuring musical preference provide an observable value that reflects the complete absence of a preference for any style of music.

** 2. Ordinal** (also known as “ordered categorical”) measurement scales also use numbers to classify observations into categories, but the order of these numbers is meaningful in the sense that we can use them to rank observations in terms of the

*amount*of the underlying characteristic. However, even though the numbers used in

*ordinal*scales reflect relatively

*more*or

*less*of the underlying variable, they do not indicate

*how much*more or

*how much*less. Nor do

*ordinal*scales include a value on the scale that reflects the total absence of the variable being measured. Thus,

*ordinal*measurement scales possess magnitude, but they lack equal intervals and an absolute zero-point.

As an illustration of an *ordinal* measure of musical preference, for example, we might ask people to rank order a list of 10 different styles of music (pop, rock, hip-hop, jazz, classical, country, heavy metal, gospel, R & B, and punk), from their most favorite (1) to least favorite (10) musical style. Notice here that the numbers assigned to the different types of music reflect strength or *magnitude* of preference for each musical style—that is, lower numbers represent greater preference. However, the numbers do ** not** have

*equal intervals*—for example, the difference between a person’s preferences for their #1 and #2 ranked styles of music may well be

*much greater*than the difference between their preference for their #9 and #10 ranked styles of music. Nor does this

*ordinal*way of measuring musical preference provide an observable ranked value that reflects the complete absence of a preference for any style of music whatsoever.

*3.*** Interval** measurement scales use numbers that reflect the magnitude of the underlying variable the scale assesses. In addition, the numerical differences between any two consecutive numbers on the measurement scale reflect equal differences in the amounts of the underlying variable. However, interval scale measures do

**provide an observable value that reflects the complete absence of the measured characteristic.**

*not*As an illustration of an *interval* measure of musical preference, for example, we might ask people to rate each of the 10 different styles of music, using a 7-point response scale where *1* is labeled “strongly prefer” and *7* is labeled “strongly dislike.” As with an *ordinal* scale, the numbers assigned to the different types of music reflect relative strength or *magnitude* of preference for each musical style. However, unlike an *ordinal* scale, the numbers on an *interval* scale imply equal differences in strength of preference between consecutive numbers across the full range of the 7-point interval. In other words, the numbers are presumed to have *equal intervals*—for example, the difference in amount of preference between ratings of *1* versus *2* is considered equivalent to the difference in amount of preference between ratings of *2* versus *3*. However, this *interval-scale* way of measuring musical preference does ** not** provide a rating that reflects the total absence of a preference for any musical style.

** 4. Ratio** scales have all the measurement properties of an interval scale, as well as a specific value that indicates the complete absence of the characteristic that the scale assesses. Because it provides an absolute-zero point, a

*ratio*scale produces numbers that you can meaningfully compare in terms of the absolute amount of the thing you are measuring. For instance, with a

*ratio*scale, a number that is mathematically twice as large as another number reflects twice as much of whatever the scale measures. Thus, a

*ratio*scale allows you to determine

*how many times greater*one score is than another, whereas an

*interval*scale only allows you to determine

*how far apart*two scores are from each other.

As an illustration of a *ratio* measure of musical preference, for example, we might ask people to estimate the actual number of times they have listened to each of the 10 different styles of music during the past month. Notice that this way of measuring musical preference provides numbers that have not only magnitude and equal intervals, but also an absolute-zero point.

** Table 1** summarizes the four types of measurement scales and the specific properties they possess.

#### Table 1. Four Types of Measurement Scales

Measurement Properties | Nominal | Ordinal | Interval | Ratio |
---|---|---|---|---|

Magnitude | NO | YES | YES | YES |

Equal Intervals | NO | NO | YES | YES |

Absolute-zero point | NO | NO | NO | YES |

Examples | a. Genderb. Ethnicityc. Country in which you were bornd. Religious preferencee. Political affiliationf. Eye colorg. Social security numberh. Zip codei. Numbers on football players’ jerseysj. List of responses to the question, “What is your favorite food?”k. List of college majorsl. Type of car one owns | a. Class rank in high schoolb. Military rankc. Ranking of world’s top ten tennis playersd. Measure of how often you get the hiccups using the following response-scale:1 = rarely 2 = seldom 3 = occasionally 4 = sometimes 5 = often e. Measure of annual income using the following scale:1 = less than $25,000 2 = $25,000 - $100,000 3 = more than $100,000 | a. Time measured by an analog or digital clock.b. Page numbers in a bookc. Shoe size.d. Temperature measured using the Celsius or Fahrenheit scalee. How often you get the hiccups, using a 1-to-7 scale (where 1 is labeled “rarely,” 7 is labeled “often,” and numbers 2-6 are not labeled)f. Floor numbers in an office building (but for high-rise buildings that skip the 13th floor, this scale would be ordinal)g. Score on the SAT or ACT | a. Elapsed time measured by a stopwatchb. A weighing scalec. Height measured using a yardstickd. Amount of liquid in a drinking glass measured in ouncese. The balance in your checking accountf. Number of hats that you owng. Resting pulse-rate in beats per minuteh. Number of points a basketball team scored in their last gamei. Number of pets you have owned in your life. |

** Two Commonly Used Types of Descriptive Statistics.** Having used some type of measurement scale to collect data, researchers can use statistics to describe and summarize a variety of different characteristics of their data. Perhaps the most frequently reported characteristics of data in the health and psychosocial sciences are

*central tendency*and

*variability*.

** Central Tendency** refers to the most common or typical score that lies in the middle of a distribution of scores. The three most often used measures of central tendency are the

*mean*,

*median*, and

*mode*.

**1.** The ** mean** (aka the “arithmetic average”), as many people know, is computed by adding together all of the observed scores in a distribution and then dividing this sum by the total number of scores in the distribution. However, this

*computational*formula is

**the conceptual definition of the mean. The**

*not**definitional*formula for the mean is that the mean is the one value for which Σ(x – mean) = 0. That is, if you subtract the mean from each score in the distribution and then add together all of these deviation-scores, the sum will be zero. In other words, the mean of a distribution is the one value for which (a) the sum of its distances from each of the scores

*above*it, minus (b) the sum of its distances from each of the

*scores*below it, exactly equals zero. Thus, the mean is a “balancing point” for the scores in a distribution, such that it precisely balances the sum of its distances from each of the scores above it and the sum of its distances from each of the scores below it.

Given this conceptual definition, notice that the measurement scale used to assess an underlying variable must have both *magnitude* (so that scores represent higher or lower amounts of the variable), as well as *equal intervals* (so that numerical differences between any two consecutive numbers on the scale reflect equal differences in amount) in order to use the *mean*. Otherwise, the mean of a set of scores will not be meaningfully interpretable as a “balancing point.” For this reason, it is appropriate to use the mean to describe central tendency only for *interval* and *ratio* measurement scales (which have magnitude and equal intervals), but it is ** not** appropriate to use the mean for

*nominal*or

*ordinal*measurement scales (which lack magnitude). [Notice the mean does

**require scales to have an absolute-zero point.]**

*not***2.** The ** median** is defined as the one value that splits a distribution of scores exactly in half, such that 50% of the cases lie

*above*it and 50% of cases lie

*below*it,

**(i.e., from lowest to highest, or from highest to lowest). Given this conceptual definition, notice that the measurement scale used to assess an underlying variable must have magnitude (so that scores represent higher or lower amounts of the variable) in order to use the median, but that the median does**

*when all of the scores are first arranged in order of magnitude***require the measurement scale to have equal intervals. Whereas the**

*not***balances (a) the**

*mean**sum of how far*it is from the scores above it and (b) the

*sum of how far*it is from the scores below it, the

**balances (a) the**

*median**total*

*number of scores*that lie above it and (b) the

*total*

*number of scores*that lie below it. For this reason, it is appropriate to use the median for all measurement scales that possess magnitude—that is, for

*ordinal*,

*interval*, and

*ratio*scales—but it is

**appropriate to use the median for**

*not**nominal*scales (which lack magnitude).

**3.** The ** mode** is defined as the most frequently occurring score in a distribution (although it is possible for more than one score to be most frequent, in which case the distribution of scores in known as “multimodal”). Given this conceptual definition of the mode, notice that the measurement scale used to assess an underlying variable is

**required to have either magnitude or equal intervals, in order to use the median. Thus, it is appropriate to use the mode for all four types of measurement scales—i.e., nominal, ordinal, interval, and ratio.**

*not*** Variability** refers to the degree to which the scores in a distribution deviate from one another, or how widely or narrowly spread out (or dispersed) scores are in value. The three most frequently used measures of variability are the

*range*,

*variance*, and

*standard deviation*.

**1.** The ** range** is the difference between the highest and lowest score in a distribution. Given this conceptual definition, notice that the measurement scale used to assess an underlying variable must have magnitude (so that scores represent higher or lower amounts of the variable) in order to use the range, but that the range does

**require the measurement scale to have equal intervals. For this reason, it is appropriate to use the range for all measurement scales that possess magnitude—that is, for**

*not**ordinal*,

*interval*, and

*ratio*scales—but it is

**appropriate to use the range for**

*not**nominal*scales (which lack magnitude).

**2.** The ** variance** is an estimate of the typical distance that scores in a distribution are from their mean, when expressing these differences in

**units of measurement. Given that the mean is an essential ingredient in computing the variance, the measurement scale used to assess an underlying variable must have both**

*squared**magnitude*(so that scores represent higher or lower amounts of the variable), as well as

*equal intervals*(so that numerical differences between any two consecutive numbers on the scale reflect equal differences in amount) in order to use the

*variance*. Otherwise, the variance of a set of scores will not be meaningfully interpretable as the size of their “typical squared deviation from the mean.” For this reason, it is appropriate to use the variance to describe variability only for

*interval*and

*ratio*measurement scales (which have magnitude and equal intervals), but it is

**appropriate to use the variance for**

*not**nominal*or

*ordinal*measurement scales (which lack magnitude). [Notice that the variance does

**require scales to have an absolute-zero point.]**

*not***3.** The ** standard deviation** is simply the square root of the variance (and conversely, the variance is simply the standard deviation squared). In other words, the standard deviation is an estimate of the typical distance that scores in a distribution are from their mean, when expressing these differences in the

**units of measurement. As with its squared counterpart (i.e., the variance), the standard deviation is based on the value of the mean—thus, it is appropriate to use the standard deviation to describe variability only for**

*original**interval*and

*ratio*measurement scales (which have magnitude), but it is

**appropriate to use the standard deviation for**

*not**nominal*or

*ordinal*measurement scales (which lack magnitude).

The bottom line here is that ** different types of **measurement scales require different descriptive statistics.

When analyzing *interval* or* ratio* scales, it is appropriate to use all three measures of central tendency (mean, median, and mode), as well as all three measures of variability (range, variance, and standard deviation).

When analyzing *ordinal* scales, it is appropriate to use the *median* and the *mode* as measures of central tendency (but ** not** the mean), and the

*range*as a measure of variability (but

**the variance or the standard deviation).**

*not*When analyzing *nominal* scales, it is appropriate to use the *mode* as a measure of central tendency (but ** not** the mean or the median), and it is

*inappropriate*to use all three measures of variability (the range, variance, and standard deviation.

An appropriate measure of variability for use with *nominal* scales is the total number of nominal categories for which there is at least one observed response. For example, imagine we assess people’s favorite style of music using a *nominal* measure (where *1* = pop, *2* = rock, *3* = hip-hop, *4 = *jazz, *5* = classical, *6* = country, *7* = heavy metal, *8* = gospel, *9* = R & B, *10* = punk) in two different samples of respondents—a group of 100 young adults (age 18-30), and a group of 100 senior citizens (age 70-90). We might hypothesize that the younger sample would show a *greater variability* in musical preference, compared to the older sample. Computing the total number of categories for which there is at least one observed response in each group, we find all 10 types of music received at least one endorsement in the younger sample, whereas only 4 types of music received at least one endorsement in the older sample (i.e., jazz, classical, country, and gospel).

** Table 2** provides examples of the four types of measurement scales and summarizes the specific measures of central tendency and variability that are appropriate to use with each type of scale.

#### Table 2. Examples of Measurement Scales and the Proper Descriptive Statistics to Use with Them

Conceptual Variable | Example of Measurement Scale | Type of Scale | Appropriate Descriptive Statistic(s) to Use | |
---|---|---|---|---|

Central Tendency | Variability | |||

Political Preference | Which of the five leading presidential candidates would you vote for in the next presidential election? | Nominal | Mode | Total number of categories with at least one response |

Rank order the five leading presidential candidates in terms of your likelihood of voting for these candidates in the next presidential election. | Ordinal | Median, mode | Range | |

Rate how much you like each of the five leading presidential candidates, using a scale from 1 (dislike) to 10 (like). | Interval | Mean, median, mode | Range, variance, standard deviation | |

Rate each of the five leading presidential candidates in terms of the chance (from 0 – 100%) that you will vote for them in the next presidential election. | Ratio | Mean, median, mode | Range, variance, standard deviation | |

Frequency of temper tantrums in young children | How does your child typically react to not getting his or her way? (1 = accepts it; 2 = keeps asking me; 3 = has a temper tantrum; 4 = tries to bargain with me; 5 = other). | Nominal | Mode | Total number of categories with at least one response |

Rank order the frequency with which your child uses each of the following responses when the child does not get his or her way: accepts it; keeps asking me; has a temper tantrum; tries to bargain with me; other. | Ordinal | Median, mode | Range | |

Rate how often your child has a temper tantrum when the child does not get his or her way, using a scale from 1 (rarely) to 7 (often). | Interval | Mean, median, mode | Range, variance, standard deviation | |

How many temper tantrums has your child had in the past week? | Ratio | Mean, median, mode | Range, variance, standard deviation | |

Content of one’s earliest memory | Describe the content of your earliest childhood memory. (The researcher then develops a qualitative coding scheme for use in categorizing the general themes that respondents mention as their earliest memory.) | Nominal | Mode | Total number of categories with at least one response |

Rank order a provided list of various general themes, in terms of how early each theme exists in your memory. | Ordinal | Median, mode | Range | |

For each theme provided on a list, rate how early in your life you have a memory that involves the particular theme, using a scale from 1 (very early) to 7 (not very early). | Interval | Mean, median, mode | Range, variance, standard deviation | |

For each theme provided on a list, indicate the earliest age (in years) at which you have a memory that involves the particular theme. | Ratio | Mean, median, mode | Range, variance, standard deviation |