Continued article from the The Behavioral Measurement Letter, Vol. 6, No.2 Fall 1999
What Does It Measure?: Using Measurement Modeling to Clarify the Construct(s) Underlying the Affect Intensity Measure (AIM)
Fred B. Bryant
In previous columns (Bryant, 1997, 1998, 1999). I described a versatile, new data-analytic approach to construct validation known as “measurement modeling.” With this approach, one systematically compares alternative ways of conceptualizing the construct or constructs that a particular instrument taps using powerful, state-of-the-art, multivariate statistical tools, in order to clarify what these instruments actually measure. Measurement modeling provides researchers with a host of invaluable benefits. But perhaps its most important psychometric contribution is its ability to improve construct validity by distinguishing instruments that measure a single, unitary construct from instruments that tap multidimensional constructs, and by further decomposing the latter into their constituent parts. This increased conceptual precision not only helps researchers choose appropriate instruments for a given purpose, but also reveals how to score responses to these instruments so as to assess the underlying construct(s) with maximum reliability.
To review the basics as covered in these previous columns (Bryant, 1997, 1998, 1999), measurement modeling, also known as confirmatory factor analysis, is a specific form of structural equation modeling that examines the “structure” of people’s responses to a set of questions. “Structure” refers to the relationships among the responses to the individual questions, and to the underlying construct or constructs (called “factors”) that these interrelationships define. A cluster of questions that produce similar responses (i.e., that intercorrelate) is considered to reflect or define a single common factor. A set of measures may have any number (i.e., zero or more) of such clusters of interrelated questions, i.e., underlying factors. Questions that strongly define a particular factor are said to “load” highly on that factor or to have a strong “factor loading.” In other words, each question’s loading on a particular factor indicates how strongly responses to that question define the underlying construct that the given factor taps.
To impose a formal “measurement model” on a set of questions, one specifies: (a) the number of constructs or factors underlying responses to the set of questions; (b) the specific questions that reflect each of these factors; (c) whether or not multiple factors, if they exist, are intercorrelated; and (d) whether the unique variance in responses to each question (i.e., the variance that is unrelated to the underlying factor) is independent or intercorrelated across the questions. By contrasting how well alternative measurement models explain responses to a set of questions (e.g., one-factor versus two-factor versus three-factor models), researchers can determine whether a particular instrument measures more than one construct, and if so, what these multiple constructs are and how they relate to each other. This work not only improves how we use measurement instruments, but also refines our conceptual understanding of what these instruments actually assess. (For further details about measurement modeling, see Kline, 1998.)
The following summary of research on the Affect Intensity Measure (AIM; Larsen, 1984; Larsen & Diener, 1987) illustrates concretely how measurement modeling can be used to achieve these important benefits. Larsen (1984) developed the AIM to assess the personality trait of affect intensity, or the characteristic strength with which people experience emotions. Analogous to a kind of “emotional thermostat,” affect intensity reflects one’s emotional temperament. Larsen (1984) originally conceptualized affect intensity as a unidimensional construct – that is, people are assumed to have a trait-like tendency to feel a particular level of emotion, regardless of whether this emotion is positive or negative. The underlying theoretical model is based on the assumption that high intensity individuals actually experience lower levels of emotional arousal than do low intensity individuals, but that they express higher levels of emotion to try to achieve an optimal level of internal arousal (Larsen & Diener, 1987).
The AIM consists of 40 statements that are intended to reflect one’s characteristic level of emotion, both in general and in response to specific situations. These items cover a wide range of both positive emotions (calmness, contentment, delight, ecstasy, elation, enthusiasm, euphoria, excitement, exuberance, joy, jubilation, peacefulness, relaxation, zest) and negative emotions (anger, anxiety, guilt, nervousness, sadness, shame, tension). Respondents are instructed to indicate on a five- point scale (1 = never; 2 = almost never; 3 = occasionally; 4 = usually; 5 = almost always) how characteristic each statement is of them.
Virtually all research using the AIM has followed Larsen’s (1984) original theoretical model, which regards affect intensity as a unidimensional construct. Accordingly, researchers have typically summed responses to the 40 AIM items to obtain a global total score. A great deal of empirical evidence suggests that individual differences in total AIM score are temporally and situationally stable and are related to personality in a conceptually meaningful way. For example, total AIM score has been found to correlate with the extremity of daily moods and the frequency of emotional swings, parental reports of early childhood behaviors indicative of temperamental intensity, the strength of physiological and expressive changes associated with emotion, scores on psychosomatic symptom checklists, measures of risk for cyclothymia and bipolar affective disorder, and many important personality characteristics, social behaviors, and emotional responses. Moreover, researchers have found that 13-14% of the variance in total AIM score is linked to genetic factors.
However, there is also evidence that the AIM is multidimensional. Discussing Larsen’s (1984) original work, for example, Diener, Sandvik, and Larsen (1985) stated that the AIM assesses at least five underlying factors: positive affect intensity, negative affect intensity, preference for arousal, general emotional intensity, and visceral reactivity to emotional events. Similarly, Williams (1989) reported an exploratory factor analysis of the AIM that revealed four underlying factors, two affectively-positive (which correlated with extraversion) and two affectively-negative (which correlated with neuroticism). Whether the AIM is unidimensional or multidimensional is an important issue, because a total score that collapses across multiple constructs could distort hypothesized relationships between different aspects of affect intensity and other constructs.
About six years ago, my colleagues and I embarked on a program of research using measurement modeling to investigate more carefully whether the AIM measures a single, unitary construct or multiple sub-constructs. This work resulted in two articles (Bryant, Yarnold, & Grimm, 1996; Weinfurt, Bryant, & Yarnold, 1994) that illustrate the use of measurement modeling to: (a) determine what an instrument actually measures, (b) clarify how best to score responses to an instrument, and (c) refine our conceptual understanding of the constructs involved.
We began our first study (Weinfurt et al., 1994) by administering the AIM to 673 undergraduates and then using measurement modeling to impose both Larsen’s original one-factor model and William’s four-factor model on these data. Contrary to the notion that the AIM measures a single, unitary construct, the four-factor model (which explained 80% of the variance in responses to the AIM) fit the data significantly better (p <0.0001) than did the one-factor model (which explained only 62% of the variance in responses). Although these results strongly suggest that affect intensity is multidimensional, the four-factor model fails to explain 90% or more of the variance in responses to the AIM, the standard by which a measurement model is considered adequate (Bentler & Bonett, 1980). Accordingly, we continued to search for a a better-fitting measurement model for the AIM.
Scrutinizing the multiple dimensions of affect intensity more carefully, we realized that they incorporated two critical distinctions: positive versus negative valence, and intensity (strength) versus reactivity (responsiveness). By explicitly crossing these two distinctions, a four-factor model emerges that consists of self-evaluations of one’s predisposition to experience: (a) positive intensity, (b) negative intensity, (c) positive reactivity, and (d) negative reactivity. We termed this structure the AIR model, for Affect Intensity and Reactivity. To develop a measurement model that explicitly embodied these dimensions, we (Bryant et al., 1996, Study 1) began by sorting the 40 AIM items into a subset of 27 items that could be judged a priori as indicative of either the characteristic intensity or reactivity of either positive or negative emotion. We categorized seven AIM items as reflecting positive intensity (e.g., item 2: “When I feel happy it is a strong type of exuberance”); six as reflecting negative intensity (e.g., item 30: “When I do feel anxiety it is normally very strong”); eight as reflecting positive reactivity (e.g., item 23: “When I receive an award I become overjoyed”); and six as reflecting negative reactivity (e.g., item 11: “Sad movies deeply touch me”). We discarded 13 AIM items because we could not unequivocally classify them as reflecting one of these four dimensions of affective experience (e.g., item 3: “I enjoy being with other people”).
Which measurement model better explains responses to these 27 AIM items — Larsen’s original one-factor model that assumes affect intensity is a unitary construct, or a four-factor AIR model that assumes people report separate experiences of positive and negative intensity and reactivity? To answer this key question, we administered the 40-item AIM to an independent sample of 631 undergraduates. We then used measurement modeling to compare how well the one-factor “total score” model and the four- factor AIR model explained responses to the 27 items for both this new sample and Weinfurt et al.’s (1994) earlier sample as well.
For both samples, the four-factor AIR model explained responses to the 27 AIM items significantly better (p <0.0001) than did the one-factor model. Moreover, the four-factor model explained 83% and 85% of the variation in the responses in the two samples, respectively, whereas the one-factor model explained only 62% and 66%, respectively. While inspecting the relationships among the four AIR factors, however, we noticed that the positive intensity and positive reactivity factors were highly intercorrelated 92 for sample 1 and 90 for sample 2, whereas the negative intensity and negative reactivity factors intercorrelated at only .55 for both samples. (Evidently, affect intensity and reactivity are different in relation to positive versus negative emotions, with the distinction between feeling and expression being much more relevant for negative affect. This may be because negative emotions often have more harmful social consequences then do positive emotions, and thus negative emotions are more likely to be repressed or inhibited.) With the high correlation between positive intensity and positive reactivity, we combined these two factors to produce a three-factor AIR model (positive affectivity, negative intensity, negative reactivity) that achieved the same degree of goodness-of-fit as did the four-factor model. Because this three-factor AIR model provides equivalent statistical precision but greater parsimony than the four-factor model, it is currently the best measurement model for the AIM.
In the final phase of this research, we (Bryant et al., 1996, Study 2) investigated whether the three AIR factors contribute more, both conceptually and statistically, to understanding and predicting an important personality characteristic than does total AIM score. To accomplish this, we examined the relationship between affect intensity and dispositional empathy as measured by the Interpersonal Reactivity Index (IRI; Davis, 1983). First, we tested the hypothesis that the three AIR factors in combination would do a better job of predicting dimensions of dispositional empathy — empathic concern, perspective taking, personal distress, and fantasy — than would total AIM score. Second, we assessed the discriminant validity of the three AIR factors, relative to that of the unidimensional total AIM score, in predicting dimensions of dispositional empathy. For example, would positive affectivity, relative to negative intensity or negative reactivity, show a different pattern of relationships with the empathy dimensions?
We began by administering the AIM and the IRI to a new sample of 218 undergraduates. Once again, for this new sample, the three-factor AIR model provided a significantly better (p <0.0001) measurement model for the 27 AIM items than did the one-factor “total score” model. We next used regression analyses to predict each of the four IRI factors using total AIM score first and then the three AIR factors. If affect intensity is truly unidimensional, as Larsen (1984) argued, then using the three AIR factors together as predictors should explain no more variance in dispositional empathy than using total AIM score as a global predictor. However, for each of the four empathy dimensions, the three AIR factors together explained more variance than did total AIM score, with the difference in r ranging from a low of 8% (for fantasy) to a high of 125% (for perspective taking). These results clearly show the greater predictive utility of the three-factor AIR model.
Investigation of discriminant validity found that no two AIR factors showed the same pattern of relationships with the four IRI factors, and none of the IRI factors showed the same pattern of relationships with the three AIR factors. For example, positive affectivity predicted greater empathic concern and greater empathic fantasy, but not personal distress and perspective taking. Negative affect intensity predicted greater personal distress and empathic fantasy, but not perspective taking and empathic concern. Negative reactivity predicted greater empathic concern, greater personal distress, and greater perspective taking, but was unrelated to empathic fantasy. The multidimensional (three- factor) model of affect intensity thus demonstrated superior conceptual and predictive precision relative to the unidimensional total AIM score. Thus, affect intensity, as operationalized by the AIM, is multidimensional rather than unidimensional.
Further supporting the discriminant validity of the three AIR factors, women reported higher levels of negative reactivity than of positive affectivity (p <0.0001), whereas men reported lower levels of negative reactivity than of positive affectivity (p <0.0001). Thus, women say they are more emotionally reactive to negative events than to positive, whereas men. say they are more emotionally reactive to positive events than to negative. (This pattern of results may reflect sex differences in socialization that encourage females to express or exaggerate negative feelings, and encourage males to suppress or deny negative feelings, in response to aversive events.)
Our research has an important implication for anyone who uses the AIM to measure affect intensity. Specifically, using total AIM score to operationalize affect intensity may well obscure findings that would otherwise emerge using the three AIR factors. Clearly, total AIM score does not provide the same picture of affect intensity as the three-factor model.
Does this mean that researchers should not use a global measure of affect intensity? The answer is that it depends on one’s purpose. If, for example, one wants to obtain a global assessment of “affect intensity” as a general personality trait, then the unidimensional model is appropriate. If, however, one wants to use “affect intensity” to predict other traits or outcomes, then the multidimensional model is more appropriate. With regard to the unidimensional approach, however, in the case of “affect intensity,” our research suggests that summing the 27 items from the AIR model into a single, total score provides a better global measure of general affect intensity than does the 40-item total AIM score.
As a more general point, this work on the Affect Intensity Measure illustrates how measurement modeling can be used to refine our understanding of the constructs that instruments measure, determine the most reliable and informative methods of scoring instruments to improve conceptual clarity and statistical precision, and enhance the effectiveness with which we use instruments. These important benefits make measurement modeling an invaluable psychometric tool in the behavioral sciences.
Bentler, P.M., & Bonett, D.G. (1980). Significance tests and goodness of fit in the analysis of covariance structures. Psychological Bulletin, 88, 588-606.
Bryant, F.B. (1997). The comparative anatomy of related instruments: An emerging specialty. The Behavioral Measurements Letter, 4 (2), 7-9.
Bryant, F.B. (1998). Measurement modeling: A tool for investigating the comparative anatomy of related instruments. The Behavioral Measurements Letter, 5 (2), 14-17.
Bryant, F.B. (1999). Measurement modeling: Identifying the constructs underlying the Center for Epidemiologic Studies Depression Scale (CES-D). The Behavioral Measurements Letter, 6 (1), 6-9.
Bryant, F.B., Yarnold, P.R., & Grimm, L.G. (1996). Toward a measurement model for the Affect Intensity Measure: A three-factor structure. Journal of Research in Personality, 30, 223-247.
Davis, M.H. (1983). Measuring individual differences in empathy: Evidence for a multidimensional approach. Journal of Personality and Social Psychology, 44, 113-126.
Diener, E., Sandvik, E., & Larsen, R.J. (1985). Age and sex effects for emotional intensity. Developmental Psychology, 21, 542-546.
Kline, R.G. (1998). Principles and practice of structural equation modeling. New York: Guilford Press.
Larsen, R.J. (1984). Theory and measurement of affect intensity as an individual difference characteristic. Dissertation Abstracts International, 85, 2297B. (University Microfilms No. 84-22112).
Larsen, R.J., & Diener, E. (1987). Affect intensity as an individual difference characteristic: A review. Journal of Research in Personality, 21, 1-39.
Weinfurt, K.P., Bryant, F.B., & Yarnold, P.R. (1994). The factor structure of the Affect Intensity Measure: In search of a measurement model. Journal of Research in Personality, 28, 314-331.
Williams, D.G. (1989). Neuroticism and extraversion in different factors of the Affect Intensity Measure. Personality and Individual Differences, 10, 1095-1100.
Fred Bryant is Professor of Psychology at Loyola University, Chicago. He has roughly 80 professional publications in the areas of social psychology, personality psychology, measurement, and behavioral medicine. In addition, he has coedited 5 books, including Methodological Issues in Applied Social Psychology (New York, Plenum Press, 1993). Dr. Bryant has extensive consulting experience in a wide variety of applied settings, including work as a research consultant for numerous marketing firms, medical schools, and public school systems; a methodological expert for the U.S. Government Accounting Office; and an expert witness in several federal court cases involving social science research evidence. He is currently on the Editorial Board of the journal Basic and Applied Social Psychology. His current research interests include happiness, psychological well-being, Type A behavior, the measurement of cognition and emotion, and structural equation modeling.
Read additional articles from this newsletter: