• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 



• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 

The Multitrait Multimethod (MT-MM)

Vol. 4, No. 2: Spring 1997


The Method in Multitrait-Multimethod (MT-MM)

Donald W. Fiske


The multitrait-multimethod (MT-MM) matrix is now part of the common methodology for research psychologists studying individual differences. The inspiration of the late Donald T. Campbell, it will be included in the training of psychologists for years to come. The basic idea is straightforward: a researcher should design his studies so that he has measures of his essential variables by more than one method. In other words, his research should involve the correlations among several traits, each measured by the same set of methods. As the necessity of using several methods becomes more generally accepted, the field of individual differences should gain. There is some question, however, whether the published matrices are improving over the dismaying set included in the original article (Campbell & Fiske, 1959, Psychological Bulletin, 56, 81-105).

That paper has had a record-breaking citation rate and is still receiving more than a hundred citations a year. The wide range of journals in which it is being cited suggests that its message is spreading widely. Some of the citations are probably in papers by methodologists who continue to work on the problem of statistically analyzing the MT-MM matrix.

Apart from any such analyses, much can be learned about the constructs measured and the methods used by a close examination of each matrix one obtains. Answers can be sought for a number of important questions. For each construct (trait or other type of variable), is it being measured differently by the several methods? Is the level of association compatible with the investigator’s conceptualization of that variable or does it indicate appreciable method variance (unwanted variance due solely to the particular methods)? Are there one or more methods that yield distorted patterns of correlations or correlational values that are much too low? Are there methods that generate too high correlations, correlations that cannot be accepted for the constructs as currently construed?

Of course, one’s research field may not enable one to set up a design with several traits or constructs, each being measured by several apparently highly diverse methods. Even so, one can measure each central construct in the research plan by at least two methods. Correlations between two methods of measuring can be illuminating. Correlations between two constructs measured by just one method seem useless. If any of the constructs can be measured only by just one method, beware! The construct may be inextricably embedded in the method.


The MT-MM matrix can help us discard a method or a construct that is not useful. We are quite willing to discard a method when we find flaws in it, when it generates weak and unreliable data, when method variance overwhelms construct variance. But we sail along happily with our constructs, until an MT-MM matrix shows that they are too specific to the method used to measure them. Or we may find that our pet or fledgling construct overlaps highly with another construct, for each of several methods. An interesting case study is “social intelligence.” In the twenties, an effort was made to measure this construct, which seemed like a reasonable one. But efforts to measure it yielded scores that were so highly correlated with general intelligence that it could not be considered a separate variable. [Editor’s Note: One cannot help but wonder whether a similar fate awaits Daniel Goleman’s (1995) articulation of “emotional intelligence” (New York: Bantam Books).

The major focus of the “Convergent and Discriminant Validation” paper and of this note is on method. We can reformulate constructs as our results require, but what can we do about our methods? Sometimes we can improve them, but more usually a bad method has to (and therefore should) be discarded. Psychometricians are familiar with the several sources of possibly intrusive variance in ordinary testing, such as the items, the instructions, the examiner, and the reason for the testing session. All of these are within the testing room, the standard context well known to psychometricians. What about other contexts or situations? For each context, one can prepare a list of components or features of that context, each of which could potentially affect the measurements obtained in the given context. For a construct that you have studied, what are the one or more contexts in which you must study it? The most convenient, most readily available context, such as a laboratory bench or a psychology classroom, may not be a wise choice. Mating behavior in the laboratory cage is quite different from the behavior observed in a simulation of the natural environment in which rats mate (McClintock, 1981, New directions for methodology in social and behavioral science, San Francisco: Jossey-Bass).

Our problem is not specific to psychology or the social sciences more generally. Recently I learned that studies of the Golgi material in cells yield somewhat different results, depending upon the discipline to which the researcher adheres, as a consequence of training and practice with a given set of tools. The two sets of results are simply disparate, not contradicting each other but at the same time not mutually supportive of each other.

In other instances in the natural sciences, the situation is quite clear and clean: e.g., in the measurement of temperature, there are a score of different methods, there are conceptual linkages among them, and they agree with each other very well (especially in terms of our standards). So when temperature is being measured, our problem does not get in the way.

An examination of the measurement of temperature may throw some light on our problem. Temperature has a variety of effects and there is some theory about each effect. So the physicist can apply a piece of general physical theory to provide a conceptual basis for each method of measuring it. Most of the physical theory relevant to temperature seems to be pretty well agreed upon by physicists. Unfortunately for us, we do not have such an agreed upon general theory in psychology, or at least not enough to provide a basis for a theory of measuring, a theory of method. (Psychometric test theory does not throw light on the underlying problems.) So we need to develop conceptual formulations about measuring that we can test empirically to see which ones need to be modified. Until such theorizing has been worked out and tested, we can use the MT-MM matrix as a prop. If we believe that a substantive construct should be constant over a set of methods for measuring it, we can apply those methods to that construct and see what happens. If the methods agree, that’s great. If they don’t, then either the methods or the conceptualization of the construct must be changed.

Perhaps the solution to this problem is to label each set of measurements not only by the substantive, conceptual core but also by the method by which they were obtained. We already do this for self-ratings as opposed to peer-ratings and for intelligence as measured by the Wechsler or as measured by the Stanford-Binet.

The pervasiveness of method has been known for centuries. Shakespeare has Polonius say of Hamlet’s strange behavior, “Though this be madness, yet there is method in’t.”

Officially retired in 1986, Fiske is now almost completely retired from the frenetic world of actual research and scholarship. At the University of Chicago from 1948, the time of his doctorate at the University of Michigan, he worked on many problems, not solving any of them but contributing to our understanding of the problem and its consequent reformulation: intraindividual variability over time; our overreliance on words as stimuli, responses, and in instructions; rater-judge effects; and – most central of all the method problem.


Read additional articles from this newsletter:

Measuring Perceptions of Relational Communication

The Comparative Anatomy of Related Instruments: An Emerging Specialty


About This Newsletter… Transitionally Speaking

Robert Perloff, Guest Newsletter Editor



After I agreed to offer a segue or transition into the three innovative articles constituting the substance of and written especially for this issue of The Behavioral Measurements Letter (Volume 4, No. 2, Spring 1997), a question leaped up at me: what exactly did this issue segue from? How is this issue of the Behavioral Measurement Database Services (BMDS) semiannual newsletter different from the issues preceding it? Raising this question is no idle distraction from the business at hand, for this question mandates, necessarily and desirably, that the newsletter and its parent, BMDS, be identified and explained for new readers as well as for our friends and colleagues who have been with us since the newsletter’s inception some four years ago.

The principal offspring of BMDS is HaPI [the Health and Psychosocial Instruments (hence HaPI, get it?) database]. HaPI’s function, expressed in the sidebar to the newsletter title, above, is “Enriching the health and behavioral sciences by broadening instrument access,” a double-duty enrichment serving as an icon both for HaPI and for the newsletter.

HaPI, available online through OVID Technologies (an international vendor of databases) and now obtainable as a CD-ROM from BMDS, contains over 45,000 records of interest to psychologists, physicians, nurses, social workers, educators, evaluators, sociologists, administrators, other health and behavioral scientists, and students. From widely recognized to obscure and unpublished, these instruments include questionnaires, interview schedules, coding schemes, observation checklists, rating scales, tests, project techniques, and measures using vignettes or scenarios.


Information Brought to Light: Most instruments are “buried” in avalanches of published literature and are hence difficult to discover. Worse still, scientists in one field (e.g., psychology) may be unfamiliar with instruments in other fields (e.g., medicine, nursing, public health). The majority of users do not have access to instruments that either have been recently developed or are described in unpublished manuscripts. These measures are generally known only by people in a particular field or subspecialty. By maintaining information on instruments from these diverse sources, HaPI enables users to retrieve relevant measures about which they might otherwise be unaware. Thus, HaPI helps researchers avoid “reinventing the wheel.” HaPI places existing information on measurement instruments at users’ fingertips, no farther away than their keyboard.


What and Why is the Behavioral Measurements Letter?

The Behavioral Measurements Letter, a semi-annual newsletter, is devoted to the exploration of timely measurement topics. The Behavioral Measurements Letter is published by Behavioral Measurement Database Services, producer of the Health and Psychosocial Instruments (HaPI) database.

The impetus for this newsletter sprang from BMDS’s belief in the paramount importance of measurement. Just as in the physical sciences, advances in the health and behavioral sciences are proportional to advances in measurement. As Robert Pool stated in the case of the physical sciences, “These advances are vital, because science’s understanding of the physical world is necessarily limited by the accuracy with which science can measure that world”: (Science, 1988, 240, 604-605).


Earlier Editions of the Newsletter (Where we are Segueing From)

The first three volumes of the newsletter and No. 1 of Volume 4, while not bereft of substantive articles (“Finding the Right Measure,” “Clinical Measurement for Primary Care,” “Faith in Measurement,” “Ways to Measure Demographic Variables,” “In Memoriam-Donald T. Campbell,” “Measuring Reminiscence in Research on Type A Behavior,” “Beck Depression Inventory”) were quite appropriately devoted to housekeeping items and articles establishing the impetus for the newsletter in the first place [HaPInings,” “HaPI Thoughts” (the lighter side), “Instrument Update,” and related developments pinpointing changes in and new features of HaPI].


Segueing Into…the Transition

When the newsletter was inaugurated there were 18,000 records; this has increased 150%; in less than four years, to 45,000 records. First, a highly popular feature with users has now been established as part of HaPI’s repertoire: document delivery of instruments. Next, probing in this issue, beneath measurement’s first layer of information, are “something old” (Fiske’s article) and “something new” (the articles by Pfau and by Bryant). The “old” is a review by Donald W. Fiske of the celebrated classic of his and Donald Campbell’s on the multitrait-multimethod (MT-MM) matrix, a measurement breakthrough whose significance is attested to by its thousands of citations. The question these “hall of fame” measurement psychologists sought to ask is whether the determination of a psychological phenomenon or trait or behavior is due fundamentally to that trait explicitly or is the finding an artifact, rather, of the method used to explore that trait? In Fiske’s own words from his article in this newsletter, “Are there one or more methods that yield distorted patterns of correlations or correlational values that are much too low? Are there [rather] methods that generate too high correlations, correlations that cannot be accepted for the constructs as currently construed?”

The first “something new” is “Measuring Perceptions of Relational Communication,” by Wisconsin communication professor Michael Pfau. Like the article by Fiske, Pfau’s is significant and instructive because, in identifying a new dimension in communication-“relational messages,” he is suggesting that communication behavior is well served by looking at implicit (nonverbal) as well as explicit communication. As a matter of fact, I was instantly struck by the importance and wide spread applicability of “relational messages”.

In a separate communication, Pfau says that in his article he “focused on the relational communication measure, as opposed to a combination of communication measures, because it has unique potential in the health communication context to tap both verbal and nonverbal components of person perception, and it is relatively unknown outside of the communication discipline.”

So here we have a powerful step up from what HaPI was a piddling three years ago and a step up, also, in the context of the BMDS newsletter. And this brings us to the second “something new,” a new feature of the newsletter: illustrations provided in a new column, by Fred Bryant, starting with his maiden column in this issue: “The Comparative Anatomy of Related Instruments: an Emerging Specialty.” Bryant’s column brings the newsletter to a higher level of measurement breakthroughs and to a window of erudition providing more insight into measures and the concepts they seek to elucidate, where there is revealed “alternative measures of the same construct to determine conceptual overlap and uniqueness.” This brings added value to the interpretation of measures as well as to decisions about what measures to use in a particular study. In this issue of the newsletter, Bryant’s first column illustrates this procedure, using the construct of optimism. Bryant seeks to determine whether and how alternative measures of the same concept leave us with a level of confidence that the concept is robust over different measures, in the same way that Campbell and Fiske, a half century earlier, sought to determine whether the results of research involving measured entities are attributable to the concepts underlying the measure or, instead, attributions to how the concepts are measured-which, of course, is what multimethod and multitrait are all about.

Thus, a more complex and sophisticated aspect of measurement is represented in the articles by Fiske, Pfau, and Bryant: an understanding not only of specific measures themselves-in an absolute and insulated way-but rather of where the measure stands with regard to other factors (the method used to measure concepts in the case of Fiske and of Bryant, and in the case of Pfau a treatise on “relational communications,” which can be communicated verbally as well as, and especially, nonverbally, along side of other verbal contents. We are confident that you will enjoy these three articles-and feel that their perusal was worthwhile.

So, retreating full circle, this is what The Behavioral Measurements Letter was and what it has evolved into. We hope that you will give a “thumbs up” to the newsletter’s modest contribution to the measurement enterprise and will yourself be propelled to offer articles for enriching The Behavioral Measurements Letter.

Robert Perloff, PhD, is Distinguished Service Professor Emeritus of Business Administration and of Psychology at the University of Pittsburgh. He has been president of many national professional and scientific societies including the American Psychological Association, the American Evaluation Society, and the Society of Psychologists in Management. Earlier in his career, Dr. Perloff was Director of Research & Development at Science Research Associates. He has just concluded a 3-year term as a member of the American Psychological Association’s Board of Scientific Affairs.