• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

The
Behavioral
Measurement
Letters

Behavioral
Measurement
Database
Service

• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • •

Selecting Instruments for Behavioral Research: Advice for the Intermediate User

Continued article from The Behavioral Measurement Letter, Vol. 9, No. 1: Spring 2006

Thomas P. Hogan
University of Scranton

General Overview

Selecting appropriate measurement instruments is a crucial part of any research study. For most psychological constructs, instruments already exist. Fortunately, there are a variety of sources of information to help locate suitable instruments. Although novices are best served by simply adopting whatever is most frequently used (or whatever they are told to use), persons at an intermediate level of training and experience should become proficient in using standard sources of information to find and evaluate instruments. I outline these sources, along with their strengths and weaknesses, in four major categories: electronic listings, hard copy listings, test reviews, and test publishers’ catalogs and websites.

Users of Instruments

For didactic purposes, it is convenient to divide people who need to select an instrument for behavioral research into three groups. I will call the three groups of users novices, intermediates, and experts, much like a classification of tennis players or do-it-yourself home repairers. Of course, there is an underlying continuum of knowledge and skill for all three groups, but the discrete groupings will be helpful for formulating advice.

Novices in the world of instrument selection include undergraduate students and others with little experience or training in behavioral research. The entry-level users might have had one course in statistics and possibly a course in research methods. In fact, novice users might need to select a behavioral instrument for a first project as part of a course in research methods or psychological testing.

The intermediate group of instrument users includes individuals who have developed a reasonable base of training and experience in behavioral research. These mid-level users have completed several courses in statistics, research methods, and measurement. Intermediate users have likely designed and carried out one or two simple research projects, but always under the supervision of a more experienced researcher. Many graduate students would fall into this middle category of instrument users.

The expert group, of course, includes experienced researchers who regularly publish the results of their research. These high-level users often supervise and teach novice and intermediate users. Expert users include professors and full-time researchers at college, universities, research institutes, and major test publishers.

Getting the Target Construct Right

In fundamental ways, all three groups have the same needs when it comes to selecting measurement instruments for research projects. As nicely described by Brockway and Bryant (1998) [reprinted in this issue of the newsletter], research studies typically revolve around theoretical constructs. These abstract constructs receive their concrete operational definitions in the form of some measurement instruments. A crucial question is how well a particular measurement instrument fits the construct(s) of interest.

Textbooks and journal articles tend to use cryptic labels (often one or two words) for constructs, for example, depression, self-esteem, nonverbal intelligence, or home environment. In many ways, these labels facilitate communication. Indeed, later sections of this article will use these short-hand labels for purposes of keyword searching. However, an overreliance on such labels belies the underlying richness and complexity of the construct the user wants to measure.

The Standards for Educational and Psychological Testing (Standards; American Educational Research Association, American Psychological Association, & National Council on Measurement in Education, 1999) provides a useful conceptual framework and terminology for this issue of the fit between construct and measure. The Standards refers to construct underrepresentation and construct irrelevant variance.

Construct underrepresentation occurs when the measurement instrument does not fully capture the construct of interest. Construct irrelevant variance refers to measuring, in part, constructs in addition to what the user has in mind. The Standards note “Nearly all tests leave out elements that some potential users believe should be measured [construct underrepresentation] and include some elements that some potential users consider inappropriate [construct irrelevant variance]” (p. 10).

Before searching for appropriate measurement instruments, it is essential to spend time clearly defining the construct of interest. In other words, it is crucial that users get the target right. The following simple strategy may help this process. Start with a short-hand label for the construct. Then develop an elaborated definition or description of the construct. This might be one or two paragraphs. The expanded version might include synonyms, definitions from standard sources such as field-specific dictionaries, and textbook elaborations. Finally, try to identify what is not included in the construct. For example, you may want to ensure that the construct of quantitative ability is not contaminated with a heavy computational load or reading ability; or that anxiety is distinguished from fear. The important point here is that this process of clarifying the target construct should occur before you start searching for appropriate measurement instruments.

Although this basic issue of defining the target construct accurately pervades the work of people in all three groups of instrument users, the practical approaches to selecting appropriate instruments differ across the three groups. In this brief article, I concentrate on advice for the intermediate group of instrument users. But, first, I briefly consider the other two groups of users. As for the expert group, no advice is needed. People in this group already “know the ropes.”

As for the novice group of users, advice is easy. It consists of three simple points. First, resist the temptation to build your own instrument. While it sounds expedient to develop your own instrument, it is almost always a bad idea to try to do so. Why? Because instrument development and validation is a complex and time-consuming task, one that requires considerable technical expertise and experience. Second, find out what measurement instruments are most frequently used for the construct(s) you are studying and use these instruments. For example, if most of the studies covered in your literature review use the Piers-Harris Children’s Self-Concept Scale, then use that instrument. Adopting commonly used instruments helps to integrate your findings with previous research. Third, use simple search strategies, such as those outlined by Moore, Bryant, and Perloff (1999) [reprinted in this issue of the news-letter], to get more detailed information about the instruments you plan to use, that is, the instruments most frequently used in studying your topic.

Advice for the Intermediate Group

Now for the middle group of instrument users: What more is expected of them than is expected of the novice group? The research community, including the research mentor, has at least the following expectations for persons in the intermediate group of instrument users.

You should have increased sophistication in searching out appropriate instruments for specific applications. Later sections of this article help to ensure that this expectation is met.
You should have the ability to evaluate the reliability and validity of instruments for specific applications. Such evaluation may be beyond novices, who probably do not have sufficient training to make independent evaluations of these matters. But people in the middle group do have such training – that is partly why they get classified as intermediate users. You need to apply this training to make your own judgments about reliability and validity.
You need to be able to judiciously combine two or three instruments in a single study. This requires not only knowledge of technical matters like reliability and validity, but also the practical common sense that comes from having conducted at least a few research studies in the past.

Practical Tips About the Search Process

As you begin your search for suitable instruments, here are a few practical tips.

The first bit of advice is the same as for the novice group: Resist the temptation to build your own instrument. Leave that for the experts. Building your own instrument should be a last resort. Almost certainly, several instruments already exist for the constructs you intend to study. The trick is finding them – which is the main point of this article. (The exception to this first bit of advice is a project where the entire focus of attention is on building a new instrument.)
Use the invaluable sources of information described here to find an existing instrument or instruments for your research project.
Devote ample time to your search for instruments. You cannot do a respectable job in a mere 20 minutes of Internet searching. It will take several hours for your initial search, followed by hours of more detailed study of the most promising possibilities, followed by more time securing copies and perhaps permission to use the instrument.
Do not rely entirely on brief descriptions of instruments. You have to get your hands on the actual instrument, examine its directions, and review the items and administration procedures before deciding whether the instrument is right for your purposes.
Be sensitive to the varied roles that measurement instruments play in the research enterprise. A measurement instrument provides the operational definition of the dependent variable in a research study. However, measurement instruments can also be important in describing characteristics of the participants and as covariates or blocking variables in more sophisticated research designs.
It is tempting to use multiple instruments in order to get a rich description of the construct of interest. In fact, use of multiple instruments is the mark of a researcher with some experience. However, be careful not to overdo it. You will carry out your research with real people, who probably don’t care that much about your project. Be careful you don’t bore respondents to the point that you jeopardize the validity and reliability of their responses.
If you are reproducing or adapting an existing instrument, make sure you obtain proper permission from the author or publisher. Be aware that obtaining such permission can be time-consuming.

Sources for Your Search

Now for the search itself. There are a variety of excellent sources available to help you find suitable instruments for a research project. Here I identify these sources and describe their strengths and weaknesses. The sources for finding instruments fall into four major categories.

Electronic Databases. Currently, there are two available electronic databases specifically devoted to cataloging existing instruments:

Health and Psychosocial Instruments (HaPI), a product of Behavioral Measurement Database Services (BMDS), available Online and on CD-ROM from Ovid Technologies.

Educational Testing Service Test Collection, available at http://sydneyplus.ets.org

Probably the best place to start your search is with one or both of these two electronic databases. Each database contains brief descriptions of thousands of tests and has excellent searching capabilities. Simply enter the name of the construct of interest (e.g., anxiety or self-esteem) and get a list of measurement instruments relevant for that construct.

These databases have three main advantages. First is their easy searching capability. Your list of possibly useful instruments is just a few clicks away. Second is their comprehensiveness. Each database contains information about thousands of instruments. For the more commonly studied psychological constructs (such as depression, self-concept, anxiety, introversion, intelligence, spatial ability, etc.) it is not unusual to find at least a dozen existing instruments listed in these databases. Third, the databases contain not only the names of instruments but also brief descriptive information, including the instrument’s purpose, intended target groups, scores, and publisher or other source.

The electronic databases have two main drawbacks. First, although they contain brief descriptive information about each entry, they do not provide evaluations of the quality of the entries. The bad get equal billing with the good. Thus, you, as the ultimate user, still have work to do in deciding if the instrument is appropriate for your project. Second, although the databases guarantee that a particular instrument existed somewhere at sometime, they do not guarantee the ready availability of that instrument. Users must investigate availability separately. Some of the instruments in the databases can be purchased from one of the major test publishers. BMDS also provides copies and scoring instructions for many of the instruments in the HaPI database. On the other hand, some of the instruments may be out-of-print or may be available only in the appendix of a journal article, thus requiring permission from the journal’s publisher for reproduction.

Hard Copy Listings. There are currently at least three hard copy listings of tests. Although these hard copy listings are similar in many ways to the electronic listings, they are also different in several important ways.

Tests in Print (Murphy, Plake, Impara, & Spies, 2002), usually referred to simply as TIP, is now in its sixth edition. New editions appear about every four years. Tests: A Comprehensive Reference for Assessments in Psychology, Education and Business (Tests; Maddox, 2003) is quite similar to TIP; new editions of Tests appear about every five years. Both TIP and Tests limit their entries to tests that are in English and are available from regular publishers. (A “regular” publisher means a commercial organization that is in the business of developing and selling tests.) The Directory of Unpublished Experimental Mental Measures (Directory; Gold-man & Mitchell, 2003), as suggested by its title, lists tests that are not available from a publisher. Entries in the Directory are taken, either by reference or in full, from a journal article. Thus entries in the Directory are published in the sense that they appear in print somewhere, but they are not available from a regular publisher.

Whereas you use typical “keyword and click” methods to search the electronic databases, you use a printed index to search these hard copy listings.

Although it is certainly easier to complete a quick search electronically, browsing a hard copy listing is potentially useful; you are more likely to encounter a serendipitous result by leisurely browsing a hard copy. As with the electronic listings, all three of these hard copy listings provide brief descriptions of the tests they include (e.g., purpose, target group, and scores). Also like the electronic listings, these sources do not provide evaluative commentary about the tests. Most academic libraries have copies of all three of these hard copy listings.

Test Reviews. Two major sources provide professional, qualitative reviews of tests. The most well-known is Buros Mental Measurements Yearbook (Spies & Plake, 2005), now in its 16th edition. Known simply as Buros or MMY, new editions appear about every two or three years. Each edition contains reviews of approximately 400 tests, concentrating on new or recently revised tests. Two independent reviews are given for most entries.

Buros reviews are available in three forms. First, there are the traditional hard copy volumes, available in most academic libraries. Second, some libraries subscribe to the Buros on-line reviews (an Ovid Technologies product, like HaPI), which include all reviews since the 10th MMY. For libraries subscribing to this service, there is no user fee for accessing a review. Third, for a fee, a Buros review can be accessed via the Internet at http://www.unl.edu/buros/.

The second major source of test reviews is Test Critiques (Keyser, 2004), now available in 11 volumes, with new volumes appearing at varying intervals. Test Critiques covers fewer instruments than MMY and tends to concentrate on the more widely used tests.

Reviews in MMY and Test Critiques provide a very important professional service. They are the only sources that give evaluations of the quality of a wide variety of tests. You should definitely consult these sources to see if they contain evaluative reviews of any tests you are considering for your project and, of course, read the reviews if they are available.

These collections of test reviews have two principal drawbacks. First, they cover only regularly published tests and not even all of those. Thus, many of the tests you identify in searching electronic or hard copy listings of tests will not have been reviewed. Second, because the reviewing process takes time, reviews will not be available for tests that have become available only recently. For both of these reasons, you need to rely on your own judgment about the suitability of many of the instruments you are considering for your research. Members of the intermediate group of test users should be able to do that.

Publishers’ Catalogs and Websites. Like L.L. Bean and Sears, the major test publishers describe their products in catalogs, available in hard copy as well as on the Internet. Of course, many of these published tests are also briefly described in the electronic and hard copy listings presented earlier. However, a publisher’s catalog will contain more complete descriptions of a test. Thus, if you are considering use of an instrument from one of the major test publishers, you should definitely consult the publisher’s catalog. Especially important are the detailed facts about the latest editions, types of answer media, scoring services, and costs. Regarding costs, be aware that most publishers give discounts, usually from 25 – 40%, for research use of their tests. A publisher’s website usually contains the form needed to secure this research discount.

Although the publisher’s catalog is the best source of information about such practical matters as cost, latest editions, etc., it is not an unbiased source of information about the quality of a test. After all, the publisher is in the business of selling the test and will, therefore, present it in the best possible light. Look elsewhere, including your own judgment, to evaluate test quality.

Other Sources

I have outlined here the main sources of information about tests appropriate for use at the intermediate level. Several more advanced sources, especially appropriate for technical work in test construction and specialized applications, are also available. For descriptions of these other sources see Hogan (2003).

Concluding Thoughts

Selecting the best instruments for your research project is a crucial part of the entire research process. An instrument is an operational definition of the construct under investigation. Novices should simply use whatever has been used most frequently in the past. More is expected of those who, by training and experience, have moved beyond the novice category. Showing sophistication in the use of sources of information about existing tests is part of this “more.” It is worth taking the time to become proficient in the use of these sources of information and then actually use them. Doing so should contribute to a richer, more meaningful research project.

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education. (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.

Brockway, J. H., & Bryant, F. B. (1998). You can’t judge a measure by its label: Teaching students how to locate, evaluate, and select appropriate instruments. Teaching of Psychology, 25, 121-123.

Goldman, B. A., & Mitchell, D. F. (2003). Directory of unpublished experimental mental measures (Vol. 8). Washington, DC: American Psychological Association.

Hogan, T. P. (2003). Psychological testing: A practical introduction. New York: Wiley.

Keyser, D. J. (Ed.). (2004). Test critiques (Vol. XI). Austin, TX: Pro-Ed.

Maddox, T. (2003). Tests: A comprehensive reference for assessments in psychology, education, and business (5th ed.). Austin, TX: Pro-Ed.

Moore, D., Bryant, F. B., & Perloff, E. (1999). Measurement instruments at your fingertips. Eye on Psi Chi, 3(2), 17-19.

Murphy, L. L., Plake, B. S., Impara, J. C., & Spies, R. A. (Eds.). (2002). Tests in Print VI. Lincoln, NE: University of Nebraska Press.

Spies, R. A., & Plake, B. S. (Eds.). (2005). The sixteenth mental measurements yearbook. Lincoln, NE: University of Nebraska Press.

Read additional articles from this newsletter:

You Can’t Judge a Measure by Its Label: Teaching the Process of Instrumentation

Measurement Instruments at Your Fingertips

vol-9-no-1-Spring-2006

Learn how to submit your own Instrument to the HaPI Database

Click Here

The
Behavioral
Measurement
Letters

Behavioral
Measurement
Database
Service

Selecting Instruments for Behavioral Research: Advice for the Intermediate User