• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 



• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 

Selecting Tools For Use in Cross-Cultural Measurement

Continued article from the January 2005 Issue of The Behavioral Measurement Letter, Vol. 8, No. 2  Winter 2005

Carolyn F. Waltz, RN, PhD, FAAN

Cross cultural measurement refers to the use of tools for comparing respondents from different cultures within one country, or comparing one cultural group to another in two or more countries (see Waltz, Strickland, & Lenz, in press). The demand for measures that can be employed cross-culturally has increased dramatically during the last decade, due to: (a) greater awareness of health problems with global impact; (b) the rise in education, research, and clinical collaborations across cultures; and (c) the attention focused on eliminating health disparities among population subgroups in the U.S. and other countries. To ensure that tools have the essential attributes for use across cultures, it is necessary to consider carefully the research concept of interest, specific attributes of the tool, and language translation strategies. Before selecting a tool for cross-cultural use, the following three conditions must prevail in order to minimize threats to reliability and validity.

  1. The concept the instrument is designed to measure should have the same meaning within each of the cultures in which the instrument will be employed.

To determine whether this condition holds true, researchers should begin by conducting a review of the literature in each culture, to see if the concept is relevant, and if so, to determine the extent to which prior researchers have studied it and how they have measured it. Researchers should then conduct a pretest with respondents representative of the cultures of interest, instructing them to describe the concept, and specific traits or behaviors that characterize it (Serpell, 1993). Once they have selected or constructed a measurement tool, cross­ cultural researchers should also examine similarities and differences in response patterns, by comparing factor structures of scores on the instrument across relevant subgroups within the same culture (Wilson, Hutchinson, & Holzemer, 1997) or performing differential item analyses (Allalouf, Hambleton, & Sireci, 1999). If the results reveal structural equivalence across subgroups within each culture, then scores obtained from the same tool should be comparable when employed in the different subgroups within each culture.

  1. Researchers should use an appropriate translation strategy.

Assuming measurement instruments are unavailable in the languages of the cultures under investigation, translators should be carefully selected. Foremost, translators should be ethnically and culturally representative of the population among whom the tool will be employed, fluent in both the original (source) language and the target language to which the tool is to be translated, familiar with both cultures, and knowledgeable about the concepts being measured and the tool being used. The probability of a successful translation is greater when: (a) translators use terms that refer to real experiences that are familiar in both cultures; (b) translators determine that each item describes the same phenomenon in both cultures and make necessary modifications when necessary; (c) translators recognize differences in meanings of idioms and take steps to assure these meanings are equivalent in both languages; (d) the original and translated versions of the tool are administered in the same manner, so that valid cross-cultural comparisons can be made; and (e) methods for assessing concepts are comparable between the two cultures. Cross-cultural researchers should pretest translated tools to determine that the resulting responses have distributional properties comparable to those for respondents in the source language version. If discrepancies are found, these differences should be analyzed for mistranslation, and the instrument should be modified if necessary. Before undertaking the actual research project, cross­ cultural researchers should fully establish reliability and validity in each culture, and should conduct differential item analysis, for both the original and translated versions of the tool.

  1. Members of different cultures should respond in the same manner to the particular tool to be used. Discrepancies in how people from different cultures respond to a specific tool may result from: (a) differences in the tendency to respond in ways that are socially desirable; (b) differences in the tendency to respond in an acquiescent style; (c) differences in familiarity with response procedures; (d) differences in physical conditions during administration; (e) noncomparable sampling with respect to such variables as age, gender, and educational background; (t) interviewer effects; or (g) interviewer-respondent effects such as communication problems (Van de Vijver & Poortinga, 1997). Strategies for minimizing such discrepancies include familiarizing respondents with the method of assessment before administering the instrument, and conducting 8 pilot study to investigate instrument reliability and validity and to explore individual differences among respondents that may influence responses to the instrument.


In summary, before using a tool in cross-cultural research, it is essential to demonstrate that the concept of interest has the same meaning in each culture, use appropriate translation strategies when appropriate, make sure that members of each cultural group respond in a similar manner to the particular tool, and strongly establish reliability and validity of the tool to be used.


Allouf, A., Hambleton, R.K., & Sireci, S.G. (1999). Identifying the causes of differences in translated verbal items. Journal of Education Measurement, 36, 185-198.

Serpell, R. ( 1993). The significance of schooling: Life-journeys in an African society. Cambridge, UK: Cambridge University Press.

Van de Vijver, F.J.R., & Poortinga, Y.H., (1997). Towards an integrated analysis of bias in cross-cultural assessment. European Journal of Psychological Assessment, 12, 21- 29.

Waltz, C. F., Strickland, O.S., & Lenz, E.R. (in press). Measurement in nursing and health research (3rd ed.). NY: Springer Publishing Company.

Wilson, H.S., Hutchinson, S., & Holzemer, W.L. (1997). Salvaging quality of life in ethnically diverse patients with advanced HIV/AIDS.

Qualitative Health Research, 75, 75-97.


Carolyn F Waltz, RN, PhD, FAAN, is Professor and Director of International Activities and Evaluation in the School of Nursing at the University of Maryland, Baltimore. She is a member of the American Academy of Nursing, and has received 11 American Journal of Nursing Book of the Year Awards since 1981. Dr. Waltz is best known for her work on the measurement of clinical and educational outcomes in nursing. Email: cwaltz@son.umaryland.edu

“Finnegan’s Finagling Factor: That quantity which, when multiplied times, divided by, added to, or subtracted from the answer you got… gives you the answer you should have gotten.”

– Anonymous


Read additional articles from this newsletter:

Comparing Measurement in the Natural and Behavioral Sciences

The Test Taker’s Right to Know

Doris Bloch Remembered



Subscribe to our Newsletter Today

Stay up to date! Newsletters sent out quarterly.

Copyright © 2024 BMDS |  All Rights Reserved

Design: LDS