Vol. 8, No. 2 Winter 2005
Introduction to the January 2005 Issue of The Behavioral Measurement Letter
This issue of The Behavioral Measurement Letter includes four articles addressing a diverse range of measurement-related topics: Looking back fondly on the life of a legendary pioneer and measurement advocate in the field of nursing research; addressing test-takers’ right to receive feedback about their test scores; helping researchers select tools and methods in cross-cultural research; and comparing and contrasting measurement in the behavioral and natural sciences. Note the multidimensional threads of measurement that run throughout these articles, weaving the different topics together into a rich and varied conceptual tapestry.
To begin this issue of The Behavioral Measurement Letter, Deidre Blank warmly remembers the life and times of Doris Bloch, a pillar in the field of nursing research. Dr. Bloch, who died in August 2003, was instrumental in building a formal structure for federal research support in nursing, and her contributions live on in the legacy she leaves behind in the health sciences. Doris was also a champion and ally of the Health and Psychosocial Instrument (HaPI) database, from its very inception. It is with deepest respect that we fondly dedicate this issue of The Behavioral Measurement Letter to the life, work, and memory of Doris Bloch.
Also in this issue of The Behavioral Measurement Letter, Robert Perloff makes a strong argument in support of test-takers’ unalienable right to know the meaning and interpretation of their test scores, whenever test-takers willingly give their responses to a tester for scoring, analysis, and interpretation.
Advocating a quid pro quo arrangement, he argues that anyone who uses test-takers’ scores is ethically obligated to provide test-takers with feedback about the meaning of their test scores, unless an individual respondent voluntarily waives his or her right to receive such feedback. Dr. Perloff’s clarion call for justice in ethical standards to guarantee the rights of individual test-takers suggests important extensions of informed consent procedures when conducting research with human participants. Readers interested in responding to Dr. Perloff’s article should submit either a Letter to the Editor or a brief manuscript to the address provided below.
Reflecting the growing increase in cross-cultural research, Carolyn Waltz offers researchers studying intra-cultural (within culture) or inter-cultural (between culture) differences a framework for maximizing the validity and reliability of the measurements involved. Advocating a solid psychometric foundation, she highlights the preconditions necessary to optimize the validity and reliability of a particular measurement instrument in cross-cultural research. Dr. Waltz emphasizes the absolute necessity of (a) using an appropriate translation strategy when creating a new form of an original instrument in another language, and (b) demonstrating that the construct being measured is equivalent both within and across cultures by establishing psychometric equivalence intra- and inter-culturally.
Finally, in this issue of The Behavioral Measurement Letter, Fred B. Bryant compares and contrasts the process of measurement in the behavioral sciences (in which he includes the health and social sciences) and the natural sciences. Arguing that researchers in the behavioral sciences have much to learn about measurement from their counterparts in the natural sciences, Dr. Bryant presents excerpts from an interview of a research colleague in biochemistry that highlight basic similarities and differences between the two disciplines. Dr. Bryant uses the interview excerpts to identify a core set of concerns-involving instrumentation, measurement error, measurement validity, and graduate training in measurement that interconnect and distinguish measurement in the behavioral and natural sciences.
We invite written responses from our readership. Please address comments, suggestions, letters, or ideas for topics to be covered in future issues of the journal to: The Editor, The Behavioral Measurement Letter, Behavioral Measurement Database Services, P.O. Box 110287, Pittsburgh, PA, 15232-0787. Email: email@example.com
We also consider short manuscripts for publication in The Behavioral Measurement Letter. Submit, at any time, a brief article, opinion piece, or book review on a topic related to behavioral measurement, to The Editor at the above address. Each submission will be given careful consideration for possible publication in a forthcoming issue of The Behavioral Measurement Letter.
HaPI reading …
“The purpose of computing is insight not numbers.”
– R. Hamming
Comparing Measurement in the Natural and Behavioral Sciences
An Interview With Biochemist Duarte Mota de Freitas, Ph.D.
Fred B. Bryant, PhD
Measurement—the quantifying of information—is crucial to all science. It is simply impossible to test and build knowledge efficiently without it. Yet, on the whole, behavioral scientists (by this term I mean also to include researchers in the health and social sciences) seem to know relatively little about how their counterparts in the natural sciences approach the process of measurement. This knowledge gap is regrettable, however, because a better understanding of measurement issues in the natural sciences may well shed light on these same issues in the behavioral sciences. The present article provides some insights into behavioral measurement by highlighting similarities and differences m comparison with measurement in biochemistry.
Clearly, the two sciences have some similarities in measurement concerns. Presumably, natural scientists wrestle with some of the same thorny measurement issues (e.g., instrumentation, reliability, error, validity, and bias) as do behavioral researchers. Yet, there are also fundamental differences in measurement concerns across the two sciences, reflecting the difference between measuring physical properties versus behavioral constructs. What exactly are the similarities and differences in approaches to measurement in the natural sciences, as compared to the behavioral sciences?
To begin to answer this question, I recently interviewed an experienced natural scientist, Duarte Mota de Freitas, who is actively engaged in experimental laboratory research in biochemistry. Having graduated from UCLA in 1984 with a Ph.D. in chemistry, Dr. de Freitas is currently a Professor of Chemistry at Loyola University Chicago, where he specializes in bioinorganic chemistry, an interdisciplinary area at the interface of biochemistry and inorganic chemistry. I have chosen to interview a scientist from this particular discipline because this specialty is representative of the natural sciences.
By way of background, bioinorganic chemists like Dr. de Freitas study the role of metal ions, such as those of lithium, magnesium, platinum, iron, copper, and sodium, in biology and medicine. During the past 18 years, the primary focus of Dr. de Freitas’ research has been to seek an understanding at the molecular and cellular levels of how lithium salts work in the treatment of bipolar disorder. He is also applying this fundamental knowledge of lithium biochemistry to clinical questions, such as response to and toxicity of lithium treatment, by using blood samples from patients being treated for bipolar depression.
Below are excerpts from the interview of Dr. de Freitas (DMF), in which the author (FBB) posed questions about four broad issues related to measurement: (a) instrumentation, (b) error in measurement, (c) establishing measurement validity, and (d) training in measurement and instrument development. Where relevant, I highlight points of similarity and dissimilarity between the natural and behavioral sciences for each of these four measurement-related areas. Finally, I conclude by integrating the main points of convergence and divergence to form a broader perspective on measurement across the two sciences.
In both behavioral science and biochemistry, scientists often specialize in particular methods of measurement. However, whereas this specialization is due primarily to specialized interests in the behavioral sciences, measurement specialization in the natural sciences is also determined by the prohibitive costs of acquiring additional measurement tools. The following interview excerpt illustrates these points.
FBB: In bioinorganic chemistry, is it ever the case that a new theory will require the development of a new measurement tool that currently does not exist, in order to test hypotheses derived from that theory? If so, can you provide any examples?
DMF: When I embarked in lithium research, the major tool for lithium detection was atomic absorption spectrophotometry. However, atomic absorption spectrophotometry can only be used to measure total lithium concentrations in biological samples, and cannot be used to discriminate between free and bound forms of lithium. It was proposed that a cell membrane abnormality was present in bipolar patients. If that is the case, one would anticipate that the distribution of lithium between free and bound forms in cell membrane preparations from bipolar patients and normal individuals would be different in disease and healthy states. My research group spent the first 8 years or so developing NMR spectroscopy of lithium-7 and demonstrated that this technique could indeed be used to discriminate free and bound forms of lithium and that this lithium distribution was indeed different in disease and healthy states.
FBB: Are any formal resources available to help bioinorganic chemists like you identify alternative ways of measuring the phenomena you want to study? (The Health and Psychosocial Instruments database provides this measurement information for behavioral researchers.)
DMF: No. In the natural sciences, finding the best method and experimental conditions to address a question requires creative insight. There are several regional laboratory facilities, which are funded by the Federal Government, across the United States. Although these facilities are well equipped with state-of-the-art instrumentation and are available for use by academic researchers, the traveling time, the limited availability of each instrument, and/or the relatively costly user fees preclude most researchers from having good access to every type of instrumentation for a specific research project. Any research laboratory or, for that matter, any academic department in the natural sciences cannot afford to purchase and maintain every conceivable type of sophisticated instrumentation. Most often, a given research laboratory specializes in the use of a relatively small number of physical methods and focuses on research problems that can be addressed by those tools.
Reducing Error and Bias in Measurement
Behavioral scientists and biochemists alike strive to overcome the problems of error and bias in their research by using multiple, complimentary, and independent methods of measurement; and they both wrestle with situations in which the very act of measuring can alter the phenomenon under investigation. However, unlike behavioral scientists, biochemists can check to see if a measurement instrument is working properly by using it to assess a standard specimen with known properties. The following interview excerpt illustrates these points.
FBB: Researchers in the behavioral sciences are often concerned about potential sources of error or unreliability in their measurements. These are random influences on measurements that are unrelated to the underlying variables of interest. What are the potential sources of error in the measurements you make, and what steps do you take to reduce these spurious influences?
DMF: Potential sources of error include purity of materials, viability of biological samples, and lack of a deep understanding of the physical methods used to obtain the measurements. To reduce these influences, we test for possible metal ion contamination in our biological samples. (It is ironic that some lithium researchers originally collected blood samples from lithium-treated samples in tubes containing the anticoagulant lithium heparin, which contaminated these samples. Because other nonlithium salts are commonly available as anticoagulants, the use of lithium heparin was clearly a poor choice.) We also ensure that the purified proteins and the cells that we use are biologically active and viable. Conducting measurements in solutions of known composition in terms of lithium concentration or other variables that we are trying to measure tests the proper use of the instrumentation.
FBB: Researchers in the behavioral sciences are also concerned about sources of predictable bias in their measurements. These are systematic influences on measurements that have nothing to do with the underlying variables of interest. For example, when a researcher knows the experimental hypothesis, he or she might unintentionally influence measurements in ways that confirm the expected results. In the behavioral sciences, researchers often keep the experimenter unaware or “blind” to hypotheses or experimental conditions, to control for this type of experimenter bias. Is this ever a concern in your research? Are there potential sources of systematic bias in the measurements you make?
DMF: This is an issue where, in my opinion, the natural sciences diverge the most from the behavioral sciences in terms of approaches to measurement. When we intentionally influence measurements in a way to test a hypothesis experimentally in the natural sciences, we can be sure that the observed results are due to our intentional influence. For example, we increase the lithium concentration holding everything else constant, and we see if this change has an effect on the magnesium concentration. This way we can make certain the effect is due to the variation in lithium concentration and not due to some other variable, i.e., change in sodium concentration, pH, etc. It seems harder to be sure that observed effects are due to intentional influences in the behavioral sciences.
FBB: What steps do you take to reduce these systematic influences on measurements?
DMF: Oftentimes, by measuring standardized solutions or conducting control experiments using independent methods, one can isolate the contribution of systematic errors. Even when the contribution of systematic errors cannot be quantified, we take great care in discussing in our publications the possible sources of error and the assumptions made in the calculations of parameters derived from direct measurements. In general, we test over and over again our measurements with freshly prepared biological samples to ensure that our measurements are accurate and reliable. We also make extensive use of statistical methods in our data analysis to quantif.i1 sampling error and conduct inferential hypothesis tests.
FBB: Although behavioral scientists rely heavily of inferential statistical tests to quantify the probability that chance produced observed effects, natural scientists seem to rely more on visual inspection of graphs in drawing research conclusions. Thus, your use of statistics is somewhat unique for a natural scientist. Why is this?
DMF: Having experienced transitions from fundamental research in inorganic chemistry to more applied research in complex systems in biochemistry, cell biology, and psychiatry during the course of my career, I notice that most basic chemists do not make as much use of statistical methods as they should. It is important to bear in mind, however, that in the basic sciences the elegance of an experiment is intimately related to the design and simplicity of the system being analyzed. Because the number of variables in these simpler systems is small, it is not su1prising that any phenomenon observed tends to be huge compared to sometimes tiny experimental effects observed in the applied physical and biological sciences.
FBB: Another common concern in the behavioral sciences is the possibility that the very act of measurement may well change the things we are studying. Is this “reactivity” issue ever a concern in your area of research? If so, how do researchers address this problem?
DMF: The “reactivity” issue is definitely a source of concern in our research. For instance, if we increase lithium concentrations in an experiment too much, we may run into problems of cell toxicity and have certain chemical reactions become prevalent, which are situations that are not generalizable to the therapeutically relevant concentration range of lithium. This problem could be safely avoided by conducting measurements within the pharmacologically relevant concentration of lithium. However, it is not always possible to avoid this problem because the physical methods used for lithium detection are sometimes not sensitive enough to pick up variations in lithium concentration within the therapeutic range.
Assessing Measurement Validity
Validating measurement instruments is vital in both the behavioral and natural sciences. Unlike natural scientists, however, behavioral scientists usually have no “gold standard” to use in validating a new measure. The best that behavioral scientists can do is correlate scores on the new measure with scores on a preexisting alternative measure of the same variable, to assess what is called “convergent validity.” The following interview excerpt illustrates these points.
FBB: In the behavioral sciences, researchers devote a great deal of attention to assessing the validity of their measurement instruments. The key question in such cases is whether a particular instrument really measures what it’s supposed to measure (i.e., the issue of construct validity). Do researchers in your field ever question whether their measurement instruments actually assess what these instruments are intended to assess? Can you provide any examples?
DMF: We always question whether the instrumentation used is actually measuring what it is supposed to measure. One of the laboratories with which we compete once reported that the total lithium concentrations in human red blood cells measured by NMR spectroscopy were less than those measured by atomic absorption spectrophotometry. They went on to conclude erroneously that the mechanism of action of lithium must take place outside the cell and not inside it. We later demonstrated that the total concentrations of lithium in human red blood cells were the same regardless of whether they were measured by NMR spectroscopy or atomic absorption spectrophotometry. The source of error for their NMR measurements was that these researchers failed to consider the long NMR relaxation properties of the lithium-7 nucleus.
FBB: In your field of specialty, are there agreed upon “gold standard” measures for assessing key research variables (e.g., a thermometer for assessing temperature)?
DMF: The use of “gold standards” is widespread in the natural sciences. For example, we test the performance of the NMR spectrometer by cm firming the known value of the signal-to-noise ratio for the NMR spectrum of a standard solution.
FBB: How do your researchers in your field go about determining whether a measurement instrument actually measures what it is supposed to measure? What kinds of research strategies or procedures do they use to address this question of measurement validity?
DMF: We typically use another physical method to measure the same variable as well as conduct measurements of standard solutions.
FBB: Behavior scientists often use multiple ways of measuring things, in order to determine how much their results are due to their particular method of measurement. Do you ever use more than one method of measuring a phenomenon of interest?
DMF: We almost always use complimentary methods in our research. For example, if we want to test ·whether competition between lithium and magnesium ions occurs under a certain experimental condition, we can measure an increase (or a decrease) in free lithium concentration by NMR spectroscopy and a decrease (or an increase) in free magnesium concentration by fluorescence spectroscopy.
Training in Measurement and Instrument Development
Another major difference between the behavioral and natural sciences concerns the emphasis placed on measurement in advanced graduate training. Whereas measurement training is typically sporadic and informal in the behavioral sciences, such training is typically an integral part of the graduate curriculum in the natural sciences. This profound difference seems to reflect the complex nature of physical measurement in contrast to behavioral measurements. The following interview excerpt illustrates this point.
FBB: Many behavioral scientists have lamented the general lack of graduate training in measurement. For example, graduate students in psychology rarely receive formal training in how to use research instruments, how to select a suitable instrument for their research needs, or how to develop new instruments. Is this also the case in your area of specialty?
DMF: No. If anything, this is one of the primary focuses of graduate training in bioinorganic chemistry.
FBB: Do you offer this type of training in your own graduate program or in your lab?
DMF: Yes. Both through lecture graduate courses in bioinorganic chemistry and in biological applications of NMR spectroscopy and advice during research lab meetings with my graduate students. In addition, in journal clubs, critiques of methodology used in a given research paper most often than not dwells on sensitivity of the methods used to address the questions raised in the publication.
FBB: To what factor do you attribute this difference in educational emphasis?
DMF: Many of the measurement tools in bioinorganic chemistry are highly complex and require in-depth training to be used properly. Thus, by necessity we have made this training an integral part of the graduate curriculum.
On the one hand, these interview excerpts highlight several important similarities in the nature of measurement in the natural and behavioral sciences. For example, both types of scientists often use multiple methods of assessment in making measurements; and when the size of the effect being investigated is small, both natural and behavioral scientists use inferential statistics to establish the presence of bivariate relationships. But when natural scientists study large effects, they typically rely on visual inspection of data to draw conclusions. Thus, the issue of sampling and measurement error is typically of less concern in the natural sciences.
On the other hand, the interview excerpts also reveal several critical differences in measurement across the two disciplines. For example, unlike natural scientists, behavioral scientists have neither “gold standards” with which to gauge measurement validity, nor “standard solutions” containing known levels of the variables they wish to measure, for use in calibrating and validating measurement tools. As a consequence, it is harder to establish measurement validity in the behavioral sciences than in the natural sciences. Moreover, the measurement tools of behavioral science are usually simpler, less expensive, and more readily available than those of natural science. As a consequence, there is less specialization in methods of measurement and less emphasis on formal measurement training in the behavioral sciences.
Clearly, much work is needed before the behavioral measurement instruments will be as reliable, accurately calibrated, and well established as the instruments of natural science. For some time to come, the complexities and random irrelevancies of human behavior will continue to make measurement an especially thorny concern for behavioral researchers. Indeed, behavioral scientists may never develop the level of accuracy and precision in measurement that natural scientists have achieved.
Fred B. Bryant received his Ph.D. in social psychology from Northwestern University in 1980, and currently teaches courses in social psychology, statistics, and research methodology at Loyola University Chicago. Dr. Bryant has over 100 publications in the fields of personality, cognition and emotion, psychometrics, and structural equation modeling. He has also served as a methodological consultant for the United States Government Accounting Office (GAO), as a statistical consultant for numerous medical centers and internationally prominent marketing firms, and as an expert social science witness in several major Federal Court cases. Email: firstname.lastname@example.org
Read additional articles from this newsletter: