• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 



• • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • • 

Measurement Issues in Biobehavioral Studies

Vol 10, No 1 – Winter 2008


Introduction to the Winter 2008 Issue of The Behavioral Measurement Letter

Nursing research covers a broad area, with nurse scientists involved in studies that often examine not only theoretical but also behavioral, clinical, and physiological aspects of patient care, sometimes all within the same study. As a result, instrumentation may not always get the focus that is needed to ensure quality measures to make the study meaningful or to get the study published.

In this issue, Molly Dougherty, in Observations of a Nursing Research Editor, offers some insight into common problems she has observed with submitted manuscripts dealing with behavioral instruments, such as scope, copyright, literature review, and sampling. Dougherty dispenses some useful tips to potential authors for overcoming these problems, including reminding authors to read for both personal and professional reasons in order to overcome some common writing problems, considering publishing in on-line journals, and querying editors on the appropriateness of an article before submitting it to that journal. Finally, she shares with us some of the trends in the publication of behavioral instruments, including increased submissions, internationalization of nursing publications, and ‘open access publication.’

Next, in Measurement Error in Clinical Investigations, Louise Jenkins reminds us of the complexity of clinical trials and how measurement of behavioral variables may not get the attention that it needs because of the focus on other key aspects, such as sample size, randomization, etc. Jenkins cites some specific sources of behavioral measurement error found in clinical trial failures as found by the Food and Drug Administration and also asks us to consider other factors which can increase measurement error in a study. Finally, she offers a practical approach, the development of a measurement protocol, to help minimize measurement error.

Lastly, Maureen Groer, in her article Measurement Issues in Biobehavioral Studies, compares and contrasts measurement issues facing both behavioral and biological scientists. Groer reviews some important areas that need to be addressed when planning and implementing biobehavioral research. Lastly, while recognizing the complexities involved in biobehavioral studies, she reminds us of the value of an integrated, holistic approach.

Just a reminder, most college and university libraries in the US, as well as many colleges and universities in Canada, subscribe to both the HaPI database service and the newsletter. If you have questions with regard to how you might be able to take advantage of these services, don’t hesitate to contact the staff at Behavioral Measurement Database Services (BMDS). Also, since the last issue, I am happy to report that the titles of the articles appearing in The Behavioral Measurement Letter (BML) are now listed in the Cumulative Index to Nursing and Allied Health Literature (CINAHL). So if you missed Introduction to the Winter 2008 Issue of The Behavioral Measurement Letter The Behavioral Measurement Letter 2 Vol. 10, No. 1, Winter 2008 Introduction (continued from Page 1) a particular issue or would like to see what articles have been printed since 1993, just check out the listings in CINAHL.

Please address comments and suggestions to The Editor, The Behavioral Measurement Letter, Behavioral Measurement Database Services, PO Box 110287, Pittsburgh, PA 15232-0787. We also accept short manuscripts for The BML. Submit, at any time, a brief article, opinion piece, or book review on a BML-relevant topic to The Editor at the above address. Each submission will be given careful consideration for possible publication.

HaPI reading…

Deidre M. Blank, RN, DSN, FAAN Guest Editor


Measurement Issues in Biobehavioral Studies

Maureen Wimberly Groer

Research that simultaneously explores both psychosocial and biological processes provides a lens into integrated, holistic mechanisms by which human beings organize responses to external and internal events. In fact, studies that explore only a single aspect of those human responses provide only partial and often very in-complete data. By focusing on a single dimension, the complexity of a response is certainly obscured. But the exploration of biobehavioral responses is not easy, because there are multiple methodological issues that the investigator must consider. This article will describe several of the major issues and suggest solutions that will assist those scientists who wish to use a more holistic approach in their human studies.


General Biobehavioral Measurement Issues

Behavioral and biological scientists share common as well as disparate paradigms, scientific concerns, and training. While the behavioral scientist tends to seek answers to questions about relationships between large variables such as personality, environment, and behavior, the biological scientist is often focused on a particular molecule or biochemical process. The scale of inquiry is therefore quite different and the degree of specialization can appear to be extreme in some cases.

Issues related to reliability and validity, while always important in science, are of differing dimensions in the two fields of inquiry. The behavioral scientist worries about the performance of a questionnaire within a population. The biologist worries about the performance of an assay or a piece of electronic equipment. A variable such as depression may be measured by a cut-off score on a well validated instrument that has a range of 1-200. The biologist may be measuring a particular molecule in a body fluid that has a range of 1-5 picograms/ml. Measurement error on the order of a standard deviation becomes extremely worrisome when one is measuring such small and narrowly distributed values.

Biologists are often envied for the fact that they are doing “hard” science, with the faulty assumption being made that error and bias is less likely to creep into studies than when psychosocial measurements are being made. Many times the variables of interest to the behavioral scientist are collected by self-report paper and pencil testing and recall, which are notorious sources of error in studies. But biology is also fraught with similar problems. Every biologist knows that one’s laboratory and personnel are major sources of error and bias and therefore every lab establishes its own quality control standards and controls. The general working environment, the training and level of competency of laboratory personnel, the handling of specimens, the guidelines for procedures, the proper operation and maintenance of equipment, and the accuracy of data collection and recording are all possible sources of error in biological measurement. Modern automation and robotic procedures can help reduce errors, but the equipment must be properly used, and many research labs do not handle the large number of samples that make this equipment cost effective.

Another difference between the two disciplines is related to training. Behavioral scientists tend to have much stronger training in measurement theory, research design, and statistics than classically trained biologists. The apprenticeship model for advanced training is still widely in use in the biological sciences and students are expected to learn by example and exposure in their mentor’s laboratory as well as in classroom settings. More sophisticated statistical approaches are seen in the psychosocial compared to the biological literature. Data are not only analyzed but also often presented quite differently as well. Biologists often employ graphs and tables while behavioral scientists are leaders in the use of modern statistical approaches such as structural equation modeling and hierarchical linear modeling.

While the differences described above are significant, they do not present insurmountable obstacles to interdisciplinary collaboration. These different ways of thinking and doing research can inform biobehavioral studies. In order for truly interdisciplinary work to be carried out in the future, some aspects of the training of young scientists in both disciplines should be considered. An approach then could be coursework in biobehavioral paradigms with both types of students engaging in dialogue, thinking, and planning for research. At a later time, when thinking about a biobehavioral type of study, the scientist so trained would have a firmer ground for planning and working with others to develop ideas.

The next section of this paper will review some areas for attention and concern in the planning and implementation of biobehavioral research, particularly focusing on biological issues known to influence biobehavioral data collection and analysis.


Metabolic Pathways

Oftentimes the goals of research are to examine relationships between a particular psychological state and biological variables. The psychological state is usually measured by a paper and pencil self-report instrument, or occasionally by qualitative approaches. The researcher may be interested in a short term concurrent behavioral state or a more chronic condition. The biological variable (biomarker) is often the measure of a particular molecule in some body fluid at a single point in time, unless the study is prospective in design. Behaviors, emotions, moods, perceptions of stress are all potentially associated with neurohumoral, endocrine, or immune states, which can be assessed through measurement of well chosen biomarkers. The difficulty lies in being aware of the timing of metabolic pathways involved in the secretion, release, uptake, binding, and removal of the biomarker. An example would be the measurement of Interleukin-1 β, which is a proinflammatory cytokine, which has been shown to be increased in depression and stress (Anisman & Merali, 2002). Direct serum measures of this cytokine provide a value which ultimately reflects influences of multiple biological and biochemical pathways and influences. First is the actual production and secretion of the cytokine, mainly by cells of monocytic lineage. IL-1 is released from these cells after cleavage of its aminoterminal region by caspase-1 (Jacques, Gosset, Berebaum & Gabay, 2006). The effects of serum IL-1 are modified in vitro and in vivo by natural inhibitors such as IL-1 receptor antagonist (IL-1ra) and soluble IL-1 receptors. IL-1 ra inhibits the effect of IL-1 by preventing its attachment to IL-1 cell receptors. Thus, the measurement of IL-1 as a single biomarker of inflammation yields only partial information. A full picture requires measurement of soluble receptors and antagonists. In addition, IL-1 is usually in very low concentration in serum, so a high sensitivity assay may be necessary to produce values in the 1-3 picogram/ml range. Many cytokines have similar pathways. There are certainly multiple physiological influences on levels of most biomarkers of interest to the biobehavioral researcher, many of which are not yet well described.

Another issue is the compartmentalization of some biomarkers. Large molecules (many hormones, drugs) may not be able to cross epithelial barriers and be accurately measured in secretions such as saliva and tears. Other influences must also be considered. For example, more than 90% of serum cortisol is bound to cortisol binding globulin (CBG), and changes in serum proteins may alter bioavailability. The hormone is present in an unbound state in saliva, and thus is more reflective of bioavailable cortisol. On the other hand, the signature molecule of hypothalamic-hypophyseal-adrenocorticotrophic (HPA) activation is corticotrophin releasing hormone (CRH), which is not measurable in the serum (except during pregnancy when the placenta secretes it) (Mastorakos & Ilias, 2003). So indirect measures of HPA activation must be used, such as levels of serum ACTH or salivary cortisol. ACTH and many other hormones may be difficult to measure in body fluids because natural proteases may degrade them after collection of the sample. While it is often best to add a protease inhibitor such as apoprotein to samples that are collected, it is a procedure that is rarely done and the choice of the protease inhibitor has to be carefully planned so as to not interfere with the concentration of the protein actually being measured.

Circadian influences also markedly influence the concentration of many of the biomarkers of interest to the biobehavioral researcher. Most notable are the hormones of the HPA axis, but also many of the cytokines (IL-6, TNF-α) are now being identified as being influenced by the time of day and involved in sleep/wake cycles (Kapsimalis, Richardson, Opp, & Kryger, 2005). When comparing individuals within a stress paradigm, the levels of these hormones and cytokines cannot be compared among individuals measured at different time points across the course of a day. Time must be constant or controlled in some way.


Demographic Influences

Influences of demographics and environment cannot be ignored in biobehavioral studies. Socioeconomic status, gender, age, and body mass index are all important mediators or moderators of psychosocial-biological relationships. For example, reactivation of latent herpes viruses is often used as a marker of cell mediated immunity. Older females, compared to males, are more at risk for having reactivation of herpes zoster virus, which produces shingles, and this may be related to a lower percentage of herpes zoster memory T cells in females (Klein et al., 2006). Thus, in experiments evaluating the influence of stress of herpes zoster reactivation, gender must be considered. Stressors from multiple potential sources must be accounted for in biobehavioral research. Vulnerabilities to stressors may be exacerbated by environmental influences, such as may occur for pediatric asthma. Combinations of both life stress and exposure to allergens and indoor pollutants together may produce the biological effect (Friedman & Lawrence, 2002).

Another potential source of error is the presence of a concomitant disease process or multiple risk behaviors in individuals being studied with a biobehavioral approach. Many of the markers associated with stress states and dysphoric moods are also associated with certain illnesses. The relationship between cardiovascular risk and C-reactive protein (CRP) is well established, but CRP is also associated with other inflammatory diseases, generally rises with age, and may be related to other multiple factors associated with cardiovascular disease (Lowe & Pepys, 2006). Another example would be the finding of a relationship between stress and a particular cytokine. Many other intervening variables could potentially account for the cytokine elevation, such as genetic polymorphisms in genes that regulate cytokine expression, or even something as simple as a lack of sleep the night before.


Measurement Problems

Ease of measurement is often a concern, so biological fluid which can be collected without invasive procedures may be an option. However, many times the most accessible fluids provide limited information. Careful controls over the actual collection of fluids such as saliva or breast milk are necessary to avoid spurious results. For example, smoking, eating, or oral hygiene may lead to microscopic bleeding in the mouth and if these are done prior to saliva collection, then measurements of certain biomarkers will not be accurate. Subject burden may result in spurious or inaccurate biological data. In this researcher’s experience, when collecting saliva samples in people with very dry mouths, the wish to produce something in the tube may lead to undesirable sputum collection rather than saliva. The collection method may actually interfere with the accuracy of certain biomarkers. The use of cotton-based saliva collection has been found to lead to errors. Salivary assay results for testosterone, DHEA, progesterone, and estradiol were artificially high, and for secretory Immunoglobulin A (sIgA) artificially low in studies, when samples were collected using cotton absorbent materials (Shirtcliff, Granger, Schwartz & Curran, 2001). Timing of the biological collection is also important. Many times salivary markers require accurately timed collection, as not only absolute concentration is of interest, but also secretion rate. Sometimes a protein marker of secretion is used so that the molecule of interest is expressed in units per mg of a protein that is secreted at a constant rate. With regard to measurement of human milk molecules, the time of milk expression in relationship to the production of milk may be important, as fore milk, produced at the beginning of a feed, is quite different from hind milk.

When data are collected in the fi eld, the samples must be transported to the laboratory at appropriate temperatures to assure viability of the samples, but also with regard to the possibility of cold damage to samples. An example would be packing and transporting blood collected in plasma collection tubes in ice buckets which may cause cryodamage through spotty icing along the inside surface of the collection tube which may result in platelet clumping and aggregation onto white blood cell surfaces.

Laboratory assays of biomarkers also may not be accurate for one reason or another. Appropriate controls, and intra and inter assay coefficients of variation are important to assess in order to be confident of the results. Many biological variables are positively skewed and require mathematical transformation in order to be amenable to parametric statistical analysis. Another problem is the limit of the assays available to measure biomarkers. Some cytokines may be present in concentrations lower than the lowest limit of the assay’s standard curve, so that the results are all essentially zero if one uses these assays and biologically significant differences are easily missed. The use of high sensitivity assays may be necessary to discover these differences. The reverse problem occurs when the amount of the biomarker is very high and the assay measures at a much lower range. This requires serial dilutions of samples to achieve the range of the assay, but with every serial step, there is potential for error. Another consideration is the statistical manipulation of data, when outliers in particular biomarkers are discarded from the analysis. Individual differences, which may be of great interest, may be obscured by this approach (Cohen, 2004). An experience this researcher had was the discovery of extremely high levels of proinflammatory cytokines in a participant in a study which would have skewed the data considerably. However, this participant then was diagnosed with an autoimmune disease. So mindlessly eliminating data that is way beyond the means and ranges in the study may result in statistically more homogeneous data, but care should be taken to assure that the “outlier” is not a methodological aberration, but a real person with an illness needing treatment.



While biomarkers clearly add important information to studies, a theoretical approach which provides carefully considered and justified rationale for the choice of the biological molecule is essential. Simply adding a “stress” marker, without the framework to guide its selection, is casting a net out for biological data without a plan for integrating that data into a theoretically sound analysis plan.

The future of biobehavioral research is exciting as technology will allow great ease of measurement and extensive amounts of data to be collected. Currently the use of gene arrays and multiplex protein assays provide this opportunity. This reinforces the need for training young behavioral scientists in physiology and molecular biology, as well as educating biology students in behavioral measurement, so that future studies will be framed in the most holistic, theoretically sound, and robust manner. This type of research is strongest when interdisciplinary collaboration is employed to develop and implement the study.



Anisman, H., & Merali, Z. (2002). Cytokines, stress, and depressive illness. Brain Behavior and Immunity, 16, 513-524.

Cohen, N. (2004). Biological relevance of data variability. Brain, Behavior, and Immunity, 18, 495– 496.

Friedman, E. M., & Lawrence, D. A. (2002). Environmental Stress Mediates Changes in Neuroimmunological Interactions. Toxicological Sciences, 67, 4-10.

Jacques, C., Gosset, M., Berebaum, F., & Gabay, C. (2006). The role of IL-1and IL-1Ra in joint inflammation and cartilage degradation. Vitamins and Hormones, 74, 371-403.

Kapsimalis, F., Richardson, G., Opp, M. R., & Kryger, M. (2005). Cytokines and normal sleep. Current Opinions in Pulmonary Medicine, 11, 481- 484.

Klein, N. P., Holmes, T. H., Sharp, M. A., Heineman, T. C., Schleiss, M. R., Bernstein, D. I., Kemble, G., et al. (2006). Variability and gender differences in memory T cell immunity to varicella-zoster virus in healthy adults. Vaccine, 24, 5913-5918.

Lowe, G., & Pepys, M. (2006). C-reactive protein and cardiovascular disease: weighing the evidence. Current Atherosclerosis Reports, 8, 421-428.

Mastorakos, G., & Ilias, I. (2003). Maternal and fetal hypothalamic-pituitary-adrenal axes during pregnancy and postpartum. Annals of the New York Academy of Sciences, 997, 136-149.

Shirtcliff, E., Granger, D., Schwartz, E., & Curran, M. (2001). Use of salivary biomarkers in biobehavioral research: cotton-based sample collection methods can interfere with salivary immunoassay results. Psychoneuroendocrinology, 26, 165-173.

Maureen Wimberly Groër is a nurse physiologist whose work has focused on the psychoneuroimmunological mechanisms of maternal-infant interactions. She is the Gordon Keller Professor at the University of South Florida College of Nursing. E-mail: mgroer@health.usf.edu


Observations of a Nursing Research Editor

Molly C. Dougherty



As an editor I see many manuscripts on behavioral instruments across a broad range of nursing areas. Quality manuscripts have specifi c characteristics and there are common problems among those that are weak. I will address common problems, tips for authors, and trends in publication of behavioral instruments.


Common Problems in Submitted Manuscripts


Successful researchers give careful thought to the scope of the problems they study. Quality behavioral instrument manuscripts address a contemporary problem that is broad enough to interest a wide audience, but not so broad that it cannot be addressed in feasible studies. A quality manuscript addresses a topic that manuscript reviewers see as timely and important based on their knowledge of the literature and the potential audience for the work. Finalizing a project and then realizing that it is too narrow or of waning interest usually can be avoided by adequate forethought.



When they think about it, most authors want to retain copyright ownership of instruments they develop. But, many authors include a copy of the instrument in the manuscript; I discourage this. It is best to provide a few items or partial items in tables or text, but not the entire instrument. The author usually does not want to turn over ownership of the instrument to the publisher, but does so if the instrument is included and the author signs a standard copyright agreement. Editors are usually sensitive to this, but it is helpful if authors think about it and plan ahead.



Authors may write a proposal with current literature review years before the manuscript reporting the results is prepared. Reviewers are quick to notice if the presentation of the background literature is dated and it seems to negatively color their assessment of the manuscript. Manuscripts that include updated, integrated background literature receive more favorable reviews.



Concern that the sample size reported in a manuscript is not adequate to support the analysis that was performed or that is needed is a common observation of reviewers of behavioral instrument manuscripts. A second concern about sampling is related to whether the sample described in the manuscript adequately represents the population for whom the instrument is intended. Obtaining an adequate sample size and using an appropriate sampling frame is crucial to publishing behavioral instrument research.


Division of Behavioral Instrument Studies into Manuscripts

The appropriate packaging of studies into manuscripts seems to be one of the most difficult tasks authors face in behavioral instrument research. The development of a behavioral instrument involves multiple studies. Often one study, e.g., item generation, is not weighty enough to merit publication as a stand alone article. I am not aware of any specific rules of thumb on this, but two principles pertain: (a) the content of a manuscript should make a clear contribution to the literature in the fi eld and (b) submission of multiple manuscripts that represent the smallest publishable unit should be avoided. One approach to building the evidence for a behavioral instrument is to present early studies at professional meetings which feature published abstracts. Citing published abstracts from early studies allows the author to show the step-wise development of the behavioral instrument and avoids manuscripts that attempt too much or offer too little. Standards among journals differ, but every journal editor wants to publish articles that make a contribution and that do not duplicate other published work.

These are some of the common problems found in manuscripts and in the section below, tips that may provide the author with techniques to overcome some of the problems are offered.


Tips for Authors

Read, Read, Read

Nearly all successful authors report that they love to read. Reading allows the author to see how others use language and overcome some of the common problems in writing. I suggest that authors read fiction (Dougherty, 2005) or any work they enjoy. Good writing flows and pulls the reader into the work. Occasionally, I receive a manuscript of this quality and I examine it closely. I think that if I can understand how the author accomplishes this flow, that I will be able to help others achieve it. At minimum, I hope that by reading I will become a more accomplished author myself.


Examine Quality Work

It is useful for an author to read articles in top journals in her/his area. Often other authors have faced and surmounted problems the author has in presentation of material. The division of material into narrative, tables, and fi gures is often instructive. The organization of a manuscript seems obvious only when it is complete. Reviewing how articles are divided into sections and the kind of material that is included in each section often helps an author make good decisions about organizing his/her material. I am not The Behavioral Measurement Letter 4 Vol. 10, No. 1, Winter 2008 Observations of a Nursing Research Editor (continued from Page 3) suggesting plagiarism, but emulation, a form of role modeling at a distance.


Consider an On-Line Journal

Nursing has been a little slower to uptake online publishing than some other fi elds. Nonetheless there are a number of quality on-line journals which are not constrained by print-page limitation and carry out quality peer review. Given the trends in academic publishing, online journals are destined to grow and become a more important part of academic life. I recommend that authors become more familiar with on-line journals in their area and to think about them as a viable choice for some of their behavioral instrument products.


Query the Editor

Relatively few journals require that an author query the editor before submission of a manuscript. At Nursing Research I prefer to hear from authors before they invest the time into preparing a manuscript that will be submitted for our exclusive consideration. A query containing a preliminary abstract is easy to prepare and send by e-mail, and it can be sent to multiple editors simultaneously. With the use of queries I am able to redirect manuscripts that are not within our editorial purpose, provide information that will allow the author to prepare a manuscript better suited to Nursing Research, or to encourage submission to Nursing Research. Often when I recommend another journal I save the author weeks of time in manuscript review. Communicating with editors allows authors to make sound decisions about submission of manuscripts to specific journals.

Persevere. I doubt that there is any successful author who has not experienced rejection of his/her written work by editors. One of the characteristics of successful authors is their ability to take the critical comments of reviewers and to improve their work. An example of the communications and reviews related to a recently published behavioral instrument article is on the Nursing Research Editor’s Website (http://www. nursing-research-editor.com/authors/open.php) under Open Manuscript Review, Manuscript #9. Numerous communications related to this manuscript reflect the issues reviewers raise and the competence of the author in addressing them. Novice authors are tempted to give up during the review process, but generally, when an editor requests a revision of a manuscript there is interest in seeing that manuscript to publication.

There are many books written about writing intended to help authors improve their work. These few tips are ones that seem relevant to academic authors in the area of behavioral instruments. Exploring the literature on writing for publication will uncover many other useful tips.


Trends in Publication of Behavioral Instruments

Increased Behavioral Instrument Research

As nursing science expands and priorities for research funding evolve, it is likely that behavioral instrument development will follow. Based on articles published in Nursing Research, there is increasing activity in behavioral instrument research. In 2006, 49 articles were published in regular issues and 11 of them were on behavioral instruments. In comparison, in 2000 there were 45 articles published of which 5 were on behavioral instruments. In the last two months of 2006, Nursing Research received 22 queries and 4 of these were on behavioral instrument development. It appears that publication interest in this area will continue to be brisk.



Although there is relatively little research on it (Dougherty, Lin, McKenna, & Seers, 2004), the internationalization of nursing publications is an important trend. To illustrate, of the 11 behavioral instrument articles published in Nursing Research in 2006, the address of the first author of 4 of them was not in the United States. Internationalization of nursing provides fertile ground for behavioral instrument development because it requires that concepts be redefined consistent with cultural context and translation requires careful attention to linguistic and cultural meaning. It is likely that internationalization will be a driver of behavioral instrument development in nursing.


Open Access Publishing

It is likely that open access publishing will be a stimulus for behavioral instrument research. The world-wide availability of open access publications, the expansion of readership, and global opportunities for authorship permit the growth of scholarship in behavioral instrument development into areas that have been poorly represented.

Three topics (problems, tips, and trends) have been addressed here. There are outstanding opportunities in publishing for authors in behavioral instrument research. With attention to the quality of writing and the target journal, authors are in position to take advantage of the trends in publishing in this area.



Dougherty, M. C. (2005). Read fiction to know nursing. Nursing Research, 54, 73.

Dougherty, M. C., Lin, S-Y., McKenna, H. P., & Seers, K. (2004). International content of high ranking nursing journals in the year 2000. Journal of Nursing Scholarship, 36(2), 174-180.

Nursing Research Editor’s Website. Open Manuscript Review. Retrieved January 9, 2007, from http://www. nursing-research-editor.com/authors/open.php

Molly Dougherty, PhD, RN, FAAN, has been Editor of Nursing Research since 1997. Nursing Research is the premiere nursing research journal and consistently ranks near the top of the ISI ranking for nursing. The Editor’s website (http://www.nursingresearch-editor.com/) provides expanded content on selected Nursing Research articles, and makes the editorial review process transparent with open manuscript review. Educated at the University of Florida (Gainesville) in nursing and anthropology, Dr. Dougherty is Professor of Nursing at the University of North Carolina at Chapel Hill. E-mail: m-dougherty@unc.edu

“Do not go where the path may lead, go instead where there is no path and leave a trail.”

Ralph Waldo Emerson


Measurement Error in Clinical Investigations

Louise S. Jenkins

Clinical investigations continue to expand in size, complexity, and scope; growing numbers are multi-site or encompass clinical trials requiring signifi cant resources. Typically, much attention is focused on aspects of design such as sample size, randomization, control of the intervention(s), and analytic methods. Measurement of key variables may receive less attention. Hopefully, at minimum, instruments are selected having strong evidence for reliability and validity as well as for appropriateness for the study population. Is that suffi cient?

Kobak, Kane, Thase, and Nierenberg (2007) cite poor inter-rater reliability, interview quality, and rater bias as major problems in clinical trial failure in over 1/3 of 45 data sets reviewed from the Food and Drug Administration. Clearly there are other sources of measurement error that can impact the conduct and results of a study. For example, does it matter whether subjects know to which treatment they will be assigned before baseline data are collected? Brooks, Jenkins, Schron, Steinberg, Cross, and Paeth (1998) found that despite the lack of signifi cant differences in clinical and demographic variables, “baseline” quality of life scores from data collected from subjects before they knew what treatment option they would be randomized to was signifi cantly better than scores from subjects who already were aware of their treatment option. In this study, the fact that treatment options were dramatically different (receiving antiarrhythmic therapy or implantation of an internal cardioverter defi brillator) helped demonstrate the point that “baseline” must be carefully specifi ed and be the same for all subjects.

Consider a few related questions:

  • – Are scores different for subjects who complete study measures by themselves or have a staff member read the questions to them and record the responses?
  • – Are data collection procedures carried out in the same manner by all research personnel?
  • – Do scores differ by the environment in which instruments are completed? Consider how a busy clinic environment might vary from the ambience of one’s own home environment.
  • – Do scores differ when subjects hand the study measures to a care provider for review versus placing them in a sealed envelope?
  • – When study measures are mailed to subjects, how do you know who really completed them?
  • – Does the order in which study measures are presented to subjects for completion impact scores?
  • – How does the frequency with which study measures are completed impact scores?
  • – How is the fidelity of scoring procedures assured?


These questions reflect just some of the things that can increase the amount of measurement error in a study. While there is no way to totally eliminate some measurement error, one approach to help in minimizing it is including a strong measurement protocol in the design of study methods. Beyond describing measures to be used, a measurement protocol should be written to be very thorough; the level of detail should be sufficient to allow another to precisely replicate each aspect. The need for detailing adequate initial training of study personnel as well as monitoring of their actions and performance throughout the duration of the study increases exponentially as their numbers increase; this is also true as the number of study sites increases. Timing of administration of measures with a carefully determined + tolerance (e.g., within 3 days) relevant to the attributes being measured is crucial. Exact methods for administering and scoring of measures need to be set forth and monitored throughout the study. Attention must be given to scoring rules and approaches to be used in handling missing data as well. While the measurement protocol does require significant attention to develop and use, it offers a helpful approach to minimizing measurement error and contributing to the rigor of clinical investigations.



Brooks, M.M., Jenkins, L.S., Schron, E.B., Steinberg, J.S., Cross, J.A., & Paeth, D.S. (for the Antiarrhythmics versus Implantable Defibrillators (AVID) Investigators) (1998). Quality of life at baseline: Is assessment after randomization valid? Medical Care, 36(10), 1515-1519.

Kobak, K.A, Kane, J.M., Thase, M.E., & Nierenberg, A.A. (2007). Why do clinical trials fail? The problem of measurement error in clinical trials: Time to test new paradigms? Journal of Clinical Psychopharmacology, 27(1), 1-5.

Louise S. Jenkins, PhD, RN, is Co-Director of the Institute for Education in Nursing and Health Professions at the University of Maryland School of Nursing where she also teaches courses in measurement in the PhD program. Dr. Jenkins has developed and tested a number of patient outcome measures and served as quality of life consultant on several major clinical trials including Antiarrhythmics versus Implantable Defibrillators (AVID) and Atrial Fibrillation Follow-up: Investigation of Rhythm Management (AFFIRM). Both of these studies were sponsored by the National Heart, Lung, and Blood Institute, NIH.


“Think like a wise man but communicate in the language of the people.”

– William Butler Yeats



Subscribe to our Newsletter Today

Stay up to date! Newsletters sent out quarterly.

Copyright © 2024 BMDS |  All Rights Reserved

Design: LDS