Assessing the Value of the Quality of Health Economic Studies (QHES)

select patients (diabetes mellitus, chronic renal failure, post-MI) and in combination therapy. The beta-blockers were not compared with the diuretic for study design reasons. Some may argue that these agents may be more cost effective than the diuretics that often require laboratory testing and potassium supplementation. However, for a majority of patients with mild to moderate hypertension, a diuretic should be the initial, or perhaps second, agent chosen to control blood pressure and reduce adverse clinical outcomes. Managed care pharmacists should be familiar with the ALLHAT study findings and help educate clinicians and members of P & T committees regarding these findings.


■■ Assessing the Value of the Quality of Health Economic Studies (QHES)
Ofman and colleagues address an important challenge, that of increasing the use of economic evaluations among decision makers. 1 Specifically, the study examines whether a scoring algorithm for checklists, referred to as the Quality of Health Economic Studies (QHES), facilitates the identification of highquality economic evaluations. The use of checklists and summary scores is not new to health care, and the limitations of quality checklists have been previously identified. 2 Checklists typically cannot separate the quality of reporting from the validity of the design and conduct of a trial. Many checklists contain items that are not directly related to validity but to the precision of results (e.g., power calculations) or generalizability (e.g., inclusion and exclusion criteria). 3 When checklist items are weighted and aggregated into a summary score, such limitations can be compounded. Despite the appeal of a summary score to measure quality, research has found that the use of summary scores provides unreliable assessments of validity. 4.5 It is against this backdrop that the QHES should be consid-ered. While the objective of QHES is to discriminate the quality of studies, its theoretical basis is unclear. Many checklist items are more closely related to reporting quality or interpretation of results than internal validity (i.e., the strength of the cause-effect relationship). For example, the checklist places significant weight on issues such as transparency, whether the study objective was clearly stated, and to a lesser extent, the funding source. Important issues related to internal validity were not included in the checklist, such as the nature of randomization and blinding. For example, a cost-effectiveness study that had adequately concealed randomization and was double-blinded could receive the same score as a study that inadequately concealed randomization and had no blinding. This is problematic since both poor allocation concealment and blinding have been associated with bias. 6 Similarly, the checklist lacks questions to address the internal validity of observational economic evaluations.
The authors did attempt to validate the QHES among health economists, decision makers, and through their own work. For the health economists who were surveyed, the summary score generally correlated with a more qualitative evaluation. However, no such assessment was conducted for decision makers, who were identified as the key audience for the QHES.
Furthermore, while there was evidence of convergent validity among health economists, there was a mixed reaction to the usefulness of the instrument. The authors indicated that they found a greater acceptance for the QHES among decision makers than among health economists. However, the forum in which the data was collected, a group discussion at an annual meeting of a professional society, set the stage for possible selection bias (as evidenced by the large representation from the pharmaceutical industry), social desirability bias, and, hence, overestimation of the utility of the instrument.
The authors cite further evidence of the utility of the instrument based on its use in the review of 30 economic evaluations of GERD treatments. Yet, no evidence of validity or utility was presented since QHES results were not compared to a separate assessment and no mention was made of the time or difficulty in completing the QHES in this application, relative to other approaches. That said, higher scores on the QHES were associated with GERD studies that were published after 1996, had researchers with academic affiliations, and had been conducted in the United States. Perhaps these characteristics are proxies for higher-quality studies, but the authors never addressed this issue.
The authors propose that the QHES may be simpler for decision makers to use and that it may have equal or perhaps greater ability to discriminate study quality than other checklists. To adequately answer these questions will require rigorous research among decision makers in real-world situations. However, it should be made clear that the need for validation is not unique to the QHES checklist. Many checklists have been designed to facilitate the review of economic evaluations by decision makers, yet the ability of any of these checklists to measure what they are supposed to measure remains unclear.
Research is needed to examine which criteria for assessing the validity of cost-effectiveness studies are important determinants of study results and in what situations. For example, what is the relationship between quality scores (QHES, as an example) and treatment effect (i.e., cost-effectiveness measure)? Do lowerscoring studies tend to produce more variable estimates of costeffectiveness? Do certain components of the checklist (e.g., sufficient time horizon) relate to the size of the treatment effect? Do quality scores vary across study type (i.e., randomized controlled trial, model, and observational study)? This type of methodological work is virtually extant in the pharmacoeconomic discipline, but with the plethora of quality checklists and the substantial resources devoted to the conduct of pharmacoeconomic studies, such a strategic approach seems viable.
Meanwhile, readers of economic evaluations should be cautious not to assume a false sense of precision in the use of summary quality scores since they generally have not been supported by empirical evidence, may actually be misleading, and are potentially more time consuming.

Brenda Motheral, RPh, MBA, PhD
Vice President of Outcomes Research Express Scripts, Inc.

■■ Summary Quality Scores for Pharmacoeconomic Studies: Balancing Validity With Need
Once a product has received marketing approval from the U.S. Food and Drug Administration, decisions regarding insured access to these agents are immediately raised. The existence and amount of the insured benefit for specific agents requires weighing the evidence for clinical gains and their associated costs against similar measures for competing products and therapies. Pharmacoeconomics provides a systematic, explicit, and objective basis for making and defending such drug benefit decisions. However, lack of standardization in the field 1 and the differences in perspectives, knowledge, and interests across and within the producers and consumers of pharmacoeconomics has limited its impact on drug coverage decisions. As the methodology advances, consumers of pharmacoeconomic studies require an efficient tool to identify superior studies. In this issue of the Journal, Offman et al. propose such a scoring instrument, the Quality of Health Economic Studies (QHES). 2 Beginning in 1973, clinical epidemiology has consistently identified large variations in the rates of performance of medical procedures and use of specific products. 3 As health care costs have increased, drug costs and effectiveness analyses have become common; however, the explosion of pharmacoeconomic studies has also included some of uncertain quality, rigor, or validity. Pharmacoeconomic studies have nevertheless been subject to increasing standardization. Some are still viewed with skepticism by health plans and insurers who perceive the potential latitude in permissible assumptions as resulting in less than objective evidence. However, purchasers face constant pressure to determine the relative value of marketed pharmaceuticals and to make decisions with imperfect and disparate information. Analyses to assist these determinations come from multiple sources, with attendant variations in quality, reliability, validity, and timeliness of content. Consequently, the assessment of quality and validity of specific pharmacoeconomic results is at the center of the decision process, and uncertainty here will continue to influence the impact of pharmacoeconomic studies.
The proposed QHES instrument will be a substantial contribution if it assists end-users of pharmacoeconomic studies to discriminate among the exploding body of literature 4 and efficiently identify the studies with superior merit. For producers of pharmacoeconomic studies, an accepted rating instrument could establish a clearer target-potentially encouraging higher quality and greater rigor. To achieve this level of acceptance and use, however, the QHES must demonstrate key validity characteristics.
A precondition for a valid rating instrument is that it be reliable. It must yield the same results on repeated trials. On this dimension, the qualitative nature of some of the QHES questions could mean lower reliability if the raters are not trained and their assessments not standardized. Otherwise, different observers may weigh the validity and reliability of health outcomes measures or scales differently. Without reliability, no instrument or measure can be valid.
Beyond being reliable, the QHES must rate studies on how well they actually answer the question posed by the research. Criterion validity, the closest aspect to what is commonly meant by validity, assesses the extent to which the measure being developed correlates with another, "gold standard" measure at the same time. 5 Questions of external validity, or generalizability, are at the forefront of issues confronting decision makers as users of such information. Whether the original study has a societal, patient, provider, or health plan perspective will determine the relevance of results to a specific setting or decision maker. One of the biggest challenges in evaluating pharmacoeconomic studies may be the