2008: A Tipping Point for Disease Management?

Given recent events, 2008 has certainly been a year of discontent for the commercial disease management industry1 and ultimately may turn out to be its tipping point. The year kicked off with continuing questions about the effectiveness of disease management, precipitated by Mattke et al.’s highly publicized study from late 2007, “Evidence for the effect of disease management: Is $1 billion a year a good investment?” Based on a review of the literature, the authors concluded that the savings from population-based programs are uncertain, warning plan sponsors to be “skeptical about vendor claims.”2 In early January, The Centers for Medicare & Medicaid Services (CMS) announced that it had lowered the savings target for the Medicare Health Support (MHS) Program from 5% net savings to budget neutrality at the request of the MHS vendor organizations.3 This news was quickly followed by a CMS announcement that it would terminate the MHS demonstrations for failure to meet the statutory requirements, including cost savings that matched vendor fees.4 At about the same time, Inverness Medical Innovations publicized its intention to purchase Matria, one of the largest disease management companies in the United States.5 With Inverness’ previous purchases of Alere and Paradigm and the late 2007 acquisition of Health Dialogue by British United Provident Association, only the largest of the 3 disease management companies remained independent. On top of the CMS announcement, the industry’s largest vendor was further challenged by unexpected reductions in demand for its services.6 Concerns about disease management resurfaced again in June when a peer-reviewed study found that disease management had no significant impact on adherence to pharmacologic treatment after myocardial infarction.7 In an editorial accompanying the study, Mattke asked whether a disease management backlash exists.8 Increasingly, the answer appears to be yes. Once the darling of employers and benefits consultants, disease management is now the object of considerable dissatisfaction and growing doubts.9,10 Industry observation suggests multiple reasons for this disaffection. Five of the most important include a desire for (a) better alignment of vendor and client interests, (b) greater transparency in business arrangements, (c) improved plausibility in reports of financial and clinical outcomes, (d) more rigorous evaluation methodology, and (e) more convincing evidence of outcomes improvement. 1. Misalignment of Vendor and Client Interests The most common form of disease management enrollment is the opt-out model, which means that members are enrolled in the program unless they explicitly opt out.11 The industry shift to an opt-out model was consistent with the belief that continual efforts were needed to engage members and, conveniently, allowed use of return on investment (ROI) methodologies without casematching, which adds complexity. The opt-out model has been positive for revenue and gross margin. Members who cannot be reached are still enrolled, with an accompanying revenue stream from the plan sponsor, yet accrue little cost to the disease management vendor. Furthermore, members infrequently “graduate” from the program, thus providing a perpetual revenue stream for the disease management vendor until the member leaves the health plan or employer. Although good for vendor revenue, the opt-out model has resulted in a lack of alignment between vendor and plan sponsor. The disease management vendor has limited incentive to aggressively reach out to members for 2 reasons: Outreach efforts incur incremental costs without raising incremental revenue; and there is a risk that, once reached, the member will opt out of the program. In its extreme, the opt-out model encourages disease management vendors to cut costs in time of pricing pressure by reducing call center staff. Although one might expect that the competitive need to demonstrate ROI would provide an incentive to enroll members into services, it is not clear that the link between staffing levels and outcomes (which appear to be more methodology driven than intervention driven) is strong enough, from the vendor’s perspective, to overcome the disincentive to incur higher operational costs. Accordingly, a disease management vendor can titrate its operational expenses to meet its profit goal in times of pricing pressure. Equally important, the opt-in model has left plan sponsors paying a monthly fee for an entire population of diseased members when only 25% may actually participate in the program at least once, and far fewer engage on an ongoing basis.12 The largest plan sponsors are increasingly addressing the lack of alignment through 2 approaches. The first approach is to have contractual requirements for minimum staffing levels or administrative performance metrics, but these arrangements require precise contractual language. For example, if staffing levels are calculated at the end of the month (per the contract), a vendor can wait until the last week of the month to hire and begin training 2008: A Tipping Point for Disease Management?

new staff for open positions, thus effectively operating at a lower than required staffing levels during most of the month.
The second approach has been to move to an engagement model, meaning that the vendor is paid the full fee for only those members who are reached, typically by telephone. A stricter version of the engagement model is the opt-in model, which requires members to agree to participate in the program before they are enrolled and before the plan sponsor pays any fee. Neither option provides a complete solution to the problem of misaligned incentives, but both options represent a better-aligned relationship than the opt-out model in the face of questionable improvement in clinical and financial outcomes.

Insufficient Transparency
Closely related to concern over misalignment of vendor and client interests in disease management is a growing frustration with insufficient transparency, particularly in the areas of engagement, intervention, and reporting. Perhaps the biggest transparency concern revolves around levels of engagement, which means knowing how many members are contacted, typically by telephone. Plan sponsors desire transparency because engagement levels can serve as an early indicator of ultimate success, a plausibility indicator when reviewing outcomes results (i.e., patients with higher engagement levels should have better outcomes), and as one comparative process measure of vendor effectiveness (i.e., better vendors are expected to contact more patients).
Despite its obvious importance, engagement is commonly a "black box" to plan sponsors because of limited vendor reporting. Even when engagement levels are reported, there are options for defining engagement and vendors use different methodologies, adding to the transparency challenge. For example, one vendor might exclude members with invalid telephone numbers from the denominator when calculating engagement levels, whereas another may not. Given that members with invalid telephone numbers can represent 10%-20% of the membership, the exclusion of these individuals from the calculation can make a material difference in the overall reported engagement level.
For larger and more sophisticated plan sponsors, the desire for transparency around engagement extends beyond aggregate reporting to identifying exactly which members are engaged by which channel (telephone, mail, etc.). Such information allows plan sponsors to conduct independent assessments of program effectiveness and understand interactive effects between the disease management and other vendors' programs on both participation and outcomes.
Greater transparency is desired around not only the level of engagement, but also the intervention itself. Plan sponsors have no reliable, routine, and cost-effective way to assess the quality of the intervention. Neither call recording (i.e., recording of the actual phone conversation) nor intervention notes (i.e., detailed notes of the call that are taken by the nurse/coach) are standards in the industry. 13 In addition to enhanced reporting on engagement, greater transparency is desired on evaluation methodology and clinical results through timelier, more detailed, and more readily accessible reporting. For example, MacStravic suggests that vendors should better develop their "paper trail" of metrics that should be linked to outcomes improvement. 14 Examples could include motivation, self-management knowledge, behavior changes, and self-efficacy. By seeing real changes in these leading indicators, plan sponsors could have more confidence that the outcome improvements are actually a result of the disease management intervention. Finally, it is important to keep in mind that the transparency challenges go beyond the 3 issues mentioned here and extend to other aspects of operations, such as transparency in referral fees paid to consultants, quality of data on enrollment, etc.
As plan sponsors and consultants push for greater transparency, some vendors have yet to embrace the transparency movement given the potential business implications and the capital required to make the necessary system changes. Will disease management vendors hold up under the scrutiny that comes with greater transparency? The push for greater transparency in pharma funding in the pharmacy benefit management (PBM) industry resulted in one PBM discontinuing any ancillary funding from pharmaceutical manufacturers, 15 led to the development of client pledges that outlined specific behaviors around alignment and transparency, 16 and opened the door for new, smaller PBMs that claimed to offer aligned and fully transparent business models. 17,18 If other industries are any indication, expect the demand for transparency to drive widespread changes in business practices and perhaps even the competitive landscape.

Improved Plausibility Is Needed
Surprisingly, concerns about plausibility have been a more recent phenomenon as plan sponsors are increasingly questioning the size and timing of the ROI and researchers continue to challenge the scientific underpinnings of the disease management model. Although many vendors produce ROIs that range from 1.5 to 3.0, larger ROIs are sometimes still claimed for the core chronic conditions, such as diabetes and heart failure. In the case of noncore conditions (e.g., arthritis), vendors have been known to claim ROIs that are 10-or even 20-fold. Obviously, the 10-and 20-fold ROIs raise red flags and strongly suggest that the statistical phenomenon of regression to the mean (the tendency for utilization of high-cost members to decline over time, with or without intervention), rather than the intervention, is the more plausible explanation for the findings.
In the case of the less extreme ROIs, plan sponsors will still question the ROI, particularly when it does not align with other relevant data points, such as engagement levels. For example, a plan sponsor would question a vendor who showed an ROI that increased over time while engagement decreased over time. Similarly, a health plan would question a vendor that claimed a higher ROI for Employer A than for Employer B, even though Employer A had significantly lower engagement levels, all else being similar. Another simple and powerful plausibility test is a comparison of the ROI for engaged versus nonengaged members because one would expect the ROI to be higher for engaged members.
Although the examples above are good screeners for plausibility, they are no guarantee that the ROI is legitimate. The plausibility of the results can be further assessed by comparison to utilization changes. Lewis and Linden have led adoption of this approach. They provide an example of the utilization plausibility test for asthma based on the disease management model of reducing emergency room (ER) visits and hospitalizations for asthma through better daily management. 19 For this example, assume that the program fee is 5% of total expenditure and that asthma-related ER visits and hospitalizations represent 25% of total expenditure (Table 1). Under these assumptions, a 40% reduction in ER and hospitalization expenditure for asthma is needed to achieve a 2:1 ROI. Obviously, a 40% reduction in ER/hospital costs for asthma is a high bar and beyond the effect seen in published studies of disease management.
On a related note, CMS questioned the plausibility of costsavings for the MHS pilots given the relatively low level of disease-related admissions observed across vendors. Specifically, CMS reported that the MHS vendors had 3 nondisease-related admissions for every 1 disease-related admission, in the case of both heart failure and diabetes, and expressed concern that vendors have "overestimated the impact of their intervention on their ability to reduce the stream of beneficiary utilization." 20 It is important to keep in mind that, in the above example, a 40% reduction is conservative because it assumes no increase in drug expenditure or other services that would have to be offset through additional savings and does not reflect real-world engagement levels. Continuing the previous example, if one half of the asthmatics targeted by the program are actually engaged, an 80% reduction in asthma-related ER/hospital costs for those asthmatics engaged in the program is needed to achieve a 2:1 ROI.
The formal analysis for this approach is number needed to decrease (NND), which allows plan sponsors to determine whether the claimed ROI is even possible. 21 The Disease Management Association of American (DMAA) adopted the NND plausibility test as part of its outcomes guidelines. 22 However, use of this test is just beginning to take hold in the industry-the ability of disease management vendors to "pass" the plausibility test remains to be seen.
Although the plausibility test focuses on a retrospective assessment, there is growing evidence that challenges the epidemiologic foundation of the outcomes expected from disease management programs. Again, an example best illustrates the issue. Suppose that a program serves a 50-year-old, 250-pound male diabetic with a hemoglobin A1c of 9 and low-density lipoprotein cho- lesterol (LDL-C) of 228. A well-validated epidemiologic model shows that aggressive management to lower the A1c from 9 to 7 will reduce the 3-year risk of myocardial infarction from 12.9% to 11.3%, the risk of stroke from 1.7% to 1.4%, the risk of retinopathy from 0.23% to 0.12%, and the risk of foot problems from 0.5% to nearly zero. 23 Although clinically meaningful, these reductions seem insufficient to drive a short-term positive ROI given the small absolute risk reduction and incremental cost to manage. Accordingly, a plan sponsor must be willing to believe that short-term savings will be generated from reductions in ER visits (and hospitalizations) for short-term complications, primarily hypoglycemic events requiring emergency care, which average 3.4 per 100 person-years among diabetics. 24 Research suggests otherwise. More than a decade ago, researchers using a Monte Carlo simulation of the Diabetes Control and Complications (DCCT) Trial found that intensive management of blood glucose in patients with diabetes did not provide either short-or long-term net direct medical savings. 25 Recently, researchers from the American Diabetes Association and American Heart Association, using a well-validated epidemiologic model, found that neither blood glucose control nor blood pressure control in patients with diabetes produced cost savings even over 30 years. 26 The question of value is compounded by the recent outpouring of evidence that chasing biomarkers is also not associated with expected nonfinancial end-point outcomes. 27 Against this backdrop, disease management vendors have been conspicuous in the lack of provision of epidemiologic and economic underpinnings for program intervention effects. Given the large datasets at their disposal, vendors could provide population-based modeling, based on evidence in the literature that shows the causal pathway and the corresponding effect on process measures to produce the desired results. In addition, many vendors should have sufficient experience at this point to factor in the cost of medications and other services that generate a cost increase. Of course, vendors run the risk that a priori identification of the outcomes drivers will suggest that the program is not a good investment for a particular plan sponsor given its too much time, or is simply not possible because no comparison group exists. In other words, they argue that there is a trade-off between rigor and practicality. Typically, these barriers are the exception rather than the rule. Many health plans sell disease management services as an add-on to the medical benefit for their administrative service only (ASO) groups (i.e., employers who self-fund). This sales strategy provides multiple employer groups for comparison because purchase of disease management services among ASO groups is rarely 100%. Similarly, in the case of an employer who purchases directly from a disease management vendor, the disease management vendor can often provide an adequate comparison group from its health plan's ASO customers who have not yet purchased the program 30 or, if desired, through staggered implementation across the employer's sites.
Although vendors cannot provide a comparison group in every case, they can usually do so for larger clients. For smaller clients, providing relevant benchmarks of expected outcomes improvement over time (for employers with and without disease management) would be a positive step forward, even though no direct statistical comparisons can be made. Increasingly, more sophisticated plan sponsors do not rely on the disease management vendors to provide comparison group results. Instead, they are producing their own analyses in-house or with third-party assistance.
Finally, as evidenced by DMAA changing its name to DMAA: The Care Continuum Alliance, 31 disease management vendors have expanded rapidly into the health and wellness arena, which is fraught with numerous new methodological issues that warrant close attention. For example, the measurement of productivity savings is currently as much art as it is science, offering degrees of freedom in terms of methodological options and making it susceptible to exaggeration. 32 A vendor might apply productivity savings from a member survey that had only a 20% response rate to the entire population to generate an ROI for the program. This approach will overstate health-related productivity loss if the individuals who responded to the survey are significantly sicker and incur greater health-related productivity loss than the overall population, as is frequently the case. Determining the value of improvements in biometrics scores is another area of relatively uncharted territory where extraordinary claims of value are being made, but the rigor of methodologies to validate those claims is not keeping pace. On these new issues, it will be important to proactively identify rigorous and practical solutions to assessing the value of these programs.

Evidence of Improvement in Outcomes
Ultimately, most of the issues identified above are symptoms of the fundamental question surrounding disease management: Does it work? By "work," decision makers usually mean "save money," with a secondary interest in improved clinical outcomes.
A meta-analysis of studies from 1995 to 2003 found a small to moderate effect of disease management on medical costs. However, the statistical analysis combined multiple models of underlying utilization patterns. This question was raised more than a decade ago, when clinical researchers challenged the economic and clinical benefit of an asthma disease management program for a health plan with a high ratio of drug costs and a low ratio of ER and inpatient hospitalization costs (relative to total expenditure). 28

More Rigorous Evaluation Methodology
The need to conduct plausibility tests is partially a symptom of the historic weakness of the methodologies used to evaluate disease management programs. The most common approach to calculating an ROI has been to compare the actual observed trend for the diseased members to an expected trend based on actuarial forecasting techniques and other adjustments developed specifically for disease management (e.g., using the trend of nondiseased members as the expected trend). Major challenges with this approach are that trends for whole programs are subject to random variation and that trend estimates can vary dramatically depending on reimbursement rates, claims run-out patterns, and numerous other wild cards. These externalities and methodological decisions can have a dramatic impact on the ROI calculation even when they affect expected or actual trend by only a few percentage points.
From a design perspective, the ROI methodology has been inadequate to establish a causal link between the program and the outcomes because the threats to validity, including regression to the mean and selection bias (i.e., a comparison to a nonequivalent group), are not sufficiently addressed. 11 Absent a randomized trial, which was used in the MHS demonstrations but is clearly not practical on a routine basis, the strongest evaluation approach for disease management interventions is the quasi-experimental, pre-post with comparison group design, adjusting for known differences. The use of a comparator group is essential to control for many potential confounders, one of the most concerning of which is regression to the mean. Vendors have developed numerous methodologies, short of control groups, in an attempt to address this limitation (e.g., adjust for nondiseased trend), but none has been shown empirically to be adequate.
More recently, DMAA has advocated the use of control groups as best practice when practical, but their use in the market has been limited to date. If disease management vendors provide a comparison group, they frequently use nonparticipants as the comparator. 29 Although convenient, this approach is problematic because the reasons why an individual chooses to participate in a program (e.g., motivation, trust of provider, or general inclination toward adherence with medical treatments) are likely related to their ultimate outcomes, yet cannot be controlled empirically through variables that are measurable using medical claims data.
In terms of the preferred approach of using comparison groups from other populations (e.g., a group of members who meet the same diagnostic criteria as disease management participants but to whom no disease management program is available), some argue that the use of comparison groups is too costly, takes whole, provide some checks and balances for the industry.
Clearly, pharma has come under the most scrutiny because pharmaceuticals can do real harm when ingested. Accordingly, although it would be useful to implement many of the solutions above in disease management, it would be unrealistic to expect such an implementation to materialize. Even the government is not likely to fund additional pilots until 2010 or beyond. Thus, the responsibility lies with plan sponsors, both health plans and employers, to conduct these evaluations (or share their data with third-party researchers) for dissemination. MacStravic advocates that employers and health plans expand and/or coordinate their efforts, perhaps even financially sponsor a national body to conduct systematic analyses of disease management programs. 42 As MacStravic notes, the criteria for disease management interventions are already established and generally agreed upon, so the incremental time and cost should be modest. MacStravic argues that "if payers demand comparative information from all HDM [health and disease management] providers, in some standard format and understandable mode of reporting, providers will have little choice but to deliver it… No provider will want to be shut out of consideration for having failed to provide the information called for by their prospects and current clients." 42 Finally, research continues to suggest that the fundamental question of what defines the value proposition for disease management should be revisited. Although disease management had its roots in improving the quality of care, it quickly evolved to a net savings requirement, even though literature suggests that less than 20% of treatments for existing conditions are cost-saving. 43 A net savings requirement may simply be too high a hurdle given the findings to date. Kahn et al. found that smoking cessation was the only prevention intervention that produced cost savings over 30 years in the U.S. population given current market prices. 26 However, when measured on a cost per quality-adjusted life-year saved (QALY), some disease management interventions may, in fact, provide a good value. Kahn et al. found that glucose control in patients with diabetes costs $48,759 per QALY (not including the cost of a disease management program) and using aspirin in high-risk individuals costs only $2,779 per QALY. In contrast, lowering of LDL-C in those at high-risk for coronary heart disease costs $83,327 per QALY, which exceeds the well-known but arbitrary threshold of $50,000 per QALY. 26

■■ Conclusions
Although enthusiasm for disease management was once a matter of philosophical belief, the tolerance of plan sponsors for weak evidence of the value of disease management is dissipating. Current evidence does not suggest that commercial disease management saves money in any of its current variations. As to whether disease management can produce beneficial outcomes, the answer will likely depend, in part, on the criteria for effectiveness-net savings, clinical improvement, or a reasonable cost per QALY. That said, the business challenge is that deci-disease management from numerous countries, making it difficult to reach any conclusions about the sparsely represented commercial, population-based, telephonic programs in the U.S. 33 As mentioned earlier, Mattke et al.'s review (December 2007) 2 found just 3 studies of population-based disease management from 1990 to 2005, only 1 of which used a quasi-experimental design but had other limitations, 34 leading the authors to conclude that plan sponsors should take a hard look at vendor assertions of success. 2 Since that time, Chan and Cooke found that disease management had no impact on pharmacotherapy after myocardial infarction. 7 Finally, CMS found no evidence of cost savings from the MHS demonstrations; 20 and although the design and implementation of the pilot have been criticized, 35,36 CMS has not waivered.
Interestingly, of all this work, Chan and Cooke received the least attention, although the study was quite important due to its use of a contemporary comparison group to examine medication compliance, a critical indicator of program success. To date, there is no strong evidence that disease management has improved the well-known compliance problem that exists for chronic diseases such as diabetes and cardiovascular disease. This is somewhat ironic given that first-generation disease management programs were frequently funded by pharmaceutical manufacturers and focused on increasing patient compliance with chronic medications.
Given little rigorous evidence suggesting that telephone-based disease management programs save money, how do many of the disease management vendors make claims of value? They do so by citing their own analyses that are lower in the hierarchy of evidence (e.g., case reports and case controls) and/or studies that do not address effectiveness directly. After 10 years and tens of millions of enrolled members, one would expect a richer body of literature. Although some might claim that rigorous studies are not feasible for all the reasons mentioned earlier, pre-post, comparison group studies are a viable approach. Accordingly, the lack of literature is more likely explained by a lack of effectiveness combined with publication bias. Absent a countervailing force, one would expect disease management vendors in a competitive marketplace to selectively publish results that put them in the most favorable light. Evidence of this publication bias was found in a meta-analysis published in 2005. 37 It is important to keep in mind that a lack of evidence and potential for publication bias are hardly unique to the disease management industry. Publication bias is a systemic problem that has been documented most extensively in the pharmaceutical industry. 38,39 If we look to that industry for guidance, several partial solutions have been suggested and a few implemented, including Guidelines for Good Publication Practice, 40 as well as calls for more government or other independent funding, independent statistical assessment of results before publication, and a trials database that requires companies to register their clinical trials at the outset. 41 All represent only partial solutions but as a

DISCLOSURES
The author is a former employee of a disease management company and provides consulting services in the design, management, and evaluation of disease management interventions.
sions to fund disease management become much more difficult under a cost per QALY framework than when claiming net cost savings, particularly in economic downturns. Regardless of the success measure, to overcome the current hurdles to establishing value, future disease management programs may require a more targeted approach that includes more selective enrollment, more focused interventions, and greater customization to the specific gaps and cost drivers of an individual plan sponsor.
Plan sponsors also bear responsibility for the current situation. As long as they demand a short-term ROI in the current model and inconsistently require comparison groups, they are more likely to promote methodological creativity than they are to inspire true innovation. They must consistently challenge vendors, look for the evidence beyond the marketing, and share their experience. The same holds true for those benefits consultants who have bought into the ROI concept so much that their requests for proposals demand a methodologically unsound ROI calculation. That demand puts pressure on the disease management companies to have an ROI option even when they know it makes no sensethey risk losing the contract to another company that promises an exaggerated ROI. Table 2 summarizes the 5 challenges and accompanying actions that plan sponsors and consultants can take immediately to mitigate the issues outlined here.
Although some people question whether disease management will meet a fate similar to that of the HMOs in the late 1990s, 8 the continued experimentation in the retail-and employer-based model and optimism among some stakeholders for the medical home model will keep the disease management industry on the radar for the near future. During this time, marketplace entry by new vendors with notably innovative solutions is likely. If plan sponsors and consultants work collaboratively to set a higher bar, the next generation of vendors will be required to put more emphasis on the epidemiologic and economic science of disease management than has been seen previously.