Evaluating Oncology Value-Based Frameworks in the U.S. Marketplace and Challenges in Real-World Application: A Multiple Myeloma Test Case

BACKGROUND: With the continuous rise in costs for oncology drugs, the American Society of Clinical Oncology (ASCO), the Institute for Clinical and Economic Review (ICER), the Memorial Sloan Kettering Cancer Center’s Drug Abacus (DrugAbacus), and the National Comprehensive Cancer Network (NCCN) have developed value-based frameworks (VBFs) to assist stakeholders in formulary and treatment decision-making processes. Since emerging VBFs have the potential to affect available treatment options for patients, it is important to understand the differences associated with these VBFs within various therapeutic areas. OBJECTIVES: To (a) compare VBFs across 3 therapeutic options for relapsed or refractory multiple myeloma (RRMM) and (b) identify challenges and limitations associated with real-world decision making using VBFs in the U.S. marketplace. METHODS: The values of regimens carfilzomib (CFZ), elotuzumab (ELO), and ixazomib (IX) were generated using the ASCO, NCCN, ICER, and DrugAbacus VBFs. These regimens, used for second- or third-line treatment of RRMM, shared a common comparator in clinical trials: lenalidomide + dexamethasone (LEN + DEX). ASCO’s 2016 VBF, which incorporated clinical benefit, toxicity, and bonus points, was used to generate a net health benefit score, along with the drug wholesale acquisition cost, for each regimen compared with LEN + DEX. Results of the 2016 NCCN Evidence Blocks for multiple myeloma and the ICER 2016 report of treatment options for RRMM were extracted to generate the value of CFZ, ELO, and IX. No output was generated from DrugAbacus because of the lack of regimens included in the test case. Shortcomings associated with running the test case in RRMM for each VBF were also identified. RESULTS: Among the 3 therapeutic agents, CFZ, in combination with LEN + DEX, was the most valued. ASCO and ICER VBFs suggested that CFZ + LEN + DEX may be the most valued, followed by ELO + LEN + DEX and IX + LEN + DEX. NCCN suggested that LEN + DEX may be the most valued followed by CFZ + LEN + DEX, IX + LEN + DEX, and ELO + LEN + DEX. A number of shortcomings were noted across each VBF, such as complexities of drug evidence evaluation with the ASCO VBF, the inability to adjust the ICER and NCCN VBFs to specific populations, and subjectivity associated with the NCCN VBF and DrugAbacus. CONCLUSIONS: Although the test case provided some consensus on treatment decisions, there is much nuance and limitations with the VBFs available for RRMM. Clearer objectivity and better adaptability to specific treatment decisions are warranted.

associated with real-world decision making using VBFs in the U.S. marketplace.

■■ Methods Study Overview
A literature review was performed to identify all U.S. VBFs used to assess the value of oncology drugs. Four VBFs were identified and included in this study: ASCO and ICER VBFs, NCCN Evidence Blocks, and DrugAbacus. To determine the RRMM treatment of greatest value, a test case analysis was performed for each VBF. Once the test case analysis was completed, results for each VBF were compared to evaluate whether value was equally measured across all frameworks. As the evaluation was performed, shortcomings of each VBF were reported.

Test Case: Multiple Myeloma Drugs
For our test case, we defined inclusion criteria for cancer drugs used within the same disease state. Four inclusion criteria were considered in the selection of oncology drugs for the test case: (1) oncology therapy approved between 2015 and 2016 by the U.S. Food and Drug Administration (FDA) for multiple myeloma; (2) oncology therapy with phase III clinical trial results available through databases such as PubMed and Scopus; (3) oncology therapy that used the same comparator in clinical trials; and (4) oncology therapy evaluated and included in the ICER VBF report and NCCN Evidence Blocks. Based on these criteria, RRMM was chosen for the test case analysis.
Since 2015, the FDA has approved 6 drugs for RRMM: carfilzomib (CFZ), elotuzumab (ELO), ixazomib (IX), daratumumab (DARA), panobistat (PAN), and pomalidomide (POM). DARA was excluded from the analysis because it did not have a phase III clinical trial, and PAN and POM were excluded because they had different comparators in their clinical trials, making it inappropriate to compare using the ASCO VBF. So, for the test case, we used CFZ, ELO, and IX because they fulfilled the predefined inclusion criteria and used the same standard of care as the comparator, lenalidomide + dexamethasone (LEN + DEX).

Oncology Value Frameworks and Usability in Test Case
ASCO Value-Based Framework. The updated 2016 ASCO framework was used to generate the value of CFZ, ELO, and IX. This VBF allows users to generate a net health benefit (NHB) score for a regimen assessed against its comparator using the available prospective randomized controlled trial data. The NHB of the regimen is derived from the clinical benefit (calculated using hazard ratio [HR], overall survival, or progressionfree survival); toxicity (calculated using frequency of side effects and grade); and bonus points (calculated by evaluating tail of the curve, palliation, health-related quality of life, and treatmentfree interval). 12, 13 The NHB score serves as an indicator of the clinical effect of a regimen as compared with a control regimen. research and development, manufacturing costs for complex compounds, and the economic principles surrounding oncology drug pricing. 3 Cost sharing for patients has increased, as well. Annual patient out-of-pocket payments for intravenous and oral medications now soar to over $7,000 and $3,000, respectively. 2 Strategies to control costs have been proposed, and some have already been implemented to mitigate cancer drug costs, such as the use of evidence-based clinical treatment pathways and tools to facilitate cost discussions with patients to establish value of treatments. 3,4 These strategies highlight a key role for health technology assessment (HTA) in pharmaceutical valuation moving forward.
The United States persistently lacks meaningful use of cost-effectiveness in health economic evaluations, which is largely because of a diverse health care landscape that is riddled with numerous gaps in care and the complexity of a multipayer system. 5 A recent survey of private payers discovered that all respondents had used at least 1 external HTA organization to help influence coverage decision making with regards to personalized medicine in oncology. 6 Nonetheless, the survey responders reported lack of availability, timeliness and redundancy of reviews, and inadequate incorporation of cost-effectiveness as main shortcomings of HTA assessments, highlighting cost-effectiveness as a primary nonclinical factor needed in HTA assessments. 6 Today, an emerging HTA model known as the value-based framework (VBF) evaluates evidence from clinical and economic data to inform health care decision making across payer, physician, and patient perspectives. The American Society of Clinical Oncology (ASCO), the National Comprehensive Cancer Network (NCCN), Memorial Sloan Kettering Cancer Center's Drug Abacus (DrugAbacus), and the Institute for Clinical and Economic Review (ICER) have developed VBFs with the collective mission to justifiably valuate therapies available for a variety of disease states. 7 With these VBFs, the use of weights to reflect user preference varies tremendously, barring the economic principle of trade-off, which ultimately convolutes the utility of these assessment models. Furthermore, these frameworks seem to continually advance towards multiple criteria decision analysis, which many health economists believe to be the next step in value assessment. 5 Yet, all existing VBFs have received criticism and challenges because of the minimal influence from health economists in their creation. 5 Despite the proliferation of recent literature assessing the validity, reliability, and practicality of VBFs, few studies have critically evaluated all available models for oncology and their potential effect on real-world decision making. [7][8][9][10][11] There is no study that has compared the value of cancer regimens for a specific disease state across all available U.S. oncology VBFs. Thus, the purpose of this study was to (a) describe and assess VBFs used for relapsed or refractory multiple myeloma (RRMM) regimens and (b) identify challenges and limitations Each regimen was independently scored by 2 of the authors who followed the directions provided in the revised ASCO framework tool for advanced disease developed by . 13 After each author independently scored the RRMM treatment, the scoring results were compared by the 2 authors for similarities or discrepancies. Any discrepancies were further discussed among all 4 authors to establish a final NHB score and drug cost for that regimen. Results from the phase III randomized clinical trials ASPIRE, ELOQUENT, and TOURMALINE were used for CFZ, ELO, and IX to generate the regimen's NHB score, respectively. [14][15][16][17] The cost of each therapy was included using the drug acquisition cost or the patient cost, depending on patient health insurance status. 13 The wholesale acquisition cost (WAC)-the list price from a manufacturer to a wholesaler-was used to generate the regimen cost. WAC prices were calculated using a standard weight-based dosing of 70 kg and height of 170 cm and were obtained from Medi-Span Price Rx and RED BOOK for pricing references. 18 CFZ = carfilzomib; ELO = elotuzumab; IX= ixazomib; LEN + DEX = lenalidomide + dexamethasone; NHB = net health benefit.   19 The NCCN Evidence Blocks has an assessment of different factors that are used to evaluate a regimen. These factors include 5 evidence blocks: efficacy, safety, quality, consistency of evidence supporting the recommended therapy, and affordability. 20 The score ranges from 1 to 5, with 5 as the most favorable. 20 ICER Value-Based Framework. ICER publishes reports that include clinical and economic evaluations of therapies of certain disease states. The ICER evidence-based medicine matrix rates comparative clinical effectiveness with 3 levels of certainty: low certainty (I); moderate certainty (B+, C+, P/I, and I); and high certainty (A, B, C, and D). 21 The cost-effectiveness analysis results are reported as cost per quality-adjusted lifeyears (QALYs), and the budget impact analysis results are reported in average budget impact cost per year. The ICER 2016 (May 5, 2016) report on treatment options for RRMM was used to extract results on the value of CFZ, ELO, and IX. 22 This report incorporates comparative clinical effectiveness, incremental costs per outcome achieved, potential budgetary impact, and value-based price benchmarks. In the test case, results of comparative effectiveness analysis, cost-effectiveness analysis, and budget impact analysis were used to estimate the value of RRMM agents selected.
DrugAbacus. The DrugAbacus price is calculated based on the measure of 1 domain and the weight defining the importance of that domain according to the user. 23 This tool contains 8 domains: efficacy, tolerability, novelty, rarity, population burden, research and development, costs, unmet need, and prognosis. 23 DrugAbacus determines a theoretical price-the Abacus price-for cancer drugs based on opinions of experts regarding possible domains of a drug's value. 23 The authors sent a questionnaire by e-mail to 6 pharmacists and physicians who specialized in oncology. The questionnaire was composed of the standard 8 questions included in DrugAbacus. Because of the lack of multiple myeloma drug options from DrugsAbacus, we were unable to use this tool for the test case.

■■ Results
The results were obtained after generating the value of CFZ, ELO, and IX using the ASCO VBF and extracted from the 2016 RRMM ICER report and the 2016 NCCN Evidence Blocks.

ASCO Value-Based Framework
Scoring from the ASCO VBF suggested CFZ as the preferred option, with a cost slightly higher than ELO (Appendix, available in online article). CFZ, in combination with LEN + DEX, had an NHB of 28.8 obtained from the clinical benefit score, clinical toxicity score, and bonus points ( Figure 1). The clinical benefit score of CFZ represented an HR of 0.79 for death, or 21% reduction in risk of death, when compared with the control LEN + DEX. The toxicity score was 29.5 versus 26.5 for LEN + DEX. Because of the statistically significant improvement in quality of life reported from patients taking CFZ in the ASPIRE clinical trial, 10 bonus points were awarded in NHB.
IX, in combination with LEN + DEX, had an NHB of 23.0, resulting from the clinical benefit and toxicity score (Figure 1). The IX clinical benefit score represented an HR of 0.77 for death, or 23% reduction in risk of death, when ELO, in combination with LEN + DEX, had an NHB of 23.7 resulting from the clinical benefit score and toxicity score (Figure 1). Because of lack of information on the HR for death or disease progression, the median progression-free survival was used to determine the clinical benefit score for ELO. The addition of ELO to LEN + DEX provided a 30.2% increase in median progression-free survival compared with LEN + DEX alone, which resulted in a clinical benefit score of 24.2 (Appendix). The toxicity score for ELO + LEN + DEX was 38.5 versus 37.5 for LEN + DEX. No bonus points were awarded to the regimen because the tail of the curve could not be reliably interpreted. The monthly WAC or WAC per cycle for LEN + DEX was $11,616. The addition of CFZ, ELO, and IX to LEN + DEX provided a monthly drug cost of $17,364, $16,032, and $20,607, respectively (Figure 1).

NCCN Evidence Blocks
The NCCN Evidence Blocks report suggested CFZ + LEN + DEX as a preferred regimen in terms of efficacy, but LEN + DEX had a preferred safety and cost profile. As the baseline comparator, LEN + DEX achieved a score of 4, 4, 4, 4, and 2 in the efficacy, safety, quality, consistency, and affordability domains, respectively ( Figure 2). 19 Efficacy was the only differentiating drug assessment factor, with scores of 5, 3, and 4 for CFZ, ELO, and IX regimens, respectively. Safety, quality, consistency, and affordability were the same across regimens with scores of 3, 4, 4, and 1, respectively ( Figure 2). 19 Based on the efficacy score from the NCCN Evidence Blocks, CFZ, in addition to LEN + DEX, was the preferred option among all 3 regimens.

ICER Value-Based Framework
Results from the ICER VBF report suggested CFZ as the preferred agent from all 3 agents in terms of cost-effectiveness and budget impact. The ICER VBF report assigned a "B+" rating, or moderate certainty, for the comparative clinical effectiveness of CFZ, ELO, and IX, in combination with LEN + DEX. 22 These 3 regimens provided a better NHB for second-line and third-line therapy in patients with RRMM compared with the standard therapy LEN + DEX. The comparative clinical effectiveness rating was based on the progression-free survival benefit observed with each regimen and the positive balance of benefits and harms of each regimen. In the ICER report, costs per QALYs for second-line therapy were $199,982, $427,607, and $433,794 for CFZ, ELO, and IX, in combination with LEN + DEX (Table 1A). 22 Third-line treatment costs per QALY for CFZ, ELO, and IX, in combination with LEN + DEX were $238,560, $481,244, and $484,582, respectively (Table 1B). 22 These results suggest that no regimen would be a good value in the long term. Thus, significant list price reductions for each regimen would be required to achieve acceptable long-term cost-effectiveness between $100,000 and $150,000 per QALY for second-line and third-line use ( Table 1). None of the 3 regimens exceeded the ICER budget impact threshold of $904 million per year for a new drug. The average potential budget impact per year for secondline therapy was $226 million per year for CFZ + LEN + DEX, $395 million per year for ELO + LEN + DEX, and $330 million for IX + LEN + DEX (Figure 3). 22 The budget impact was lower for third-line regimens because of smaller patient populations.

VBF Shortcomings
Because of the structure of the ASCO VBF and DrugAbacus, the results, respectively, rely on understanding and interpretation of randomized controlled trial results and user weights that are used as input, which could cause subjectivity bias. However, the ICER VBF and NCCN Evidence Blocks are published reports and cannot be edited or adjusted for a specific situation or population. Overall, the clinical application of the 4 VBFs can be challenging, since they do not produce patientspecific outcomes. In addition, the ASCO evaluation is not practical for everyday practice, since input information is not always readily available and its collection can be time consuming. Likewise, the ICER report is extensive and not completely clear its guidance for clinical practice and physician prescribing. Based on ease of implementation, NCCN Evidence Blocks have the highest potential for uptake in clinical practice, yet they could be subject to bias, since the categories are based on the NCCN panel's perspective.

■■ Discussion
Novel value frameworks target different audiences-clinician, patient, or payer-which makes it challenging when trying to determine the true value of an oncology regimen. This study evaluated all oncology VBFs that can be used for RRMM in the U.S. health care system. The ASCO VBF, which targets

Evaluating Oncology Value-Based Frameworks in the U.S. Marketplace and Challenges in Real-World Application: A Multiple Myeloma Test Case
In the test case, we identified limitations specific to VBFs that can prevent their implementation into real-world practice. For instance, we identified complexities of drug evaluation as a limitation for ASCO, which led to discrepancies among the authors' results. The ASCO framework quantifies values in terms of points awarded for clinical benefit, toxicity, and bonus points. These points can be problematic to calculate and to determine the right value for the regimen. 8 Moreover, previous studies have concluded that there are mixed results on interrater reliability when using the ASCO VBF. 9,11 Wilson et al.
(2017) suggest a low inter-reliability because of the confusion among clinicians about the calculation of the toxicity score and uncertainty about bonus points scoring. 9 The lack of available data on bonus points, such as palliation, quality of life, and treatment-free survival, were also identified in the literature as a limitation associated with ASCO that affects the NHB score used to establish the value of a regimen. 9 With the 2016 passage of the 21st Century Cures Act, a particular focus on patient experience data may encourage standard reporting of these endpoints in future studies. 28 A study that evaluated the validity of the ASCO, ICER and NCCN VBFs for advanced lung cancer drugs showed a high convergent validity between ASCO and NCCN and a low convergent validity between ICER and NCCN. 11 However, the results of our test case demonstrated that ICER and ASCO outcomes were the same with regards to the most valued regimen (CFZ in combination with LEN + DEN), while NCCN results differed from ASCO and ICER. A unique inherent design component of the NCCN framework enabled the baseline comparator (LEN + DEX) to be considered the most valued treatment among our considered treatments. It is important to note that Bentley et al. (2017) used transparent descriptions of comparative clinical effectiveness 11 -the evidence-rating clinicians, and the ICER report, which targets payers, suggested CFZ in combination with LEN + DEX as the preferred treatment regimen in the test case. The NCCN Evidence Blocks, which target clinicians and patients, suggested that LEN + DEX was the best regimen, followed by CFZ in combination with LEN + DEX. Because of the lack of availability of some of the cancer drugs in DrugAbacus, no output was generated from that VBF.
Despite different targeted audiences, these VBFs recommended the same regimen among all 3 novel cancer drugs. Each audience has different preferences and attributes weights of value that differ from one audience to another, which can be problematic when determining if each VBF captures the true value for its audience. 8,24 Our findings indicate that value may be measured the same across all frameworks regardless of the audience targeted. This finding may imply that attributable weights incorporated in each VBF (ASCO, ICER, or NCCN) converge in their assessment of value or that each VBF captures shared components of value across all stakeholders.
Although some VBFs highlighted the same medication as the most valued therapy option, it is unclear whether this will be the case across different therapeutic areas. At the time of this study, we were limited to 2 cancers because published ICER reports were only available for multiple myeloma and advanced non-small cell lung cancer. This study focused on multiple myeloma, since it is the second most common blood cancer, with the majority of patients experiencing intermittent relapse and remission throughout the course of the disease. 25 For patients with advanced RRMM disease, there is no standard of care despite several approved novel agents, 3 of which were included in the test case. 26,27 This lack of a standard of care highlights the need for VBFs that can aid in treatment and formulary decision making. eliminates a user's ability to adjust weights or designate preferences of certain criteria over others. In the ASCO framework, users at least have complete transparency in the NHB derivation, but they cannot adjust the weighting of the tool. Navigation of the DrugAbacus tool is completely based on user-weighted preferences, yet it succumbs to scrutiny for its lack of explanation of domain input ranges and scarcity of drugs for evaluation.
Conversely, there is a convergent aspect among the oncology VBFs with regard to their sources of evidence. It is clear that the frameworks mainly rely on evidence from randomized clinical trials. If VBFs gain considerable tracking for value assessment, manufacturers may be compelled to incorporate certain endpoints, especially "bonus" domains in the ASCO framework, as standard drug performance metrics in future study designs.
Our recommendation for future versions of VBFs includes incorporation of explicit weighting systems, such that end users may have a clear understanding of implicit trade-offs created by their decisions when certain criteria are weighted higher than others. In addition, all relevant costs should be presented relative to the perspective of the VBF. Some of the VBFs do not adhere to this stipulation and may underestimate the value of drugs, the costs of which may be offset by lesser medical expenditures. Conceivably, a framework that could comprehensively capture costs, health outcomes, toxicity, quality of life, and other value criteria (e.g., ease of use) could be customizable to multiple targeted stakeholder perspectives.

■■ Conclusions
Even in their early beginnings, VBFs presented the opportunity to establish meaningful drug evaluation to help inform key stakeholders when making health care decisions. While consistency and reliability remain to be established, oncology VBFs have opened the door to multiple criteria decision analysis, a valuation method that enables user preferences to navigate through conflicting criteria such as quality and costs. The potential usefulness of this methodology has prompted ICER to incorporate a modified multiple criteria decision analysis process into the 2.0 version of its value framework report. As value for service and products continue to drive U.S. health care reform, VBFs may be a potential tool in the value-based care toolbox. matrix provided by ICER in its analysis-while we used all of the components of the ICER RRMM report (comparative clinical effectiveness, cost-effectiveness analysis, and budget impact analysis) to obtain the value of the regimens.
There was novelty in the test case approach. We intended to evaluate multiple therapies within a disease state that had a common comparator (i.e., LEN + DEX) using different available U.S. VBFs. In essence, our study provides a unique example of an indirect comparison of value within different VBFs. Meaningful comparisons across treatment regimens are challenging when dealing with multiple comparators in clinical trial evidence in certain VBFs, such as ASCO. For example, we considered including POM in our study, the pivotal phase III trial for which was compared with high-dose DEX. Any comparison of POM to other agents in our test case would have been biased in the ASCO framework, since each NHB score is derived from the differential advantage of the novel agent over the comparator. This example highlights a major challenge for future versions of available VBFs: how to address discrepancies when comparators are not consistent across therapies within a treatment space.
We also intended to include all available U.S. VBFs in the test case. To our knowledge, no single study has used all 4 available oncology frameworks. While no single stakeholder is anticipated to use all frameworks to assist in decision making, a comprehensive insight into the output of each VBF for a common test case would help to elucidate further the role each plays in this intricate process. The ability to evaluate any set of treatments across these 4 oncology VBFs for a common indication remains largely limited by gaps in evidence assessment.

Limitations
There are some methodological limitations associated with the test case used in this study. The pharmacologic treatment pathways (and their development) in RRMM, like many cancer types, are constantly adapting to new available therapies. This study was mainly limited to 3 recently approved therapies (CFZ, ELO, and IX); however, we recognize that these therapies do not encompass the totality of the RRMM treatment paradigm. For example, NCCN guidelines designate several "preferred" treatment regimens other than the 3 drugs included in the test case. While the test case provided a meaningful comparison of 3 add-on agents to LEN + DEX for patients with RRMM, this was the only disease state that we identified to fit our inclusion criteria. However, although DrugAbacus includes 52 cancer drugs that were approved by the FDA between 2001 and 2015, only PAN, POM, and bortezomib are included as RRMM drugs. 23 Consequently, our ability to generate a value output from DrugAbacus and effectively compare it with the ICER, NCCN, and ASCO VBFs was limited.

Implications
A divergent aspect among the oncology VBFs is their level of customizability. NCCN and ICER publish reports, which