Assessment of the Level of Satisfaction and Unmet Data Needs for Specialty Drug Formulary Decisions in the United States

BACKGROUND: Formulary management within a limited budget is critical, especially for specialty drugs, which are used for serious medical conditions and are very expensive. Despite attempts to summarize the pertinent evidence, it is uncertain whether data needs of formulary decision makers for specialty drugs are satisfied. OBJECTIVE: To assess the level of satisfaction of specialty drug formulary decision makers with regards to the strength of current available data sources and unmet needs regarding clinical, economic, and unpublished evidence. METHODS: This study targeted pharmacists and physicians involved with formulary decision making at health plans or pharmacy benefit management companies at the national, large regional, and local levels. 95 individuals were invited to participate (without compensation) in a 21-item, web-based survey (Qualtrics), which was open from June 14 to July 31, 2014. The responses were coded for descriptive and statistical analysis. Statistical analyses included the Kruskal-Wallis test, analysis of variance, and the Mann-Whitney-Wilcoxon test. RESULTS: Of 95 pharmacists or physicians, 40 respondents initiated the survey, and 33 respondents completed the survey (response rate = 34.7%). Drug formulary decision makers infrequently rated data evidence strength (17.1% “always”). Clinical data evidence strength was rated highest with published randomized controlled trials (RCTs; mean [SD] = 4.06 [0.87] of 5.0), while participant organizations’ internal data were rated highest for economic data evidence strength (mean [SD] = 3.91 [1.07] of 5.0). Decision makers rated the highest unmet need as more data generated from head-to-head RCTs (mean [SD] = 2.94 [0.25] of 3.0) and cost-effectiveness analyses (mean [SD] = 2.53 [0.67] of 3.0). The participants believed manufacturers might be in the best position to satisfy their desire for head-to-head RCTs (mean [SD] = 4.31 [1.09] of 5.0). CONCLUSIONS: Despite a variety of data sources, drug formulary decision makers continue to rely on published RCTs or internal economic analyses as having the strongest evidence strength. The study respondents believed that pharmaceutical manufacturers would be best able to satisfy the greatest clinical data unmet need, that is, head-to-head RCTs in specialty drug formulary decisions.

D rug formularies for hospitals, long-term care facilities, and home care settings list available drugs and represent the compiled clinical judgment of health care providers. 1 Managed care organizations have applied a similar critical drug assessment process and have adopted drug formularies as lists of covered medications to meet the specific needs of their program memberships. 1 Formulary management within a limited budget is critical, especially for specialty drugs, which are used for serious medical conditions and are expensive and have a 6 times higher annual cost increase than nonspecialty drugs. 2 Despite attempts to summarize the pertinent evidence for formulary management, 3 it is uncertain whether data needs of formulary decision makers for specialty drugs are satisfied.
Perception of U.S. payers on data available for their decision making has drawn research attention. [4][5][6] For many years, the only clinical data available for the review of new drugs have been from randomized controlled trials (RCTs), which are often placebo controlled and are sponsored by pharmaceutical manufacturers for drug approvals. The growing availability of comparative effectiveness research (CER) does not fully satisfy formulary decision makers because of the paucity of relevant head-to-head trials, economic data, and low reliability of manufacturer-sponsored studies. 4 Pragmatic clinical trials (PCTs) have been proposed to facilitate postregulatory decisions, but RCTs are required for drug registration trials and are more • Drug formularies are common in any health care organization that manages the selection and use of pharmaceuticals. • Health care professionals involved in drug formulary decisions have historically relied primarily on pharmaceutical manufacturer-sponsored registration randomized controlled trials (RCTs).

What is already known about this subject
• Today, health care professionals conducting drug data analyses and comparisons have access to a variety of sources of clinical and economic evidence.
• Although clinical and economic evidence are increasingly available from a variety of data sources, drug formulary decision makers continue to rely on published RCTs as having the strongest clinical evidence strength. • Regarding economic evidence strength, drug formulary decision makers rely mostly on their own organizational internal economic analyses, perhaps because they best know their business objectives and net cost of drugs. • Pharmaceutical manufacturers appear to be in an important position to help satisfy unmet clinical data needs by including more head-to-head registration RCTs in submissions to the FDA.
On February 24-27, 2014, a pretest to determine content validity of the survey was completed by 3 reviewers involved with formulary management. A finalized survey included the feedback obtained from this pretest. A 21-item, web-based anonymous survey was open from June 14 to July 31, 2014. After the first 2 weeks, participants received a reminder e-mail asking for survey participation if they had not yet completed the survey. The questionnaire had the following 6 major parts (see Appendix, available in online article): 1. Characteristics of survey participants (questions 1-5). 2. Strength evaluation of clinical, economic, and unpublished evidence (questions 6-8). 3. Use of evidence-rating system (questions 9-10).
A 5-Likert scale was used for rating the levels of strength, frequency, and agreement. A 3-Likert scale was used for scoring usefulness and needs.

Data Analysis
The responses from each question were coded for descriptive and statistical analyses. Questions with rating scales were statistically analyzed, and responses in open text fields were collected. Statistical analyses included analysis of variance (ANOVA) and the Kruskal-Wallis test to examine differences among participants' responses. Once the 1-way ANOVA indicated the statistical significance, Kruskal-Wallis tests were also conducted, which assumed nonsymmetrical distribution and produced a higher power than the 1-way ANOVA. 15 Thus, post hoc comparisons for the Kruskal-Wallis tests were provided when identifying the pairs with significant difference. The SAS macro in SAS 9.4 (SAS Institute, Cary, NC) was used for this post hoc comparison. 16 ANOVA and Kruskal-Wallis tests generated the mean, standard deviation [SD], median, and interquartile range. The Mann-Whitney-Wilcoxon test was used to compare the reliance level for the 2 different liaison positions with manufacturers (see Appendix, question 19). All statistical tests were conducted at a significance level of P < 0.05 for a 2-sided test.
This research was approved by the Institutional Review Board at the University of Florida.

Characteristics of Survey Participants
An online survey link was e-mailed to 95 pharmacists or physicians, who were involved with specialty drug formulary decision making. Overall, the response rate was 34.7% (33 of 95), and approximately one third of the total responses were gener-prevalent than PCTs in the published literature. 7 The value of PCTs has been acknowledged for the general population and active drug comparisons, but they do not replace RCTs or a payer's internal data analyses. 5 In addition, the perception of payers regarding different types and sources of evidence has been quantified. Results indicate that RCTs and meta-analyses/ systematic reviews are valued highest. However, our investigation was limited to pharmaceutical technology assessment. 6 In addition to RCTS, PCTs, and CER, 4-6 clinical data and other information are available from the published literature, pharmaceutical manufacturers, third-party drug evaluation entities, the Academy of Managed Care Pharmacy (AMCP) eDossier System, 8 international drug registry agencies, and other internal databases. The Food and Drug Administration Modernization Act of 1997 section 114 also provides an opportunity for pharmaceutical manufacturers to expand economic information provided to payers. 9 Thus, formulary decision makers have access to several sources of data. The evaluation of these data and decision processes vary by organization, which may result in different drug coverage decisions for the same drug among various health plan and pharmacy benefit management company (PBM) formularies. This variety suggests that decision makers independently make value assessments on the strength of available data and sources.
The purpose of this study was to assess the level of satisfaction of specialty drug formulary decision makers with regards to the strength of current available data sources and unmet needs regarding clinical, economic, and unpublished evidence.

■■ Methods
This study targeted pharmacists and physicians involved with drug evaluations and formulary decisions for specialty drugs. In this study, the term "specialty drugs" was defined as costly drugs requiring special handling and administration and was limited to brand-name drugs. Target organizations were national, multistate, large regional and local health plans; health insurers; and PBMs with employer-based commercial, Medicare Part D, and/or Medicaid programs. Managed care organizations were identified based on 5 major sources: Utilization Review Accreditation Commission (URAC), National Committee for Quality Assurance (NCQA), Pharmacy Benefit Management Institute (PBMI), America's Health Insurance Plans (AHIP), and National Association of Insurance Commissioners (NAIC). [10][11][12][13][14] An initial list of 54 health plans and 27 PBMs was further reviewed by one of the authors, and a final list included individuals from 89 health plans and 6 PBMs, who were involved in drug formulary management. Ninety-five individuals were invited by an e-mail to participate in an online Qualtrics survey developed by the authors of this study. This online survey encrypted survey participants and archived responses. Thus, survey responses were evaluated anonymously. ated after sending a reminder. Forty-three respondents opened the survey (view rate = 43 of 95, 45.3%); 40 respondents initiated the survey (participation rate = 40 of 43, 93.0%); and 33 participants completed (completion rate = 33 of 40, 82.5%) the survey. 17 The majority of survey participants were pharmacists (85.0%, n = 34); most participants worked at health plans (77.5%, n = 31); and the primary line of business was commercial or employer-based (90.0%, n = 36). Most respondents were employed by multistate or national organizations (77.5%, n = 31). Most organizations covered a minimum of 0.5 million lives (70.0%, n = 28; Table 1). The summed covered lives by all surveyed organizations, based on the participants' reported membership range in survey question 5 (see Appendix), were estimated to be 170 million at minimum, representing more than half of the U.S. population.

Frequency of Rating the Evidence Strength for Specialty Drug Evaluations
When asked how frequently an evidence strength grading system was applied, the most common answer was "never" (45.7%, n = 16), followed by "sometimes, when data are controversial" (28.6%, n = 10); "always" (17.1%, n = 6); "sometimes, for certain drug types" (5.7%, n = 2); and "sometimes, when quantity of data is insufficient" (2.9%, n = 1). Two respondents indicated in the open text field that ratings were used for a "very high cost, new entry in crowded therapeutic category" and "depends on the study and transparency." Respondents who answered "always or sometimes," were asked about the frequency of specific evidence-strength rating systems in a 5-Likert scale ("never use," "rare use," "occasional use," "frequent use," and "always use"). The frequency did not vary across the following rating systems: (a) Grades of Recommendation Assessment, Development and Evaluation (GRADE) system (mean [SD] = 3.24 [1.15]); (b) assessment of the evidence for health care decision makers offered by the AMCP/ISPOR CER Collaborative 18 (mean [SD] = 3.06 [0.77]); and (c) Delfini evidence tool kit (mean [SD] = 2.31 [1.11]). Other than the given rating systems, a respondent answered "PBM analysis" with "occasional use" in an open text field.

Perceived Strength of Clinical, Economic, and Unpublished Evidence
Clinical Data Sources. The perceived clinical evidence strength of 10 different sources or types was evaluated using a 5-Likert scale ("very weak," "weak," "moderate," "strong," and "very strong"). "Published RCTs (randomized controlled trials)" were the strongest (mean [SD] = 4.06 [0.87]) and "unpublished research presented at scientific meetings" were the weakest sources (mean [SD] = 2.24 [0.75]; Table 2). The Kruskal-Wallis test showed that perceived strength is significantly different across data sources (P < 0.001). According to a post hoc analysis, "government sources" (e.g., the U.S. Food and Drug Administration [FDA], Centers for Disease Control and Prevention, and National Institutes of Health ) and "external clinical reviews" (e.g., Cochrane, Hayes, and the National Institute for Health and Care Excellence) were also recognized as relatively strong sources, showing no difference from published RCTs ( Figure 1).

Economic Data Sources.
The strength of economic data was also perceived differently among 10 different sources or types (Kruskal-Wallis test, P < 0.001). "Internal financial analysis of own data" was the strongest (mean [SD] = 3.91 [1.07]), while "unpublished information provided directly by the manufacturer" (interpreted as "data on file") was the weakest data source (mean [SD] = 1.87 [0.94]; Table 2). Based on a post hoc analysis, "external economic reviews," "published cost-effectiveness analysis," and "PBM economic reviews and recommendations" were also perceived to be relatively strong evidence ( Figure 1). . Additionally, in a descriptive field, 1 respondent noted "national expert panels such as American Association for the Study of Liver Diseases treatment guidelines" with "very strong" strength.

Satisfaction Level and Unmet Needs of Clinical and Economic Evidence
Respondents rated satisfaction level for 7 data attributes related to clinical evidence and economic evidence, using separate 10-point slider questions. The 7 attributes for clinical evidence (survey question 14, see Appendix) assessed satisfaction in terms of the amount, accessibility, internal and external validity, interventions and exposures (e.g., active comparators), statistical analysis, and interpretation of results. Clinical data satisfaction level was different by data attributes (Kruskal-Wallis test, P < 0.001), and "accessibility to data" received the most satisfied level (mean [SD] = 6.12 [1.95]). Post hoc analysis discriminated the least satisfactory attribute, which had relevance to "intervention/exposures" (mean [SD] = 3.32 [2.46]). For economic evidence (survey questions 16), the satisfaction level was not significantly different among attributes (Kruskal-Wallis test, P=0.348; Figure 2).
When participants were asked for recommendations on improving clinical data available for formulary decision making, they rated "head-to-head RCTs" as the highest need (mean [SD] = 2.94 [0.25], using a 3-Likert scale ("low need," "medium need," and "high need"). "Observational studies"

Clinical Evidence
KW P < 0.001

Perceived Strength of Clinical and Economic Data (Post Hoc Comparisons)
Note: Different alpha letters (alone or in combination) indicate significant differences between each data source (e.g., data source a was significantly different than data source b and data source c, as well as the alpha letter pair bc, which does not contain letter a). However, data sources with the same letter (alone or in alpha letter pairs) were not significantly different (e.g., source a is not significantly different than source ab, which contains letter a). KW = Kruskal-Wallis test; PBM = pharmacy benefit manager; RCT = randomized controlled trial; SD = standard deviation.
had received data from manufacturers specifically under a FDAMA disclosure varied by data type: on-label published data about the clinical and economic value of drugs (72.2%, n = 13), on-label but unpublished data (61.1%, n = 11), and offlabel data (55.6%, n = 10). These respondents answered questions about the usefulness of these data types using a 3-Likert scale ("low usefulness," "medium usefulness," and "high usefulness"). The usefulness of these data types was not rated very high: "published on-label data" (mean Mann-Whitney-Wilcoxon test, P = 0.002) in a 5-Likert scale ("never," "unlikely," "sometimes," "frequently," and "almost always").

Perceived Strength of Clinical, Economic, and Unpublished Evidence
knew their business objectives and net cost of drugs. Published economic sources may be supplemental, but they alone might not be sufficient for benefit design and formulary decisions on pharmacy benefit financial exposure. Pharmaceutical manufacturers often produce budget impact models (BIMs) as part of the AMCP dossier or as stand-alone economic models to supplement payer drug analyses. Study participants rated the strength of BIMs lower than their own internal analyses but at approximately the same level as other economic methodologies (e.g., cost-benefit analyses or cost-minimization analyses). BIMs are often considered to be less reliable because they are funded by pharmaceutical manufacturers. 22 Similarly, the evidence strength of manufacturers' data on file was perceived as weak because of a possible lack of data transparency. 23 The lower strength of CUAs shown in this study might be because the output of CUAs is usually quality-adjusted life years (QALYs), which are not commonly used by U.S. payers at this time. However, this may change as the economic training and experience of decision makers grow, and more and more plans adopt value-based drug formularies. 24

■■ Discussion
Specialty drug management is a serious concern for U.S. payers because of high patient cost and increasing availability of specialty drugs, 19 which are estimated to account for half of the total drug spend by 2018. 20 This study described how payers perceived currently available clinical and economic evidence for formulary management of specialty drugs. Published RCT data were the highest rated clinical data sources for evidence strength. This finding is consistent with a previous study focusing on pharmaceutical technology assessment. 6 In fact, this rating was not surprising, since RCTs have been standard in evaluating the efficacy and safety of drugs and providing evidence for FDA approvals. 21 On the opposite end of the spectrum, the perception of unpublished data as the weakest type of evidence reflects payer concerns regarding incomplete or misleading information.
When evaluating economic data sources, drug formulary decision makers relied mostly on their own organizational internal economic analyses, presumably because they best

Satisfaction Level of Clinical and Economic Evidence (Post Hoc Comparisons)
Note: Different alpha letters indicate significant differences between each data source (e.g., data source a was significantly different than data source b). However, data sources with the same letter were not significantly different. a For economic data, post hoc comparison was not conducted because KW indicated nonsignificance. KW = Kruskal-Wallis test; SD = standard deviation.

Limitations
Our study includes concerns for generalizability as well as internal validity. First, we were not able to establish reliability for our questionnaire. Internal consistency is often determined to show how well a set of questions is correlated. 27 The most common reliability estimate for internal consistency, Cronbach's Alpha was not applicable for our questionnaire, since we did not summate ratings. 28 However, we were able to determine content validity, ensuring that newly developed survey items provided an adequate sample of all items that potentially assess our study aims. 27 Second, with a response rate of 34.7% (33 of 95), the number of survey participants was not large. We could not assess statistical significance for some questions because of the small number of participants. However, given that the overall response rate for online surveys is known to be 33%, 29 our rate of 34.7% was not very low. Also, our target population was a limited population of health plan and PBM pharmacists and physicians known, or self-reported, to be engaged in drug analysis and formulary management. Third, although survey participants were known to participate in drug formulary decisions, we did not assess their knowledge level, which might have introduced inaccuracies. Fourth, nonresponse bias might have occured if survey respondents differed from nonrespondents in meaningful ways. To minimize nonresponsiveness, we sent a survey reminder to participants. Lastly, our survey participants may not represent all payers in United States, since a limited portion of defined U.S. payers agreed to participate in this survey. When interpreting survey results, this limited generalizability should be noted.

■■ Conclusions
Although clinical and economic evidence are increasingly available from a variety of data sources, drug formulary decision makers continue to rely mostly on published RCTs or internal economic analyses until additional reliable sources become available. Pharmaceutical manufacturers are encouraged to generate head-to-head active comparator RCTs to satisfy the obvious unmet need in clinical evidence particularly desirable for informing specialty drug formulary decisions. Health services researchers also have an opportunity to satisfy unmet economic data needs by providing well-designed CEAs or CBAs that are applicable to drug formulary specialty drug decision makers. Drug formulary decision makers may consider their plan benefit type, organizational business objectives, and other information when making specialty drug formulary decisions. Thus, future research designs that incorporate other factors affecting formulary decisions into contemporary means of analysis may be especially helpful as clinical outcomes-based contracts begin to influence drug value assessments.
When asked how unmet clinical data needs may be satisfied, the highest response was head-to-head RCTs. Since pharmaceutical manufacturers have been the source of RCTs, the participants believed that manufacturers might be in the best position to provide head-to-head RCTs. One interpretation of this high rating is that pharmaceutical manufacturers should include an active comparator in the registration RCTs to be submitted for FDA approval. This is often requested by U.S. payers but, unfortunately, is an infrequent occurrence, presumably because of the high cost and risk associated with active comparator RCTs. Other data sources that were rated beneficial for formulary management included observational studies, CEAs, and CBAs. Increasing reliance on health economic or outcomes research among U.S. payers was noted in one survey. 25 The funding of CER by the American Recovery and Reinvestment Act (ARRA) and the Patient Protection and Affordable Care Act (PPACA) might generate more comparative data, which was reported as an unmet need by study participants. Accordingly, payers continue to rely on published RCT data to determine efficacy but would like to have at their disposal other data that represent real-world effectiveness.
The systematic grading of evidence strength also varied considerably. For example, a minority of participants always rated data, while one-third selectively rated data. We hypothesized that inconsistent application of evidence strength grading systems may be a result of (a) lack of familiarity with evidence comparison or the use of rating systems; (b) a belief that available data strength is adequate, and rating is unnecessary; or (c) confidence in the ability to rate data without the use of an external rating system.
In our study, representatives of pharmaceutical manufacturers rarely mentioned FDAMA Section 114 when they presented scientific data. Other research has also reported that only a few manufacturers are under the FDAMA (unpublished data obtained in other research projects for the pharmaceutical industry by Robert P. Navarro from 2013 to 2015). Section 114 of the FDAMA states that data delivered to a formulary committee shall "not be false or misleading and is based on competent and reliable scientific evidence." 9 Although Section 114 of the FDAMA has been in existence for almost 20 years, presenting data with FDAMA disclosure is rarely exercised and thus far has been considered of limited value. The reluctance of manufacturers to disclose Section 114 of the FDAMA might be due to the vague and inscrutable nature of the FDAMA itself. 26 Nevertheless, drug formulary decision makers indicated their continuing reliance upon scientific information provided by pharmaceutical manufacturers' medical science liaisons (MSLs) and health economic liaisons (HOLs), especially for novel specialty drugs and orphan drugs. Despite noted need, payers considered information delivered from MSLs or HOLs as modestly reliable. Published cost-minimization analysis (e.g., the least costly drug ingredient cost only)

Study Questionnaire
Published cost-benefit analysis (e.g., costs and benefits in dollars) Published budget impact analysis (e.g., estimate of the financial consequences) 9. Does your organization rate the strength of evidence using a rating system (e.g., Delfini, GRADE, USPSTF, others) when considering information collected from the above-mentioned data sources when evaluating clinical and/or economic data for brand specialty drugs? £ Yes, always rate the strength of evidence. 10. If you answered either YES or SOMETIMES for using a rating system, please identify how frequently you used each of the rating systems below when reflecting on recent formulary decisions in your organization for novel brand specialty drugs. Amount of available data Accessibility to data Internal validity (e.g., good study design) External validity (e.g., generalization of target population) Intervention/exposures (e.g., amount of research conducting head to head comparisons) Statistical analysis Interpretation of results 15. Please indicate whether having access to more of the following study types that offered more CLINICAL information for brand specialty drugs would be helpful in improving your formulary decision making. Amount of available data Accessibility to data Establishment of effectiveness of each intervention Identification of all relevant costs and consequences for each intervention Appropriate perspective measuring costs and consequences (e.g., societal vs. insurer) Allowance for uncertainty in the estimates of costs and consequences (e.g., sensitivity analysis) Interpretation of results 17. Please indicate whether having access to more of the following study types that offered more ECONOMIC information for brand specialty drugs would be helpful in improving your formulary decision making.

No need 1 (Low need) 2 (Medium need) 3 (High need)
Cost analysis data (e.g., economic burden) Cost-benefit analysis data Cost-effectiveness analysis data Cost-utility analysis data (QALYs) Others 18. Please indicate your agreement that pharmaceutical manufacturers could satisfy your unmet data needs for brand specialty drugs with the following evidence.

(Strongly agree)
Meta-analysis Head-to-head RCTs Observational studies (e.g., non-randomized comparative effectiveness research) Cost-effectiveness analysis Cost-utility analyses (e.g., QALY) Other (please provide): 19. What is your organization's practice for brand specialty drugs regarding Medical Science Liaisons (MSLs) and Health Outcomes Liaison (HOLs) presentations? 21. When encountered, are biopharmaceutical company MSLs/HOLs reliable sources to provide balanced, accurate, and compelling clinical and/or economic evidence for brand specialty drugs? Please provide your general experience regarding drug formulary decisions.

(Almost always)
Clinical evidence Economic evidence APPENDIX Study Questionnaire (continued)