In all the attention given in the past couple of years (to say nothing of the past couple of months) to the individual mandate and its effect on health care access, other key features of PPACA, which address other aspects of the health care system, have received less attention than they deserve. Two remarkable things happened yesterday that highlight the importance of addressing health care quality.
First, newly published research suggests that each year in the U.S., more than 100,000 men undergo expensive, invasive surgeries that leave many of them impotent and/or incontinent, all without increasing their chances of survival or providing other clinical benefits.
The New England Journal of Medicine published the results of what one oncologist not involved with the study called a potentially “game-changing” experiment involving men with early-stage prostate cancer. Most new diagnoses of prostate cancer are early-stage, and about 90% of those who receive this diagnosis choose immediate treatment with surgery (or radiation). But despite the considerable financial and human costs of the surgery, it’s not clear that the surgery actually improves patients’ odds or duration of survival.
To examine that question, investigators randomly assigned 731 men either to surgical removal of the prostate or to an observation group, where the cancer was monitored but not treated unless it showed signs of progressing. After 15 years, there was no statistical difference between the two groups in terms of either overall mortality or mortality due to prostate cancer. A secondary analysis, however, suggested that surgery may benefit men with high Prostate Specific Antigen (PSA) levels.
The results are complex, and an accompanying editorial argues that the sample size of 731 rendered the study insufficiently powered to draw conclusions about the comparative efficacy of surgery and watchful waiting. Notably, the investigators sought 2,000 subjects, but could only recruit 731, presumably because, in the face of a cancer diagnosis, being randomized to “watchful waiting” seems risky. But it’s precisely the point of health services research — and analogous research in other areas, including legal services, see, e.g., Greiner and Pattanayak’s RCT of Harvard legal aid offers — that what seems, based often on not much more than mere intuition (and perhaps a cognitive bias that favors action over inaction), to be the most effective option is not always so in fact.
The second remarkable thing that happened yesterday is that the House proposed to gut federal involvement in health services research, a diverse category of research of which the prostate cancer study is but one example. And this, despite the fact that, historically, health services research has received only about 1% of the budget of NIH, the largest funder of basic biomedical research. Health services research evaluates, and often compares, actual health services delivery systems and interventions (e.g., treatments, diagnostics, or screening and other preventative measures), as they are implemented in actual practice settings.
The House Subcommittee on Labor, Health and Human Services, Education and Related Agencies passed the spending bill for fiscal year (FY) 2013; it goes to full committee next week. Among other cuts, section 227 of the bill (see pp. 90-93) eliminates, effective October 1, 2012, the uber-wonky, but critically important, Agency for Healthcare Research and Quality (AHRQ). Located within HHS, the mission of AHRQ (which had a $405 million budget in FY 2012) is to improve health care quality, safety, efficiency, and effectiveness. AHRQ is the major funder of health services research. Among many, many other studies, for instance, it helped fund the prostate cancer study discussed above. It also studies, and tries to prevent, the medical errors that kill more than 100,000 people a year. The House bill, moreover, prohibits other agencies within the subcommittee’s jurisdiction from assuming AHRQ’s duties unless they already have statutory authority to do so.
The vote to pass the bill fell along party lines, with the exception of Rep. Jeff Flake (R-Ariz.), who sided with the Democrats in voting against the bill because he wanted even deeper cuts. And indeed, one possible explanation for the House’s decision is that cutting AHRQ is a somewhat lamentable, but ultimately wise, choice in tight economic times. After all, why spend scarce time and money studying an intervention after it has been marketed, often after having endured rigorous approval processes like those imposed by the FDA?
This thinking is short-sighted — and dangerous. Much medical practice — like surgeries — was never subjected to rigorous study before it became the standard of care. Much of medicine, in short, isn’t evidence-based. Hence the need for an active movement towards “Evidence Based Medicine.”
Of course many other interventions — like drugs, and some devices — are subject to randomized, controlled trials (RCTs) before they are approved for marketing and use in medical practice. But for several reasons, traditional, premarket RCTs should be the beginning, rather than the end, of medical testing.
First, and most obviously, when the FDA finds a drug to be effective, generally the only thing it determines is that it is superior to an inactive placebo. (In the case of medical devices, FDA approval may only mean a determination that the new device is substantially equivalent to an existing device.) Placebo-controlled trials provide the least noisy evidence of causation—that is, that whatever felicitous outcomes trial subjects experience are in fact caused by the drug or device rather than some confounding factor or the placebo effect. In other words, RCTs have strong internal validity. But such trials do not say much about how the new intervention compares to existing ones that address the same condition.
Second, enrollment criteria for traditional premarket RCTs are often quite stringent. For instance, investigators look for “treatment-naïve” subjects so that the trial results won’t be “compromised” by subjects’ past or current use of other interventions. They also use many subjects who are healthy and not patients at all, and when they do use patients, they tend to prefer those who don’t suffer from any other co-morbidities. But in the real world, patients have multiple health issues and they try multiple interventions to address those conditions. And so the results of RCTs, despite their high internal validity, may not generalize well to actual patients. In other words, traditional RCTs’ strong internal validity often comes at the expense of weakened external validity.
Third, and similarly, the environment in which traditional RCTs are conducted is often highly controlled. Investigators are trained to strictly adhere to the protocol, while subjects may be monitored for proper compliance. In the real world, by contrast, clinicians and health care centers vary in skill, time, and other resources, and these variations can affect the safety and/or efficacy of interventions. Similarly, patients vary in their levels of compliance and in other potentially relevant ways.
Fourth, investigators don’t often follow subjects’ outcomes for very long after the conclusion of a trial, and they still involve relatively modest numbers of subjects (despite the FDA requiring larger and larger Phase III trials). Some side effects emerge only after a substantial period of time and/or a large sample size.
Fifth, premarket approval processes, where they exist at all, are run by humans and so are imperfect in other ways. Regulators’ approval decisions are sometimes influenced by industry, unusually powerful patient advocacy groups, or other special interests, or are based on other political considerations, rather than sound science. Continued study of health services in the practice setting serves as a check on such regulatory bias.
Sixth, RCTs compare the average results for the active arm(s) of the trial to the average results for the inactive arm(s). When one arm is declared safer or more effective than the other — often, on the basis of only a small statistically significant difference between the two groups — this aggregation of outcomes masks individual differences within each group: subjects in the placebo arm who did well, say, and subjects in the active arm who did not do well.
And so there are lots of excellent reasons, which should be obvious to anyone who cares to consider the matter even briefly, to continue to study medical interventions of all kinds long after any approval process they may have endured. Indeed, what we really need to do is to fully integrate research into the practice setting and to continually update standards of care based on the results of that research, in order to establish what the Institute of Medicine has called a “learning healthcare system.” Although, as this 2003 Health Affairs article* explains, AHRQ has had a turbulent political history, it hasn’t faced an existential threat like this for almost two decades. And Director Carolyn Clancy has served during both the Bush and Obama administrations.
I suspect that the House members who voted to eliminate AHRQ did not do so because they believe that health services research is trivial or redundant. To the contrary, it seems likely that their vote reflects some mixture of a misguided belief about how the results of just one type of health services research — comparative effectiveness research (CER) — might be used, and the political expediency of “defunding Obamacare,” as the press release accompanying the bill puts it. A bit of evidence for the former motivation comes from section 217 of the bill (see p. 84), which prohibits any federal agency within the subcommittee’s jurisdiction from funding “patient-centered outcomes research.”
That curious phrase is an allusion to the Patient-Centered Outcomes Research Institute (PCORI), one of PPACA’s lesser-known features. PCORI evaluates and compares health outcomes and the clinical effectiveness, risks, and benefits of two or more “health care interventions, protocols for treatment, care management, [or] delivery, procedures, medical devices, diagnostic tools, pharmaceuticals (including drugs and biologicals), integrative health practices, [or] any other strategies or items being used in the treatment, management, and diagnosis of, or prevention of illness or injury in, individuals.” In other words, PCORI conducts and sponsors CER. Its mission is “to assist patients, clinicians, purchasers, and policy-makers in making informed health decisions by advancing the quality and relevance of evidence concerning the manner in which diseases, disorders, and other health conditions can effectively and appropriately be prevented, diagnosed, treated, monitored, and managed . . . .”
PCORI was initially to be a government entity called the Comparative Effectiveness Research Institute. But, although CER and cost-effectiveness research differ, opponents managed to equate CER and rationing in the minds of many Americans. Fears of the “death panels” that would supposedly inevitably result from CER led to the stripping of CER from PCORI’s name, a prohibition on the conduct of cost-effectiveness research by PCORI, prohibitions on how the Secretary of HHS may use the results of PCORI research to determine coverage or set priorities under Medicare, and to the decision to establish PCORI as a “nonprofit corporation” that, PPACA is at pains to make clear, “is neither an agency nor establishment of the United States Government.”
Talk of death panels has never really gone away. PCORI is funded from a special government trust fund, also established by PPACA, from which PCORI may draw without further explicit appropriation by Congress. (The fund itself is fed by a yearly fee of $2 per covered life paid by Medicare, commercially insured patients and the self-insured, starting in 2013.) Had PCORI required additional annual appropriations, no doubt yesterday’s House bill would have defunded it out of existence along with AHRQ. Instead, the committee must be content simply to prohibit any other agencies within its jurisdiction from engaging in CER. (Not coincidentally, AHRQ was authorized to develop a CER program through the Medicare Modernization Act of 2003, and funding for this new program was first appropriated in FY 2005. PCORI was in a way an extension of AHRQ’s CER program; PPACA provides that PCORI may contract with any governmental or private entity to conduct or fund the research, but that it must prioritize contracts with AHRQ and NIH, the heads of each of which serves on PCORI’s Board of Governors.)
It is, of course, true that one can use CER results to deny insurance coverage or treatment options, or to raise insurance rates. But that is not the end toward which CER inevitably leads. And it is true that some CER results have been controversial and the subject of disagreement among experts. Earlier studies that suggest PSA-based prostate cancer screening may do more harm than good, and the FDA’s similar decision to remove the breast cancer drug Avastin from the market are notable recent examples. Indeed, we should continue to debate the proper use of the results of comparative effectiveness and other health services research. But the solution to one problematic use of knowledge cannot be to avoid learning altogether. Health services research may or may not be used to address the problem of health care cost, but we should all work for it to address the problem of health care quality. Evidence-based practice — in medicine and elsewhere — should be a bipartisan goal.
Here, I've diverted from my planned blogging to focus on timely political challenges to evidence-based medicine. In a future post, I’ll have more to say about legal and ethical challenges to evidence-based practice in a variety of fields.