So says a recent document from the Coalition of Evidence Based Policy. The document goes on to describe how Randomized Control Trials (or RCTs) are the gold standard for measuring an intervention's true effect, and hence the handiest tool in our tool box to identify which interventions are effective and which ones aren’t. A different version of the same argument is used by two former government budget officials who claim that less than $1 out of every $100 of government spending is backed by evidence.
This line of reasoning sounds logical. After all, who wants resources to go towards interventions that aren’t effective? In an age of budget scarcity, we want to be doubly sure that we are being good stewards of public (and private) dollars. And what better tool than RCTs to tell us, with high certainty, whether interventions are indeed effective.
This type of reasoning can be referred to as “seductive logic”. It’s hard to disagree with it, much less fight it. It just seems to make intuitive sense! Until, of course, you start digging underneath for assumptions.
One fundamental assumption that is embedded in the “RCT paradigm” is that that there are several competing programs or interventions that can potentially be applied to solve a social problem, and RCTs are the way to “separate the wheat from the chaff”. However, as we know, solutions come about not through stand-alone interventions, but through systems of interventions working in concert. For example, the education field for years has gone after different “programs” – teacher incentives, coaching, school leadership, learning communities, etc. It’s no surprise that 90% of RCT’s commissioned by the Institute if Education Sciences since 2002 were found to have weak or no positive effect. Real progress has happened only where initiatives are threaded together holistically in a systems approach.
Another key assumption is that context does not matter. Once RCT-tested, an “effective” program will always remain effective, and an “ineffective” program will always be a waste of resources, irrespective of the context that they function in. What we know, however, is that context can make or break an intervention. Even within a particular context, an intervention may work for some populations, but not others. A recent New York Times article titled, “Do Clinical Trials Work?” explored how the field of drug testing (in some ways, the holy mecca of RCTs) has realized that studies often fail even though they may work very well for some sub-groups, and is moving towards small clinical trials enrolling those who have the appropriate genetic or molecular signature.
This doesn’t mean, however, that we throw out the baby with the bathwater. Experimentation, in itself, is a good thing. Small controlled experiments, that may include randomization, are a healthy part of any innovation and development process. With the proliferation of digital infrastructure, “A/B testing” as it has come to be called, has become easier and cheaper to do. For example, through simple testing, the 2012 Obama campaign discovered that the most successful email subject line (in terms of getting campaign contributions) was simply, “hey”.
The problem arises when dogma becomes associated with an experimental design, anointing it superior to other alternatives. As a field, we have raised legitimate arguments against the use of RCTs. However, they often verge on the tactical (too expensive, hard to pull off logistically, possibility of spillover effects, tough to deny service to some populations). While reasonable, these arguments are easily overcome through better design and increased resources. The time is now ripe for us to examine some of the fundamental assumptions raised above to really ask ourselves, “Even if we could overcome the tactical issues, are RCTs the right tool to measure effectiveness in the social sector?”