One of the most important developments in medicine has been an increasing push for transparency about clinical trials and their underlying data. Among many examples, the National Institutes of Health are about to issue a new rule mandating greater disclosure of clinical trial results, and the Food and Drug Administration (FDA) is considering ways to allow greater access to the clinical trial data it possesses. And in a surprise announcement on January 20, 2016, the International Committee of Medical Journal Editors—representing all the top medical journals in the world—issued a joint proposal to require any authors of clinical trial reports to share the individual patient data “no later than 6 months after publication.”
That announcement, however, was followed the next day by an editorial in the New England Journal of Medicine (NEJM) that took a much more jaded view of data sharing. In the editorial, NEJM editors Jeffrey Drazen and Dan Longo said that while data sharing was a “moral imperative,” they were worried about how data sharing might play out. Most controversially, they suggested that with more data sharing, a “new class of research person will merge,” namely, so-called “research parasites” who either steal “from the research productivity planned by the data gatherers” or “use the data to try to disprove what the original investigators had posited.” They concluded that data sharing works “best” when other scholars develop novel research interests, work with whoever collected the original data to test a new hypothesis, and then co-author a new paper together.
The editorial was quickly mocked by scholars as “deranged,” “sleazy,” “unbelievable,” “anti-scientific,” and occasionally adjectives that would be impolite to repeat. Two scholars at Berkeley and Harvard called for the authors to resign.
Some ‘Disproven’ Examples
This heated reaction had a point. In numerous recent cases, published clinical trials have been “disproven” or corrected—and it’s a good thing, too—when later scholars have taken a second look at the data. For example, an influential study published in 2001 said that paroxetine (also known as Paxil) could be used to treat teenagers with depression. A re-analysis of the clinical trial data in September 2015 showed that Paxil had actually been causing significant harm.
For another example, published clinical trials on Tamiflu had suggested that the drug could prevent an array of harms from the flu, including hospitalization and death, but a massive re-analysis published in BMJ in 2014 showed that the overall effect of Tamiflu was merely to shorten a typical flu experience by 14.4 hours. For yet another example, Duke’s Anil Potti infamously published a number of papers claiming to have found genetic markers that would guide the treatment of various cancers, and it took other researchers a great deal of time and effort to discover that some of the data had been falsified.
In cases like these, the researchers who spent countless tedious hours digging through data should not be referred to as “research parasites.” They should be given an award for benefiting patients’ lives and health.
The Risks And Tradeoffs
Yet for all of this, the NEJM authors have a point, too — one that should be acknowledged by their many critics (including me). As David Shaywitz pointed out in Forbes, Drazen and Longo were merely saying out loud what many have thought privately — that even if data sharing is a net benefit, there are still risks and tradeoffs. Someone who has spent the better part of a decade raising funds, managing, and analyzing a large clinical trial might reasonably worry about what could happen if he makes the data openly available. For example, someone might scoop the original trialists in the race to write a “top” journal article. However remote this possibility, it’s only human nature for the original trialists to worry about it.
Even worse, someone who is unqualified to analyze data or who is inspired by professional malice might take a perfectly valid clinical trial, redo the analysis in an unjustified way, and then announce that the trial has been “disproven” when no such thing has occurred. Leave aside anti-vaccine activists, homeopaths, and the like: even university researchers manage to publish specious papers regularly, and it is not irrational to worry that with more data sharing, more people will incorrectly claim to have debunked a valid clinical trial.
Both I and the foundation for which I work are passionate about supporting data sharing, to be sure, and we have no doubt that data sharing will be an overwhelming net positive. Rather than be dismissive, insulting, and self-righteous towards anyone who expresses any reservations or doubts, those of us who support data sharing should figure out how to ensure that data sharing works to everyone’s benefit.
Preventing Misuse Of Data
As for the misuse of data, it is important to remember that no one has ever suggested clinical trial data be publicly posted with no review process. Instead, data sharing efforts to date always have many safeguards that prevent the misuse of data.
At the Yale Open Data Access project (YODA), for example, researchers who want access to Medtronic or J&J data must complete a training module, submit a research protocol and statistical analysis plan complete with a list of the “specific hypotheses to be evaluated,” provide conflict of interest statements, and more. YODA’s website states that it will not only review proposals for scientific merit, but will “publicly post information about who has obtained access to the data and for what scientific purpose.”
Similar processes exist at other data sharing portals, and I see little reason to fear the as-yet entirely hypothetical risk that anyone will exploit the system so as to attack and undermine high-quality research. As noted above, the most famous secondary uses of clinical trial data were crucial to exploring shortcomings in how the original trials were reported. In any event, the NEJM editorial does not provide even a single example of spurious secondary uses of clinical trial data.
As for the fear that data collectors will be cheated of publication opportunities, there is a simple solution: hiring committees and especially tenure and promotion committees at universities should not merely reward scholars for the number of their own authored publications, but should reward any scholar for creating and making public a dataset (or software, for that matter) that others in the field find useful. Indeed, someone who publishes one clinical trial with a dataset that is then used for 20 other publications should be rewarded just as if she had written all those 20 publications herself.
The same goes for NIH review of grant proposals. Peer review panels and program officers should explicitly score proposals far higher when the scholar in question has a track record of having made data available at trusted repositories. Moreover, the NIH should either create or support a general repository for clinical trial data, so that academic trialists will have an obvious place to deposit data while knowing that legal and privacy requirements will be respected.
If universities and funding agencies adopt these rewards for data sharing, trialists will no longer have any basis for complaining that a new journal requirement is forcing them to do extra work for someone else’s benefit. Instead, they will realize that data sharing is in their own self-interest as well.
Given that self-interest is an unavoidable part of human nature, we might as well harness that interest on behalf of data sharing, rather than making some people feel as if they are penalized in the academic and funding rat race if they do the good deed of sharing.