- Health Affairs Blog - http://healthaffairs.org/blog -

The Million Veteran Program: Building VA’s Mega-Database for Genomic Medicine

Posted By Joel Kupersmith and Timothy O'Leary On November 19, 2012 @ 2:29 pm In All Categories,Health IT,Research,Science and Health,Technology | 1 Comment

This year marks the fiftieth anniversary of Watson and Crick (and Wilkins) being named Nobel Prize recipients for discovering DNA, the genetic code.  In the half century since, there has been an exponential growth of knowledge and accomplishment based on their findings.  More recently, a confluence of scientific and technical advances have made possible vast progress in our understanding of human disease, its diagnosis, and the most effective treatment(s).   Among these advances are genetic testing, high performance computing platforms, and the electronic health record (EHR), which together offer the possibility of clinically rich databases that link genetic information to treatment outcomes.

These and other advances have made it clear that the genetic predispositions to adult diseases are in many cases extremely complex.  In its early phases, human genetics focused on single genes for single diseases that generally occurred in childhood; e.g., Tay-Sachs disease. The genomics of  adult diseases—such as coronary heart disease—are associated with complexity resulting from multigene interactions and strong environmental influences (e.g., lifestyle and exposures), that may in some cases result in organ-specific “epigenetic” changes that modify DNA.

A prominent example of how these various factors come together can be seen by looking at diabetes. Having a gene associated with diabetes may modestly increase one’s chances of developing this condition from—let us say—6 to 12 percent.   But whether diabetes actually results is influenced by additional factors, such as the sequences of other genes, environmental influences (such as diet and exercise), and age.

New Databases And Technologies

Having a large database makes it likely that there will be enough research subjects to account for these numerous individual factors, thereby providing a more solid basis for predicting who will or will not develop diabetes and how the disease will manifest itself in a given subject.  In addition, large databases enable the confirmation of smaller studies insufficiently powered to validate discoveries. This is important because many genetic associations that are identified in early studies are not confirmed in subsequent studies [1].

A number of large databases and biorepositories have been started toward achievement of these and other objectives. These large databases and biorepositories include the UK Biobank [2], the Kaiser Biobank [3], the Marshfield Clinic Biobank [4], and the Iceland Biobank [5].

The development of robust research biobanks has been accompanied by the development of increasingly rapid and powerful techniques for genetic and other biomarker identification, such as genome-wide association studies [6], whole genome sequencing [7], and studies that examine how gene expression (effects) is mediated via  “proteomics “and “metabolomics [8].”  Whole genome sequencing, in particular, has rapidly become less expensive and will almost certainly cost no more than a few hundred dollars within a few years.  Interpretation of whole genome sequencing is difficult especially when genetic and clinical information are spread over many databases.

Against this background of scientific and technological advancement, the Department of Veterans Affairs (VA) has taken advantage of its unique resources to launch the Million Veteran Program (MVP).  As its name suggests, MVP is a mega-database that will hold genomic and clinical information for future studies about veterans — more specifically, veterans who receive their care from VA and who volunteer to participate in the program.  Launched in May 2011 by VA Chief of Staff John Gingrich, MVP now includes 40 of the 107 VA medical centers (VAMCs) that have the capacity for research (of 152 VAMCs total).

VA is expressly qualified to create such a resource.   First, VA has a large healthcare system which, as of 2012, included more than 8.5 million enrollees. (On an annual basis, VA treats more than 6 million unique patients.)   VA also has had an EHR for its enrollees for more than 15 years and thus has considerable longitudinal clinical data to correlate with genetic data.  Finally, VA has an embedded, federally funded research enterprise (the VA Office of Research and Development, ORD) noteworthy for clinical trials establishment and a central Institutional Review Board (IRB), which avoids the necessity of having to obtain individual site approval to conduct multisite trials. Potential veteran participants are drawn from a large, diverse patient population in the VA healthcare system. This population also includes the full range of complications seen in health care.

Veteran Attitudes Toward Genomic Research

A crucial part of the early MVP planning process was assessing the attitudes and concerns of veterans toward genomic research as well gauging the level of support. A survey [9] conducted by the Johns Hopkins Genetic and Public Policy Center showed that 83 percent of veterans supported the program and 71 percent indicated they would participate in it.  The survey also found that 84 percent of veterans thought the database would lead to improved treatments for veterans and 80 percent believed the program was important.  Willingness to participate correlated with altruistic characteristics such as being a blood and/or organ donor. Further, the survey showed that 96 percent felt that receiving information about their health was important.

Significantly, the survey also revealed a strong concern by nearly all veterans­­—93 percent—regarding privacy and security of information.  In summary, the survey reflected substantial veteran support for MVP and provided an important basis for moving forward, but also revealed key concerns about privacy. This information was critical to MVP planning.

Enrollment In The Million Veteran Program

Enrollment is conducted as follows: Veterans at the 40 participating VAMCs receive a letter asking if they would like to join the program. They may either opt in or out. If they opt out, there is no further contact. If they opt in, they fill out a brief health survey and make an appointment at the VAMC to accompany their next medical visit.  During this visit, they are counseled about the program and given the opportunity to consent to be in the database and to make their medical records available for future studies.  Also during this visit, they have blood drawn (serum and buffy coat).

As of October 29, 2012, MVP had 103,102 enrollees.  Of those responding to the request letter, 59 percent opted in while 41 percent opted out.  Notably, 14 percent of the enrollees were “drop-ins,” that is, veterans who hadn’t received a letter, but who had heard about the program and came in on their own.  This is in keeping with the altruistic nature of the veteran population. Also noteworthy is the fact that research subjects include the VA Secretary, Deputy Secretary, and Chief of Staff.

What are the characteristics of enrollees?   Exhibits 1 and 2 (click to enlarge) show certain demographics of the MVP population at the 70,466 enrollee mark.  Exhibit 1 shows that, compared to the MVP sites and VA health system population, the 18-49 year age groups are underrepresented, with the highest proportional recruitments in the 60-69 year age groups.  Exhibit 1 also shows that female recruitments are proportional to the veteran population.

[10]

Exhibit 2 demonstrates that fewer African Americans are enrolled than the recruitment site proportion.  Exhibit 2 also shows slightly more Hispanic Americans enrolled in MVP than their proportion.

[11]

To help manage the vast amount of data being collected, software in a secure IT platform called GenISIS (Genomic Information System for Integrated Science) has been developed.   The software manages study enrollment, mail and call centers, consents, and genomic data sets.   Moving forward, it also will be used to identify subjects within the database for specific studies, manage clinical data, to hold the data repository, facilitate biorepository operation, and host bioinformatic software to facilitate genetic analysis. Further, the software will provide potentially broad resources for use of genomics (personalized medicine) in the healthcare system while at the same time assure a secure environment. Importantly, and in keeping with veterans’ concerns about privacy, MVP analyses will be stored behind a secure firewall.  Researchers accessing these analyses will see data that does not carry with it name, social security number, address and date of birth.  Additional policies regarding researcher access are now being formulated within the IT platform. Privacy and security— which have their own complexities— are under continuous discussion in VA as they are in the public square.

Utilizing The Million Veteran Program Database

How will VA make use of the MVP database? The following are the types of studies that will be undertaken:
.

  • Identification and validation of genomic associations. These studies are fundamental to determining which genes are associated with a given condition and genomic customization of treatments, such as pharmacogenomic studies that determine how genetics is related to the body’s handling of drugs, a factor in drug effects.  For example, having a particular gene may mean the body cannot efficiently eliminate a certain drug, ultimately leading to drug toxicity.
  • Studies validating the practical value of using genomic data in guiding therapy.  For example, let’s presume there’s an association between a gene and a clinical issue. What, specifically, is the value of that information in treating patients and how does it reduce complications?  Here, VA’s large clinical healthcare system and research program with extensive genetic and clinical trials experience under one roof makes MVP an especially powerful resource.
  • Studies of specific deployment conditions; e.g., post-traumatic stress disorder (PTSD), as well as chronic conditions seen in the VA health care system and other health care systems nationwide.  In addition, this large cohort will provide data on other issues besides genomics, such as early surveillance of unusual symptoms that may be signals of toxic exposures resulting from deployment.

Perhaps the greatest promise of human genomic research lies in studying biologic pathways; i.e., the pathways through which genetic effects are implemented (including “proteomics” and “metabolomics”).  Such studies will give researchers a more profound understanding of the basic biology of disease and create unmatched possibilities for specific disease treatments.  These studies may even lead to changing the way disease is considered and classified.

Selection of research subjects for individual studies will be based on genetic and other individual characteristics, including those identified through use of the VA EHR in conjunction with the software described above. (Note: the general approach used by ORD to choose studies is an investigator-initiated process [12].)  The VA EHR will not only aid in selecting subjects, but also in identifying potential research topics.  To date, a study of schizophrenia and bipolar disease has been started and there are plans to conduct studies on posttraumatic stress disorder (PTSD), which is already known to have genetic predispositions.

Looking Forward

We estimate that MVP will take five to seven years to reach its enrollment goal of one million. Ten more VAMC sites will be added as enrollment centers in the near future, and we anticipate that the VA’s community-based outpatient centers will be utilized as well.  More information, including educational materials, a newsletter and study hand-outs can be found here [13].

In the 50 years since Watson, Crick and Wilkins discovered the genetic code, biologic understanding of disease has progressed substantially.  Today, that understanding points toward numerous and exciting possibilities, from identifying new and better ways of looking at and classifying disease to developing and utilizing new treatments.  As we mark the fiftieth anniversary of these scientists’ achievement, it is hoped that MVP will serve as a vehicle for placing those possibilities within reach.


Article printed from Health Affairs Blog: http://healthaffairs.org/blog

URL to article: http://healthaffairs.org/blog/2012/11/19/the-million-veteran-program-building-vas-mega-database-for-genomic-medicine/

URLs in this post:

[1] many genetic associations that are identified in early studies are not confirmed in subsequent studies: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC2865141/

[2] UK Biobank: http://www.ukbiobank.ac.uk/about-biobank-uk/

[3] Kaiser Biobank: http://www.dor.kaiser.org/external/DORExternal/rpgeh/index.aspx

[4] Marshfield Clinic Biobank: http://www.marshfieldclinic.org/chg/pages/default.aspx?page=chg_pers_med_res_prj

[5] Iceland Biobank: http://grants.nih.gov/grants/icelandic_research.pdf

[6] genome-wide association studies: http://www.ncbi.nlm.nih.gov/pubmed/17916250

[7] whole genome sequencing: http://genomemedicine.com/content/2/1/3

[8] examine how gene expression (effects) is mediated via  “proteomics “and “metabolomics: http://www.ncbi.nlm.nih.gov/pubmed/21311341

[9] A survey: http://www.ncbi.nlm.nih.gov/pubmed/19346960

[10] Image: http://healthaffairs.org/blog/wp-content/uploads/Kupersmith-OLeary-Exhibit-1.jpg

[11] Image: http://healthaffairs.org/blog/wp-content/uploads/Kupersmith-OLeary-Exhibit-2.jpg

[12] the general approach used by ORD to choose studies is an investigator-initiated process: http://www.ncbi.nlm.nih.gov/pubmed/21184863

[13] found here: http://www.research.va.gov/mvp/