I’ve worked in my small-town general practice in Scotland for ten years, combining clinical work there with a university post. In many ways my practice is pretty average. We have 5,700 patients, of all ages and from a mix of socioeconomic backgrounds. As in virtually all other British general practices, patient care is shared across our team of physicians and nurses, and we’ve used a form of electronic medical record (EMR) for many years. However, historically we didn’t make full use of the skills of our nurses, or exploit our EMR to its full capacity. For example, our nurses did tasks like phlebotomy that could easily have been done by a less skilled person, and we primarily used our EMR for patient registration, prescribing and morbidity coding, while maintaining a parallel paper record for free-text clinical notes (the practice finally went paperless in 2005). Although we measured and tried to change areas of care that concerned us, quality improvement was often piecemeal and wasn’t always sustained. But we were proud of our practice, and reasonably comfortable in our belief that we provided high-quality care.

In February 2003, the morning mail brought a copy of the Blue Book to me and every other general practitioner (GP) in the UK. It invited us to vote for a new contract with the National Health Service. The proposed contract had many attractions, not least the chance to increase practice income by about 20 percent. However, all of that increase was dependent on our delivering high performance measured by the 147 indicators in the Quality and Outcomes Framework (QOF). Although we’d all known the broad outline of the contract for some months, the detailed proposals were met with a storm of criticism from rank-and-file GPs, although uncertainty rather than rage was the dominant emotion in my practice. After some expedient changes to pacify the most discontented, GPs voted 4 to 1 to accept the contract, so committing ourselves to be the subjects of the world’s largest health care pay-for-performance experiment.

How does UK pay-for-performance work?

In principle, QOF is very simple. Every QOF indicator has a points value attached, reflecting its relative importance. They all add up to 1050 points in total, and for an average-size practice (about 5400 patients, about 3 physicians) each point was worth £75 (approximately $140) in 2004/5 rising to £120 ($230) in 2005/6. (Currency conversions reflect exchange rates prevailing when this post was written.) Payment doesn’t depend on comparisons with other practices, so we know that we’ll get paid for any work done. We’re also allowed to exclude patients from measures for a range of reasons, which means that practices aren’t penalized for serving sicker, older or more socio-economically deprived populations. To get paid, we have to record all relevant data for every patient in our EMR. All practices get a “light touch” regulatory inspection annually, and a random 5 percent are additionally subject to a detailed payment verification audit (as well as those practices where there are concerns about data and payment accuracy).

In practice, the QOF is not so simple. The size of the performance bonus meant that there was little chance of our not responding. £125,000 ($240,000) is a substantial sum for a small practice, although payment per measure is rather less impressive. For example, in the first year, the practice earned a total of £225 (approximately $430) for ensuring that more than 90% of our 220 diabetics had their glycosylated haemoglobin checked in the previous 15 months, and £375 (about $720) for ensuring that more than 90% had appropriate retinopathy screening. However, it never really occurred to us to sit down and work out which measures were worth the effort involved to deliver them. We just decided to try to get as many points as possible. Doctors are pretty good at jumping through hoops, and that’s what we enthusiastically set out to do. Remarkably, the nurses and the practice manager were equally keen participants, despite having no direct financial incentives. In part, that was because we all accepted that QOF really did represent high quality (most of the time anyway). In part, it was because public reporting of QOF data threatened our cozy assumptions of providing high-quality care.

What did we do?

Although the details weren’t clear, we knew that something like QOF was coming over a year before implementation, and we used that time to prepare. In the eighteen months before implementation, we paid for a small increase in nursing hours, freed up nursing time through the use of a part-time phlebotomist, reorganized administrative work to facilitate the expected data collection, and starting thinking about how to make better use of our existing EMR capacity and nurses’ skills.

We already had pretty good electronic data in terms of patient registration, prescribing, and morbidity. We also had experience of using a diabetes register, so creating registries for other incentivized diseases wasn’t that hard for us, although we rapidly realized that creation is in some ways the easy bit. Keeping registries accurate and up-to-date is a neverending task. There is a torrent of data to be reliably recorded, which is a constant burden on clinical and administrative time.

Towards the end of the first year, we were at about 1,030 points. By then, the law of diminishing returns had kicked in. I pointed out that working many extra hours for a likely £750 (approximately $1425) shared among three doctors didn’t really make financial sense. By this stage, though, imagined comparisons with other practices meant that pride trumped the weakness of the business case. In the end, we achieved 1,040.7 points in 2004/5 and 1,050 the next year.

What changed?

Where we had prior audit data, we know our quality for incentivized diseases got better. For example, between 2003 and 2005, the percentage of our patients with coronary heart disease whose blood pressure was measured at least annually increased from 69 percent to 94 percent and those with ideal control (less than or equal to 140/85) increased from 45 percent to 58 percent. QOF made us act on other diseases like stroke and epilepsy that that we hadn’t previously paid enough attention to. I’d expect that improvements were greater for these relatively neglected conditions, but there is simply no way of knowing because that lack of attention also means that we don’t have good “before” data to compare our current performance with.

QOF kept us consistently focused, but this greater depth for incentivized diseases is balanced by a narrowing of focus. Since 2004, we haven’t done any major quality improvement work for the 85 percent of our workload that isn’t in QOF. As an example, over several years we had successfully worked to make sure that our diagnosis and initial management of hypertension and diabetes was rigorous. QOF largely ignores diagnosis, so we don’t do that anymore. Has our diagnostic accuracy and rigor got worse? We don’t know, and are unlikely to have the time to find out. Has our care for people with complex problems, or our ability to deliver personal, individualized care suffered? We don’t know, and I worry about it.

The QOF In Its Wider Context

We did well, but so did almost all other practices. The 2005/6 average points score was 1026/1050, far exceeding expectations. Unfortunately, exceeding expectations created financial problems for the NHS, since the payment system placed financial risk for “overperformance” solely on the government.

Is QOF a good thing? Patients got remarkably high quality for incentivized diseases with strikingly little variation between practices or regions. We got a large pay raise in return for significant effort to reorganize our practices, but the consequences of our new disease and incentive focus aren’t yet clear, and the jury is out on whether quality of care more broadly has been affected. The NHS got high quality for incentivized diseases, although it paid a handsome price in the first two years. However, every GP I know expects that more work will be required for no extra money, and events support this view. The 2006 QOF review raised many performance thresholds and added six diseases to QOF with no change in total financial incentive. In the longer term, perceptions of who really won and lost may well change.

It seems inevitable that pay-for-performance for physicians and practice groups will continue its expansion in the US, although its impact is likely to be smaller than many purchasers seem to assume. We changed our systems of care partly for the money, and partly for pride in our publicly reported quality. In the US, financial incentives are currently small and only linked to a few quality measures. Additionally, provider payment and public reporting are both fragmented across many payers, weakening motivation to improve care. This will change as pay-for-performance expands, but since there is usually no new money to fund programs, change will be slow. Large-scale pay-for-performance drove moderate change in UK primary care, but the continuing debate over its true impact suggests that pay-for-performance in North America will never be as rapidly transformational as its more enthusiastic proponents suggest.