Editor’s Note: This is the second installment of a series of blog posts by Ron Klar offering suggestions on how to make the Medicare Shared Savings Program a more viable vehicle for the creation of accountable care organizations. You can read the first installment here.

In this posting I will address the three financial issues that are critical to the willingness of provider groups to form accountable care organizations (ACOs) and participate in the Medicare Shared Savings Program (MSSP), and that also create the bases of assessing an ACO’s “performance”.  These are (1) computing the benchmark, (2) adjusting expenditures during the performance years, and (3) determining savings (and losses). (My suggestions for alternatives regarding (a) evaluating quality, including the selection of quality measures and the standards, (b) apportioning and limiting payments, and (c) specifying the content and timing of CMS’s provision of data and information, initially intended for this part, will now be a separate Part Three.)

With apologies in advance, these are highly “technical” issues, but ones that deserve careful analysis and consideration, as they are fundamental to understanding “the deal”.  The final rule on these issues, more than any others, will determine if this becomes a program of change or of chance.  Although CMS has made it clear they do not want a program with lottery payments, the rule includes proposals that makes these more likely.

As background, the following table lists my suggested changes from Part One:

The bulk of attention since the proposed rule was released on March 31 has been focused on (a) the timing of assignment, (b) the number of measures (and the burden of reporting, if not the specifics of the measures), and (c) the presence and amount of risk.  Remarkably little has been written about establishing the “benchmark”.  This seems particularly surprising as the benchmark expenditures are the sole basis, in comparison to assigned beneficiaries’ expenditures during each performance year, for determining “savings” or “losses”.

At the end of the day the decision to become a Medicare ACO will be based on financial issues.  One’s expected opportunity for shared savings payments, and the resulting return on their prerequisite investments, will be critical. Unfortunately, this absence of discussion is understandable when one recognizes how complicated these issues are to evaluate — or even to determine — from the limited specifications and discussion contained in the rule.

Computing the Benchmark

The rule presents two options, of which CMS says “we believe both Option 1 and Option 2 are legally permissible” and “viable”.  It proposes “to adopt Option 1 … but seek comments on the merits and limitations of both options”.

Option 1 relies on the expenditure experience of different cohorts of beneficiaries, simulated to have been assigned, for each of the three most recent years available prior to the start of the first performance year of the ACO.  Option 2 relies on the three-year expenditure experience of the beneficiaries who actually are assigned, for the three years immediately prior to the first performance year of their assignment.

The major difference between the options can be summarized as follows: use of the historical experience of (1) different beneficiaries from those actually assigned, but who were previous patients of physicians as if they were organized as an ACO, or (2) the same beneficiaries as those actually assigned, but who may not have been previous patients of ACO physicians.

As the rule states, “an accurate benchmark estimate is important [I would say essential] in order to ensure that an ACO that … achieves real savings is rewarded … [and] … to ensure that shared savings are not inadvertently paid … [i.e., if not earned].”  Accuracy is the key, as the benchmark is to be viewed “as a surrogate measure of what the … expenditures would otherwise have been in the absence of the ACO.”

Unfortunately, Option 1 can produce an accurate benchmark only by coincidence, and this is not good enough.  The reason is that it would include historical expenditures of beneficiaries who would be different from most of those who are actually assigned during the performance period; it is not “some” as stated on page 241.  This simply has no “face validity”.

This difference occurs, similar to a discussion in Part One, because there will be year-to-year changes in beneficiary assignment due to changes in the physicians rendering their primary care services (by choice or circumstances), and the “time-lag” for administration.  For example, the “most recent available 3 years” required by statute, including virtually complete claim data (as proposed), to use for an Option 1 benchmark prior to CY2012 would have to be CY2010, 2009, and 2008 (what the rule refers to as BY³, BY², and BY¹).

Considering annual change percentages of 33 percent (from the Physician Group Practice (PGP) demo experience) and 25 percent (from the Actuary’s simulation with PGP data, but with assignment based on only a plurality of primary care services by only primary care physicians), the following table shows the estimated percentages of simulated assigned beneficiaries who are common during each year, indexed to 2010 = 100:

The statute did not specify how to average the three years.  One method would be a simple average; another would be a weighted average.  The rule proposes a weighted average, with BY³ = 60 percent, BY² = 30 percent, and BY¹ = 10 percent; the values shown as “Wtd Avg” above reflect this weighting.  The following table shows estimated percentages of the common assigned beneficiaries for each year 2011 – 2014, each again indexed to 2010 = 100:

The interpretation of these estimates is as follows: during 2013, the second performance year, only between 26.5 and 38.1 percent of beneficiaries actually assigned would be the same as the beneficiaries whose expenditures would be used in determining the benchmark pursuant to proposed Option 1.

If the annual change factor turns out to be greater than the 33 percent found for the PGP sites, not inconceivable recognizing that these sites are uncharacteristically “dominant” in their communities, then the percentages could be even lower.  It is also true that these percentages would have been lower if the simple average of the base years was used instead of the weighted averages.  This is why the rule states on page 241 that “the weighting results in a more accurate benchmark”.  A more correct characterization would be a “less inaccurate” benchmark.

Because the Option 1 benchmark method does not include any adjustments to reflect the differences in “clinical risk” profiles between the performance-year’s actual beneficiaries and the base-years’ simulated beneficiaries, there is a greater likelihood of “winning the lottery”, or “losing the game”.  Furthermore, by knowing in advance the risk profile of the benchmark beneficiaries, and knowing that this “base” benchmark will not change, ACOs could be tempted to improve their chances of winning simply by seeking out beneficiaries of better risks and avoiding those of poorer risks.  This must not be allowed to even be a possibility.

Suggestion I: The Benchmark Methodology for the SS Model Should Be Option 2, With Five Modifications

Benchmark Option 2 would be more accurate than Option 1 because it relies on the prior experience, for virtually all actually-assigned beneficiaries, of only the actually-assigned beneficiaries themselves.  This eliminates the possibility of low-risk seeking and high-risk avoidance because the benchmark is not determined until after the beneficiaries are assigned, which I have previously suggested to be “retrospective” and “invisible”.  Whichever beneficiaries are assigned, they bring their own prior Medicare expenditure experience with them (as long as they have it, see below).  If anything, this provides an incentive to care for previously high-risk and vulnerable beneficiaries as the opportunity to achieve improvements for these is increased.

Unlike Option 1, with Option 2 there would also be no time-lag between the “most recent” three base years for the benchmark and the first performance year for each beneficiary.  For example, assuming CY2012 is the first performance year, the base years would be 2009, 2010, and 2011.  Thus, there would no longer be the one year of imprecise, albeit reasonable, trending to get to the start of the performance year.

It is notable that unlike Option 1, there would not need to have any weighting of these three benchmark years to enhance accuracy.  To the contrary, the Congressional intent of having three years was to capture the variations that do occur over time for “beneficiaries assigned”, believing three years (as opposed to one in the PGP Demo) to be a more accurate predictor of the next year than any one, including the most recent one.

Modification 1:  For both options the proposed rule includes a method to adjust the first two benchmark year expenditures for changing clinical risk.  Based on the CMS-HCC (hierarchical condition categories) risk scores, BY¹ and BY² historical expenditures (HE) would be adjusted (ª) by the ratio of the BY³ risk score (rs) to that for each prior year, e.g., BY¹HEª = BY¹HE x (BY³rs ÷ BY¹rs).  While this might be appropriate for Option 1 (given the different beneficiaries in each of the years), I believe that it is not for Option 2, because the pre-performance experience is only of the same beneficiaries.  Therefore, I suggest (# I-1) that no such clinical risk adjustment is applied to the benchmark years’ expenditures.  Instead, however, I will make the opposite suggestion regarding the performance years’ expenditures (see # J1 below).

Modification 2:  As we know, on average and correcting for changes in clinical risk, there tends to be an increase in year-to-year expenditures due to aging.  Therefore, I suggest (# I2) an adjustment to reflect their aging during the benchmark years.  This would be accomplished with an age-specific increase ratio (1.xxxx) for each year, derived from all Medicare fee-for-service (FFS) beneficiaries, to multiply each of the BY¹ and BY² composite (of all beneficiaries) expenditure amounts.  No further aging adjustment would be needed to align with each subsequent performance year as the proposed “updating” of the benchmark, consistent with the precise statutory language, already includes the effect of aging.

Modification 3:  The proposed rule correctly identifies that Option 2 could warrant a further adjustment for beneficiaries without three full prior years of Medicare experience; it offers several choices and seeks comments.  I suggest (# I3) that for anyone with less than one year of prior experience (a) their prior experience be excluded from the regular benchmark and update computations, and (b) an amount equal to the average for all FFS beneficiaries of the same age and clinical risk profile for this performance year be substituted.  This allows for these new beneficiaries to still be counted toward the ACOs assignment and performance, a better choice than excluding them.

Modification 4:  For assigned beneficiaries with at least one year of prior experience but less than three, I suggest (# I4) a weighted-average, by the number of months (“x”) of their experience, of (a) their prior expenditures, adjusted and updated with the regular rules, weighted [x ÷ 36], and (b) the average for all FFS beneficiaries of the same age and clinical risk profile for this performance year, weighted [(36 – x) ÷ 36].  I believe this computation would be equitable, actuarially-sound, and operationally straight-forward.

Modification 5:  The proposed rule correctly identifies that for Option 2 this process would be repeated after the second and third performance years of the agreement period.  For each of these years the benchmark, as previously computed and updated for PY¹, would be “adjusted” to account for year-to-year changes in the assigned population.  First, as identified, the prior experience of beneficiaries no longer assigned would be removed from the benchmark previously computed.

Second, as also identified, the prior experience of newly-assigned beneficiaries would be added, just as for the first performance year example above.  However, rather than use “the 3 years prior to the agreement period” as proposed (page 243) for these newly-assigned beneficiaries, I suggest (# I5) that the immediately-preceding years be used instead.  This would take advantage of more current data.

Beneficiaries assigned in PY¹, not in PY², and assigned again in PY³, would not be considered as “newly-assigned”, but as “previously-assigned”.  For these, the second performance year benchmark would be adjusted with their previous initial year benchmark data, just as proposed in the rule for Option 2.

Suggestion J: The Benchmark Methodology for the SS&L Model Should Be Option 2, With Five Modifications

With the same rationales as discussed above, I suggest (# J1 – # J5) that the five modifications to CMS’ benchmark Option 2 suggested for the SS model should also be incorporated for use with the SS&L model.

However, because of my suggestions E and # E2 (in the Part One blog) that assignment for the SS&L model should be “prospective” with an “opt-out”, and the likelihood of this in the final rule because of the same suggestion from many others, an additional issue arises that should be addressed: the timing of computing the benchmark.  If it is computed at the same time as the determination of assignment, then there would be one “missing” year of historical experience between the three benchmark years and its corresponding performance year.  This would require some non-specific (relative to the assigned beneficiaries) and non-precise trending to “fill-in” for this year – clearly not an optimal situation.

As an alternative to avoid this, I suggest (# J1) that the benchmark for the SS&L model be computed at the same time as the assessment of performance; that is, after the first performance year.  This would be repeated after the second and third performance years to reflect the changes in assignment.  If it is determined that providing some form of benchmark information before the start of the performance years is appropriate, CMS could compute and share a “preliminary” benchmark (with the three earlier years trended forward one plus the preliminary update one).

These suggestions (J and # J1) have the added benefit of easily accommodating another possible exception to no “retrospective-reconciliation” of prospective assignment (my prior exception suggestion # E4 for “beneficiaries who move out of the area”) — beneficiaries who are not prospectively-assigned, but who meet the assignment criteria during the performance year.  With the Option 2 benchmark, as modified, such beneficiaries would “bring with them” their own benchmark experience, avoiding the “risk management” problems discussed with retrospective-reconciliation.

Knowing this, an ACO might be more willing to include beneficiaries other than those who are prospectively-assigned for their enhanced care processes.  This would be desirable.  Therefore, but only if Option 2, as modified, is adopted as the benchmark methodology for the SS&L model, I suggest (# E5) that another exception to no “retrospective-reconciliation” be for such “subsequently-aligned” beneficiaries.

Option 1 would not be compatible with any retrospective-reconciliation of prospective assignment as the opportunity for inappropriate low-risk seeking and high-risk avoidance would be significant.  [Please see Part One for the discussion of “retrospective-reconciliation” issues.]

Suggestion K: For Both the SS and SS&L Models, Performance Year Expenditures Should Be Adjusted for Beneficiary Characteristics (If Benchmark Option 2, Modified As Suggested, Is Adopted)

The statute specifies that payments for shared savings are to be “ … the difference between such estimated average per capita Medicare expenditures in a year, adjusted for beneficiary characteristics, under the ACO and such benchmark for the ACO” [section 1899(d)(2), italics added for emphasis].

Following a lengthy discussion, the proposed rule (pages 246 – 252) concludes that “changes in the assigned beneficiary population risk score from the 3-year benchmark period during the performance year will not be incorporated”, but welcomes comments.  I believe this would seriously and unfairly disadvantage all ACOs, especially those participating in the SS model that I suggest have retrospective and invisible assignment, because of the elimination of any risk adjustment to the benchmark suggested above (# I-1 and #J1).  Although the care they render is under their control, the changing clinical risk of their patients, at least as determined by the CMS-HCC methodology, is beyond their control.

Therefore, I suggest (# K1) that any such changes in beneficiary risk, for both the SS and SS&L models, should result in an adjustment of average expenditures (AE) during the performance year, accomplished as follows: (1) the risk scores for each of the three benchmark years would be averaged (BY*rs), (2) a “gross” risk change ratio (gRCR) would be computed as performance year risk score PYⁿrs ÷ BY*rs, (3) a “net” RCR would be computed as gRCR ÷ DCEF, a “diagnosis coding excess factor” determined by the CMS actuary for all ACO compared to non-ACO/FFS professionals, and (4) if the nRCR is greater than 1.0, the computation of adjusted actual expenditures (AEª) would be AE ÷ nRCR.

This adjustment (# K1), which is operationally simpler than the four steps might suggest, would thus satisfy both the disparate interests to (a) recognize “real” increases in beneficiaries’ clinical risk during the performance years, and (b) not recognize “artificial” increases coming from more “complete” diagnosis coding, that evidence from the PGP Demo and Medicare Advantage suggest does occur.

Another possible adjustment of performance year expenditures, rather than of Option 2 benchmark expenditures as discussed in the rule (pages 244 – 245), would be for often higher-than-projected expenditures “for beneficiaries who die during an agreement year”.  One method could be to further adjust their adjusted expenditures (suggestion # K1) by a decedent factor (DF = 1.xxxx), different depending on their terminal condition(s), to reflect the average observed differential for all FFS beneficiaries, i.e., AEª x DF.

However, because of the huge variation among decedents in their expenditures during their terminal months, using averages would yield an adjustment too low for some and too high for others.  Therefore, such an adjustment would seem inappropriate to include.  Instead, I suggest (# K2) that (a) the expenditures of assigned decedents during their last sixty days, except if all of these days were covered under hospice care, are excluded from their adjusted actual expenditures, and (b) their historical expenditures included in the benchmark are proportional to the same number of days (minus sixty) as during the performance year, i.e., xxx ÷ 365.

Such exclusion would remove virtually all concern about possible “skimping” on acute care for gravely and terminally ill beneficiaries who were (under the SS&L model) or might be (under the SS model) assigned.  This suggested handling of expenditures for decedents would seem appropriate and neutral, i.e., neither advantaging nor disadvantaging.  Of course, excluding their experience entirely would remove all concern for skimping, but this might raise other concerns.  It is notable that there would be no comparable balance if the Option 1 benchmark methodology is included in the final rule, as it would continue to include the historical (and presumably higher) expenditures of simulated-assigned beneficiaries who died.

Suggestion L: The Minimum Savings Rate (MSR) Should Vary Between 1.5 and 3.0 Percent For the SS Model, and 1.0 and 2.0 Percent For the SS&L Model

The statute requires that performance-year expenditures must be “at least the percent specified by the Secretary below the applicable benchmark”, i.e., the MSR, before “savings” can be considered achieved.  This MSR is “to account for normal variation in expenditures … based on the number of Medicare fee-for-service beneficiaries assigned to an ACO”.

Presuming the use of proposed Option 1 for estimating the benchmark, the rule includes a range of 2.0% to 3.9% for the MSR for “Track 1” (SS model for two years and SS&L for the third) and a fixed 2% for the MSR for “Track 2” (SS&L model).  While the range for Track 1 was based on statistical reasons, the fixed percent for Track 2 was based on strategic reasons.  Many groups considering participation in the MSSP, including the prototype PGP sites, have identified these percentages as too high and unreasonably reducing their opportunity to be successful.

My professional statistician friends tell me that the “estimation error” with Option 2 will be less than with Option 1; that is, any observed differences in expenditures between the performance year and the Option 2 benchmark will be less likely due to “normal variation”.  This is because the former has identical populations in the performance years as the benchmark while the latter has different populations.  As a result, only Option 1 has to also deal with “sampling error”.

For Option 2 with modifications as suggested, the “normal variation” error necessary to deal with would only be the residual variation of three consecutive years of expenditures to the next year for a fixed population.  This variation is further reduced because of the suggested adjustments to the fourth year (the performance year) expenditures for differences in risk and other beneficiary characteristics (expected to cause variation) compared to the immediately prior three years.  As a result, the MSRs can be reduced, with the range suggested for the SS model being conservative and more reasonable.

The reason for suggesting a further reduced range for the SS&L model is also strategic rather than statistical.  Because the MSR also serves as a minimum loss rate (MLR) before losses would be applicable, reducing the range would provide an even greater incentive to SS&L ACOs to implement successful care enhancement and efficiency activities.

Suggestion M: For the SS Model, All ACOs with Savings Greater Than the MSR Should Share From the Full Difference Between Adjusted Actual Expenditures and the Benchmark

As I mentioned in my April 7 Health Affairs blog regarding “Statutory Guidance Issues”, I believe the proposal in the rule to “share in savings beyond a … 2-percent threshold” is contrary to clear statutory language.  Section 1899(d)(2), “Payments for Shared Savings”, requires payments to be based on “a percent (as determined appropriate by the Secretary) … [the “sharing rate(s), to be discussed in my Part Three blog because of the relationship to quality performance issues] … of the difference between such estimated average … and such benchmark” [italics added for emphasis].

There is no reference to, suggestion of, or authority for a “net savings threshold” as proposed; of course, there is such authority for the SS&L or other payment models pursuant to section1899(i), but such proposal is not included in Track 2.  In addition, this threshold is not included for smaller SS ACOs under certain specified criteria.

This was included because CMS “believes” it “protects the program from sharing unearned savings” (page 272).  For all the reasons I discussed in Part One that mitigate this possibility, and the further assurances of improved accuracy of the benchmark combined with adjustments of actual expenditures discussed herein, I believe this protection is simply not necessary.  In addition, such a threshold would be specifically disadvantageous to certain SS ACOs, which would not only be inequitable but also place their willingness to participate in jeopardy.


While certainly tedious, I am hopeful that this analysis has assisted the evaluation of the highly complex and technical financial proposals in the rule, particularly in the context of alternatives I suggest that should improve the prospects for program success.  These issues are fundamental to determining savings and losses – and thus of critical importance to all stakeholders.  The options chosen for the final rule will determine whether ACOs receive (or make) payments only because of their delivery reform efforts, or also because of a “roll of the dice”.