This post marks one of many, many times I’ve written about our lack of sufficient data infrastructure to track the COVID-19 pandemic in the U.S. Two years in, I continue not to understand the abject failure to do a better job with this. Perhaps I should stop writing about it, but I cannot help myself.

What keeps me up at night at the moment is hospitals. And, in particular, the worry that the overwhelming of hospitals will get much worse.

To be clear: this has always been a key concern with COVID-19. If we think back to March 2020, much of the initial motivation for lockdowns was to “flatten the curve” — to slow spread to the point where we had the hospital capacity to deal with it. When I worked with the state of Rhode Island on COVID response, our focus was on modeling how much hospital capacity we would need (the modeling wasn’t very good). We have come a long way, but hospitals still remain a central concern. Emergency rooms overwhelmed with COVID patients cannot function, or serve other emergency needs.

In certain parts of the country, the most significant current issue for hospitals is staffing. One consequence of pandemic burnout has been a loss of nursing staff in particular. Hospitals have less capacity to staff beds than they did at the start of the pandemic, less support. My friends who work in emergency departments tell me they are doing three or four people’s jobs — seeing patients, cleaning beds, transporting patients between rooms. These problems are poised to get worse as Omicron spreads, with health-care workers likely to be out for (with new CDC guidance) five days at a time with mild or asymptomatic COVID infections detected through routine testing.

(Staffing concerns are also in the way of much school reopening, but that is for another post.)

This situation is scary. A coordinated federal or state response will be necessary to figure out where we need to send more staff, including possible National Guard help. On television on Sunday, Dr. Fauci said the key thing he is paying attention to is hospitalizations. But in fact, to understand what the patterns in the data mean, we need better hospitalization data than what we have. In terms of prediction, we are flying blind as usual.

To see why, think about our method of tracking. We’ve always been interested in predicting hospitalizations, but up until now we’ve been effectively using cases to predict hospital rates in the future. There are conceptual problems with this approach, but actually it’s been pretty good up until now.

With Omicron, this link has totally come apart, for two reasons: rapid testing, and changes in severity. On the first point: case rates at the moment are effectively meaningless other than to show us there is a lot of COVID around. Many people are rapid testing at home and then just isolating (which is current CDC guidance), meaning they never show up in case counts. This is more true than it was in previous waves, meaning case counts are further detached from actual cases. Positivity rates — the share of tests that are positive — are likely a huge overcount, since many tests are among people who have already tested positive at home.

On the second point, it is now clear that Omicron has lower severity, either due to high vaccination rates or being intrinsically milder. This means that the hospitalization rates we expected from earlier waves are going to be an overestimate for the current wave.

Both of these together predict a divergence between cases and hospitalizations, which we have already seen in many locations.

We can no longer rely on case counts to predict hospitalizations, at least not in the same way. Yet it remains as crucial as ever — perhaps more so — to do this prediction. And there are ways to do it, but we need better data. What does that mean?

Our current data on hospitalization, in most cases, is extremely crude. In general, hospitalization rates are simply the count of the number of people admitted to the hospital with a positive COVID test per 100,000 population. This is problematic in two ways. First: everyone admitted to the hospital is tested for COVID. This means that if someone is admitted with an injury or mental health issue or anything else and they have asymptomatic COVID-19 infection, they appear as a COVID case.

This makes hospital COVID-19 rates a useful way to measure overall COVID burden, but it doesn’t necessarily help predict how much help hospitals will need in treating COVID. This issue is sometimes discussed as differentiating between hospitalizations “for” COVID versus “with” COVID. A couple of states (Iowa, for example) try to separate these out in their reporting, but it is not widely done. This separation is especially crucial for understanding any changes in demographics. Is Omicron worse for children? We see rising hospitalizations, but we know many of them are incidental. Is it most? Only some? Without details it’s easy to draw the wrong conclusions in either direction.

A second issue is that these raw counts do not provide any detail that would be helpful in prediction. To what extent are the hospitalizations for breakthrough infections versus unvaccinated? Which demographic groups are most at risk for needing significant care? How much do boosters matter? Are ICU beds being used most by unvaccinated younger people, unvaccinated older people, vaccinated older people?

All of this information would be hugely valuable in predicting where we will need hospital space and staffing. It’s generally understood that unvaccinated individuals are more likely to need to be hospitalized, but the extent to which that phenomenon interacts with demographics, and which groups need hospitalization among vaccinated people — that is all completely unclear.

What we need is not complicated, though it’s richer than what we have now. In an ideal world, I would like to see basically two sets of data:

Baseline data 

  • Hospital admissions with COVID-19 as primary diagnosis
  • Hospital admissions with COVID-19 as secondary diagnosis

Detailed data for primary diagnosis

These counts would be reported only for individuals who are being treated for COVID.

  • Patient age
  • Patient vaccination status
  • Comorbidities (such as heart disease, cancer, and diabetes)
  • In ICU
  • On ventilator

Given such data, we could do a significantly better job of predicting where hospital capacity will be needed, and what type of capacity. For example: at least some of the current discussion suggests that Omicron is less likely to affect the lungs and so may result in less need for ICU staffing and more need for staffing of non-ICU hospital beds. On the other hand, perhaps this is not true for unvaccinated individuals, which would change the calculus in areas with less vaccination. This isn’t easy to figure out from our current data.

It is important to emphasize that much of the reason to get these better data is to help hospital systems and health-care workers and to better manage the pandemic. The reason is not to understate the crisis facing hospitals at the moment. Last weekend, I tweeted asking if any states were reporting out even the first part of this breakdown — the “for” versus the “with” — and immediately people raised concerns about that data being weaponized, calling me “evil” and a “killer.”

It is true that if we collect better data, we are likely to find that some hospitalizations are “with” rather than “for” COVID. But that isn’t a reason not to get the data. And it is certainly not a reason to not collect and report better information about the people who are hospitalized for COVID.

Collecting these data isn’t necessarily simple, but in many hospital systems it exists as part of charting or other records. The challenge is more likely to be how to extract the data and compile it in a usable format. It’s the kind of thing the CDC or the Biden administration COVID team might do, except based on what has happened so far in the pandemic, I am sure they will not. The request is tilting at windmills for now; I accept that. But setting up this infrastructure will matter for the next wave. I hope someone is listening.