Emily Oster

11 min Read Emily Oster

Emily Oster

A question I get frequently: Why does my analysis often disagree with groups like the American Academy of Pediatrics or other national bodies, or other public health experts, or Andrew Huberman (lately I get that last one a lot)? The particular context is often in observational studies of topics in nutrition or development.

Some examples:

The recent analysis of processed food and cancer is emblematic of many of these cases. In that post, I argued that the relationship observed in the data was extremely likely to reflect correlation and not causation. My argument rested on the observation that people who ate differently also differed on many other features.

In response, a reader wrote in this question:

You emphasize causation vs. correlation, and I think you are pointing to potential confounders that could actually be the root cause of the findings. My question is — can’t and don’t study researchers control for that in their analysis? Can’t they look at the link between screen time and academic success while keeping potential confounders equal across the comparison groups? And if so, wouldn’t that help rule out the impact of other factors and strengthen the case that there is a true link?

This is a very good question, and it clarifies for me where many of the disagreements lie.

The questioner essentially notes: the reason we know that the processed food groups differ a lot is that the authors can see the characteristics of individuals. But because they see these characteristics, they can adjust for them (using statistical tools). While it’s true that education levels are higher among those who eat less processed food, by adjusting for education we can come closer to comparing people with the same education level who eat different kinds of food.

However, in typical data you cannot observe and adjust for all differences. You do not see everything about people. Sometimes this is simply because our variables are rough: we see whether someone has a family income above or below the poverty line, but not any more details, and those details are important. There are also characteristics we almost never capture in data, like How much do you like exercise? or How healthy are your partner’s behaviors? or even Where is the closest farmers’ market? 

For both of these reasons, in nearly all examples, we worry about residual confounding. That’s the concern that there are still other important differences across groups that might drive the results. Most papers list this possibility in their “limitations” section.

We all agree that this is a concern. Where we differ is in how much of a limitation we believe it to be. In my view, in these contexts (and in many others), residual confounding is so significant a factor that it is hopeless to try to learn causality from this type of observational data. 

This position drives a lot of my concerns with existing research. Thinking about these issues is a huge part of my research and teaching. So I thought I’d spend a little time today explaining why I hold this position. I’m going to start with theory and then discuss two pieces of evidence.

A quick note: this post focuses on concerns about approaches which take non-randomized data and argue for causality based on including observed controls. There are other approaches to non-randomized data (i.e. difference-in-difference, event studies) which have stronger causality claims. See some discussion of those in this older post.

Theory

Conceptually, the gold standard for causality is a randomized controlled trial. In the canonical version of such a trial, researchers randomly allocate half of their participants to treatment and half to control. They then follow them over time and compare outcomes. The key is that because you randomly choose who is in the treatment group, you expect them, on average, to be the same as the control other than the presence of the treatment. So you can get a causal effect of treatment by comparing the groups.

Randomized trials are great but not always possible. A lot of what is done in public health and economics aims to estimate causal effects without randomized trials. The key to doing this is to isolate a source of randomness in some treatment, even if that randomization is not explicit.

For example: Imagine that you’re interested in the effect of going to a selective high school on college enrollment. One simple thing to do would be to compare the students who went to the selective high school with those who did not. But this would be tricky, because there are so many other differences across the students.

Now imagine that the way that admission to the high school works is based on a test score: if you get a score above some cutoff, you get in, and if you are below, you do not. With that kind of mechanism, we can get closer to causality. Let’s say the cutoff score is 150. You’ve got some students who scored 149 and some who scored 150. The second group gets in, the first doesn’t. But their scores are really similar. It may be reasonable to claim that it is effectively random whether you got 149 or 150 — the difference is so small, it could happen by chance. In that case, you can try to figure out the causal effect of the selective high school by comparing the students just above the cutoff with those just below.

This particular technique is called regression discontinuity; it’s part of a suite of approaches to estimate causal effects that take advantage of these moments of randomness in the world. The moments do not need to be truly random, but they do need to be driving the treatment and not driving the outcome you are interested in.

We can take this lens to the kind of observational data that we often consider. Let’s return to the processed food and cancer example. The approach in that paper was to compare people who ate a lot of processed food with those who ate less. Clearly, in raw terms, this would be unacceptable because there are huge differences across those groups. The authors argue, though, that once they control for those differences, they have mostly addressed this issue.

This argument comes down to: once I control for the variables I see, the choice about processed food is effectively random, or at least unrelated to other aspects of health.

I find this fundamentally unpalatable. Take two people who have the same level of income, the same education, and the same preexisting conditions, and one of them eats a lot of processed food and the other eats a lot of whole grains and fresh vegetables. I contend that those people are still different. That their choice of food isn’t effectively random — it’s related to other things about them, things we cannot see. Adding more and more controls doesn’t necessarily make this problem better. You’re isolating smaller and smaller groups, but still you have to ask why people are making different food choices.

Food is a huge part of our lives, and our choices about it are not especially random. Sure, it may be random whether I have a sandwich or a salad for lunch today, but whether I’m eating a bag of Cheetos or a tomato and avocado on whole-grain toast — that is simply not random and not unrelated to other health choices.

This is where, perhaps, I conceptually differ from others. I have to imagine that researchers doing this work do not hold this view. It must be that they think that once we adjust for the observed controls, the differences across people are random, or at least are unrelated to other elements of their health.

This is a theoretical disagreement. But there are at least two things in data that have really reinforced my view — one from my own research and one example from my books.

Selection on observables: Vitamins

Underlying the issue of correlation versus causation are human choices. This is especially true in nutrition. The reason it is hard to learn about causality is that different people make different choices. One of the possible reasons for those different choices is different information, or different processing of information.

A few years ago, I got curious about the role of information — of news — in driving these choices, and I wrote a paper that looked at what happened to health behaviors after changes in health information. I wrote at more length about that paper here, but the basic idea was to analyze who adopts new health behaviors when news comes out suggesting those behaviors are good.

The main application is vitamin E. In the early 1990s, a study came out suggesting vitamin E supplements improved health. What happened as a result was that more people took vitamin E. But not just any people. The new adopters were more educated, richer, more likely to exercise, less likely to smoke, more likely to eat vegetables. In turn, over time, as these people started taking the vitamin, vitamin E started to look even better for health.

Over a period of about a decade, vitamin E went from being only mildly associated with lower mortality to being strongly associated with lower mortality. This is not because the impacts of the vitamin changed! It was because the people who took the vitamin changed. And, importantly, these patterns persisted even when I put in controls.

What this says to me is that these biases in our conclusions — and I saw this in vitamins, but also in sugar and fat — are malleable based on the information out there in the world. Once you acknowledge that what is going on here is people are reading news and reacting to it in different ways, it is hard to believe that the limited observable characteristics we can control for are enough.

Evolving coefficients: Breastfeeding

The second important data point for me is looking carefully at what happens in many of these situations when we introduce more and better controls.

The link between breastfeeding and IQ is a good example. This is a research space where you can find many, many papers showing a positive correlation. The concern, of course, is that moms who breastfeed tend to be more educated, have higher income, and have access to more resources. These variables are also known to be linked to IQ, so it’s difficult to isolate the impacts of breastfeeding.

What these papers typically do is control for some observable differences. And, like the discussion above, we might think, “Well, isn’t that enough? If we can see these detailed demographics, isn’t that going to address the problem?”

The paper I like the best to illustrate the fact that, no, that doesn’t address the problem is one that used data that — among other things — included sibling pairs. The authors of this paper do four analyses of the relationship between breastfeeding and IQ:

  1. Raw correlation — no adjustment for anything
  2. Regression adjusting for standard demographics (parental education, etc.)
  3. Regression adjusting for standard demographics plus adjusting for mom IQ score
  4. Within-sibling analysis: compare two siblings, one of whom was breastfed and one of whom was not

The graph below shows their results. When they just compare groups — without adjusting for any other differences — there is a large difference in IQ between breastfed and non-breastfed children. When they add in some demographic adjustments, this difference falls but is still statistically significant. This is where most papers stop. But as these authors add their additional controls, eventually they get to an effect of zero. Comparing across siblings, there is no difference at all.

The point of this discussion is not to get in the weeds on breastfeeding (you can read my whole chapter from Cribsheet about it). This is an illustrative example of a general issue: the control sets we typically consider are incomplete. There are a lot of papers that report effectively only the first two bars in the graph above. But those simple observable controls are just not sufficient. The residual confounding is real and it is significant.

(If you want another example, you can look back to the very similar kind of graph in Panic Headlines from last week. This problem is everywhere.)

Conclusions

The question of whether a controlled effect in observational data is “causal” is inherently unanswerable. We are worried about differences between people that we cannot observe in the data. We can’t see them, so we must speculate about whether they are there. Based on a couple of decades of working intensely on these questions in both my research and my popular writing, I think they are almost always there. I think they are almost always important, and that a huge share of the correlations we see in observational data are not close to causal.

There are two final notes on this.

First: A common approach in these papers is to hedge in the conclusion by saying, “Well, it might not be causal.” I find this hedge problematic. If the relationship between processed food and cancer isn’t causal, why do we care about it? The obvious interpretation of this result is that you should stop eating processed foods. But if the result isn’t causal, that interpretation is wrong. This hedge is a cop-out. And this approach — to bury the hedge in the conclusion — encourages the poorly informed and inflammatory media coverage that often follows.

Second: I recognize that other people may disagree and find these relationships more compelling. I believe we can have productive conversations about that. To my mind, though, these conversations need to be grounded in the theory I started with. That is, if you want to argue that there is a causal relationship between processed food and cancer, you need to be willing to make a case that you’re approximating a randomized trial with your analysis. If we focus our discussion on that claim, it will discipline our disagreements.

And last: Thank you for indulging my love of econometrics today. My dad may be the only person who gets this far in the newsletter, but even so, it was worth it. Back with more parenting content on Thursday.

0 Comments
Inline Feedbacks
View all comments

Dec 05 2022

12 min read

Where Does Data Come From?

Weight and weighting

Emily Oster
A child holds up an abacus with green and red beads arranged to look like a data chart.

Aug 10 2023

9 min read

Data Literacy for Parenting

The ParentData mission statement (it’s on my door!) is “To create the most data-literate generation of parents.” The other day, Read more

Emily Oster
An illustration of a head, with the top opening up to reveal a rainbow of colors against a blue background.

Oct 10 2023

10 min read

I Hit My Head and Learned Three Lessons

At the end of September, I went to a conference in Denver. The first morning, I went for a run Read more

Emily Oster

Instagram

left right
I hear from many of you that the information on ParentData makes you feel seen. Wherever you are on your journey, it’s always helpful to know you’re not alone. 

Drop an emoji in the comments that best describes your pregnancy or parenting searches lately… 💤🚽🍻🎒💩

I hear from many of you that the information on ParentData makes you feel seen. Wherever you are on your journey, it’s always helpful to know you’re not alone.

Drop an emoji in the comments that best describes your pregnancy or parenting searches lately… 💤🚽🍻🎒💩
...

Milestones. We celebrate them in pregnancy, in parenting, and they’re a fun thing to celebrate at work too. Just a couple years ago I couldn’t have foreseen what this community would grow into. Today, there are over 400,000 of you here—asking questions, making others feel seen wherever they may be in their journey, and sharing information that supports data > panic. 

It has been a busy summer for the team at ParentData. I’d love to take a moment here to celebrate the 400k milestone. As I’ve said before, it’s more important than ever to put good data in the hands of parents. 

Share this post with a friend who could use a little more data, and a little less parenting overwhelm. 

📷 Me and my oldest, collaborating on “Expecting Better”

Milestones. We celebrate them in pregnancy, in parenting, and they’re a fun thing to celebrate at work too. Just a couple years ago I couldn’t have foreseen what this community would grow into. Today, there are over 400,000 of you here—asking questions, making others feel seen wherever they may be in their journey, and sharing information that supports data > panic.

It has been a busy summer for the team at ParentData. I’d love to take a moment here to celebrate the 400k milestone. As I’ve said before, it’s more important than ever to put good data in the hands of parents.

Share this post with a friend who could use a little more data, and a little less parenting overwhelm.

📷 Me and my oldest, collaborating on “Expecting Better”
...

I spend a lot of time talking people down after they read the latest panic headline. In most cases, these articles create an unnecessary amount of stress around pregnancy and parenting. This is my pro tip for understanding whether the risk presented is something you should really be worrying about.

Comment “link” for an article with other tools to help you navigate risk and uncertainty.

#emilyoster #parentdata #riskmanagement #parentstruggles #parentingstruggles

I spend a lot of time talking people down after they read the latest panic headline. In most cases, these articles create an unnecessary amount of stress around pregnancy and parenting. This is my pro tip for understanding whether the risk presented is something you should really be worrying about.

Comment “link” for an article with other tools to help you navigate risk and uncertainty.

#emilyoster #parentdata #riskmanagement #parentstruggles #parentingstruggles
...

Here’s why I think you don’t have to throw away your baby bottles.

Here’s why I think you don’t have to throw away your baby bottles. ...

Drop your toddlers favorite thing right now in the comments—then grab some popcorn.

Original thread source: Reddit @croc_docs

Drop your toddlers favorite thing right now in the comments—then grab some popcorn.

Original thread source: Reddit @croc_docs
...

Just keep wiping.

Just keep wiping. ...

Dr. Gillian Goddard sums up what she learned from the Hot Flash  S e x  Survey! Here are some key data takeaways:

🌶️ Among respondents, the most common s e x u a l frequency was 1 to 2 times per month, followed closely by 1 to 2 times per week
🌶️ 37% have found their sweet spot and are happy with the frequency of s e x they are having
🌶️ About 64% of respondents were very or somewhat satisfied with the quality of the s e x they are having

Do any of these findings surprise you? Let us know in the comments!

#hotflash #intimacy #midlifepleasure #parentdata #relationships

Dr. Gillian Goddard sums up what she learned from the Hot Flash S e x Survey! Here are some key data takeaways:

🌶️ Among respondents, the most common s e x u a l frequency was 1 to 2 times per month, followed closely by 1 to 2 times per week
🌶️ 37% have found their sweet spot and are happy with the frequency of s e x they are having
🌶️ About 64% of respondents were very or somewhat satisfied with the quality of the s e x they are having

Do any of these findings surprise you? Let us know in the comments!

#hotflash #intimacy #midlifepleasure #parentdata #relationships
...

Should your kid be in a car seat on the plane? The AAP recommends that you put kids under 40 pounds into a car seat on airplanes. However, airlines don’t require car seats.

Here’s what we know from a data standpoint:
✈️ The risk of injury to a child on a plane without a carseat is very small (about 1 in 250,000)
✈️ A JAMA Pediatrics paper estimates about 0.4 child air crash deaths per year might be prevented in the U.S. with car seats 
✈️ Cars are far more dangerous than airplanes! The same JAMA paper suggests that if 5% to 10% of families switched to driving, then we would expect more total deaths as a result of this policy. 

If you want to buy a seat for your lap infant, or bring a car seat for an older child, by all means do so! But the additional protection based on the numbers is extremely small.

#parentdata #emilyoster #flyingwithkids #flyingwithbaby #carseats #carseatsafety

Should your kid be in a car seat on the plane? The AAP recommends that you put kids under 40 pounds into a car seat on airplanes. However, airlines don’t require car seats.

Here’s what we know from a data standpoint:
✈️ The risk of injury to a child on a plane without a carseat is very small (about 1 in 250,000)
✈️ A JAMA Pediatrics paper estimates about 0.4 child air crash deaths per year might be prevented in the U.S. with car seats
✈️ Cars are far more dangerous than airplanes! The same JAMA paper suggests that if 5% to 10% of families switched to driving, then we would expect more total deaths as a result of this policy.

If you want to buy a seat for your lap infant, or bring a car seat for an older child, by all means do so! But the additional protection based on the numbers is extremely small.

#parentdata #emilyoster #flyingwithkids #flyingwithbaby #carseats #carseatsafety
...

SLEEP DATA 💤 PART 2: Let’s talk about naps. Comment “Link” for an article on what we learned about daytime sleep!

The first three months of life are a chaotic combination of irregular napping, many naps, and a few brave or lucky souls who appear to have already arrived at a two-to-three nap schedule. Over the next few months, the naps consolidate to three and then to two. By the 10-to-12-month period, a very large share of kids are napping a consistent two naps per day. Over the period between 12 and 18 months, this shifts toward one nap. And then sometime in the range of 3 to 5 years, naps are dropped. What I think is perhaps most useful about this graph is it gives a lot of color to the average napping ages that we often hear. 

Note: Survey data came from the ParentData audience and users of the Nanit sleep monitor system. Both audiences skew higher-education and higher-income than the average, and mostly have younger children. The final sample is 14,919 children. For more insights on our respondents, read the full article.

SLEEP DATA 💤 PART 2: Let’s talk about naps. Comment “Link” for an article on what we learned about daytime sleep!

The first three months of life are a chaotic combination of irregular napping, many naps, and a few brave or lucky souls who appear to have already arrived at a two-to-three nap schedule. Over the next few months, the naps consolidate to three and then to two. By the 10-to-12-month period, a very large share of kids are napping a consistent two naps per day. Over the period between 12 and 18 months, this shifts toward one nap. And then sometime in the range of 3 to 5 years, naps are dropped. What I think is perhaps most useful about this graph is it gives a lot of color to the average napping ages that we often hear.

Note: Survey data came from the ParentData audience and users of the Nanit sleep monitor system. Both audiences skew higher-education and higher-income than the average, and mostly have younger children. The final sample is 14,919 children. For more insights on our respondents, read the full article.
...

Happy Father’s Day to the Fathers and Father figures in our ParentData community! 

Tag a Dad who this holiday may be tricky for. We’re sending you love. 💛

Happy Father’s Day to the Fathers and Father figures in our ParentData community!

Tag a Dad who this holiday may be tricky for. We’re sending you love. 💛
...

“Whilst googling things like ‘new dad sad’ and ‘why am I crying new dad,’ I came across an article written by a doctor who had trouble connecting with his second child. I read the symptoms and felt an odd sense of relief.” Today we’re bringing back an essay by Kevin Maguire of @newfatherhood about his experience with paternal postpartum depression. We need to demystify these issues in order to change things for the better. Comment “Link” for a DM to read his full essay.

#parentdata #postpartum #postpartumdepression #paternalmentalhealth #newparents #emilyoster

“Whilst googling things like ‘new dad sad’ and ‘why am I crying new dad,’ I came across an article written by a doctor who had trouble connecting with his second child. I read the symptoms and felt an odd sense of relief.” Today we’re bringing back an essay by Kevin Maguire of @newfatherhood about his experience with paternal postpartum depression. We need to demystify these issues in order to change things for the better. Comment “Link” for a DM to read his full essay.

#parentdata #postpartum #postpartumdepression #paternalmentalhealth #newparents #emilyoster
...

What does the data say about children who look more like one parent? Do they also inherit more character traits and mannerisms from that parent? Let’s talk about it 🔎

#emilyoster #parentdata #parentingcommunity #lookslikedaddy #lookslikemommy

What does the data say about children who look more like one parent? Do they also inherit more character traits and mannerisms from that parent? Let’s talk about it 🔎

#emilyoster #parentdata #parentingcommunity #lookslikedaddy #lookslikemommy
...

SLEEP DATA 💤 We asked you all about your kids’ sleep—and got nearly 15,000 survey responses to better understand kids’ sleep patterns. Comment “Link” for an article that breaks down our findings!

This graph shows sleeping location by age. You’ll notice that for the first three months, most kids are in their own sleeping location in a parent’s room. Then, over the first year, this switches toward their own room. As kids age, sharing a room with a sibling becomes more common. 

Head to the newsletter for more and stay tuned for part two next week on naps! 🌙

#parentdata #emilyoster #childsleep #babysleep #parentingcommunity

SLEEP DATA 💤 We asked you all about your kids’ sleep—and got nearly 15,000 survey responses to better understand kids’ sleep patterns. Comment “Link” for an article that breaks down our findings!

This graph shows sleeping location by age. You’ll notice that for the first three months, most kids are in their own sleeping location in a parent’s room. Then, over the first year, this switches toward their own room. As kids age, sharing a room with a sibling becomes more common.

Head to the newsletter for more and stay tuned for part two next week on naps! 🌙

#parentdata #emilyoster #childsleep #babysleep #parentingcommunity
...

Weekends are good for extra cups of ☕️ and listening to podcasts. I asked our team how they pod—most people said on walks or during chores. What about you?

Comment “Link” to subscribe to ParentData with Emily Oster, joined by some excellent guests.

#parentdata #parentdatapodcast #parentingpodcast #parentingtips #emilyoster

Weekends are good for extra cups of ☕️ and listening to podcasts. I asked our team how they pod—most people said on walks or during chores. What about you?

Comment “Link” to subscribe to ParentData with Emily Oster, joined by some excellent guests.

#parentdata #parentdatapodcast #parentingpodcast #parentingtips #emilyoster
...