Emily Oster

12 min Read Emily Oster

Emily Oster

New Study on Alcohol Consumption and Heart Disease

An excuse for a (very) deep dive on Mendelian randomization

Emily Oster

12 min Read

I spend a lot of time thinking about how difficult it is to understand the relationship between diet and health, and the two examples I come back to frequently are coffee and alcohol. Both of these choices are sometimes linked to worse health and sometimes to better health. One day, coffee helps you live forever; three weeks later, another study says it causes early death. Alcohol consumption is subject to similar fluctuations — a glass of red wine a day is key to heart health; no, actually, any drinking is dangerous.

In both cases (as with much in diet), the underlying problem is that dietary choices are not random, and it’s hard to separate these choices from the other choices or characteristics they typically go along with. As a result, researchers work to use more sophisticated data techniques to answer these questions.

One of these techniques surfaced a couple of weeks ago in the context of alcohol. A new study on alcohol and cardiovascular effects came out in late March that used a technique called Mendelian randomization to try to better isolate causal effects. It got attention, not least because of this exotic empirical strategy, based on genetics.

However, this technique is somewhat confusing and, in my view, poorly understood even by some of the people who use it. So today, I want to do a deep dive. I’ll first try to explain, in a stylized example, how this works (and what the pitfalls might be). I’ll then talk about the particular details of this study.

Yes, this post is a little more technical than usual. The newsletter is called ParentData, after all! Stick with me. It’s interesting, I promise. 

Teaching example overview

Let’s put aside drinking and heart disease and turn to perhaps the most canonical relationship in economics: the relationship between education and wages. If people get more education, do they make more money and, if so, how much? If you think about it for a moment, you can see why it might be difficult to learn the answer to this question just by looking at wages across education groups. There are many other factors (family background, circumstances, ability, patience) that likely contribute to education but also to wages directly.

When researchers study this question, then, they look for strategies to get around these confounding factors. The ideal (from a research standpoint) would be to randomize how much education people get. Since we typically cannot do that for practical or ethical reasons, a common approach is to look for some other external factor that impacts individual education. In a famous example, researchers noted that because of compulsory schooling laws, the quarter of the year that you were born in impacted your educational attainment. They could then use the time of birth as what we call an “instrumental variable” to estimate a (more plausibly) causal impact of education on wages.

Very broadly, the idea in Mendelian randomization (which I will now call “MR” for word count reasons) is to recognize that your genetic code could be used as this instrumental variable.

How might that work in practice?

Quick biology reminder: You have two copies of each of your 23 chromosomes, one inherited from each parent. Each chromosome contains a number of genes, which all together are your genetic code.

Imagine for a moment that we’ve identified a genetic variant (a “SNP”) that strongly predicts college attendance. Let’s imagine it’s on chromosome 3, and we’re going to call this variant “COLLEGE.”* Let’s say your mother has one copy of the COLLEGE variant, on one of her two chromosome 3s.

When you are conceived, you get one copy of each chromosome from your mother and your father. This means you get only one of the chromosome 3 copies from your mother. And — here’s the key — which copy you get is random. As a result, there is a 50% chance you get her COLLEGE variant and a 50% chance you get the other copy, with no variant. (You’ll get your other copy of chromosome 3 from your father; here, I’m going to assume he doesn’t have the COLLEGE variant at all, so you definitely do not get it from him.)

In this scenario, you have a few siblings, and each of them also gets a copy of chromosome 3 from your mother. Some of you get the COLLEGE variant copy, and some get the other one. In expectation, half get each. But we have now generated random variation in the propensity to go to college within your family, based on this genetic lottery. We can potentially use that to estimate the effect of college on wages. Potentially being the key word, as doing so is going to require additional assumptions.

One thing I want to be clear about: The “randomness” in genetic makeup here is necessarily conditional on your parental genes. Genetic variants are not, in general, randomly allocated around the population of the world. Since your parents’ (and other ancestors’!) genes impact their behavior and outcomes, and those behaviors and outcomes can impact you directly, it’s really only among siblings who share both parents that there is a condition of randomness.

Simple approach: The simplest approach to the data here would be to compare wages across children in the same family who got different versions of the COLLEGE variant. What we can say with confidence, comparing siblings within a family, is whether the child who gets the COLLEGE variant of the gene has higher wages.

This impact may be causal, but it is also uninteresting. The question we are interested in is to what extent going to college increases wages. There is a simple way to imagine translating between the two. Specifically:

  • Calculate how much having the COLLEGE variant increases the chance of going to college within a group of siblings.
  • Calculate how much having the COLLEGE variant increases wages within a group of siblings.
  • Divide the second number by the first number. This effectively translates the impact of the COLLEGE gene on wages into an impact of college-going on wages. It re-scales the impact to get what you want.

This is called an IV (instrumental variables) estimator or a Wald estimator. The calculation is straightforward, but interpreting what you get as the causal impact of college-going on wages requires additional assumptions. (Want a more technical explanation of all of the following? Start with this seminal 1996 JASA paper, among the origins of last year’s Nobel Prize in Economics. Or the less technical explainer here.)

What are the assumptions, and how do they work in the genetic case?

Causal interpretation: The key assumption here is what is called the exclusion restriction. Intuitively, the exclusion restriction says that in order to interpret our simple estimate as a causal impact, it must be the case that the random variable (in this example, the COLLEGE variant) impacts the outcome (wages) only as a result of its impact on the intermediate behavior (college-going). That is, the COLLEGE gene doesn’t lead to higher wages on its own.

In the case of these genetic analyses, there are several primary ways the exclusion restriction might be violated.

The first problem has a name: pleiotropy. This is the phenomenon whereby a single gene influences multiple traits. For example: imagine that this COLLEGE gene influences college-going but also height. We know that taller people make more money (seriously). In this case, the differences we see in wages within the family might be due to differences in height, not differences in college-going. In this case, it would be a mistake to assign all the impact of the COLLEGE variant as due to the college-going.

A related issue is linkage disequilibrium. Genes that are near each other on a chromosome are more likely to be inherited together. If the COLLEGE gene is right next to a HEIGHT gene, you could get a form of the pleiotropy problem, even if they are distinct genes.

A final issue is that most of the time we have no idea what the gene really does. My hypothetical COLLEGE gene doesn’t, like, fill out the Common App for you. A gene that is predictive of college attendance could be predictive for any number of reasons — because it influences patience, because it influences ability, because it influences the likelihood of being good enough at tennis to play in college. But in some cases, these other factors could also independently predict wages. Again, it would be a mistake, then, to attribute all the effect of the gene to its impact on college-going.

None of this is to say that these analyses cannot be useful, or cannot deliver causal estimates. For example: there are cases (breast cancer, for example) in which we have genes that clearly lead to a dramatically increased risk of cancer. We could use within-family variation in these genes to estimate the impact of getting breast cancer on various outcomes or behaviors.

Even here, though, it wouldn’t necessarily be appropriate to use this to try to (for example) estimate the impact of life expectancy on happiness, even though these genetic variants do impact life expectancy, because they could influence happiness for other reasons too. As in virtually all cases where we use these instrumental variables strategies, it is necessary to think really carefully about what, exactly, is going on.

The literature using these techniques does, in fact, think about these exclusion restrictions. My point is simply that those are hard in some of these settings to get around.

New study: alcohol and heart disease

I dragged us through that long discussion of the logic of MR in order to discuss this new study about the link between alcohol consumption and cardiovascular disease. In the paper, the authors aim to use the MR approach to generate causal estimates of this relationship. As they note, when you look at a cross section of people, we tend to find that light drinking is associated with better heart health but heavy drinking with worse. These authors are rightly concerned that this result might simply reflect the fact that people who drink lightly tend to be better educated, wealthier, and less likely to smoke than either abstainers or those who drink heavily.

Their proposed solution is to exploit a set of genetic variants that have been associated with alcohol dependence. They use data on individuals that has information on their drinking amounts, heart disease, and these genetic variants. The authors use a form of the analysis described above to estimate a relationship between drinking and heart disease, using the genetic variation as the instrumental variable. Their conclusion is that this approach shows no health advantage to light drinking and a large health risk to heavy drinking — that all drinking is at least slightly bad, and drinking more is worse.

The paper has gotten some significant attention. The New York Times wrote about it, quoting a doctor who said the conclusions of the study “totally changed my life.”

I, however, remain skeptical. The analysis here is subject to a number of the complex concerns raised above. In several cases, the variants the authors explore are associated with outcomes even for non-drinkers. They attempt to exclude these particular variants from their main results, but this raises the general concern that these genes are impacting outcomes for reasons unrelated to drinking (this would violate the exclusion restriction). In a more technical sense, they are focused on instrumenting for both a linear and squared term in the analysis, and it isn’t clear that this will generate causal impacts even putting aside confounding concerns.

The main problem, though, the biggest issue with this paper, is simply that they do not do this analysis within sibling groups. I noted up top the idea that genes are randomly assigned and you could use that randomization for identification — this is true only within families. Genes aren’t randomly allocated around the population overall. However, in this paper the authors do not observe family groups. So rather than compare two siblings who got a different set of genetic endowments at random, they are comparing people whose family genetic makeup is different.

This approach is subject to more basic concerns. The individuals in the study whose genetic makeup is associated with more alcohol consumption are also more likely to have had parents who consumed more alcohol. This could matter for all kinds of reasons having nothing to do with their own consumption. The authors of the paper do not, as far as I can tell, observe anything about family background, parental drinking, or anything else like that.

In the end, then, the Mendelian randomization used in this paper is … not random. Forget about exclusion restrictions or interpretation concerns. This paper falls on a much more basic sword.

To be clear: I am quite sympathetic to the authors’ views that the slight positive effect of moderate alcohol consumption is correlation rather than causation. I felt that way before, and I still do. And there are methodologically stronger papers that use this approach to answer the same question (notably, this one). This particular paper, however, is too problematic to move the needle very much.

This post was a challenge, and I’m grateful for help from Jonathan Roth, Peter Hull, Dan Benjamin, and Penelope Shapiro.


*This is all very hypothetical. Although there are some SNPs associated with education, none of them are strongly associated, and I have no idea if they are on chromosome 3. Also, genetic variants have names like “rs1260326,” not “COLLEGE.”

A rear view of a young woman shopping in front of a case of wine.

Jan 26 2023

8 min read

Alcohol and Health

Cutting through the noise

Emily Oster
A glass of red wine spilled on a white background.

Dec 02 2022

3 min read

Will Alcohol During Pregnancy Alter My Baby’s Brain?

A new study was released that says even small amounts of drinking during pregnancy alter a baby’s brain structure. Given Read more

Emily Oster
Close up of non-alcoholic beer.

Oct 14 2022

2 min read

Are Non-Alcoholic Beers Safe While Pregnant?

I’m wondering about pregnancy and non-alcoholic beers. The NA beers are under 0.5% ABV. These are becoming more common, but Read more

Emily Oster
A close-up of a poppy seed bagel with cream cheese.

Aug 30 2023

2 min read

Can Poppy Seed Bagels Cause Positive Drug Tests?

During my last pregnancy, I ate an everything bagel (with poppy seeds) almost every day. At one of my checkups, Read more

Emily Oster

Instagram

left right
I hear from many of you that the information on ParentData makes you feel seen. Wherever you are on your journey, it’s always helpful to know you’re not alone. 

Drop an emoji in the comments that best describes your pregnancy or parenting searches lately… 💤🚽🍻🎒💩

I hear from many of you that the information on ParentData makes you feel seen. Wherever you are on your journey, it’s always helpful to know you’re not alone.

Drop an emoji in the comments that best describes your pregnancy or parenting searches lately… 💤🚽🍻🎒💩
...

Milestones. We celebrate them in pregnancy, in parenting, and they’re a fun thing to celebrate at work too. Just a couple years ago I couldn’t have foreseen what this community would grow into. Today, there are over 400,000 of you here—asking questions, making others feel seen wherever they may be in their journey, and sharing information that supports data > panic. 

It has been a busy summer for the team at ParentData. I’d love to take a moment here to celebrate the 400k milestone. As I’ve said before, it’s more important than ever to put good data in the hands of parents. 

Share this post with a friend who could use a little more data, and a little less parenting overwhelm. 

📷 Me and my oldest, collaborating on “Expecting Better”

Milestones. We celebrate them in pregnancy, in parenting, and they’re a fun thing to celebrate at work too. Just a couple years ago I couldn’t have foreseen what this community would grow into. Today, there are over 400,000 of you here—asking questions, making others feel seen wherever they may be in their journey, and sharing information that supports data > panic.

It has been a busy summer for the team at ParentData. I’d love to take a moment here to celebrate the 400k milestone. As I’ve said before, it’s more important than ever to put good data in the hands of parents.

Share this post with a friend who could use a little more data, and a little less parenting overwhelm.

📷 Me and my oldest, collaborating on “Expecting Better”
...

I spend a lot of time talking people down after they read the latest panic headline. In most cases, these articles create an unnecessary amount of stress around pregnancy and parenting. This is my pro tip for understanding whether the risk presented is something you should really be worrying about.

Comment “link” for an article with other tools to help you navigate risk and uncertainty.

#emilyoster #parentdata #riskmanagement #parentstruggles #parentingstruggles

I spend a lot of time talking people down after they read the latest panic headline. In most cases, these articles create an unnecessary amount of stress around pregnancy and parenting. This is my pro tip for understanding whether the risk presented is something you should really be worrying about.

Comment “link” for an article with other tools to help you navigate risk and uncertainty.

#emilyoster #parentdata #riskmanagement #parentstruggles #parentingstruggles
...

Here’s why I think you don’t have to throw away your baby bottles.

Here’s why I think you don’t have to throw away your baby bottles. ...

Drop your toddlers favorite thing right now in the comments—then grab some popcorn.

Original thread source: Reddit @croc_docs

Drop your toddlers favorite thing right now in the comments—then grab some popcorn.

Original thread source: Reddit @croc_docs
...

Just keep wiping.

Just keep wiping. ...

Dr. Gillian Goddard sums up what she learned from the Hot Flash  S e x  Survey! Here are some key data takeaways:

🌶️ Among respondents, the most common s e x u a l frequency was 1 to 2 times per month, followed closely by 1 to 2 times per week
🌶️ 37% have found their sweet spot and are happy with the frequency of s e x they are having
🌶️ About 64% of respondents were very or somewhat satisfied with the quality of the s e x they are having

Do any of these findings surprise you? Let us know in the comments!

#hotflash #intimacy #midlifepleasure #parentdata #relationships

Dr. Gillian Goddard sums up what she learned from the Hot Flash S e x Survey! Here are some key data takeaways:

🌶️ Among respondents, the most common s e x u a l frequency was 1 to 2 times per month, followed closely by 1 to 2 times per week
🌶️ 37% have found their sweet spot and are happy with the frequency of s e x they are having
🌶️ About 64% of respondents were very or somewhat satisfied with the quality of the s e x they are having

Do any of these findings surprise you? Let us know in the comments!

#hotflash #intimacy #midlifepleasure #parentdata #relationships
...

Should your kid be in a car seat on the plane? The AAP recommends that you put kids under 40 pounds into a car seat on airplanes. However, airlines don’t require car seats.

Here’s what we know from a data standpoint:
✈️ The risk of injury to a child on a plane without a carseat is very small (about 1 in 250,000)
✈️ A JAMA Pediatrics paper estimates about 0.4 child air crash deaths per year might be prevented in the U.S. with car seats 
✈️ Cars are far more dangerous than airplanes! The same JAMA paper suggests that if 5% to 10% of families switched to driving, then we would expect more total deaths as a result of this policy. 

If you want to buy a seat for your lap infant, or bring a car seat for an older child, by all means do so! But the additional protection based on the numbers is extremely small.

#parentdata #emilyoster #flyingwithkids #flyingwithbaby #carseats #carseatsafety

Should your kid be in a car seat on the plane? The AAP recommends that you put kids under 40 pounds into a car seat on airplanes. However, airlines don’t require car seats.

Here’s what we know from a data standpoint:
✈️ The risk of injury to a child on a plane without a carseat is very small (about 1 in 250,000)
✈️ A JAMA Pediatrics paper estimates about 0.4 child air crash deaths per year might be prevented in the U.S. with car seats
✈️ Cars are far more dangerous than airplanes! The same JAMA paper suggests that if 5% to 10% of families switched to driving, then we would expect more total deaths as a result of this policy.

If you want to buy a seat for your lap infant, or bring a car seat for an older child, by all means do so! But the additional protection based on the numbers is extremely small.

#parentdata #emilyoster #flyingwithkids #flyingwithbaby #carseats #carseatsafety
...

SLEEP DATA 💤 PART 2: Let’s talk about naps. Comment “Link” for an article on what we learned about daytime sleep!

The first three months of life are a chaotic combination of irregular napping, many naps, and a few brave or lucky souls who appear to have already arrived at a two-to-three nap schedule. Over the next few months, the naps consolidate to three and then to two. By the 10-to-12-month period, a very large share of kids are napping a consistent two naps per day. Over the period between 12 and 18 months, this shifts toward one nap. And then sometime in the range of 3 to 5 years, naps are dropped. What I think is perhaps most useful about this graph is it gives a lot of color to the average napping ages that we often hear. 

Note: Survey data came from the ParentData audience and users of the Nanit sleep monitor system. Both audiences skew higher-education and higher-income than the average, and mostly have younger children. The final sample is 14,919 children. For more insights on our respondents, read the full article.

SLEEP DATA 💤 PART 2: Let’s talk about naps. Comment “Link” for an article on what we learned about daytime sleep!

The first three months of life are a chaotic combination of irregular napping, many naps, and a few brave or lucky souls who appear to have already arrived at a two-to-three nap schedule. Over the next few months, the naps consolidate to three and then to two. By the 10-to-12-month period, a very large share of kids are napping a consistent two naps per day. Over the period between 12 and 18 months, this shifts toward one nap. And then sometime in the range of 3 to 5 years, naps are dropped. What I think is perhaps most useful about this graph is it gives a lot of color to the average napping ages that we often hear.

Note: Survey data came from the ParentData audience and users of the Nanit sleep monitor system. Both audiences skew higher-education and higher-income than the average, and mostly have younger children. The final sample is 14,919 children. For more insights on our respondents, read the full article.
...

Happy Father’s Day to the Fathers and Father figures in our ParentData community! 

Tag a Dad who this holiday may be tricky for. We’re sending you love. 💛

Happy Father’s Day to the Fathers and Father figures in our ParentData community!

Tag a Dad who this holiday may be tricky for. We’re sending you love. 💛
...

“Whilst googling things like ‘new dad sad’ and ‘why am I crying new dad,’ I came across an article written by a doctor who had trouble connecting with his second child. I read the symptoms and felt an odd sense of relief.” Today we’re bringing back an essay by Kevin Maguire of @newfatherhood about his experience with paternal postpartum depression. We need to demystify these issues in order to change things for the better. Comment “Link” for a DM to read his full essay.

#parentdata #postpartum #postpartumdepression #paternalmentalhealth #newparents #emilyoster

“Whilst googling things like ‘new dad sad’ and ‘why am I crying new dad,’ I came across an article written by a doctor who had trouble connecting with his second child. I read the symptoms and felt an odd sense of relief.” Today we’re bringing back an essay by Kevin Maguire of @newfatherhood about his experience with paternal postpartum depression. We need to demystify these issues in order to change things for the better. Comment “Link” for a DM to read his full essay.

#parentdata #postpartum #postpartumdepression #paternalmentalhealth #newparents #emilyoster
...

What does the data say about children who look more like one parent? Do they also inherit more character traits and mannerisms from that parent? Let’s talk about it 🔎

#emilyoster #parentdata #parentingcommunity #lookslikedaddy #lookslikemommy

What does the data say about children who look more like one parent? Do they also inherit more character traits and mannerisms from that parent? Let’s talk about it 🔎

#emilyoster #parentdata #parentingcommunity #lookslikedaddy #lookslikemommy
...

SLEEP DATA 💤 We asked you all about your kids’ sleep—and got nearly 15,000 survey responses to better understand kids’ sleep patterns. Comment “Link” for an article that breaks down our findings!

This graph shows sleeping location by age. You’ll notice that for the first three months, most kids are in their own sleeping location in a parent’s room. Then, over the first year, this switches toward their own room. As kids age, sharing a room with a sibling becomes more common. 

Head to the newsletter for more and stay tuned for part two next week on naps! 🌙

#parentdata #emilyoster #childsleep #babysleep #parentingcommunity

SLEEP DATA 💤 We asked you all about your kids’ sleep—and got nearly 15,000 survey responses to better understand kids’ sleep patterns. Comment “Link” for an article that breaks down our findings!

This graph shows sleeping location by age. You’ll notice that for the first three months, most kids are in their own sleeping location in a parent’s room. Then, over the first year, this switches toward their own room. As kids age, sharing a room with a sibling becomes more common.

Head to the newsletter for more and stay tuned for part two next week on naps! 🌙

#parentdata #emilyoster #childsleep #babysleep #parentingcommunity
...

Weekends are good for extra cups of ☕️ and listening to podcasts. I asked our team how they pod—most people said on walks or during chores. What about you?

Comment “Link” to subscribe to ParentData with Emily Oster, joined by some excellent guests.

#parentdata #parentdatapodcast #parentingpodcast #parentingtips #emilyoster

Weekends are good for extra cups of ☕️ and listening to podcasts. I asked our team how they pod—most people said on walks or during chores. What about you?

Comment “Link” to subscribe to ParentData with Emily Oster, joined by some excellent guests.

#parentdata #parentdatapodcast #parentingpodcast #parentingtips #emilyoster
...