Establishing a causal relationship is hard. Evaluating a causal claim isn't as difficult.

· by Brian Anderson · Read in about 7 min · (1326 words) ·

In a recent WSJ column, Christopher Mims borrows from behavioral economics to offer an explanation for “Why the world seems worse than it is.” The details of the column aren’t important for this post, but a passage caught my attention…

Sometimes known as the availability heuristic, this bias is one reason parents are afraid to let children play unsupervised, though it’s never been safer to be a child in America.

Briefly, the availability bias/heuristic posits that the more recent a piece of information, and particularly information where a person can easily recall the consequences of that information (such as frightening news), the more salient the information is, and the more likely a person is to rely on that information—irrespective of its veracity—to inform a future decision. Mr. Mims posits that the availability heuristic, heightened by the nonstop barrage of information online, is the mechanism for the quote above and also “why people are afraid of shark attacks, even though they’re more likely to drown at the beach.”

Lets leave aside whether the availability heuristic is a true causal mechanism that explains why parents are afraid to let children play unsupervised, and just assume that it is possible. If we were to lay out the causal chain, it might look something like this…

\[\text{Exposure to Frightening News}\rightarrow\text{Increased Parent Supervision}\]

How might we go about evaluating whether we should put much stock in Mr. Mims’ causal claim, without using sophisticated causal inference frameworks like the Rubin casual model or structural causal modeling?

Start with a thought experiment

One approach is to use a thought experiment to imagine different variables and manipulations for the predictor variable, in this case, exposure to frightening news. For example, we might say that the frightening news was an attempted abduction of a child. To a parent, this is truly horrific, and so it is easy to conclude that a normal reaction would be to increase parent supervision.

Now consider another example, recently in the news, about a parent receiving a visit from child protective services for allowing her 8-year old daughter to take a dog for a walk by herself. To one parent, a visit from protective services might be truly frightening, and so his or her response is to increase parental supervision. But to another parent—and judging by the social media reaction to the story—some might react to the news by decreasing supervision, perhaps in a form of protest.

The key point is that because we can imagine a realistic scenario in which exposure to frightening news does not always increase parent supervision, there is the possibility that meaningful contextual factors, or boundary conditions, are necessary to fully understand when frightening news leads to parental supervision, and when does may not. That is, there might be another important predictor out there—one that might change the nature of the impact frightening news has on parental supervision. If that’s the case, even without knowing what that specific predictor or contextual factor is, we should be a bit skeptical about the strength of the claim and its applicability.

Consider the assumptions

Another approach is to think through the assumptions. Mr. Mims posits that the availability heuristic causes increased parental supervision, “though it’s never been safer to be a child in America.” The implication seems to be that parents are cognizant that it’s very safe today to be an American kid, but despite that knowledge, exposure to frightening news increases supervision, which may not be needed because America is so safe today.

Lets use this assumption that people are aware that it much safer today for American kids than in the past. It seems plausible that people may associate increased supervision with increased safety; after all, the frightening event could be a situation in which a child’s safety may have increased if parental supervision would have been present. In the aggregate then, we would expect knowledge of America being safer for kids not necessarily as an assumption, but as a logical consequence of increased supervision. So in reality, our causal chain may look something like this…

\[\text{Exposure to Frightening News}\rightarrow\text{Increased Parent Supervision}\rightarrow\text{Kids Are Safer}\]

In this case, one of the assumptions underlying the strength of the causal claim that exposure to frightening event causes increased supervision despite knowing that America is safer for kids today is not nearly as strong if parents are also aware that increasing supervision, in the aggregate, causes America to be safer.

Of course, we are still assuming causality here between exposure to frightening news and increased supervision, but it’s not nearly as strong—increased supervision is simply the mechanism, or mediator, that connects exposure to frightening news to the more important outcome variable, kids being safe. When we put this possibility together with the possibility of contextual factors at play, our skepticism about the causal claims made in the column should increase.

Alternate explanations

Another way to question causal claims is to think about alternate explanations, and the easiest way for me to do this is to consider the role of time. Lets keep going with the assumption that people are aware that America is safer today, and that they assume that increased parental supervision improves kid safety. Mr. Mims also posits that “it’s never been safer to be a child in America.” The implication is that, in the past, America was less safe for kids, and considering our earlier assumptions, it is reasonable also that parents are aware that in the past America wasn’t as safe.

So lets put together a new causal model, one based on parents knowing that America wasn’t as safe in the past, and that they assume increasing supervision increases safety. What might the causal chain look like?

\[\text{Kids Safety}_{t-1}\rightarrow\text{Increased Parent Supervision}_{t}\rightarrow\text{Kids Safety}_{t+1}\]

The \(t\) subscript is just shorthand for some time period. We read the model above as kids safety at some point before increasing supervision (\(t-1\)), at the point at which parents increased supervision (\(t\)), and then kids safety after supervision increased (\(t+1\)).

What doesn’t appear in the causal chain? Exposure to a frightening event.

In the model above, assuming similar conditions by Mr. Mims, parents are aware that America wasn’t as safe before, so they decide to increase supervision as a result, which then causes safety to improve. This is an alternate causal model, one that is equally plausible given the same observations about parental supervision and kids safety Mr. Mims uses in the column, but explains a change in parent supervision without exposure to a frightening event as a causal mechanism.

Of course, the original causal model could be accurate; but so could our alternate models.

Now when you put together the likelihood of boundary conditions on the model, assumptions that may weaken the causal claims, and the presence of alternate causal models consistent with the observed data, we should be quite skeptical about the causal claims made in the column.

So what to do?

I really do not mean to criticize Mr. Mims column, and his thesis may well be valid. But as we have shown, his thesis may also be invalid. The role of research using causal research designs is to test different theoretical perspectives and, ideally leveraging Bayesian inference, pit the causal claims of one theory against the other. The hard reality is that establishing causal relationships is devilishly difficult, and in the social sciences sphere, truly, exceptionally, challenging.

Truth be told, I think the most useful part of the column comes near the end, where Mr. Mims quotes psychology professor Steven Pinker…

I’m always skeptical of now-more-than-ever observations that are not backed up by time-series data, since they themselves can be products of the availability heuristic and may be inaccurate.

So true. To Dr. Pinker’s point I would add that even with time-series data, absent taking steps to eliminate alternate explanations and confounding factors, skepticism is the safe response to these kind of causal claims. Hopefully this post improves your ability to evaluate the causal claims you come across.