Insufficiently Supported Causal Inferences: A Response to Barak-Corren and Tebbe

In November, the Supreme Court heard arguments in Fulton v. City of Philadelphia. That case involves free exercise questions surrounding a city’s refusal to continue to allow Catholic Social Services to place foster care children after the agency clarified that it would not certify unmarried couples or same-sex couples to be foster parents. One issue that was raised in the briefing and oral arguments was the degree to which shutting out Catholic Social Services would harm foster children given that the City was also urgently asking for more foster families given a chronic shortage. As an example, the petitioner pointed to what happened in Boston after the state legalized same-sex marriage and refused to grant Catholic Charities an exception from a legal requirement that agency place children with same-sex couples.

In a Balkinization blog post in late October, Professors Netta Barak-Corren and Nelson Tebbe claimed that they had uncovered new empirical evidence that children were not harmed when Catholic Charities, “one of the largest placement agencies in Boston,” shut its doors rather than comply with a state law that would have required it to place children with same-sex couples, in violation of the agency’s religious beliefs. Such an empirical claim is counterintuitive and—given that Fulton involves a similar situation in Philadelphia—worth exploring more closely.

The Evidence

Barak-Corren and Tebbe conclude that the “empirical claim that children are harmed when religious placement agencies close is not supported by precedent, and particularly its most prominent precedent—the Boston case.”

They note two kinds of evidence for this conclusion in their blog post: Interviews with two “professionals who worked for Catholic Charities at the time it closed its child placement operation,” and observational data of (1) the number of days children in Massachusetts spent in foster care the year before and two years after Catholic Charities closed its doors and (2) the Massachusetts adoption rate.

The first professional, who said that she and “her fellow agency workers were generally supportive of the practice of placing children with same-sex couples,” claimed that transitioning cases from Catholic Charities to another agency went smoothly and had no negative effects on children. True, some staff members left, so “caseloads were elevated for about six months,” and she was “working on adoption, not foster care,” but “in her experience,” most families will work with a new agency when there is a transition.

The second professional echoed the first. When Catholic Charities shut down, “most of our staff as well as their cases went with this other agency.” Thus, the transition “was as beneficial as possible for the kids and families that they were working with at the time.” And while there was “an adjustment period” given the enormity of the change, “to his knowledge, there were no families who had been working with Catholic Charities who refused to transition to other agencies after the closure.”

Barak-Corren and Tebbe also point to some data. Comparing the year before Catholic Charities closed with two years after, they note that Massachusetts children’s “time in foster care slightly decreases, from a median of 468 days in 2005 to 338 days in 2008.” Further, they point out that “the rate of adoption for children with that goal remains high and stable, at 93-95%.”

Why This Evidence Is Not Sufficiently Persuasive

The problem with the evidence from the interviews is that they constitute mere anecdotal evidence. Moreover, two interviews are not very many, so there is very little of a very weak kind of evidence. Furthermore, it appears the interviewees supported the government’s controversial policy—the very policy they claim inflicted no harm. Would we be satisfied with such evidence in any other context? Further, we are not told whether there are interviews that Barak-Corren and Tebbe have not reported that contradict the views of the two interviews they did report.

The problem with the data is that Barak-Corren and Tebbe report only descriptive statistics, not inferential statistics. The question under discussion is whether closing Catholic Charities’ foster program caused harm to foster children. And causal claims cannot be investigated with just descriptive statistics. That’s because questions of causal inference require determining counterfactuals. This is often referred to as the potential outcomes framework: what would the outcome have been under an alternative scenario where the unit of observation did not (or did) receive the treatment, holding all else constant?

Of course, this is impossible outside of science fiction and creates a problem of missing data—we can never see the outcome in the alternative universe for any one individual, or in this case, state. Instead, researchers seeking to answer questions like ours attempt to create two groups that appear to be essentially equal on factors that matter for the outcome being studied, giving one group the treatment (or intervention) and withholding it from the other. By measuring the differences between these two otherwise identical groups on the outcome being studied, one can infer the degree to which the treatment caused the difference. This is why random assignment of subjects to either a treatment or control group in experimental designs is the gold standard for determining causality.

But as with alternative universes, even this is often not fully possible since some of the most interesting or important causal questions cannot be examined under the conditions of a controlled experiment, as is the case with foster care and adoptions. This leaves us with the task of inferring causality from the messy data generated by the real world.

This requires inferential statistics, rather than just the descriptive statistics Barak-Corren and Tebbe relied on. For the Massachusetts data, two potential methods are the difference-in-difference statistical technique or the synthetic control method. Barak-Corren and Tebbe employ neither, so they cannot accurately estimate what Massachusetts data would have looked like had Catholic Charities not dropped out in 2006. Perhaps the number of days in foster care would have been even lower. Perhaps adoption rates would have been even higher.

In short, two interviews and some descriptive statistics tell us virtually nothing about the causal question of what effect Catholic Charities’ departure from Massachusetts foster care had on outcome variables of interest.

Another problem with extrapolating from the Massachusetts example to inform discussions of analogous situations in other states (as in Fulton) is that we have no idea whether the Bay State is an outlier compared to other states. Barak-Corren and Tebbe admit as much: “None of this is to say that things could not turn out differently in another context where a transition is managed less well.” And they ignore descriptive statistics from Illinois on a different outcome that cuts the other way. There, in the seven-year period after the state ended its partnership with faith-based agencies, the number of non-relative foster homes dropped nearly in half, from 11,386 to 6,034.

This raises another question: are the two outcome variables Barak-Corren and Tebbe focus on (days in foster care and adoption rates) the ones we should be looking at? For instance, is the number of days in foster care a good metric or do we need to know a lot more about why children are in foster care and where they are going afterwards? And what about other measures? For example, the percentage of kids aging out of the foster care system in Massachusetts—generally seen as a negative outcome—spiked right after Catholic Charities closed.

Barak-Corren and Tebbe’s claim to have “uncovered evidence that children were not harmed” would never satisfy social science standards of causation. The two professors want “arguments about outcomes [to] rest on actual data,” but any old data isn’t enough to make the claims they make. At least if one wants to rigorously claim something causal about the real world.

Optional Login

Have an account?

Sign in

Proceed as Guest

The Evidence

Why This Evidence Is Not Sufficiently Persuasive

Topics: