Edited, memorised or added to reading list

on 13-Apr-2022 (Wed)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#causality #statistics
We denote by π‘Œ(1) the potential outcome of happiness you would observe if you were to get a dog ( 𝑇 = 1 ). Similarly, we denote by π‘Œ(0) the potential outcome of happiness you would observe if you were to not get a dog ( 𝑇 = 0 ). In scenario 1, π‘Œ(1) = 1 and π‘Œ(0) = 1. In contrast, in scenario 2, π‘Œ(1) = 1 and π‘Œ(0) = 0.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
More generally, the potential outcome π‘Œ(𝑑) denotes what your outcome would be, if you were to take treatment 𝑑 . A potential outcome π‘Œ(𝑑) is distinct from the observed outcome π‘Œ in that not all potential outcomes are observed. Rather all potential outcomes can potentially be observed. The one that is actually observed depends on the value that the treatment 𝑇 takes on
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

The Fundamental Problem of Causal Inference:

It is impossible to observe all potential outcomes for a given individual

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




Flashcard 7070610492684

Tags
#causality #statistics
Question

The Fundamental Problem of Causal Inference:

It is [...] to observe all potential outcomes for a given individual

Answer
impossible

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The Fundamental Problem of Causal Inference: It is impossible to observe all potential outcomes for a given individual

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7070612065548

Tags
#causality #statistics
Question

The Fundamental Problem of Causal Inference:

It is impossible to observe [...] potential outcomes for a given individual

Answer
all

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The Fundamental Problem of Causal Inference: It is impossible to observe all potential outcomes for a given individual

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7070618094860

Tags
#causality #has-images #statistics

#causality #statistics


#causality #statistics


statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

pdf

cannot see any pdfs







Flashcard 7070629104908

Tags
#DAG #causality #has-images #statistics

#DAG #causality #statistics


#DAG #causality #statistics


statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

pdf

cannot see any pdfs







#causality #statistics

the fundamental problem of causal inference

It is fundamental because if we cannot observe both π‘Œ 𝑖 (1) and π‘Œ 𝑖 (0) , then we cannot observe the causal effect π‘Œ 𝑖 (1) βˆ’ π‘Œ 𝑖 (0) . This problem is unique to causal inference because, in causal inference, we care about making causal claims, which are defined in terms of potential outcomes. For contrast, consider machine learning. In machine learning, we often only care about predicting the observed outcome π‘Œ , so there is no need for potential outcomes, which means machine learning does not have to deal with this fundamental problem that we must deal with in causal inference.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
The potential outcomes that you do not (and cannot) observe are known as counterfactuals because they are counter to fact (reality). β€œPotential outcomes” are sometimes referred to as β€œcounterfactual outcomes,” but we will never do that in this book because a potential outcome π‘Œ(𝑑) does not become counter to fact until another potential outcome π‘Œ(𝑑 0 ) is observed. The potential outcome that is observed is sometimes referred to as a factual. Note that there are no counterfactuals or factuals until the outcome is observed. Before that, there are only potential outcomes
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
We get the average treatment effect (ATE) by taking an average over the ITEs: 𝜏 , 𝔼[π‘Œ 𝑖 (1) βˆ’ π‘Œ 𝑖 (0)] = 𝔼[π‘Œ(1) βˆ’ π‘Œ(0)] , where the average is over the individuals 𝑖 if π‘Œ 𝑖 (𝑑) is deterministic. If π‘Œ 𝑖 (𝑑) is random, the average is also over any other randomness
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #has-images #statistics
We will take this table as the whole population of interest. Because of the fundamental problem of causal inference, this is fundamentally a missing data problem. All of the question marks in the table indicate that we do not observe that cell.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #has-images #statistics
Well, what assumption(s) would make it so that the ATE is simply the associational difference? This is equivalent to saying β€œwhat makes it valid to calculate the ATE by taking the average of the π‘Œ(0) column, ignoring the question marks, and subtracting that from the average of the π‘Œ(1) column, ignoring the question marks?” 6 This ignoring of the question marks (missing data) is known as ignorability
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #statistics
#causality #has-images #statistics
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
The ignorability assumption is used in Equation 2.3. We will talk more about Equation 2.4 when we get to Section 2.3.5. Another perspective on this assumption is that of exchangeability.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




Flashcard 7070653746444

Tags
#causality #statistics
Question
The ignorability assumption is used in Equation 2.3. We will talk more about Equation 2.4 when we get to Section 2.3.5. Another perspective on this assumption is that of [...another assumption?].
Answer
exchangeability

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The ignorability assumption is used in Equation 2.3. We will talk more about Equation 2.4 when we get to Section 2.3.5. Another perspective on this assumption is that of exchangeability.

Original toplevel document (pdf)

cannot see any pdfs







#causality #statistics
Exchangeability means that the treatment groups are exchangeable in the sense that if they were swapped, the new treatment group would observe the same outcomes as the old treatment group, and the new control group would observe the same outcomes as the old control group.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
To identify a causal effect is to reduce a causal expression to a purely statistical expression. In this chapter, that means to reduce an expression from one that uses potential outcome notation to one that uses only statistical notation such as 𝑇 , 𝑋 , π‘Œ , expectations, and conditioning. This means that we can calculate the causal effect from just the observational distribution 𝑃(𝑋, 𝑇, π‘Œ)
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #statistics
#causality #has-images #statistics
Identifibility
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
We have seen that ignorability is extremely important (Equation 2.3), but how realistic of an assumption is it? In general, it is completely unrealistic because there is likely to be confounding in most data we observe (causal structure shown in Figure 2.1). However, we can make this assumption realistic by running randomized experiments, which force the treatment to not be caused by anything but a coin toss, so then we have the causal structure shown in Figure 2.2. We cover randomized experiments in greater depth in Chapter 5.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
In observational data, it is unrealistic to assume that the treatment groups are exchangeable. In other words, there is no reason to expect that the groups are the same in all relevant variables other than the treatment.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
there is no reason to expect that the groups are the same in all relevant variables other than the treatment. However, if we control for relevant variables by conditioning, then maybe the subgroups will be exchangeable. We will clarify what the β€œrelevant variables” are in Chapter 3,
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
The idea is that although the treatment and potential outcomes may be unconditionally associated (due to confounding), within levels of 𝑋 , they are not associated. In other words, there is no confounding within levels of 𝑋 because controlling for 𝑋 has made the treatment groups comparable.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




Flashcard 7070670523660

Tags
#causality #has-images #statistics

#causality #statistics


#causality #statistics


statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

pdf

cannot see any pdfs








#causality #has-images #statistics
We do not have exchangeability in the data because 𝑋 is a common cause of 𝑇 and π‘Œ . We illustrate this in Figure 2.3. Because 𝑋 is a common cause of 𝑇 and π‘Œ , there is non-causal association between 𝑇 and π‘Œ . This non-causal association flows along the 𝑇 ← 𝑋 β†’ π‘Œ path; we depict this with a red dashed arc
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #has-images #statistics
However, we do have conditional exchangeability in the data. This is because, when we condition on 𝑋 , there is no longer any non-causal association between 𝑇 and π‘Œ . The non-causal association is now β€œblocked” at 𝑋 by conditioning on 𝑋 . We illustrate this blocking in Figure 2.4 by shading 𝑋 to indicate it is conditioned on and by showing the red dashed arc being blocked there
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
Conditional exchangeability is the main assumption necessary for causal inference. Armed with this assumption, we can identify the causal effect within levels of 𝑋
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #statistics
#causality #has-images #statistics
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
The main reason for moving from exchangeability (Assumption 2.1) to conditional exchangeability (Assumption 2.2) was that it seemed like a more realistic assumption. However, we often cannot know for certain if conditional exchangeability holds. There may be some unobserved confounders that are not part of 𝑋 , meaning conditional exchangeability is violated. Fortunately, that is not a problem in randomized experiments
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
Positivity is the condition that all subgroups of the data with different covariates have some probability of receiving any value of treatment
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

An β€œestimator” is a function that takes a dataset as input and outputs an estimate. We discuss this statistics terminology more in Section 2.4.

That’s the math for why we need the positivity assumption, but what’s the intuition? Well, if we have a positivity violation, that means that within some subgroup of the data, everyone always receives treatment or everyone always receives the control. It wouldn’t make sense to be able to estimate a causal effect of treatment vs. control in that subgroup since we see only treatment or only control. We never see the alternative in that subgroup

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

The Positivity-Unconfoundedness Tradeoff

Although conditioning on more covariates could lead to a higher chance of satisfying unconfoundedness, it can lead to a higher chance of violating positivity. As we increase the dimension of the covariates, we make the subgroups for any level π‘₯ of the covariates smaller.

The Positivity-Unconfoundedness Tradeoff

This is related to the curse of dimensionality. As each subgroup gets smaller, there is a higher and higher chance that either the whole subgroup will have treatment or the whole subgroup will have control. For example, once the size of any subgroup has decreased to one, positivity is guaranteed to not hold.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs