Edited, memorised or added to reading queue

on 14-Apr-2022 (Thu)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Flashcard 7070716923148

Tags
#causality #has-images #statistics

#causality #statistics


#causality #statistics


statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

pdf

cannot see any pdfs







#causality #statistics
No interference means that my outcome is unaffected by anyone else’s treatment. Rather, my outcome is only a function of my own treatment. We’ve been using this assumption implicitly throughout this chapter. We’ll now formalize it. Assumption 2.4 (No Interference) π‘Œ 𝑖 (𝑑 1 , . . . , 𝑑 π‘–βˆ’1 , 𝑑 𝑖 , 𝑑 𝑖+1 , . . . , 𝑑 𝑛 ) = π‘Œ 𝑖 (𝑑 𝑖 ) Of course, this assumption could be violated. For example, if the treatment is β€œget a dog” and the outcome is my happiness, it could easily be that my happiness is influenced by whether or not my friends get dogs because we could end up hanging out more to have our dogs play together
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
Consistency is the assumption that the outcome we observe π‘Œ is actually the potential outcome under the observed treatment 𝑇
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




Flashcard 7070725573900

Tags
#causality #has-images #statistics

#causality #statistics


#causality #statistics


statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

pdf

cannot see any pdfs







#causality #statistics
It might seem like consistency is obviously true, but that is not always the case. For example, if the treatment specification is simply β€œget a dog” or β€œdon’t get a dog,” this can be too coarse to yield consistency. It might be that if I were to get a puppy, I would observe π‘Œ = 1 (happiness) because I needed an energetic friend, but if I were to get an old, low-energy dog, I would observe π‘Œ = 0 (unhappiness). However, both of these treatments fall under the category of β€œget a dog,” so both correspond to 𝑇 = 1 . This means that π‘Œ(1) is not well defined, since it will be 1 or 0, depending on something that is not captured by the treatment specification. In this sense, consistency encompasses the assumption that is sometimes referred to as β€œno multiple versions of treatment.”
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
stable unit-treatment value assumption (SUTVA)
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
SUTVA is satisfied if unit (individual) 𝑖 ’s outcome is simply a function of unit 𝑖 ’s treatment. Therefore, SUTVA is a combination of consistency and no interference (and also deterministic potential outcomes)
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

Assumptions of causal inference:

1. Unconfoundedness (Assumption 2.2)

2. Positivity (Assumption 2.3)

3. No interference (Assumption 2.4)

4. Consistency (Assumption 2.5)

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
An estimate (noun) is an approximation of some estimand, which we get using data
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
An estimand is the quantity that we want to estimate.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
When we say β€œidentification” in this book, we are referring to the process of moving from a causal estimand to an equivalent statistical estimand
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
When we say β€œestimation,” we are referring to the process of moving from a statistical estimand to an estimate
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #statistics
#causality #has-images #statistics
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
What do we do when we go to actually estimate quantities such as 𝔼 𝑋 [ 𝔼[π‘Œ | 𝑇 = 1, 𝑋] βˆ’ 𝔼[π‘Œ | 𝑇 = 0, 𝑋] ] ? We will often use a model (e.g. linear regression or some more fancy predictor from machine learning) in place of the conditional expectations 𝔼[π‘Œ | 𝑇 = 𝑑, 𝑋 = π‘₯] . We will refer to estimators that use models like this as model-assisted estimators. Now that we’ve gotten some of this terminology out of the way, we can proceed to an example of estimating the ATE
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
A graph is a collection of nodes (also called β€œvertices”) and edges that connect the nodes.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
If there is a directed path that starts at node 𝑋 and ends at node π‘Œ , then 𝑋 is an ancestor of π‘Œ , and π‘Œ is a descendant of 𝑋
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
We will denote descendants of 𝑋 by de(𝑋)
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
If two parents 𝑋 and π‘Œ share some child 𝑍 , but there is no edge connecting 𝑋 and π‘Œ , then 𝑋 β†’ 𝑍 ← π‘Œ is known as an immorality
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #has-images #statistics
For example, if we remove the 𝐴 β†’ 𝐡 to get Figure 3.5, then 𝐴 β†’ 𝐢 ← 𝐡 is an immorality
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




Flashcard 7070758604044

Tags
#causality #has-images #statistics


Question
For example, if we remove the 𝐴 β†’ 𝐡 from Figure 3.3 to get Figure 3.5, then [...] is an immorality
Answer
𝐴 β†’ 𝐢 ← 𝐡

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it

Original toplevel document (pdf)

cannot see any pdfs







#causality #statistics
It turns out that much of the work for causal graphical models was done in the field of probabilistic graphical models. Probabilistic graphical models are statistical models while causal graphical models are causal models.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
Probabilistic graphical models are statistical models while causal graphical models are causal models.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

Assumption 3.1 (Local Markov Assumption)

Given its parents in the DAG, a node 𝑋 is independent of all its non-descendants

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
The Bayesian network factorization is also known as the chain rule for Bayesian networks or Markov compatibility.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
As important as the local Markov assumption is, it only gives us information about the independencies in 𝑃 that a DAG implies. It does not even tell us that if 𝑋 and π‘Œ are adjacent in the DAG, then 𝑋 and π‘Œ are dependent. And this additional information is very commonly assumed in causal DAGs. To get this guaranteed dependence between adjacent nodes, we will generally assume a slightly stronger assumption than the local Markov assumption: minimality
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

Assumption 3.2 (Minimality Assumption)

1. Given its parents in the DAG, a node 𝑋 is independent of all its non-descendants (Assumption 3.1).

2. Adjacent nodes in the DAG are dependent.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #has-images #statistics
For example, if the DAG were simply two connected nodes 𝑋 and π‘Œ as in Figure 3.8, the local Markov assumption would tell us that we can factorize 𝑃(π‘₯, 𝑦) as 𝑃(π‘₯)𝑃(𝑦|π‘₯) , but it would also allow us to factorize 𝑃(π‘₯, 𝑦) as 𝑃(π‘₯)𝑃(𝑦) , meaning it allows distributions where 𝑋 and π‘Œ are independent. In contrast, the minimality assumption does not allow this additional independence. Minimality would tell us to factorize 𝑃(π‘₯, 𝑦) as 𝑃(π‘₯)𝑃(𝑦|π‘₯) , and it would tell us that no additional independencies (𝑋 βŠ₯βŠ₯ π‘Œ) exist in 𝑃 that are minimal with respect to Figure 3.8.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
Definition 3.2 (What is a cause?) A variable 𝑋 is said to be a cause of a variable π‘Œ if π‘Œ can change in response to changes in 𝑋
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

Assumption 3.3 ((Strict) Causal Edges Assumption)

In a directed graph, every parent is a direct cause of all its children

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
In contrast, the non-strict causal edges assumption would allow for some parents to not be causes of their children. It would just assume that children are not causes of their parents. This allows us to draw graphs with extra edges to make fewer assumptions, just like we would in Bayesian networks, where more edges means fewer independence assumptions. Causal graphs are sometimes drawn with this kind of non-minimal meaning, but the vast majority of the time, when someone draws a causal graph, they mean that parents are causes of their children.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

the main assumptions that we need for our causal graphical models to tell us how association and causation flow between variables are the following two:

1. Local Markov Assumption (Assumption 3.1)

2. Causal Edges Assumption (Assumption 3.3)

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics
the flow of association and causation in DAGs. We can understand this flow in general DAGs by understanding the flow in the minimal building blocks of graphs. The minimal building blocks of DAGs consist of chains (Figure 3.9a), forks (Figure 3.9b), immoralities (Figure 3.9c), two unconnected nodes (Figure 3.10), and two connected nodes (Figure 3.11)
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#causality #statistics

By β€œflow of association,” we mean whether any two nodes in a graph are associated or not associated. Another way of saying this is whether two nodes are (statistically) dependent or (statistically) independent.

Additionally, we will study whether two nodes are conditionally independent or not.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs





#causality #statistics
#causality #has-images #statistics
Answer: association
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs