Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Question

To work our way up to GLMs, we will begin by defining exponential family distributions. We say that a class of distributions is in the exponential family if it can be written in the form

Answer

\(p(y ; \eta)=b(y) \exp \left(\eta^{T} T(y)-a(\eta)\right)\)

Here, \(\eta\) is called the *natural parameter* (also called the canonical parameter) of the distribution; T(y) is the *sufficient statistic* (for the distributions we consider, it will often be the case that T(y) = y); and \(a(\eta)\) is the log partition function. The quantity \(e^{-a(\eta)}\) essentially plays the role of a normalization constant, that makes sure the distribution \(p(y;\eta)\) sums/integrates over y to 1.

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

Question

To work our way up to GLMs, we will begin by defining exponential family distributions. We say that a class of distributions is in the exponential family if it can be written in the form

\(p(y ; \eta)=b(y) \exp \left(\eta^{T} T(y)-a(\eta)\right)\)

Here, \(\eta\) is called the **[...]** (also called the **[...]** ) of the distribution; T(y) is the *sufficient statistic* (for the distributions we consider, it will often be the case that T(y) = y); and \(a(\eta)\) is the log partition function. The quantity \(e^{-a(\eta)}\) essentially plays the role of a normalization constant, that makes sure the distribution \(p(y;\eta)\) sums/integrates over y to 1.

Answer

natural parameter

canonical parameter

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

Question

To work our way up to GLMs, we will begin by defining exponential family distributions. We say that a class of distributions is in the exponential family if it can be written in the form

\(p(y ; \eta)=b(y) \exp \left(\eta^{T} T(y)-a(\eta)\right)\)

Here, \(\eta\) is called the *natural parameter* (also called the canonical parameter) of the distribution; T(y) is the **[...]** (for the distributions we consider, it will often be the case that T(y) = y); and \(a(\eta)\) is the log partition function. The quantity \(e^{-a(\eta)}\) essentially plays the role of a normalization constant, that makes sure the distribution \(p(y;\eta)\) sums/integrates over y to 1.

Answer

sufficient statistic

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

Question

\(p(y ; \eta)=b(y) \exp \left(\eta^{T} T(y)-a(\eta)\right)\)

Here, \(\eta\) is called the *natural parameter* (also called the canonical parameter) of the distribution; T(y) is the *sufficient statistic* (for the distributions we consider, it will often be the case that T(y) = y); and \(a(\eta)\) is the **[...]** . The quantity \(e^{-a(\eta)}\) essentially plays the role of a normalization constant, that makes sure the distribution \(p(y;\eta)\) sums/integrates over y to 1.

Answer

log partition function

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

Tags

#machine-learning #statistics #supervised-learning

Question

In CS229, what 3 modelling assumptions are used to derive GLMs?

Answer

1. \(y | x ; \theta \sim \text { Exponential Family }(\eta)\)

2. Goal is to predict \(h(x)=\mathrm{E}[T(y) | x]\)

3. The natural parameter \(\eta\) and the inputs x are related linearly: \(\eta = \theta^Tx\)

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

Question

To introduce a little more terminology, the function g giving the distribution’s mean as a function of the natural parameter (\(g(\eta)=\mathrm{E}[T(y) ; \eta]\)) is called the **[...]** function. Its inverse, g −1 , is called the **[...]** function.

Answer

canonical response

canonical link

E.g. \(\phi=g(\eta)=1 /\left(1+e^{-\eta}\right)\)is the canonical response function for the Bernoulli distribution, \(\eta= g^{-1}(\phi)=\log (\phi /(1-\phi))\) is the canonical link function.

status | not learned | measured difficulty | 37% [default] | last interval [days] | |||
---|---|---|---|---|---|---|---|

repetition number in this series | 0 | memorised on | scheduled repetition | ||||

scheduled repetition interval | last repetition or drill |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

status | not read | reprioritisations | ||
---|---|---|---|---|

last reprioritisation on | reading queue position [%] | |||

started reading on | finished reading on |

, see transitive set § Transitive closure. In mathematics, the transitive closure of a binary relation R on a set X is the smallest relation on X that contains R and is transitive. For example, <span>if X is a set of airports and xRy means "there is a direct flight from airport x to airport y" (for x and y in X), then the transitive closure of R on X is the relation R+ such that x R+ y means "it is possible to fly from x to y in one or more flights". Informally, the transitive closure gives you the set of all places you can get to from any starting place. More formally, the transitive closure of a binary relation R on a set X is the transitive relation R+ on set X such that R+ contains R and R+ is minimal Lidl & Pilz (1998, p. 337).