# on 08-Mar-2018 (Thu)

#### Flashcard 1729398443276

Question
the beta and Dirichlet process allow [...] to drive the complexity of the learned model, while still permitting efficient inference algorithms.
the data

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
By leveraging stochastic processes such as the beta and Dirichlet process (DP), these methods allow the data to drive the complexity of the learned model, while still permit- ting efficient inference algorithms.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Annotation 1729511165196

#linear-algebra
In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them. Generalized inverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup.

Generalized inverse - Wikipedia
ree encyclopedia Jump to: navigation, search "Pseudoinverse" redirects here. For the Moore–Penrose inverse, sometimes referred to as "the pseudoinverse", see Moore–Penrose inverse. <span>In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them. Generalized inverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup. This article describes generalized inverses of a matrix A {\displaystyle A} . Formally, given a matrix A ∈

#### Annotation 1729513262348

#linear-algebra
Formally, given a matrix and a matrix , A is a generalized inverse of if it satisfies the condition : .

Generalized inverse - Wikipedia
nverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup. This article describes generalized inverses of a matrix A {\displaystyle A} . <span>Formally, given a matrix A ∈ R n × m {\displaystyle A\in \mathbb {R} ^{n\times m}} and a matrix A g ∈ R m × n {\displaystyle A^{\mathrm {g} }\in \mathbb {R} ^{m\times n}} , A g {\displaystyle A^{\mathrm {g} }} is a generalized inverse of A {\displaystyle A} if it satisfies the condition A A g A = A {\displaystyle AA^{\mathrm {g} }A=A} .    The purpose of constructing a generalized inverse of a matrix is to obtain a matrix that can serve as an inverse in some sense for a wider class of matrices than invertibl

#### Flashcard 1729636470028

Tags
#gaussian-process
Question
Viewed as a machine-learning algorithm, a Gaussian process uses lazy learning and a measure of [...] to predict the value for an unseen point from training data.
the similarity between points

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Viewed as a machine-learning algorithm, a Gaussian process uses lazy learning and a measure of the similarity between points (the kernel function) to predict the value for an unseen point from training data.

#### Original toplevel document

Gaussian process - Wikipedia
f them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those (infinitely many) random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space. <span>Viewed as a machine-learning algorithm, a Gaussian process uses lazy learning and a measure of the similarity between points (the kernel function) to predict the value for an unseen point from training data. The prediction is not just an estimate for that point, but also has uncertainty information—it is a one-dimensional Gaussian distribution (which is the marginal distribution at that poi

#### Flashcard 1729643285772

Tags
#linear-algebra #matrix-decomposition
Question
The eigendecomposition decomposes matrix A to [...] .

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The eigendecomposition can be derived from the fundamental property of eigenvectors: and thus which yields .

#### Original toplevel document

Eigendecomposition of a matrix - Wikipedia
, {\displaystyle v_{i}\,\,(i=1,\dots ,N),} can also be used as the columns of Q. That can be understood by noting that the magnitude of the eigenvectors in Q gets canceled in the decomposition by the presence of Q −1 . <span>The decomposition can be derived from the fundamental property of eigenvectors: A v = λ v {\displaystyle \mathbf {A} \mathbf {v} =\lambda \mathbf {v} } and thus A Q = Q Λ {\displaystyle \mathbf {A} \mathbf {Q} =\mathbf {Q} \mathbf {\Lambda } } which yields A = Q Λ Q − 1 {\displaystyle \mathbf {A} =\mathbf {Q} \mathbf {\Lambda } \mathbf {Q} ^{-1}} . Example[edit source] Taking a 2 × 2 real matrix A = [

#### Flashcard 1729646955788

Tags
#linear-algebra #matrix-decomposition
Question

[...] is the factorization of a matrix into a canonical form

eigendecomposition

Also called spectral decomposition

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In linear algebra, eigendecomposition or sometimes spectral decomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonal

#### Original toplevel document

Eigendecomposition of a matrix - Wikipedia
| ocultar ahora Eigendecomposition of a matrix From Wikipedia, the free encyclopedia (Redirected from Eigendecomposition) Jump to: navigation, search <span>In linear algebra, eigendecomposition or sometimes spectral decomposition is the factorization of a matrix into a canonical form, whereby the matrix is represented in terms of its eigenvalues and eigenvectors. Only diagonalizable matrices can be factorized in this way. Contents [hide] 1 Fundamental theory of matrix eigenvectors and eigenvalues 2 Eigendecomposition of a matrix 2.1 Example 2.2 Matrix inverse via eigendecomposition 2.2.1 Pr

#### Flashcard 1729666616588

Tags
#multivariate-normal-distribution
Question
The directions of the principal axes of the ellipsoids are given by [...] of the covariance matrix Σ
the eigenvectors

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The directions of the principal axes of the ellipsoids are given by the eigenvectors of the covariance matrix Σ. The squared relative lengths of the principal axes are given by the corresponding eigenvalues.

#### Original toplevel document

Multivariate normal distribution - Wikipedia
urs of a non-singular multivariate normal distribution are ellipsoids (i.e. linear transformations of hyperspheres) centered at the mean.  Hence the multivariate normal distribution is an example of the class of elliptical distributions. <span>The directions of the principal axes of the ellipsoids are given by the eigenvectors of the covariance matrix Σ. The squared relative lengths of the principal axes are given by the corresponding eigenvalues. If Σ = UΛU T = UΛ 1/2 (UΛ 1/2 ) T is an eigendecomposition where the columns of U are unit eigenvectors and Λ is a diagonal matrix of the eigenvalues, then we have

#### Flashcard 1729674480908

Tags
#multivariate-normal-distribution
Question
If Y = c + BX, then Y has variance [...]
BΣBT

Corollaries: sums of Gaussian are Gaussian, marginals of Gaussian are Gaussian.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
><head> If Y = c + BX is an affine transformation of where c is an vector of constants and B is a constant matrix, then Y has a multivariate normal distribution with expected value c + Bμ and variance BΣB T . Corollaries: sums of Gaussian are Gaussian, marginals of Gaussian are Gaussian. <html>

#### Original toplevel document

Multivariate normal distribution - Wikipedia
{\displaystyle {\boldsymbol {\Sigma }}'={\begin{bmatrix}{\boldsymbol {\Sigma }}_{11}&{\boldsymbol {\Sigma }}_{13}\\{\boldsymbol {\Sigma }}_{31}&{\boldsymbol {\Sigma }}_{33}\end{bmatrix}}} . Affine transformation[edit source] <span>If Y = c + BX is an affine transformation of X ∼ N ( μ , Σ ) , {\displaystyle \mathbf {X} \ \sim {\mathcal {N}}({\boldsymbol {\mu }},{\boldsymbol {\Sigma }}),} where c is an M × 1 {\displaystyle M\times 1} vector of constants and B is a constant M × N {\displaystyle M\times N} matrix, then Y has a multivariate normal distribution with expected value c + Bμ and variance BΣB T i.e., Y ∼ N ( c + B μ , B Σ B T ) {\displaystyle \mathbf {Y} \sim {\mathcal {N}}\left(\mathbf {c} +\mathbf {B} {\boldsymbol {\mu }},\mathbf {B} {\boldsymbol {\Sigma }}\mathbf {B} ^{\rm {T}}\right)} . In particular, any subset of the X i has a marginal distribution that is also multivariate normal. To see this, consider the following example: to extract the subset (X 1 , X 2 , X 4 )

#### Flashcard 1729676053772

Tags
#multivariate-normal-distribution
Question

To obtain the marginal distribution over a subset of multivariate normal random variables, one only needs to [...]

drop the irrelevant variables

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
To obtain the marginal distribution over a subset of multivariate normal random variables, one only needs to drop the irrelevant variables (the variables that one wants to marginalize out) from the mean vector and the covariance matrix. The proof for this follows from the definitions of multivariate normal distributions an

#### Original toplevel document

Multivariate normal distribution - Wikipedia
) {\displaystyle \operatorname {E} (X_{1}\mid X_{2}##BAD TAG##\rho E(X_{2}\mid X_{2}##BAD TAG##} and then using the properties of the expectation of a truncated normal distribution. Marginal distributions[edit source] <span>To obtain the marginal distribution over a subset of multivariate normal random variables, one only needs to drop the irrelevant variables (the variables that one wants to marginalize out) from the mean vector and the covariance matrix. The proof for this follows from the definitions of multivariate normal distributions and linear algebra.  Example Let X = [X 1 , X 2 , X 3 ] be multivariate normal random variables with mean vector μ = [μ 1 , μ 2 , μ 3 ] and covariance matrix Σ (standard parametrization for multivariate

#### Flashcard 1729687325964

Tags
#linear-algebra
Question
Formally, given a matrix and a matrix , A is a generalized inverse of if it satisfies the condition [...] .

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Formally, given a matrix and a matrix , A is a generalized inverse of if it satisfies the condition : .

#### Original toplevel document

Generalized inverse - Wikipedia
nverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup. This article describes generalized inverses of a matrix A {\displaystyle A} . <span>Formally, given a matrix A ∈ R n × m {\displaystyle A\in \mathbb {R} ^{n\times m}} and a matrix A g ∈ R m × n {\displaystyle A^{\mathrm {g} }\in \mathbb {R} ^{m\times n}} , A g {\displaystyle A^{\mathrm {g} }} is a generalized inverse of A {\displaystyle A} if it satisfies the condition A A g A = A {\displaystyle AA^{\mathrm {g} }A=A} .    The purpose of constructing a generalized inverse of a matrix is to obtain a matrix that can serve as an inverse in some sense for a wider class of matrices than invertibl

#### Annotation 1729689685260

#linear-algebra
In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them.

#### Parent (intermediate) annotation

Open it
In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them. Generalized inverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup.

#### Original toplevel document

Generalized inverse - Wikipedia
ree encyclopedia Jump to: navigation, search "Pseudoinverse" redirects here. For the Moore–Penrose inverse, sometimes referred to as "the pseudoinverse", see Moore–Penrose inverse. <span>In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them. Generalized inverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup. This article describes generalized inverses of a matrix A {\displaystyle A} . Formally, given a matrix A ∈

#### Flashcard 1729691258124

Tags
#linear-algebra
Question
a [...] has some properties of an inverse element but not necessarily all of them.
generalized inverse

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them.

#### Original toplevel document

Generalized inverse - Wikipedia
ree encyclopedia Jump to: navigation, search "Pseudoinverse" redirects here. For the Moore–Penrose inverse, sometimes referred to as "the pseudoinverse", see Moore–Penrose inverse. <span>In mathematics, and in particular, algebra, a generalized inverse of an element x is an element y that has some properties of an inverse element but not necessarily all of them. Generalized inverses can be defined in any mathematical structure that involves associative multiplication, that is, in a semigroup. This article describes generalized inverses of a matrix A {\displaystyle A} . Formally, given a matrix A ∈

#### Flashcard 1729705413900

Tags
#probability
Question
[...] also arises as a continuous mixture as Gamma-Poisson distributions
The negative binomial distribution

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The negative binomial distribution also arises as a continuous mixture of Poisson distributions (i.e. a compound probability distribution) where the mixing distribution of the Poisson rate is a gamma distribution. </spa

#### Original toplevel document

Negative binomial distribution - Wikipedia
) . {\displaystyle \operatorname {Poisson} (\lambda )=\lim _{r\to \infty }\operatorname {NB} \left(r,{\frac {\lambda }{\lambda +r}}\right).} Gamma–Poisson mixture[edit source] <span>The negative binomial distribution also arises as a continuous mixture of Poisson distributions (i.e. a compound probability distribution) where the mixing distribution of the Poisson rate is a gamma distribution. That is, we can view the negative binomial as a Poisson(λ) distribution, where λ is itself a random variable, distributed as a gamma distribution with shape = r and scale θ = p/(1 − p) or correspondingly rate β = (1 − p)/p. To display the intuition behind this statement, consider two independent Poisson processes, “Success” and “Failure”, with intensities p and 1 − p. Together, the Success and Failure pr

#### Flashcard 1729714326796

Tags
#singular-value-decomposition
Question
[...] generalises eigendecomposition of a positive semidefinite normal matrix to any matrix
singular-value decomposition (SVD)

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In linear algebra, the singular-value decomposition (SVD) generalises the eigendecomposition of a positive semidefinite normal matrix (for example, a symmetric matrix with positive eigenvalues) to any matrix via an extension of the polar deco

#### Original toplevel document

Singular-value decomposition - Wikipedia
nto three simple transformations: an initial rotation V ∗ , a scaling Σ along the coordinate axes, and a final rotation U. The lengths σ 1 and σ 2 of the semi-axes of the ellipse are the singular values of M, namely Σ 1,1 and Σ 2,2 . <span>In linear algebra, the singular-value decomposition (SVD) is a factorization of a real or complex matrix. It is the generalization of the eigendecomposition of a positive semidefinite normal matrix (for example, a symmetric matrix with positive eigenvalues) to any m × n {\displaystyle m\times n} matrix via an extension of the polar decomposition. It has many useful applications in signal processing and statistics. Formally, the singular-value decomposition of an m × n {\d

#### Flashcard 1730164690188

Tags
#variational-inference
Question
[...] can be seen as an extension of the expectation-maximization algorithm
Variational inference

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Variational Bayes can be seen as an extension of the EM (expectation-maximization) algorithm from maximum a posteriori estimation (MAP estimation) of the single most probable value of each parameter to f

#### Original toplevel document

Variational Bayesian methods - Wikipedia
om. In particular, whereas Monte Carlo techniques provide a numerical approximation to the exact posterior using a set of samples, Variational Bayes provides a locally-optimal, exact analytical solution to an approximation of the posterior. <span>Variational Bayes can be seen as an extension of the EM (expectation-maximization) algorithm from maximum a posteriori estimation (MAP estimation) of the single most probable value of each parameter to fully Bayesian estimation which computes (an approximation to) the entire posterior distribution of the parameters and latent variables. As in EM, it finds a set of optimal parameter values, and it has the same alternating structure as does EM, based on a set of interlocked (mutually dependent) equations that cannot be s

#### Flashcard 1730817953036

Tags
#bayesian-optimisation
Question
Bayesian optimisation treats the objective function as random and put a [...] over it.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Since the objective function is unknown, the Bayesian strategy (of optimisation) is to treat it as a random function and place a prior over it.

#### Original toplevel document

Bayesian optimization - Wikipedia
erences 8 External links History[edit source] The term is generally attributed to Jonas Mockus and is coined in his work from a series of publications on global optimization in the 1970s and 1980s.    Strategy[edit source] <span>Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a prior over it. The prior captures our beliefs about the behaviour of the function. After gathering the function evaluations, which are treated as data, the prior is updated to form the posterior distribution over the objective function. The posterior distribution, in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines what the next query point should be. Examples[edit source] Examples of acquisition functions include probability of improvement, expected improvement, Bayesian expected losses, upper confidence bounds (UCB), Thompson s

#### Flashcard 1730911014156

[unknown IMAGE 1739364109580]
Tags
#has-images
Question
The posterior distribution (of the objective function) is used to construct the [...]
acquisition function

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The posterior distribution (of the objective function), in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines what the next query point should be.

#### Original toplevel document

Bayesian optimization - Wikipedia
erences 8 External links History[edit source] The term is generally attributed to Jonas Mockus and is coined in his work from a series of publications on global optimization in the 1970s and 1980s.    Strategy[edit source] <span>Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a prior over it. The prior captures our beliefs about the behaviour of the function. After gathering the function evaluations, which are treated as data, the prior is updated to form the posterior distribution over the objective function. The posterior distribution, in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines what the next query point should be. Examples[edit source] Examples of acquisition functions include probability of improvement, expected improvement, Bayesian expected losses, upper confidence bounds (UCB), Thompson s

#### Flashcard 1733059022092

Tags
#kalman-filter
Question
Kalman filtering estimats a [...] over the variables for each timeframe.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
hat uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating <span>a joint probability distribution over the variables for each timeframe. <span><body><html>

#### Original toplevel document

Kalman filter - Wikipedia
into account; P k ∣ k − 1 {\displaystyle P_{k\mid k-1}} is the corresponding uncertainty. <span>Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers of its theory. The Kalman filter has numerous applications in technology. A common application is for guidanc

#### Flashcard 1739221241100

Tags
#fields
Question
In mathematics, a [...] is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers.
field

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers.

#### Original toplevel document

Field (mathematics) - Wikipedia
Module-like[show] Module Group with operators Vector space Linear algebra Algebra-like[show] Algebra Associative Non-associative Composition algebra Lie algebra Graded Bialgebra v t e <span>In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. The best known fields are the field of rational

#### Flashcard 1739352313100

Tags
#linear-algebra #matrix-decomposition
Question

The Cholesky decomposition only works properly for [...] matrices

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The Cholesky decomposition of a Hermitian positive-definite matrix A is a decomposition of the form where L is a lower triangular matrix with real and positive diagonal entries, and L* denotes the conjugate transpose of L. </

#### Original toplevel document

Cholesky decomposition - Wikipedia
s 7 Generalization 8 Implementations in programming languages 9 See also 10 Notes 11 References 12 External links 12.1 History of science 12.2 Information 12.3 Computer code 12.4 Use of the matrix in simulation 12.5 Online calculators <span>Statement[edit source] The Cholesky decomposition of a Hermitian positive-definite matrix A is a decomposition of the form A = L L ∗ , {\displaystyle \mathbf {A} =\mathbf {LL} ^{*},} where L is a lower triangular matrix with real and positive diagonal entries, and L* denotes the conjugate transpose of L. Every Hermitian positive-definite matrix (and thus also every real-valued symmetric positive-definite matrix) has a unique Cholesky decomposition.  If the matrix A is Hermitian and positive semi-definite, then it still has a decomposition of the form A = LL* if the diagonal entries of L are allowed to be zero.  When A has real entries, L has real entries as well, and the factorization may be written A = LL T .  The Cholesky decomposition is unique when A is positive definite; there is only one lower triangular matrix L with strictly positive diagonal entries such that A = LL*. However, the decomposition need not be unique when A is positive semidefinite. The converse holds trivially: if A can be written as LL* for some invertible L, lower triangular or otherwise, then A is Hermitian and positive definite. LDL decomposition[edit source] A closely related variant of the classical Cholesky decomposition is the LDL decomposition, A =

#### Flashcard 1741134630156

Tags
#measure-theory #stochastics
Question
E gives expectations of random variables, so it is a function $$X \mapsto E(X)$$ that maps [...] to [...]
random variables to real numbers.

Since the E operator operates on random variables, it's a function of functions; that is, it's a function that takes functions as input.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
E gives expectations of random variables, so it is a function X↦E(X) that maps random variables to real numbers.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 1741249187084

Tags
#probability
Question

a degenerate distribution in a space has support only on [...]
a space of lower dimension

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In mathematics, a degenerate distribution is a probability distribution in a space (discrete or continuous) with support only on a space of lower dimension.

#### Original toplevel document

Degenerate distribution - Wikipedia
e i k 0 t {\displaystyle e^{ik_{0}t}\,} <span>In mathematics, a degenerate distribution is a probability distribution in a space (discrete or continuous) with support only on a space of lower dimension. If the degenerate distribution is univariate (involving only a single random variable) it is a deterministic distribution and takes only a single value. Examples include a two-headed co

#### Flashcard 1741250759948

Tags
#probability
Question
a [...] distribution in a space has support only on a space of lower dimension.
degenerate

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In mathematics, a degenerate distribution is a probability distribution in a space (discrete or continuous) with support only on a space of lower dimension.

#### Original toplevel document

Degenerate distribution - Wikipedia
e i k 0 t {\displaystyle e^{ik_{0}t}\,} <span>In mathematics, a degenerate distribution is a probability distribution in a space (discrete or continuous) with support only on a space of lower dimension. If the degenerate distribution is univariate (involving only a single random variable) it is a deterministic distribution and takes only a single value. Examples include a two-headed co

#### Flashcard 1741386812684

Tags
#inner-product-space #vector-space
Question
A vector space with a topology allows the consideration of issues of [...].
proximity and continuity

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Functional spaces are generally endowed with additional structure than vector spaces, which may be a topology, allowing the consideration of issues of proximity and continuity.

#### Original toplevel document

Vector space - Wikipedia
roperties, which in some cases can be visualized as arrows. Vector spaces are the subject of linear algebra and are well characterized by their dimension, which, roughly speaking, specifies the number of independent directions in the space. <span>Infinite-dimensional vector spaces arise naturally in mathematical analysis, as function spaces, whose vectors are functions. These vector spaces are generally endowed with additional structure, which may be a topology, allowing the consideration of issues of proximity and continuity. Among these topologies, those that are defined by a norm or inner product are more commonly used, as having a notion of distance between two vectors. This is particularly the case of Banach spaces and Hilbert spaces, which are fundamental in mathematical analysis. Historically, the first ideas leading to vector spaces can be traced back as far as the 17th century's analytic geometry, matrices, systems of linear equations, and Euclidean vectors.

#### Flashcard 1741410405644

Tags
#sets
Question
a set is called [...], if it is, in a certain sense, of finite size.
bounded

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In mathematical analysis and related areas of mathematics, a set is called bounded, if it is, in a certain sense, of finite size.

#### Original toplevel document

Bounded set - Wikipedia
towards the right. "Bounded" and "boundary" are distinct concepts; for the latter see boundary (topology). A circle in isolation is a boundaryless bounded set, while the half plane is unbounded yet has a boundary. <span>In mathematical analysis and related areas of mathematics, a set is called bounded, if it is, in a certain sense, of finite size. Conversely, a set which is not bounded is called unbounded. The word bounded makes no sense in a general topological space without a corresponding metric. Contents [hide] 1

#### Flashcard 1741411978508

Tags
#sets
Question
a set is called bounded, if it is, in a certain sense, of [...].
finite size

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In mathematical analysis and related areas of mathematics, a set is called bounded, if it is, in a certain sense, of finite size.

#### Original toplevel document

Bounded set - Wikipedia
towards the right. "Bounded" and "boundary" are distinct concepts; for the latter see boundary (topology). A circle in isolation is a boundaryless bounded set, while the half plane is unbounded yet has a boundary. <span>In mathematical analysis and related areas of mathematics, a set is called bounded, if it is, in a certain sense, of finite size. Conversely, a set which is not bounded is called unbounded. The word bounded makes no sense in a general topological space without a corresponding metric. Contents [hide] 1

#### Flashcard 1741415124236

Tags
#sets
Question
The word bounded makes no sense in a general topological space without a corresponding [...].

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The word bounded makes no sense in a general topological space without a corresponding metric.

#### Original toplevel document

Bounded set - Wikipedia
le the half plane is unbounded yet has a boundary. In mathematical analysis and related areas of mathematics, a set is called bounded, if it is, in a certain sense, of finite size. Conversely, a set which is not bounded is called unbounded. <span>The word bounded makes no sense in a general topological space without a corresponding metric. Contents [hide] 1 Definition 2 Metric space 3 Boundedness in topological vector spaces 4 Boundedness in order theory 5 See also 6 References Definition[edit source]

#### Flashcard 1744294251788

Tags
#lebesgue-integration
Question

To assign a value to [...], the only reasonable choice is to set: the integral of the indicator function 1S of a measurable set S consistent with the given measure μ

Notice that the result may be equal to +∞ , unless μ is a finite measure.
Trick: just read the expression from left to right

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
To assign a value to the integral of the indicator function 1 S of a measurable set S consistent with the given measure μ, the only reasonable choice is to set: Notice that the result may be equal to +∞ , unless μ is a finite measure.

#### Original toplevel document

Lebesgue integration - Wikipedia
x ) {\displaystyle \int _{E}f\,\mathrm {d} \mu =\int _{E}f\left(x\right)\,\mathrm {d} \mu \left(x\right)} for measurable real-valued functions f defined on E in stages: Indicator functions: <span>To assign a value to the integral of the indicator function 1 S of a measurable set S consistent with the given measure μ, the only reasonable choice is to set: ∫ 1 S d μ = μ ( S ) . {\displaystyle \int 1_{S}\,\mathrm {d} \mu =\mu (S).} Notice that the result may be equal to +∞, unless μ is a finite measure. Simple functions: A finite linear combination of indicator functions ∑ k a

#### Flashcard 1744627961100

Tags
#expectation-operator
Question
For random variables such as Cauchy, the long-tails of the distribution prevent [...].
the sum/integral from converging

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
For random variables such as Cauhy, the long-tails of the distribution prevent the sum/integral from converging.

#### Original toplevel document

Expected value - Wikipedia
on subsumes both of these and also works for distributions which are neither discrete nor absolutely continuous; the expected value of a random variable is the integral of the random variable with respect to its probability measure.   <span>The expected value does not exist for random variables having some distributions with large "tails", such as the Cauchy distribution.  For random variables such as these, the long-tails of the distribution prevent the sum/integral from converging. The expected value is a key aspect of how one characterizes a probability distribution; it is one type of location parameter. By contrast, the variance is a measure of dispersion of t

#### Flashcard 1756471627020

Tags
#lagrange-multiplier #optimization
Question
Solving contrained optimization by direct substitution can be difficult because finding [...] is not easy
analytic solution of the constraint equation

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Solving contrained optimization by direct substitution can be difficult becausing finding analytic solution of the constraint equation is not easy undesirable because the natural symmetry between the variables is spoiled

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 1756885028108

[unknown IMAGE 1756483423500]
Tags
#Karush-Kuhn-Tucker-condition #has-images
Question
the function f(x) will only be at a maximum if λ satisfies [...formula...]
$$\lambda > 0$$

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In optimization with inequality constraint, the sign of the Lagrange multiplier is crucial, because the function f(x) will only be at a maximum if its gradient is oriented away from the region g(x) > 0

#### Original toplevel document (pdf)

cannot see any pdfs

#### Annotation 1758480436492

#function-space
In topological spaces, a function f is a function is defined on the sets X and Y, but the continuity of f depends on the topologies used on X and Y.

Continuous function - Wikipedia
∈ X | f ( x ) ∈ V } {\displaystyle f^{-1}(V)=\{x\in X\;|\;f(x)\in V\}} is an open subset of X. That is, <span>f is a function between the sets X and Y (not on the elements of the topology T X ), but the continuity of f depends on the topologies used on X and Y. This is equivalent to the condition that the preimages of the closed sets (which are the complements of the open subsets) in Y are closed in X. An extreme example: if a set X is giv

#### Annotation 1758482533644

#function-space

A function between two topological spaces X and Y is continuous if for every open set VY, the inverse image is an open subset of X.

Continuous function - Wikipedia
intersections that generalize the properties of the open balls in metric spaces while still allowing to talk about the neighbourhoods of a given point. The elements of a topology are called open subsets of X (with respect to the topology). <span>A function f : X → Y {\displaystyle f\colon X\rightarrow Y} between two topological spaces X and Y is continuous if for every open set V ⊆ Y, the inverse image f − 1 ( V ) = { x ∈ X | f ( x ) ∈ V } {\displaystyle f^{-1}(V)=\{x\in X\;|\;f(x)\in V\}} is an open subset of X. That is, f is a function between the sets X and Y (not on the elements of the topology T X ), but the continuity of f depends on the topologies used on X and Y. This is equivalent to

#### Flashcard 1758485417228

Tags
#function-space
Question

A function between two topological spaces X and Y is continuous if for every open set VY, [...].

the inverse image is an open subset of X

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
A function between two topological spaces X and Y is continuous if for every open set V ⊆ Y, the inverse image is an open subset of X.

#### Original toplevel document

Continuous function - Wikipedia
intersections that generalize the properties of the open balls in metric spaces while still allowing to talk about the neighbourhoods of a given point. The elements of a topology are called open subsets of X (with respect to the topology). <span>A function f : X → Y {\displaystyle f\colon X\rightarrow Y} between two topological spaces X and Y is continuous if for every open set V ⊆ Y, the inverse image f − 1 ( V ) = { x ∈ X | f ( x ) ∈ V } {\displaystyle f^{-1}(V)=\{x\in X\;|\;f(x)\in V\}} is an open subset of X. That is, f is a function between the sets X and Y (not on the elements of the topology T X ), but the continuity of f depends on the topologies used on X and Y. This is equivalent to

#### Flashcard 1758488825100

Tags
#function-space
Question
In topological spaces, a function f is defined on [...] , but the continuity of f depends on [...]
the sets X and Y, the topologies used on X and Y.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In topological spaces, a function f is a function is defined on the sets X and Y, but the continuity of f depends on the topologies used on X and Y.

#### Original toplevel document

Continuous function - Wikipedia
∈ X | f ( x ) ∈ V } {\displaystyle f^{-1}(V)=\{x\in X\;|\;f(x)\in V\}} is an open subset of X. That is, <span>f is a function between the sets X and Y (not on the elements of the topology T X ), but the continuity of f depends on the topologies used on X and Y. This is equivalent to the condition that the preimages of the closed sets (which are the complements of the open subsets) in Y are closed in X. An extreme example: if a set X is giv

#### Flashcard 1767469092108

Tags
#politics
Question
Atrocity propaganda is the spreading information about the crimes committed by an enemy, especially deliberate [...]
fabrications or exaggerations.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Atrocity propaganda is the spreading information about the crimes committed by an enemy, especially deliberate fabrications or exaggerations.

#### Original toplevel document

Atrocity propaganda - Wikipedia

#### Annotation 1767480102156

#history
The Hundred Years' War was a series of conflicts waged from 1337 to 1453 by the House of Plantagenet, rulers of the Kingdom of England, against the House of Valois, rulers of the Kingdom of France, over the succession to the French throne.

Hundred Years' War - Wikipedia
Anglo-French wars 1202–04 1213–14 1215–17 1242–43 1294–1303 1337–1453 (1337–60, 1369–89, 1415–53) 1496-98 1512–14 1522–26 1542–46 1557–59 1627–29 1666–67 1689–97 1702–13 1744–48 1744–1763 1754–63 1778–83 1793–1802 1803–14 1815 <span>The Hundred Years' War was a series of conflicts waged from 1337 to 1453 by the House of Plantagenet, rulers of the Kingdom of England, against the House of Valois, rulers of the Kingdom of France, over the succession to the French throne. Each side drew many allies into the war. It was one of the most notable conflicts of the Middle Ages, in which five generations of kings from two rival dynasties fought for the throne o

#### Flashcard 1767482199308

Tags
#history
Question
The Hundred Years' War started in [...]
1337

The war between the tame and the meek (Christians).

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The Hundred Years' War was a series of conflicts waged from 1337 to 1453 by the House of Plantagenet, rulers of the Kingdom of England, against the House of Valois, rulers of the Kingdom of France, over the succession to the French throne. </

#### Original toplevel document

Hundred Years' War - Wikipedia
Anglo-French wars 1202–04 1213–14 1215–17 1242–43 1294–1303 1337–1453 (1337–60, 1369–89, 1415–53) 1496-98 1512–14 1522–26 1542–46 1557–59 1627–29 1666–67 1689–97 1702–13 1744–48 1744–1763 1754–63 1778–83 1793–1802 1803–14 1815 <span>The Hundred Years' War was a series of conflicts waged from 1337 to 1453 by the House of Plantagenet, rulers of the Kingdom of England, against the House of Valois, rulers of the Kingdom of France, over the succession to the French throne. Each side drew many allies into the war. It was one of the most notable conflicts of the Middle Ages, in which five generations of kings from two rival dynasties fought for the throne o

#### Flashcard 1767484558604

Tags
#history
Question
The Hundred Years' War ended in [...]
1453

England would hate losing its heirloom.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The Hundred Years' War was a series of conflicts waged from 1337 to 1453 by the House of Plantagenet, rulers of the Kingdom of England, against the House of Valois, rulers of the Kingdom of France, over the succession to the French throne. </b

#### Original toplevel document

Hundred Years' War - Wikipedia
Anglo-French wars 1202–04 1213–14 1215–17 1242–43 1294–1303 1337–1453 (1337–60, 1369–89, 1415–53) 1496-98 1512–14 1522–26 1542–46 1557–59 1627–29 1666–67 1689–97 1702–13 1744–48 1744–1763 1754–63 1778–83 1793–1802 1803–14 1815 <span>The Hundred Years' War was a series of conflicts waged from 1337 to 1453 by the House of Plantagenet, rulers of the Kingdom of England, against the House of Valois, rulers of the Kingdom of France, over the succession to the French throne. Each side drew many allies into the war. It was one of the most notable conflicts of the Middle Ages, in which five generations of kings from two rival dynasties fought for the throne o

#### Flashcard 1767782878476

Tags
#function-space
Question

for a function and set VY, the inverse image of Y is defined as [...] status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
A function between two topological spaces X and Y is continuous if for every open set V ⊆ Y, the inverse image is an open subset of X.

#### Original toplevel document

Continuous function - Wikipedia
intersections that generalize the properties of the open balls in metric spaces while still allowing to talk about the neighbourhoods of a given point. The elements of a topology are called open subsets of X (with respect to the topology). <span>A function f : X → Y {\displaystyle f\colon X\rightarrow Y} between two topological spaces X and Y is continuous if for every open set V ⊆ Y, the inverse image f − 1 ( V ) = { x ∈ X | f ( x ) ∈ V } {\displaystyle f^{-1}(V)=\{x\in X\;|\;f(x)\in V\}} is an open subset of X. That is, f is a function between the sets X and Y (not on the elements of the topology T X ), but the continuity of f depends on the topologies used on X and Y. This is equivalent to

#### Annotation 1782250343692

#history
A fief was the central element of feudalism and consisted of heritable property or rights granted by an overlord to a vassal who held it in fealty in return for a form of feudal allegiance and service, usually given by the personal ceremonies of homage and fealty.

Fief - Wikipedia
ism Minarchism Distributism Anarchism Socialism Communism Totalitarianism Global vs. local geo-cultural ideologies Commune City-state National government Intergovernmental organisation World government Politics portal v t e <span>A fief (/fiːf/; Latin: feudum) was the central element of feudalism and consisted of heritable property or rights granted by an overlord to a vassal who held it in fealty (or "in fee") in return for a form of feudal allegiance and service, usually given by the personal ceremonies of homage and fealty. The fees were often lands or revenue-producing real property held in feudal land tenure: these are typically known as fiefs or fiefdoms. However, not only land but anything of value cou

#### Flashcard 1782253227276

Tags
#history
Question
A [...] was the central element of feudalism
fief

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
A fief was the central element of feudalism and consisted of heritable property or rights granted by an overlord to a vassal who held it in fealty in return for a form of feudal allegiance and

#### Original toplevel document

Fief - Wikipedia
ism Minarchism Distributism Anarchism Socialism Communism Totalitarianism Global vs. local geo-cultural ideologies Commune City-state National government Intergovernmental organisation World government Politics portal v t e <span>A fief (/fiːf/; Latin: feudum) was the central element of feudalism and consisted of heritable property or rights granted by an overlord to a vassal who held it in fealty (or "in fee") in return for a form of feudal allegiance and service, usually given by the personal ceremonies of homage and fealty. The fees were often lands or revenue-producing real property held in feudal land tenure: these are typically known as fiefs or fiefdoms. However, not only land but anything of value cou

#### Annotation 1782258732300

#history
Ever since the Norman conquest of 1066, the King of England held lands in France, which made him a vassal of the King of France.

Hundred Years' War - Wikipedia
ns of kings from two rival dynasties fought for the throne of the largest kingdom in Western Europe. The war marked both the height of chivalry and its subsequent decline, and the development of strong national identities in both countries. <span>Ever since the Norman conquest of 1066, the King of England held lands in France, which made him a vassal of the King of France. Tensions over the status of the English monarch's French fiefs led to conflicts between the crowns of France and England, and the extent of these lands varied throughout the medieval pe

#### Flashcard 1782260305164

Tags
#history
Question
Ever since the Norman conquest the King of England is a [...] of the King of France.
vassal

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Ever since the Norman conquest of 1066, the King of England held lands in France, which made him a vassal of the King of France.

#### Original toplevel document

Hundred Years' War - Wikipedia
ns of kings from two rival dynasties fought for the throne of the largest kingdom in Western Europe. The war marked both the height of chivalry and its subsequent decline, and the development of strong national identities in both countries. <span>Ever since the Norman conquest of 1066, the King of England held lands in France, which made him a vassal of the King of France. Tensions over the status of the English monarch's French fiefs led to conflicts between the crowns of France and England, and the extent of these lands varied throughout the medieval pe

#### Flashcard 1782267383052

Tags
#kalman-filter
Question
Kalman filtering is known as [...]

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend

#### Original toplevel document

Kalman filter - Wikipedia
into account; P k ∣ k − 1 {\displaystyle P_{k\mid k-1}} is the corresponding uncertainty. <span>Kalman filtering, also known as linear quadratic estimation (LQE), is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone, by estimating a joint probability distribution over the variables for each timeframe. The filter is named after Rudolf E. Kálmán, one of the primary developers of its theory. The Kalman filter has numerous applications in technology. A common application is for guidanc

#### Flashcard 1782275509516

[unknown IMAGE 1739364109580]
Tags
#bayesian-optimisation #has-images
Question
The acquisition function determines what [...] should be.
the next query point

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The posterior distribution (of the objective function), in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines what the next query point should be.

#### Original toplevel document

Bayesian optimization - Wikipedia
erences 8 External links History[edit source] The term is generally attributed to Jonas Mockus and is coined in his work from a series of publications on global optimization in the 1970s and 1980s.    Strategy[edit source] <span>Since the objective function is unknown, the Bayesian strategy is to treat it as a random function and place a prior over it. The prior captures our beliefs about the behaviour of the function. After gathering the function evaluations, which are treated as data, the prior is updated to form the posterior distribution over the objective function. The posterior distribution, in turn, is used to construct an acquisition function (often also referred to as infill sampling criteria) that determines what the next query point should be. Examples[edit source] Examples of acquisition functions include probability of improvement, expected improvement, Bayesian expected losses, upper confidence bounds (UCB), Thompson s