# on 10-Jan-2020 (Fri)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#### Flashcard 4761342053644

[unknown IMAGE 4761343626508]
Tags
#Cardiologie #Médecine #Physiologie #Rythmologie #has-images
Question
Early After Depolarizations — EADs are triggered during prolonged action potentials. A prolonged action potential allows a longer window for reopening of [...] channels during phase 2 (or occasionally phase 3) of the action potential.
Answer

## L-type Ca2+

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
EADs — EADs are triggered during prolonged action potentials. A prolonged action potential allows a longer window for reopening of L-type Ca2+ channels during phase 2 (or occasionally phase 3) of the action potential. L-type Ca2+ current depolarizes the membrane before repolarization, triggering an afterdepolarization. Due to

#### Original toplevel document

UpToDate
the previous action potential to trigger them, hence an afterdepolarization is said to be a triggered arrhythmia. However, it is important to understand that DADs and EADs differ in mechanism. ●<span>EADs — EADs are triggered during prolonged action potentials. A prolonged action potential allows a longer window for reopening of L-type Ca2+ channels during phase 2 (or occasionally phase 3) of the action potential. L-type Ca2+ current depolarizes the membrane before repolarization, triggering an afterdepolarization. Due to L-type Ca2+ channel time and voltage dependence, EADs occur at slow stimulation rates or after a ventricular pause when action potential duration (phase 2) is prolonged and they are suppressed with faster heart rates. EADs are thought to initiate the polymorphic ventricular arrhythmias torsades de pointes (TdP) found in inherited and acquired long QT syndrome (LQTS), for example drug-induced LQTS. A point of distinction to be made here is that triggered activity can initiate TdP, but TdP may be a re-entrant mechanism at the organ level with a functional (spiral reentry) rather t

#### Annotation 4769620036876

[unknown IMAGE 4769622658316]
#MLBook #has-images #linear-regression #machine-learning

You could have noticed that the form of our linear model in eq. 1 $$\left[ f_{\mathbf w,b} (\mathbf x) = \mathbf w \mathbf x + b \right]$$ is very similar to the form of the SVM model. The only difference is the missing sign operator. The two models are indeed similar. However, the hyperplane in the SVM plays the role of the decision boundary: it’s used to separate two groups of examples from one another. As such, it has to be as far from each group as possible.

On the other hand, the hyperplane in linear regression is chosen to be as close to all training examples as possible.

You can see why this latter requirement is essential by looking at the illustration in Figure 1. It displays the regression line (in red) for one-dimensional examples (blue dots). We can use this line to predict the value of the target $$y$$ new for a new unlabeled input example $$x_{new}$$ new . If our examples are $$D$$-dimensional feature vectors (for $$D > 1$$), the only difference with the one-dimensional case is that the regression model is not a line but a plane (for two dimensions) or a hyperplane (for $$D > 2$$).

status not read

#### pdf

cannot see any pdfs

#### Annotation 4769626066188

#MLBook #cost-function #empirical-risk #linear-regression #loss-function #machine-learning #solution #squared-error-loss

The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression:

$$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$

In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4769785449740

[unknown IMAGE 4773033413900]
#MLBook #binary-classification #has-images #logistic-regression #machine-learning #problem-statement #sigmoid-function #standard-logistic-function

In logistic regression, we still want to model $$y_i$$ as a linear function of $$\mathbf x_i$$, however, with a binary $$y_i$$ this is not straightforward. The linear combination of features such as $$\mathbf w \mathbf x_i + b$$ is a function that spans from minus infinity to plus infinity, while $$y_i$$ has only two possible values.

At the time where the absence of computers required scientists to perform manual calculations, they were eager to find a linear classification model. They figured out that if we define a negative label as 0 and the positive label as 1, we would just need to find a simple continuous function whose codomain is (0 , 1). In such a case, if the value returned by the model for input $$\mathbf x$$ is closer to 0, then we assign a negative label to $$\mathbf x$$ ; otherwise, the example is labeled as positive. One function that has such a property is the standard logistic function (also known as the sigmoid function):

$$f(x) = \displaystyle \frac{1}{1 + e^{-x}}$$,

where $$e$$ is the base of the natural logarithm (also called Euler’s number; $$e^x$$ is also known as the $$exp(x)$$ function in programming languages). Its graph is depicted in Figure 3.

The logistic regression model looks like this:

$$f_{\mathbf w, b} (\mathbf x) \stackrel{\textrm{def}}{=} \displaystyle \frac{1}{1 + e^{-(\mathbf w \mathbf x + b)}} \quad (3)$$

You can see the familiar term $$\mathbf w \mathbf x + b$$ from linear regression.

By looking at the graph of the standard logistic function, we can see how well it fits our classification purpose: if we optimize the values of $$\mathbf w$$ and $$b$$ appropriately, we could interpret the output of $$f( \mathbf x )$$ as the probability of $$y_i$$ being positive. For example, if it’s higher than or equal to the threshold 0.5 we would say that the class of $$\mathbf x$$ is positive; otherwise, it’s negative. In practice, the choice of the threshold could be different depending on the problem. We return to this discussion in Chapter 5 when we talk about model performance assessment.

Now, how do we find optimal $$\mathbf w^\ast$$ and $$b^\ast$$? In linear regression, we minimized the empirical risk which was defined as the average squared error loss, also known as the mean squared error or MSE.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773036821772

#MLBook #logistic-regression #machine-learning #maximum-likelihood #solution

In logistic regression, on the other hand, we maximize the likelihood of our training set according to the model. In statistics, the likelihood function defines how likely the observation (an example) is according to our model.

For instance, let’s have a labeled example $$( \mathbf x_i, y_i )$$ in our training data. Assume also that we found (guessed) some specific values $$\hat {\mathbf w}$$ and $$\hat b$$ of our parameters. If we now apply our model $$f_{\hat{\mathbf w}, \hat b}$$ to $$\mathbf x_i$$ using eq. 3 $$\left[ f_{\mathbf w, b} (x) \stackrel{\textrm{def}}{=} \displaystyle \frac{1}{1 + e^{-(\mathbf w \mathbf x + b)}} \right]$$ we will get some value $$0 < p < 1$$ as output. If $$y_i$$ is the positive class, the likelihood of $$y_i$$ being the positive class, according to our model, is given by $$p$$. Similarly, if $$y_i$$ is the negative class, the likelihood of it being the negative class is given by $$1 − p$$.

The optimization criterion in logistic regression is called maximum likelihood. Instead of minimizing the average loss, like in linear regression, we now maximize the likelihood of the training data according to our model:

$$L_{\mathbf w, b} \stackrel{\textrm{def}}{=} \displaystyle \prod_{i = 1 \ldots N} f_{\mathbf w, b} (\mathbf x_i )^{y_i} (1 - f_{\mathbf w, b} (\mathbf x_i ))^{(1 - y_i)}. \quad (4)$$

The expression $$f_{\mathbf w, b} (\mathbf x )^{y_i} (1 - f_{\mathbf w, b} (\mathbf x ))^{(1 - y_i)}$$ may look scary but it’s just a fancy mathematical way of saying: “$$f_{\mathbf w, b} (\mathbf x )$$ when $$y_i = 1$$ and $$(1 - f_{\mathbf w, b} (\mathbf x ))$$ otherwise”. Indeed, if $$y_i = 1$$, then $$(1 - f_{\mathbf w, b} (\mathbf x ))^{(1 - y_i)}$$ equals 1 because $$(1 - y_i) = 0$$ and we know that anything power 0 equals 1. On the other hand, if $$y_i = 0$$, then $$f_{\mathbf w, b} (\mathbf x )^{y_i}$$ equals 1 for the same reason.

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4773237099788

Question
Whatis the task of klogd ?
Answer
Kernel messages are really handled by a diﬀerent program called klogd . This program preprocesses the messages and usually passes them along to sys- logd . See section 1.4.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4773238934796

yslogd proves very useful when debugging. It logs the diﬀerent system messageslog and is—as its name suggests—a daemon program.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773240507660

“Rsyslog”, which is a more modern implementa- tion of a syslogd with more room for conﬁguration. The additional capabil- ities are, however, not essential for getting started and/or passing the LPI exam

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773242080524

Instead of syslogd , certain versions of the Novell/SUSE distributions, in par- ticular the SUSE Linux Enterprise Server, use the Syslog-NG package in- stead of syslogd . This is conﬁgured in a substantially diﬀerent manner. For the LPIC-1 exam, you need to know that Syslog-NG exists and roughly what it does; see section 1.6.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773243653388

By default, Rsyslog uses /etc/rsyslog.conf as its conﬁguration ﬁle. This is largely compatible to what syslogd would use.

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4773245226252

Question
[default - edit me]
Answer
syslog uses /etc/rsyslog.conf as its conﬁguration ﬁle. This is largely compatible to what syslogd would use.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4773246274828

Question
[default - edit me]
Answer
Table 1.1: syslogd facilities Facility Meaning authpriv Conﬁdential security subsystem messages cron Messages from cron and at daemon Messages from daemon programs with no more speciﬁc facility ftp FTP daemon messages kern System kernel messages lpr Printer subsystem messages mail Mail subsystem messages news Usenet news subsystem messages syslog syslogd messages user Messages about users uucp Messages from the UUCP subsystem local 𝑟 (0 ≤ 𝑟 ≤ 7) Freely usable for local messages

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4773247847692

Question
[default - edit me]
Answer
Table 1.2: syslogd priorities (with ascending urgency) Priority Meaning none No priority in the proper sense—serves to exclude all messages from a certain facility debug Message about internal program states when debugging info Logging of normal system operations notice Documentation of particularly noteworthy situations during normal system operations warning (or warn ) Warnings about non-serious occurrences which are not se- rious but still no longer part of normal operations err Error messages of all kinds crit Critical error messages (the dividing line between this and err is not strictly deﬁned) alert “Alarming” messages requiring immediate attention emerg Final message before a system crash

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4773251517708

Question
Who gets to determine what facility or priority is attached to a message?
Answer
Whoever uses the syslog() function, namely the de- veloper of the program in question, must assign a facility and priority to their code’s messages. Many programs allow the administrator to at least redeﬁne the message facility.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4773253352716

• They can be written to a ﬁle.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773254925580

Log messages can be written to a named pipe (FIFO).

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773256498444

• They can be passed across the network to another syslogd .

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773259906316

They can be sent directly to users.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773261479180

They can be sent to all logged-in users

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773263052044

You may also specify multiple facilites with the same priority like mail,news.info ;Multiple facilities—same priority this expression selects messages of priority info and above that belong to the mail or news facilities.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773265149196

*.=warn;*.=err -/var/log/warn *.crit /var/log/warn *.*;mail.none;news.none -/var/log/messages The ﬁrst column of each line determines which messages will be selected, and the second line says where these messages go. The ﬁrst column’s format is ⟨facility⟩ . ⟨priority⟩[ ; ⟨facility⟩ . ⟨priority⟩] … where the ⟨facility⟩ denotes the system program or component giving rise to the facilities message.

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4773266722060

Question
What does the selection criterion mail.info mean in /etc/rsyslog.conf ?
Answer
A selection criterion of the form mail.info means “all messages of the mail sub-selection criteria system with a priority of info and above”.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4773271440652

Question
How to get more messages with editing the syslog.conf ?
Answer
If you want to log more messages, for example because speciﬁc problems are occurring, you should edit the syslog.conf ﬁle and then send syslogd a SIGHUP signal to get it to re-read its conﬁguration ﬁle.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4773273275660

Log ﬁles are generally created below /var/log .

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4773275110668

Question

@Debian

What is the default name convention and storage location of log files, facilities and debugging messages specified bei sysconf.conf ?

Answer

* separate log ﬁles for the auth , daemon , kern , lpr , mail , user , and uucp facilities, predictably called auth.log etc.

* mail system uses ﬁles called mail.info , mail.warn , and mail.err , which respectively contain only those messages with priority info etc. (and above).

* Debugging messages from all facilities except for authpriv , news , and mail end up in /var/log/debug , and messages of priority info , notice , and warn from all facilities except those just mentioned as well as cron and daemon in /var/log/messages .

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4773277732108

Question

@OpenSUSE

Where does OpenSuSE stores all te logs and messages ?

Answer

OpenSUSE logs all messages except those from iptables and the news and mail facilities to /var/log/messages .

*Messages from iptables go to /var/log/ firewall .

* Messages that are not from iptables and have priority warn , err , or crit are also written to /var/log/warn .

* /var/ log/localmessages ﬁle for messages from the local* facilities,

* the /var/log/ NetworkManager ﬁle for messages from the NetworkManager program,

* /var/log/acpid ﬁle for messages from the ACPI daemon. *

* The mail sys- tem writes its log both to /var/log/mail (all messages) and to the ﬁles mail.info , mail.warn , and mail.err (the latter for the priorities err and crit ),

* news system writes its log to news/news.notice , news/news.err , and news/news.crit (according to the priority)

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4773279567116

There are also special tools for reading log ﬁles, the most popular of which include logsurfer and xlogmaster

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4773281139980

Question
Why is there no klogd, when rsyslog exist ?
Answer
Rsyslog gets by without a separate klogd program, because it takes care of kernel log messages directly by itself. Hence, if you can’t ﬁnd a klogd on your system, this may very likely be because it is using rsyslog

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4773282974988

The dmesg command makes it possible to access the kernel log buﬀer retroactively and look at the system start log

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4773284547852

Question
What information d dmesg provides ?
Answer

* dmesg command you can also delete the kernel ring buﬀer ( -c op- tion) and set a priority for direct notiﬁcations: messages meeting or exceed- ing this priority will be sent to the console immediately ( -n option).

* Kernel messages have priorities from 0 to 7 corresponding to the syslogd priorities from emerg down to debug .

* The command # dmesg -n 1 for example causes only emerg messages to be written to the console directly.

* All messages will be written to /proc/kmsg in every case—here it is the job of postprocessing software such as syslogd to suppress unwanted messages.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4773286907148

[unknown IMAGE 4773289528588]
#has-images #lagrange-multiplier #optimization #single-constraint

For the case of only one constraint and only two choice variables (as exemplified in Figure 1), consider the optimization problem

$${\displaystyle {\text{maximize}}\ f(x,y)}$$ $${\displaystyle {\text{subject to:}}\ g(x,y)=0}$$

(Sometimes an additive constant is shown separately rather than being included in $$g$$, in which case the constraint is written $${\displaystyle g(x,y)=c}$$, as in Figure 1.) We assume that both $$f$$ and $$g$$ have continuous first partial derivatives. We introduce a new variable ($$\lambda$$) called a Lagrange multiplier (or Lagrange undetermined multiplier) and study the Lagrange function (or Lagrangian or Lagrangian expression) defined by

$${\displaystyle {\mathcal {L}}(x,y,\lambda )=f(x,y)-\lambda g(x,y),}$$

where the $$\lambda$$ term may be either added or subtracted. If $${\displaystyle f(x_{0},y_{0})}$$ is a maximum of $${\displaystyle f(x,y)}$$ for the original constrained problem, then there exists $$\lambda_0$$ such that ($${\displaystyle x_{0},y_{0},\lambda _{0}}$$) is a stationary point for the Lagrange function (stationary points are those points where the first partial derivatives of $${\mathcal {L}}$$ are zero). Also, it must be assumed that $${\displaystyle \nabla g\neq 0.}$$ However, not all stationary points yield a solution of the original problem, as the method of Lagrange multipliers yields only a necessary condition for optimality in constrained problems.[7][8][9][10][11] Sufficient conditions for a minimum or maximum also exist, but if a particular candidate solution satisfies the sufficient conditions, it is only guaranteed that that solution is the best one locally – that is, it is better than any permissible nearby points. The global optimum can be found by comparing the values of the original objective function at the points satisfying the necessary and locally sufficient conditions.

The method of Lagrange multipliers relies on the intuition that at a maximum, $${\displaystyle f(x,y)}$$ cannot be increasing in the direction of any such neighboring point that also has $${\displaystyle g=0}$$. If it were, we could walk along $${\displaystyle g=0}$$ to get higher, meaning that the starting point wasn't actually the maximum.

We can visualize contours of $$f$$ given by $${\displaystyle f(x,y)=d}$$ for various values of $$d$$, and the co

...

status not read

Lagrange multiplier - Wikipedia
nt g(x, y) = c. The blue curves are contours of f(x, y). The point where the red constraint tangentially touches a blue contour is the maximum of f(x, y) along the constraint, since d1 > d2. <span>For the case of only one constraint and only two choice variables (as exemplified in Figure 1), consider the optimization problem maximize f ( x , y ) {\displaystyle {\text{maximize}}\ f(x,y)} subject to: g ( x , y ) = 0 {\displaystyle {\text{subject to:}}\ g(x,y)=0} (Sometimes an additive constant is shown separately rather than being included in g {\displaystyle g} , in which case the constraint is written g ( x , y ) = c {\displaystyle g(x,y)=c} , as in Figure 1.) We assume that both f {\displaystyle f} and g {\displaystyle g} have continuous first partial derivatives . We introduce a new variable ( λ {\displaystyle \lambda } ) called a Lagrange multiplier (or Lagrange undetermined multiplier) and study the Lagrange function (or Lagrangian or Lagrangian expression) defined by L ( x , y , λ ) = f ( x , y ) − λ g ( x , y ) , {\displaystyle {\mathcal {L}}(x,y,\lambda )=f(x,y)-\lambda g(x,y),} where the λ {\displaystyle \lambda } term may be either added or subtracted. If f ( x 0 , y 0 ) {\displaystyle f(x_{0},y_{0})} is a maximum of f ( x , y ) {\displaystyle f(x,y)} for the original constrained problem, then there exists λ 0 {\displaystyle \lambda _{0}} such that ( x 0 , y 0 , λ 0 {\displaystyle x_{0},y_{0},\lambda _{0}} ) is a stationary point for the Lagrange function (stationary points are those points where the first partial derivatives of L {\displaystyle {\mathcal {L}}} are zero). Also, it must be assumed that ∇ g ≠ 0. {\displaystyle \nabla g\neq 0.} However, not all stationary points yield a solution of the original problem, as the method of Lagrange multipliers yields only a necessary condition for optimality in constrained problems.[7][8][9][10][11] Sufficient conditions for a minimum or maximum also exist , but if a particular candidate solution satisfies the sufficient conditions, it is only guaranteed that that solution is the best one locally – that is, it is better than any permissible nearby points. The global optimum can be found by comparing the values of the original objective function at the points satisfying the necessary and locally sufficient conditions. The method of Lagrange multipliers relies on the intuition that at a maximum, f ( x , y ) {\displaystyle f(x,y)} cannot be increasing in the direction of any such neighboring point that also has g = 0 {\displaystyle g=0} . If it were, we could walk along g = 0 {\displaystyle g=0} to get higher, meaning that the starting point wasn't actually the maximum. We can visualize contours of f {\displaystyle f} given by f ( x , y ) = d {\displaystyle f(x,y)=d} for various values of d {\displaystyle d} , and the contour of g {\displaystyle g} given by g ( x , y ) = c {\displaystyle g(x,y)=c} . Suppose we walk along the contour line with g = c {\displaystyle g=c} . We are interested in finding points where f {\displaystyle f} does not change as we walk, since these points might be maxima. There are two ways this could happen: We could be following a contour line of f {\displaystyle f} , since by definition f {\displaystyle f} does not change as we walk along its contour lines. This would mean that the contour lines of f {\displaystyle f} and g {\displaystyle g} are parallel here. We have reached a "level" part of f {\displaystyle f} , meaning that f {\displaystyle f} does not change in any direction. To check the first possibility (we are following a contour line of f {\displaystyle f} ), notice that since the gradient of a function is perpendicular to the contour lines, the contour lines of f {\displaystyle f} and g {\displaystyle g} are parallel if and only if the gradients of f {\displaystyle f} and g {\displaystyle g} are parallel. Thus we want points ( x , y {\displaystyle x,y} ) where g ( x , y ) = c {\displaystyle g(x,y)=c} and ∇ x , y f = λ ∇ x , y g , {\displaystyle \nabla _{x,y}f=\lambda \,\nabla _{x,y}g,} for some λ {\displaystyle \lambda } where ∇ x , y f = ( ∂ f ∂ x , ∂ f ∂ y ) , ∇ x , y g = ( ∂ g ∂ x , ∂ g ∂ y ) {\displaystyle \nabla _{x,y}f=\left({\frac {\partial f}{\partial x}},{\frac {\partial f}{\partial y}}\right),\qquad \nabla _{x,y}g=\left({\frac {\partial g}{\partial x}},{\frac {\partial g}{\partial y}}\right)} are the respective gradients. The constant λ {\displaystyle \lambda } is required because although the two gradient vectors are parallel, the magnitudes of the gradient vectors are generally not equal. This constant is called the Lagrange multiplier. (In some conventions λ {\displaystyle \lambda } is preceded by a minus sign). Notice that this method also solves the second possibility, that f {\displaystyle f} is level: if f {\displaystyle f} is level, then its gradient is zero, and setting λ = 0 {\displaystyle \lambda =0} is a solution regardless of ∇ x , y g {\displaystyle \nabla _{x,y}g} . To incorporate these conditions into one equation, we introduce an auxiliary function L ( x , y , λ ) = f ( x , y ) − λ g ( x , y ) , {\displaystyle {\mathcal {L}}(x,y,\lambda )=f(x,y)-\lambda g(x,y),} and solve ∇ x , y , λ L ( x , y , λ ) = 0. {\displaystyle \nabla _{x,y,\lambda }{\mathcal {L}}(x,y,\lambda )=0.} Note that this amounts to solving three equations in three unknowns. This is the method of Lagrange multipliers. Note that ∇ λ L ( x , y , λ ) = 0 {\displaystyle \nabla _{\lambda }{\mathcal {L}}(x,y,\lambda )=0} implies g ( x , y ) = 0 {\displaystyle g(x,y)=0} . To summarize ∇ x , y , λ L ( x , y , λ ) = 0 ⟺ { ∇ x , y f ( x , y ) = λ ∇ x , y g ( x , y ) g ( x , y ) = 0 {\displaystyle \nabla _{x,y,\lambda }{\mathcal {L}}(x,y,\lambda )=0\iff {\begin{cases}\nabla _{x,y}f(x,y)=\lambda \,\nabla _{x,y}g(x,y)\\g(x,y)=0\end{cases}}} The method generalizes readily to functions on n {\displaystyle n} variables ∇ x 1 , … , x n , λ L ( x 1 , … , x n , λ ) = 0 {\displaystyle \nabla _{x_{1},\dots ,x_{n},\lambda }{\mathcal {L}}(x_{1},\dots ,x_{n},\lambda )=0} which amounts to solving n + 1 {\displaystyle n+1} equations in n + 1 {\displaystyle n+1} unknowns. The constrained extrema of f {\displaystyle f} are critical points of the Lagrangian L {\displaystyle {\mathcal {L}}} , but they are not necessarily local extrema of L {\displaystyle {\mathcal {L}}} (see Example 2 below). One may reformulate the Lagrangian as a Hamiltonian , in which case the solutions are local minima for the Hamiltonian. This is done in optimal control theory, in the form of Pontryagin's minimum principle . The fact that solutions of the Lagrangian are not necessarily extrema also poses difficulties for numerical optimization. This can be addressed by computing the magnitude of the gradient, as the zeros of the magnitude are necessarily local minima, as illustrated in the numerical optimization example. Multiple constraints[edit ] [imagelink] [emptylink] Figure 2: A paraboloid constrained along two intersecting lines. [imagelink] [emptylink] Figure 3: Contour map of Figure 2. The metho

#### Annotation 4773292412172

[unknown IMAGE 4773303422220]
#has-images #lagrange-multiplier #multiple-constraints #optimization

The method of Lagrange multipliers can be extended to solve problems with multiple constraints using a similar argument. Consider a paraboloid subject to two line constraints that intersect at a single point. As the only feasible solution, this point is obviously a constrained extremum. However, the level set of $$f$$ is clearly not parallel to either constraint at the intersection point (see Figure 3); instead, it is a linear combination of the two constraints' gradients. In the case of multiple constraints, that will be what we seek in general: the method of Lagrange seeks points not at which the gradient of $$f$$ is multiple of any single constraint's gradient necessarily, but in which it is a linear combination of all the constraints' gradients.

Concretely, suppose we have $$M$$ constraints and are walking along the set of points satisfying $${\displaystyle g_{i}(\mathbf {x} )=0,i=1,\dots ,M}$$. Every point $$\mathbf {x}$$ on the contour of a given constraint function $$g_{i}$$ has a space of allowable directions: the space of vectors perpendicular to $${\displaystyle \nabla g_{i}(\mathbf {x} )}$$. The set of directions that are allowed by all constraints is thus the space of directions perpendicular to all of the constraints' gradients. Denote this space of allowable moves by $$A$$ and denote the span of the constraints' gradients by $$S$$. Then $${\displaystyle A=S^{\perp }}$$, the space of vectors perpendicular to every element of $$S$$.

We are still interested in finding points where $$f$$ does not change as we walk, since these points might be (constrained) extrema. We therefore seek $$\mathbf {x}$$ such that any allowable direction of movement away from $$\mathbf {x}$$ is perpendicular to $$\nabla f(\mathbf{x})$$ (otherwise we could increase $$f$$ by moving along that allowable direction). In other words, $${\displaystyle \nabla f(\mathbf {x} )\in A^{\perp }=S}$$. Thus there are scalars $${\displaystyle \lambda _{1},\lambda _{2},....\lambda _{M}}$$ such that

$${\displaystyle \nabla f(\mathbf {x} )=\sum _{k=1}^{M}\lambda _{k}\,\nabla g_{k}(\mathbf {x} )\quad \iff \quad \nabla f(\mathbf {x} )-\sum _{k=1}^{M}{\lambda _{k}\nabla g_{k}(\mathbf {x} )}=0.}$$

These scalars are the Lagrange multipliers. We now have $$M$$ of them, one for every constraint.

As before, we introduce an auxiliary function

$${\displaystyle {\mathcal {L}}\left(x_{1},\ldots ,x_{n},\lambda _{1},\ldots ,\lambda _{M}\right)=f\left(x_{1},\ldots ,x_{n}\right)-\sum \limits _{k=1}^{M}{\lambda _{k}g_{k}\left(x_{1},\ldots ,x_{n}\right)}}$$

and solve

$${\displaystyle \nabla _{x_{1},\ldots ,x_{n},\lambda _{1},\ldots ,\lambda _{M}}{\mathcal {L}}(x_{1},\ldots ,x_{n},\lambda _{1},\ldots ,\lambda _{M})=0\iff {\begin{cases}\nabla f(\mathbf {x} )-\sum _{k=1}^{M}{\lambda _{k}\,\nabla g_{k}(\mathbf {x} )}=0\\g_{1}(\mathbf {x} )=\cdots =g_{M}(\mathbf {x} )=0\end{cases}}}$$

which amounts to solving $${\displaystyle n+M}$$ equations in $${\displaystyle n+M}$$ unknowns.

The method of Lagrange multipliers is generalized by the

...

status not read

Lagrange multiplier - Wikipedia
mization example. Multiple constraints[edit ] [imagelink] [emptylink] Figure 2: A paraboloid constrained along two intersecting lines. [imagelink] [emptylink] Figure 3: Contour map of Figure 2. <span>The method of Lagrange multipliers can be extended to solve problems with multiple constraints using a similar argument. Consider a paraboloid subject to two line constraints that intersect at a single point. As the only feasible solution, this point is obviously a constrained extremum. However, the level set of f {\displaystyle f} is clearly not parallel to either constraint at the intersection point (see Figure 3); instead, it is a linear combination of the two constraints' gradients. In the case of multiple constraints, that will be what we seek in general: the method of Lagrange seeks points not at which the gradient of f {\displaystyle f} is multiple of any single constraint's gradient necessarily, but in which it is a linear combination of all the constraints' gradients. Concretely, suppose we have M {\displaystyle M} constraints and are walking along the set of points satisfying g i ( x ) = 0 , i = 1 , … , M {\displaystyle g_{i}(\mathbf {x} )=0,i=1,\dots ,M} . Every point x {\displaystyle \mathbf {x} } on the contour of a given constraint function g i {\displaystyle g_{i}} has a space of allowable directions: the space of vectors perpendicular to ∇ g i ( x ) {\displaystyle \nabla g_{i}(\mathbf {x} )} . The set of directions that are allowed by all constraints is thus the space of directions perpendicular to all of the constraints' gradients. Denote this space of allowable moves by A {\displaystyle A} and denote the span of the constraints' gradients by S {\displaystyle S} . Then A = S ⊥ {\displaystyle A=S^{\perp }} , the space of vectors perpendicular to every element of S {\displaystyle S} . We are still interested in finding points where f {\displaystyle f} does not change as we walk, since these points might be (constrained) extrema. We therefore seek x {\displaystyle \mathbf {x} } such that any allowable direction of movement away from x {\displaystyle \mathbf {x} } is perpendicular to ∇ f ( x ) {\displaystyle \nabla f(\mathbf {x} )} (otherwise we could increase f {\displaystyle f} by moving along that allowable direction). In other words, ∇ f ( x ) ∈ A ⊥ = S {\displaystyle \nabla f(\mathbf {x} )\in A^{\perp }=S} . Thus there are scalars λ 1 , λ 2 , . . . . λ M {\displaystyle \lambda _{1},\lambda _{2},....\lambda _{M}} such that ∇ f ( x ) = ∑ k = 1 M λ k ∇ g k ( x ) ⟺ ∇ f ( x ) − ∑ k = 1 M λ k ∇ g k ( x ) = 0. {\displaystyle \nabla f(\mathbf {x} )=\sum _{k=1}^{M}\lambda _{k}\,\nabla g_{k}(\mathbf {x} )\quad \iff \quad \nabla f(\mathbf {x} )-\sum _{k=1}^{M}{\lambda _{k}\nabla g_{k}(\mathbf {x} )}=0.} These scalars are the Lagrange multipliers. We now have M {\displaystyle M} of them, one for every constraint. As before, we introduce an auxiliary function L ( x 1 , … , x n , λ 1 , … , λ M ) = f ( x 1 , … , x n ) − ∑ k = 1 M λ k g k ( x 1 , … , x n ) {\displaystyle {\mathcal {L}}\left(x_{1},\ldots ,x_{n},\lambda _{1},\ldots ,\lambda _{M}\right)=f\left(x_{1},\ldots ,x_{n}\right)-\sum \limits _{k=1}^{M}{\lambda _{k}g_{k}\left(x_{1},\ldots ,x_{n}\right)}} and solve ∇ x 1 , … , x n , λ 1 , … , λ M L ( x 1 , … , x n , λ 1 , … , λ M ) = 0 ⟺ { ∇ f ( x ) − ∑ k = 1 M λ k ∇ g k ( x ) = 0 g 1 ( x ) = ⋯ = g M ( x ) = 0 {\displaystyle \nabla _{x_{1},\ldots ,x_{n},\lambda _{1},\ldots ,\lambda _{M}}{\mathcal {L}}(x_{1},\ldots ,x_{n},\lambda _{1},\ldots ,\lambda _{M})=0\iff {\begin{cases}\nabla f(\mathbf {x} )-\sum _{k=1}^{M}{\lambda _{k}\,\nabla g_{k}(\mathbf {x} )}=0\\g_{1}(\mathbf {x} )=\cdots =g_{M}(\mathbf {x} )=0\end{cases}}} which amounts to solving n + M {\displaystyle n+M} equations in n + M {\displaystyle n+M} unknowns. The method of Lagrange multipliers is generalized by the Karush–Kuhn–Tucker conditions , which can also take into account inequality constraints of the form h ( x ) ≤ c {\displaystyle h(\mathbf {x} )\leq c} . Modern formulation via differentiable manifolds[edit ] The problem of finding the local maxima and minima subject to constraints can be generalized to finding local maxima and minima on

#### Annotation 4773318364428

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Les mycoses sont des maladies provoquées par des champignons microscopiques dénommés mycètes (plus précisément micromycètes, par opposition aux macromycètes, champignons visibles dans l'environnement), susceptibles de vivre en parasite chez l'Homme.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773319937292

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
Les mycètes, véritables eucaryotes, constituent un règne (Fungi) distinct de celui des plantes (car n'ayant pas de pigment assimilateur de la chlorophylle) et du règne animal. Les cham- pignons assurent leur nutrition uniquement par absorption à partir du mycélium (réseau de filaments)

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773320985868

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
La plupart des mycètes sont des pathogènes opportunistes, profitant d'un affaiblissement de l'hôte pour provoquer une infection : ce sont, soit des champignons commensaux normale- ment présents chez l'Homme, soit des champignons présents dans l'environnement (moisis- sures) qui peuvent pénétrer dans l'organisme (par exemple, Aspergillus). D'autres mycètes (dermatophytes) se comportant en parasites obligatoires sont pathogènes, quel que soit le statut immunitaire du patient

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773322034444

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
ls sont très répandus : on évalue à plus d'un million le nombre d'espèces connues, dont seulement quelques centaines sont potentiellement patho- gènes chez l'Homme.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773323083020

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
D'un point de vue pratique, selon leur aspect morphologique, on distingue trois types de mycètes : filamenteux, levuriformes et dimorphiques.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773324131596

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
Ils se développent sur leur substrat nutritif par un système de filaments plus ou moins ramifiés dénommé thalle ou mycélium, constitué de filaments (ou hyphes) cloisonnés ou non

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773325180172

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
Parmi ces mycètes filamenteux impliqués en pathologie, on différencie : • les dermatophytes : champignons kératinophiles, adaptés à la peau et aux phanères de l'Homme ou l'animal, et provoquant des lésions quel que soit l'état immunitaire du patient ; • les moisissures issues de l'environnement au comportement opportuniste (par exemple, Aspergillus) : leur développement chez l'Homme est permis par l'affaiblissement de ses défenses immunitaires.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773326228748

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
B. Levuriformes Dans ce cas, le thalle se réduit à un état unicellulaire. L'aspect classique est celui d'une levure de forme ronde ou ovalaire, de petite taille (généralement moins de 10 μm), qui se reproduit par bourgeonnement. Certaines levures, appartenant par exemple au genre Candida, peuvent donner naissance par bourgeonnements successifs à un pseudomycélium ou même à des filaments mycéliens vrais.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773328850188

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
Les dimorphiques, absents habituellement en France métropolitaine, sont issus de régions tropicales ou subtropicales — par exemple, Histoplasma et Talaromyces marneffei (ex-Penicillium marneffei)

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773329898764

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
Il convient de mettre à part Pneumocystis jirovecii, agent de la pneumocystose humaine. Il s'agit d'un champignon atypique, non cultivable

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773330947340

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
On incrimine aujourd'hui plus de 400 espèces fongiques impliquées dans un processus patho- logique chez l'Homme et ce nombre continue d'augmenter

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773331995916

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
Les champignons peuvent être responsables chez l'Homme : • d'intoxications dues à l'exposition à certains micromycètes (mycotoxicoses) ou macromy- cètes (syndrome phalloïdien, etc.) ; ces pathologies ne sont pas des mycoses et ne seront pas traitées dans cet ouvrage ; • de pathologies immunoallergiques liées à un état d'hypersensibilité (alvéolites extrinsèques, asthme, etc.) qui ne sont pas non plus traitées dans cet ouvrage ; • d'infections résultant du parasitisme, mycoses superficielles ou profondes le plus souvent opportunistes.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773333044492

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
À part quelques exceptions, les kératinophiles ne colonisent que la peau et les phanères (ongles, poils, cheveux). L'importance des lésions varie selon le degré d'adaptation parasitaire. Seules les espèces dites « anthropophiles » peuvent être relativement bien tolérées ; les lésions mycosiques, dans ces cas, sont discrètes voire absentes. En revanche, les espèces dites « zoo- philes » ou « géophiles », peu ou pas adaptées à l'Homme, sont à l'origine de dermatophyties volontiers inflammatoires.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773334093068

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine
A. Champignons adaptés au parasitisme Le meilleur exemple est celui des micromycètes kératinophiles dont l'avidité pour la kératine animale et humaine est très prononcée

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773335141644

#MLBook #machine-learning #review
Solution

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773340646668

[unknown IMAGE 4773337763084]
#MLBook #has-images #machine-learning

I already presented SVM in the introduction, so this section only fills a couple of blanks. Two critical questions need to be answered:

1. What if there’s noise in the data and no hyperplane can perfectly separate positive examples from negative ones?
2. What if the data cannot be separated using a plane, but could be separated by a higher-order polynomial?

You can see both situations depicted in Figure 5. In the left case, the data could be separated by a straight line if not for the noise (outliers or examples with wrong labels). In the right case, the decision boundary is a circle and not a straight line.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773342743820

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Elle est encore aujourd'hui la première infection opportuniste dans ce contexte en France métropolitaine, malgré la diminution globale de l'incidence du sida depuis la disponibilité des traitements antirétroviraux.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773343792396

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Les premières descriptions de pneumopathies à Pneumocystis jirovecii (PPC) ont été faites dans les années quarante chez des nourrissons malnutris en Europe de l'Est

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773345627404

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
La pneumocystose est une mycose pulmonaire due à un champignon cosmopolite opportuniste transmissible, dénommé Pneumocystis jirovecii, survenant majoritairement chez les patients immunodéprimés. L'infection se présente essentiellement comme une pneumopathie, tandis que les localisations extrapulmonaires sont rares.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773346675980

[unknown IMAGE 4773337763084]
#MLBook #hard-margin-SVM #has-images #hinge-loss #machine-learning #noise #soft-margin-SVM #support-vector-machine

To extend SVM to cases in which the data is not linearly separable, we introduce the hinge loss function: $$\max (0, 1 − y_i (\mathbf w \mathbf x_i − b))$$.

The hinge loss function is zero if the constraints in 8 [i.e., $$\mathbf w \mathbf x_i − b \ge +1 \; \textrm{if} \; y_i = +1$$ and $$\mathbf w \mathbf x_i − b \le -1 \; \textrm{if} \; y_i = -1$$] are satisfied; in other words, if $$\mathbf w \mathbf x_i$$ lies on the correct side of the decision boundary. For data on the wrong side of the decision boundary, the function’s value is proportional to the distance from the decision boundary.

We then wish to minimize the following cost function,

$$C \left\Vert \mathbf w \right\Vert^2 + \frac{1}{N} \displaystyle \sum_{i=1}^N \max (0, 1 − y_i (\mathbf w \mathbf x_i − b))$$,

where the hyperparameter $$C$$ determines the tradeoff between increasing the size of the decision boundary and ensuring that each $$\mathbf x_i$$ lies on the correct side of the decision boundary. The value of $$C$$ is usually chosen experimentally, just like ID3’s hyperparameters $$\epsilon$$ and $$d$$ . SVMs that optimize hinge loss are called soft-margin SVMs, while the original formulation is referred to as a hard-margin SVM.

As you can see, for sufficiently high values of $$C$$, the second term in the cost function will become negligible, so the SVM algorithm will try to find the highest margin by completely ignoring misclassification. As we decrease the value of $$C$$, making classification errors is becoming more costly, so the SVM algorithm tries to make fewer mistakes by sacrificing the margin size. As we have already discussed, a larger margin is better for generalization. Therefore, $$C$$ regulates the tradeoff between classifying the training data well (minimizing empirical risk) and classifying future examples well (generalization).

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773348248844

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
On décrit deux formes de cet ascomycète et des formes de transition : • les asques (anciennement appelés kystes), qui sont ovoïdes ou sphériques et mesurent 4 à 7 μm de diamètre ; les asques matures contiennent huit ascospores ; les asques vides ont une forme de ballon dégonflé caractéristique ; • les formes trophiques ou trophozoïtes, qui sont très variables en forme et en taille (2 à 8 μm). Elles sont mononucléées et amiboïdes et sont munies d'élongations, dénommées les filopodes, qui leur permettent de s'arrimer très étroitement aux cellules épithéliales pulmonaires de type I où elles se multiplient activement. C'est à partir des grandes formes trophiques que se forment les asques.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773349297420

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Les asques matures contiennent huit ascospores qui, après libération et selon un cycle hypothétique, donneront des petites formes trophiques puis des plus grandes formes trophiques polymorphes. Les formes trophiques par conjugaison donneront de nouveaux asques qui, après maturation, libéreront de nouvelles ascospores

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773351918860

[unknown IMAGE 4773350608140]
#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose #has-images
Cycle parasitaire de Pneumosystis jirovecii

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773353229580

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Dans ce contexte, la spécificité des espèces de Pneumocystis pour leur mammifère hôte permet d'écarter un réservoir animal pour P. jirovecii et de retenir un réservoir strictement humain

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773354278156

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
De plus, la transmission interindividuelle de P. jirovecii par voie aérienne est désormais admise chez l'Homme. Elle peut être à l'origine de « cas groupés » ou d'épidémies dans les services hospitaliers accueillant des patients immunodéprimés

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773355326732

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Chez les patients à risque, le développement de P. jirovecii entraîne des lésions de l'épithélium alvéolaire dont les cloisons s'épaississent, réalisant une pneumopathie interstitielle diffuse à l'origine d'hypoxémie et d'insuffisance respiratoire

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773356375308

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
1. Chez l'enfant immunodéprimé non infecté par le VIH La pneumocystose survient essentiellement au cours de la première année de vie chez les enfants atteints d'immunodéficience combinée sévère ou d'hypogammaglobulimémie congénitale. Le début est brutal avec dyspnée, toux sèche et fièvre. La mortalité est de 100 % en l'absence de traitement.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773357423884

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
2. Chez l'enfant immunodéprimé infecté par le VIH L'incidence de la pneumocystose était d'environ 40 % dans les pays développés avant l'utilisation des antirétroviraux. Le début clinique est progressif, avec tachypnée, fièvre et toux.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773358472460

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
1. Chez l'adulte infecté par le VIH La pneumocystose se traduit par une triade classique d'apparition progressive dans la moitié des cas, faite de fièvre, de toux sèche et de dyspnée d'intensité croissante. Il peut exister des formes fébriles pures.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773359521036

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
À l'inverse, lorsque le diagnostic est tardif, les patients se présentent avec un tableau d'insuffisance respiratoire aiguë. La radiographie pulmonaire est quasi opaque, en « verre dépoli » (ou « poumons blancs »). Les gaz du sang peuvent montrer une hypoxémie grave (PaO 2 inférieure à 60 mmHg). Ces cas sont de mauvais pronostic. La mortalité globale sous traitement est estimée à 15 %

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773360569612

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Quand le taux de CD4 est inférieur à 100/mm 3 , d'autres infections opportunistes peuvent être évoquées devant une pneumopathie interstitielle d'origine parasitaire ou fongique (toxoplasmose, cryptococcose, histoplasmose, pénicilliose), d'origine bactérienne (pneumocoque, Haemophilus, tuberculose). Une maladie de Kaposi ou une pneumopathie interstitielle lymphoïde peuvent être également discutées.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773361618188

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
La symptomatologie est proche de celle rencontrée chez le patient infecté par le VIH, mais elle est plus aiguë et évolue plus vite vers l'insuffisance respiratoire. La mortalité est également plus élevée que chez le patient infecté par le VIH (de 30 à 60 %).

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773365026060

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
L'étude du lavage bronchiolo-alvéolaire est le meilleur examen pour sa détection. Celle des crachats induits ou des lavages oropharyngés, bien que moins sensible, peut être proposée en cas de contre- indications au lavage bronchiolo-alvéolaire. La biopsie transpariétale ou transbronchique est rarement usitée en France.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773366861068

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Le diagnostic repose sur la mise en évidence de P. jirovecii par examen microscopique direct après coloration des prélèvements biologiques. La coloration argentique ou le bleu de toluidine colore bien la paroi des asques regroupés en amas (fig. 31.3 et 31.4). La coloration au Giemsa est indispensable pour mettre en évidence les formes trophiques, non observées avec les colorations précédentes : elle permet aussi de visualiser les ascospores (au nombre de huit au maximum, disposés en rosette) et les amas spumeux contenant les formes trophiques, les asques matures et immatures (fig. 31.5).

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773372628236

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
La PCR, de par sa grande sensibilité, permet de détecter de faibles quantités de P. jirovecii dans le lavage bronchiolo-alvéolaire et ainsi de poser le diagnostic de pauci-infections. Bien qu'elle permette d'augmenter également la sensibilité de l'examen des crachats ou des lavages oropharyngés lorsqu'un lavage bronchiolo-alvéolaire ne peut être réalisé, elle ne permet pas d'écarter une pneumocystose en cas de résultat négatif sur ce type d'échantillon. Elle devient l'examen de référence quel que soit l'échantillon pulmonaire analysé

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773377608972

[unknown IMAGE 4773373938956]
#MLBook #SVM #has-images #machine-learning #non-linearity

SVM can be adapted to work with datasets that cannot be separated by a hyperplane in its original space. Indeed, if we manage to transform the original space into a space of higher dimensionality, we could hope that the examples will become linearly separable in this transformed space. In SVMs, using a function to implicitly transform the original space into a higher dimensional space during the cost function optimization is called the kernel trick.

The effect of applying the kernel trick is illustrated in Figure 6. As you can see, it’s possible to transform a two-dimensional non-linearly-separable data into a linearly-separable three-dimensional data using a specific mapping $$\phi: \mathbf x \mapsto \phi (\mathbf x)$$, where $$\phi (\mathbf x)$$ is a vector of higher dimensionality than $$\mathbf x$$. For the example of 2D data in Figure 5 (right), the mapping $$\phi$$ for that projects a 2D example $$\mathbf x = \left[ q, p \right]$$ into a 3D space (Figure 6) would look like this: $$\phi \left( \left[ q, p \right] \right) \stackrel{\textrm{def}}{=} \left( q^2, \sqrt{2} qp, p^2\right)$$, where $$\cdot^2$$ means $$\cdot$$ squared. You see now that the data becomes linearly separable in the transformed space.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773379706124

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
C. Marqueurs sériques biologiques non spécifiques La détection des antigènes β(1,3)-D-glucanes sériques est positive chez les patients développant des infections fongiques invasives dont la pneumocystose, la paroi des asques contenant du β(1,3)-D-glucane. Il s'agit d'un test diagnostique complémentaire, l'examen direct et la PCR restent les examens de première intention

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773382851852

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Le taux de lactates déshydrogénases sériques est généralement élevé chez le patient infecté par le VIH développant une pneumocystose. La détection des anticorps anti-Pneumocystis sériques n'est pas utilisée pour le diagnostic mais présente un intérêt pour les enquêtes de séroprévalence

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773383900428

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Le traitement de première intention repose sur le cotrimoxazole (triméthoprime et sulfaméthoxazole) ; en cas d'intolérance ou de contre-indication, on peut utiliser : • l'association clindamycine et primaquine ou l'iséthionate de pentamidine (souvent mal toléré) dans les formes modérées à sévères ; • l'atovaquone ou la dapsone, associée ou non au triméthoprime, dans les formes légères à modérées.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773384949004

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Le cotrimoxazole (Bactrim ® , et ses génériques) est donné à la posologie de 20 mg/kg par jour de triméthoprime et de 100 mg/kg par jour de sulfaméthoxazole en trois à quatre prises, par voie orale ou intraveineuse pendant 2 à 3 semaines selon la maladie sous-jacente. Des effets secondaires surviennent dans plus de 50 % des cas, à type d'éruption cutanée, fièvre, leucopénie, anémie, thrombopénie, élévation des transaminases…

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773385997580

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
L'association clindamycine (Dalacine ® ) et primaquine est prescrite respectivement à la dose de 1 800 mg par jour en trois prises (IV ou orales) et 15 mg par jour en une prise orale.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773387046156

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
L'iséthionate de pentamidine (Pentacarinat ® ) est utilisé par voie intraveineuse lente à la posologie de 4 mg/kg par jour pendant 3 semaines. Les injections intramusculaires sont déconseillées à cause du risque de douleur et nécrose au point de piqûre. Les effets secondaires sont nombreux : insuffisance rénale, hypotension orthostatique, leucopénie, thrombo pénie, hypoglycémie, troubles du rythme cardiaque, diabète, pancréatite aiguë, élévation des transaminases…

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773388094732

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
L'atovaquone (Wellvone ® ) est utilisée en suspension à la dose de 1 500 mg par jour, en 2 prises, pendant 21 jours

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773389143308

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Chez le patient infecté par le VIH, en cas d'hypoxie associée (PaO 2 inférieure à 60 mmHg), on peut adjoindre une corticothérapie. Chez les patients non infectés par le VIH, l'intérêt de la corticothérapie est controversé.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773390191884

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
La dapsone (Disulone ® ) est prescite à 100 mg par jour, seule ou en association avec le triméthoprime (Delprim ® ) à 20 mg par jour (hors AMM)

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773391240460

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
1. Prophylaxie primaire Pour les patients infectés par le VIH, elle doit être envisagée dès que les CD4 sont inférieurs à 200/mm 3 ou inférieurs à 15 % des lymphocytes. Elle doit être réalisée plus tôt s'il existe une baisse rapide des CD4 ou une chimiothérapie associée (lymphome, Kaposi), une autre infection opportuniste ou encore une altération sévère de l'état général

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773392289036

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Les patients transplantés d'organe solide et ceux ayant bénéficié d'une allogreffe de cellules souches hématopoïétiques reçoivent systématiquement une prophylaxie par cotrimoxazole durant les 6 mois qui suivent la transplantation.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773393337612

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
La prophylaxie est également recommandée chez les patients ayant une leucémie aiguë lymphoblastique, de l'induction à la fin de l'entretien de la chimiothérapie.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773395172620

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
Par ailleurs, compte tenu du risque de transmission interhumaine, les patients infectés par P. jirovecii doivent être isolés des patients réceptifs.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773396221196

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose

Interruption des prophylaxies primaire et secondaire :

• Après amélioration sous antirétroviraux, si les CD4 sont supérieurs à 200/mm 3 de façon durable (3 mois) et si la charge virale du VIH est inférieure à 1 000 copies/ml, il est possible d'interrompre les traitements prophylactiques.
• Les recommandations chez les patients immunodéprimés autres que le VIH sont moins bien établies.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773397269772

#Cours #Facultaires #Infectiologie #Maladies-infectieuses-et-tropicales #Médecine #Pneumocystose
2. Prophylaxie secondaire Le traitement privilégié est le cotrimoxazole per os (1 comprimé de Bactrim ® par jour ou 1 comprimé de Bactrim ® forte trois fois par semaine). En cas d'intolérance, les alternatives sont la pentamidine en aérosol à la posologie de 4 mg/kg par semaine, la dapsone per os à la dose de 100 mg par jour, seule ou associée à 50 mg de pyriméthamine (Malocide ® ) par semaine, l'atovaquone (Wellvone ® ) en suspension à la dose de 1 500 mg par jour en deux prises.

status not read

#### pdf

cannot see any pdfs

#### Annotation 4773398318348

#MLBook #RBF-kernel #SVM #kernel-functions #machine-learning #non-linearity

However, we don’t know a priori which mapping $$\phi$$ would work for our data. If we first transform all our input examples using some mapping into very high dimensional vectors and then apply SVM to this data, and we try all possible mapping functions, the computation could become very inefficient, and we would never solve our classification problem.

Fortunately, scientists figured out how to use kernel functions (or, simply, kernels ) to efficiently work in higher-dimensional spaces without doing this transformation explicitly. To understand how kernels work, we have to see first how the optimization algorithm for SVM finds the optimal values for $$\mathbf x$$ and $$b$$. The method traditionally used to solve the optimization problem in eq. 9 is the method of Lagrange multipliers. Instead of solving the original problem from eq. 9, it is convenient to solve an equivalent problem formulated like this:
$$\max_{\alpha_1 \ldots \alpha_N} \displaystyle \sum_{i=1}^N \alpha_i - \frac{1}{2} \sum_{i=1}^N \sum_{k=1}^N y_i \alpha_i (\mathbf x_i \mathbf x_k) y_k \alpha_k \; \textrm{subject to} \; \sum_{i=1}^N \alpha_i y_i \; \textrm{and} \; \alpha_i \ge 0, i = 1, \ldots, N,$$

where $$\alpha_i$$ are called Lagrange multipliers. When formulated like this, the optimization problem becomes a convex quadratic optimization problem, efficiently solvable by quadratic programming algorithms.

Now, you could have noticed that in the above formulation, there is a term $$\mathbf x_i \mathbf x_k$$ , and this is the only place where the feature vectors are used. If we want to transform our vector space into higher dimensional space, we need to transform $$\mathbf x_i$$ into $$\phi ( \mathbf x_i )$$ and $$\mathbf x_k$$ into $$\phi ( \mathbf x_k )$$ and then multiply $$\phi ( \mathbf x_i )$$ and $$\phi ( \mathbf x_k )$$. Doing so would be very costly.

On the other hand, we are only interested in the result of the dot-product $$\mathbf x_i \mathbf x_k$$, which, as we know, is a real number. We don’t care how this number was obtained as long as it’s correct. By using the kernel trick, we can get rid of a costly transformation of original feature vectors into higher-dimensional vectors and avoid computing their dot-product. We replace that by a simple operation on the original feature vectors that gives the same result. For example, instead of transforming $$( q_1, p_1 )$$ into $$( q_1^2, \sqrt{2} q_1 p_1, p_1^2 )$$ and $$( q_2, p_2 )$$ into $$( q_2^2, \sqrt{2} q_2 p_2, p_2^2 )$$ and then computing the dot-product of $$( q_1^2, \sqrt{2} q_1 p_1, p_1^2 )$$ and $$( q_2^2, \sqrt{2} q_2 p_2, p_2^2 )$$ to obtain $$( q_1^2 q_2^2 + 2 q_1 q_2 p_1 p_2 + p_1^2 p_2^2 )$$ we could find the dot-product between $$( q_1, p_1 )$$ and $$( q_2, p_2 )$$ to get $$( q_1 q_2 + p_1 p_2 )$$ and then square it to get exactly the same result $$( q_1^2 q_2^2 + 2 q_1 q_2 p_1 p_2 + p_1^2 p_2^2 )$$.

That was an example of the kernel trick, and we used the quadratic kernel $$k ( \mathbf x_i, \mathbf x_k ) \stackrel{\textrm{def}}{=} ( \mathbf x_i, \mathbf x_k )^2$$. Multiple kernel functions exist, the most widely used of which is the RBF kernel:

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789192756492

Tags
#MLBook #likelihood #logistic-regression #machine-learning
Question
In logistic regression, on the other hand, we maximize the [...] of our training set according to the model. In statistics, the likelihood function defines how likely the observation (an example) is according to our model.
Answer
likelihood

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In logistic regression, on the other hand, we maximize the likelihood of our training set according to the model. In statistics, the likelihood function defines how likely the observation (an example) is according to our model.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789194329356

Tags
#Cardiologie #Médecine #Physiologie #Rythmologie
Question
Class IV — The class IV drugs are [...].
Answer

#### calcium channel blockers

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Class IV — The class IV drugs are calcium channel blockers. Verapamil has a more pronounced inhibitory effect on the slow response SA and AV nodes than diltiazem .

#### Original toplevel document

UpToDate
iarrhythmic activity [11]. Subclass IIIc (transmitter dependent K channel blockers) such as acetylcholine activate potassium channels, with no clinically available drugs yet having this action. <span>Class IV — The class IV drugs are calcium channel blockers. Verapamil has a more pronounced inhibitory effect on the slow response SA and AV nodes than diltiazem. In comparison, the dihydropyridines, such as nifedipine, have little electrophysiologic effect on the heart. Verapamil and diltiazem can slow the sinus rate (usually in the presence of

#### Flashcard 4789195902220

Tags
#MLBook #likelihood #logistic-regression #machine-learning
Question
In logistic regression, on the other hand, we maximize the likelihood of our training set according to the model. In statistics, the likelihood function defines [...].
Answer
how likely the observation (an example) is according to our model

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In logistic regression, on the other hand, we maximize the likelihood of our training set according to the model. In statistics, the likelihood function defines how likely the observation (an example) is according to our model.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789197475084

Tags
#MLBook #likelihood #logistic-regression #machine-learning
Question
In logistic regression, on the other hand, we maximize the likelihood of our training set according to the model. In statistics, the [...] defines how likely the observation (an example) is according to our model.
Answer
likelihood function

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In logistic regression, on the other hand, we maximize the likelihood of our training set according to the model. In statistics, the likelihood function defines how likely the observation (an example) is according to our model.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789199834380

Tags
#Cardiologie #Médecine #Physiologie #Rythmologie
Question
Verapamil has a more pronounced inhibitory effect on the slow response SA and AV nodes than [...].
Answer

#### diltiazem

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Class IV — The class IV drugs are calcium channel blockers. Verapamil has a more pronounced inhibitory effect on the slow response SA and AV nodes than diltiazem .

#### Original toplevel document

UpToDate
iarrhythmic activity [11]. Subclass IIIc (transmitter dependent K channel blockers) such as acetylcholine activate potassium channels, with no clinically available drugs yet having this action. <span>Class IV — The class IV drugs are calcium channel blockers. Verapamil has a more pronounced inhibitory effect on the slow response SA and AV nodes than diltiazem. In comparison, the dihydropyridines, such as nifedipine, have little electrophysiologic effect on the heart. Verapamil and diltiazem can slow the sinus rate (usually in the presence of

#### Flashcard 4789203242252

[unknown IMAGE 4789208222988]
Tags
#Immunologie #Médecine #Physiologie #has-images
Question
Where do B lymphocytes mature ?
Answer

#### B lymphocytes mature in the bone marrow

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
All lymphocytes arise from stem cells in the bone marrow (Fig. 1-10). B lymphocytes mature in the bone marrow, and T lymphocytes mature in an organ called the thymus.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789205601548

[unknown IMAGE 4789208222988]
Tags
#Immunologie #Médecine #Physiologie #has-images
Question
Where do T lymphocytes mature ?
Answer

#### T lymphocytes mature in an organ called the thymus

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
All lymphocytes arise from stem cells in the bone marrow (Fig. 1-10). B lymphocytes mature in the bone marrow, and T lymphocytes mature in an organ called the thymus.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789213203724

Tags
#Kaliémie #Médecine #Physiologie
Question

Acid-base disturbances cause potassium to shift into and out of cells ("internal potassium balance" [2])

### Plasma potassium concentration will rise by [...] mEq/L for every 0.1 unit reduction of the extracellular pH [3].

However, this estimate was based upon only five patients with a variety of disturbances, and the range was very broad (0.2 to 1.7 mEq/L).

Answer

### 0.6

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
turbances cause potassium to shift into and out of cells, a phenomenon called "internal potassium balance" [2 ]. An often-quoted study found that the plasma potassium concentration will rise by <span>0.6 mEq/L for every 0.1 unit reduction of the extracellular pH [3 ]. However, this estimate was based upon only five patients with a variety of disturbances, and the range was very broad (0

#### Original toplevel document

UpToDate
[1]. These changes are most pronounced with metabolic acidosis but can also occur with metabolic alkalosis and, to a lesser degree, respiratory acid-base disorders. INTERNAL POTASSIUM BALANCE — <span>Acid-base disturbances cause potassium to shift into and out of cells, a phenomenon called "internal potassium balance" [2]. An often-quoted study found that the plasma potassium concentration will rise by 0.6 mEq/L for every 0.1 unit reduction of the extracellular pH [3]. However, this estimate was based upon only five patients with a variety of disturbances, and the range was very broad (0.2 to 1.7 mEq/L). This variability in the rise or fall of the plasma potassium in response to changes in extracellular pH was confirmed in subsequent studies [2,4]. Metabolic acidosis — In metabolic acidosis, more than one-half of the excess hydrogen ions are buffered in the cells. In this setting, electroneutrality is maintained in part by

#### Flashcard 4789220281612

Tags
#Kaliémie #Médecine #Physiologie
Question
A rise in the plasma potassium concentration can induce a mild metabolic [...].
Answer

### acidosis

In patients with hypoaldosteronism, for example, the mild metabolic acidosis is primarily due to the associated hyperkalemia [ 11].

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
Just as metabolic acidosis can cause hyperkalemia, a rise in the plasma potassium concentration can induce a mild metabolic acidosis. In patients with hypoaldosteronism, for example, the mild metabolic acidosis is primarily due to the associated hyperkalemia [ 11 ]. Two factors contribute to this phenomenon: ● A tran

#### Original toplevel document

UpToDate
e the ability of the organic anion to accompany the hydrogen ion into the cell, perhaps as the lipid-soluble, intact acid [9], and differential effects on insulin and glucagon secretion [4,10]. <span>Just as metabolic acidosis can cause hyperkalemia, a rise in the plasma potassium concentration can induce a mild metabolic acidosis. In patients with hypoaldosteronism, for example, the mild metabolic acidosis is primarily due to the associated hyperkalemia [11]. Two factors contribute to this phenomenon: ●A transcellular exchange occurs as the entry of most of the excess potassium into the cells is balanced in part by intracellular hydrogen ions moving into the extracellular fluid [12]. The net effect is an extracellular acidosis and an intracellular alkalosis. ●Normally, the kidney increases ammonium excretion after an acid load, an effect that is stimulated in part by a fall in intracellular pH [13]. In hyperkalemia, the associated intracellular alkalosis diminishes ammonium generation by the proximal tubule [14]. Hyperkalemia reduces the expression of ammonia-generating enzymes in the proximal tubule and upregulates expression of the ammonia-recycling enzyme glutamine synthetase [15]. Normally, ammonium exiting the proximal tubule is reabsorbed in the thick ascending limb via the apical Na+-K+/NH4+-2Cl- cotransporter (NKCC2), after which it crosses the interstitium and is excreted into the urine by the collecting duct [16-18]. However, potassium competes with ammonium for reabsorption by NKCC2, and therefore, elevated tubular potassium concentrations can impair normal renal ammonium handling, resulting in acidosis [19]. In addition, hyperkalemia reduces expression of the ammonia transporter family member Rhcg and decreases apical expression of H-ATPase in the inner stripe of the outer medullary collecting duct, further compromising urinary ammonium excretion [15]. The net effect of these changes in cation distribution and renal function is that metabolic acidosis and relative hyperkalemia are often seen together. Metabolic alkalosis — For similar

#### Flashcard 4789225262348

Tags
#Kaliémie #Médecine #Physiologie
Question
What are the 2 main factors that can lead to mild metabolic acidosis due to hyperkalemia ?
Answer

• #### Elevated tubular potassium concentrations can impair normal renal ammonium handling, resulting in acidosis [19].

Normally, the kidney increases ammonium excretion after an acid load, an effect that is stimulated in part by a fall in intracellular pH [13]. In hyperkalemia, the associated intracellular alkalosis diminishes ammonium generation by the proximal tubule [14]. Hyperkalemia reduces the expression of ammonia-generating enzymes in the proximal tubule and upregulates expression of the ammonia-recycling enzyme glutamine synthetase [15]. Normally, ammonium exiting the proximal tubule is reabsorbed in the thick ascending limb via the apical Na+-K+/NH4+-2Cl- cotransporter (NKCC2), after which it crosses the interstitium and is excreted into the urine by the collecting duct [16-18]. However, potassium competes with ammonium for reabsorption by NKCC2, and therefore, elevated tubular potassium concentrations can impair normal renal ammonium handling, resulting in acidosis [19].

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
bolic acidosis. In patients with hypoaldosteronism, for example, the mild metabolic acidosis is primarily due to the associated hyperkalemia [ 11 ]. Two factors contribute to this phenomenon: ● <span>A transcellular exchange occurs as the entry of most of the excess potassium into the cells is balanced in part by intracellular hydrogen ions moving into the extracellular fluid [12 ]. The net effect is an extracellular acidosis and an intracellular alkalosis. ● Normally, the kidney increases ammonium excretion after an acid load, an effect that is stimulated in part by a fall in intracellular pH [13 ]. In hyperkalemia, the associated intracellular alkalosis diminishes ammonium generation by the proximal tubule [14 ]. Hyperkalemia reduces the expression of ammonia-generating enzymes in the proximal tubule and upregulates expression of the ammonia-recycling enzyme glutamine synthetase [15 ]. Normally, ammonium exiting the proximal tubule is reabsorbed in the thick ascending limb via the apical Na+-K+/NH4+-2Cl- cotransporter (NKCC2), after which it crosses the interstitium and is excreted into the urine by the collecting duct [16-18 ]. However, potassium competes with ammonium for reabsorption by NKCC2, and therefore, elevated tubular potassium concentrations can impair normal renal ammonium handling, resulting in acidosis [19 ]. In addition, hyperkalemia reduces expression of the ammonia transporter family member Rhcg and decreases apical expression of H-ATPase in the inner stripe of the outer medullary collecting duct, further compromising urinary ammonium excretion [15 ]. <span>

#### Original toplevel document

UpToDate
e the ability of the organic anion to accompany the hydrogen ion into the cell, perhaps as the lipid-soluble, intact acid [9], and differential effects on insulin and glucagon secretion [4,10]. <span>Just as metabolic acidosis can cause hyperkalemia, a rise in the plasma potassium concentration can induce a mild metabolic acidosis. In patients with hypoaldosteronism, for example, the mild metabolic acidosis is primarily due to the associated hyperkalemia [11]. Two factors contribute to this phenomenon: ●A transcellular exchange occurs as the entry of most of the excess potassium into the cells is balanced in part by intracellular hydrogen ions moving into the extracellular fluid [12]. The net effect is an extracellular acidosis and an intracellular alkalosis. ●Normally, the kidney increases ammonium excretion after an acid load, an effect that is stimulated in part by a fall in intracellular pH [13]. In hyperkalemia, the associated intracellular alkalosis diminishes ammonium generation by the proximal tubule [14]. Hyperkalemia reduces the expression of ammonia-generating enzymes in the proximal tubule and upregulates expression of the ammonia-recycling enzyme glutamine synthetase [15]. Normally, ammonium exiting the proximal tubule is reabsorbed in the thick ascending limb via the apical Na+-K+/NH4+-2Cl- cotransporter (NKCC2), after which it crosses the interstitium and is excreted into the urine by the collecting duct [16-18]. However, potassium competes with ammonium for reabsorption by NKCC2, and therefore, elevated tubular potassium concentrations can impair normal renal ammonium handling, resulting in acidosis [19]. In addition, hyperkalemia reduces expression of the ammonia transporter family member Rhcg and decreases apical expression of H-ATPase in the inner stripe of the outer medullary collecting duct, further compromising urinary ammonium excretion [15]. The net effect of these changes in cation distribution and renal function is that metabolic acidosis and relative hyperkalemia are often seen together. Metabolic alkalosis — For similar

#### Flashcard 4789226573068

Tags
#MLBook #logistic-regression #machine-learning #maximum-likelihood #solution
Question
Describe the optimization criterion in logistic regression.
Answer

For instance, let’s have a labeled example $$( \mathbf x_i, y_i )$$ in our training data. Assume also that we found (guessed) some specific values $$\hat {\mathbf w}$$ and $$\hat b$$ of our parameters. If we now apply our model $$f_{\hat{\mathbf w}, \hat b}$$ to $$\mathbf x_i$$ using eq. 3 $$\left[ f_{\mathbf w, b} (x) \stackrel{\textrm{def}}{=} \displaystyle \frac{1}{1 + e^{-(\mathbf w \mathbf x + b)}} \right]$$ we will get some value $$0 < p < 1$$ as output. If $$y_i$$ is the positive class, the likelihood of $$y_i$$ being the positive class, according to our model, is given by $$p$$. Similarly, if $$y_i$$ is the negative class, the likelihood of it being the negative class is given by $$1 − p$$.

The optimization criterion in logistic regression is called maximum likelihood. Instead of minimizing the average loss, like in linear regression, we now maximize the likelihood of the training data according to our model:

$$L_{\mathbf w, b} \stackrel{\textrm{def}}{=} \displaystyle \prod_{i = 1 \ldots N} f_{\mathbf w, b} (\mathbf x_i )^{y_i} (1 - f_{\mathbf w, b} (\mathbf x_i ))^{(1 - y_i)}. \quad (4)$$

The expression $$f_{\mathbf w, b} (\mathbf x )^{y_i} (1 - f_{\mathbf w, b} (\mathbf x ))^{(1 - y_i)}$$ may look scary but it’s just a fancy mathematical way of saying: “$$f_{\mathbf w, b} (\mathbf x )$$ when $$y_i = 1$$ and $$(1 - f_{\mathbf w, b} (\mathbf x ))$$ otherwise”. Indeed, if $$y_i = 1$$, then $$(1 - f_{\mathbf w, b} (\mathbf x ))^{(1 - y_i)}$$ equals 1 because $$(1 - y_i) = 0$$ and we know that anything power 0 equals 1. On the other hand, if $$y_i = 0$$, then $$f_{\mathbf w, b} (\mathbf x )^{y_i}$$ equals 1 for the same reason.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
er hand, we maximize the likelihood of our training set according to the model. In statistics, the likelihood function defines how likely the observation (an example) is according to our model. <span>For instance, let’s have a labeled example $$( \mathbf x_i, y_i )$$ in our training data. Assume also that we found (guessed) some specific values $$\hat {\mathbf w}$$ and $$\hat b$$ of our parameters. If we now apply our model $$f_{\hat{\mathbf w}, \hat b}$$ to $$\mathbf x_i$$ using eq. 3 $$\left[ f_{\mathbf w, b} (x) \stackrel{\textrm{def}}{=} \displaystyle \frac{1}{1 + e^{-(\mathbf w \mathbf x + b)}} \right]$$ we will get some value $$0 < p < 1$$ as output. If $$y_i$$ is the positive class, the likelihood of $$y_i$$ being the positive class, according to our model, is given by $$p$$. Similarly, if $$y_i$$ is the negative class, the likelihood of it being the negative class is given by $$1 − p$$. The optimization criterion in logistic regression is called maximum likelihood. Instead of minimizing the average loss, like in linear regression, we now maximize the likelihood of the training data according to our model: $$L_{\mathbf w, b} \stackrel{\textrm{def}}{=} \displaystyle \prod_{i = 1 \ldots N} f_{\mathbf w, b} (\mathbf x_i )^{y_i} (1 - f_{\mathbf w, b} (\mathbf x_i ))^{(1 - y_i)}. \quad (4)$$ The expression $$f_{\mathbf w, b} (\mathbf x )^{y_i} (1 - f_{\mathbf w, b} (\mathbf x ))^{(1 - y_i)}$$ may look scary but it’s just a fancy mathematical way of saying: “$$f_{\mathbf w, b} (\mathbf x )$$ when $$y_i = 1$$ and $$(1 - f_{\mathbf w, b} (\mathbf x ))$$ otherwise”. Indeed, if $$y_i = 1$$, then $$(1 - f_{\mathbf w, b} (\mathbf x ))^{(1 - y_i)}$$ equals 1 because $$(1 - y_i) = 0$$ and we know that anything power 0 equals 1. On the other hand, if $$y_i = 0$$, then $$f_{\mathbf w, b} (\mathbf x )^{y_i}$$ equals 1 for the same reason. <span>

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789229718796

[unknown IMAGE 4773033413900]
Tags
#MLBook #binary-classification #has-images #logistic-regression #machine-learning #problem-statement #sigmoid-function #standard-logistic-function
Question
State the problem in logistic regression.
Answer

In logistic regression, we still want to model $$y_i$$ as a linear function of $$\mathbf x_i$$, however, with a binary $$y_i$$ this is not straightforward. The linear combination of features such as $$\mathbf w \mathbf x_i + b$$ is a function that spans from minus infinity to plus infinity, while $$y_i$$ has only two possible values.

At the time where the absence of computers required scientists to perform manual calculations, they were eager to find a linear classification model. They figured out that if we define a negative label as 0 and the positive label as 1, we would just need to find a simple continuous function whose codomain is (0 , 1). In such a case, if the value returned by the model for input $$\mathbf x$$ is closer to 0, then we assign a negative label to $$\mathbf x$$ ; otherwise, the example is labeled as positive. One function that has such a property is the standard logistic function (also known as the sigmoid function):

$$f(x) = \displaystyle \frac{1}{1 + e^{-x}}$$,

where $$e$$ is the base of the natural logarithm (also called Euler’s number; $$e^x$$ is also known as the $$exp(x)$$ function in programming languages). Its graph is depicted in Figure 3.

The logistic regression model looks like this:
$$f_{\mathbf w, b} (\mathbf x) \stackrel{\textrm{def}}{=} \displaystyle \frac{1}{1 + e^{-(\mathbf w \mathbf x + b)}} \quad (3)$$

You can see the familiar term $$\mathbf w \mathbf x + b$$ from linear regression.

By looking at the graph of the standard logistic function, we can see how well it fits our classification purpose: if we optimize the values of $$\mathbf w$$ and $$b$$ appropriately, we could interpret the output of $$f( \mathbf x )$$ as the probability of $$y_i$$ being positive. For example, if it’s higher than or equal to the threshold 0.5 we would say that the class of $$\mathbf x$$ is positive; otherwise, it’s negative. In practice, the choice of the threshold could be different depending on the problem. We return to this discussion in Chapter 5 when we talk about model performance assessment.

Now, how do we find optimal $$\mathbf w^\ast$$ and $$b^\ast$$? In linear regression, we minimized the empirical risk which was defined as the average squared error loss, also known as the mean squared error or MSE.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In logistic regression, we still want to model $$y_i$$ as a linear function of $$\mathbf x_i$$, however, with a binary $$y_i$$ this is not straightforward. The linear combination of features such as $$\mathbf w \mathbf x_i + b$$ is a function that spans from minus infinity to plus infinity, while $$y_i$$ has only two possible values. At the time where the absence of computers required scientists to perform manual calculations, they were eager to find a linear classification model. They figured out that if we define a negative label as 0 and the positive label as 1, we would just need to find a simple continuous function whose codomain is (0 , 1). In such a case, if the value returned by the model for input $$\mathbf x$$ is closer to 0, then we assign a negative label to $$\mathbf x$$ ; otherwise, the example is labeled as positive. One function that has such a property is the standard logistic function (also known as the sigmoid function): $$f(x) = \displaystyle \frac{1}{1 + e^{-x}}$$, where $$e$$ is the base of the natural logarithm (also called Euler’s number; $$e^x$$ is also known as the $$exp(x)$$ function in programming languages). Its graph is depicted in Figure 3. The logistic regression model looks like this: $$f_{\mathbf w, b} (x) \stackrel{\textrm{def}}{=} \displaystyle \frac{1}{1 + e^{-(\mathbf w \mathbf x + b)}} \quad (3)$$ You can see the familiar term $$\mathbf w \mathbf x + b$$ from linear regression. By looking at the graph of the standard logistic function, we can see how well it fits our classification purpose: if we optimize the values of $$\mathbf w$$ and $$b$$ appropriately, we could interpret the output of $$f( \mathbf x )$$ as the probability of $$y_i$$ being positive. For example, if it’s higher than or equal to the threshold 0.5 we would say that the class of $$\mathbf x$$ is positive; otherwise, it’s negative. In practice, the choice of the threshold could be different depending on the problem. We return to this discussion in Chapter 5 when we talk about model performance assessment. Now, how do we find optimal $$\mathbf w^\ast$$ and $$b^\ast$$? In linear regression, we minimized the empirical risk which was defined as the average squared error loss, also known as the mean squared error or MSE.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Annotation 4789231553804

#MLBook #gradient-descent #log-likelihood #logistic-regression #machine-learning #solution
You may have noticed that we used the product operator $$\prod$$ in the objective function instead of the sum operator $$\sum$$ which was used in linear regression. It’s because the likelihood of observing $$N$$ labels for $$N$$ examples is the product of likelihoods of each observation (assuming that all observations are independent of one another, which is the case). You can draw a parallel with the multiplication of probabilities of outcomes in a series of independent experiments in the probability theory.

status not read

Open it
You may have noticed that we used the product operator $$\prod$$ in the objective function instead of the sum operator $$\sum$$ which was used in linear regression. It’s because the likelihood of observing $$N$$ labels for $$N$$ examples is the product of likelihoods of each observation (assuming that all observations are independent of one another, which is the case). You can draw a parallel with the multiplication of probabilities of outcomes in a series of independent experiments in the probability theory. Because of the $$exp$$ function used in the model, in practice, it’s more convenient to maximize the log-likelihood instead of likelihood. The log-likelihood is defined like follows: $$#### Original toplevel document (pdf) cannot see any pdfs #### Flashcard 4789235748108 Tags #MLBook #gradient-descent #log-likelihood #logistic-regression #machine-learning #solution Question You may have noticed that we used the product operator \(\prod$$ in the objective function instead of the sum operator $$\sum$$ which was used in linear regression. It’s because [...]
Answer
the likelihood of observing $$N$$ labels for $$N$$ examples is the product of likelihoods of each observation (assuming that all observations are independent of one another, which is the case). You can draw a parallel with the multiplication of probabilities of outcomes in a series of independent experiments in the probability theory.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
You may have noticed that we used the product operator $$\prod$$ in the objective function instead of the sum operator $$\sum$$ which was used in linear regression. It’s because the likelihood of observing $$N$$ labels for $$N$$ examples is the product of likelihoods of each observation (assuming that all observations are independent of one another, which is the case). You can draw a parallel with the multiplication of probabilities of outcomes in a series of independent experiments in the probability theory.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Annotation 4789237583116

#MLBook #gradient-descent #log-likelihood #logistic-regression #machine-learning #solution

Because of the $$exp$$ function used in the model, in practice, it’s more convenient to maximize the log-likelihood instead of likelihood. The log-likelihood is defined like follows:

$$LogL_{\mathbf w,b} \stackrel{\textrm{def}}{=} \ln(L_{\mathbf w,b} (\mathbf x)) = \displaystyle \sum_{i=1}^N y_i \ln f_{\mathbf w,b} (\mathbf x) + (1 −y_i ) \ln (1 − f_{\mathbf w,b} (\mathbf x)).$$

Because $$\ln$$ is a strictly increasing function, maximizing this function is the same as maximizing its argument, and the solution to this new optimization problem is the same as the solution to the original problem.

Contrary to linear regression, there’s no closed form solution to the above optimization problem. A typical numerical optimization procedure used in such cases is gradient descent. We talk about it in the next chapter.

status not read

#### Parent (intermediate) annotation

Open it
re independent of one another, which is the case). You can draw a parallel with the multiplication of probabilities of outcomes in a series of independent experiments in the probability theory. <span>Because of the $$exp$$ function used in the model, in practice, it’s more convenient to maximize the log-likelihood instead of likelihood. The log-likelihood is defined like follows: $$LogL_{\mathbf w,b} \stackrel{\textrm{def}}{=} \ln(L_{\mathbf w,b} (\mathbf x)) = \displaystyle \sum_{i=1}^N y_i \ln f_{\mathbf w,b} (\mathbf x) + (1 −y_i ) \ln (1 − f_{\mathbf w,b} (\mathbf x)).$$ Because $$\ln$$ is a strictly increasing function, maximizing this function is the same as maximizing its argument, and the solution to this new optimization problem is the same as the solution to the original problem. Contrary to linear regression, there’s no closed form solution to the above optimization problem. A typical numerical optimization procedure used in such cases is gradient descent. We talk about it in the next chapter. <span>

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789239155980

Tags
#Kaliémie #Médecine #Physiologie
Question
A fall in pH is much [...] likely to raise the plasma potassium concentration in patients with lactic acidosis or ketoacidosis [7,8].
Answer

### less

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
A fall in pH is much less likely to raise the plasma potassium concentration in patients with lactic acidosis or ketoacidosis [7,8 ]. The hyperkalemia that is commonly seen in diabetic ketoacidosis (DKA), for ex

#### Original toplevel document

UpToDate
uced [5,6]. There is still a relative increase in the plasma potassium concentration, however, as evidenced by a further fall in the plasma potassium concentration if the acidemia is corrected. <span>A fall in pH is much less likely to raise the plasma potassium concentration in patients with lactic acidosis or ketoacidosis [7,8]. The hyperkalemia that is commonly seen in diabetic ketoacidosis (DKA), for example, is more closely related to the insulin deficiency and hyperosmolality than to the degree of acidemia. (See "Diabetic ketoacidosis and hyperosmolar hyperglycemic state in adults: Clinical features, evaluation, and diagnosis".) Why this occurs is not well understood. Two factors that may

#### Flashcard 4789241515276

Tags
#Kaliémie #Médecine #Physiologie
Question

A fall in pH is much less likely to raise the plasma potassium concentration in patients with lactic acidosis or ketoacidosis [7,8].

Answer

### insulin deficiency and hyperosmolality

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
tassium concentration in patients with lactic acidosis or ketoacidosis [7,8 ]. The hyperkalemia that is commonly seen in diabetic ketoacidosis (DKA), for example, is more closely related to the <span>insulin deficiency and hyperosmolality than to the degree of acidemia. <span>

#### Original toplevel document

UpToDate
uced [5,6]. There is still a relative increase in the plasma potassium concentration, however, as evidenced by a further fall in the plasma potassium concentration if the acidemia is corrected. <span>A fall in pH is much less likely to raise the plasma potassium concentration in patients with lactic acidosis or ketoacidosis [7,8]. The hyperkalemia that is commonly seen in diabetic ketoacidosis (DKA), for example, is more closely related to the insulin deficiency and hyperosmolality than to the degree of acidemia. (See "Diabetic ketoacidosis and hyperosmolar hyperglycemic state in adults: Clinical features, evaluation, and diagnosis".) Why this occurs is not well understood. Two factors that may

#### Flashcard 4789243874572

Tags
#MLBook #linear-regression #machine-learning #problem-statement
Question
State the problem of linear regression.
Answer

We have a collection of labeled examples $$\{ ( \mathbf x_i , y_i ) \}^N_{i=1}$$ , where $$N$$ is the size of the collection, $$\mathbf x_i$$ is the $$D$$-dimensional feature vector of example $$i = 1 , . . . , N$$ , $$y_i$$ is a real-valued target and every feature $$x^{(j)}_i , j = 1, \ldots , D$$, is also a real number. We want to build a model $$f_{\mathbf w,b} (\mathbf x)$$ as a linear combination of features of example $$\mathbf x$$:

$$f_{\mathbf w,b} (\mathbf x) = \mathbf w \mathbf x + b$$,

where $$\mathbf w$$ is a $$D$$-dimensional vector of parameters and $$b$$ is a real number. The notation $$f_{\mathbf w,b} (\mathbf x)$$ means that the model $$f$$ is parametrized by two values: $$\mathbf w$$ and $$\mathbf b$$.

We will use the model to predict the unknown $$y$$ for a given $$\mathbf x$$ like this: $$y \leftarrow f_{\mathbf w,b} ( x )$$. Two models parametrized by two different pairs $$( \mathbf w, b )$$ will likely produce two different predictions when applied to the same example. We want to find the optimal values $$( \mathbf w^\ast, b^\ast )$$. Obviously, the optimal values of parameters define the model that makes the most accurate predictions.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
We have a collection of labeled examples $$\{ ( \mathbf x_i , y_i ) \}^N_{i=1}$$ , where $$N$$ is the size of the collection, $$\mathbf x_i$$ is the $$D$$-dimensional feature vector of example $$i = 1 , . . . , N$$ , $$y_i$$ is a real-valued target and every feature $$x^{(j)}_i , j = 1, \ldots , D$$, is also a real number. We want to build a model $$f_{\mathbf w,b} (\mathbf x)$$ as a linear combination of features of example $$\mathbf x$$: $$f_{\mathbf w,b} (\mathbf x) = \mathbf w \mathbf x + b$$, where $$\mathbf w$$ is a $$D$$-dimensional vector of parameters and $$b$$ is a real number. The notation $$f_{\mathbf w,b} (\mathbf x)$$ means that the model $$f$$ is parametrized by two values: $$\mathbf w$$ and $$\mathbf b$$. We will use the model to predict the unknown $$y$$ for a given $$\mathbf x$$ like this: $$y \leftarrow f_{\mathbf w,b} ( x )$$. Two models parametrized by two different pairs $$( \mathbf w, b )$$ will likely produce two different predictions when applied to the same example. We want to find the optimal values $$( \mathbf w^\ast, b^\ast )$$. Obviously, the optimal values of parameters define the model that makes the most accurate predictions.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789246233868

Tags
#Kaliémie #Médecine #Physiologie
Question

#### In several organic acidoses, the acid anion is excreted in the urine with [...] or [...] as the accompanying cation.

Hypokalemia may result despite the concurrent shift of potassium out of cells in response to acidemia.

Answer

### sodium or potassium

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
In several organic acidoses, the acid anion is excreted in the urine with sodium or potassium as the accompanying cation. Hypokalemia may result despite the concurrent shift of potassium out of cells in response to acidemia. The metabolic acidosis caused by glue sni

#### Original toplevel document

UpToDate
potassium. The net result is a normal anion gap metabolic acidosis with potassium depletion and hypokalemia. (See "Causes of hypokalemia in adults", section on 'Lower gastrointestinal losses'.) <span>In several organic acidoses, the acid anion is excreted in the urine with sodium or potassium as the accompanying cation. Hypokalemia may result despite the concurrent shift of potassium out of cells in response to acidemia. The metabolic acidosis caused by glue sniffing is the most dramatic example of this phenomenon. Inhaled toluene is metabolized to hippuric acid, and the acid anion (hippurate) is eliminated in the urine by both filtration and secretion, commonly resulting in hypokalemia [22]. (See "The delta anion gap/delta HCO3 ratio in patients with a high anion gap metabolic acidosis".) Renal potassium wasting also occurs in diabetic ketoacidosis (DKA) and occasionally may lead to hypokalemia (6 percent of patients with DKA in one study) [23]. However, in contrast to t

#### Flashcard 4789248593164

Tags
#Kaliémie #Médecine #Physiologie
Question
What is the most dramatic example of a co-excretion induced ion loss occurring in the setting of a metabolic acidosis ?
Answer

### glue sniffing

Inhaled toluene is metabolized to hippuric acid, and the acid anion (hippurate) is eliminated in the urine by both filtration and secretion, commonly resulting in hypokalemia [22 ].

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
rine with sodium or potassium as the accompanying cation. Hypokalemia may result despite the concurrent shift of potassium out of cells in response to acidemia. The metabolic acidosis caused by <span>glue sniffing is the most dramatic example of this phenomenon. Inhaled toluene is metabolized to hippuric acid, and the acid anion (hippurate) is eliminated in the urine by both filtration and secret

#### Original toplevel document

UpToDate
potassium. The net result is a normal anion gap metabolic acidosis with potassium depletion and hypokalemia. (See "Causes of hypokalemia in adults", section on 'Lower gastrointestinal losses'.) <span>In several organic acidoses, the acid anion is excreted in the urine with sodium or potassium as the accompanying cation. Hypokalemia may result despite the concurrent shift of potassium out of cells in response to acidemia. The metabolic acidosis caused by glue sniffing is the most dramatic example of this phenomenon. Inhaled toluene is metabolized to hippuric acid, and the acid anion (hippurate) is eliminated in the urine by both filtration and secretion, commonly resulting in hypokalemia [22]. (See "The delta anion gap/delta HCO3 ratio in patients with a high anion gap metabolic acidosis".) Renal potassium wasting also occurs in diabetic ketoacidosis (DKA) and occasionally may lead to hypokalemia (6 percent of patients with DKA in one study) [23]. However, in contrast to t

#### Flashcard 4789254098188

[unknown IMAGE 4769622658316]
Tags
#MLBook #SVM #has-images #linear-regression #machine-learning
Question
Compare the SVM and linear regression models.
Answer

You could have noticed that the form of our linear model in eq. 1 $$\left[ f_{\mathbf w,b} (\mathbf x) = \mathbf w \mathbf x + b \right]$$ is very similar to the form of the SVM model. The only difference is the missing sign operator. The two models are indeed similar. However, the hyperplane in the SVM plays the role of the decision boundary: it’s used to separate two groups of examples from one another. As such, it has to be as far from each group as possible.

On the other hand, the hyperplane in linear regression is chosen to be as close to all training examples as possible.

You can see why this latter requirement is essential by looking at the illustration in Figure 1. It displays the regression line (in red) for one-dimensional examples (blue dots). We can use this line to predict the value of the target $$y$$ new for a new unlabeled input example $$x_{new}$$ new . If our examples are $$D$$-dimensional feature vectors (for $$D > 1$$), the only difference with the one-dimensional case is that the regression model is not a line but a plane (for two dimensions) or a hyperplane (for $$D > 2$$).

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
You could have noticed that the form of our linear model in eq. 1 $$\left[ f_{\mathbf w,b} (\mathbf x) = \mathbf w \mathbf x + b \right]$$ is very similar to the form of the SVM model. The only difference is the missing sign operator. The two models are indeed similar. However, the hyperplane in the SVM plays the role of the decision boundary: it’s used to separate two groups of examples from one another. As such, it has to be as far from each group as possible. On the other hand, the hyperplane in linear regression is chosen to be as close to all training examples as possible. You can see why this latter requirement is essential by looking at the illustration in Figure 1. It displays the regression line (in red) for one-dimensional examples (blue dots). We can use this line to predict the value of the target $$y$$ new for a new unlabeled input example $$x_{new}$$ new . If our examples are $$D$$-dimensional feature vectors (for $$D > 1$$), the only difference with the one-dimensional case is that the regression model is not a line but a plane (for two dimensions) or a hyperplane (for $$D > 2$$).

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789258554636

Tags
#MLBook #cost-function #empirical-risk #linear-regression #loss-function #machine-learning #solution #squared-error-loss
Question
Describe the optimization procedure to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ in the linear regression model.
Answer

The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression:

$$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$

In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression: $$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$ In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789260913932

Tags
#MLBook #cost-function #empirical-risk #linear-regression #loss-function #machine-learning #solution #squared-error-loss
Question

The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression:

$$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$

In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the [...]. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

Answer
loss function

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the <span>loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789262748940

Tags
#MLBook #cost-function #empirical-risk #linear-regression #loss-function #machine-learning #solution #squared-error-loss
Question

The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression:

$$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$

In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the [...]. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

Answer
cost function

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the <span>cost function. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789264321804

Tags
#MLBook #cost-function #empirical-risk #linear-regression #loss-function #machine-learning #solution #squared-error-loss
Question

The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression:

$$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$

In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called [...] . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

Answer
squared error loss

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
_i ) - y_i\right)^2\) in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called <span>squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, th

#### Original toplevel document (pdf)

cannot see any pdfs

#### Flashcard 4789265894668

Tags
#MLBook #cost-function #empirical-risk #linear-regression #loss-function #machine-learning #solution #squared-error-loss
Question

The optimization procedure which we use to find the optimal values for $$\mathbf w^\ast$$ and $$b^\ast$$ tries to minimize the following expression:

$$\displaystyle \frac{1}{N} \displaystyle \sum_{i = 1, \ldots N} \left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2. \quad (2)$$

In mathematics, the expression we minimize or maximize is called an objective function, or, simply, an objective. The expression $$\left( f_{\mathbf w, b} ( \mathbf x_i ) - y_i\right)^2$$ in the above objective is called the loss function. It’s a measure of penalty for misclassification of example $$i$$. This particular choice of the loss function is called squared error loss . All model-based learning algorithms have a loss function and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the [...]. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data.

Answer
empirical risk

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
ction and what we do to find the best model is we try to minimize the objective known as the cost function. In linear regression, the cost function is given by the average loss, also called the <span>empirical risk. The average loss, or empirical risk, for a model, is the average of all penalties obtained by applying the model to the training data. <span>

#### Original toplevel document (pdf)

cannot see any pdfs

#### Annotation 4789267467532

[unknown IMAGE 4789270351116]
#MLBook #has-images #linear-regression #machine-learning #overfitting
One practical justification of the choice of the linear form for the model is that it’s simple. Why use a complex model when you can use a simple one? Another consideration is that linear models rarely overfit. Overfitting is the property of a model such that the model predicts very well labels of the examples used during training but frequently makes errors when applied to examples that weren’t seen by the learning algorithm during training. An example of overfitting in regression is shown in Figure 2. The data used to build the red regression line is the same as in Figure 1. The difference is that this time, this is the polynomial regression with a polynomial of degree 10. The regression line predicts almost perfectly the targets almost all training examples, but will likely make significant errors on new data, as you can see in Figure 1 for $$x_{new}$$ . We talk more about overfitting and how to avoid it Chapter 5.

status not read

#### pdf

cannot see any pdfs

#### Flashcard 4789274545420

[unknown IMAGE 4789270351116]
Tags
#MLBook #has-images #linear-regression #machine-learning #overfitting
Question
Discuss about overfitting in linear regression.
Answer
One practical justification of the choice of the linear form for the model is that it’s simple. Why use a complex model when you can use a simple one? Another consideration is that linear models rarely overfit. Overfitting is the property of a model such that the model predicts very well labels of the examples used during training but frequently makes errors when applied to examples that weren’t seen by the learning algorithm during training. An example of overfitting in regression is shown in Figure 2. The data used to build the red regression line is the same as in Figure 1. The difference is that this time, this is the polynomial regression with a polynomial of degree 10. The regression line predicts almost perfectly the targets almost all training examples, but will likely make significant errors on new data, as you can see in Figure 1 for $$x_{new}$$ . We talk more about overfitting and how to avoid it Chapter 5.

status measured difficulty not learned 37% [default] 0

#### Parent (intermediate) annotation

Open it
One practical justification of the choice of the linear form for the model is that it’s simple. Why use a complex model when you can use a simple one? Another consideration is that linear models rarely overfit. Overfitting is the property of a model such that the model predicts very well labels of the examples used during training but frequently makes errors when applied to examples that weren’t seen by the learning algorithm during training. An example of overfitting in regression is shown in Figure 2. The data used to build the red regression line is the same as in Figure 1. The difference is that this time, this is the polynomial regression with a polynomial of degree 10. The regression line predicts almost perfectly the targets almost all training examples, but will likely make significant errors on new data, as you can see in Figure 1 for $$x_{new}$$ . We talk more about overfitting and how to avoid it Chapter 5.

#### Original toplevel document (pdf)

cannot see any pdfs