Edited, memorised or added to reading queue

on 04-Jun-2024 (Tue)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#deep-learning #keras #lstm #python #sequence

1.4.2 LSTM Gates

The key to the memory cell are the gates. These too are weighted functions that further govern the information flow in the cell.

There are three gates:
Forget Gate:
Decides what information to discard from the cell.

Input Gate:
Decides which values from the input to update the memory state.

Output Gate:
Decides what to output based on input and the memory of the cell.

The forget gate and input gate are used in the updating of the internal state. The output gate is a final limiter on what the cell actually outputs.

It is these gates and the consistent data flow called the constant error carrousel or CEC that keep each cell stable (neither exploding or vanishing). Each memory cell’s internal architecture guarantees constant error flow within its constant error carrousel CEC... This represents the basis for bridging very long time lags. Two gate units learn to open and close access to error flow within each memory cell’s CEC. The multiplicative input gate affords protection of the CEC from perturbation by irrelevant inputs. Likewise, the multiplicative output gate protects other units from perturbation by currently irrelevant memory contents

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#feature-engineering #lstm #recurrent-neural-networks #rnn
The effect of a direct mailing does not end after the campaign is over, and the customer has made her decision to respond or not. An advertising campaign or customer retention program can impact customers' behaviors for several weeks, even months. Customers tend to remember past events, at least partially. Hence, the effects of marketing actions tend to carry-over into numerous subsequent periods
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
The effect of a direct mailing does not end after the campaign is over, and the customer has made her decision to respond or not. An advertising campaign or customer retention program can impact customers' behaviors for several weeks, even months. Customers tend to remember past events, at least partially. Hence, the effects of marketing actions tend to carry- over into numerous subsequent periods (Lilien, Rangaswamy, & De Bruyn, 2013; Schweidel & Knox, 2013; Van Diepen et al., 2009). The LSTM neural network, which we introduce next, is a kind of RNN that has been modifie

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7628977999116

Tags
#ML_in_Action #learning #machine #software-engineering
Question
Part 3 (chapters 14–16) focuses on “the after”: specifically, considerations related to streamlining production release, retraining, monitoring, and attribution for a project. With examples focused on A/B testing, [...], and a passive retraining system, you’ll be shown how to implement systems and architectures that can ensure that you’re building the minimally complex solution to solve a business problem with ML
Answer
feature stores

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
16) focuses on “the after”: specifically, considerations related to streamlining production release, retraining, monitoring, and attribution for a project. With examples focused on A/B testing, <span>feature stores, and a passive retraining system, you’ll be shown how to implement systems and architectures that can ensure that you’re building the minimally complex solution to solve a business prob

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629338971404

Tags
#has-images
[unknown IMAGE 7560930397452]
Question

Multiple assertions in one unit test

Test will pass only if [...].

Answer
both assertions pass

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Open it
Multiple assertions in one unit test Test will pass only if both assertions pass.







Flashcard 7629342117132

Tags
#causality #statistics
Question
Exchangeability means that the treatment groups are exchangeable in the sense that if they were [...], the new treatment group would observe the same outcomes as the old treatment group
Answer
swapped

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Exchangeability means that the treatment groups are exchangeable in the sense that if they were swapped, the new treatment group would observe the same outcomes as the old treatment group

Original toplevel document (pdf)

cannot see any pdfs







#feature-engineering #lstm #recurrent-neural-networks #rnn
Many alternative model specifications and network architectures offer the promises of improvements over vanilla LSTM models. They have already been proven superior in some domains. Such alternative specifications include Gated Recurrent Units, BiLSTM (Siami-Namini, Tavakoli, & Namin, 2019), Multi-Dimensional LSTM (Graves & Schmidhuber, 2009), Neural Turing Machines (Graves, Wayne, & Danihelka, 2014), Attention-Based RNN and its various implementations (e.g., Bahdanau, Cho, & Bengio, 2014; Luong, Pham, & Manning, 2015), or Transformers (Vaswani et al., 2017).
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
Finally, the field of deep learning in general, and recurrent neural networks, in particular, is evolving rapidly. Many alternative model specifications and network architectures offer the promises of improvements over vanilla LSTM models. They have already been proven superior in some domains. Such alternative specifications include Gated Recurrent Units, BiLSTM (Siami-Namini, Tavakoli, & Namin, 2019), Multi-Dimensional LSTM (Graves & Schmidhuber, 2009), Neural Turing Machines (Graves, Wayne, & Danihelka, 2014), Attention-Based RNN and its various implementations (e.g., Bahdanau, Cho, & Bengio, 2014; Luong, Pham, & Manning, 2015), or Transformers (Vaswani et al., 2017). It is not clear that one architecture will lead systematically to the best possible performance. Lacking benchmarking studies, the analyst may be required to experiment with several mod

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7629345787148

Tags
#deep-learning #keras #lstm #python #sequence
Question
The choice of activation function is most important for the [...] layer as it will define the format that predictions will take.
Answer
output

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The choice of activation function is most important for the output layer as it will define the format that predictions will take.

Original toplevel document (pdf)

cannot see any pdfs







#abm #agent-based #machine-learning #model #priority #synergistic-integration
Semisupervised learning falls in between supervised or unsupervised learning algorithms. It is an approach that combines a small amount of labeled data with a large amount of unlabeled data during training
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
Semisupervised learning falls in between supervised or unsupervised learning algorithms. It is an approach that combines a small amount of labeled data with a large amount of unlabeled data during training when the cost of labeling work may render large, fully labeled training sets infeasible, whereas the acquisition of unlabeled data is relatively inexpensive.

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7629349981452

Tags
#feature-engineering #has-images #lstm #recurrent-neural-networks #rnn
[unknown IMAGE 7103902780684]
Question
Fig. 2. Classic feedforward neural network (A), [...] neural network (B), and “unrolled” graphical representation of a recurrent neural network (C) where we use sequence data (x 1 ,x 2 ,x 3 ) to make sequence predictions (y 1 ,y 2 ,y 3 ) while preserving information through the hidden states h 1 ,h 2 ,h 3
Answer
recurrent

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Fig. 2. Classic feedforward neural network (A), recurrent neural network (B), and “unrolled” graphical representation of a recurrent neural network (C) where we use sequence data (x 1 ,x 2 ,x 3 ) to make sequence predictions (y 1 ,y 2 ,y 3 ) w

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629351816460

Tags
#bayes #programming #r #statistics
Question
One way to summarize the uncertainty is by marking the [...] that are most credible and cover 95% of the distribution. This is called the highest density inter val (HDI) and is marked by the black bar on the floor of the distribution in Figure 2.5.
Answer
span of values

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
One way to summarize the uncertainty is by marking the span of values that are most credible and cover 95% of the distribution. This is called the highest density inter val (HDI) and is marked by the black bar on the floor of the distribution in Figure 2.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629353913612

Tags
#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
Question
Vector-based machine learning methods like logistic regression take vectors f = (f 1 , . . . , f n ) of fixed length n as inputs. Applying these methods on consumer histories of arbitrary length requires [...]
Answer
feature engineering

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
ed machine learning methods like logistic regression take vectors f = (f 1 , . . . , f n ) of fixed length n as inputs. Applying these methods on consumer histories of arbitrary length requires <span>feature engineering <span>

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629356010764

Tags
#causality #statistics
Question
To identify a causal effect is to reduce a causal expression to a purely statistical expression. In this chapter, that means to reduce an expression from one that uses [...] notation to one that uses only statistical notation such as 𝑇 , 𝑋 , 𝑌 , expectations, and conditioning. This means that we can calculate the causal effect from just the observational distribution 𝑃(𝑋, 𝑇, 𝑌)
Answer
potential outcome

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
To identify a causal effect is to reduce a causal expression to a purely statistical expression. In this chapter, that means to reduce an expression from one that uses potential outcome notation to one that uses only statistical notation such as 𝑇 , 𝑋 , 𝑌 , expectations, and conditioning. This means that we can calculate the causal effect from just the observational di

Original toplevel document (pdf)

cannot see any pdfs







#English #vocabulary

acronym

noun [ C ]

UK /ˈækrəʊnɪm/ US

a word made from the first letters of other words

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Open it
acronym noun [ C ] UK /ˈækrəʊnɪm/ US a word made from the first letters of other words akronim, skrótowiec AIDS is the acronym for 'acquired immune deficiency syndrome'.




Flashcard 7629359942924

Tags
#Docker
Question

Docker look at the log of an exited container (with [...])

docker logs -t [NAZWA KONTENERA]
Answer
timestamps

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Open it
Docker look at the log of an exited container (with timestamps) docker logs -t [NAZWA KONTENERA]







#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
In e-commerce, available data sources and prediction scenarios often change, making the generality of RNNs appealing as no problem-specific feature engineering has to take place.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
pan> Event-stream RNNs We propose to model the behavior of consumers with RNNs. Consumer histories are inherently sequential and of varying lengths T , making RNNs a natural model choice. In e-commerce, available data sources and prediction scenarios often change, making the generality of RNNs appealing as no problem-specific feature engineering has to take place. <span>

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7629443829004

Tags
#deep-learning #keras #lstm #python #sequence
Question
How to Convert Categorical Data to Numerical Data This involves two steps: 1. [...] Encoding. 2. One Hot Encoding.
Answer
Integer

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
How to Convert Categorical Data to Numerical Data This involves two steps: 1. Integer Encoding. 2. One Hot Encoding.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629446974732

Tags
#feature-engineering #lstm #recurrent-neural-networks #rnn
Question
Because of their typical high-dimensionality, the [...] of RNN models are usually more potent than that of hidden markov models (e.g., Netzer, Lattin, & Srinivasan, 2008), which are commonly used in marketing to capture customer dynamics.
Answer
hidden states

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Because of their typical high-dimensionality, the hidden states of RNN models are usually more potent than that of hidden markov models (e.g., Netzer, Lattin, & Srinivasan, 2008), which are commonly used in marketing to capture customer dynamics

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629448809740

Tags
#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
Question
In the future, predictions on the level of products and individual [...] will be in our focus, enabling sophisticated recommendation products. This will require richer input descriptions at individual time-steps. Likewise, more sophisticated RNN architectures will be promising for future research
Answer
tastes

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
In the future, predictions on the level of products and individual tastes will be in our focus, enabling sophisticated recommendation products. This will require richer input descriptions at individual time-steps. Likewise, more sophisticated RNN architecture

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7629450120460

Tags
#English #vocabulary
Question

[...]

/ˈɑːdjʊəs,ˈɑːdʒʊəs/

Learn to pronounce

adjective

  1. involving or requiring strenuous effort; difficult and tiring

Answer
arduous

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
arduous /ˈɑːdjʊəs,ˈɑːdʒʊəs/ Learn to pronounce adjective involving or requiring strenuous effort; difficult and tiring

Original toplevel document

Open it
arduous /ˈɑːdjʊəs,ˈɑːdʒʊəs/ Learn to pronounce adjective involving or requiring strenuous effort; difficult and tiring "an arduous journey" Similar: onerous taxing difficult hard heavy laborious burdensome







#DAG #causal #edx
For example, suppose L is fetal death. We don't know the true causal DAG, we propose seven causal DAGs. Suppose that L does not help block a backdoor path in any of the seven DAGs, then we will not adjust for L, even if L were strongly associated with A and Y.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
ible causal DAGs without being able choose a particular one. And that's fine, because those causal DAGs that we propose allow us to identify inconsistencies between our beliefs and our actions. <span>For example, suppose L is fetal death. We don't know the true causal DAG, we propose seven causal DAGs. Suppose that L does not help block a backdoor path in any of the seven DAGs, then we will not adjust for L, even if L were strongly associated with A and Y. <span>

Original toplevel document (pdf)

cannot see any pdfs