BuboFlash - helps with learning

Edited, memorised or added to reading queue

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Annotation 7103908547852

#feature-engineering #lstm #recurrent-neural-networks #rnn

The learning mechanism of the recurrent neural network thus involves:

(1) the forward propagation step where the cross- entropy loss is calculated;

(2) the backpropagation step where the gradient of the parameters with respect to the loss is calculated; and finally,

(3) the optimization algorithm, that changes the parameters of the RNN based on the gradient.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7103996890380

3.1.3 Practical Considerations When Scaling

#deep-learning #keras #lstm #python #sequence

3.1.3 Practical Considerations When Scaling

There are some practical considerations when scaling sequence data.

Estimate Coefficients
You can estimate coefficients (min and max values for normalization or mean and standard deviation for standardization) from the training data. Inspect these first-cut estimates and use domain knowledge or domain experts to help improve these estimates so that they will be usefully correct on all data in the future.

Save Coefficients
You will need to scale new data in the future in exactly the same way as the data used to train your model. Save the coefficients used to file and load them later when you need to scale new data when making predictions.

Data Analysis
Use data analysis to help you better understand your data. For example, a simple histogram can help you quickly get a feeling for the distribution of quantities to see if standardization would make sense.

Scale Each Series
If your problem has multiple series, treat each as a separate variable and in turn scale them separately. Here, scale refers to a choice of scaling procedure such as normalization or standardization.

Scale At The Right Time
It is important to apply any scaling transforms at the right time. For example, if you have a series of quantities that is non-stationary, it may be appropriate to scale after first making your data stationary. It would not be appropriate to scale the series after it has been transformed into a supervised learning problem as each column would be handled differently, which would be incorrect.

Scale if in Doubt
You probably do need to rescale your input and output variables. If in doubt, at least normalize your data.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Flashcard 7627469360396

Tags

#tensorflow #tensorflow-certificate

Question

Preprocessing data

ct = make_column_transformer(([...](dtype="int32"), ['Sex']), remainder="passthrough") #other columns unchangaed
ct.fit(X_train) 
X_train_transformed = ct.transform(X_train)
X_test_transformed = ct.transform(X_test)

Answer

OneHotEncoder

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Preprocessing data ct = make_column_transformer((OneHotEncoder(dtype="int32"), ['Sex']), remainder="passthrough") #other columns unchangaed ct.fit(X_train) X_train_transformed = ct.transform(X_train) X_test_transformed = ct.transform(X_test)

Original toplevel document

TfC_01_ADDITIONAL_01_Abalone.ipynb
Preprocessing data ct = make_column_transformer((OneHotEncoder(dtype="int32"), ['Sex']), remainder="passthrough") #other columns unchangaed ct.fit(X_train) X_train_transformed = ct.transform(X_train) X_test_transformed = ct.transform(X_test) Predictions valuation_predicts = model.predict(X_valuation_transformed) (array([[ 9.441547], [10.451973], [10.48082 ], ..., [10.401164], [13.13452 ], [ 8.081818]], dtype=float32), (6041

Flashcard 7628313726220

Tags

#DAG #causal #edx #has-images #inference

[unknown IMAGE 7096178707724]

Question

As you may have already noticed, the case-control design selects individuals based on their outcome. Women who did develop cancer are [...] to be included in the study than women who did not develop cancer. Therefore, our causal graph will include a note for selection-- C-- an arrow from the outcome Y to C, and a box around C to indicate that the analysis is conditional on having been selected into the study, which means that we are only one arrow away from selection bias.

Answer

much more likely

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
As you may have already noticed, the case-control design selects individuals based on their outcome. Women who did develop cancer are much more likely to be included in the study than women who did not develop cancer. Therefore, our causal graph will include a note for selection-- C-- an arrow from the outcome Y to C, and a box around

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628315823372

Tags

#causality #statistics

Question

Given that we have tools to measure association, how can we isolate causation? In other words, how can we ensure that the association we measure is causation, say, for measuring the causal effect of 𝑋 on 𝑌 ? Well, we can do that by ensuring that there is [...] association flowing between 𝑋 and 𝑌

Answer

no non-causal

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
solate causation? In other words, how can we ensure that the association we measure is causation, say, for measuring the causal effect of 𝑋 on 𝑌 ? Well, we can do that by ensuring that there is no non-causal association flowing between 𝑋 and 𝑌

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628317658380

#recurrent-neural-networks #rnn

non-contractual settings

The specific challenge in such settings is to accurately and timely inform managers on the subtle distinction between a pending defection event (i.e., a customer stops doing business with the focal firm) and an extended period of inactivity of their customers, because possible marketing implications are completely different in each of these situations.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
n-contractual business settings is by definition unobserved by the firm and thus needs to be indirectly inferred from past transaction behavior (Reinartz & Kumar, 2000; Gupta et al., 2006). The specific challenge in such settings is to accurately and timely inform managers on the subtle distinction between a pending defection event (i.e., a customer stops doing business with the focal firm) and an extended period of inactivity of their customers, because possible marketing implications are completely different in each of these situations.

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628322114828

#feature-engineering #lstm #recurrent-neural-networks #rnn

The HMM has N discrete hidden states (where N is typically small) and, therefore, has only log 2 (N) bits of information available to capture the sequence history (Brown & Hinton, 2001)

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The HMM has N discrete hidden states (where N is typically small) and, therefore, has only log 2 (N) bits of information available to capture the sequence history (Brown & Hinton, 2001). On the other hand, the RNN has distributed hidden states, which means that each input generally results in changes across all the hidden units of the RNN (Ming et al., 2017). RNNs comb

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628357504268

#recurrent-neural-networks #rnn

Extended variants of the original (‘‘Buy ’Till You Die” (BTYD) )models (e.g., Zhang, Bradlow, & Small (2015), Platzer & Reutterer (2016), Reutterer, Platzer, & Schröder (2021)) improve predictive accuracy by incorporating more hand-crafted summary statistics of customer behavior. However, including customer covariates is cumbersome and an approach to account for time-varying covariates has only just recently been introduced by Bachmann, Meierer, and Näf (2021) at the cost of manual labeling and slower performance.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
f BTYD models – while ”alive”, customers make purchases until they drop out – gives these models robust predictive power, especially on the aggregate cohort level, and over a long time horizon. Extended variants of the original models (e.g., Zhang, Bradlow, & Small (2015), Platzer & Reutterer (2016), Reutterer, Platzer, & Schröder (2021)) improve predictive accuracy by incorporating more hand-crafted summary statistics of customer behavior. However, including customer covariates is cumbersome and an approach to account for time-varying covariates has only just recently been introduced by Bachmann, Meierer, and Näf (2021) at the cost of manual labeling and slower performance. Even advanced BTYD models can be too restrictive to adequately capture diverse customer behaviors in different contexts and the derived forecasts present customer future in an oftentime

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628360125708

#deep-learning #keras #lstm #python #sequence

Unfortunately, the range of contextual information that standard RNNs can access is in practice quite limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the network’s recurrent connections.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
Unfortunately, the range of contextual information that standard RNNs can access is in practice quite limited. The problem is that the influence of a given input on the hidden layer, and therefore on the network output, either decays or blows up exponentially as it cycles around the network’s recurrent connections. This shortcoming ... referred to in the literature as the vanishing gradient problem ... Long Short-Term Memory (LSTM) is an RNN architecture specifically designed to address the vanish

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628363009292

#feature-engineering #lstm #recurrent-neural-networks #rnn

The learning mechanism of the recurrent neural network thus involves:

(1) the forward propagation step where the cross- entropy loss is calculated;

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The learning mechanism of the recurrent neural network thus involves: (1) the forward propagation step where the cross- entropy loss is calculated; (2) the backpropagation step where the gradient of the parameters with respect to the loss is calculated; and finally, (3) the optimization algorithm, that changes the parameters of the

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628365106444

Tags

#deep-learning #keras #lstm #python #sequence

Question

Sequence-to-sequence prediction involves predicting an [...] given an input sequence. For example: Input Sequence: 1, 2, 3, 4, 5 Output Sequence: 6, 7, 8, 9, 1

Answer

output sequence

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Sequence-to-sequence prediction involves predicting an output sequence given an input sequence. For example: Input Sequence: 1, 2, 3, 4, 5 Output Sequence: 6, 7, 8, 9, 1

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628366679308

Question

The algorithms generate predictive scores for each customer based on journey features. These scores allow the company to predict individual customer [...] and value outcomes such as revenue, loyalty, and cost to serve. More broadly, they allow CX leaders to assess the ROI for particular CX investments and directly tie CX initiatives to business outcomes

Answer

satisfaction

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
The algorithms generate predictive scores for each customer based on journey features. These scores allow the company to predict individual customer satisfaction and value outcomes such as revenue, loyalty, and cost to serve. More broadly, they allow CX leaders to assess the ROI for particular CX investments and directly tie CX initiatives to bu

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628368252172

Tags

#feature-engineering #lstm #recurrent-neural-networks #rnn

Question

The RNN processes the entire sequence of available data without having to [...] it into features.

Answer

summarize

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
The RNN processes the entire sequence of available data without having to summarize it into features.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628370087180

Tags

#recurrent-neural-networks #rnn

Question

Embedding layers are used to reduce data dimensionality, compressing large vectors of values into relatively smaller ones, to both [...] and limit the number of model parameters required

Answer

reduce noise

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Embedding layers are used to reduce data dimensionality, compressing large vectors of values into relatively smaller ones, to both reduce noise and limit the number of model parameters required

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628371922188

Tags

#deep-learning #keras #lstm #python #sequence

Question

For a multiclass classification problem, the results may be in the form of an array of probabilities (assuming a one hot encoded output variable) that may need to be converted to a single class output prediction using the [...]() NumPy function.

Answer

argmax

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
ion problem, the results may be in the form of an array of probabilities (assuming a one hot encoded output variable) that may need to be converted to a single class output prediction using the argmax() NumPy function.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628373232908

Tags

#DAG #causal #edx

Question

So all these methods for confounding adjustment -- stratification, matching, inverse probability weighting, G-formula, G-estimation -- have two things in common. First, they require data on the [...] that block the backdoor path. If those data are available, then the choice of one of these methods over the others is often a matter of personal taste. Unless the treatment is time-varying -- then we have to go to G-methods

Answer

confounders

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
So all these methods for confounding adjustment -- stratification, matching, inverse probability weighting, G-formula, G-estimation -- have two things in common. First, they require data on the confounders that block the backdoor path. If those data are available, then the choice of one of these methods over the others is often a matter of personal taste. Unless the treatment is time-vary

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628374543628

Tags

#recurrent-neural-networks #rnn

Question

In this paper, we offer marketing analysts an alternative to these models by developing a deep learning based approach that does not rely on any ex-ante data [...] or feature engineering, but instead automatically detects behavioral dynamics like seasonality or changes in inter-event timing patterns by learning directly from the prior transaction history

Answer

labelling

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
In this paper, we offer marketing analysts an alternative to these models by developing a deep learning based approach that does not rely on any ex-ante data labelling or feature engineering, but instead automatically detects behavioral dynamics like seasonality or changes in inter-event timing patterns by learning directly from the prior transaction

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628375854348

#recurrent-neural-networks #rnn

Sarkar and De Bruyn (2021) demonstrate that a special RNN type can help marketing response modelers to benefit from the multitude of inter-temporal customer-firm interactions accompanying observed transaction flows for predicting the most likely next customer action. However, their approach is limited to single point, next-step predictions and to continue with such forecasts into the long-run one must estimate the new model repeatedly with each additional future time step

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
ing purchasing intent. In a similar context, Toth, Tan, Di Fabbrizio, and Datta (2017) have shown that a mixture of RNNs can approximate several complex functions simultaneously. More recently, Sarkar and De Bruyn (2021) demonstrate that a special RNN type can help marketing response modelers to benefit from the multitude of inter-temporal customer-firm interactions accompanying observed transaction flows for predicting the most likely next customer action. However, their approach is limited to single point, next-step predictions and to continue with such forecasts into the long-run one must estimate the new model repeatedly with each additional future time step

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7628378213644

Tags

#DAG #causal #edx #has-images

[unknown IMAGE 7093205732620]

Question

In those cases, it is generally better [...] L, because even though adjusting for L will not eliminate all confounding by U, it will typically eliminate some of the confounding by U

Answer

to adjust for

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
In those cases, it is generally better to adjust for L, because even though adjusting for L will not eliminate all confounding by U, it will typically eliminate some of the confounding by U

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628381359372

#recurrent-neural-networks #rnn

The name, often shortened to seq2seq, comes from the fact that these models can translate a sequence of input elements into a sequence of outputs.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7628382932236

#recurrent-neural-networks #rnn

Different seq2seq models can be created depending on how we manipulate the input data; i.e., we can conceal certain parts of the input sequence and train the model to predict what is missing, to ‘‘fill in the blanks”.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The name, often shortened to seq2seq, comes from the fact that these models can translate a sequence of input elements into a sequence of outputs. Different seq2seq models can be created depending on how we manipulate the input data; i.e., we can conceal certain parts of the input sequence and train the model to predict what is missing, to ‘‘fill in the blanks”. If we always blank only the last element in a historical sequence, the model effectively learns to predict the most likely future, conditioned on the observed past. Applying this idea t

Original toplevel document (pdf)

cannot see any pdfs

Edited, memorised or added to reading queue

on 01-Jun-2024 (Sat)

pdf

pdf

Parent (intermediate) annotation

Original toplevel document

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)