BuboFlash - helps with learning

Edited, memorised or added to reading queue

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Annotation 7583428119820

#Inference #causal #reading

the artificial intelligence (AI) literature has developed a wide array of techniques for causal learning that allow leveraging information from various imperfect, heterogeneous, and biased data sources (Bareinboim and Pearl, 2016). See also https://causalai-book.net/

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
Building on the structural approach to causality introduced by Haavelmo (1943) and the graph-theoretic framework proposed by Pearl (1995), the artificial intelligence (AI) literature has developed a wide array of techniques for causal learning that allow leveraging information from various imperfect, heterogeneous, and biased data sources (Bareinboim and Pearl, 2016)

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7789510003980

#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data

In principle, one could evaluate the logistic regression model at every single time-step in the consumer history to determine the influence of individual events. However, this would involve the inefficient process of re-calculating features for every time-step.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
In principle, one could evaluate the logistic regression model at every single time-step in the consumer history to determine the influence of individual events. However, this would involve the inefficient process of re-calculating features for every time-step. Calculations at timesteps t and t − 1 would be highly redundant: features at t represent the complete history until t and not only what happened in between t − 1 and t.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789511838988

Tags

#Inference #causal #reading

Question

Answer

Bareinboim

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
icial intelligence (AI) literature has developed a wide array of techniques for causal learning that allow leveraging information from various imperfect, heterogeneous, and biased data sources (Bareinboim and Pearl, 2016)

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7789513673996

#R #ggplot2

Marginal distributions can now be made in R using ggside, a new ggplot2 extension

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Side-Plot Tutorial with ggside
Marginal distributions can now be made in R using ggside, a new ggplot2 extension. You can make linear regression with marginal distributions using histograms, densities, box plots, and more. Bonus - The side panels are super customizable for uncovering complex relat

Flashcard 7789515246860

Tags

#R #ggplot2

Question

[...] can now be made in R using ggside, a new ggplot2 extension

Answer

Marginal distributions

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Marginal distributions can now be made in R using ggside, a new ggplot2 extension

Original toplevel document

Flashcard 7789517081868

Tags

#recurrent-neural-networks #rnn

Question

We show that incorporating contextual information in the model is straightforward and brings an additional boost in predictive accuracy. However, the model performance is already extremely strong when no context is available beyond the [...] of the customer’s transactions. This is welcome news for firms that do not wish to collect personal information on principle, to avoid the questionable ethics of harvesting the ‘‘behavioral surplus”

Answer

timing

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
rmation in the model is straightforward and brings an additional boost in predictive accuracy. However, the model performance is already extremely strong when no context is available beyond the timing of the customer’s transactions. This is welcome news for firms that do not wish to collect personal information on principle, to avoid the questionable ethics of harvesting the ‘‘behavi

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789519441164

Tags

#feature-engineering #lstm #recurrent-neural-networks #rnn

Question

models with [...] capacity may overfit the training set and exhibit high variance

Answer

high

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
models with high capacity may overfit the training set and exhibit high variance

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789521014028

Tags

#feature-engineering #lstm #recurrent-neural-networks #rnn

Question

models with high capacity may overfit the training set and exhibit [...] variance

Answer

high

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
models with high capacity may overfit the training set and exhibit high variance

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789522586892

Tags

#tensorflow #tensorflow-certificate

Question

Bag of tricks to improve model

3. Fit the model - more epochs, more [...]

Answer

data examples

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Bag of tricks to improve model 3. Fit the model - more epochs, more data examples

Original toplevel document

TfC_02_classification-PART_1
nse(10, activation='relu'), tf.keras.layers.Dense(1, activation='sigmoid') ]) model.compile(optimizer=tf.keras.optimizers.Adam(), loss=tf.keras.losses.binary_crossentropy, metrics=['accuracy']) Bag of tricks to improve model Create model - more layers, more neurons, different activation Compile mode - other loss, other optimizer, change optimizer parameters Fit the model - more epochs, more data examples # plots model predictions agains true data import numpy as np def plot_decision_boundry(model, X, y): """ Take in a trained model, features and labels and create numpy.meshgrid of the d

Annotation 7789524684044

#RNN #ariadne #behaviour #consumer #deep-learning #priority #retail #simulation #synthetic-data

Past study [5] has shown that retailers use conventional techniques with available data to model consumer purchase. While these help in estimating purchase pattern for loyal consumers and high selling items with reasonable accuracy, they do not perform well for the long tail.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
Past study [5] has shown that retailers use conventional techniques with available data to model consumer purchase. While these help in estimating purchase pattern for loyal consumers and high selling items with reasonable accuracy, they do not perform well for the long tail. Since multiple parameters interact non-linearly to define consumer purchase pattern, traditional models are not sufficient to achieve high accuracy across thousands to millions of consu

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789526781196

Tags

#tensorflow #tensorflow-certificate

Question

Three types of classification problems:

binary classification
multiclass
[...]

Answer

multilabel

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Three types of classification problems: binary classification multiclass multilabel

Original toplevel document

TfC_02_classification-PART_1
Types of classification problems Three types of classification problems: binary classification multiclass multilabel Multilabel classification - a sample can be assigned to more than one label from more than 2 label options Multiclass classification - a sample can be assigned to one label but from mor

Flashcard 7789528616204

Tags

#deep-learning #keras #lstm #python #sequence

Question

When a network is fit on unscaled data that has a range of values (e.g. quantities in the 10s to 100s) it is possible for large inputs to [...] the learning and convergence of your network, and in some cases prevent the network from effectively learning your problem.

Answer

slow down

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
When a network is fit on unscaled data that has a range of values (e.g. quantities in the 10s to 100s) it is possible for large inputs to slow down the learning and convergence of your network, and in some cases prevent the network from effectively learning your problem.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789530451212

Tags

#deep-learning #keras #lstm #python #sequence

Question

Truncated Backpropagation Through Time, or [...](acornym?)

Answer

TBPTT

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Truncated Backpropagation Through Time, or TBPTT, is a modified version of the BPTT training algorithm

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789532810508

Tags

#deep-learning #keras #lstm #python #sequence

Question

Truncated Backpropagation Through Time, or TBPTT, is a modified version of the [...] training algorithm

Answer

BPTT

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Truncated Backpropagation Through Time, or TBPTT, is a modified version of the BPTT training algorithm

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7789534645516

Tags

#tensorflow #tensorflow-certificate

Question

Bag of tricks to improve model

[...] model - other loss, other optimizer, change optimizer parameters

Answer

Compile

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
Bag of tricks to improve model Compile model - other loss, other optimizer, change optimizer parameters

Original toplevel document

Flashcard 7789536480524

Tags

#tensorflow #tensorflow-certificate

Question

In case of labels as [...] use SparseCategoricalCrossentropy

Answer

integeres

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
In case of labels as integeres use SparseCategoricalCrossentropy

Original toplevel document

TfC_02_classification-PART_2
y-axis -> true label x-axis -> predicted label # Create confusion metrics from sklearn.metrics import confusion_matrix y_preds = model_8.predict(X_test) confusion_matrix(y_test, y_preds) important: This time there is a problem with loss function. In case of categorical_crossentropy the labels have to be one-hot encoded In case of labels as integeres use SparseCategoricalCrossentropy # Get the patterns of a layer in our network weights, biases = model_35.layers[1].get_weights()

Annotation 7789538053388

#RNN #ariadne #behaviour #consumer #deep-learning #patterns #priority #recurrent-neural-networks #retail #simulation #synthetic-data

recency (R), frequency (F), and monetary value (M) variables, called RFM [3], [4], [5]. These variables present some understanding of customer’s behaviour and try to answer the following questions: “How recently did the customer purchase?”, “How often do they purchase?”, and “How much do they spend?” [2].

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The CLV models use different strategies for customer behaviour modelling. One of the most reliable ones is using the recency (R), frequency (F), and monetary value (M) variables, called RFM [3], [4], [5]. These variables present some understanding of customer’s behaviour and try to answer the following questions: “How recently did the customer purchase?”, “How often do they purchase?”, and “How much do they spend?” [2]. RFM variables are sufficient statistics for customer behaviour modelling and are a mainstay of the industry because of their ease of implementation in practice [6], [3].

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7789539888396

#data-science #infrastructure

if we assume that human-time is more expensive than computer-time, which is certainly true for most data scientists, it makes sense to use a highly expressive, productivity-boosting language like Python instead of a low-level language like C++, even if it makes workloads more inefficient to process

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
ctive. This realization has fundamental ramifications to how we should think about and design infrastructure for data scientists— for fellow human beings, instead of for machines. For instance, if we assume that human-time is more expensive than computer-time, which is certainly true for most data scientists, it makes sense to use a highly expressive, productivity-boosting language like Python instead of a low-level language like C++, even if it makes workloads more inefficient to process.

Original toplevel document (pdf)

cannot see any pdfs

Edited, memorised or added to reading queue

on 23-Jan-2026 (Fri)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Bag of tricks to improve model

Parent (intermediate) annotation

Original toplevel document

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Bag of tricks to improve model

Parent (intermediate) annotation

Original toplevel document

Parent (intermediate) annotation

Original toplevel document

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)