BuboFlash - helps with learning

Edited, memorised or added to reading queue

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Annotation 7554957708556

#data-science #infrastructure

A typical bottleneck is caused by the fact that humans can’t deliver software (or hardware, if operating outside the cloud) fast enough. Even if they were capable of hacking code fast enough, they may be busy maintaining existing systems, which is another critically human activity. This observation helps us to realize that although “infrastructure” sounds very technical, we are not building infrastructure for the machines. We are building infrastructure to make humans more productive. This realization has fundamental ramifications to how we should think about and design infrastructure for data scientists— for fellow human beings, instead of for machines. For instance, if we assume that human-time is more expensive than computer-time, which is certainly true for most data scientists, it makes sense to use a highly expressive, productivity-boosting language like Python instead of a low-level language like C++, even if it makes workloads more inefficient to process. We will dig deeper into this question in chapter 5

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Flashcard 7625116880140

Tags

#tensorflow #tensorflow-certificate

Question

# Create 4-[...] tensor (the same as 4 dimensions)

A = tf.constant(np.arange(0, 120), shape=(2, 3, 4, 5))

Answer

rank

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Open it
# Create 4-rank tensor (the same as 4 dimensions) A = tf.constant(np.arange(0, 120), shape=(2, 3, 4, 5)) A

Annotation 7625187134732

#RNN #ariadne #behaviour #consumer #deep-learning #patterns #priority #recurrent-neural-networks #retail #simulation #synthetic-data

The model utilizes an auto-encoder to represent features of input parameters (i.e. customer loyalty number, R, F, and M). The proposed model is the first of its kind in the literature and has many opportunities for further improvement. The model can be improved by using more training data. It is interesting to explore deeper structures of the model in auto- encoder and recursion levels. Clumpiness is another variable which can be studied as an additive to R, F, and M (i.e. RFMC) variables.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
This paper proposes a new model for RFM prediction of customers based on recurrent neural networks (RNNs) with rectified linear unit activation function. The model utilizes an auto-encoder to represent features of input parameters (i.e. customer loyalty number, R, F, and M). The proposed model is the first of its kind in the literature and has many opportunities for further improvement. The model can be improved by using more training data. It is interesting to explore deeper structures of the model in auto- encoder and recursion levels. Clumpiness is another variable which can be studied as an additive to R, F, and M (i.e. RFMC) variables. Another pathway is considering other parameters of user (e.g. location, age, and etc.) for automatic feature extraction and further development of recommender systems.

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7625188969740

[unknown IMAGE 7100426751244]

#has-images #recurrent-neural-networks #rnn

What would we expect from customers like the first ten individuals 1001–1010, who started out as occasional benefactors, but through an evolving relationship with the firm have developed a more regular transaction behavior? Will they continue this trend; will they eventually turn into the firm’s premium customers?

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
What would we expect from customers like the first ten individuals 1001–1010, who started out as occasional benefactors, but through an evolving relationship with the firm have developed a more regular transaction behavior? Will they continue this trend; will they eventually turn into the firm’s premium customers? Conversely, how about the next ten individuals 1011–1020, who have all made a number of transactions historically, but recently have been on an unusually long hiatus? Is the customer-fi

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7625190804748

#ML-engineering #ML_in_Action #learning #machine #software-engineering

Project scoping for ML is incredibly challenging. Even for the most seasoned ML veterans, conjecturing how long a project will take, which approach is going to be most successful, and the amount of resources required is a futile and frustrating exercise

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
Project scoping for ML is incredibly challenging. Even for the most seasoned ML veterans, conjecturing how long a project will take, which approach is going to be most successful, and the amount of resources required is a futile and frustrating exercise. The risk associated with making erroneous claims is fairly high, but structuring proper scoping and solution research can help minimize the chances of being wildly off on estimation. <

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7625192639756

Tags

#feature-engineering #lstm #recurrent-neural-networks #rnn

Question

The LSTM neural network would be well-suited for modeling online customer behavior across multiple websites since it can naturally capture inter-sequence and inter-temporal interactions from multiple streams of clickstream data without growing [...] in complexity.

Answer

exponentially

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
odeling online customer behavior across multiple websites since it can naturally capture inter-sequence and inter-temporal interactions from multiple streams of clickstream data without growing exponentially in complexity.

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7625194736908

#recurrent-neural-networks #rnn

The simple behavioral story which sits at the core of BTYD models – while ”alive”, customers make purchases until they drop out

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The simple behavioral story which sits at the core of BTYD models – while ”alive”, customers make purchases until they drop out – gives these models robust predictive power, especially on the aggregate cohort level, and over a long time horizon.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7625197620492

Tags

#data-science #infrastructure

Question

to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market ([...]), ensure that the results are robust (validity), and make it possible to support a larger variety of projects

Answer

velocity

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7625199455500

#data #synthetic

Traditional methods of synthetic data generation use techniques that do not intend to replicate important statistical properties of the original data. Properties such as the distribution, the patterns or the correlation between variables, are often omitted. Moreover, most of the existing tools and approaches require a great deal of user-defined rules and do not make use of advanced techniques like Machine Learning or Deep Learning

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
Traditional methods of synthetic data generation use techniques that do not intend to replicate important statistical properties of the original data. Properties such as the distribution, the patterns or the correlation between variables, are often omitted. Moreover, most of the existing tools and approaches require a great deal of user-defined rules and do not make use of advanced techniques like Machine Learning or Deep Learning. While Machine Learning is an innovative area of Artificial Intelligence and Computer Science that uses statistical techniques to give computers the ability to learn from data, Deep Lea

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7625201290508

#recurrent-neural-networks #rnn

We propose and implement a flexible methodological framework that provides marketing managers with highly accurate forecasts of fine granularity both in the short and in the long run. Our method also captures seasonal peaks and customer-level dynamics and allows to differentiate between different customer groups

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
Army knife-like) general-purpose problem solver that generalizes across the described decision tasks of managing customer relationships. This article makes a first step towards this direction. We propose and implement a flexible methodological framework that provides marketing managers with highly accurate forecasts of fine granularity both in the short and in the long run. Our method also captures seasonal peaks and customer-level dynamics and allows to differentiate between different customer groups

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7625203387660

Tags

#recurrent-neural-networks #rnn

Question

In this specific domain of customer base analysis, probabilistic approaches from the [...] model family represent the gold standard, leveraging easily observable Recency and Frequency (RF, or RFM when including also the monetary value) metrics together with a latent attrition process to deliver accurate predictions (Schmittlein, Morrison, & Colombo, 1987; Fader, Hardie, & Lee, 2005; Fader & Hardie, 2009)

Answer

‘‘Buy ’Till You Die” (BTYD)

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
In this specific domain of customer base analysis, probabilistic approaches from the ‘‘Buy ’Till You Die” (BTYD) model family represent the gold standard, leveraging easily observable Recency and Frequency (RF, or RFM when including also the monetary value) metrics together with a latent attrition

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7625204698380

#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data

Applying RNNs directly to sequences of consumer actions yields the same or higher prediction accuracy than vector-based methods like logistic regression.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
In multiple aspects, RNNs offer advantages over existing methods that are relevant for real-world production systems. Applying RNNs directly to sequences of consumer actions yields the same or higher prediction accuracy than vector-based methods like logistic regression. Unlike the latter, the application of RNNs comes without the need for extensive feature engineering. In addition, we show that RNNs help us link individual actions directly to predictio

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7625206795532

Tags

#bayesian #stan

Question

The Stan development crew has made it easy to interactively explore diagnostics via the shinystan package, and one should do so with each model. In addition, there are other diagnostics available in other packages like [...] and posterior.

Answer

loo

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Stan - diagnostic packages
has made it easy to interactively explore diagnostics via the shinystan package, and one should do so with each model. In addition, there are other diagnostics available in other packages like loo and posterior.

Flashcard 7625208368396

Tags

#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data

Question

As [...] are required directly in many practical applications, we use NLL also for evaluation. In some applications, the resulting ranking of consumers is more important than the probabilities themselves. For this reason, we also report the area under the ROC curve (AUC)

Answer

probability estimates

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
As probability estimates are required directly in many practical applications, we use NLL also for evaluation. In some applications, the resulting ranking of consumers is more important than the probabilities t

Original toplevel document (pdf)

cannot see any pdfs

Edited, memorised or added to reading queue

on 24-Apr-2024 (Wed)

pdf

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)