Edited, memorised or added to reading queue

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Annotation 7103967268108

#RNN #ariadne #behaviour #consumer #deep-learning #priority #retail #simulation #synthetic-data

Most retail/e-retail brands, plan their short-term inventory (2-4 weeks ahead) based on consumer purchase pattern. Also, certain sales and marketing strategies like Offer Personalization and personalized item recommendations are made leveraging results of consumer purchase predictions for the near future. Given that every demand planner works on a narrow segment of item portfolio, there is a high variability in choices that different planners recommend. Additionally, the demand planners might not get enough opportunities to discuss their views and insights over their recommendations. Hence, subtle effects like cannibalization [21], and item affinity remain unaccounted for. Such inefficiencies lead to a gap between consumer needs and item availability, resulting in the loss of business opportunities in terms of consumer churn, and out-of-stock and excess inventory

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7545564826892

#ML-engineering #ML_in_Action #learning #machine #software-engineering

The aim of ML engineering is not to iterate through the lists of skills just mentioned and require that a data scientist (DS) master each of them. Instead, ML engineering collects certain aspects of those skills, carefully crafted to be relevant to data scientists, all with the goal of increasing the chances of getting an ML project into production and making sure that it’s not a solution that needs constant maintenance and intervention to keep running

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7545571380492

#ML-engineering #ML_in_Action #learning #machine #software-engineering

Throughout this first part of the book, we’ll discuss how to identify the reasons so many projects fail, are abandoned, or take far longer than they should to reach production. We’ll also discuss the solutions to each of these common failures and cover the processes that can significantly lower the chances of these factors derailing your projects. Generally, these failures happen because the DS team is either inexperienced with solving a problem of the scale required (a technological or process-driven failure) or hasn’t fully understood the desired outcome from the business (a communication- driven failure). I’ve never seen this happen because of malicious intent. Rather, most ML projects are incredibly challenging, complex, and composed of algorithmic software tooling that is hard to explain to a layperson—hence the breakdowns in communication with business units that most projects endure

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7552568003852

#ML-engineering #ML_in_Action #learning #machine #software-engineering

Project scoping for ML is incredibly challenging. Even for the most seasoned ML veterans, conjecturing how long a project will take, which approach is going to be most successful, and the amount of resources required is a futile and frustrating exercise. The risk associated with making erroneous claims is fairly high, but structuring proper scoping and solution research can help minimize the chances of being wildly off on estimation.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7552569576716

#ML-engineering #ML_in_Action #learning #machine #software-engineering

Most companies have a mix of the types of people in this hyperbolic scenario. Some are academics whose sole goal is to further the advancement of knowledge and research into algorithms, paving the way for future discoveries from within the industry. Others are “applications of ML” engineers who just want to use ML as a tool to solve a business problem. It’s important to embrace and balance both aspects of these philosophies toward ML work, strike a compromise during the research and scoping phase of a project, and know that the middle ground here is the best path to trod upon to ensure that a project actually makes it to production.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7553395592460

#data-science #infrastructure

to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7553397165324

#data-science #infrastructure

We will systematically go through the stack of systems that make a modern, effective infrastructure for data science. The principles covered in this book are not specific to any particular implementation, but we will use an open source framework, Metaflow, to show how the ideas can be put into practice. Alternatively, you can customize your own solution by using other off-the-shelf libraries. This book will help you to choose the right set of tools for the job

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 7559029067020

#Docker

Docker look at the log of an exited container (with timestamps)

docker logs -t [NAZWA KONTENERA]

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Flashcard 7559792954636

Tags

#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data

Question

While preprocessing is an important tool to improve model performance, it artificially increases the dimensionality of the input vector. Also, the resulting binary features can be [...]. Both outcomes make it difficult to tell which action patterns in the underlying consumer histories have a strong impact on the prediction outcome

Answer

strongly correlated

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
While preprocessing is an important tool to improve model performance, it artificially increases the dimensionality of the input vector. Also, the resulting binary features can be strongly correlated. Both outcomes make it difficult to tell which action patterns in the underlying consumer histories have a strong impact on the prediction outcome

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559794789644

These so-called continual or lifelong learning systems, and in particular lifelong deep neural networks (L-DNN), were inspired by brain neurophysiology. These deep learning algorithms separate feature training and rule training and are able to add new rule information on the fly.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Deep Learning Has Reinvented Quality Control in Manufacturing—but It Hasn’t Gone Far Enough AI systems that make use of “lifelong learning” techniques are more flexible and faster to train
These so-called continual or lifelong learning systems, and in particular lifelong deep neural networks (L-DNN), were inspired by brain neurophysiology. These deep learning algorithms separate feature training and rule training and are able to add new rule information on the fly. While they still learn features slowly using a large and balanced data set, L-DDNs don't learn rules at this stage. And they don't need images of all known valve defects—the dataset can

Flashcard 7559796886796

Tags

#deep-learning #keras #lstm #python #sequence

Question

By default, the samples within an epoch are shuffled. This is a good practice when working with Multilayer Perceptron neural networks. If you are trying to preserve state across samples, then the order of samples in the training dataset may be important and must be preserved. This can be done by setting the [...] argument in the fit() function to False.

Answer

shuffle

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
on neural networks. If you are trying to preserve state across samples, then the order of samples in the training dataset may be important and must be preserved. This can be done by setting the shuffle argument in the fit() function to False.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559801343244

Tags

#has-images #recurrent-neural-networks #rnn

[unknown IMAGE 7101511240972]

Question

This particular individual makes a transaction in the first week, followed by one week of [...], then transacting for two consecutive weeks, and so on; in weeks 3 and 4 they also received some form of a marketing appeal.

Answer

inactivity

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
This particular individual makes a transaction in the first week, followed by one week of inactivity, then transacting for two consecutive weeks, and so on; in weeks 3 and 4 they also received some form of a marketing appeal.

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559803440396

#recurrent-neural-networks #rnn

The challenge for deep learning models of customer behavior remains their opaque nature and the lack of simple ways to interpret their behavior, which is especially true for the complex temporal dynamics of RNNs.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The challenge for deep learning models of customer behavior remains their opaque nature and the lack of simple ways to interpret their behavior, which is especially true for the complex temporal dynamics of RNNs. Other frequently contended disadvantages are disappearing: Computational power is more affordable and efficient training methods are advancing at a fast pace, which also facilitates the

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559805275404

Tags

#has-images #recurrent-neural-networks #rnn

[unknown IMAGE 7101511240972]

Question

The two calendar components – the month and week [...] – represent time-varying contextual information which is shared across the individuals within a given cohort. In addition, in this example, we include also an individual time-invariant covariate (gender) and a time-varying, individual-level covariate (marketing appeals)

Answer

indicators

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
The two calendar components – the month and week indicators – represent time-varying contextual information which is shared across the individuals within a given cohort. In addition, in this example, we include also an individual time-invariant

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559807110412

#recurrent-neural-networks #rnn

As we have shown, it also accurately predicts periods of elevated transaction activity and captures other forms of purchase dynamics that can be leveraged in simulations of future sequences of customer transactions. We highlight our model’s flexibility and performance on two groups of valuable customers: those who keep making more and more transactions with the firm (denoted as ”opportunity” customers) and those who are at risk of defection. We demonstrate that the model also excels at automatically capturing seasonal trends in customer activity, such as the shopping period leading up to the December holidays. In Appendix Section F we provide a further characterization of scenarios where our model performs particularly well and where it does not do so relative to the used benchmark methods.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
roposed model informs managers on both short- and long-term forecasts of individual customer behavior and helps to timely uncover business opportunities as well as potential customer defection. As we have shown, it also accurately predicts periods of elevated transaction activity and captures other forms of purchase dynamics that can be leveraged in simulations of future sequences of customer transactions. We highlight our model’s flexibility and performance on two groups of valuable customers: those who keep making more and more transactions with the firm (denoted as ”opportunity” customers) and those who are at risk of defection. We demonstrate that the model also excels at automatically capturing seasonal trends in customer activity, such as the shopping period leading up to the December holidays. In Appendix Section F we provide a further characterization of scenarios where our model performs particularly well and where it does not do so relative to the used benchmark methods. The model brings many practical benefits for the marketing analyst, such as the lack of need for manual encoding of any features in the customer data, a simple optimization objective, a

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559808945420

Tags

#deep-learning #keras #lstm #python #sequence

Question

The promise of recurrent neural networks is that the temporal dependence and [...] information in the input data can be learned. A recurrent network whose inputs are not fixed but rather constitute an input sequence can be used to transform an input sequence into an output sequence while taking into account contextual information in a flexible way

Answer

contextual

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
The promise of recurrent neural networks is that the temporal dependence and contextual information in the input data can be learned. A recurrent network whose inputs are not fixed but rather constitute an input sequence can be used to transform an input sequence into an o

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559811566860

Tags

#Docker

Question

Docker look at the log of an exited container

docker [...] -t [NAZWA KONTENERA]

Answer

logs

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Open it
Docker look at the log of an exited container docker logs -t [NAZWA KONTENERA]

Flashcard 7559813401868

Tags

#feature-engineering #lstm #recurrent-neural-networks #rnn

Question

The [...] neural network typology is well-suited for modeling churn, especially in time-series format. However, its performance against standard churn prediction models remains an avenue for further research

Answer

LSTM

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
The LSTM neural network typology is well-suited for modeling churn, especially in time-series format. However, its performance against standard churn prediction models remains an avenue for furt

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559815236876

Tags

#deep-learning #keras #lstm #python #sequence

Question

Given that LSTMs operate on sequence data, it means that the addition of layers adds levels of abstraction of input observations over time. In effect, chunking observations over time or representing the problem at different time scales. ... building a deep RNN by stacking multiple recurrent hidden states on top of each other. This approach potentially allows the hidden state at each level to operate at different [...] — How to Construct Deep Recurrent Neural Networks, 2013

Answer

timescale

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
t time scales. ... building a deep RNN by stacking multiple recurrent hidden states on top of each other. This approach potentially allows the hidden state at each level to operate at different timescale — How to Construct Deep Recurrent Neural Networks, 2013

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559817071884

Tags

#bayes #programming #r #statistics

Question

The posterior distribution also shows the [...] in that estimated slope, because the distribution shows the relative credibility of values across the continuum.

Answer

uncertainty

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
The posterior distribution also shows the uncertainty in that estimated slope, because the distribution shows the relative credibility of values across the continuum.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559818906892

Tags

#feature-engineering #lstm #recurrent-neural-networks #rnn

Question

When an analyst uses [...] to predict behavior, the performance of the model will depend greatly on the analyst's domain knowledge

Answer

feature engineering

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
When an analyst uses feature engineering to predict behavior, the performance of the model will depend greatly on the analyst's domain knowledge

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559821004044

Tags

#deep-learning #keras #lstm #python #sequence

Question

One Hot Encoding

For categorical variables where no such ordinal relationship exists, the [...] encoding is not enough.

Answer

integer

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
One Hot Encoding For categorical variables where no such ordinal relationship exists, the integer encoding is not enough.

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559822839052

#deep-learning #keras #lstm #python #sequence

The LSTM expects input data to have the dimensions: samples, time steps, and features.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
The LSTM expects input data to have the dimensions: samples, time steps, and features. It is the second dimension of this input format, the time steps, that defines the number of time steps used for forward and backward passes on your sequence prediction problem </

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559824674060

#data-science #infrastructure

Incidental complexity is a huge problem for real-world data science because we have to deal with such a high level of inherent complexity that distinguishing between real problems and imaginary problems becomes hard

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
we must avoid introducing incidental complexity, or complexity that is not necessitated by the problem itself but is an unwanted artifact of a chosen approach. Incidental complexity is a huge problem for real-world data science because we have to deal with such a high level of inherent complexity that distinguishing between real problems and imaginary problems becomes hard

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559826509068

Tags

#data-science #infrastructure

Question

An effective infrastructure helps to expose and manage inherent complexity, which is the natural state of the world we live in, while making a conscious effort to avoid introducing [...] complexity. Doing this well is hard and requires constant judgment.

Answer

incidental

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
An effective infrastructure helps to expose and manage inherent complexity, which is the natural state of the world we live in, while making a conscious effort to avoid introducing incidental complexity. Doing this well is hard and requires constant judgment.

Original toplevel document (pdf)

cannot see any pdfs

Flashcard 7559829392652

Tags

#data-science #infrastructure

Question

to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed [...] (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects

Answer

simultaneously

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559834373388

#English #vocabulary

arduous

/ˈɑːdjʊəs,ˈɑːdʒʊəs/

Learn to pronounce

adjective

involving or requiring strenuous effort; difficult and tiring

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Open it
arduous /ˈɑːdjʊəs,ˈɑːdʒʊəs/ Learn to pronounce adjective involving or requiring strenuous effort; difficult and tiring "an arduous journey" Similar: onerous taxing difficult hard heavy laborious burdensome

Flashcard 7559836470540

Tags

#data-science #infrastructure

Question

data scientists and engineers are expected to build end-to-end solutions to business problems, of which models are a small but important part. Because this book focuses on end-to-end solutions, we say that the data scientist’s job is to build data science [...].

Answer

applications

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Parent (intermediate) annotation

Open it
solutions to business problems, of which models are a small but important part. Because this book focuses on end-to-end solutions, we say that the data scientist’s job is to build data science applications.

Original toplevel document (pdf)

cannot see any pdfs

Annotation 7559838567692

#deep-learning #keras #lstm #python #sequence

LSTMs may not be ideal for all sequence prediction problems. For example, in time series forecasting, often the information relevant for making a forecast is within a small window of past observations

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Parent (intermediate) annotation

Open it
LSTMs may not be ideal for all sequence prediction problems. For example, in time series forecasting, often the information relevant for making a forecast is within a small window of past observations. Often an MLP with a window or a linear model may be a less complex and more suitable model

Original toplevel document (pdf)

cannot see any pdfs

Edited, memorised or added to reading queue

on 18-Jan-2023 (Wed)

pdf

pdf

pdf

pdf

pdf

pdf

pdf

Docker look at the log of an exited container (with timestamps)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)

Parent (intermediate) annotation

Original toplevel document (pdf)