Edited, memorised or added to reading queue

on 18-Jan-2023 (Wed)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#RNN #ariadne #behaviour #consumer #deep-learning #priority #retail #simulation #synthetic-data
Most retail/e-retail brands, plan their short-term inventory (2-4 weeks ahead) based on consumer purchase pattern. Also, certain sales and marketing strategies like Offer Personalization and personalized item recommendations are made leveraging results of consumer purchase predictions for the near future. Given that every demand planner works on a narrow segment of item portfolio, there is a high variability in choices that different planners recommend. Additionally, the demand planners might not get enough opportunities to discuss their views and insights over their recommendations. Hence, subtle effects like cannibalization [21], and item affinity remain unaccounted for. Such inefficiencies lead to a gap between consumer needs and item availability, resulting in the loss of business opportunities in terms of consumer churn, and out-of-stock and excess inventory
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
The aim of ML engineering is not to iterate through the lists of skills just mentioned and require that a data scientist (DS) master each of them. Instead, ML engineering collects certain aspects of those skills, carefully crafted to be relevant to data scientists, all with the goal of increasing the chances of getting an ML project into production and making sure that it’s not a solution that needs constant maintenance and intervention to keep running
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
Throughout this first part of the book, we’ll discuss how to identify the reasons so many projects fail, are abandoned, or take far longer than they should to reach production. We’ll also discuss the solutions to each of these common failures and cover the processes that can significantly lower the chances of these factors derailing your projects. Generally, these failures happen because the DS team is either inexperienced with solving a problem of the scale required (a technological or process-driven failure) or hasn’t fully understood the desired outcome from the business (a communication- driven failure). I’ve never seen this happen because of malicious intent. Rather, most ML projects are incredibly challenging, complex, and composed of algorithmic software tooling that is hard to explain to a layperson—hence the breakdowns in communication with business units that most projects endure
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
Project scoping for ML is incredibly challenging. Even for the most seasoned ML veterans, conjecturing how long a project will take, which approach is going to be most successful, and the amount of resources required is a futile and frustrating exercise. The risk associated with making erroneous claims is fairly high, but structuring proper scoping and solution research can help minimize the chances of being wildly off on estimation.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
Most companies have a mix of the types of people in this hyperbolic scenario. Some are academics whose sole goal is to further the advancement of knowledge and research into algorithms, paving the way for future discoveries from within the industry. Others are “applications of ML” engineers who just want to use ML as a tool to solve a business problem. It’s important to embrace and balance both aspects of these philosophies toward ML work, strike a compromise during the research and scoping phase of a project, and know that the middle ground here is the best path to trod upon to ensure that a project actually makes it to production.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#data-science #infrastructure
to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#data-science #infrastructure
We will systematically go through the stack of systems that make a modern, effective infrastructure for data science. The principles covered in this book are not specific to any particular implementation, but we will use an open source framework, Metaflow, to show how the ideas can be put into practice. Alternatively, you can customize your own solution by using other off-the-shelf libraries. This book will help you to choose the right set of tools for the job
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#Docker

Docker look at the log of an exited container (with timestamps)

docker logs -t [NAZWA KONTENERA]
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on




Flashcard 7559792954636

Tags
#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
Question
While preprocessing is an important tool to improve model performance, it artificially increases the dimensionality of the input vector. Also, the resulting binary features can be [...]. Both outcomes make it difficult to tell which action patterns in the underlying consumer histories have a strong impact on the prediction outcome
Answer
strongly correlated

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
While preprocessing is an important tool to improve model performance, it artificially increases the dimensionality of the input vector. Also, the resulting binary features can be strongly correlated. Both outcomes make it difficult to tell which action patterns in the underlying consumer histories have a strong impact on the prediction outcome

Original toplevel document (pdf)

cannot see any pdfs







These so-called continual or lifelong learning systems, and in particular lifelong deep neural networks (L-DNN), were inspired by brain neurophysiology. These deep learning algorithms separate feature training and rule training and are able to add new rule information on the fly.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Deep Learning Has Reinvented Quality Control in Manufacturing—but It Hasn’t Gone Far Enough AI systems that make use of “lifelong learning” techniques are more flexible and faster to train
These so-called continual or lifelong learning systems, and in particular lifelong deep neural networks (L-DNN), were inspired by brain neurophysiology. These deep learning algorithms separate feature training and rule training and are able to add new rule information on the fly. While they still learn features slowly using a large and balanced data set, L-DDNs don't learn rules at this stage. And they don't need images of all known valve defects—the dataset can




Flashcard 7559796886796

Tags
#deep-learning #keras #lstm #python #sequence
Question
By default, the samples within an epoch are shuffled. This is a good practice when working with Multilayer Perceptron neural networks. If you are trying to preserve state across samples, then the order of samples in the training dataset may be important and must be preserved. This can be done by setting the [...] argument in the fit() function to False.
Answer
shuffle

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
on neural networks. If you are trying to preserve state across samples, then the order of samples in the training dataset may be important and must be preserved. This can be done by setting the <span>shuffle argument in the fit() function to False. <span>

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559801343244

Tags
#has-images #recurrent-neural-networks #rnn
[unknown IMAGE 7101511240972]
Question
This particular individual makes a transaction in the first week, followed by one week of [...], then transacting for two consecutive weeks, and so on; in weeks 3 and 4 they also received some form of a marketing appeal.
Answer
inactivity

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
This particular individual makes a transaction in the first week, followed by one week of inactivity, then transacting for two consecutive weeks, and so on; in weeks 3 and 4 they also received some form of a marketing appeal.

Original toplevel document (pdf)

cannot see any pdfs







#recurrent-neural-networks #rnn
The challenge for deep learning models of customer behavior remains their opaque nature and the lack of simple ways to interpret their behavior, which is especially true for the complex temporal dynamics of RNNs.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
The challenge for deep learning models of customer behavior remains their opaque nature and the lack of simple ways to interpret their behavior, which is especially true for the complex temporal dynamics of RNNs. Other frequently contended disadvantages are disappearing: Computational power is more affordable and efficient training methods are advancing at a fast pace, which also facilitates the

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7559805275404

Tags
#has-images #recurrent-neural-networks #rnn
[unknown IMAGE 7101511240972]
Question
The two calendar components – the month and week [...] – represent time-varying contextual information which is shared across the individuals within a given cohort. In addition, in this example, we include also an individual time-invariant covariate (gender) and a time-varying, individual-level covariate (marketing appeals)
Answer
indicators

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The two calendar components – the month and week indicators – represent time-varying contextual information which is shared across the individuals within a given cohort. In addition, in this example, we include also an individual time-invariant

Original toplevel document (pdf)

cannot see any pdfs







#recurrent-neural-networks #rnn
As we have shown, it also accurately predicts periods of elevated transaction activity and captures other forms of purchase dynamics that can be leveraged in simulations of future sequences of customer transactions. We highlight our model’s flexibility and performance on two groups of valuable customers: those who keep making more and more transactions with the firm (denoted as ”opportunity” customers) and those who are at risk of defection. We demonstrate that the model also excels at automatically capturing seasonal trends in customer activity, such as the shopping period leading up to the December holidays. In Appendix Section F we provide a further characterization of scenarios where our model performs particularly well and where it does not do so relative to the used benchmark methods.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
roposed model informs managers on both short- and long-term forecasts of individual customer behavior and helps to timely uncover business opportunities as well as potential customer defection. <span>As we have shown, it also accurately predicts periods of elevated transaction activity and captures other forms of purchase dynamics that can be leveraged in simulations of future sequences of customer transactions. We highlight our model’s flexibility and performance on two groups of valuable customers: those who keep making more and more transactions with the firm (denoted as ”opportunity” customers) and those who are at risk of defection. We demonstrate that the model also excels at automatically capturing seasonal trends in customer activity, such as the shopping period leading up to the December holidays. In Appendix Section F we provide a further characterization of scenarios where our model performs particularly well and where it does not do so relative to the used benchmark methods. The model brings many practical benefits for the marketing analyst, such as the lack of need for manual encoding of any features in the customer data, a simple optimization objective, a

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7559808945420

Tags
#deep-learning #keras #lstm #python #sequence
Question
The promise of recurrent neural networks is that the temporal dependence and [...] information in the input data can be learned. A recurrent network whose inputs are not fixed but rather constitute an input sequence can be used to transform an input sequence into an output sequence while taking into account contextual information in a flexible way
Answer
contextual

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The promise of recurrent neural networks is that the temporal dependence and contextual information in the input data can be learned. A recurrent network whose inputs are not fixed but rather constitute an input sequence can be used to transform an input sequence into an o

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559811566860

Tags
#Docker
Question

Docker look at the log of an exited container

docker [...] -t [NAZWA KONTENERA]
Answer
logs

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Open it
Docker look at the log of an exited container docker logs -t [NAZWA KONTENERA]







Flashcard 7559813401868

Tags
#feature-engineering #lstm #recurrent-neural-networks #rnn
Question
The [...] neural network typology is well-suited for modeling churn, especially in time-series format. However, its performance against standard churn prediction models remains an avenue for further research
Answer
LSTM

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The LSTM neural network typology is well-suited for modeling churn, especially in time-series format. However, its performance against standard churn prediction models remains an avenue for furt

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559815236876

Tags
#deep-learning #keras #lstm #python #sequence
Question
Given that LSTMs operate on sequence data, it means that the addition of layers adds levels of abstraction of input observations over time. In effect, chunking observations over time or representing the problem at different time scales. ... building a deep RNN by stacking multiple recurrent hidden states on top of each other. This approach potentially allows the hidden state at each level to operate at different [...] — How to Construct Deep Recurrent Neural Networks, 2013
Answer
timescale

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
t time scales. ... building a deep RNN by stacking multiple recurrent hidden states on top of each other. This approach potentially allows the hidden state at each level to operate at different <span>timescale — How to Construct Deep Recurrent Neural Networks, 2013 <span>

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559817071884

Tags
#bayes #programming #r #statistics
Question
The posterior distribution also shows the [...] in that estimated slope, because the distribution shows the relative credibility of values across the continuum.
Answer
uncertainty

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The posterior distribution also shows the uncertainty in that estimated slope, because the distribution shows the relative credibility of values across the continuum.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559818906892

Tags
#feature-engineering #lstm #recurrent-neural-networks #rnn
Question
When an analyst uses [...] to predict behavior, the performance of the model will depend greatly on the analyst's domain knowledge
Answer
feature engineering

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
When an analyst uses feature engineering to predict behavior, the performance of the model will depend greatly on the analyst's domain knowledge

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559821004044

Tags
#deep-learning #keras #lstm #python #sequence
Question

One Hot Encoding

For categorical variables where no such ordinal relationship exists, the [...] encoding is not enough.

Answer
integer

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
One Hot Encoding For categorical variables where no such ordinal relationship exists, the integer encoding is not enough.

Original toplevel document (pdf)

cannot see any pdfs







#deep-learning #keras #lstm #python #sequence
The LSTM expects input data to have the dimensions: samples, time steps, and features.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
The LSTM expects input data to have the dimensions: samples, time steps, and features. It is the second dimension of this input format, the time steps, that defines the number of time steps used for forward and backward passes on your sequence prediction problem </

Original toplevel document (pdf)

cannot see any pdfs




#data-science #infrastructure
Incidental complexity is a huge problem for real-world data science because we have to deal with such a high level of inherent complexity that distinguishing between real problems and imaginary problems becomes hard
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
we must avoid introducing incidental complexity, or complexity that is not necessitated by the problem itself but is an unwanted artifact of a chosen approach. Incidental complexity is a huge problem for real-world data science because we have to deal with such a high level of inherent complexity that distinguishing between real problems and imaginary problems becomes hard

Original toplevel document (pdf)

cannot see any pdfs




Flashcard 7559826509068

Tags
#data-science #infrastructure
Question
An effective infrastructure helps to expose and manage inherent complexity, which is the natural state of the world we live in, while making a conscious effort to avoid introducing [...] complexity. Doing this well is hard and requires constant judgment.
Answer
incidental

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
An effective infrastructure helps to expose and manage inherent complexity, which is the natural state of the world we live in, while making a conscious effort to avoid introducing incidental complexity. Doing this well is hard and requires constant judgment.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7559829392652

Tags
#data-science #infrastructure
Question
to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed [...] (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects
Answer
simultaneously

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects

Original toplevel document (pdf)

cannot see any pdfs







#English #vocabulary

arduous

/ˈɑːdjʊəs,ˈɑːdʒʊəs/

Learn to pronounce

adjective

  1. involving or requiring strenuous effort; difficult and tiring

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Open it
arduous /ˈɑːdjʊəs,ˈɑːdʒʊəs/ Learn to pronounce adjective involving or requiring strenuous effort; difficult and tiring "an arduous journey" Similar: onerous taxing difficult hard heavy laborious burdensome




Flashcard 7559836470540

Tags
#data-science #infrastructure
Question
data scientists and engineers are expected to build end-to-end solutions to business problems, of which models are a small but important part. Because this book focuses on end-to-end solutions, we say that the data scientist’s job is to build data science [...].
Answer
applications

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
solutions to business problems, of which models are a small but important part. Because this book focuses on end-to-end solutions, we say that the data scientist’s job is to build data science <span>applications. <span>

Original toplevel document (pdf)

cannot see any pdfs







#deep-learning #keras #lstm #python #sequence
LSTMs may not be ideal for all sequence prediction problems. For example, in time series forecasting, often the information relevant for making a forecast is within a small window of past observations
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on


Parent (intermediate) annotation

Open it
LSTMs may not be ideal for all sequence prediction problems. For example, in time series forecasting, often the information relevant for making a forecast is within a small window of past observations. Often an MLP with a window or a linear model may be a less complex and more suitable model

Original toplevel document (pdf)

cannot see any pdfs