Edited, memorised or added to reading queue

on 19-Aug-2022 (Fri)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

[unknown IMAGE 7545558273292] #ML-engineering #ML_in_Action #has-images #learning #machine #software-engineering
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on




#ML-engineering #ML_in_Action #learning #machine #software-engineering
The end goal of ML work is, after all, about solving a problem. The most effective way to solve those business problems that we’re all tasked with as data science (DS) practitioners is to follow a process designed around preventing rework, confusion, and complexity. By embracing the concepts of ML engineering and following the road of effective project work, the end goal of getting a useful modeling solution can be shorter, far cheaper, and have a much higher probability of succeeding than if you just wing it and hope for the best
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
Data scientists are also expected to be familiar with additional realms of competency. From mid-level DE skills (you have to get your data for your data science from somewhere, right?), software development skills, project management skills, visualization skills, and presentation skills, the list grows ever longer, and the volumes of experience that need to be gained become rather daunting. It’s not much of a surprise, considering all of this, that “just figuring it out” in reference to all the required skills to create production-grade ML solutions is untenable
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
The aim of ML engineering is not to iterate through the lists of skills just mentioned and require that a data scientist (DS) master each of them. Instead, ML engineering collects certain aspects of those skills, carefully crafted to be relevant to data scientists, all with the goal of increasing the chances of getting an ML project into production and making sure that it’s not a solution that needs constant maintenance and intervention to keep running
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#ML-engineering #ML_in_Action #learning #machine #software-engineering
ML engineers need to know just enough software development skills to be able to write modular code and implement unit tests. They don’t need to know about the intricacies of non-blocking asynchronous messaging brokering. They need just enough data engineering skills to build (and schedule the ETL for) feature datasets for their models, but not to construct a petabyte-scale streaming ingestion framework. They need just enough visualization skills to create plots and charts that communicate clearly what their research and models are doing, but not to develop dynamic web apps that have complex user experience (UX) components. They also need just enough project management experience to know how to properly define, scope, and control a project to solve a problem, but they need not go through a Project Management Professional (PMP) certification
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




[unknown IMAGE 7545568234764] #ML-engineering #ML_in_Action #has-images #learning #machine #software-engineering
Figure 1.2 depicts rough estimates of what I’ve come to see as the six primary reasons projects fail (and the rates of these failures in any given industry, from my experience, are truly surprising).
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on




#ML-engineering #ML_in_Action #learning #machine #software-engineering
Throughout this first part of the book, we’ll discuss how to identify the reasons so many projects fail, are abandoned, or take far longer than they should to reach production. We’ll also discuss the solutions to each of these common failures and cover the processes that can significantly lower the chances of these factors derailing your projects. Generally, these failures happen because the DS team is either inexperienced with solving a problem of the scale required (a technological or process-driven failure) or hasn’t fully understood the desired outcome from the business (a communication- driven failure). I’ve never seen this happen because of malicious intent. Rather, most ML projects are incredibly challenging, complex, and composed of algorithmic software tooling that is hard to explain to a layperson—hence the breakdowns in communication with business units that most projects endure
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




Flashcard 7545573477644

Tags
#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
Question
Consumer histories are inherently [...] and of varying lengths T , making RNNs a natural model choice.
Answer
sequential

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Consumer histories are inherently sequential and of varying lengths T , making RNNs a natural model choice.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545578196236

Tags
#DAG #causal #edx
Question
Inverse probability [...] is in fact just one of the group of so called G-methods
Answer
matching

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Inverse probability matching is in fact just one of the group of so called G-methods

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545580817676

Tags
#causality #statistics
Question
Conditional exchangeability is the main assumption necessary for causal inference. Armed with this assumption, we can identify the causal effect within [...] of 𝑋
Answer
levels

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Conditional exchangeability is the main assumption necessary for causal inference. Armed with this assumption, we can identify the causal effect within levels of 𝑋

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545588681996

Tags
#abm #agent-based #machine-learning #model #priority
Question

The goal of the presented framework is to provide a universal technique for agent-based models, in which the decision making process of the agents is not determined by theory-driven or empirically found rules, but rather by an Artificial Neural Network. The process itself can be separated into four phases:

(1) Initialization

(2) [...]

(3) Training

(4) Application

Answer
Experience

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
he agents is not determined by theory-driven or empirically found rules, but rather by an Artificial Neural Network. The process itself can be separated into four phases: (1) Initialization (2) <span>Experience (3) Training (4) Application <span>

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545590779148

Tags
#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
Question
In this paper, we show that recurrent neural networks (RNNs) are promising to overcome both shortcomings of vector-based methods, tedious feature engineering and lack of [...].
Answer
explainability

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
In this paper, we show that recurrent neural networks (RNNs) are promising to overcome both shortcomings of vector-based methods, tedious feature engineering and lack of explainability.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545602575628

Tags
#deep-learning #keras #lstm #python #sequence
Question
The Long Short-Term Memory, or LSTM, network is a type of [...] Neural Network.
Answer
Recurrent

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The Long Short-Term Memory, or LSTM, network is a type of Recurrent Neural Network.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545604934924

Tags
#deep-learning #keras #lstm #python #sequence
Question
LSTM cells are comprised of weights and [...]
Answer
gates

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
LSTM cells are comprised of weights and gates

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545608867084

Tags
#deep-learning #keras #lstm #python #sequence
Question
Batch : A pass through a subset of samples in the training dataset after which the network weights are updated. One [...] is comprised of one or more batches
Answer
epoch

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Batch : A pass through a subset of samples in the training dataset after which the network weights are updated. One epoch is comprised of one or more batches

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545618566412

Tags
#DAG #causal #edx #has-images
[unknown IMAGE 7093205732620]
Question
In those cases, it is generally better [...] for L
Answer
to adjust

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
In those cases, it is generally better to adjust for L

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545620401420

Question
3. [...]: Surveys often fail to reveal the root causes of customer sentiment. In fact, scores can vary based on many outside factors, including geographical bias and industry shocks, making it difficult to perform reliable root-cause analysis using surveys alone
Answer
Ambiguous

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
3. Ambiguous: Surveys often fail to reveal the root causes of customer sentiment. In fact, scores can vary based on many outside factors, including geographical bias and industry shocks, making it d

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545625120012

Tags
#RNN #ariadne #behaviour #consumer #deep-learning #priority #recurrent-neural-networks #retail #simulation #synthetic-data
Question
Consumer behavior is inherently [...] which makes RNNs a perfect fit.
Answer
sequential

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Consumer behavior is inherently sequential which makes RNNs a perfect fit.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545639537932

Tags
#abm #agent-based #machine-learning #model #priority #synergistic-integration
Question
ABM is a [...] modeling approach in which every agent of the system, theoretically, can be simulated to any level of granularity.
Answer
bottom-up

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
ABM is a bottom-up modeling approach in which every agent of the system, theoretically, can be simulated to any level of granularity.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545645042956

Tags
#causality #statistics
Question
[...] is satisfied if unit (individual) 𝑖 ’s outcome is simply a function of unit 𝑖 ’s treatment.
Answer
SUTVA

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
SUTVA is satisfied if unit (individual) 𝑖 ’s outcome is simply a function of unit 𝑖 ’s treatment.

Original toplevel document (pdf)

cannot see any pdfs







Flashcard 7545648188684

Tags
#deep-learning #keras #lstm #python #sequence
Question
[...] : One pass through all samples in the training dataset and updating the network weights. LSTMs may be trained for tens, hundreds, or thousands of .....
Answer
Epoch

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Epoch : One pass through all samples in the training dataset and updating the network weights. LSTMs may be trained for tens, hundreds, or thousands of epochs.

Original toplevel document (pdf)

cannot see any pdfs