Edited, memorised or added to reading queue

on 07-Oct-2022 (Fri)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#data-science #infrastructure

Details of the life cycle will naturally vary between companies and projects: How you develop a predictive model for customer lifetime value differs greatly from building self-driving cars. However, all data science and machine learning projects have the following key elements in common:

1. From a technical point of view, all projects involve data and computation at their foundation.

2. This book focuses on practical applications of these techniques instead of pure research, so we expect that all projects will eventually need to address the question of integrating results into production systems, which typically involves a great deal of software engineering.

3. Finally, from the human point of view, all projects involve experimentation and iteration, which many consider to be the central activity of data science.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#data-science #infrastructure
to conduct data science projects, a common infrastructure can help to increase the number of projects that can be executed simultaneously (volume), speed up the time to market (velocity), ensure that the results are robust (validity), and make it possible to support a larger variety of projects
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#data-science #infrastructure
We will systematically go through the stack of systems that make a modern, effective infrastructure for data science. The principles covered in this book are not specific to any particular implementation, but we will use an open source framework, Metaflow, to show how the ideas can be put into practice. Alternatively, you can customize your own solution by using other off-the-shelf libraries. This book will help you to choose the right set of tools for the job
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




#data-science #infrastructure
The goal of the stack, which is introduced in the next section, is to unlock the four Vs: it should enable a greater volume and variety of projects, delivered with a higher velocity, without compromising validity of results. However, the stack doesn’t deliver projects by itself—successful projects are delivered by data scientists whose productiv- ity is hopefully greatly improved by the stack
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pdf

cannot see any pdfs




[unknown IMAGE 7553401359628] #data-science #has-images #infrastructure
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on