In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers.

Field (mathematics) - Wikipedia
Module-like[show] Module Group with operators Vector space Linear algebra Algebra-like[show] Algebra Associative Non-associative Composition algebra Lie algebra Graded Bialgebra v t e <span>In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. The best known fields are the field of rational

In mathematics, a [...] is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers.
field

Field (mathematics) - Wikipedia
Module-like[show] Module Group with operators Vector space Linear algebra Algebra-like[show] Algebra Associative Non-associative Composition algebra Lie algebra Graded Bialgebra v t e <span>In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. The best known fields are the field of rational

In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to [...]

Field (mathematics) - Wikipedia
Module-like[show] Module Group with operators Vector space Linear algebra Algebra-like[show] Algebra Associative Non-associative Composition algebra Lie algebra Graded Bialgebra v t e <span>In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. The best known fields are the field of rational

mathematical optimization selects a [...] (with regard to some criterion) from some set of available alternatives.

Mathematical optimization - Wikipedia
+ 4. The global maximum at (x, y, z) = (0, 0, 4) is indicated by a blue dot. [imagelink] Nelder-Mead minimum search of Simionescu's function. Simplex vertices are ordered by their value, with 1 having the lowest (best) value. <span>In mathematics, computer science and operations research, mathematical optimization or mathematical programming, alternatively spelled optimisation, is the selection of a best element (with regard to some criterion) from some set of available alternatives.  In the simplest case, an optimization problem consists of maximizing or minimizing a real function by systematically choosing input values from within an allowed set and computing the

In mathematics, a field is a set on which [...] are defined, and behave as when they are applied to rational and real numbers.

Field (mathematics) - Wikipedia
Module-like[show] Module Group with operators Vector space Linear algebra Algebra-like[show] Algebra Associative Non-associative Composition algebra Lie algebra Graded Bialgebra v t e <span>In mathematics, a field is a set on which addition, subtraction, multiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. The best known fields are the field of rational

#fields
The best known fields are the field of rational numbers and the field of real numbers.

Field (mathematics) - Wikipedia
ltiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. <span>The best known fields are the field of rational numbers and the field of real numbers. The field of complex numbers is also widely used, not only in mathematics, but also in many areas of science and engineering. Many other fields, such as fields of rational functions, al

The best known fields are the field of [...] and [...] .

Field (mathematics) - Wikipedia
ltiplication, and division are defined, and behave as when they are applied to rational and real numbers. A field is thus a fundamental algebraic structure, which is widely used in algebra, number theory and many other areas of mathematics. <span>The best known fields are the field of rational numbers and the field of real numbers. The field of complex numbers is also widely used, not only in mathematics, but also in many areas of science and engineering. Many other fields, such as fields of rational functions, al

Question
All search commands can be followed , (comma) to go the the previous searched item

#### Original toplevel document

A Great Vim Cheat Sheet
Bash CheatSheet for UNIX Systems

Stan Best Practices

m used in Stan. All of these criteria are necessary but not sufficient conditions for a good fit -- in other words they all identify problems that will ensure a bad fit but none of them can guarantee a good fit. Recover simulated values <span>One of the most powerful means of validating a statistical algorithm is to verify that you can recover the ground truth from simulated data. Begin by selecting reasonable "true" values for each of your parameters, simulating data according to your model, and then trying to fit your model with the simulated data.

Most common shorthand notation: $$\displaystyle X_{n}\,{\xrightarrow {P}}\,X$$ #### Original toplevel document Probability theory - Wikipedia D X \displaystyle X_{n}\,{\xrightarrow {\mathcal {D}}}\,X} Convergence in probability <span>The sequence of random variables X 1 , X 2 , … X_{1},X_{2},\dots \,} is said to converge towards the random variable X X\,} in probability if lim n → ∞ P ( | X n − X | ≥ ε ) = 0 \lim _{n\rightarrow \infty }P\left(\left|X_{n}-X\right|\geq \varepsilon \right)=0} for every ε > 0. Most common shorthand notation: X n → P X \displaystyle X_{n}\,{\xrightarrow {P}}\,X} Strong convergence The sequence of random variables X 1 , X #### Flashcard 2976253873420 Tags #probability-theory Question strong convergence means that [...]. Most common shorthand notation: $$\displaystyle X_{n}\,{\xrightarrow {\mathrm {a.s.} }}\,X$$ Answer $$P(\lim _{n\rightarrow \infty }X_{n}=X)=1$$ status measured difficulty not learned 37% [default] 0 #### Parent (intermediate) annotation Open it The sequence of random variables $$X_{1},X_{2},\dots \,$$ is said to converge towards the random variable $$X\,$$ strongly if $$P(\lim _{n\rightarrow \infty }X_{n}=X)=1$$. Strong convergence is also known as almost sure convergence. Most common shorthand notation: $$\displaystyle X_{n}\,{\xrightarrow {\mathrm {a.s.} }}\,X$$ #### Original toplevel document Probability theory - Wikipedia → P X \displaystyle X_{n}\,{\xrightarrow {P}}\,X} Strong convergence <span>The sequence of random variables X 1 , X 2 , … X_{1},X_{2},\dots \,} is said to converge towards the random variable X X\,} strongly if P ( lim n → ∞ X n = X ) = 1 P(\lim _{n\rightarrow \infty }X_{n}=X)=1} . Strong convergence is also known as almost sure convergence. Most common shorthand notation: X n → a . s . X \displaystyle X_{n}\,{\xrightarrow {\mathrm {a.s.} }}\,X} As the names indicate, weak convergence is weaker than strong convergence. In fact, strong convergence implies convergence in probability, and convergence in probability implies w #### Flashcard 2976256232716 Tags #probability-theory Question Strong convergence is also known as [...]. Answer almost sure convergence status measured difficulty not learned 37% [default] 0 #### Parent (intermediate) annotation Open it tml>The sequence of random variables $$X_{1},X_{2},\dots \,$$ is said to converge towards the random variable $$X\,$$ strongly if $$P(\lim _{n\rightarrow \infty }X_{n}=X)=1$$. Strong convergence is also known as almost sure convergence. Most common shorthand notation: $$\displaystyle X_{n}\,{\xrightarrow {\mathrm {a.s.} }}\,X$$ <html> #### Original toplevel document Probability theory - Wikipedia → P X \displaystyle X_{n}\,{\xrightarrow {P}}\,X} Strong convergence <span>The sequence of random variables X 1 , X 2 , … X_{1},X_{2},\dots \,} is said to converge towards the random variable X X\,} strongly if P ( lim n → ∞ X n = X ) = 1 P(\lim _{n\rightarrow \infty }X_{n}=X)=1} . Strong convergence is also known as almost sure convergence. Most common shorthand notation: X n → a . s . X \displaystyle X_{n}\,{\xrightarrow {\mathrm {a.s.} }}\,X} As the names indicate, weak convergence is weaker than strong convergence. In fact, strong convergence implies convergence in probability, and convergence in probability implies w #### Flashcard 2976258592012 Tags #probability-theory Question to study Brownian motion, probability is defined on [...]. Answer a space of functions status measured difficulty not learned 37% [default] 0 #### Parent (intermediate) annotation Open it to study Brownian motion, probability is defined on a space of functions. #### Original toplevel document Probability theory - Wikipedia rk on probabilities outside R n \mathbb {R} ^{n}} , as in the theory of stochastic processes. For example, <span>to study Brownian motion, probability is defined on a space of functions. When it's convenient to work with a dominating measure, the Radon-Nikodym theorem is used to define a density as the Radon-Nikodym derivative of the probability distribution of intere #### Flashcard 2976260164876 Tags #probability-theory Question measure-theoretic treatment also allows us to work on [...], as in the theory of stochastic processes . Answer probabilities outside $$\mathbb {R} ^{n}$$ status measured difficulty not learned 37% [default] 0 #### Parent (intermediate) annotation Open it measure-theoretic treatment also allows us to work on probabilities outside $$\mathbb {R} ^{n}$$, as in the theory of stochastic processes . #### Original toplevel document Probability theory - Wikipedia \mu _{F}\,} induced by F . F\,.} Along with providing better understanding and unification of discrete and continuous probabilities, <span>measure-theoretic treatment also allows us to work on probabilities outside R n \mathbb {R} ^{n}} , as in the theory of stochastic processes. For example, to study Brownian motion, probability is defined on a space of functions. When it's convenient to work with a dominating measure, the Radon-Nikodym theorem is used to def #### Flashcard 2976261737740 Tags #probability-theory Question The raison d'être of the measure-theoretic treatment of probability is that it [...], and makes the difference a question of which measure is used. Answer unifies the discrete and the continuous cases status measured difficulty not learned 37% [default] 0 #### Parent (intermediate) annotation Open it The raison d'être of the measure-theoretic treatment of probability is that it unifies the discrete and the continuous cases, and makes the difference a question of which measure is used. #### Original toplevel document Probability theory - Wikipedia R n \mathbb {R} ^{n}} and other continuous sample spaces. Measure-theoretic probability theory <span>The raison d'être of the measure-theoretic treatment of probability is that it unifies the discrete and the continuous cases, and makes the difference a question of which measure is used. Furthermore, it covers distributions that are neither discrete nor continuous nor mixtures of the two. An example of such distributions could be a mix of discrete and continuous distr #### Annotation 2976271961356 #vim To check, run vim --version and see if +clipboard exists. status not read A Great Vim Cheat Sheet You should now be able to press [space]w in normal mode to save a file. [space]p should paste from the system clipboard (outside of Vim). If you can’t paste, it’s probably because Vim was not built with the system clipboard option. <span>To check, run vim --version and see if +clipboard exists. If it says -clipboard , you will not be able to copy from outside of Vim. For Mac users, homebrew install Vim with the clipboard option. Install homebrew and then run brew install vim #### Flashcard 2976273534220 Tags #vim Question To check system clipboard compatibility, run [...] and see if +clipboard exists. Answer vim --version status measured difficulty not learned 37% [default] 0 #### Parent (intermediate) annotation Open it To check, run vim --version and see if +clipboard exists. #### Original toplevel document A Great Vim Cheat Sheet You should now be able to press [space]w in normal mode to save a file. [space]p should paste from the system clipboard (outside of Vim). If you can’t paste, it’s probably because Vim was not built with the system clipboard option. <span>To check, run vim --version and see if +clipboard exists. If it says -clipboard , you will not be able to copy from outside of Vim. For Mac users, homebrew install Vim with the clipboard option. Install homebrew and then run brew install vim #### Annotation 2976277204236 #shell echo$SHELL # displays the shell you're using

Bash CheatSheet for UNIX Systems
in the foreground or bg in the background DELETE # deletes one character backward !! # repeats the last command exit # logs out of current session # 1. Bash Basics. export # displays all environment variables <span>echo $SHELL # displays the shell you're using echo$BASH_VERSION # displays bash version bash # if you want to use bash (type exit to go back to your normal shell) whereis bash # finds out where bash is on

#### Flashcard 2976278777100

Tags
#shell
Question
[...] # displays the shell you're using

#### Original toplevel document

Bash CheatSheet for UNIX Systems --&gt; UPDATED VERSION --&gt; https://github.com/LeCoupa/awesome-cheatsheets · GitHub
in the foreground or bg in the background DELETE # deletes one character backward !! # repeats the last command exit # logs out of current session # 1. Bash Basics. export # displays all environment variables <span>echo $SHELL # displays the shell you're using echo$BASH_VERSION # displays bash version bash # if you want to use bash (type exit to go back to your normal shell) whereis bash # finds out where bash is on

#### Annotation 2976281660684

#best-practice
Treat the data (and its format) as immutable.

n—if you've got thoughts, please contribute or share them. Data is immutable Don't ever edit your raw data, especially not manually, and especially not in Excel. Don't overwrite your raw data. Don't save multiple versions of the raw data. <span>Treat the data (and its format) as immutable. The code you write should move the raw data through a pipeline to your final analysis. You shouldn't have to run all of the steps every time you want to make a new figure (see Analysis

#### Annotation 2976283233548

#best-practice
Therefore, by default, the data folder is included in the .gitignore file.

new figure (see Analysis is a DAG), but anyone should be able to reproduce the final products with only the code in src and the data in data/raw . Also, if data is immutable, it doesn't need source control in the same way that code does. <span>Therefore, by default, the data folder is included in the .gitignore file. If you have a small amount of data that rarely changes, you may want to include the data in the repository. Github currently warns if files are over 50MB and rejects files over 100MB.

#### Annotation 2976284806412

#best-practice
When we use notebooks in our work, we often subdivide the notebooks folder.

unication Notebook packages like the Jupyter notebook, Beaker notebook, Zeppelin, and other literate programming tools are very effective for exploratory data analysis. However, these tools can be less effective for reproducing an analysis. <span>When we use notebooks in our work, we often subdivide the notebooks folder. For example, notebooks/exploratory contains initial explorations, whereas notebooks/reports is more polished work that can be exported as html to the reports directory. Since no

#### Annotation 2976286379276

#best-practice
Notebooks are for exploration and communication

for storing/syncing large data include AWS S3 with a syncing tool (e.g., s3cmd ), Git Large File Storage, Git Annex, and dat. Currently by default, we ask for an S3 bucket and use AWS CLI to sync data in the data folder with the server. <span>Notebooks are for exploration and communication Notebook packages like the Jupyter notebook, Beaker notebook, Zeppelin, and other literate programming tools are very effective for exploratory data analysis. However, these tools can

#### Annotation 2976287952140

#best-practice
Follow a naming convention that shows the owner and the order the analysis was done in.

control (e.g., diffs of the json are often not human-readable and merging is near impossible), we recommended not collaborating directly with others on Jupyter notebooks. There are two steps we recommend for using notebooks effectively: <span>Follow a naming convention that shows the owner and the order the analysis was done in. We use the format --.ipynb (e.g., 0.3-bull-visualize-distributions.ipynb ). Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data pr

#### Annotation 2976289525004

#best-practice
1. Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim. If it's useful utility code, refactor it to src.

. There are two steps we recommend for using notebooks effectively: Follow a naming convention that shows the owner and the order the analysis was done in. We use the format --.ipynb (e.g., 0.3-bull-visualize-distributions.ipynb ). <span>Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim . If it's useful utility code, refactor it to src . Now by default we turn the project into a Python package (see the setup.py file). You can import your code and use it in notebooks with a cell like the following: # OPTIONAL: Lo

#### Annotation 2976291097868

#best-practice
Now by default we turn the project into a Python package (see the setup.py file). You can import your code and use it in notebooks with a cell like the following:

rts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim . If it's useful utility code, refactor it to src . <span>Now by default we turn the project into a Python package (see the setup.py file). You can import your code and use it in notebooks with a cell like the following: # OPTIONAL: Load the "autoreload" extension so that code can change %load_ext autoreload # OPTIONAL: always reload modules so that as you change code in src, it gets loaded

#### Annotation 2976292670732

#best-practice
We prefer make for managing steps that depend on each other, especially the long-running ones.

an analysis you have long-running steps that preprocess data or train models. If these steps have been run already (and you have stored the output somewhere like the data/interim directory), you don't want to wait to rerun them every time. <span>We prefer make for managing steps that depend on each other, especially the long-running ones. Make is a common tool on Unix-based platforms (and is available for Windows). Following the make documentation, Makefile conventions, and portability guide will help ensure your Makef

#### Annotation 2976294243596

#best-practice
The first step in reproducing an analysis is always reproducing the computational environment it was run in.

re other tools for managing DAGs that are written in Python instead of a DSL (e.g., Paver, Luigi, Airflow, Snakemake, Ruffus, or Joblib). Feel free to use these if they are more appropriate for your analysis. Build from the environment up <span>The first step in reproducing an analysis is always reproducing the computational environment it was run in. You need the same tools, the same libraries, and the same versions to make everything play nicely together. One effective approach to this is use virtualenv (we recommend virtualenvwr

#### Annotation 2976295816460

#best-practice
By listing all of your requirements in the repository (we include a requirements.txt file) you can easily track the packages needed to recreate the analysis.

vironment it was run in. You need the same tools, the same libraries, and the same versions to make everything play nicely together. One effective approach to this is use virtualenv (we recommend virtualenvwrapper for managing virtualenvs). <span>By listing all of your requirements in the repository (we include a requirements.txt file) you can easily track the packages needed to recreate the analysis. Here is a good workflow: Run mkvirtualenv when creating a new project pip install the packages that your analysis needs Run pip freeze > requirements.txt to pin the exact pack

#best-practice

Store your secrets and config variables in a special file

Create a .env file in the project root folder. Thanks to the .gitignore, this file should never get committed into the version control repository.

p secrets and configuration out of version control You really don't want to leak your AWS secret key or Postgres username and password on Github. Enough said — see the Twelve Factor App principles on this point. Here's one way to do this: <span>Store your secrets and config variables in a special file Create a .env file in the project root folder. Thanks to the .gitignore , this file should never get committed into the version control repository. Here's an example: # example .env file DATABASE_URL=postgres://username:password@localhost:5432/dbname AWS_ACCESS_KEY=myaccesskey AWS_SECRET_ACCESS_KEY=mysecretkey OTHER_VARIABLE=some

#### Annotation 2976308399372

but setting the default editor and then using

 git commit - e 

might be much more comfortable.

bash - Add line break to 'git commit -m' from the command line - Stack Overflow
up vote 16 down vote I hope this isn't leading too far away from the posted question, <span>but setting the default editor and then using git commit -e might be much more comfortable. share|edit|flag edited Dec 22 '17 at 6:21

#### Flashcard 2976311020812

#### Original toplevel document

bash - Add line break to 'git commit -m' from the command line - Stack Overflow
up vote 16 down vote I hope this isn't leading too far away from the posted question, <span>but setting the default editor and then using git commit -e might be much more comfortable. share|edit|flag edited Dec 22 '17 at 6:21

#best-practice
Providing metadata is a fundamental requirement when publishing data on the Web because data publishers and data consumers may be unknown to each other.

Data on the Web Best Practices
to tasks where metadata are used, for example, discovery and reuse. Best Practice 1: Provide metadata Provide metadata for both human users and computer applications. Why <span>Providing metadata is a fundamental requirement when publishing data on the Web because data publishers and data consumers may be unknown to each other. Then, it is essential to provide information that helps human users and computer applications to understand the data as well as other important aspects that describes a dataset or a dis

Providing metadata is a fundamental requirement when publishing data on the Web because data publishers and data consumers may be unknown to each other.

#### Original toplevel document

Data on the Web Best Practices
to tasks where metadata are used, for example, discovery and reuse. Best Practice 1: Provide metadata Provide metadata for both human users and computer applications. Why <span>Providing metadata is a fundamental requirement when publishing data on the Web because data publishers and data consumers may be unknown to each other. Then, it is essential to provide information that helps human users and computer applications to understand the data as well as other important aspects that describes a dataset or a dis

Store your secrets and config variables in a special file Create a .env file in the project root folder. Thanks to the .gitignore , this file should never get committed into the version control repository.

#### Original toplevel document

p secrets and configuration out of version control You really don't want to leak your AWS secret key or Postgres username and password on Github. Enough said — see the Twelve Factor App principles on this point. Here's one way to do this: <span>Store your secrets and config variables in a special file Create a .env file in the project root folder. Thanks to the .gitignore , this file should never get committed into the version control repository. Here's an example: # example .env file DATABASE_URL=postgres://username:password@localhost:5432/dbname AWS_ACCESS_KEY=myaccesskey AWS_SECRET_ACCESS_KEY=mysecretkey OTHER_VARIABLE=some

Store your secrets and config variables in a special file Create a .env file in the project root folder. Thanks to the .gitignore , this file should never get committed into the version control repository.

#### Original toplevel document

p secrets and configuration out of version control You really don't want to leak your AWS secret key or Postgres username and password on Github. Enough said — see the Twelve Factor App principles on this point. Here's one way to do this: <span>Store your secrets and config variables in a special file Create a .env file in the project root folder. Thanks to the .gitignore , this file should never get committed into the version control repository. Here's an example: # example .env file DATABASE_URL=postgres://username:password@localhost:5432/dbname AWS_ACCESS_KEY=myaccesskey AWS_SECRET_ACCESS_KEY=mysecretkey OTHER_VARIABLE=some

By listing all of your requirements in the repository (we include a requirements.txt file) you can easily track the packages needed to recreate the analysis.

#### Original toplevel document

vironment it was run in. You need the same tools, the same libraries, and the same versions to make everything play nicely together. One effective approach to this is use virtualenv (we recommend virtualenvwrapper for managing virtualenvs). <span>By listing all of your requirements in the repository (we include a requirements.txt file) you can easily track the packages needed to recreate the analysis. Here is a good workflow: Run mkvirtualenv when creating a new project pip install the packages that your analysis needs Run pip freeze > requirements.txt to pin the exact pack

The first step in reproducing an analysis is always reproducing the computational environment it was run in.

#### Original toplevel document

re other tools for managing DAGs that are written in Python instead of a DSL (e.g., Paver, Luigi, Airflow, Snakemake, Ruffus, or Joblib). Feel free to use these if they are more appropriate for your analysis. Build from the environment up <span>The first step in reproducing an analysis is always reproducing the computational environment it was run in. You need the same tools, the same libraries, and the same versions to make everything play nicely together. One effective approach to this is use virtualenv (we recommend virtualenvwr

We prefer make for managing steps that depend on each other, especially the long-running ones.

#### Original toplevel document

an analysis you have long-running steps that preprocess data or train models. If these steps have been run already (and you have stored the output somewhere like the data/interim directory), you don't want to wait to rerun them every time. <span>We prefer make for managing steps that depend on each other, especially the long-running ones. Make is a common tool on Unix-based platforms (and is available for Windows). Following the make documentation, Makefile conventions, and portability guide will help ensure your Makef

Now by default we turn the project into a Python package (see the setup.py file). You can import your code and use it in notebooks with a cell like the following:

#### Original toplevel document

rts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim . If it's useful utility code, refactor it to src . <span>Now by default we turn the project into a Python package (see the setup.py file). You can import your code and use it in notebooks with a cell like the following: # OPTIONAL: Load the "autoreload" extension so that code can change %load_ext autoreload # OPTIONAL: always reload modules so that as you change code in src, it gets loaded

Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim . If it's useful utility code, refactor it to src . <span><body><html>

#### Original toplevel document

. There are two steps we recommend for using notebooks effectively: Follow a naming convention that shows the owner and the order the analysis was done in. We use the format --.ipynb (e.g., 0.3-bull-visualize-distributions.ipynb ). <span>Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data preprocessing task, put it in the pipeline at src/data/make_dataset.py and load data from data/interim . If it's useful utility code, refactor it to src . Now by default we turn the project into a Python package (see the setup.py file). You can import your code and use it in notebooks with a cell like the following: # OPTIONAL: Lo

Follow a naming convention that shows the owner and the order the analysis was done in.

#### Original toplevel document

control (e.g., diffs of the json are often not human-readable and merging is near impossible), we recommended not collaborating directly with others on Jupyter notebooks. There are two steps we recommend for using notebooks effectively: <span>Follow a naming convention that shows the owner and the order the analysis was done in. We use the format --.ipynb (e.g., 0.3-bull-visualize-distributions.ipynb ). Refactor the good parts. Don't write code to do the same task in multiple notebooks. If it's a data pr

When we use notebooks in our work, we often subdivide the notebooks folder.

#### Original toplevel document

unication Notebook packages like the Jupyter notebook, Beaker notebook, Zeppelin, and other literate programming tools are very effective for exploratory data analysis. However, these tools can be less effective for reproducing an analysis. <span>When we use notebooks in our work, we often subdivide the notebooks folder. For example, notebooks/exploratory contains initial explorations, whereas notebooks/reports is more polished work that can be exported as html to the reports directory. Since no

Notebooks are for exploration and communication

#### Original toplevel document

for storing/syncing large data include AWS S3 with a syncing tool (e.g., s3cmd ), Git Large File Storage, Git Annex, and dat. Currently by default, we ask for an S3 bucket and use AWS CLI to sync data in the data folder with the server. <span>Notebooks are for exploration and communication Notebook packages like the Jupyter notebook, Beaker notebook, Zeppelin, and other literate programming tools are very effective for exploratory data analysis. However, these tools can

Therefore, by default, the data folder is included in the .gitignore file.

#### Original toplevel document

new figure (see Analysis is a DAG), but anyone should be able to reproduce the final products with only the code in src and the data in data/raw . Also, if data is immutable, it doesn't need source control in the same way that code does. <span>Therefore, by default, the data folder is included in the .gitignore file. If you have a small amount of data that rarely changes, you may want to include the data in the repository. Github currently warns if files are over 50MB and rejects files over 100MB.

Treat the data (and its format) as immutable.

#### Original toplevel document

n—if you've got thoughts, please contribute or share them. Data is immutable Don't ever edit your raw data, especially not manually, and especially not in Excel. Don't overwrite your raw data. Don't save multiple versions of the raw data. <span>Treat the data (and its format) as immutable. The code you write should move the raw data through a pipeline to your final analysis. You shouldn't have to run all of the steps every time you want to make a new figure (see Analysis

#Make
Besides building programs, Make can be used to manage any project where some files must be updated automatically from others whenever the others change.

Make (software) - Wikipedia
y how to derive the target program. Though integrated development environments and language-specific compiler features can also be used to manage a build process, Make remains widely used, especially in Unix and Unix-like operating systems. <span>Besides building programs, Make can be used to manage any project where some files must be updated automatically from others whenever the others change. Contents [hide] 1 Origin 2 Derivatives 3 Behavior 4 Makefile 4.1 Rules 4.2 Macros 4.3 Suffix rules 4.4 Pattern rules 4.5 Other elements 5 Example makefiles 6 See also 7 R

#### Annotation 2976346148108

#Make

Make is invoked with a list of target file names to build as command-line arguments:

Make (software) - Wikipedia
urce) and the transformation actions might be to convert the file to some specific format, copy the result into a content management system, and then send e-mail to a predefined set of users indicating that the above actions were performed. <span>Make is invoked with a list of target file names to build as command-line arguments: make [TARGET ...] Without arguments, Make builds the first target that appears in its makefile, which is traditionally a symbolic "phony" target named all. Make d

#### Annotation 2976347720972

#Make
Make decides whether a target needs to be regenerated by comparing file modification times.

Make (software) - Wikipedia
a list of target file names to build as command-line arguments: make [TARGET ...] Without arguments, Make builds the first target that appears in its makefile, which is traditionally a symbolic "phony" target named all. <span>Make decides whether a target needs to be regenerated by comparing file modification times.  This solves the problem of avoiding the building of files which are already up to date, but it fails when a file changes but its modification time stays in the past. Such changes

#### Annotation 2976349293836

#Make
This solves the problem of avoiding the building of files which are already up to date, but it fails when a file changes but its modification time stays in the past.

Make (software) - Wikipedia
ut arguments, Make builds the first target that appears in its makefile, which is traditionally a symbolic "phony" target named all. Make decides whether a target needs to be regenerated by comparing file modification times.  <span>This solves the problem of avoiding the building of files which are already up to date, but it fails when a file changes but its modification time stays in the past. Such changes could be caused by restoring an older version of a source file, or when a network filesystem is a source of files and its clock or timezone is not synchronized with the mac

#### Annotation 2976351390988

#Make
Make searches the current directory for the makefile to use

Make (software) - Wikipedia
The user must handle this situation by forcing a complete build. Conversely, if a source file's modification time is in the future, it triggers unnecessary rebuilding, which may inconvenience users. Makefile Main article: Makefile <span>Make searches the current directory for the makefile to use, e.g. GNU make searches files in order for a file named one of GNUmakefile, makefile, Makefile and then runs the specified (or default) target(s) from (only) that file. The makefile

#### Annotation 2976352963852

#Make
The makefile language is similar to declarative programming. This class of language, in which necessary end conditions are described but the order in which actions are to be taken is not important

Make (software) - Wikipedia
akefile Make searches the current directory for the makefile to use, e.g. GNU make searches files in order for a file named one of GNUmakefile, makefile, Makefile and then runs the specified (or default) target(s) from (only) that file. <span>The makefile language is similar to declarative programming.     This class of language, in which necessary end conditions are described but the order in which actions are to be taken is not important, is sometimes confusing to programmers used to imperative programming. One problem in build automation is the tailoring of a build process to a given platform. For instance, the compi

#### Annotation 2976360828172

Make allows us to specify what depends on what and how to update things that are out of date.