Edited, memorised or added to reading queue

on 11-May-2018 (Fri)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#PATH

But in order for this to happen, Jupyter needs to know where to look for the associated executable: that is, it needs to know which path the python sits in.

These paths are specified in jupyter's kernelspec, and it's possible for the user to adjust them to their desires.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Running Jupyter with multiple Python and IPython paths - Stack Overflow
for a non-existent Python version. Jupyter is set-up to be able to use a wide range of "kernels", or execution engines for the code. These can be Python 2, Python 3, R, Julia, Ruby... there are dozens of possible kernels to use. <span>But in order for this to happen, Jupyter needs to know where to look for the associated executable: that is, it needs to know which path the python sits in. These paths are specified in jupyter's kernelspec , and it's possible for the user to adjust them to their desires. For example, here's the list of kernels that I have on my system: $ jupyter kernelspec list Available kernels: python2.7 /Users/jakevdp/.ipython/kernels/python2.7 python3.




#betancourt #probability-theory
In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different systems, which then burdens the already confused mathematics with the distinct and often conflicting philosophical connotations of those applications.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Probability Theory (For Scientists and Engineers)
ity theory is a rich and complex field of mathematics with a reputation for being confusing if not outright impenetrable. Much of that intimidation, however, is due not to the abstract mathematics but rather how they are employed in practice. <span>In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different systems, which then burdens the already confused mathematics with the distinct and often conflicting philosophical connotations of those applications. In this case study I attempt to untangle this pedagogical knot to illuminate the basic concepts and manipulations of probability theory. Our ultimate goal is to demystify what we can




#betancourt #probability-theory
we cannot explicitly construct abstract probability distributions in any meaningful sense. Instead we must utilize problem-specific representations of abstract probability distributions
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Probability Theory (For Scientists and Engineers)
robability theory. Let me open with a warning that the section on abstract probability theory will be devoid of any concrete examples. This is not because of any conspiracy to confuse the reader, but rather is a consequence of the fact that <span>we cannot explicitly construct abstract probability distributions in any meaningful sense. Instead we must utilize problem-specific representations of abstract probability distributions which means that concrete examples will have to wait until we introduce these representations in Section 3. 1 Setting A Foundation Ultimately probability theory concerns itself wi




#best-practice #pystan
When they do fail, however, their failures manifest in diagnostics that are readily checked.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
emented with responsibility. In particular, while dynamic implementations of Hamiltonian Monte Carlo, i.e. implementations where the integration time is dynamic, do perform well over a large class of models their success is not guaranteed. <span>When they do fail, however, their failures manifest in diagnostics that are readily checked. By acknowledging and respecting these diagnostics you can ensure that Stan is accurately fitting the Bayesian posterior and hence accurately characterizing your model. And only with




#best-practice #pystan
the effective sample size quantifies the accuracy of the Markov chain Monte Carlo estimator of a given function, here each parameter mean, provided that geometric ergodicity holds. The potential problem with these effective sample sizes, however, is that we must estimate them from the fit output. When we geneate less than 0.001 effective samples per transition of the Markov chain the estimators that we use are typically biased and can significantly overestimate the true effective sample size.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
chains (at convergence, Rhat=1). We can investigate each more programatically, however, using some of our utility functions. Checking Split R̂ R^ and Effective Sample Sizes¶ As noted in Section 1, <span>the effective sample size quantifies the accuracy of the Markov chain Monte Carlo estimator of a given function, here each parameter mean, provided that geometric ergodicity holds. The potential problem with these effective sample sizes, however, is that we must estimate them from the fit output. When we geneate less than 0.001 effective samples per transition of the Markov chain the estimators that we use are typically biased and can significantly overestimate the true effective sample size. We can check that our effective sample size per iteration is large enough with one of our utility functions, In [14]: stan_utility.check_n_eff(fit)




#best-practice #pystan
Split \( \hat{R} \) quantifies an important necessary condition for geometric ergodicity
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on




#best-practice #pystan
Both large split R ̂ R^ \hat{R} and low effective sample size per iteration are consequences of poorly mixing Markov chains. Improving the mixing of the Markov chains almost always requires tweaking the model specification, for example with a reparameterization or stronger priors.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on




#best-practice #pystan
The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid infinite loops that can occur for non-identified models. For sufficiently complex models, however, Stan can saturate this threshold even if the model is identified, which limits the efficacy of the sampler.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
agnostics that can indicate problems with the fit. These diagnostics are extremely sensitive and typically indicate problems long before the arise in the more universal diagnostics considered above. Checking the Tree Depth¶ <span>The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid infinite loops that can occur for non-identified models. For sufficiently complex models, however, Stan can saturate this threshold even if the model is identified, which limits the efficacy of the sampler. We can check whether that threshold was hit using one of our utility functions, In [17]: stan_utility.check_treedepth(fit) 0 of 4000 iterat




#best-practice #pystan

Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration.

We can identify this problem by consulting the energy Bayesian Fraction of Missing Information,

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod




#best-practice #pystan
The stan_utility module uses the threshold of 0.2 to diagnose problems
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
ction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your model <span>The stan_utility module uses the threshold of 0.2 to diagnose problems, although this is based on preliminary empirical studies and should be taken only as a very rough recommendation. In particular, this diagnostic comes out of recent theoretical work an




#best-practice #pystan
divergences indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
ow E-BFMI are remedied by tweaking the specification of the model. Unfortunately the exact tweaks required depend on the exact structure of the model and, consequently, there are no generic solutions. Checking Divergences¶ <span>Finally, we can check divergences which indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well. For this fit we have a significant number of divergences In [19]: stan_utility.check_div(fit) 202.0 of 4000 iterations ended with a divergen




#best-practice #pystan

Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta, which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives.

In [20]:

         
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
nce (5.05%) Try running with larger adapt_delta to remove the divergences indicating that the Markov chains did not completely explore the posterior and that our Markov chain Monte Carlo estimators will be biased. <span>Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]: fit = model.sampling(data=data, seed=194838, control=dict(adapt_delta=0.9)) Checking again, In [21]: sampler_params = fit.get_sam




#best-practice #pystan
In order to argue that divergences are only false positives, the divergences have to be completely eliminated for some adapt_delta sufficiently close to 1.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
45.0 of 4000 iterations ended with a divergence (1.125%) Try running with larger adapt_delta to remove the divergences we see that while the divergences were reduced they did not completely vanish. <span>In order to argue that divergences are only false positives, the divergences have to be completely eliminated for some adapt_delta sufficiently close to 1. Here we could continue increasing adapt_delta , where we would see that the divergences do not completely vanish, or we can analyze the existing divergences graphically. If the dive




#best-practice #pystan

If the divergences are not false positives then they will tend to concentrate in the pathological neighborhoods of the posterior. Falsely positive divergent iterations, however, will follow the same distribution as non-divergent iterations.

statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
ompletely eliminated for some adapt_delta sufficiently close to 1. Here we could continue increasing adapt_delta , where we would see that the divergences do not completely vanish, or we can analyze the existing divergences graphically. <span>If the divergences are not false positives then they will tend to concentrate in the pathological neighborhoods of the posterior. Falsely positive divergent iterations, however, will follow the same distribution as non-divergent iterations. Here we will use the partition_div function of the stan_utility module to separate divergence and non-divergent iterations, but note that this function works only if your model pa




#best-practice #pystan
One of the challenges with a visual analysis of divergences is determining exactly which parameters to examine. Consequently visual analyses are most useful when there are already components of the model about which you are suspicious
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
color = green, alpha=0.5) plot.gca().set_xlabel("theta_1") plot.gca().set_ylabel("tau") plot.show() WARNING:root:`dtypes` ignored when `permuted` is False. <span>One of the challenges with a visual analysis of divergences is determining exactly which parameters to examine. Consequently visual analyses are most useful when there are already components of the model about which you are suspicious, as in this case where we know that the correlation between random effects ( theta_1 through theta_8 ) and the hierarchical standard deviation, tau , can be problematic. Indeed we




#best-practice #pystan
In order to avoid this issue we have to consider a modification to our model, and in this case we can appeal to a non-centered parameterization of the same model that does not suffer these issues.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

pystan_workflow
be problematic. Indeed we see the divergences clustering towards small values of tau where the posterior abruptly stops. This abrupt stop is indicative of a transition into a pathological neighborhood that Stan was not able to penetrate. <span>In order to avoid this issue we have to consider a modification to our model, and in this case we can appeal to a non-centered parameterization of the same model that does not suffer these issues. A Successful Fit¶ Multiple diagnostics have indicated that our fit of the centered parameterization of our hierarchical model is not to be trusted, so let's instead c




#Rochford #best-practice #pymc
Since December 2014, I have tracked the books I read in a Google spreadsheet.
statusnot read reprioritisations
last reprioritisation on suggested re-reading day
started reading on finished reading on

Austin Rochford - Quantifying Three Years of Reading
s About Projects Talks Reading Log Quantifying Three Years of Reading Posted on December 29, 2017 <span>Since December 2014, I have tracked the books I read in a Google spreadsheet. It recently occurred to me to use this data to quantify how my reading habits have changed over time. This post will use PyMC3 to model my reading habits. %matplotlib inline from it




Flashcard 2961873964300

Tags
#Rochford #best-practice #pymc
Question
Since December 2014, I have tracked the books I read in a [...].
Answer
Google spreadsheet

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Since December 2014, I have tracked the books I read in a Google spreadsheet.

Original toplevel document

Austin Rochford - Quantifying Three Years of Reading
s About Projects Talks Reading Log Quantifying Three Years of Reading Posted on December 29, 2017 <span>Since December 2014, I have tracked the books I read in a Google spreadsheet. It recently occurred to me to use this data to quantify how my reading habits have changed over time. This post will use PyMC3 to model my reading habits. %matplotlib inline from it







Flashcard 2961875537164

Tags
#PATH
Question

When you type any command at the prompt (say, python), the system has a [...] that it looks for the executable.

Answer
well-defined sequence of places

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
When you type any command at the prompt (say, python ), the system has a well-defined sequence of places that it looks for the executable. This sequence is defined in a system variable called PATH , which the user can specify. To see your PATH , you can type echo $PATH . </spa

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
thon installs and finds packages How Jupyter knows what Python to use For the sake of completeness, I'll try to do a quick ELI5 on each of these, so you'll know how to solve this issue in the best way for you. 1. Unix/Linux/OSX $PATH <span>When you type any command at the prompt (say, python ), the system has a well-defined sequence of places that it looks for the executable. This sequence is defined in a system variable called PATH , which the user can specify. To see your PATH , you can type echo $PATH . The result is a list of directories on your computer, which will be searched in order for the desired executable. From your output above, I assume that it contains this: $ echo







Flashcard 2961877896460

Tags
#PATH
Question

To see your PATH, you can type [...]

Answer
echo $PATH.

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
mmand at the prompt (say, python ), the system has a well-defined sequence of places that it looks for the executable. This sequence is defined in a system variable called PATH , which the user can specify. To see your PATH , you can type <span>echo $PATH . <span><body><html>

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
thon installs and finds packages How Jupyter knows what Python to use For the sake of completeness, I'll try to do a quick ELI5 on each of these, so you'll know how to solve this issue in the best way for you. 1. Unix/Linux/OSX $PATH <span>When you type any command at the prompt (say, python ), the system has a well-defined sequence of places that it looks for the executable. This sequence is defined in a system variable called PATH , which the user can specify. To see your PATH , you can type echo $PATH . The result is a list of directories on your computer, which will be searched in order for the desired executable. From your output above, I assume that it contains this: $ echo







Flashcard 2961880255756

Tags
#PATH
Question
The result of echo $PATH is [...]
Answer
a list of directories on your computer

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The result is a list of directories on your computer, which will be searched in order for the desired executable.

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
mpt (say, python ), the system has a well-defined sequence of places that it looks for the executable. This sequence is defined in a system variable called PATH , which the user can specify. To see your PATH , you can type echo $PATH . <span>The result is a list of directories on your computer, which will be searched in order for the desired executable. From your output above, I assume that it contains this: $ echo $PATH /usr/bin/:/Library/Frameworks/Python.framework/Versions/3.5/bin/:/usr/local/bin/ In windows echo %path% P







Flashcard 2961883401484

Tags
#PATH
Question
When you run python and do something like import matplotlib, Python has [...] that specifies where to find the package you want
Answer
sys.path

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
When you run python and do something like import matplotlib , Python has to play a similar game to find the package you have in mind. Similar to $PATH in unix, Python has sys.path that specifies these

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
many installations of the same program on your system. Changing the path is not too complicated; see e.g. How to permanently set $PATH on Linux?. Windows - How to set environment variables in Windows 10 2. How Python finds packages <span>When you run python and do something like import matplotlib , Python has to play a similar game to find the package you have in mind. Similar to $PATH in unix, Python has sys.path that specifies these: $ python >>> import sys >>> sys.path ['', '/Users/jakevdp/anaconda/lib/python3.5', '/Users/jakevdp/anaconda/lib/python3.5/site-packages', ...] Some importan







Flashcard 2961885760780

Tags
#PATH
Question
When you run python and do something like import matplotlib, Python has sys.paththat specifies [...]
Answer
where to find the package you want

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
When you run python and do something like import matplotlib , Python has to play a similar game to find the package you have in mind. Similar to $PATH in unix, Python has sys.path that specifies these

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
many installations of the same program on your system. Changing the path is not too complicated; see e.g. How to permanently set $PATH on Linux?. Windows - How to set environment variables in Windows 10 2. How Python finds packages <span>When you run python and do something like import matplotlib , Python has to play a similar game to find the package you have in mind. Similar to $PATH in unix, Python has sys.path that specifies these: $ python >>> import sys >>> sys.path ['', '/Users/jakevdp/anaconda/lib/python3.5', '/Users/jakevdp/anaconda/lib/python3.5/site-packages', ...] Some importan







Flashcard 2961887595788

Tags
#PATH
Question
by default, the first entry in sys.path is [...].
Answer
the current directory

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
by default, the first entry in sys.path is the current directory.

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
nix, Python has sys.path that specifies these: $ python >>> import sys >>> sys.path ['', '/Users/jakevdp/anaconda/lib/python3.5', '/Users/jakevdp/anaconda/lib/python3.5/site-packages', ...] Some important things: <span>by default, the first entry in sys.path is the current directory. Also, unless you modify this (which you shouldn't do unless you know exactly what you're doing) you'll usually find something called site-packages in the path: this is the default pla







Flashcard 2961889168652

Tags
#PATH
Question
the default place that Python puts packages is something called [...]
Answer
site-packages

when you install them using python setup.py install, or pip, or conda, or a similar means

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
is the current directory. Also, unless you modify this (which you shouldn't do unless you know exactly what you're doing) you'll usually find something called site-packages in the path: this is the default place that Python puts packages when you install them using python setup.py install , or pip , or conda , or a similar means.

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
these: $ python >>> import sys >>> sys.path ['', '/Users/jakevdp/anaconda/lib/python3.5', '/Users/jakevdp/anaconda/lib/python3.5/site-packages', ...] Some important things: by default, the first entry in sys.path <span>is the current directory. Also, unless you modify this (which you shouldn't do unless you know exactly what you're doing) you'll usually find something called site-packages in the path: this is the default place that Python puts packages when you install them using python setup.py install , or pip , or conda , or a similar means. The important thing to note is that each python installation has its own site-packages , where packages are installed for that specific Python version . In other words, if you inst







Flashcard 2961891527948

Tags
#PATH
Question
some Python packages come bundled with stand-alone scripts that you can run from [...]
Answer
the command line

(examples are pip, ipython, jupyter, pep8, etc.)

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip , ipython , jupyter , pep8 , etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to w

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
cause it was installed on a different Python! This is why in our twitter exchange I recommended you focus on one Python installation, and fix your $PATH so that you're only using the one you want to use. There's another component to this: <span>some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip , ipython , jupyter , pep8 , etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to work only with that Python installation . That means that, as your system is set-up, when you run python , you get /usr/bin/python , but when you run ipython , you get /Library/Frameworks/Python.framework/Versions/3.5/bi







Flashcard 2961893887244

Tags
#PATH
Question
some Python packages come bundled with [...] that you can run from the command line
Answer
stand-alone scripts

(examples are pip, ipython, jupyter, pep8, etc.)

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip , ipython , jupyter , pep8 , etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to w

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
cause it was installed on a different Python! This is why in our twitter exchange I recommended you focus on one Python installation, and fix your $PATH so that you're only using the one you want to use. There's another component to this: <span>some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip , ipython , jupyter , pep8 , etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to work only with that Python installation . That means that, as your system is set-up, when you run python , you get /usr/bin/python , but when you run ipython , you get /Library/Frameworks/Python.framework/Versions/3.5/bi







Flashcard 2961896508684

Tags
#PATH
Question
some Python packages that you can run from the command line (examples are pip, ipython, jupyter, pep8, etc.) are put in [...where...]
Answer
the same directory path as the Python used to install them

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
ead> some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip , ipython , jupyter , pep8 , etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to work only with that Python installation . <html>

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
cause it was installed on a different Python! This is why in our twitter exchange I recommended you focus on one Python installation, and fix your $PATH so that you're only using the one you want to use. There's another component to this: <span>some Python packages come bundled with stand-alone scripts that you can run from the command line (examples are pip , ipython , jupyter , pep8 , etc.) By default, these executables will be put in the same directory path as the Python used to install them, and are designed to work only with that Python installation . That means that, as your system is set-up, when you run python , you get /usr/bin/python , but when you run ipython , you get /Library/Frameworks/Python.framework/Versions/3.5/bi







Flashcard 2961898867980

Tags
#PATH
Question
the $PATH variable are usually set in[...files...]
Answer
shell.dotfiles

~/.zshrc
or ~/.bashrc,or ~/.bash_profile

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Well, first make sure your $PATH variable is doing what you want it to. You likely have a startup script called something like ~/.bash_profile or ~/.bashrc that sets this $PATH variable.

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
s that the packages you can import when running python are entirely separate from the packages you can import when running ipython or a Jupyter notebook: you're using two completely independent Python installations. So how to fix this? <span>Well, first make sure your $PATH variable is doing what you want it to. You likely have a startup script called something like ~/.bash_profile or ~/.bashrc that sets this $PATH variable. On Windows, you can modify the user specific environment variables. You can manually modify that if you want your system to search things in a different order. When you first install an







Flashcard 2961901227276

Tags
#PATH
Question

jupyter uses [...]to look for the associated kernels

Answer
kernelspec

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
ad> But in order for this to happen, Jupyter needs to know where to look for the associated executable: that is, it needs to know which path the python sits in. These paths are specified in jupyter's kernelspec , and it's possible for the user to adjust them to their desires. <html>

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
for a non-existent Python version. Jupyter is set-up to be able to use a wide range of "kernels", or execution engines for the code. These can be Python 2, Python 3, R, Julia, Ruby... there are dozens of possible kernels to use. <span>But in order for this to happen, Jupyter needs to know where to look for the associated executable: that is, it needs to know which path the python sits in. These paths are specified in jupyter's kernelspec , and it's possible for the user to adjust them to their desires. For example, here's the list of kernels that I have on my system: $ jupyter kernelspec list Available kernels: python2.7 /Users/jakevdp/.ipython/kernels/python2.7 python3.







Flashcard 2961903586572

Tags
#PATH
Question

IPython relies on [...package...] to install a python kernel

Answer

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
IPython relies on the ipykernel package which contains a command to install a python kernel: for example $ python - m ipykernel install It will create a kernelspec associated with the Python executable you use to run th

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
specifies the kernel name, the path to the executable, and other relevant info. You can adjust kernels manually, editing the metadata inside the directories listed above. The command to install a kernel can change depending on the kernel. <span>IPython relies on the ipykernel package which contains a command to install a python kernel: for example $ python -m ipykernel install It will create a kernelspec associated with the Python executable you use to run this command. You can then choose this kernel in the Jupyter notebook to run your code with that Python. You can see other options that ipykernel provides using the help command: $ python -m ipykernel install --help usage: ipython-kernel-install [-h] [--user] [--name NAME]







Flashcard 2961906732300

Tags
#PATH
Question

Ipython uses

 $ python - m ipykernel install 

to create a [...] associated with the Python executable you use to run this command.

Answer
kernelspec

You can then choose this kernel in the Jupyter notebook to run your code with that Python.


statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
IPython relies on the ipykernel package which contains a command to install a python kernel: for example $ python - m ipykernel install It will create a kernelspec associated with the Python executable you use to run this command. You can then choose this kernel in the Jupyter notebook to run your code with that Python. </h

Original toplevel document

Running Jupyter with multiple Python and IPython paths - Stack Overflow
specifies the kernel name, the path to the executable, and other relevant info. You can adjust kernels manually, editing the metadata inside the directories listed above. The command to install a kernel can change depending on the kernel. <span>IPython relies on the ipykernel package which contains a command to install a python kernel: for example $ python -m ipykernel install It will create a kernelspec associated with the Python executable you use to run this command. You can then choose this kernel in the Jupyter notebook to run your code with that Python. You can see other options that ipykernel provides using the help command: $ python -m ipykernel install --help usage: ipython-kernel-install [-h] [--user] [--name NAME]







Flashcard 2961910140172

Tags
#best-practice #pystan
Question
divergences indicate [...] that the simulated Hamiltonian trajectories are not able to explore sufficiently well.
Answer
pathological neighborhoods of the posterior

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
divergences indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well.

Original toplevel document

pystan_workflow
ow E-BFMI are remedied by tweaking the specification of the model. Unfortunately the exact tweaks required depend on the exact structure of the model and, consequently, there are no generic solutions. Checking Divergences¶ <span>Finally, we can check divergences which indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well. For this fit we have a significant number of divergences In [19]: stan_utility.check_div(fit) 202.0 of 4000 iterations ended with a divergen







Flashcard 2961911713036

Tags
#best-practice #pystan
Question
divergences indicate pathological neighborhoods of the posterior that [...] are not able to explore sufficiently well.
Answer
the simulated Hamiltonian trajectories

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
divergences indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well.

Original toplevel document

pystan_workflow
ow E-BFMI are remedied by tweaking the specification of the model. Unfortunately the exact tweaks required depend on the exact structure of the model and, consequently, there are no generic solutions. Checking Divergences¶ <span>Finally, we can check divergences which indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well. For this fit we have a significant number of divergences In [19]: stan_utility.check_div(fit) 202.0 of 4000 iterations ended with a divergen







Flashcard 2961913285900

Tags
#best-practice #pystan
Question
[...] indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well.
Answer
divergences

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
divergences indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well.

Original toplevel document

pystan_workflow
ow E-BFMI are remedied by tweaking the specification of the model. Unfortunately the exact tweaks required depend on the exact structure of the model and, consequently, there are no generic solutions. Checking Divergences¶ <span>Finally, we can check divergences which indicate pathological neighborhoods of the posterior that the simulated Hamiltonian trajectories are not able to explore sufficiently well. For this fit we have a significant number of divergences In [19]: stan_utility.check_div(fit) 202.0 of 4000 iterations ended with a divergen







Flashcard 2961914858764

Tags
#best-practice #pystan
Question

Divergences can sometimes be [...].

Answer
false positives

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectori

Original toplevel document

pystan_workflow
nce (5.05%) Try running with larger adapt_delta to remove the divergences indicating that the Markov chains did not completely explore the posterior and that our Markov chain Monte Carlo estimators will be biased. <span>Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]: fit = model.sampling(data=data, seed=194838, control=dict(adapt_delta=0.9)) Checking again, In [21]: sampler_params = fit.get_sam







Flashcard 2961917218060

Tags
#best-practice #pystan
Question

With divergence, to verify that we have real fitting issues we can rerun with [...]

Answer
a higher target acceptance rate

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]:

Original toplevel document

pystan_workflow
nce (5.05%) Try running with larger adapt_delta to remove the divergences indicating that the Markov chains did not completely explore the posterior and that our Markov chain Monte Carlo estimators will be biased. <span>Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]: fit = model.sampling(data=data, seed=194838, control=dict(adapt_delta=0.9)) Checking again, In [21]: sampler_params = fit.get_sam







Flashcard 2961919577356

Tags
#best-practice #pystan
Question

a larger target acceptance probability will force [...] and reduce the false positive divergences

Answer
more accurate simulations of Hamiltonian trajectories

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
l> Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]: <html>

Original toplevel document

pystan_workflow
nce (5.05%) Try running with larger adapt_delta to remove the divergences indicating that the Markov chains did not completely explore the posterior and that our Markov chain Monte Carlo estimators will be biased. <span>Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]: fit = model.sampling(data=data, seed=194838, control=dict(adapt_delta=0.9)) Checking again, In [21]: sampler_params = fit.get_sam







Flashcard 2961921936652

Tags
#best-practice #pystan
Question
In order to argue that divergences are only false positives, the divergences have to be [...] for some adapt_delta sufficiently close to 1.
Answer
completely eliminated

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
In order to argue that divergences are only false positives, the divergences have to be completely eliminated for some adapt_delta sufficiently close to 1.

Original toplevel document

pystan_workflow
45.0 of 4000 iterations ended with a divergence (1.125%) Try running with larger adapt_delta to remove the divergences we see that while the divergences were reduced they did not completely vanish. <span>In order to argue that divergences are only false positives, the divergences have to be completely eliminated for some adapt_delta sufficiently close to 1. Here we could continue increasing adapt_delta , where we would see that the divergences do not completely vanish, or we can analyze the existing divergences graphically. If the dive







Flashcard 2961923509516

Tags
#best-practice #pystan
Question

true divergences will tend to concentrate in [...].

Answer
the pathological neighborhoods of the posterior

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
If the divergences are not false positives then they will tend to concentrate in the pathological neighborhoods of the posterior. Falsely positive divergent iterations, however, will follow the same distribution as non-divergent iterations.

Original toplevel document

pystan_workflow
ompletely eliminated for some adapt_delta sufficiently close to 1. Here we could continue increasing adapt_delta , where we would see that the divergences do not completely vanish, or we can analyze the existing divergences graphically. <span>If the divergences are not false positives then they will tend to concentrate in the pathological neighborhoods of the posterior. Falsely positive divergent iterations, however, will follow the same distribution as non-divergent iterations. Here we will use the partition_div function of the stan_utility module to separate divergence and non-divergent iterations, but note that this function works only if your model pa







Flashcard 2961925868812

Tags
#best-practice #pystan
Question

Graphically, falsely positive divergent iterations will [...].

Answer
follow the same distribution as non-divergent iterations

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
If the divergences are not false positives then they will tend to concentrate in the pathological neighborhoods of the posterior. Falsely positive divergent iterations, however, will follow the same distribution as non-divergent iterations.

Original toplevel document

pystan_workflow
ompletely eliminated for some adapt_delta sufficiently close to 1. Here we could continue increasing adapt_delta , where we would see that the divergences do not completely vanish, or we can analyze the existing divergences graphically. <span>If the divergences are not false positives then they will tend to concentrate in the pathological neighborhoods of the posterior. Falsely positive divergent iterations, however, will follow the same distribution as non-divergent iterations. Here we will use the partition_div function of the stan_utility module to separate divergence and non-divergent iterations, but note that this function works only if your model pa







Flashcard 2961928228108

Tags
#best-practice #pystan
Question
One of the challenges with a visual analysis of divergences is determining [...].
Answer
exactly which parameters to examine

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
One of the challenges with a visual analysis of divergences is determining exactly which parameters to examine. Consequently visual analyses are most useful when there are already components of the model about which you are suspicious

Original toplevel document

pystan_workflow
color = green, alpha=0.5) plot.gca().set_xlabel("theta_1") plot.gca().set_ylabel("tau") plot.show() WARNING:root:`dtypes` ignored when `permuted` is False. <span>One of the challenges with a visual analysis of divergences is determining exactly which parameters to examine. Consequently visual analyses are most useful when there are already components of the model about which you are suspicious, as in this case where we know that the correlation between random effects ( theta_1 through theta_8 ) and the hierarchical standard deviation, tau , can be problematic. Indeed we







Flashcard 2961930849548

Tags
#best-practice #pystan
Question
The stan_utility module uses the E-BFMI threshold of [...] to diagnose problems
Answer
0.2

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The stan_utility module uses the threshold of 0.2 to diagnose problems

Original toplevel document

pystan_workflow
ction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your model <span>The stan_utility module uses the threshold of 0.2 to diagnose problems, although this is based on preliminary empirical studies and should be taken only as a very rough recommendation. In particular, this diagnostic comes out of recent theoretical work an







Flashcard 2961933733132

Tags
#best-practice #pystan
Question

E-BFMI stands for [...],

Answer
energy Bayesian Fraction of Missing Information

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
ory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the <span>energy Bayesian Fraction of Missing Information, <span><body><html>

Original toplevel document

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod







Flashcard 2961936092428

Tags
#best-practice #pystan
Question

Hamiltonian Monte Carlo proceeds in [...] phases

Answer
two

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next tr

Original toplevel document

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod







Flashcard 2961938451724

Tags
#best-practice #pystan
Question

The first phase of Hamiltonian Monte Carlo simulates a Hamiltonian trajectory that [...]

Answer
rapidly explores a slice of the target parameter space

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the

Original toplevel document

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod







Flashcard 2961940811020

Tags
#best-practice #pystan
Question

The second phase of Hamiltonian Monte Carlo [...] to allow the next trajectory to explore another slice of the target parameter space.

Answer
resamples the auxiliary momenta

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which

Original toplevel document

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod







Flashcard 2961943170316

Tags
#best-practice #pystan
Question

Unfortunately, too short [...] induced by the momenta resamplings can lead to slow exploration.

Answer
jumps between slices of trajectories

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
rst simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the <span>jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, <span><body><html>

Original toplevel document

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod







Flashcard 2961945529612

Tags
#best-practice #pystan
Question
The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a [...] built in to avoid infinite loops that can occur for non-identified models.
Answer
maximum trajectory length

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid infinite loops that can occur for non-identified models. For sufficiently complex models, however, Stan can saturate this threshold even if the model is identified, wh

Original toplevel document

pystan_workflow
agnostics that can indicate problems with the fit. These diagnostics are extremely sensitive and typically indicate problems long before the arise in the more universal diagnostics considered above. Checking the Tree Depth¶ <span>The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid infinite loops that can occur for non-identified models. For sufficiently complex models, however, Stan can saturate this threshold even if the model is identified, which limits the efficacy of the sampler. We can check whether that threshold was hit using one of our utility functions, In [17]: stan_utility.check_treedepth(fit) 0 of 4000 iterat







Flashcard 2961947888908

Tags
#best-practice #pystan
Question
The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid [...] .
Answer
possible infinite loops in non-identified models

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid infinite loops that can occur for non-identified models. For sufficiently complex models, however, Stan can saturate this threshold even if the model is identified, wh

Original toplevel document

pystan_workflow
agnostics that can indicate problems with the fit. These diagnostics are extremely sensitive and typically indicate problems long before the arise in the more universal diagnostics considered above. Checking the Tree Depth¶ <span>The dynamic implementation of Hamiltonian Monte Carlo used in Stan has a maximum trajectory length built in to avoid infinite loops that can occur for non-identified models. For sufficiently complex models, however, Stan can saturate this threshold even if the model is identified, which limits the efficacy of the sampler. We can check whether that threshold was hit using one of our utility functions, In [17]: stan_utility.check_treedepth(fit) 0 of 4000 iterat







Flashcard 2961951296780

Tags
#best-practice #pystan
Question
Split \( \hat{R} \) quantifies an important necessary condition for [...]
Answer
geometric ergodicity

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Split R̂ quantifies an important necessary condition for geometric ergodicity

Original toplevel document

pystan_workflow







Flashcard 2961952869644

Tags
#best-practice #pystan
Question
Improving the mixing of the Markov chains almost always requires [...]
Answer
tweaking the model specification

for example with a reparameterization or stronger priors.

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
> Both large split R ̂ R^ \hat{R} and low effective sample size per iteration are consequences of poorly mixing Markov chains. Improving the mixing of the Markov chains almost always requires tweaking the model specification, for example with a reparameterization or stronger priors. <html>

Original toplevel document

pystan_workflow







Flashcard 2961955228940

Tags
#best-practice #pystan
Question
[...] quantifies the accuracy of the Markov chain Monte Carlo estimator of a given function, e.g. parameter mean
Answer
the effective sample size

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
the effective sample size quantifies the accuracy of the Markov chain Monte Carlo estimator of a given function, here each parameter mean, provided that geometric ergodicity holds. The potential problem with the

Original toplevel document

pystan_workflow
chains (at convergence, Rhat=1). We can investigate each more programatically, however, using some of our utility functions. Checking Split R̂ R^ and Effective Sample Sizes¶ As noted in Section 1, <span>the effective sample size quantifies the accuracy of the Markov chain Monte Carlo estimator of a given function, here each parameter mean, provided that geometric ergodicity holds. The potential problem with these effective sample sizes, however, is that we must estimate them from the fit output. When we geneate less than 0.001 effective samples per transition of the Markov chain the estimators that we use are typically biased and can significantly overestimate the true effective sample size. We can check that our effective sample size per iteration is large enough with one of our utility functions, In [14]: stan_utility.check_n_eff(fit)







Flashcard 2961957588236

Tags
#betancourt #probability-theory
Question
we cannot explicitly construct [...] in any meaningful sense. Instead we must utilize problem-specific representations of it
Answer
abstract probability distributions

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
we cannot explicitly construct abstract probability distributions in any meaningful sense. Instead we must utilize problem-specific representations of abstract probability distributions

Original toplevel document

Probability Theory (For Scientists and Engineers)
robability theory. Let me open with a warning that the section on abstract probability theory will be devoid of any concrete examples. This is not because of any conspiracy to confuse the reader, but rather is a consequence of the fact that <span>we cannot explicitly construct abstract probability distributions in any meaningful sense. Instead we must utilize problem-specific representations of abstract probability distributions which means that concrete examples will have to wait until we introduce these representations in Section 3. 1 Setting A Foundation Ultimately probability theory concerns itself wi







Flashcard 2961959947532

Tags
#betancourt #probability-theory
Question
many introductions to probability theory sloppily confound [...with...]
Answer
the abstract mathematics with their practical implementations

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different

Original toplevel document

Probability Theory (For Scientists and Engineers)
ity theory is a rich and complex field of mathematics with a reputation for being confusing if not outright impenetrable. Much of that intimidation, however, is due not to the abstract mathematics but rather how they are employed in practice. <span>In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different systems, which then burdens the already confused mathematics with the distinct and often conflicting philosophical connotations of those applications. In this case study I attempt to untangle this pedagogical knot to illuminate the basic concepts and manipulations of probability theory. Our ultimate goal is to demystify what we can







Flashcard 2961962306828

Tags
#betancourt #probability-theory
Question
many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting [...with...].
Answer
what we can calculate in the theory with how we perform those calculations in practice

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different systems, which then burdens the already confused mathematics with the distinct and often

Original toplevel document

Probability Theory (For Scientists and Engineers)
ity theory is a rich and complex field of mathematics with a reputation for being confusing if not outright impenetrable. Much of that intimidation, however, is due not to the abstract mathematics but rather how they are employed in practice. <span>In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different systems, which then burdens the already confused mathematics with the distinct and often conflicting philosophical connotations of those applications. In this case study I attempt to untangle this pedagogical knot to illuminate the basic concepts and manipulations of probability theory. Our ultimate goal is to demystify what we can







Flashcard 2961964666124

Tags
#betancourt #probability-theory
Question
probability theory is often used to model a variety of subtlely different systems, which then burdens the already confused mathematics with [...] .
Answer
the distinct and often conflicting philosophical connotations of those applications

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
ppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of <span>subtlety different systems, which then burdens the already confused mathematics with the distinct and often conflicting philosophical connotations of those applications. <span><body><html>

Original toplevel document

Probability Theory (For Scientists and Engineers)
ity theory is a rich and complex field of mathematics with a reputation for being confusing if not outright impenetrable. Much of that intimidation, however, is due not to the abstract mathematics but rather how they are employed in practice. <span>In particular, many introductions to probability theory sloppily confound the abstract mathematics with their practical implementations, convoluting what we can calculate in the theory with how we perform those calculations. To make matters even worse, probability theory is used to model a variety of subtlety different systems, which then burdens the already confused mathematics with the distinct and often conflicting philosophical connotations of those applications. In this case study I attempt to untangle this pedagogical knot to illuminate the basic concepts and manipulations of probability theory. Our ultimate goal is to demystify what we can







Flashcard 2961980394764

Tags
#best-practice #pystan
Question

The second phase of Hamiltonian Monte Carlo resample the auxiliary momenta to allow the next trajectory to [...] .

Answer
explore another slice of the target parameter space

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which

Original toplevel document

pystan_workflow
.sampling(data=data, seed=194838, control=dict(max_treedepth=15)) and then check if still saturated this larger threshold with stan_utility.check_treedepth(fit, 15) Checking the E-BFMI¶ <span>Hamiltonian Monte Carlo proceeds in two phases -- the algorithm first simulates a Hamiltonian trajectory that rapidly explores a slice of the target parameter space before resampling the auxiliary momenta to allow the next trajectory to explore another slice of the target parameter space. Unfortunately, the jumps between these slices induced by the momenta resamplings can be short, which often leads to slow exploration. We can identify this problem by consulting the energy Bayesian Fraction of Missing Information, In [18]: stan_utility.check_energy(fit) Chain 2: E-BFMI = 0.177681346951 E-BFMI below 0.2 indicates you may need to reparameterize your mod







Flashcard 2961983016204

Tags
#best-practice #pystan
Question

In Stan a larger target acceptance probability is set by the keyword [...]

Answer
adapt_delta

statusnot learnedmeasured difficulty37% [default]last interval [days]               
repetition number in this series0memorised on               scheduled repetition               
scheduled repetition interval               last repetition or drill

Parent (intermediate) annotation

Open it
Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]:

Original toplevel document

pystan_workflow
nce (5.05%) Try running with larger adapt_delta to remove the divergences indicating that the Markov chains did not completely explore the posterior and that our Markov chain Monte Carlo estimators will be biased. <span>Divergences, however, can sometimes be false positives. To verify that we have real fitting issues we can rerun with a larger target acceptance probability, adapt_delta , which will force more accurate simulations of Hamiltonian trajectories and reduce the false positives. In [20]: fit = model.sampling(data=data, seed=194838, control=dict(adapt_delta=0.9)) Checking again, In [21]: sampler_params = fit.get_sam