BuboFlash - helps with learning

Edited, memorised or added to reading queue

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

Annotation 4403302632716

The problems t h at ha ve been addressed by AI search algorithms fall into three ge neral classes: single-agent pathfinding problems , two-player games, and constraint-satisfaction problems.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Flashcard 4403492424972

Tags

#artificial-intelligence #reinforcement-learning

Question

What two steps does DeepCubeA use to solve Rubik's cubes?

Answer

Neural network is trained to learn the cost-to-go function using Autodidactic Iteration (Uniformly sample scramble depth k from 1 to K, apply this many random scrambles to the cube).
Batch Weighted A* Search is used to actually solve cubes. This is weighted A* search, plus expanding the children of the N nodes with lowest predicted cost (to use computation efficiently)

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

Flashcard 4403961138444

Tags

#artificial-intelligence

Question

What is the point in using Weighted A* over A*?

Answer

We often have that the heuristic h drastically underestimates the true distance, so that we can obtain a more realistic estimate by scaling up its influence with regard to some parameter. Although this compromises optimality, it can lead to a significant speedup; this is an appropriate choice when searching under time or space constraints.

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4404258671884

Tags

#artificial-intelligence

Question

If we parameterize [...] with \(l\in[0,1]\), we obtain a continuous range of best-first search variants \(A_l\), also denoted as weighted A*.

Answer

\(f_l(u) = l\cdot h(u) + (1-l)\cdot g(u)\)

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4408992992524

Tags

#artificial-intelligence

Question

If we parameterize \(f_l(u) = l\cdot h(u) + (1-l)\cdot g(u)\) with \(l\in[0,1]\), we obtain a continuous range of best-first search variants \(A_l\), also denoted as weighted A*. For l = 0, we simulate a [...] traversal of the problem space; for l = 1, we have [...] . Algorithm \(A_{0.5}\) selects nodes in the same order as original A*.

Answer

breadth-first

greedy best-first search

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409493687564

Tags

#artificial-intelligence

Question

What is an admissable heuristic function?

Answer

Definition 1.8. (Admissible Heuristic) An estimate h is an admissible heuristic if it is a lower bound for the optimal solution costs; that is, h(u) ≤ δ(u,T) for all u ∈ V

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409499716876

Tags

#artificial-intelligence

Question

When is a heuristic function consistent?

Answer

A goal estimate h is a consistent heuristic if h(u) ≤ h(v) +w(u,v) for all edges e = (u, v) ∈ E

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409502076172

Tags

#artificial-intelligence

Question

When is a heuristic function monotone?

Answer

Let (\(u_0\),. . . , \(u_k\)) be any path, g(\(u_i\)) be the path cost of (\(u_0\),. . . , \(u_i\)), and define f(\(u_i\)) = g(\(u_i\)) +h(\(u_i\)). A goal estimate h is a monotone heuristic if f (\(u_j\)) ≥ f (\(u_i\)) for all j > i, 0 ≤ i, j ≤ k; that is, the estimate of the total path cost is nondecreasing from a node to its successors.

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409505221900

Tags

#artificial-intelligence

Question

What is the relation of consistent and monotone heuristic functions?

Answer

Theorem 1.1. (Equivalence of Consistent and Monotone Heuristics) A heuristic is consistent if and only if it is monotone.

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409507581196

Tags

#artificial-intelligence

Question

What is the relation between consistent and admissable heuristics?

Answer

Theorem 1.2. (Consistency and Admissibility) Consistent estimates are admissible.

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409509940492

Tags

#artificial-intelligence

Question

Under a consistent heuristic, when is weighted A* search monotone?

Answer

Lemma 6.1. For l ≤ 0.5 and a consistent estimate h, f l is monotone.

(with \(f_l(u) = l\cdot h(u) + (1-l)\cdot g(u)\) for \(l\in[0,1]\))

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Flashcard 4409513086220

Tags

#artificial-intelligence

Question

Define \(\epsilon\)-optimality of a search algorithm.

Answer

Definition 6.1. ( \(\epsilon\)-Optimality) A search algorithm is \(\epsilon\)-optimal if it terminates with a solution of maximum cost (1 + \(\epsilon\)) ·δ(s,T), with \(\epsilon\) denoting an arbitrary small positive constant (s is the starting state, T is a goal state).

status	not learned	measured difficulty	37% [default]	last interval [days]
repetition number in this series	0	memorised on		scheduled repetition
scheduled repetition interval		last repetition or drill

pdf

cannot see any pdfs

Annotation 4409602477324

#artificial-intelligence #reinforcement-learning

Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo eval- uation, that does not separate between a min-max phase and a Monte- Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that progressively changes from averaging to min-max as the number of simulations grows. This approach provides a fine-grained con- trol of the tree growth, at the level of individual simulations, and allows efficient selectivity methods. This algorithm was implemented in a 9 × 9 Go-playing program, Crazy Stone, that won the 10th KGS computer-Go tournament.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

pdf

cannot see any pdfs

Annotation 4409605360908

From a helicopter view Monte Carlo Tree Search has one main purpose: given a game state to choose the most promising next move. Throughout the rest of this post we will try to take a look at the details of Monte Carlo Tree Search to understand what we mean by that. We will also refer back to Alpha Go/Zero from time to time trying to explain what MCTS variant has been used in Deepmind’s AI.

status	not read	reprioritisations
last reprioritisation on		suggested re-reading day
started reading on		finished reading on

Monte Carlo Tree Search - beginners guide Machine learning blog
nte Carlo Tree Search – summary Introduction Monte Carlo Tree Search was introduced by Rémi Coulom in 2006 as a building block of Crazy Stone – Go playing engine with an impressive performance. <span>From a helicopter view Monte Carlo Tree Search has one main purpose: given a game state to choose the most promising next move. Throughout the rest of this post we will try to take a look at the details of Monte Carlo Tree Search to understand what we mean by that. We will also refer back to Alpha Go/Zero from time to time trying to explain what MCTS variant has been used in Deepmind’s AI. Finite Two Person Zero-Sum Sequential Game The framework/environment that Monte Carlo Tree Search operates within is a game which by itself is a very abstract broad term, therefore here

Edited, memorised or added to reading queue

on 22-Sep-2019 (Sun)

pdf

pdf

pdf

pdf

pdf

pdf

pdf

pdf

pdf

pdf

pdf

pdf