# on 22-Sep-2019 (Sun)

#### Annotation 4403302632716

 The problems t h at ha ve been addressed by AI search algorithms fall into three ge neral classes: single-agent pathfinding problems , two-player games, and constraint-satisfaction problems.

#### pdf

cannot see any pdfs

#### Flashcard 4403492424972

Tags
#artificial-intelligence #reinforcement-learning
Question
What two steps does DeepCubeA use to solve Rubik's cubes?
1. Neural network is trained to learn the cost-to-go function using Autodidactic Iteration (Uniformly sample scramble depth k from 1 to K, apply this many random scrambles to the cube).
2. Batch Weighted A* Search is used to actually solve cubes. This is weighted A* search, plus expanding the children of the N nodes with lowest predicted cost (to use computation efficiently)

status measured difficulty not learned 37% [default] 0

#### Flashcard 4403961138444

Tags
#artificial-intelligence
Question
What is the point in using Weighted A* over A*?
We often have that the heuristic h drastically underestimates the true distance, so that we can obtain a more realistic estimate by scaling up its influence with regard to some parameter. Although this compromises optimality, it can lead to a significant speedup; this is an appropriate choice when searching under time or space constraints.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4404258671884

Tags
#artificial-intelligence
Question
If we parameterize [...] with $$l\in[0,1]$$, we obtain a continuous range of best-first search variants $$A_l$$, also denoted as weighted A*.
$$f_l(u) = l\cdot h(u) + (1-l)\cdot g(u)$$

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4408992992524

Tags
#artificial-intelligence
Question
If we parameterize $$f_l(u) = l\cdot h(u) + (1-l)\cdot g(u)$$ with $$l\in[0,1]$$, we obtain a continuous range of best-first search variants $$A_l$$, also denoted as weighted A*. For l = 0, we simulate a [...] traversal of the problem space; for l = 1, we have [...] . Algorithm $$A_{0.5}$$ selects nodes in the same order as original A*.

greedy best-first search

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409493687564

Tags
#artificial-intelligence
Question
What is an admissable heuristic function?
Definition 1.8. (Admissible Heuristic) An estimate h is an admissible heuristic if it is a lower bound for the optimal solution costs; that is, h(u) ≤ δ(u,T) for all u ∈ V

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409499716876

Tags
#artificial-intelligence
Question
When is a heuristic function consistent?
A goal estimate h is a consistent heuristic if h(u) ≤ h(v) +w(u,v) for all edges e = (u, v) ∈ E

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409502076172

Tags
#artificial-intelligence
Question
When is a heuristic function monotone?
Let ($$u_0$$,. . . , $$u_k$$) be any path, g($$u_i$$) be the path cost of ($$u_0$$,. . . , $$u_i$$), and define f($$u_i$$) = g($$u_i$$) +h($$u_i$$). A goal estimate h is a monotone heuristic if f ($$u_j$$) ≥ f ($$u_i$$) for all j > i, 0 ≤ i, j ≤ k; that is, the estimate of the total path cost is nondecreasing from a node to its successors.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409505221900

Tags
#artificial-intelligence
Question
What is the relation of consistent and monotone heuristic functions?
Theorem 1.1. (Equivalence of Consistent and Monotone Heuristics) A heuristic is consistent if and only if it is monotone.

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409507581196

Tags
#artificial-intelligence
Question
What is the relation between consistent and admissable heuristics?

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409509940492

Tags
#artificial-intelligence
Question
Under a consistent heuristic, when is weighted A* search monotone?

Lemma 6.1. For l ≤ 0.5 and a consistent estimate h, f l is monotone.

(with $$f_l(u) = l\cdot h(u) + (1-l)\cdot g(u)$$ for $$l\in[0,1]$$)

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Flashcard 4409513086220

Tags
#artificial-intelligence
Question
Define $$\epsilon$$-optimality of a search algorithm.
Definition 6.1. ( $$\epsilon$$-Optimality) A search algorithm is $$\epsilon$$-optimal if it terminates with a solution of maximum cost (1 + $$\epsilon$$) ·δ(s,T), with $$\epsilon$$ denoting an arbitrary small positive constant (s is the starting state, T is a goal state).

status measured difficulty not learned 37% [default] 0

#### pdf

cannot see any pdfs

#### Annotation 4409602477324

 #artificial-intelligence #reinforcement-learning Monte-Carlo evaluation consists in estimating a position by averaging the outcome of several random continuations, and can serve as an evaluation function at the leaves of a min-max tree. This paper presents a new framework to combine tree search with Monte-Carlo eval- uation, that does not separate between a min-max phase and a Monte- Carlo phase. Instead of backing-up the min-max value close to the root, and the average value at some depth, a more general backup operator is defined that progressively changes from averaging to min-max as the number of simulations grows. This approach provides a fine-grained con- trol of the tree growth, at the level of individual simulations, and allows efficient selectivity methods. This algorithm was implemented in a 9 × 9 Go-playing program, Crazy Stone, that won the 10th KGS computer-Go tournament.

#### pdf

cannot see any pdfs

#### Annotation 4409605360908

 From a helicopter view Monte Carlo Tree Search has one main purpose: given a game state to choose the most promising next move. Throughout the rest of this post we will try to take a look at the details of Monte Carlo Tree Search to understand what we mean by that. We will also refer back to Alpha Go/Zero from time to time trying to explain what MCTS variant has been used in Deepmind’s AI.