Edited, memorised or added to reading list

on 03-Jan-2020 (Fri)

Do you want BuboFlash to help you learning these things? Click here to log in or create user.

#MIT #jenga
In this paper, we propose a methodology to emulate hierarchical reasoning and multi-sensory fusion in a robot that learns to play Jenga, a complex game that requires physical interaction to be played effectively. The game mechanics are formulated as a generative process using a temporal hierarchical Bayesian model, with representations for both behavioral arch-types and noisy block states. This model captures descriptive latent structures, and the robot learns probabilistic models of these relationships in force and visual domains through a short exploration phase. Once learned, the robot uses this representation to infer block behavior patterns and states as it plays the game.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga

Jenga is a quintessential example of a contact rich task where we need to interact with the tower to learn and infer block mechanics and multi-modal behavior by combining touch and sight.

Current learning methodologies struggle with these challenges and have not exploited physics nearly as richly as we believe humans do. Most robotic learning systems still use purely visual data, without a sense of touch; this fundamentally limits how quickly and flexibly a robot can learn about the world. Learning algorithms that build on model-free reinforcement learning methods have little to no ability to exploit knowledge about the physics of objects and actions. Even the methods using model-based reinforcement learning or imitation learning have mostly used generic statistical models that do not explicitly represent any of the knowledge about physical objects, contacts, or forces that humans have from a very early age. As a consequence, these systems require far more training data than humans do to learn new models or new tasks, and they generalize much less broadly and less robustly.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
In this work, we propose a hierarchical learning approach to acquiring manipulation skills. In particular, we pose a top-down bottom-up (7-10) learning approach to first build abstractions in the joint space of touch and vision that are then used to learn rich physics models. We use Jenga as a platform to compare and evaluate our approach. We have developed a simulation environment in which we compare the performance of our approach to three other state-of-the-art learning paradigms. We further show the efficacy of the approach on an experimental implementation of the game.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




[unknown IMAGE 4737039994124]
#MIT #has-images #jenga
Our proposed approach draws from the notion of an "intuitive physics engine" in the brain that may be the cause of our ability to integrate multiple sensory channels, plan complex actions (3–5), and learn abstract latent structure (6) through physical interaction, even from an early age. Humans learn to play Jenga through physics-based integration of sight and touch: vision provides information about the location of the tower and current block arrangements, but not about block interactions. The interactions are dependent on minute geometric differences between blocks that are imperceptible to the human eye. Humans gain information by touching the blocks and combining tactile and visual senses to make inferences about their interactions. Coarse high-level abstractions such as “will a block move” play a central role in our decision making and are possible precisely because we have rich physics-based representations. We emulate this hierarchical learning and inference in the robotic system depicted in Fig. 1A using the AI schematically shown in Fig. 1B. To learn the mechanics of the game, the robot builds its physics-based representation from the data it collects during a brief exploration phase. We show that the robot builds purposeful abstractions that yield sample-efficient learning of a physics model of the game that it leverages to reason, infer, and act to play the game.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
In this study, we demonstrate the efficacy and sample-efficiency of a novel hierarchical learning approach to manipulation on the challenging game of Jenga.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
In this study, we evaluate the robot's ability to play the game by counting the number of successful consecutive block extractions in randomly generated towers.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
Sensing: The robot has access to its own pose, the pose of the blocks, and the forces applied to it at every time-step. The simulated robot observes these states directly, whereas the experimental robot has access to noisy estimates.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
Action primitives: The robot utilizes two “primitive” actions, push and extract/place. Using the push primitive, the robot first selects a block and moves to a collision-free configuration in plane. The robot then selects a contact location and heading, and pushes for a distance of 1 mm and repeats. The action is considered complete if either the robot chooses to retract or a maximum distance of 45 mm is reached. The extract/place primitive searches for a collision-free grasp of the block and places it on top of the tower at a random unoccupied slot. The extract/place primitives are parametric and computed per call, as such they are not learned.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
Base exploration policy: The robot has access to a base exploration policy for data collection. This policy randomizes the push primitive by first selecting a block at random, then executing a sequence of randomized contact locations and headings.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
Termination criteria: A run, defined as an attempt at a new tower, is terminated when one of the following conditions is met: i) all blocks have been explored, ii) a block is dropped outside the tower or the tower has toppled.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
Tower and robot specifications: The simulated tower is composed of the same number and similar distribution of movable vs. immobile blocks as the real tower. This is due to slight perturbations to weight distribution resulting from small tolerances in the height of the blocks. The relative dimensions of the tower and the end-effector are consistent for both environments.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs




#MIT #jenga
The robot’s nominal execution loop is to select a block at random and attempt the push primitive. During the push primitive, the robot either choses push poses and headings or retracts. If the block is extracted beyond ¾ of its length, the extract/place primitive is invoked. During a run, the nominal execution loop is continued until a termination criterion is met. A key challenge is that movable vs immobile pieces are indistinguishable prior to contact and the robot needs to control the hybrid/multimodal interaction for effective extraction without causing damage to the tower. If the damage compounds then the tower loses integrity and termination criteria are met earlier. As such, this problem is a challenging example that motivates the need for abstract reasoning with a fusion of tactile and visual information with a rich representation of physics.

statusnot read reprioritisations
last reprioritisation on reading queue position [%]
started reading on finished reading on

pdf

cannot see any pdfs