Skip directly to content

Machine Learning

Generative Modeling of Autonomous Robots and their Environments using Reservoir Computing

on Wed, 01/20/2016 - 21:02

Autonomous mobile robots form an important research topic in the field of robotics due to their near-term applicability in the real world as domestic service robots. These robots must be designed in an efficient way using training sequences. They need to be aware of their position in the environment and also need to create models of it for deliberative planning. These tasks have to be performed using a limited number of sensors with low accuracy, as well as with a restricted amount of computational power. In this contribution we show that the recently emerged paradigm of Reservoir Computing (RC) is very well suited to solve all of the above mentioned problems, namely learning by example, robot localization, map and path generation. Reservoir Computing is a technique which enables a system to learn any time-invariant filter of the input by training a simple linear regressor that acts on the states of a highdimensional but random dynamic system excited by the inputs. In addition, RC is a simple technique featuring ease of training, and low computational and memory demands.

Keywords: reservoir computing, generative modeling, map learning, T-maze task, road sign problem, path generation

 

 

 

 

 

 

Related Publications

  1. Eric AntoneloBenjamin Schrauwen and Jan Van Campenhout Generative Modeling of Autonomous Robots and their Environments using Reservoir Computing Neural Processing Letters, Vol. 26(3), pp. 233-249 (2007)   

 

Technologies used: C++, TCP/IP sockets, Linux, Qt, Qwt. 
Developed from 2001 to 2010.
 
Two different software programs were developed during my undergraduate and Master studies: SINAR, a simulator that shows graphically the representation of the environment and the simulation in real time; and CONAR, the autonomous  controller that receives sensor data (from SINAR) and output the actuators data (to SINAR). Simulations with multiple robots can be done if more than one controller (CONAR) connects to SINAR. The communication between both programs is represented on the following figure:

The communication protocol is implemented using TCP/IP sockets. Thus, several controllers can run under different computers over a network (distributing the computing load through different nodes of a network). Both simulator programs were developed under the Linux operating system; the graphical interface was developed with Qt library and some graphical plots were created with Qwt library. C++ was the programming  language used to create both programs.

SINAR

SINAR is a simulator for autonomous robot navigation experiments. Its graphical user interface contains menus, command bars, and the environment display. 

The user can create simulation environments merely by clicking and moving the mouse cursor in the display area. Objects are inserted, resized, translated and rotated simply by moving the mouse device. An object can be a obstacle, a target or even a robot. The user can also edit an object by changing its color and type of movement (for moving objects).

Environments can be saved in files and posteriorly they can be loaded for being used in simulations. Before a simulation starts, one or more controllers (CONAR) should be connected to the SINAR software. The user can control the simulation by activating appropriate (button) commands: start, pause and finish.

During a simulation, sensor data can be viewed graphically through plots in real time:

The performance of the robot (number of collisions, number of target captures, and number of executed iterations) can be verified in real time as well.

There are two modes of simulation: orginary mode and sophisticated mode. In the former mode, the environment display is updated at each iteration such that the user can view graphically the progress of the simulation. Furthermore,  the user can move any object in the environment in real time.

In the latter mode, the simulation is accomplished implicitily (not graphically) and is composed of a set of experiments configured by a specific C++ script. The script determines the sequence and the duration of simulation experiments (considering that each experiment uses distinct environments), besides the number of repetitions for a sequence of experiments. In the sophisticated mode, all generated data are recorded in files:  the trajectory of the robot and the performance measures (number of captures, collisions and their respective iteration time); the representation of the final state of the environment in PNG format and the performance plot (also called learning evolution graphic) are also generated automatically. The controller data (neural networks states) are also saved in an automatic way since the script tells CONAR to save its state when each simulation is finished.

 

CONAR

CONAR is a program that simulates the brain of a robot located in the SINAR environment. After receiving sensor data (distance, color and contact) from its respective robot in the SINAR environment, it sends actuator data (direction adjustment and velocity adjustment) to the same robot. This cycle is kept until the simulations ends.

The graphical interface of CONAR is shown on the following figure. Parameters of the controller can be adjusted before the simulation and in real time; commands can be activated by clicking on buttons: connect to SINAR, apply parameters changes in real time, generate performance data and plots for recording in files, save neural networks state, exit simulation. Furthermore, some neural networks in the controller can be disabled in real time (so that it outputs null (zero)): IP, IC, RR and AR networks.

   In addition, neural networks state can be viewed graphically in real time. In the following figures, a neuron is represented by a circle. In addition, the more black a neuron is, stronger is its output.

In above figure, it is shown a representation of PI repertoire neurons. A small red square inside a circle means that a neuron has already been winner during a learning event.

The next figure shows AR, RR and actuator neurons. The energy levels (degree of activity) of AR or RR neurons are represented by a thick line next to the respective neurons.

In following picture, it is shown the graphical representation of output of winner neurons in PI repertoire (each line represents the winner neuron output in a column: the first line corresponds to the first column and so on).

 

To see a video of a simulation run, check out this page: Reinforcement learning of robot behaviors

Related publications

  1. Eric AntoneloAlbert-Jan BaerveldtThorsteinn Rognvaldsson and Mauricio Figueiredo Modular Neural Network and Classical Reinforcement Learning for Autonomous Robot Navigation: Inhibiting Undesirable Behaviors Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), pp. 1225-1232 (2006)    
  2. Eric Antonelo A Neural Reinforcement Learning Approach for Behavior Acquisition in Intelligent Autonomous Systems Master thesis, Halmstad University (2006)    
  3. Eric AntoneloMauricio FigueiredoAlbert-Jan Baerlveldt and Rodrigo Calvo Intelligent autonomous navigation for mobile robots: spatial concept acquisition and object discrimination Proceedings of the 6th IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), pp. 553-557 (2005)    
  4. Eric Antonelo and Mauricio Figueiredo Autonomous intelligent systems applied to robot navigation: spatial concept acquisition and object discrimination Proceedings of the 2nd National Meeting of Intelligent Robotics (II ENRI) in the Congress of the Brazilian Computer Society (in Portuguese), pp. (2004)  

Learning navigation attractors for mobile robots with reinforcement learning and reservoir computing

on Wed, 12/16/2015 - 16:48

Autonomous robot navigation in partially observable environments is a complex task because the state of the environment can not be completely determined only by the current sensory readings of a robot. This work uses the recently introduced paradigm for training recurrent neural networks (RNNs), called reservoir computing (RC), to model multiple navigation attractors in partially observable environments. In RC, the RNN with randomly generated fixed weights, called reservoir, projects the input into a high-dimensional dynamic space. Only the readout output layer is trained using standard linear regression techniques, and in this work, is used to approximate the state-action value function. By using a policy iteration framework, where an alternating sequence of policy improvement (samples generation from environment interaction) and policy evaluation (network training) steps are performed, the system is able to shape navigation attractors so that, after convergence, the robot follows the correct trajectory towards the goal. The experiments are accomplished using an e-puck robot extended with 8 distance sensors in a rectangular environment with an obstacle between the robot and the target region. The task is to reach the goal through the correct side of the environment, which is indicated by a temporary stimulus previously observed at the beginning of the episode. We show that the reservoir-based system (with short-term memory) can model these navigation attractors, whereas a feedforward network without memory fails to do so.

Reservoir Computing network as a function approximator for reinforcement learning tasks with partially observable environments. The reservoir is a dynamical system of recurrent nodes. Solid lines represent connections which are fixed. Dashed lines are the connections to be trained

 

Motor primitives or basic behaviors: left, forward and right.

 

A sequence of robot trajectories as learning evolves, using the ESN. Each plot shows robot trajectories in the environment for several episodes during the learning process. In the beginning, exploration is high and several locations are visited by the robot. As the simulation develops, two navigation attractors are formed to the left and to the right so that the agent receives maximal reward.

 

Supervised Learning of Internal Models for Autonomous Goal-Oriented Robot Navigation using Reservoir Computing

on Wed, 12/16/2015 - 14:45

In this work we propose a hierarchical architecture which constructs internal models of a robot environment for goal-oriented navigation by an imitation learning process. The proposed architecture is based on the Reservoir Computing paradigm for training Recurrent Neural Networks (RNN). It is composed of two randomly generated RNNs (called reservoirs), one for modeling the localization capability and one for learning the navigation skill. The localization module is trained to detect the current and previously visited robot rooms based only on 8 noisy infra-red distance sensors. These predictions together with distance sensors and the desired goal location are used by the navigation network to actually steer the robot through the environment in a goal-oriented manner. The training of this architecture is performed in a supervised way (with examples of trajectories created by a supervisor) using linear regression on the reservoir states. So, the reservoir acts as a temporal kernel projecting the inputs to a rich feature space, whose states are linearly combined to generate the desired outputs. Experimental results on a simulated robot show that the trained system can localize itself within both simple and large unknown environments and navigate successfully to desired goals.

 

 

 

  1. Eric Antonelo and Benjamin Schrauwen On Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures IEEE Transactions on Neural Networks and Learning Systems, Vol. 26 pp. 763-780 (2014). DOI: 10.1109/TNNLS.2014.2323247.  
  2. Eric Antonelo and Benjamin Schrauwen Supervised learning of internal models for autonomous goal-oriented robot navigation using Reservoir Computing IEEE International conference on Robotics and Automation, Proceedings, pp. 6 (2010)   

 

Biologically-inspired robot localization (Place cells)

on Wed, 12/09/2015 - 17:21

This work proposes a hierarchical biologically-inspired architecture for learning sensor-based spatial representations of a robot environment in an unsupervised way. The first layer is comprised of a fixed randomly generated recurrent neural network, the reservoir, which projects the input into a high-dimensional, dynamic space. The second layer learns instantaneous slowly-varying signals from the reservoir states using Slow Feature Analysis (SFA), whereas the third layer learns a sparse coding on the SFA layer using Independent Component Analysis (ICA). While the SFA layer generates non-localized activations in space, the ICA layer presents high place selectivity, forming a localized spatial activation, characteristic of place cells found in the hippocampus area of the rodent’s brain. We show that, using a limited number of noisy short-range distance sensors as input, the proposed system learns a spatial representation of the environment which can be used to predict the actual location of simulated and real robots, without the use of odometry. The results confirm that the reservoir layer is essential for learning spatial representations from low-dimensional input such as distance sensors. The main reason is that the reservoir state reflects the recent history of the input stream. Thus, this fading memory is essential for detecting locations, mainly when locations are ambiguous and characterized by similar sensor readings.

Video for data generation:

 

 

Publications

  1. Eric Antonelo and Benjamin Schrauwen Learning slow features with reservoir computing for biologically-inspired robot localization NEURAL NETWORKS, pp. 178-190 (2011)   
  2. Eric Antonelo and Benjamin Schrauwen Towards autonomous self-localization of small mobile robots using reservoir computing and slow feature analysis IEEE International conference on Systems, Man, and Cybernetics, Conference digest, Vol. 2, pp. (2009)   
  3. Eric Antonelo and Benjamin Schrauwen Unsupervised learning in reservoir computing : modeling hippocampal place cells for small mobile robots LECTURE NOTES IN COMPUTER SCIENCE, Vol. 5768, pp. 747-756 (2009)   

 

Pages