Skip directly to content

Artificial Intelligence

On Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures

on Wed, 12/16/2015 - 14:56

This work proposes a general Reservoir Computing (RC) learning framework which can be used to learn navigation behaviors for mobile robots in simple and complex unknown, partially observable environments. RC provides an efficient way to train recurrent neural networks by letting the recurrent part of the network (called reservoir) fixed while only a linear readout output layer is trained.
The proposed RC framework builds upon the notion of navigation attractor or behavior which can be embedded in the high-dimensional space of the reservoir after learning. 
The learning of multiple behaviors is possible because the dynamic robot behavior, consisting of a sensory-motor sequence, can be linearly discriminated in the high-dimensional nonlinear space of the dynamic reservoir. 
Three learning approaches for navigation behaviors are shown in this paper. The first approach learns multiple behaviors based on examples of navigation behaviors generated by a supervisor, while the second approach learns goal-directed navigation behaviors based only on rewards. The third approach learns complex goal-directed behaviors, in a supervised way, using an hierarchical architecture whose internal predictions of contextual switches guide the sequence of basic navigation behaviors towards the goal.

 

 

 

 

 

Robot learning through dynamical systems (PhD thesis)

on Wed, 12/09/2015 - 14:12

During my PhD, I've worked mainly on Reservoir Computing (RC) architectures with application to modeling cognitive capabilities for mobile robots from sensor data and sometimes through interaction with the environment.

Reservoir Computing (RC) is an efficient method for trainning recurrent neural networks, which can handle spatio-temporal processing tasks, such as speech recognition. These networks are also biological plausible, as recently argued in the literature.

In my case, I used these RC networks for modeling a wide range of capabilities for mobile robots, such as:

These tasks were modeled basically using regression for learning behaviors or classification for discrete localization.

My PhD thesis can be download here. It is entitled: "Reservoir Computing Architectures for Modeling Robot Navigation Systems".

My publications are listed and can be downloaded in Google Scholar or here.

Some simulated and real robots employed in the experiments:


 

Environment used for localization experiments using the real e-puck robot:


 

After using unsupervised learning methods for self-localization, the plots below show the mean activation of place cells as a function of the robot position in the environment.
Red denotes a high response whereas blue denotes a low response.
 

It is possible to perform map generation through sensory prediction given the robot position as input. Black points represent the sensory readings whereas gray points are the robot trajectory.

 

State-of-the-art Artificial Intelligence method for detecting that you is really you and not some intruder entering the code on your mobile phone.

Technologies used:
Python (backend & custom Neural network model);
Java (Android app frontend);

Developed in 2016/2017.

 

More information:  TigerAI_info.pdf

 

 

Technologies used: C++, TCP/IP sockets, Linux, Qt, Qwt. 
Developed from 2001 to 2008.
 
Two different software programs were developed during my undergraduate and Master studies: SINAR, a simulator that shows graphically the representation of the environment and the simulation in real time; and CONAR, the autonomous  controller that receives sensor data (from SINAR) and output the actuators data (to SINAR). Simulations with multiple robots can be done if more than one controller (CONAR) connects to SINAR. The communication between both programs is represented on the following figure:

The communication protocol is implemented using TCP/IP sockets. Thus, several controllers can run under different computers over a network (distributing the computing load through different nodes of a network). Both simulator programs were developed under the Linux operating system; the graphical interface was developed with Qt library and some graphical plots were created with Qwt library. C++ was the programming  language used to create both programs.

SINAR

SINAR is a simulator for autonomous robot navigation experiments. Its graphical user interface contains menus, command bars, and the environment display. 

The user can create simulation environments merely by clicking and moving the mouse cursor in the display area. Objects are inserted, resized, translated and rotated simply by moving the mouse device. An object can be a obstacle, a target or even a robot. The user can also edit an object by changing its color and type of movement (for moving objects).

Environments can be saved in files and posteriorly they can be loaded for being used in simulations. Before a simulation starts, one or more controllers (CONAR) should be connected to the SINAR software. The user can control the simulation by activating appropriate (button) commands: start, pause and finish.

During a simulation, sensor data can be viewed graphically through plots in real time:

The performance of the robot (number of collisions, number of target captures, and number of executed iterations) can be verified in real time as well.

There are two modes of simulation: orginary mode and sophisticated mode. In the former mode, the environment display is updated at each iteration such that the user can view graphically the progress of the simulation. Furthermore,  the user can move any object in the environment in real time.

In the latter mode, the simulation is accomplished implicitily (not graphically) and is composed of a set of experiments configured by a specific C++ script. The script determines the sequence and the duration of simulation experiments (considering that each experiment uses distinct environments), besides the number of repetitions for a sequence of experiments. In the sophisticated mode, all generated data are recorded in files:  the trajectory of the robot and the performance measures (number of captures, collisions and their respective iteration time); the representation of the final state of the environment in PNG format and the performance plot (also called learning evolution graphic) are also generated automatically. The controller data (neural networks states) are also saved in an automatic way since the script tells CONAR to save its state when each simulation is finished.

 

CONAR

CONAR is a program that simulates the brain of a robot located in the SINAR environment. After receiving sensor data (distance, color and contact) from its respective robot in the SINAR environment, it sends actuator data (direction adjustment and velocity adjustment) to the same robot. This cycle is kept until the simulations ends.

The graphical interface of CONAR is shown on the following figure. Parameters of the controller can be adjusted before the simulation and in real time; commands can be activated by clicking on buttons: connect to SINAR, apply parameters changes in real time, generate performance data and plots for recording in files, save neural networks state, exit simulation. Furthermore, some neural networks in the controller can be disabled in real time (so that it outputs null (zero)): IP, IC, RR and AR networks.

   In addition, neural networks state can be viewed graphically in real time. In the following figures, a neuron is represented by a circle. In addition, the more black a neuron is, stronger is its output.

In above figure, it is shown a representation of PI repertoire neurons. A small red square inside a circle means that a neuron has already been winner during a learning event.

The next figure shows AR, RR and actuator neurons. The energy levels (degree of activity) of AR or RR neurons are represented by a thick line next to the respective neurons.

In following picture, it is shown the graphical representation of output of winner neurons in PI repertoire (each line represents the winner neuron output in a column: the first line corresponds to the first column and so on).

 

To see a video of a simulation run, check out this page: Reinforcement learning of robot behaviors

Related publications

  1. Eric AntoneloAlbert-Jan BaerveldtThorsteinn Rognvaldsson and Mauricio Figueiredo Modular Neural Network and Classical Reinforcement Learning for Autonomous Robot Navigation: Inhibiting Undesirable Behaviors Proceedings of the IEEE International Joint Conference on Neural Networks (IJCNN), pp. 1225-1232 (2006)    
  2. Eric Antonelo A Neural Reinforcement Learning Approach for Behavior Acquisition in Intelligent Autonomous Systems Master thesis, Halmstad University (2006)    
  3. Eric AntoneloMauricio FigueiredoAlbert-Jan Baerlveldt and Rodrigo Calvo Intelligent autonomous navigation for mobile robots: spatial concept acquisition and object discrimination Proceedings of the 6th IEEE International Symposium on Computational Intelligence in Robotics and Automation (CIRA), pp. 553-557 (2005)    
  4. Eric Antonelo and Mauricio Figueiredo Autonomous intelligent systems applied to robot navigation: spatial concept acquisition and object discrimination Proceedings of the 2nd National Meeting of Intelligent Robotics (II ENRI) in the Congress of the Brazilian Computer Society (in Portuguese), pp. (2004)  

Learning navigation attractors for mobile robots with reinforcement learning and reservoir computing

on Wed, 12/16/2015 - 16:48

Autonomous robot navigation in partially observable environments is a complex task because the state of the environment can not be completely determined only by the current sensory readings of a robot. This work uses the recently introduced paradigm for training recurrent neural networks (RNNs), called reservoir computing (RC), to model multiple navigation attractors in partially observable environments. In RC, the RNN with randomly generated fixed weights, called reservoir, projects the input into a high-dimensional dynamic space. Only the readout output layer is trained using standard linear regression techniques, and in this work, is used to approximate the state-action value function. By using a policy iteration framework, where an alternating sequence of policy improvement (samples generation from environment interaction) and policy evaluation (network training) steps are performed, the system is able to shape navigation attractors so that, after convergence, the robot follows the correct trajectory towards the goal. The experiments are accomplished using an e-puck robot extended with 8 distance sensors in a rectangular environment with an obstacle between the robot and the target region. The task is to reach the goal through the correct side of the environment, which is indicated by a temporary stimulus previously observed at the beginning of the episode. We show that the reservoir-based system (with short-term memory) can model these navigation attractors, whereas a feedforward network without memory fails to do so.

Reservoir Computing network as a function approximator for reinforcement learning tasks with partially observable environments. The reservoir is a dynamical system of recurrent nodes. Solid lines represent connections which are fixed. Dashed lines are the connections to be trained

 

Motor primitives or basic behaviors: left, forward and right.

 

A sequence of robot trajectories as learning evolves, using the ESN. Each plot shows robot trajectories in the environment for several episodes during the learning process. In the beginning, exploration is high and several locations are visited by the robot. As the simulation develops, two navigation attractors are formed to the left and to the right so that the agent receives maximal reward.

 

Publications

  1. Eric Antonelo and Benjamin Schrauwen On Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures IEEE Transactions on Neural Networks and Learning Systems, Vol. 26 pp. 763-780 (2014). DOI: 10.1109/TNNLS.2014.2323247.  
  2. Eric AntoneloStefan Depeweg and Benjamin Schrauwen Learning navigation attractors for mobile robots with reinforcement learning and reservoir computing Proceedings of the X Brazilian Congress on Computational Intelligence, pp. (2011)  

 

Pages