Skip directly to content

Aprendizagem de Máquina

On Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures

on Wed, 12/16/2015 - 14:56

This work proposes a general Reservoir Computing (RC) learning framework which can be used to learn navigation behaviors for mobile robots in simple and complex unknown, partially observable environments. RC provides an efficient way to train recurrent neural networks by letting the recurrent part of the network (called reservoir) fixed while only a linear readout output layer is trained.
The proposed RC framework builds upon the notion of navigation attractor or behavior which can be embedded in the high-dimensional space of the reservoir after learning. 
The learning of multiple behaviors is possible because the dynamic robot behavior, consisting of a sensory-motor sequence, can be linearly discriminated in the high-dimensional nonlinear space of the dynamic reservoir. 
Three learning approaches for navigation behaviors are shown in this paper. The first approach learns multiple behaviors based on examples of navigation behaviors generated by a supervisor, while the second approach learns goal-directed navigation behaviors based only on rewards. The third approach learns complex goal-directed behaviors, in a supervised way, using an hierarchical architecture whose internal predictions of contextual switches guide the sequence of basic navigation behaviors towards the goal.

 

 

 

 

 

Robot learning through dynamical systems (PhD thesis)

on Wed, 12/09/2015 - 14:12

During my PhD, I've worked mainly on Reservoir Computing (RC) architectures with application to modeling cognitive capabilities for mobile robots from sensor data and sometimes through interaction with the environment.

Reservoir Computing (RC) is an efficient method for trainning recurrent neural networks, which can handle spatio-temporal processing tasks, such as speech recognition. These networks are also biological plausible, as recently argued in the literature.

In my case, I used these RC networks for modeling a wide range of capabilities for mobile robots, such as:

These tasks were modeled basically using regression for learning behaviors or classification for discrete localization.

My PhD thesis can be download here. It is entitled: "Reservoir Computing Architectures for Modeling Robot Navigation Systems".

My publications are listed and can be downloaded in Google Scholar or here.

Some simulated and real robots employed in the experiments:


 

Environment used for localization experiments using the real e-puck robot:


 

After using unsupervised learning methods for self-localization, the plots below show the mean activation of place cells as a function of the robot position in the environment.
Red denotes a high response whereas blue denotes a low response.
 

It is possible to perform map generation through sensory prediction given the robot position as input. Black points represent the sensory readings whereas gray points are the robot trajectory.

 

Online recurrent neural network learning for control of nonlinear plants in oil and gas production platforms

on Wed, 10/17/2018 - 12:54

This research line aims at designing adaptive controllers by using Echo State Networks (ESN) as a efficient data-driven method for training recurrent neural networks capable of controlling complex nonlinear plants, with a focus on oil and gas production platforms from Petrobras.

The resulting ESN-based controllers should learn inverse models of the controlled plant in an online fashion by interacting with the industrial plant and observing its dynamical behaviors.

In collaboration with supervised Master Student Jean P. Jordanou.

Well model. Figure by Jahanshahi et al. (2012).          

 

Manifold connecting two oil wells and a riser. Figure by Jordanou.

Scheme of Adaptive ESN-based controller and nonlinear plant. Figure by Jordanou

State-of-the-art Artificial Intelligence method for detecting that you is really you and not some intruder entering the code on your mobile phone.

Technologies used:
Python (backend & custom Neural network model);
Java (Android app frontend);

Developed in 2016/2017.

 

More information:  TigerAI_info.pdf

 

 

Learning navigation attractors for mobile robots with reinforcement learning and reservoir computing

on Wed, 12/16/2015 - 16:48

Autonomous robot navigation in partially observable environments is a complex task because the state of the environment can not be completely determined only by the current sensory readings of a robot. This work uses the recently introduced paradigm for training recurrent neural networks (RNNs), called reservoir computing (RC), to model multiple navigation attractors in partially observable environments. In RC, the RNN with randomly generated fixed weights, called reservoir, projects the input into a high-dimensional dynamic space. Only the readout output layer is trained using standard linear regression techniques, and in this work, is used to approximate the state-action value function. By using a policy iteration framework, where an alternating sequence of policy improvement (samples generation from environment interaction) and policy evaluation (network training) steps are performed, the system is able to shape navigation attractors so that, after convergence, the robot follows the correct trajectory towards the goal. The experiments are accomplished using an e-puck robot extended with 8 distance sensors in a rectangular environment with an obstacle between the robot and the target region. The task is to reach the goal through the correct side of the environment, which is indicated by a temporary stimulus previously observed at the beginning of the episode. We show that the reservoir-based system (with short-term memory) can model these navigation attractors, whereas a feedforward network without memory fails to do so.

Reservoir Computing network as a function approximator for reinforcement learning tasks with partially observable environments. The reservoir is a dynamical system of recurrent nodes. Solid lines represent connections which are fixed. Dashed lines are the connections to be trained

 

Motor primitives or basic behaviors: left, forward and right.

 

A sequence of robot trajectories as learning evolves, using the ESN. Each plot shows robot trajectories in the environment for several episodes during the learning process. In the beginning, exploration is high and several locations are visited by the robot. As the simulation develops, two navigation attractors are formed to the left and to the right so that the agent receives maximal reward.

 

Pages