Skip directly to content

Aprendizagem de Máquina

On Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures

on Wed, 12/16/2015 - 14:56

This work proposes a general Reservoir Computing (RC) learning framework which can be used to learn navigation behaviors for mobile robots in simple and complex unknown, partially observable environments. RC provides an efficient way to train recurrent neural networks by letting the recurrent part of the network (called reservoir) fixed while only a linear readout output layer is trained.
The proposed RC framework builds upon the notion of navigation attractor or behavior which can be embedded in the high-dimensional space of the reservoir after learning. 
The learning of multiple behaviors is possible because the dynamic robot behavior, consisting of a sensory-motor sequence, can be linearly discriminated in the high-dimensional nonlinear space of the dynamic reservoir. 
Three learning approaches for navigation behaviors are shown in this paper. The first approach learns multiple behaviors based on examples of navigation behaviors generated by a supervisor, while the second approach learns goal-directed navigation behaviors based only on rewards. The third approach learns complex goal-directed behaviors, in a supervised way, using an hierarchical architecture whose internal predictions of contextual switches guide the sequence of basic navigation behaviors towards the goal.

 

 

 

 

 

Robot learning through dynamical systems (PhD thesis)

on Wed, 12/09/2015 - 14:12

During my PhD, I've worked mainly on Reservoir Computing (RC) architectures with application to modeling cognitive capabilities for mobile robots from sensor data and sometimes through interaction with the environment.

Reservoir Computing (RC) is an efficient method for trainning recurrent neural networks, which can handle spatio-temporal processing tasks, such as speech recognition. These networks are also biological plausible, as recently argued in the literature.

In my case, I used these RC networks for modeling a wide range of capabilities for mobile robots, such as:

These tasks were modeled basically using regression for learning behaviors or classification for discrete localization.

My PhD thesis can be download here. It is entitled: "Reservoir Computing Architectures for Modeling Robot Navigation Systems".

My publications are listed and can be downloaded in Google Scholar or here.

Some simulated and real robots employed in the experiments:


 

Environment used for localization experiments using the real e-puck robot:


 

After using unsupervised learning methods for self-localization, the plots below show the mean activation of place cells as a function of the robot position in the environment.
Red denotes a high response whereas blue denotes a low response.
 

It is possible to perform map generation through sensory prediction given the robot position as input. Black points represent the sensory readings whereas gray points are the robot trajectory.

 

Physics-Informed Neural Nets for Control of Dynamical Systems

on Tue, 09/28/2021 - 16:00

Physics-informed neural networks (PINNs) impose known physical laws into the learning of deep neural networks, making sure they respect the physics of the process while decreasing the demand of labeled data. For systems represented by Ordinary Differential Equations (ODEs), the conventional PINN has a continuous time input variable and outputs the solution of the corresponding ODE. In their original form, PINNs do not allow control inputs neither can they simulate for long-range intervals without serious degradation in their predictions. In this context, this work presents a new framework called Physics-Informed Neural Nets for Control (PINC), which proposes a novel PINN-based architecture that is amenable to control problems and able to simulate for longer-range time horizons that are not fixed beforehand. The framework has new inputs to account for the initial state of the system and the control action. In PINC, the response over the complete time horizon is split such that each smaller interval constitutes a solution of the ODE conditioned on the fixed values of initial state and control action for that interval. The whole response is formed by feeding back the predictions of the terminal state as the initial state for the next interval. This proposal enables the optimal control of dynamic systems, integrating a priori knowledge from experts and data collected from plants into control applications. We showcase our proposal in the control of two nonlinear dynamic systems: the Van der Pol oscillator and the four-tank system.

MPC: Representation of the output prediction in a time instant, where the proposed actions generate a predicted behavior that reduces the distance between the value predicted by the model and a reference trajectory:

mpc_pred.png


The PINC network has initial state y(0) of the dynamic system and control input u as inputs, in addition to continuous time scalar t. Both y(0) and u can be multidimensional. The output y(t) corresponds to the state of the dynamic system as a function of t 2 [0; T], and initial conditions given by y(0) and u. The deep network is fully connected even though not all connections are shown:

pinc_net.png

Below, modes of operation of the PINC network. (a) PINC net operates in self-loop mode, using its own output prediction as next initial state, after T seconds. This operation mode is used within one iteration of MPC, for trajectory generation until the prediction horizon of MPC completes (predicted output from the first Figure). (b) Block diagram for PINC connected to the plant. One pass through the diagram arrows corresponds to one MPC iteration applying a control input u for Ts timesteps for both plant and PINC network. Note that the initial state of the PINC net is set to the real output of the plant. In practice, in MPC, these two operation modes are executed in an alternated way (optimization in the prediction horizon, and application of control action).

a). pinc_feedback.png           


b) pinc_plant.png

Below, the representation of a trained PINC network evolving through time in self-loop mode (previous Figure a)) for trajectory generation in prediction horizon. The top dashed black curve corresponds to a predicted trajectory y of a hypothetical dynamic system in continuous time. The states y[k] are snapshots of the system in discrete time k positioned at the equidistant vertical lines. Between two vertical lines (during the inner continuous interval between steps k and k + 1), the PINC net learns the solution of an ODE with t \in [0; T], conditioned on a fixed control input u[k] (blue solid line) and initial state y(0) (green thick dashed line). Control action u[k] is changed at the vertical lines and kept fixed for T seconds, and the initial state y(0) in the interval between steps k and k + 1 is updated to the last state of the previous interval k  1 (indicated by the red curved arrow). The PINC net can directly predict the states at the vertical lines without the need to infer intermediate states t < T as numerical simulation does. Here, we assume that T = Ts and, thus, the number of discrete timesteps M is equal to the length of the prediction horizon in MPC.

pinc_evolution.png

Online recurrent neural network learning for control of nonlinear plants in oil and gas production platforms

on Wed, 10/17/2018 - 12:54

This research line aims at designing adaptive controllers by using Echo State Networks (ESN) as a efficient data-driven method for training recurrent neural networks capable of controlling complex nonlinear plants, with a focus on oil and gas production platforms from Petrobras.

The resulting ESN-based controllers should learn inverse models of the controlled plant in an online fashion by interacting with the industrial plant and observing its dynamical behaviors.

In collaboration with supervised Master Student Jean P. Jordanou.

Well model. Figure by Jahanshahi et al. (2012).          

 

Manifold connecting two oil wells and a riser. Figure by Jordanou.

Scheme of Adaptive ESN-based controller and nonlinear plant. Figure by Jordanou

State-of-the-art Artificial Intelligence method for detecting that you is really you and not some intruder entering the code on your mobile phone.

Technologies used:
Python (backend & custom Neural network model);
Java (Android app frontend);

Developed in 2016/2017.

 

More information:  TigerAI_info.pdf

 

 

Pages