State-of-the-art Artificial Intelligence method for detecting that you is really you and not some intruder entering the code on your mobile phone.

# Research

They are also listed below:

## On Learning Navigation Behaviors for Small Mobile Robots with Reservoir Computing Architectures

This work proposes a general Reservoir Computing (RC) learning framework which can be used to learn navigation behaviors for mobile robots in simple and complex unknown, partially observable environments. RC provides an efficient way to train recurrent neural networks by letting the recurrent part of the network (called *reservoir*) fixed while only a linear readout output layer is trained.

The proposed RC framework builds upon the notion of *navigation attractor *or behavior which can be embedded in the high-dimensional space of the reservoir after learning.

The learning of multiple behaviors is possible because the dynamic robot behavior, consisting of a sensory-motor sequence, can be linearly discriminated in the high-dimensional nonlinear space of the dynamic reservoir.

Three learning approaches for navigation behaviors are shown in this paper. The first approach learns multiple behaviors based on examples of navigation behaviors generated by a supervisor, while the second approach learns goal-directed navigation behaviors based only on rewards. The third approach learns complex goal-directed behaviors, in a supervised way, using an hierarchical architecture whose internal predictions of contextual switches guide the sequence of basic navigation behaviors towards the goal.

## Robot learning through dynamical systems (PhD thesis)

During my PhD, I've worked mainly on Reservoir Computing (RC) architectures with application to modeling cognitive capabilities for mobile robots from sensor data and sometimes through interaction with the environment.

Reservoir Computing (RC) is an efficient method for trainning recurrent neural networks, which can handle spatio-temporal processing tasks, such as speech recognition. These networks are also biological plausible, as recently argued in the literature.

In my case, I used these RC networks for modeling a wide range of capabilities for mobile robots, such as:

- localization,
- modeling behaviors,
- goal-directed navigation,
- generation of maps and planning of trajectories (including RNN
*dreaming*), - reinforcement learning in non-Markovian environments.

These tasks were modeled basically using regression for learning behaviors or classification for discrete localization.

My PhD thesis can be download here. It is entitled: "**Reservoir Computing Architectures for Modeling Robot Navigation Systems"**.

My publications are listed and can be downloaded in Google Scholar or here.

Some simulated and real robots employed in the experiments:

Environment used for localization experiments using the real e-puck robot:

It is possible to perform map generation through sensory prediction given the robot position as input. Black points represent the sensory readings whereas gray points are the robot trajectory.

## Physics-Informed Neural Nets for Control of Dynamical Systems

Physics-informed neural networks (PINNs) impose known physical laws into the learning of deep neural networks, making sure they respect the physics of the process while decreasing the demand of labeled data. For systems represented by Ordinary Differential Equations (ODEs), the conventional PINN has a continuous time input variable and outputs the solution of the corresponding ODE. In their original form, PINNs do not allow control inputs neither can they simulate for long-range intervals without serious degradation in their predictions. In this context, this work presents a new framework called Physics-Informed Neural Nets for Control (PINC), which proposes a novel PINN-based architecture that is amenable to control problems and able to simulate for longer-range time horizons that are not fixed beforehand. The framework has new inputs to account for the initial state of the system and the control action. In PINC, the response over the complete time horizon is split such that each smaller interval constitutes a solution of the ODE conditioned on the fixed values of initial state and control action for that interval. The whole response is formed by feeding back the predictions of the terminal state as the initial state for the next interval. This proposal enables the optimal control of dynamic systems, integrating a priori knowledge from experts and data collected from plants into control applications. We showcase our proposal in the control of two nonlinear dynamic systems: the Van der Pol oscillator and the four-tank system.

**MPC**: Representation of the output prediction in a time instant, where the proposed actions generate a predicted behavior that reduces the distance between the value predicted by the model and a reference trajectory:

The PINC network has initial state y(0) of the dynamic system and control input u as inputs, in addition to continuous time scalar t. Both y(0) and u can be multidimensional. The output y(t) corresponds to the state of the dynamic system as a function of t 2 [0; T], and initial conditions given by y(0) and u. The deep network is fully connected even though not all connections are shown:

Below, modes of operation of the PINC network. (a) PINC net operates in self-loop mode, using its own output prediction as next initial state, after T seconds. This operation mode is used within one iteration of MPC, for trajectory generation until the prediction horizon of MPC completes (predicted output from the first Figure). (b) Block diagram for PINC connected to the plant. One pass through the diagram arrows corresponds to one MPC iteration applying a control input u for Ts timesteps for both plant and PINC network. Note that the initial state of the PINC net is set to the real output of the plant. In practice, in MPC, these two operation modes are executed in an alternated way (optimization in the prediction horizon, and application of control action).

a).

b)

Below, the representation of a trained PINC network evolving through time in self-loop mode (previous Figure a)) for trajectory generation in prediction horizon. The top dashed black curve corresponds to a predicted trajectory y of a hypothetical dynamic system in continuous time. The states y[k] are snapshots of the system in discrete time k positioned at the equidistant vertical lines. Between two vertical lines (during the inner continuous interval between steps k and k + 1), the PINC net learns the solution of an ODE with t \in [0; T], conditioned on a fixed control input u[k] (blue solid line) and initial state y(0) (green thick dashed line). Control action u[k] is changed at the vertical lines and kept fixed for T seconds, and the initial state y(0) in the interval between steps k and k + 1 is updated to the last state of the previous interval k 1 (indicated by the red curved arrow). The PINC net can directly predict the states at the vertical lines without the need to infer intermediate states t < T as numerical simulation does. Here, we assume that T = Ts and, thus, the number of discrete timesteps M is equal to the length of the prediction horizon in MPC.

## ESN-PNMPC: Efficient data-driven model predictive control of unknown nonlinear processes

The **control of nonlinear industrial processes** is a challenging task since the model of the plant may not be completely known a priori. In addition, the application of **nonlinear model predictive control** may be affected by modeling errors and subject to high computational complexity.

In this work, a new efficient **data-driven scheme** is proposed that alleviates some known issues in the so called **Practical Nonlinear Model Predictive Controll**er (PNMPC). In **PNMPC**, the model is only linearized partially: while the free response of the system is kept fully nonlinear, only the forced response is linearized. The general model for PNMPC proposed in this work consists of an **Echo State Network** (**ESN**), a recurrent neural network with very efficient training for system identification.

The benefit of the proposed **ESN-PNMPC** scheme is that it allows:

- fast system identification for nonlinear dynamic systems with arbitrary accuracy;
- analytical computation of derivatives from the ESN model for the forced response.

This last feature assures significantly lower computational complexity for derivative computation when compared to the original finite difference method of PNMPC.

The proposed scheme is also enhanced with a correction filter that provides robustness to unforeseen disturbances during execution time, and compared to an **LSTM** (Long-Short Term Memory) implementation for the model as well as to a PI controller. The universality of the approach is shown by application to the **control of different nonlinear plants**.

Figure from J. Jordanou 2020, et al (submitted).

## Online recurrent neural network learning for control of nonlinear plants in oil and gas production platforms

This research line aims at designing **adaptive controllers** by using **Echo State Networks** (ESN) as a efficient data-driven method for training **recurrent neural networks** capable of controlling complex nonlinear plants, with a focus on **oil and gas production platforms** from *Petrobras*.

The resulting **ESN**-based controllers should learn **inverse models of the controlled plant **in an online fashion by interacting with the industrial plant and observing its dynamical behaviors.

In collaboration with supervised Master Student Jean P. Jordanou.

Well model. Figure by Jahanshahi et al. (2012).

Manifold connecting two oil wells and a riser. Figure by Jordanou.

Scheme of Adaptive ESN-based controller and nonlinear plant. Figure by Jordanou

## UTEMA (Unbiased Temporal Machine for General-purpose Times series-based Fraud detection)

In the context of energy distribution networks, frauds are non-technical losses (NTL) that may account for up to 40% of the total distributed energy in some developing countries. The fraudster alters the eletricity meter in order to pay less than the right amount. In this context, the discovery or detection of frauds is necessary in order to decrease the non-technical losses of the energy distribution networks, consequently enhancing the stability and reliability of the network.

This project proposes the use of Recurrent Neural Networks (RNNs) for projecting a times series into a spatial dimension such that it can be used as a universal temporal feature for fraud detection predictive models. The particular problem tackled here jointly with the partner company is to predict whether a given time series of monthly energy consumption data is likely to indicate a fraud (NTL) or not.

Two main approaches are planned to be used with RNNs: supervised learning with bias correction techniques, and self-organized models for unsupervised learning of new fraud (anomaly) patterns.Finally, a last step is to integrate both of the previously developed models into an unified architecture that learns the responsibilities of each model in an online way by feedback from the environment using the results of the inspections of the fraudsters - the ground truth for some of the predictions.

This project has potential not only for generating significant technological and commercial value for the industrial partner, but also outstanding scientific output, being applicable in the long-term to other fields such as monitoring, prognosis/diagnostics in robotics, medical systems and security applications.

Funding: AFR-PPP / FNR, Luxembourg.

## Proxy dynamical models of offshore oil production platforms via recurrent neural networks

Process measurements are of vital importance for monitoring and **control of industrial plants**. When we consider **offshore oil production platforms**, wells that require *gas-lift* technology to yield oil production from low pressure oil reservoirs can become unstable under some conditions. This undesirable phenomenon is usually called *slugging flow*, and can be identified by an oscillatory behavior of the downhole pressure measurement.

Given the importance of this measurement and the unreliability of the related sensor, this work aims at designing data-driven **soft-sensors** for **downhole pressure estimation** in two contexts: one for speeding up first-principled model simulation of a vertical riser model; and another for estimating the downhole pressure using real-world data from an oil well from *Petrobras* based only on topside platform measurements. Both tasks are tackled by employing **Echo State Networks (ESNs)** as an efficient technique for training Recurrent Neural Networks.

We show that a single ESN is capable of robustly modeling both the slugging flow behavior and a steady state based only on a square wave input signal representing the production choke opening in the vertical riser. Besides, we compare the performance of a standard network to the performance of a multiple timescale hierarchical architecture in the second task and show that for some periods the latter architecture performs better.

## Cognitive computation for Deviation detection in Fleet of City Buses

With Prof. Thorsteinn Rögnvaldsson, from Halmstad University, Sweden, we are looking at how Reservoir Computing can help in deviation detection in a fleet of Swedish city buses using a signal from the air tank pressure from the buses in order to predict when a bus is going to break well in advance.

Video from the project at Halmstad University:

## Learning navigation attractors for mobile robots with reinforcement learning and reservoir computing

Autonomous robot navigation in partially observable environments is a complex task because the state of the environment can not be completely determined only by the current sensory readings of a robot. This work uses the recently introduced paradigm for training recurrent neural networks (RNNs), called reservoir computing (RC), to model multiple navigation attractors in partially observable environments. In RC, the RNN with randomly generated fixed weights, called reservoir, projects the input into a high-dimensional dynamic space. Only the readout output layer is trained using standard linear regression techniques, and in this work, is used to approximate the state-action value function. By using a policy iteration framework, where an alternating sequence of policy improvement (samples generation from environment interaction) and policy evaluation (network training) steps are performed, the system is able to shape navigation attractors so that, after convergence, the robot follows the correct trajectory towards the goal. The experiments are accomplished using an e-puck robot extended with 8 distance sensors in a rectangular environment with an obstacle between the robot and the target region. The task is to reach the goal through the correct side of the environment, which is indicated by a temporary stimulus previously observed at the beginning of the episode. We show that the reservoir-based system (with short-term memory) can model these navigation attractors, whereas a feedforward network without memory fails to do so.

Reservoir Computing network as a function approximator for reinforcement learning tasks with partially observable environments. The reservoir is a dynamical system of recurrent nodes. Solid lines represent connections which are fixed. Dashed lines are the connections to be trained

Motor primitives or basic behaviors: left, forward and right.

A sequence of robot trajectories as learning evolves, using the ESN. Each plot shows robot trajectories in the environment for several episodes during the learning process. In the beginning, exploration is high and several locations are visited by the robot. As the simulation develops, two navigation attractors are formed to the left and to the right so that the agent receives maximal reward.

## Biologically-inspired robot localization (Place cells)

This work proposes a hierarchical biologically-inspired architecture for learning sensor-based spatial representations of a robot environment in an unsupervised way. The first layer is comprised of a fixed randomly generated recurrent neural network, the reservoir, which projects the input into a high-dimensional, dynamic space. The second layer learns instantaneous slowly-varying signals from the reservoir states using Slow Feature Analysis (SFA), whereas the third layer learns a sparse coding on the SFA layer using Independent Component Analysis (ICA). While the SFA layer generates non-localized activations in space, the ICA layer presents high place selectivity, forming a localized spatial activation, characteristic of place cells found in the hippocampus area of the rodent’s brain. We show that, using a limited number of noisy short-range distance sensors as input, the proposed system learns a spatial representation of the environment which can be used to predict the actual location of simulated and real robots, without the use of odometry. The results confirm that the reservoir layer is essential for learning spatial representations from low-dimensional input such as distance sensors. The main reason is that the reservoir state reflects the recent history of the input stream. Thus, this fading memory is essential for detecting locations, mainly when locations are ambiguous and characterized by similar sensor readings.

**Video for data generation**:

**Publications**

*Eric Antonelo and Benjamin Schrauwen***Learning slow features with reservoir computing for biologically-inspired robot localization**NEURAL NETWORKS, pp. 178-190 (2011)*Eric Antonelo and Benjamin Schrauwen***Towards autonomous self-localization of small mobile robots using reservoir computing and slow feature analysis**IEEE International conference on Systems, Man, and Cybernetics, Conference digest, Vol. 2, pp. (2009)*Eric Antonelo and Benjamin Schrauwen***Unsupervised learning in reservoir computing : modeling hippocampal place cells for small mobile robots**LECTURE NOTES IN COMPUTER SCIENCE, Vol. 5768, pp. 747-756 (2009)