Job Description

One of the underlying assumptions of Reinforcement Learning (RL) methodologies is the stationarity of the environment embedding the agent. Specifically, the three main elements characterizing a Markov Decision Process (MDP), i.e. the set of states, the set of actions and the reward function are assumed to be constant/invariant over time. In surveillance applications however, such assumption is generally unrealistic since the environment (i.e. the area that the agent has to monitor) is constantly changing. The amount and types of objects or different disturbance statistics are just two examples of non-stationarity. The main aim of this project is then the development of original RL schemes able to cope with time-dependent MDP. This challenging goal may bring significant benefits in many theoretical and applicative AI-subfield: from the statistical learning theory, non-stationary random processes and sets to Signal Processing applications. The theoretical findings will be validated in an emerging crucial issue: the detections of drones using massive antenna arrays.

Job Information

Contact
Stefano Fortunati
Related URL
https://l2s.centralesupelec...
Institution
Laboratoire des signaux et systèmes (L2S), Université Paris-Saclay/CentraleSupélec
Topic Category
Location
3 Rue Joliot-Curie, 91190 Gif-sur-Yvette, France
Closing Date
Sept. 4, 2022