Active perception in adversarial scenarios using maximum entropy deep reinforcement learning
2019 International Conference on Robotics and Automation (ICRA), 2019•ieeexplore.ieee.org
We pose an active perception problem where an autonomous agent actively interacts with a
second agent with potentially adversarial behaviors. Given the uncertainty in the intent of the
other agent, the objective is to collect further evidence to help discriminate potential threats.
The main technical challenges are the partial observability of the agent intent, the adversary
modeling, and the corresponding uncertainty modeling. Note that an adversary agent may
act to mislead the autonomous agent by using a deceptive strategy that is learned from past …
second agent with potentially adversarial behaviors. Given the uncertainty in the intent of the
other agent, the objective is to collect further evidence to help discriminate potential threats.
The main technical challenges are the partial observability of the agent intent, the adversary
modeling, and the corresponding uncertainty modeling. Note that an adversary agent may
act to mislead the autonomous agent by using a deceptive strategy that is learned from past …
We pose an active perception problem where an autonomous agent actively interacts with a second agent with potentially adversarial behaviors. Given the uncertainty in the intent of the other agent, the objective is to collect further evidence to help discriminate potential threats. The main technical challenges are the partial observability of the agent intent, the adversary modeling, and the corresponding uncertainty modeling. Note that an adversary agent may act to mislead the autonomous agent by using a deceptive strategy that is learned from past experiences. We propose an approach that combines belief space planning, generative adversary modeling, and maximum entropy reinforcement learning to obtain a stochastic belief space policy. By accounting for various adversarial behaviors in the simulation framework and minimizing the predictability of the autonomous agent's action, the resulting policy is more robust to unmodeled adversarial strategies. This improved robustness is empirically shown against an adversary that adapts to and exploits the autonomous agent's policy when compared with a standard Chance-Constraint Partially Observable Markov Decision Process robust approach.
ieeexplore.ieee.org