Large-Scale, Multiagent, Reinforcement Learning Control

Source: U.S. Army, https://www.army.mil/article/237978/army_advances_learning_capabilities_of_drone_swarms

Presented: June 9, 2021 12:00 pm (ET)
Presented by: Jemin George

Swarming is a method of operation where multiple autonomous systems act as a cohesive unit by actively coordinating their actions. The Army is looking to swarming technology to execute time-consuming or dangerous tasks and overmatch enemy capabilities. Finding optimal guidance policies for these swarming vehicles in real-time is a key requirement for enhancing Warfighters’ tactical situational awareness, allowing the U.S. Army to dominate in a contested environment. Reinforcement learning provides a way to optimally control agents with uncertain behavior to achieve multiobjective goals when the precise model for the agent is unavailable; however, the existing reinforcement learning schemes can only be applied in a centralized manner, which requires pooling the state information of the entire swarm at a central learner. This drastically increases the computational complexity and communication requirements, resulting in unreasonable learning time. In order to tackle this issue, our research focuses on large-scale, multiagent reinforcement learning problems. The main goal of our effort is to develop a theoretical foundation for data-driven, optimal control for large-scale swarm networks, where control actions will be taken based on low-dimensional measurement data instead of dynamic models. We will present our current approach called Hierarchical Reinforcement Learning (HRL), which decomposes the global control objective into multiple hierarchies—namely, multiple, small group-level microscopic control and a broad swarm-level macroscopic control. Our current HRL efforts will allow us to develop control policies for swarms of unmanned aerial and ground vehicles so that they can optimally accomplish different mission sets, even though the individual dynamics for the swarming agents are unknown. We will also include some preliminary experimental results obtained from applying the developed HRL scheme for multiagent, target-tracking problems.

Watch Webinar

Download Presentation