Learning Algorithms for Data Collection in RF-Charging IIoT Networks

Data collection is a fundamental operation in energy harvesting Industrial Internet of Things (IIoT) networks. To this end, we consider a Hybrid Access Point (HAP) or controller that is responsible for charging and collecting $L$ bits from sensor devices. The problem at hand is to optimize the transmit power allocation of the HAP over multiple time frames. The main challenge is that the HAP has causal channel state information to devices. We outline a novel Two-Step Reinforcement Learning with Gibbs sampling (TSRL-Gibbs) strategy, where the first step uses Q-learning and an action space comprising of transmit power allocation sampled from a multi-dimensional simplex. The second step applies Gibbs sampling to further refine the action space. Our results show that TSRL-Gibbs requires up to 28.5\% fewer frames than competing approaches.