This paper considers Radio Frequency (RF) energy harvesting devices that use an Irregular Slotted Aloha (IRSA) channel access protocol to transmit their data to a Hybrid Access Point (HAP). Specifically, it addresses the fundamental problem of optimizing the number of packet replicas transmitted by each device in each time frame. Unlike prior works, it considers a learning approach to optimize the number of replicas according to the energy level of devices. This paper first uses a model-based Markov Decision Process (MDP) to study the problem at hand. Then it proposes a model-free, centralzied and a distributed Q-learning based solution that aim to maximize the number of successful transmissions in each time frame. Our results show that our centralized and distributed solutions respective achieve up to 38% more successful transmissions than conventional Aloha. |