Select your text size  for this site here: Small Text Normal Text Large Text Extra Large Text

Note: Some of the graphical elements of this site are only visible to browsers that support accepted web standards. The content of this site is, however, accessible to any browser or Internet device.

Pub. ID 5321

  Internal author     Old internal author  

Reinforcement learning for resource allocation in LEO satellite networks
W.Usaha and J.A.Barria
"Reinforcement learning for resource allocation in LEO satellite networks", W.Usaha and J.A.Barria, IEEE Transactions on Systems, Man, and Cybernetics – B 37 (3) pp.515-527 (2007)
Publication Date:


In this paper, we develop and assess online decisionmaking algorithms for call admission and routing for low Earth orbit (LEO) satellite networks. It has been shown in a recent paper that, in a LEO satellite system, a semi-Markov decision process formulation of the call admission and routing problem can achieve better performance in terms of an average revenue function than existing routing methods. However, the conventional dynamic programming (DP) numerical solution becomes prohibited as the problem size increases. In this paper, two solution methods based on reinforcement learning (RL) are proposed in order to circumvent the computational burden of DP. The first method is based on an actor–critic method with temporaldifference (TD) learning. The second method is based on a critic-only method, called optimistic TD learning. The algorithms enhance performance in terms of requirements in storage, computational complexity and computational time, and in terms of an overall long-term average revenue function that penalizes blocked calls. Numerical studies are carried out, and the results obtained show that the RL framework can achieve up to 56% higher average revenue over existing routing methods used in LEO satellite networks with reasonable storage and computational requirements.