Multiple constrained continuous-time Markov Decision Processes with expected discounted reward criteria
Konferenz: CAIBDA 2022 - 2nd International Conference on Artificial Intelligence, Big Data and Algorithms
17.06.2022 - 19.06.2022 in Nanjing, China
Tagungsband: CAIBDA 2022
Seiten: 5Sprache: EnglischTyp: PDF
Autoren:
Zhang, Lanlan; Gao, Zhuo (Department of Mathematical Teaching, Guangzhou Civil Aviation College, China)
Inhalt:
Since the theory of discrete-time Markov decision processes is quite mature, in allusion to multiple constrained continuous-time Markov decision processes with expected discounted reward criteria, the criterion to be minimized is the expected discounted reward, and a given constraint vector is imposed on the expected discounted cost. Using the uniformization technique of transforming the continuous-time MDP to equivalent discrete-time MDP for the case of bounded transition rate, and the existence of a constrained optimal policy is shown.