TY - GEN
T1 - A solving method for MDPs by minimizing variational free energy
AU - Yoshimoto, Junichiro
AU - Ishii, Shin
PY - 2004
Y1 - 2004
N2 - In this article, we propose a novel approach to acquiring the optimal policy for a continuous Markov decision process. Based on an analogy from statistical mechanics, we introduce a variational free energy over a policy. A good policy can be obtained by minimizing the variational free energy. According to our approach, the optimal policy in linear quadratic regulator problems can be obtained by using Kaiman filtering and smoothing techniques. Even in non-linear problems, a semioptimal policy can be obtained by Monte Carlo technique with a Gaussian process method.
AB - In this article, we propose a novel approach to acquiring the optimal policy for a continuous Markov decision process. Based on an analogy from statistical mechanics, we introduce a variational free energy over a policy. A good policy can be obtained by minimizing the variational free energy. According to our approach, the optimal policy in linear quadratic regulator problems can be obtained by using Kaiman filtering and smoothing techniques. Even in non-linear problems, a semioptimal policy can be obtained by Monte Carlo technique with a Gaussian process method.
UR - http://www.scopus.com/inward/record.url?scp=10844269629&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=10844269629&partnerID=8YFLogxK
U2 - 10.1109/IJCNN.2004.1380884
DO - 10.1109/IJCNN.2004.1380884
M3 - Conference contribution
AN - SCOPUS:10844269629
SN - 0780383591
T3 - IEEE International Conference on Neural Networks - Conference Proceedings
SP - 1817
EP - 1822
BT - 2004 IEEE International Joint Conference on Neural Networks - Proceedings
T2 - 2004 IEEE International Joint Conference on Neural Networks - Proceedings
Y2 - 25 July 2004 through 29 July 2004
ER -