メインナビゲーションにスキップ 検索にスキップ メインコンテンツにスキップ

On-line EM reinforcement learning

研究成果: 会議への寄与タイプ論文査読

4   !!Link opens in a new tab 被引用数 (Scopus)

抄録

In this article, we propose a new reinforcement learning (RL) method for a system having continuous state and action spaces. Our RL method has an architecture like the actor-critic model. The critic tries to approximate the Q-function, which is the expected future return for the current state-action pair. The actor tries to approximate a stochastic soft-max policy defined by the Q-function. The soft-max policy is more likely to select an action that has a higher Q-function value. The on-line EM algorithm is used to train the critic and the actor. We apply this method to two control problems. Computer simulations show that our method is able to acquire fairly good control in the two tasks after a few learning trials.

本文言語英語
ページ163-168
ページ数6
出版ステータス出版済み - 2000
外部発表はい
イベントInternational Joint Conference on Neural Networks (IJCNN'2000) - Como, Italy
継続期間: 24-07-200027-07-2000

会議

会議International Joint Conference on Neural Networks (IJCNN'2000)
CityComo, Italy
Period24-07-0027-07-00

All Science Journal Classification (ASJC) codes

  • ソフトウェア
  • 人工知能

フィンガープリント

「On-line EM reinforcement learning」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル