Train an agent for CartPole-v0 using naive Policy Gradient.
Inspired by Andrej Karpathy's blog.
Code partly from Pytorch DQN Tutorial
Solved in 500 episodes (Avg Reward):
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Train an agent for CartPole-v0 using naive Policy Gradient.
Inspired by Andrej Karpathy's blog.
Code partly from Pytorch DQN Tutorial
Solved in 500 episodes (Avg Reward):