Copyright | (c) Sentenai 2017 |
---|---|
License | BSD3 |
Maintainer | sam@sentenai.com |
Stability | experimental |
Portability | non-portable |
Safe Haskell | None |
Language | Haskell2010 |
Environment description: > A pole is attached by an un-actuated joint to a cart, which moves along a > frictionless track. The system is controlled by applying a force of +1 or -1 > to the cart. The pendulum starts upright, and the goal is to prevent it from > falling over. A reward of +1 is provided for every timestep that the pole > remains upright. The episode ends when the pole is more than 15 degrees from > vertical, or the cart moves more than 2.4 units from the center.
- data Action
- type Runner s a x = RunnerT s a IO x
- data StateCP = StateCP {}
- type Environment = EnvironmentT IO
- type EnvironmentT t = GymEnvironmentT StateCP Action t
- runEnvironment :: Manager -> BaseUrl -> RunnerT StateCP Action IO x
- runEnvironmentT :: MonadIO t => Manager -> BaseUrl -> RunnerT StateCP Action t x
- runDefaultEnvironment :: RunnerT StateCP Action IO x
- runDefaultEnvironmentT :: MonadIO t => RunnerT StateCP Action t x
Documentation
Cartpole can only go left or right has an action space of "discrete 2" containing {0..n-1}.
FIXME: Migrate this to either a more generic "directions" actions (would need things like "up", "down" versions as well) or a "discrete actions" version. I'm a fan of the former.
The state of a cart on a pole in a CartPole environment
type Environment = EnvironmentT IO Source #
Alias to EnvironmentT
in IO
type EnvironmentT t = GymEnvironmentT StateCP Action t Source #
Alias to GymEnvironmentT
with CartPoleV0 type dependencies
runEnvironment :: Manager -> BaseUrl -> RunnerT StateCP Action IO x Source #
Alias to runEnvironment
in IO
runEnvironmentT :: MonadIO t => Manager -> BaseUrl -> RunnerT StateCP Action t x Source #
Alias to runEnvironmentT
runDefaultEnvironment :: RunnerT StateCP Action IO x Source #
Alias to runDefaultEnvironment
in IO
runDefaultEnvironmentT :: MonadIO t => RunnerT StateCP Action t x Source #
Alias to runDefaultEnvironmentT
Orphan instances
(MonadIO t, MonadThrow t) => MonadEnv (EnvironmentT t) StateCP Action Reward Source # | |