| Copyright | (c) Sentenai 2017 |
|---|---|
| License | BSD3 |
| Maintainer | sam@sentenai.com |
| Stability | experimental |
| Portability | non-portable |
| Safe Haskell | None |
| Language | Haskell2010 |
Control.MonadEnv
Description
User-facing API for MonadEnv, typeclass used to implement an environment
Documentation
class (Num r, Monad e) => MonadEnv e s a r | e -> s a r where Source #
The environment monad TODO: Think about two typeclasses: ContinuousMonadEnv and EpisodicMonadEnv
Methods
reset :: e (Initial s) Source #
Any environment must be initialized with reset. This can be used to
reset the environment at any time. It's expected that resetting an
environment begins a new episode (and can only be called once in a
continuous environment).
step :: a -> e (Obs r s) Source #
Step though an environment with an action, run the action in the environment, and return a reward and the new state of the environment.
Instances
| MonadEnv Environment () Action Reward Source # | |
| MonadEnv Environment StateCP Action Reward Source # | |
| MonadEnv m s a r => MonadEnv (MWCRandT m) s a r Source # | An instance which allows for an environment to hold a reference to a shared MWC-random generator |
| MonadEnv m s a r => MonadEnv (DebugLogger m) s a r Source # | |
| MonadEnv m s a r => MonadEnv (NoopLogger m) s a r Source # | |
| (MonadIO t, MonadThrow t) => MonadEnv (EnvironmentT t) State Action Reward Source # | |
| (MonadThrow t, MonadIO t) => MonadEnv (EnvironmentT t) State Action Reward Source # | |
| (MonadIO t, MonadThrow t) => MonadEnv (EnvironmentT t) State Action Reward Source # | |
| (MonadThrow t, MonadIO t) => MonadEnv (EnvironmentT t) StateFL Action Reward Source # | |
| MonadEnv e s a r => MonadEnv (StateT t e) s a r Source # | |
| (Monoid t, MonadEnv e s a r) => MonadEnv (WriterT t e) s a r Source # | |
| MonadEnv e s a r => MonadEnv (ReaderT * t e) s a r Source # | |
| (Monoid writer, MonadEnv e s a r) => MonadEnv (RWST reader writer state e) s a r Source # | |
An observation of the environment will either show that the environment is
done with the episode (yielding Done), that the environment has already
Terminated, or will return the reward of the last action performed and the
next state
TODO: return Terminal (or return ()) on failure
Constructors
| Next !r !o | |
| Done !r !(Maybe o) | |
| Terminated |
When starting an episode, we want to send an indication that the environment
is starting without conflating this type with future steps (in Obs r o)
Constructors
| Initial !o | |
| EmptyEpisode |