Revision - fc0c43b - RNN support for PPO2 (#859)

Revision fc0c43b1997947778cfffe9f865d21e124f1001b authored by JongGyun Kim on 26 April 2019, 22:17:57 UTC, committed by pzhokhov on 26 April 2019, 22:17:56 UTC

RNN support for PPO2  (#859)

* initial implementaion of ppo2_rnn.

* set lstm memory as tf.GraphKeys.LOCAL_VARIABLES.

* replace dones with tf.placeholder_with_default.

* improves for 'play' option.

* removed unnecessary TODO .

* improve lstm code.

* move learning rate placeholer to optimizer scope.

* support the microbatched model.

* sync cnn lstm layer with originals.

* add cnn_lnlstm layer.

* fix a case when `states` is None.

* add initial_state variable to help test.

* make ppo2 rnn test available.

* rename 'obs' with 'observations'.
rename 'transition' with 'transitions'.
fix forgetting `dones` in the replay buffer.
fix a misuse of `states` and `next_states` in the replay buffer.

* make initialization once.
make `test_fixed_sequence` compatible with ppo2.

* adjust input shape.

* fix checking of a model input args in `simple_test` function.

* disable warning on purpose.

* support the play.

* improve scopes to compatible with multiple models (i.e, other tensorflow global/local variables)

* clean the scope of ppo2 policy model.

* name the memory variable of PPO RNNs more describly

* wrap the initializations in ppo2.

* remove redundant lines.

* update `REAMD.md`.

* add RNN layers.

* add the result of HalfCheeta-v2 env  experiment.

* correct a typo.

* add RNN class.

* rename `nlstm` with `num_units` in RNN builder functions.

* remove state saving.

* reuse RNNs in a2c.utils.

* revert baselines/run.py.

* replace `ppo2.step()` with original interface.

* revert `baselines/common/tests/util.py`.

* remove redundant lines.

* revert `baselines/common/test/util.py` to b875fb7.

* remove `states` variable.

* move RNN class to `baselines/ppo2/layers.py' and revert `baselines/common/models.py` to 858afa8.

* rename `model.step_as_dict` with `model.step_with_dict`.

* removed `ppo_lstm_mlp`.

* fix 02e26fd.

1 parent 5d8041d

Files
Changes

Permalinks

File	Mode	Size
baselines
data
docs
.benchmark_pattern	-rw-r--r--	1 byte
.gitignore	-rw-r--r--	283 bytes
.travis.yml	-rw-r--r--	230 bytes
Dockerfile	-rw-r--r--	459 bytes
LICENSE	-rw-r--r--	1.1 KB
README.md	-rw-r--r--	7.4 KB
benchmarks_atari10M.htm	-rw-r--r--	425.7 KB
benchmarks_mujoco1M.htm	-rw-r--r--	153.0 KB
setup.cfg	-rw-r--r--	114 bytes
setup.py	-rw-r--r--	1.7 KB

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...

RNN support for PPO2 (#859)

README.md