https://github.com/openai/baselines
- HEAD
- refs/heads/aray-extra-imports
- refs/heads/fix998
- refs/heads/fix_build
- refs/heads/fix_monitor_close
- refs/heads/games/master
- refs/heads/gdb
- refs/heads/her-fixes
- refs/heads/internal
- refs/heads/like_pr_787
- refs/heads/master
- refs/heads/matthias-her
- refs/heads/observation-dtype
- refs/heads/old_acktr_cont
- refs/heads/param-noise-release
- refs/heads/peterz_383
- refs/heads/peterz_alex_propagate_vecenv_changes
- refs/heads/peterz_benchmarks
- refs/heads/peterz_codecov_report
- refs/heads/peterz_flatten_dict_wrapper
- refs/heads/peterz_import_internal
- refs/heads/peterz_import_internal_2e3a166
- refs/heads/peterz_learn_registration
- refs/heads/peterz_mpiless
- refs/heads/peterz_pr_214
- refs/heads/peterz_test_benchmarks
- refs/heads/peterz_tflstm
- refs/heads/peterz_tflstm_1
- refs/heads/peterz_tflstm_with_ppo2
- refs/heads/peterz_ubuntu18_04
- refs/heads/peterz_update_READMEs
- refs/heads/peterz_viz
- refs/heads/ppo-trpo
- refs/heads/simple_bench_mpi_cleanup
- refs/heads/stateful_rnn
- refs/heads/tf2
- refs/heads/tuple_pdtype
No releases to show
Take a new snapshot of a software origin
If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.
Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.Processing "take a new snapshot" request ...
Permalinks
To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.
Revision | Author | Date | Message | Commit Date |
---|---|---|---|---|
0e423a0 | Peter Zhokhov | 06 September 2019, 21:36:35 UTC | use allreduce instead of Allreduce (send pickled data instead of floats) - probably affects performance somewhat, but avoid element number mismatch. Fixes 998 | 06 September 2019, 21:36:35 UTC |
229a772 | tanzhenyu | 29 August 2019, 21:25:44 UTC | Release notes for Tensorflow 2.0 support. (#997) | 29 August 2019, 21:25:44 UTC |
d80b075 | Tomasz Wrona | 29 August 2019, 19:16:25 UTC | Make SubprocVecEnv works with DummyVecEnv (#908) * Make SubprocVecEnv works with DummyVecEnv (nested environments for synchronous sampling) * SubprocVecEnv now supports running environments in series in each process * Added docstring to the test definition * Added additional test to check, whether SubprocVecEnv results with the same output when in_series parameter is enabled and not * Added more test cases for in_series parameter * Refactored worker function, added docstring for in_series parameter * Remove check for TF presence in setup.py | 29 August 2019, 19:16:25 UTC |
0182fe1 | NicoBach | 05 August 2019, 23:03:19 UTC | entrypoint variable made public (#970) | 05 August 2019, 23:03:19 UTC |
1fb4dfb | Seungjae Ryan Lee | 05 August 2019, 23:02:43 UTC | Fix typo in GAIL dataset log (#950) | 05 August 2019, 23:02:43 UTC |
7cadef7 | Timo Kaufmann | 05 August 2019, 23:02:21 UTC | Fix typo (#930) * Fix typo * Fix train_freq documentation Seems to be a copy-paste error, train_freq has nothing to do with printing. * Fix documentation typo | 05 August 2019, 23:02:21 UTC |
fce4370 | tanzhenyu | 05 August 2019, 23:01:54 UTC | Remove duplicate code in adaptive param noise. (#976) | 05 August 2019, 23:01:54 UTC |
c575285 | tanzhenyu | 27 June 2019, 17:12:38 UTC | Remove model def from deepq. (#946) | 27 June 2019, 17:12:38 UTC |
2bca790 | Marcin Michalski | 24 June 2019, 17:19:01 UTC | Updating the version to 0.1.6 (#933) Updating the version in setup.py to avoid conflict with the old (>1 year old) version in pypi. | 24 June 2019, 17:19:01 UTC |
ba2b017 | albert | 07 June 2019, 22:05:52 UTC | add log_path flag to command line utility (#917) * add log_path flag to command line utility * Update README with log_path flag * clarify logg and viz docs | 07 June 2019, 22:05:52 UTC |
7c52085 | Anton Grigoryev | 31 May 2019, 23:49:46 UTC | Fix converting list of LazyFrames to ndarray (#907) | 31 May 2019, 23:49:46 UTC |
1c872ca | pzhokhov | 31 May 2019, 22:36:20 UTC | run test_monitor through pytest; fix the test, add flake8 to bench direectory - like PR 891 (#921) | 31 May 2019, 22:36:20 UTC |
ff8d36a | Jinho Lee | 31 May 2019, 21:31:35 UTC | Starting to reassign waiting_step in shmem_vecenv (#915) "self.waiting_step" is initialized in __init__ function but it is not reassigned anywhere. Because it is used in reset function and close_extras function, it should be fixed. So i fixed it to be similar with subproc_vec_env's one. | 31 May 2019, 21:31:35 UTC |
7614b02 | Sridhar Thiagarajan | 31 May 2019, 21:27:11 UTC | remove f strings for python back compatibility (#906) | 31 May 2019, 21:27:11 UTC |
f7d5a26 | Andy Twigg | 31 May 2019, 21:26:45 UTC | suppress excessive messages from unused loggers (#920) Only print the "logging to [dir]" message when the logger has something to output. Running with the new spawn change and both mpi and subprocvecenv, there are many "logging to [dir]" messages but most are not logging anything. | 31 May 2019, 21:26:45 UTC |
21776e8 | Joshua Meier | 31 May 2019, 21:06:20 UTC | Support Tuple observation spaces (#911) | 31 May 2019, 21:06:20 UTC |
9b68103 | pzhokhov | 08 May 2019, 18:36:10 UTC | release Internal changes (#895) * joshim5 changes (width and height to WarpFrame wrapper) * match network output with action distribution via a linear layer only if necessary (#167) * support color vs. grayscale option in WarpFrame wrapper (#166) * support color vs. grayscale option in WarpFrame wrapper * Support color in other wrappers * Updated per Peters suggestions * fixing test failures * ppo2 with microbatches (#168) * pass microbatch_size to the model during construction * microbatch fixes and test (#169) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * Peterz joshim5 subclass ppo2 model (#170) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * subclassing the model to make microbatched version of model WIP * made microbatched model a subclass of ppo2 Model * flake8 complaint * mpi-less ppo2 (resolving merge conflict) * flake8 and mpi4py imports in ppo2/model.py * more un-mpying * merge master * updates to the benchmark viewer code + autopep8 (#184) * viz docs and syntactic sugar wip * update viewer yaml to use persistent volume claims * move plot_util to baselines.common, update links * use 1Tb hard drive for results viewer * small updates to benchmark vizualizer code * autopep8 * autopep8 * any folder can be a benchmark * massage games image a little bit * fixed --preload option in app.py * remove preload from run_viewer.sh * remove pdb breakpoints * update bench-viewer.yaml * fixed bug (#185) * fixed bug it's wrong to do the else statement, because no other nodes would start. * changed the fix slightly * Refactor her phase 1 (#194) * add monitor to the rollout envs in her RUN BENCHMARKS her * Slice -> Slide in her benchmarks RUN BENCHMARKS her * run her benchmark for 200 epochs * dummy commit to RUN BENCHMARKS her * her benchmark for 500 epochs RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * disable saving of policies in her benchmark RUN BENCHMARKS her * run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch * run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch * launcher refactor wip * wip * her works on FetchReach * her runner refactor RUN BENCHMARKS Fetch1M * unit test for her * fixing warnings in mpi_average in her, skip test_fetchreach if mujoco is not present * pickle-based serialization in her * remove extra import from subproc_vec_env.py * investigating differences in rollout.py * try with old rollout code RUN BENCHMARKS her * temporarily use DummyVecEnv in cmd_util.py RUN BENCHMARKS her * dummy commit to RUN BENCHMARKS her * set info_values in rollout worker in her RUN BENCHMARKS her * bug in rollout_new.py RUN BENCHMARKS her * fixed bug in rollout_new.py RUN BENCHMARKS her * do not use last step because vecenv calls reset and returns obs after reset RUN BENCHMARKS her * updated buffer sizes RUN BENCHMARKS her * fixed loading/saving via joblib * dust off learning from demonstrations in HER, docs, refactor * add deprecation notice on her play and plot files * address comments by Matthias * 1.5 months of codegen changes (#196) * play with resnet * feed_dict version * coinrun prob and more stats * fixes to get_choices_specs & hp search * minor prob fixes * minor fixes * minor * alternative version of rl_algo stuff * pylint fixes * fix bugs, move node_filters to soup * changed how get_algo works * change how get_algo works, probably broke all tests * continue previous refactor * get eval_agent running again * fixing tests * fix tests * fix more tests * clean up cma stuff * fix experiment * minor changes to eval_agent to make ppo_metal use gpu * make dict space work * modify mac makefile to use conda * recurrent layers * play with bn and resnets * minor hp changes * minor * got rid of use_fb argument and jtft (joint-train-fine-tune) functionality built test phase directly into AlgoProb * make new rl algos generateable * pylint; start fixing tests * fixing tests * more test fixes * pylint * fix search * work on search * hack around infinite loop caused by scan * algo search fixes * misc changes for search expt * enable annealing, overriding options of Op * pylint fixes * identity op * achieve use_last_output through masking so it automatically works in other distributions * fix tests * minor * discrete * use_last_output to be just a preference, not a hard constraint * pred delay, pruning * require nontrivial inputs * aliases for get_sm * add probname to probs * fixes * small fixes * fix tests * fix tests * fix tests * minor * test scripts * dualgru network improvements * minor * work on mysterious bugs * rcall gpu-usage command for kube * use cache dir that’s not in code folder, so that it doesn’t get removed by rcall code rsync * add power mode to gpu usage * make sure train/test actually different * remove VR for now * minor fixes * simplify soln_db * minor * big refactor of mpi eda * improve mpieda for multitask * - get rid of timelimit hack - add __del__ to cleanup SubprocVecEnv * get multitask working better * fixes * working on atari, various * annotate ops with whether they’re parametrized * minor * gym version * rand atari prob * minor * SolnDb bugfix and name change * pyspy script * switch conv layers * fix roboschool/bullet3 * nenvs assertion * fix rand atari * get rid of blanket exception catching fix soln_db bug * fix rand_atari * dynamic routing as cmdline arg * slight modifications to test_mpi_map and pyspy-all * max_tries argument for run_until_successs * dedup option in train_mle * simplify soln_db * increase atari horizon for 1 experiment * start implementing reward increment * ent multiplier * create cc dsl other misc fixes * cc ops * q_func -> qs in rl_algos_cc.py * fix PredictDistr * rl_ops_cc fixes, MakeAction op * augment algo agent to support cc stuff * work on ddpg experiments * fix blocking temporarily change logger * allow layer scaling * pylint fixes * spawn_method * isolate ddpg hacks * improve pruning * use spawn for subproc * remove use of python -c in rcall * fix pylint warning * fix static * maybe fix local backend * switch to DummyVecEnv * making some fixes via pylint * pylint fixes * fixing tests * fix tests * fix tests * write scaffolding for SSL in Codegen * logger fix * fix error * add EMA op to sl_ops * save many changes * save * add upsampler * add sl ops, enhance state machine * get ssl search working — some gross hacking * fix session/graph issue * fix importing * work on mle * - scale embeddings in gru model - better exception handling in sl_prob - use emas for test/val - use non-contrib batch_norm layer * improve logging * option to average before dumping in logger * default arguments, etc * new ddpg and identity test * concat fix * minor * move realistic ssl stuff to third-party (underscore to dash) * fixes * remove realistic_ssl_evaluation * pylint fixes * use gym master * try again * pass around args without gin * fix tests * separate line to install gym * rename failing tests that should be ignored * add data aug * ssl improvements * use fixed time limit * try to fix baselines tests * add score_floor, max_walltime, fiddle with lr decay * realistic_ssl * autopep8 * various ssl - enable blocking grad for simplification - kl - multiple final prediction * fix pruning * misc ssl stuff * bring back linear schedule, don’t use allgather for collecting stats (i’ve been getting nondeterministic errors from the old code) * save/load weights in SSL, big stepsize * cleanup SslProb * fix * get rid of kl coef * fix simplification, lower lr * search over hps * minor fixes * minor * static analysis * move files and rename things for improved consistency. still broken, and just saving before making nontrivial changes * various * make tests pass * move coinrun_train to codegen since it depends on codegen * fixes * pylint fixes * improve tests fix some things * improve tests * lint * fix up db_info.py, tests * mostly restore master version of envs directory, except for makefile changes * fix tests * improve printing * minor fixes * fix fixmes * pruning test * fixes * lint * write new test that makes tf graphs of random algos; fix some bugs it caught * add —delete flag to rcall upload-code command * lint * get cifar10 lazily for testing purposes * disable codegen ci tests for now * clean up rl_ops * rename spec classes * td3 with identity test * identity tests without gin files * remove gin.configurable from AlgoAgent * comments about reduction in rl_ops_cc * address @pzhokhov comments * fix tests * more linting * better tests * clean up filtering a bit * fix concat * delayed logger configuration (#208) * delayed logger configuration * fix typo * setters and getters for Logger.DEFAULT as well * do away with fancy property stuff - unable to get it to work with class level methods * grammar and spaces * spaces * use get_current function instead of reading Logger.CURRENT * autopep8 * disable mpi in subprocesses (#213) * lazy_mpi load * cleanups * more lazy mpi * don't pretend that class is a module, just use it as a class * mass-replace mpi4py imports * flake8 * fix previous lazy_mpi imports * silly recursion * try os.environ hack * better prefix test, work with mpich * restored MPI imports * removed commented import in test_with_mpi * restored codegen from master * remove lazy mpi * restored changes from rl-algs * remove extra files * address Chris' comments * use spawn for shmem vec env as well (#2) (#219) * lazy_mpi load * cleanups * more lazy mpi * don't pretend that class is a module, just use it as a class * mass-replace mpi4py imports * flake8 * fix previous lazy_mpi imports * silly recursion * try os.environ hack * better prefix test, work with mpich * restored MPI imports * removed commented import in test_with_mpi * restored codegen from master * remove lazy mpi * restored changes from rl-algs * remove extra files * port mpi fix to shmem vec env * increase the mpi test default timeout * change humanoid hyperparameters, get rid of clip_Frac annealing, as it's apparently dangerous * remove clip_frac schedule from ppo2 * more timesteps in humanoid run * whitespace + RUN BENCHMARKS * baselines: export vecenvs from folder (#221) * baselines: export vecenvs from folder * put missing function back in * add missing imports * more imports * longer mpi timeout? * make default logger configuration the same as call to logger.configure() (#222) * Vecenv refactor (#223) * update karl util * restore pvi flag * change rcall auto cpu behavior, move gin.configurable, add os.makedirs * vecenv refactor * aux buf index fix * add num aux obs * reset level with enter * restore high difficulty flag * bugfix * restore train_coinrun.py * tweaks * renaming * renaming * better arguments handling * more options * options cleanup * game data refactor * more options * args for train_procgen * add close handler to interactive base class * use debug build if debug=True, fix range on aux_obs * add ProcGenEnv to __init__.py, add missing imports to procgen.py * export RemoveDictWrapper and build, update train_procgen.py, move assets download into env creation and replace init_assets_and_build with just build * fix formatting issues * only call global init once * fix path in setup.py * revert part of makefile * ignore IDE files and folders * vec remove dict * export VecRemoveDictObs * remove RemoveDictWrapper * remove IDE files * move shared .h and .cpp files to common folder, update build to use those, dedupe env.cpp * fix missing header * try unified build function * remove old scripts dir * add comment on build * upload libenv with render fixes * tell qthreads to die when we unload the library * pyglet.app.run is garbage * static fixes * whoops * actually vsync is on * cleanup * cleanup * extern C for libenv interface * parse util rcall arg * high difficulty fix * game type enums * ProcGenEnv subclasses * game type cleanup * unrecognized key * unrecognized game type * parse util reorg * args management * typo fix * GinParser * arg tweaks * tweak * restore start_level/num_levels setting * fix create_procgen_env interface * build fix * procgen args in init signature * fix * build fix * fix logger usage in ppo_metal/run_retro * removed unnecessary OrderedDict requirement in subproc_vec_env * flake8 fix * allow for non-mpi tests * mpi test fixes * flake8; removed special logic for discrete spaces in dummy_vec_env * remove forked argument in front of tests - does not play nicely with subprocvecenv in spawned processes; analog of forked in ddpg/test_smoke * Everyrl initial commit & a few minor baselines changes (#226) * everyrl initial commit * add keep_buf argument to VecMonitor * logger changes: set_comm and fix to mpi_mean functionality * if filename not provided, don't create ResultsWriter * change variable syncing function to simplify its usage. now you should initialize from all mpi processes * everyrl coinrun changes * tf_distr changes, bugfix * get_one * bring back get_next to temporarily restore code * lint fixes * fix test * rename profile function * rename gaussian * fix coinrun training script * change random seeding to work with new gym version (#231) * change random seeding to work with new gym version * move seeding to seed() method * fix mnistenv * actually try some of the tests before pushing * more deterministic fixed seq * misc changes to vecenvs and run.py for benchmarks (#236) * misc changes to vecenvs and run.py for benchmarks * dont seed global gen * update more references to assert_venvs_equal * Rl19 (#232) * everyrl initial commit * add keep_buf argument to VecMonitor * logger changes: set_comm and fix to mpi_mean functionality * if filename not provided, don't create ResultsWriter * change variable syncing function to simplify its usage. now you should initialize from all mpi processes * everyrl coinrun changes * tf_distr changes, bugfix * get_one * bring back get_next to temporarily restore code * lint fixes * fix test * rename profile function * rename gaussian * fix coinrun training script * rl19 * remove everyrl dir which appeared in the merge for some reason * readme * fiddle with ddpg * make ddpg work * steps_total argument * gpu count * clean up hyperparams and shape math * logging + saving * configuration stuff * fixes, smoke tests * fix stats * make load_results return dicts -- easier to create the same kind of objects with some other mechanism for passing to downstream functions * benchmarks * fix tests * add dqn to tests, fix it * minor * turned annotated transformer (pytorch) into a script * more refactoring * jax stuff * cluster * minor * copy & paste alec code * sign error * add huber, rename some parameters, snapshotting off by default * remove jax stuff * minor * move maze env * minor * remove trailing spaces * remove trailing space * lint * fix test breakage due to gym update * rename function * move maze back to codegen * get recurrent ppo working * enable both lstm and gru * script to print table of benchmark results * various * fix dqn * add fixup initializer, remove lastrew * organize logging stats * fix silly bug * refactor models * fix mpi usage * check sync * minor * change vf coef, hps * clean up slicing in ppo * minor fixes * caching transformer * docstrings * xf fixes * get rid of 'B' and 'BT' arguments * minor * transformer example * remove output_kind from base class until we have a better idea how to use it * add comments, revert maze stuff * flake8 * codegen lint * fix codegen tests * responded to peter's comments * lint fixes * minor changes to baselines (#243) * minor changes to baselines * fix spaces reference * remove flake8 disable comments and fix import * okay maybe don't add spec to vec_env * Merge branch 'master' of github.com:openai/games the commit. * flake8 complaints in baselines/her * update dmlab30 env (#258) * codegen continuous control experiment pr (#256) * finish cherry-pick td3 test commit * removed graph simplification error ingore * merge delayed logger config * merge updated baselines logger * lazy_mpi load * cleanups * use lazy mpi imports in codegen * more lazy mpi * don't pretend that class is a module, just use it as a class * mass-replace mpi4py imports * flake8 * fix previous lazy_mpi imports * removed extra printouts from TdLayer op * silly recursion * running codegen cc experiment * wip * more wip * use actor is input for critic targets, instead of the action taken * batch size 100 * tweak update parameters * tweaking td3 runs * wip * use nenvs=2 for contcontrol (to be comparable with ppo_metal) * wip. Doubts about usefulness of actor in critic target * delayed actor in ActorLoss * score is average of last 100 * skip lack of losses or too many action distributions * 16 envs for contcontrol, replay buffer size equal to horizon (no point in making it longer) * syntax * microfixes * minifixes * run in process logic to bypass tensorflow freezes/failures (per Oleg's suggestion) * squash-merge master, resolve conflicts * remove erroneous file * restore normal MPI imports * move wrappers around a little bit * autopep8 * cleanups * cleanup mpi_eda, autopep8 * make activation function of action distribution customizable * cleanups; preparation for a pr * syntax * merge latest master, resolve conflicts * wrap MPI import with try/except * allow import of modules through env id im baselines cmd_util * flake8 complaints * only wrap box action spaces with ClipActionsWrapper * flake8 * fixes to algo_prob according to Oleg's suggestions * use apply_without_scope flag in ActorLoss * remove extra line in algo/core.py * Rl19 metalearning (#261) * rl19 metalearning and dict obs * master merge arch fix * lint fixes * view fixes * load vars tweaks * user config cleanup * documentation and revisions * pass train comm to rl19 * cleanup * Symshapes - gives codegen ability to evaluate same algo on envs with different ob/ac shapes (#262) * finish cherry-pick td3 test commit * removed graph simplification error ingore * merge delayed logger config * merge updated baselines logger * lazy_mpi load * cleanups * use lazy mpi imports in codegen * more lazy mpi * don't pretend that class is a module, just use it as a class * mass-replace mpi4py imports * flake8 * fix previous lazy_mpi imports * removed extra printouts from TdLayer op * silly recursion * running codegen cc experiment * wip * more wip * use actor is input for critic targets, instead of the action taken * batch size 100 * tweak update parameters * tweaking td3 runs * wip * use nenvs=2 for contcontrol (to be comparable with ppo_metal) * wip. Doubts about usefulness of actor in critic target * delayed actor in ActorLoss * score is average of last 100 * skip lack of losses or too many action distributions * 16 envs for contcontrol, replay buffer size equal to horizon (no point in making it longer) * syntax * microfixes * minifixes * run in process logic to bypass tensorflow freezes/failures (per Oleg's suggestion) * random physics for mujoco * random parts sizes with range 0.4 * add notebook with results into x/peterz * variations of ant * roboschool use gym.make kwargs * use float as lowest score after rank transform * rcall from master * wip * re-enable dynamic routing * wip * squash-merge master, resolve conflicts * remove erroneous file * restore normal MPI imports * move wrappers around a little bit * autopep8 * cleanups * cleanup mpi_eda, autopep8 * make activation function of action distribution customizable * cleanups; preparation for a pr * syntax * merge latest master, resolve conflicts * wrap MPI import with try/except * allow import of modules through env id im baselines cmd_util * flake8 complaints * only wrap box action spaces with ClipActionsWrapper * flake8 * fixes to algo_prob according to Oleg's suggestions * use apply_without_scope flag in ActorLoss * remove extra line in algo/core.py * multi-task support * autopep8 * symbolic suffix-shapes (not B,T yet) * test_with_mpi -> with_mpi rename * remove extra blank lines in algo/core * remove extra blank lines in algo/core * remove more blank lines * symbolify shapes in existing algorithms * minor output changes * cleaning up merge conflicts * cleaning up merge conflicts * cleaning up more merge conflicts * restore mpi_map.py from master * remove tensorflow dependency from VecEnv * make tests use single-threaded session for determinism of KfacOptimizer (#298) * make tests use single-threaded session for determinism of KfacOptimizer * updated comment in kfac.py * remove unused sess_config * add score calculator wrapper, forward property lookups on vecenv wrap… (#300) * add score calculator wrapper, forward property lookups on vecenv wrapper, misc cleanup * tests * pylint * fix vec monitor infos * Workbench (#303) * begin workbench * cleanup * begin procgen config integration * arg tweaks * more args * parameter saving * begin procgen enjoy * tweaks * more workbench * more args sync/restore * cleanup * merge in master * rework args priority * more workbench * more loggign * impala cnn * impala lstm * tweak * tweaks * rl19 time logging * misc fixes * faster pipeline * update local.py * sess and log config tweaks * num processes * logging tweaks * difficulty reward wrapper * logging fixes * gin tweaks * tweak * fix * task id * param loading * more variable loading * entrypoint * tweak * ksync * restore lstm * begin rl19 support * tweak * rl19 rnn * more rl19 integration * fix * cleanup * restore rl19 rnn * cleanup * cleanup * wrappers.get_log_info * cleanup * cleanup * directory cleanup * logging, num_experiments * fixes * cleanup * gin fixes * fix local max gpu * resid nx * num machines and download params * rename * cleanup * create workbench * more reorg * fix * more logging wrappers * lint fix * restore train procgen * restore train procgen * pylint fix * better wrapping * config sweep * args sweep * test workers * mpi_weight * train test comm and high difficulty fix * enjoy show returns * removing gin, procgen_parser * removing gin * procgen args * config fixes * cleanup * cleanup * procgen args fix * fix * rcall syncing * lint * rename mpi_weight * use username for sync * fixes * microbatch fix * Grad clipping in MpiAdamOptimizer, transformer changes (#304) * transformer mnist experiments * version that only builds one model * work on inverted mnist * Add grad clipping to MpiAdamOptimizer * various * transformer changes, loading * get rid of soft labels * transformer baseline * minor * experiments involving all possible training sets * vary training * minor * get ready for fine-tuning expers * lint * minor * Add jrl19 as backend for workbench (#324) enable jrl in workbench minor logger changes * extra functionality in baselines.common.plot_util (#310) * get plot_util from mt_experiments branch * add labels * unit tests for plot_util * Fixed sequence env minor (#333) minor changes to FixedSequenceEnv to allow full score * fix tests (#335) * Procgen Benchmark Updates (#328) * directory cleanup * logging, num_experiments * fixes * cleanup * gin fixes * fix local max gpu * resid nx * tweak * num machines and download params * rename * cleanup * create workbench * more reorg * fix * more logging wrappers * lint fix * restore train procgen * restore train procgen * pylint fix * better wrapping * whackamole walls * config sweep * tweak * args sweep * tweak * test workers * mpi_weight * train test comm and high difficulty fix * enjoy show returns * better joint training * tweak * Add —update to args and add gin-config to requirements.txt * add username to download_file * removing gin, procgen_parser * removing gin * procgen args * config fixes * cleanup * cleanup * procgen args fix * fix * rcall syncing * lint * rename mpi_weight * begin composable game * more composable game * tweak * background alpha * use username for sync * fixes * microbatch fix * lure composable game * merge * proc trans update * proc trans update (#307) * finetuning experiment * Change is_local to use `use_rcall` and fix error of `enjoy.py` with multiple ends * graphing help * add --local * change args_dict['env_name'] to ENV_NAME * finetune experiments * tweak * tweak * reorg wrappers, remove is_local * workdir/local fixes * move finetune experiments * default dir and graphing * more graphing * fix * pooled syncing * tweaks * dir fix * tweak * wrapper mpi fix * wind and turrets * composability cleanup * radius cleanup * composable reorg * laser gates * composable tweaks * soft walls * tweak * begin swamp * more swamp * more swamp * fix * hidden mines * use maze layout * tweak * laser gate tweaks * tweaks * tweaks * lure/propel updates * composable midnight * composable coinmaze * composability difficulty * tweak * add step to save_params * composable offsets * composable boxpush * composable combiner * tweak * tweak * always choose correct number of mechanics * fix * rcall local fix * add steps when dump and save parmas * loading rank 1,2,3.. error fix * add experiments.py * fix loading latest weight with no -rest * support more complex run_id and add more examples * fix typo * move post_run_id into experiments.py * add hp_search example * error fix * joint experiments in progress * joint hp finished * typo * error fix * edit experiments * Save experiments set up in code and save weights per step (#319) * add step to save_params * add steps when dump and save parmas * loading rank 1,2,3.. error fix * add experiments.py * fix loading latest weight with no -rest * support more complex run_id and add more examples * fix typo * move post_run_id into experiments.py * add hp_search example * error fix * joint experiments in progress * joint hp finished * typo * error fix * edit experiments * tweaks * graph exp WIP * depth tweaks * move save_all * fix * restore_dir name * restore depth * choose max mechanics * use override mode * tweak frogger * lstm default * fix * patience is composable * hunter is composable * fixed asset seed cleanup * minesweeper is composable * eggcatch is composable * tweak * applesort is composable * chaser game * begin lighter * lighter game * tractor game * boxgather game * plumber game * hitcher game * doorbell game * lawnmower game * connecter game * cannonaim * outrun game * encircle game * spinner game * tweak * tweak * detonator game * driller * driller * mixer * conveyor * conveyor game * joint pcg experiments * fixes * pcg sweep experiment * cannonaim fix * combiner fix * store save time * laseraim fix * lightup fix * detonator tweaks * detonator fixes * driller fix * lawnmower calibration * spinner calibration * propel fix * train experiment * print load time * system independent hashing * remove gin configurable * task ids fix * test_pcg experiment * connecter dense reward * hard_pcg * num train comms * mpi splits envs * tweaks * tweaks * graph tweaks * graph tweaks * lint fix * fix tests * load bugfix * difficulty timeout tweak * tweaks * more graphing * graph tweaks * tweak * download file fix * pcg train envs list * cleanup * tweak * manually name impala layers * tweak * expect fps * backend arg * args tweak * workbench cleanup * move graph files * workbench cleanup * split env name by comma * workbench cleanup * ema graph * remove Dict * use tf.io.gfile * comments for auto-killing jobs * lint fix * write latest file when not saving all and load it when step=None * ci/runtests.sh - pass all folders to pytest (#342) * ci/runtests.sh - pass all folders to pytest * mpi_optimizer_test precision 1e-4 * fixes to tests * search for tests in the entire jax folder, also remove unnecessary humor * delete unnecessary stuff (#338) * Add initializer for process-level setup in SubprocVecEnv (#276) * Add initializer for process-level setup in SubprocVecEnv Use case: run logger.configure() in each subprocess * Add option to force dummy vec env * Procgen fixes (#352) * tweak * documentation * rely on log_comm, remove mpi averaging from wrappers * pass comm for ppo2 initialization * ppo2 logging * experiment tweaks * auto launch tensorboard when using local backend * graph tweaks * pass caller to config * configure logger and tensorboard * make parent dir if necessary * parentdir tweak * JRL PPO test with delayed identity env (#355) * add a custom delay to identity_env * min reward 0.8 in delayed identity test * seed the tests, perfect score on delayed_identity_test * delay=1 in delayed_identity_test * flake8 complaints * increased number of steps in fixed_seq_test * seed identity tests to ensure reproducibility * docstrings * (onp, np) -> (np, jp), switch jax code to use mark_slow decorator (#363) switch to mark_slow decorator * fix tests - add matplotlib to setup_requires, put mpi4py import in try-except * test fixes | 08 May 2019, 18:36:10 UTC |
3301089 | pzhokhov | 26 April 2019, 23:14:49 UTC | remove bullet extra, constrain gym version to be >= 0.10.0 (#885) * remove bullet extra, constrain gym version to be >= 0.10.0 * constrain gym version from above | 26 April 2019, 23:14:49 UTC |
a07fad9 | pzhokhov | 26 April 2019, 23:14:21 UTC | change rms 2 tfrms switch in vec_normalize to be more explicit (#886) * change rms 2 tfrms switch in vec_normalize to be more explicit * modify the vec_normalize / use_tf logic a little bit * typo * use_tf = False by default | 26 April 2019, 23:14:21 UTC |
5d8041d | Taeyeong Jeong | 19 April 2019, 22:00:09 UTC | Fix indexing LazyFrames (#875) Indexing LazyFrames with index i should return the single channel frame | 19 April 2019, 22:00:09 UTC |
fa37beb | Peter Zhokhov | 07 April 2019, 03:03:32 UTC | fix commit on atari bms page to point to a public commit | 07 April 2019, 03:03:32 UTC |
8a97e0d | Peter Zhokhov | 05 April 2019, 22:23:46 UTC | fix shuffling bug in ppo1 | 05 April 2019, 22:23:46 UTC |
fabbf2c | pzhokhov | 05 April 2019, 22:18:15 UTC | short-circuit framestack wrapper with size 1 (#871) | 05 April 2019, 22:18:15 UTC |
5d285b3 | Xingdong Zuo | 05 April 2019, 22:16:26 UTC | [Update misc_util.py]: clean up unused helper functions (#751) * Update misc_util.py * Update misc_util.py | 05 April 2019, 22:16:26 UTC |
49a99c7 | Tim Zaman | 05 April 2019, 21:46:01 UTC | Add eps to normalization (#797) | 05 April 2019, 21:46:01 UTC |
c79b337 | Peter Zhokhov | 05 April 2019, 21:43:09 UTC | parse colon-separated env_id's | 05 April 2019, 21:43:09 UTC |
6d1c6c7 | Sridhar Thiagarajan | 01 April 2019, 23:24:02 UTC | Interface for U.make_session changed (#865) | 01 April 2019, 23:24:02 UTC |
62a9c76 | JongGyun Kim | 01 April 2019, 22:49:25 UTC | fix the definition of `TfInput.make_feed_dict`. (#812) | 01 April 2019, 22:49:25 UTC |
282c9cc | Hao-Chih, Lin | 01 April 2019, 22:48:35 UTC | fix small bug in plot_results() (#864) Remove the comma behind the last input argument | 01 April 2019, 22:48:35 UTC |
096f4d9 | Peter Zhokhov | 01 April 2019, 22:47:13 UTC | neaten up stacking logic in mujoco_dset in gail | 01 April 2019, 22:47:13 UTC |
16136dd | Mingfei | 01 April 2019, 22:44:31 UTC | fix bugs: obs_ph normalization in adversary.py (#823) * fix bugs: obs_ph normalization in adversary.py * fix bug in reshape obs and acs in Mujobo_Dset | 01 April 2019, 22:44:31 UTC |
b164415 | Darío Hereñú | 01 April 2019, 22:41:52 UTC | Fixed typo on #092 (#824) | 01 April 2019, 22:41:52 UTC |
58541db | Yu Feng | 01 April 2019, 22:38:45 UTC | MPI refer to workers as ranks, not threads. (#833) | 01 April 2019, 22:38:45 UTC |
c02b575 | zlsh80826 | 01 April 2019, 22:37:32 UTC | ppo2: use time.perf_counter() instead of time.time() for time measurement (#847) | 01 April 2019, 22:37:32 UTC |
897fa31 | Pastafarianist | 29 March 2019, 20:25:56 UTC | Avoid using default config while requesting available GPUs (#863) | 29 March 2019, 20:25:56 UTC |
d51f8be | Brett Daley | 28 March 2019, 16:21:48 UTC | Report episode rewards/length in A2C and ACKTR (#856) | 28 March 2019, 16:21:48 UTC |
3f2f45a | Jacob Hilton | 25 March 2019, 21:33:15 UTC | Merge pull request #860 from openai/build-retro-env-framestack-fix run.py framestack bug fix | 25 March 2019, 21:33:15 UTC |
b64974e | Jacob Hilton | 24 March 2019, 19:27:14 UTC | build_env now doesn't apply frame stack to retro games twice | 24 March 2019, 19:27:14 UTC |
1b09243 | pzhokhov | 16 March 2019, 18:54:47 UTC | remove f-strings for python 3.5 compatibility (#854) | 16 March 2019, 18:54:47 UTC |
1259f6a | Peter Zhokhov | 12 March 2019, 00:44:03 UTC | check for environment being vectorized in the play logic in run.py | 12 March 2019, 00:44:03 UTC |
74101a9 | pzhokhov | 12 March 2019, 00:28:51 UTC | fix freeze of ppo2 (#849) * fix freeze of ppo2 * unit test for freeze, updated docstring * more docstring update * set number of threads to 1 in the test | 12 March 2019, 00:28:51 UTC |
90d6677 | JongGyun Kim | 06 March 2019, 23:13:01 UTC | remove one of duplicated lines. (#813) | 06 March 2019, 23:13:01 UTC |
b875fb7 | pzhokhov | 27 February 2019, 23:35:31 UTC | release Internal changes (#800) * joshim5 changes (width and height to WarpFrame wrapper) * match network output with action distribution via a linear layer only if necessary (#167) * support color vs. grayscale option in WarpFrame wrapper (#166) * support color vs. grayscale option in WarpFrame wrapper * Support color in other wrappers * Updated per Peters suggestions * fixing test failures * ppo2 with microbatches (#168) * pass microbatch_size to the model during construction * microbatch fixes and test (#169) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * Peterz joshim5 subclass ppo2 model (#170) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * subclassing the model to make microbatched version of model WIP * made microbatched model a subclass of ppo2 Model * flake8 complaint * mpi-less ppo2 (resolving merge conflict) * flake8 and mpi4py imports in ppo2/model.py * more un-mpying * merge master * updates to the benchmark viewer code + autopep8 (#184) * viz docs and syntactic sugar wip * update viewer yaml to use persistent volume claims * move plot_util to baselines.common, update links * use 1Tb hard drive for results viewer * small updates to benchmark vizualizer code * autopep8 * autopep8 * any folder can be a benchmark * massage games image a little bit * fixed --preload option in app.py * remove preload from run_viewer.sh * remove pdb breakpoints * update bench-viewer.yaml * fixed bug (#185) * fixed bug it's wrong to do the else statement, because no other nodes would start. * changed the fix slightly * Refactor her phase 1 (#194) * add monitor to the rollout envs in her RUN BENCHMARKS her * Slice -> Slide in her benchmarks RUN BENCHMARKS her * run her benchmark for 200 epochs * dummy commit to RUN BENCHMARKS her * her benchmark for 500 epochs RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * disable saving of policies in her benchmark RUN BENCHMARKS her * run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch * run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch * launcher refactor wip * wip * her works on FetchReach * her runner refactor RUN BENCHMARKS Fetch1M * unit test for her * fixing warnings in mpi_average in her, skip test_fetchreach if mujoco is not present * pickle-based serialization in her * remove extra import from subproc_vec_env.py * investigating differences in rollout.py * try with old rollout code RUN BENCHMARKS her * temporarily use DummyVecEnv in cmd_util.py RUN BENCHMARKS her * dummy commit to RUN BENCHMARKS her * set info_values in rollout worker in her RUN BENCHMARKS her * bug in rollout_new.py RUN BENCHMARKS her * fixed bug in rollout_new.py RUN BENCHMARKS her * do not use last step because vecenv calls reset and returns obs after reset RUN BENCHMARKS her * updated buffer sizes RUN BENCHMARKS her * fixed loading/saving via joblib * dust off learning from demonstrations in HER, docs, refactor * add deprecation notice on her play and plot files * address comments by Matthias * 1.5 months of codegen changes (#196) * play with resnet * feed_dict version * coinrun prob and more stats * fixes to get_choices_specs & hp search * minor prob fixes * minor fixes * minor * alternative version of rl_algo stuff * pylint fixes * fix bugs, move node_filters to soup * changed how get_algo works * change how get_algo works, probably broke all tests * continue previous refactor * get eval_agent running again * fixing tests * fix tests * fix more tests * clean up cma stuff * fix experiment * minor changes to eval_agent to make ppo_metal use gpu * make dict space work * modify mac makefile to use conda * recurrent layers * play with bn and resnets * minor hp changes * minor * got rid of use_fb argument and jtft (joint-train-fine-tune) functionality built test phase directly into AlgoProb * make new rl algos generateable * pylint; start fixing tests * fixing tests * more test fixes * pylint * fix search * work on search * hack around infinite loop caused by scan * algo search fixes * misc changes for search expt * enable annealing, overriding options of Op * pylint fixes * identity op * achieve use_last_output through masking so it automatically works in other distributions * fix tests * minor * discrete * use_last_output to be just a preference, not a hard constraint * pred delay, pruning * require nontrivial inputs * aliases for get_sm * add probname to probs * fixes * small fixes * fix tests * fix tests * fix tests * minor * test scripts * dualgru network improvements * minor * work on mysterious bugs * rcall gpu-usage command for kube * use cache dir that’s not in code folder, so that it doesn’t get removed by rcall code rsync * add power mode to gpu usage * make sure train/test actually different * remove VR for now * minor fixes * simplify soln_db * minor * big refactor of mpi eda * improve mpieda for multitask * - get rid of timelimit hack - add __del__ to cleanup SubprocVecEnv * get multitask working better * fixes * working on atari, various * annotate ops with whether they’re parametrized * minor * gym version * rand atari prob * minor * SolnDb bugfix and name change * pyspy script * switch conv layers * fix roboschool/bullet3 * nenvs assertion * fix rand atari * get rid of blanket exception catching fix soln_db bug * fix rand_atari * dynamic routing as cmdline arg * slight modifications to test_mpi_map and pyspy-all * max_tries argument for run_until_successs * dedup option in train_mle * simplify soln_db * increase atari horizon for 1 experiment * start implementing reward increment * ent multiplier * create cc dsl other misc fixes * cc ops * q_func -> qs in rl_algos_cc.py * fix PredictDistr * rl_ops_cc fixes, MakeAction op * augment algo agent to support cc stuff * work on ddpg experiments * fix blocking temporarily change logger * allow layer scaling * pylint fixes * spawn_method * isolate ddpg hacks * improve pruning * use spawn for subproc * remove use of python -c in rcall * fix pylint warning * fix static * maybe fix local backend * switch to DummyVecEnv * making some fixes via pylint * pylint fixes * fixing tests * fix tests * fix tests * write scaffolding for SSL in Codegen * logger fix * fix error * add EMA op to sl_ops * save many changes * save * add upsampler * add sl ops, enhance state machine * get ssl search working — some gross hacking * fix session/graph issue * fix importing * work on mle * - scale embeddings in gru model - better exception handling in sl_prob - use emas for test/val - use non-contrib batch_norm layer * improve logging * option to average before dumping in logger * default arguments, etc * new ddpg and identity test * concat fix * minor * move realistic ssl stuff to third-party (underscore to dash) * fixes * remove realistic_ssl_evaluation * pylint fixes * use gym master * try again * pass around args without gin * fix tests * separate line to install gym * rename failing tests that should be ignored * add data aug * ssl improvements * use fixed time limit * try to fix baselines tests * add score_floor, max_walltime, fiddle with lr decay * realistic_ssl * autopep8 * various ssl - enable blocking grad for simplification - kl - multiple final prediction * fix pruning * misc ssl stuff * bring back linear schedule, don’t use allgather for collecting stats (i’ve been getting nondeterministic errors from the old code) * save/load weights in SSL, big stepsize * cleanup SslProb * fix * get rid of kl coef * fix simplification, lower lr * search over hps * minor fixes * minor * static analysis * move files and rename things for improved consistency. still broken, and just saving before making nontrivial changes * various * make tests pass * move coinrun_train to codegen since it depends on codegen * fixes * pylint fixes * improve tests fix some things * improve tests * lint * fix up db_info.py, tests * mostly restore master version of envs directory, except for makefile changes * fix tests * improve printing * minor fixes * fix fixmes * pruning test * fixes * lint * write new test that makes tf graphs of random algos; fix some bugs it caught * add —delete flag to rcall upload-code command * lint * get cifar10 lazily for testing purposes * disable codegen ci tests for now * clean up rl_ops * rename spec classes * td3 with identity test * identity tests without gin files * remove gin.configurable from AlgoAgent * comments about reduction in rl_ops_cc * address @pzhokhov comments * fix tests * more linting * better tests * clean up filtering a bit * fix concat * delayed logger configuration (#208) * delayed logger configuration * fix typo * setters and getters for Logger.DEFAULT as well * do away with fancy property stuff - unable to get it to work with class level methods * grammar and spaces * spaces * use get_current function instead of reading Logger.CURRENT * autopep8 * disable mpi in subprocesses (#213) * lazy_mpi load * cleanups * more lazy mpi * don't pretend that class is a module, just use it as a class * mass-replace mpi4py imports * flake8 * fix previous lazy_mpi imports * silly recursion * try os.environ hack * better prefix test, work with mpich * restored MPI imports * removed commented import in test_with_mpi * restored codegen from master * remove lazy mpi * restored changes from rl-algs * remove extra files * address Chris' comments * use spawn for shmem vec env as well (#2) (#219) * lazy_mpi load * cleanups * more lazy mpi * don't pretend that class is a module, just use it as a class * mass-replace mpi4py imports * flake8 * fix previous lazy_mpi imports * silly recursion * try os.environ hack * better prefix test, work with mpich * restored MPI imports * removed commented import in test_with_mpi * restored codegen from master * remove lazy mpi * restored changes from rl-algs * remove extra files * port mpi fix to shmem vec env * increase the mpi test default timeout * change humanoid hyperparameters, get rid of clip_Frac annealing, as it's apparently dangerous * remove clip_frac schedule from ppo2 * more timesteps in humanoid run * whitespace + RUN BENCHMARKS * baselines: export vecenvs from folder (#221) * baselines: export vecenvs from folder * put missing function back in * add missing imports * more imports * longer mpi timeout? * make default logger configuration the same as call to logger.configure() (#222) * Vecenv refactor (#223) * update karl util * restore pvi flag * change rcall auto cpu behavior, move gin.configurable, add os.makedirs * vecenv refactor * aux buf index fix * add num aux obs * reset level with enter * restore high difficulty flag * bugfix * restore train_coinrun.py * tweaks * renaming * renaming * better arguments handling * more options * options cleanup * game data refactor * more options * args for train_procgen * add close handler to interactive base class * use debug build if debug=True, fix range on aux_obs * add ProcGenEnv to __init__.py, add missing imports to procgen.py * export RemoveDictWrapper and build, update train_procgen.py, move assets download into env creation and replace init_assets_and_build with just build * fix formatting issues * only call global init once * fix path in setup.py * revert part of makefile * ignore IDE files and folders * vec remove dict * export VecRemoveDictObs * remove RemoveDictWrapper * remove IDE files * move shared .h and .cpp files to common folder, update build to use those, dedupe env.cpp * fix missing header * try unified build function * remove old scripts dir * add comment on build * upload libenv with render fixes * tell qthreads to die when we unload the library * pyglet.app.run is garbage * static fixes * whoops * actually vsync is on * cleanup * cleanup * extern C for libenv interface * parse util rcall arg * high difficulty fix * game type enums * ProcGenEnv subclasses * game type cleanup * unrecognized key * unrecognized game type * parse util reorg * args management * typo fix * GinParser * arg tweaks * tweak * restore start_level/num_levels setting * fix create_procgen_env interface * build fix * procgen args in init signature * fix * build fix * fix logger usage in ppo_metal/run_retro * removed unnecessary OrderedDict requirement in subproc_vec_env * flake8 fix * allow for non-mpi tests * mpi test fixes * flake8; removed special logic for discrete spaces in dummy_vec_env * remove forked argument in front of tests - does not play nicely with subprocvecenv in spawned processes; analog of forked in ddpg/test_smoke * Everyrl initial commit & a few minor baselines changes (#226) * everyrl initial commit * add keep_buf argument to VecMonitor * logger changes: set_comm and fix to mpi_mean functionality * if filename not provided, don't create ResultsWriter * change variable syncing function to simplify its usage. now you should initialize from all mpi processes * everyrl coinrun changes * tf_distr changes, bugfix * get_one * bring back get_next to temporarily restore code * lint fixes * fix test * rename profile function * rename gaussian * fix coinrun training script * change random seeding to work with new gym version (#231) * change random seeding to work with new gym version * move seeding to seed() method * fix mnistenv * actually try some of the tests before pushing * more deterministic fixed seq * misc changes to vecenvs and run.py for benchmarks (#236) * misc changes to vecenvs and run.py for benchmarks * dont seed global gen * update more references to assert_venvs_equal * Rl19 (#232) * everyrl initial commit * add keep_buf argument to VecMonitor * logger changes: set_comm and fix to mpi_mean functionality * if filename not provided, don't create ResultsWriter * change variable syncing function to simplify its usage. now you should initialize from all mpi processes * everyrl coinrun changes * tf_distr changes, bugfix * get_one * bring back get_next to temporarily restore code * lint fixes * fix test * rename profile function * rename gaussian * fix coinrun training script * rl19 * remove everyrl dir which appeared in the merge for some reason * readme * fiddle with ddpg * make ddpg work * steps_total argument * gpu count * clean up hyperparams and shape math * logging + saving * configuration stuff * fixes, smoke tests * fix stats * make load_results return dicts -- easier to create the same kind of objects with some other mechanism for passing to downstream functions * benchmarks * fix tests * add dqn to tests, fix it * minor * turned annotated transformer (pytorch) into a script * more refactoring * jax stuff * cluster * minor * copy & paste alec code * sign error * add huber, rename some parameters, snapshotting off by default * remove jax stuff * minor * move maze env * minor * remove trailing spaces * remove trailing space * lint * fix test breakage due to gym update * rename function * move maze back to codegen * get recurrent ppo working * enable both lstm and gru * script to print table of benchmark results * various * fix dqn * add fixup initializer, remove lastrew * organize logging stats * fix silly bug * refactor models * fix mpi usage * check sync * minor * change vf coef, hps * clean up slicing in ppo * minor fixes * caching transformer * docstrings * xf fixes * get rid of 'B' and 'BT' arguments * minor * transformer example * remove output_kind from base class until we have a better idea how to use it * add comments, revert maze stuff * flake8 * codegen lint * fix codegen tests * responded to peter's comments * lint fixes * minor changes to baselines (#243) * minor changes to baselines * fix spaces reference * remove flake8 disable comments and fix import * okay maybe don't add spec to vec_env * Merge branch 'master' of github.com:openai/games the commit. * flake8 complaints in baselines/her | 27 February 2019, 23:35:31 UTC |
675b100 | Peter Zhokhov | 27 February 2019, 22:22:24 UTC | raised the tolerance on the test_microbatches test | 27 February 2019, 22:22:24 UTC |
adc4388 | Peter Zhokhov | 27 February 2019, 20:49:40 UTC | fixes to catch changes in gym | 27 February 2019, 20:49:40 UTC |
5b41c92 | Rishav1 | 31 January 2019, 18:23:38 UTC | fix #795: Making tf_util._Function consistent (#796) * fix #795: Making tf_util._Function consistent The fix involves using the placeholder name to crossreference passed kwargs values, just like the tf_util.function expects. Also, the givens are updated before the parameters to make it behave like it's supposed to. * test: Adding test for issue #795 | 31 January 2019, 18:23:38 UTC |
ab02fae | Peter Zhokhov | 31 January 2019, 00:21:57 UTC | fixes related to new gym and new flake8 | 31 January 2019, 00:21:57 UTC |
b55eda1 | ethanwaldie | 23 January 2019, 03:22:28 UTC | Added required arguments to the policy builder in the ACER model to (#784) * Added required arguments to the policy builder in the ACER model to fix the issue #783 * Changed the step model from nbatch to nenvs * Updated nsteps to be 1. | 23 January 2019, 03:22:28 UTC |
57e05eb | pzhokhov | 10 January 2019, 06:30:52 UTC | remove noop code (#781) | 10 January 2019, 06:30:52 UTC |
01ab1d8 | Nikhil Barhate | 09 January 2019, 19:21:53 UTC | fixed typo (#779) | 09 January 2019, 19:21:53 UTC |
7368343 | Alex Ray | 04 January 2019, 23:49:51 UTC | Merge pull request #777 from openai/aray-extra-imports add an argument for importing extra modules from run | 04 January 2019, 23:49:51 UTC |
4d0746b | Alex Ray | 03 January 2019, 19:33:31 UTC | add an argument for importing extra modules from run | 03 January 2019, 19:33:31 UTC |
5115707 | Ankesh Anand | 21 December 2018, 20:47:48 UTC | Recognize nightly tf builds (#763) * Recognize nightly tf builds * Use LooseVersion instead of StrictVersion to recongnize nightly build numbers Nightly version numbers are of the form `1.3.0.dev20181215` but it's not a valid version number for `StrictVersion`, while `LooseVersion` still recognizes it. | 21 December 2018, 20:47:48 UTC |
6c44fb2 | pzhokhov | 19 December 2018, 22:44:08 UTC | refactor HER - phase 1 (#767) * joshim5 changes (width and height to WarpFrame wrapper) * match network output with action distribution via a linear layer only if necessary (#167) * support color vs. grayscale option in WarpFrame wrapper (#166) * support color vs. grayscale option in WarpFrame wrapper * Support color in other wrappers * Updated per Peters suggestions * fixing test failures * ppo2 with microbatches (#168) * pass microbatch_size to the model during construction * microbatch fixes and test (#169) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * Peterz joshim5 subclass ppo2 model (#170) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * subclassing the model to make microbatched version of model WIP * made microbatched model a subclass of ppo2 Model * flake8 complaint * mpi-less ppo2 (resolving merge conflict) * flake8 and mpi4py imports in ppo2/model.py * more un-mpying * merge master * updates to the benchmark viewer code + autopep8 (#184) * viz docs and syntactic sugar wip * update viewer yaml to use persistent volume claims * move plot_util to baselines.common, update links * use 1Tb hard drive for results viewer * small updates to benchmark vizualizer code * autopep8 * autopep8 * any folder can be a benchmark * massage games image a little bit * fixed --preload option in app.py * remove preload from run_viewer.sh * remove pdb breakpoints * update bench-viewer.yaml * fixed bug (#185) * fixed bug it's wrong to do the else statement, because no other nodes would start. * changed the fix slightly * Refactor her phase 1 (#194) * add monitor to the rollout envs in her RUN BENCHMARKS her * Slice -> Slide in her benchmarks RUN BENCHMARKS her * run her benchmark for 200 epochs * dummy commit to RUN BENCHMARKS her * her benchmark for 500 epochs RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * add num_timesteps to her benchmark to be compatible with viewer RUN BENCHMARKS her * disable saving of policies in her benchmark RUN BENCHMARKS her * run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch * run fetch benchmarks with ppo2 and ddpg RUN BENCHMARKS Fetch * launcher refactor wip * wip * her works on FetchReach * her runner refactor RUN BENCHMARKS Fetch1M * unit test for her * fixing warnings in mpi_average in her, skip test_fetchreach if mujoco is not present * pickle-based serialization in her * remove extra import from subproc_vec_env.py * investigating differences in rollout.py * try with old rollout code RUN BENCHMARKS her * temporarily use DummyVecEnv in cmd_util.py RUN BENCHMARKS her * dummy commit to RUN BENCHMARKS her * set info_values in rollout worker in her RUN BENCHMARKS her * bug in rollout_new.py RUN BENCHMARKS her * fixed bug in rollout_new.py RUN BENCHMARKS her * do not use last step because vecenv calls reset and returns obs after reset RUN BENCHMARKS her * updated buffer sizes RUN BENCHMARKS her * fixed loading/saving via joblib * dust off learning from demonstrations in HER, docs, refactor * add deprecation notice on her play and plot files * address comments by Matthias | 19 December 2018, 22:44:08 UTC |
146bbf8 | Timothy Lee | 30 November 2018, 01:28:09 UTC | Removed code that prevented changes to actor loss when training with demos (#740) | 30 November 2018, 01:28:08 UTC |
f3a5aba | pzhokhov | 27 November 2018, 01:57:25 UTC | added smoke tests of ddpg (#734) | 27 November 2018, 01:57:25 UTC |
97e0391 | pzhokhov | 27 November 2018, 01:56:41 UTC | Fix ppo2 with MPI bug, other minor fixes (#735) * joshim5 changes (width and height to WarpFrame wrapper) * match network output with action distribution via a linear layer only if necessary (#167) * support color vs. grayscale option in WarpFrame wrapper (#166) * support color vs. grayscale option in WarpFrame wrapper * Support color in other wrappers * Updated per Peters suggestions * fixing test failures * ppo2 with microbatches (#168) * pass microbatch_size to the model during construction * microbatch fixes and test (#169) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * Peterz joshim5 subclass ppo2 model (#170) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * subclassing the model to make microbatched version of model WIP * made microbatched model a subclass of ppo2 Model * flake8 complaint * mpi-less ppo2 (resolving merge conflict) * flake8 and mpi4py imports in ppo2/model.py * more un-mpying * merge master * updates to the benchmark viewer code + autopep8 (#184) * viz docs and syntactic sugar wip * update viewer yaml to use persistent volume claims * move plot_util to baselines.common, update links * use 1Tb hard drive for results viewer * small updates to benchmark vizualizer code * autopep8 * autopep8 * any folder can be a benchmark * massage games image a little bit * fixed --preload option in app.py * remove preload from run_viewer.sh * remove pdb breakpoints * update bench-viewer.yaml * fixed bug (#185) * fixed bug it's wrong to do the else statement, because no other nodes would start. * changed the fix slightly | 27 November 2018, 01:56:41 UTC |
25ecb64 | pzhokhov | 27 November 2018, 00:30:37 UTC | fixed issue with wrong output layer variable names in ddpg (#733) | 27 November 2018, 00:30:37 UTC |
7dc6bc7 | Prabhat Nagarajan | 27 November 2018, 00:19:09 UTC | fixes typo (#732) * fixes typo * adds apostrophe | 27 November 2018, 00:19:09 UTC |
7139a66 | Christopher Hesse | 21 November 2018, 23:00:51 UTC | Merge pull request #728 from openai/christopherhesse-patch-1 Update README.md | 21 November 2018, 23:00:51 UTC |
8607dca | Christopher Hesse | 21 November 2018, 22:57:10 UTC | Update README.md | 21 November 2018, 22:57:10 UTC |
9f9835f | pzhokhov | 21 November 2018, 20:51:15 UTC | Update __init__.py | 21 November 2018, 20:51:15 UTC |
d3fed18 | sedand | 14 November 2018, 22:50:59 UTC | Fixed comment on example usage in jupyter-notebook (#396) Cause of error: Import name must be results_plotter, not log_viewer. | 14 November 2018, 22:50:59 UTC |
339d564 | Roman Ring | 14 November 2018, 20:22:42 UTC | add docs for layer_norm param in DQN baseline (#107) | 14 November 2018, 20:22:42 UTC |
a75bc37 | Buck Shlegeris | 14 November 2018, 20:20:55 UTC | fix typo in a comment (#161) | 14 November 2018, 20:20:55 UTC |
87b3a04 | Peter Zhokhov | 14 November 2018, 20:16:53 UTC | autopep8 | 14 November 2018, 20:16:53 UTC |
c5b1a1b | Brent Komer | 13 November 2018, 21:08:32 UTC | typo fix (#230) | 13 November 2018, 21:08:32 UTC |
c59a109 | JohannesAck | 13 November 2018, 21:03:48 UTC | Parameter documentation for tf_util.function (#349) * Added parameter documentation This parameter was thus far not documented and is non-intuitive when unfamiliar with tf. * Added parameter documentation | 13 November 2018, 21:03:48 UTC |
5cd6601 | James Alan Preiss | 13 November 2018, 19:09:11 UTC | case-insensitive sort for human-readable logger (#289) | 13 November 2018, 19:09:11 UTC |
0a13da8 | Xiaoquan Kong | 13 November 2018, 19:08:21 UTC | Change variable name from `inpt` to `input_` (#297) | 13 November 2018, 19:08:21 UTC |
18b6390 | Vladislav Zavadskyy | 13 November 2018, 19:03:55 UTC | Typo fix (#287) | 13 November 2018, 19:03:55 UTC |
52255be | pzhokhov | 09 November 2018, 19:18:05 UTC | microbatches in ppo2, custom frame size in WarpFrame, matching fc layer only when needed (#707) * joshim5 changes (width and height to WarpFrame wrapper) * match network output with action distribution via a linear layer only if necessary (#167) * support color vs. grayscale option in WarpFrame wrapper (#166) * support color vs. grayscale option in WarpFrame wrapper * Support color in other wrappers * Updated per Peters suggestions * fixing test failures * ppo2 with microbatches (#168) * pass microbatch_size to the model during construction * microbatch fixes and test (#169) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * Peterz joshim5 subclass ppo2 model (#170) * microbatch fixes and test * tiny cleanup * added assertions to the test * vpg-related fix * subclassing the model to make microbatched version of model WIP * made microbatched model a subclass of ppo2 Model * flake8 complaint * mpi-less ppo2 (resolving merge conflict) * flake8 and mpi4py imports in ppo2/model.py * more un-mpying | 09 November 2018, 19:18:05 UTC |
d80acbb | AurelianTactics | 08 November 2018, 18:13:07 UTC | Removing print spam from Wrapper (#705) * DDPG has unused 'seed' argument DeepQ, PPO2, ACER, trpo_mpi, A2C, and ACKTR have the code for: ``` from baselines.common import set_global_seeds ... def learn(...): ... set_global_seeds(seed) ``` DDPG has the argument 'seed=None' but doesn't have the two lines of code needed to set the global seeds. * DDPG: duplicate variable assignment variable nb_actions assigned same value twice in space of 10 lines nb_actions = env.action_space.shape[-1] * DDPG: noise_type 'normal_x' and 'ou_x' cause assert noise_type default 'adaptive-param_0.2' works but the arguments that change from parameter noise to actor noise (like 'normal_0.2' and 'ou_0.2' cause an assert message and DDPG not to run. Issue is noise following block: ''' if self.action_noise is not None and apply_noise: noise = self.action_noise() assert noise.shape == action.shape action += noise ''' noise is not nested: [number_of_actions] actions is nested: [[number_of_actions]] Can either nest noise or unnest actions * Revert "DDPG: noise_type 'normal_x' and 'ou_x' cause assert" * DDPG: noise_type 'normal_x' and 'ou_x' cause AssertionError noise_type default 'adaptive-param_0.2' works but the arguments that change from parameter noise to actor noise (like 'normal_0.2' and 'ou_0.2') cause an assert message and DDPG not to run. Issue is the following block: ''' if self.action_noise is not None and apply_noise: noise = self.action_noise() assert noise.shape == action.shape action += noise ''' noise is not nested: [number_of_actions] action is nested: [[number_of_actions]] Hence the shapes do not pass the assert line even though the action += noise line is correct * Removing Print Spam from Wrapper Prints a line every time a video is saved or not saved. Seems unnecessary. | 08 November 2018, 18:13:07 UTC |
556b198 | pzhokhov | 08 November 2018, 18:11:45 UTC | Internal minifixes (#694) * joshim5 changes (width and height to WarpFrame wrapper) * match network output with action distribution via a linear layer only if necessary (#167) * support color vs. grayscale option in WarpFrame wrapper (#166) * support color vs. grayscale option in WarpFrame wrapper * Support color in other wrappers * Updated per Peters suggestions * fixing test failures | 08 November 2018, 18:11:45 UTC |
cc88804 | pzhokhov | 08 November 2018, 01:20:52 UTC | Update viz.ipynb | 08 November 2018, 01:20:52 UTC |
c14d307 | pzhokhov | 08 November 2018, 01:19:42 UTC | move viz docs to a notebook entirely (#704) * viz docs * writing vizualization docs * documenting plot_util * docstrings in plot_util * autopep8 and flake8 * spelling (using default vim spellchecker and ingoring things like dataframe, docstring and etc) * rephrased viz.md a little bit * more examples of viz code usage in the docs * replaced vizualization doc with notebook | 08 November 2018, 01:19:42 UTC |
0b71d4c | pzhokhov | 08 November 2018, 01:19:25 UTC | remove unused args of DDPG class (#702) | 08 November 2018, 01:19:25 UTC |
7bb405c | pzhokhov | 07 November 2018, 22:25:35 UTC | Update viz.md | 07 November 2018, 22:25:35 UTC |
8b95576 | pzhokhov | 07 November 2018, 01:02:20 UTC | more viz + build fixes (#703) * viz docs * writing vizualization docs * documenting plot_util * docstrings in plot_util * autopep8 and flake8 * spelling (using default vim spellchecker and ingoring things like dataframe, docstring and etc) * rephrased viz.md a little bit * more examples of viz code usage in the docs | 07 November 2018, 01:02:20 UTC |
9d4fb76 | Peter Zhokhov | 06 November 2018, 17:58:43 UTC | making num_envs and video length smaller in test_video_recorder to prevent hanging on travis | 06 November 2018, 17:58:43 UTC |
664ec6f | Peter Zhokhov | 06 November 2018, 03:19:39 UTC | catch bugfixes in gym | 06 November 2018, 03:19:39 UTC |
3917321 | Peter Zhokhov | 06 November 2018, 01:00:40 UTC | revert over-spellchecking | 06 November 2018, 01:00:40 UTC |
6e607ef | coord.e | 05 November 2018, 22:32:17 UTC | Add video recorder (#666) * Fix: Return the result of rendering from dummyvecenv * Add: Add a video recorder wrapper for vecenv * Change: Use VecVideoRecorder with --video_monitor flag * Change: Overwrite the metadata only when it isn't defined * Add: Define __del__ to make the file correctly closed in exit * Fix: Bump epidode_id in reset() * Fix: Use hasattr to check the existence of .metadata * Fix: Make directory when it doesn't exist * Change: Kepp recording for `video_length` steps, then close Because reset() is not what it is in normal gym.Env * Add: Enable to specify video_length from command line argument * Delete: Delete default value, None, of video_callable * Change: Use self.recorded_frames and self.recording to manage intervals * Add: Log the status of video recording * Fix: Fix saving path * Change: Place metadata in the base VecEnv * Delete: Delete unused imports * Fix: epidode_id => step_id * Fix: Refine the flag name * Change: Unify the flag name folloing to previous change * [WIP] Add: Add a test of VecVideoRecorder * Fix: Use PongNoFrameskip-v0 because SimpleEnv doesn't have render() * Change; Use TemporaryDirectory * Fix: minimal successful test * Add: Test against parallel environments * Add: Test against different type of VecEnvs * Change: Test against different length and interval of video capture * Delete: Reduce the number of tests * Change: Test if the output video is not empty * Add: Add some comments * Fix: Fix the flag name * Add: Add docstrings * Fix: Install ffmpeg in testing container for VecVideoRecorder's test * Fix: Delete unused things * Fix: Replace `video_callable` with `record_video_trigger` * Fix: Improve the explanation of `record_video_trigger` argument * Fix: Close owning vecenv in VecVideoRecorder.close to resolve memory leak | 05 November 2018, 22:32:17 UTC |
c74ce02 | pzhokhov | 05 November 2018, 22:31:15 UTC | visualization code docs / bugfixes (#701) * viz docs * writing vizualization docs * documenting plot_util * docstrings in plot_util * autopep8 and flake8 * spelling (using default vim spellchecker and ingoring things like dataframe, docstring and etc) * rephrased viz.md a little bit | 05 November 2018, 22:31:15 UTC |
ab59de6 | pzhokhov | 31 October 2018, 18:15:41 UTC | mpi-less baselines (#689) * make baselines run without mpi wip * squash-merged latest master * further removing MPI references where unnecessary * more MPI removal * syntax and flake8 * MpiAdam becomes regular Adam if Mpi not present * autopep8 * add assertion to test in mpi_adam; fix trpo_mpi failure without MPI on cartpole * mpiless ddpg | 31 October 2018, 18:15:41 UTC |
a071fa7 | Mathieu Poliquin | 30 October 2018, 17:17:46 UTC | Add retro to ppo2 defaults (#682) * Adds retro to ppo2 defaults Created defaults for retro, copied from Atari defaults for now. Tested with SuperMarioBros-Nes * ppo2 retro defaults to atari | 30 October 2018, 17:17:46 UTC |
637bf55 | Mathieu Poliquin | 30 October 2018, 17:16:15 UTC | Use deepmind wrapper for retro (#685) * Use deepmind wrapper for retro * moved wrap_deepmind_retro after Monitor wrapper | 30 October 2018, 17:16:15 UTC |
165c622 | AurelianTactics | 30 October 2018, 17:13:39 UTC | DDPG: noise_type 'normal_x' and 'ou_x' cause AssertionError (#680) * DDPG has unused 'seed' argument DeepQ, PPO2, ACER, trpo_mpi, A2C, and ACKTR have the code for: ``` from baselines.common import set_global_seeds ... def learn(...): ... set_global_seeds(seed) ``` DDPG has the argument 'seed=None' but doesn't have the two lines of code needed to set the global seeds. * DDPG: duplicate variable assignment variable nb_actions assigned same value twice in space of 10 lines nb_actions = env.action_space.shape[-1] * DDPG: noise_type 'normal_x' and 'ou_x' cause assert noise_type default 'adaptive-param_0.2' works but the arguments that change from parameter noise to actor noise (like 'normal_0.2' and 'ou_0.2' cause an assert message and DDPG not to run. Issue is noise following block: ''' if self.action_noise is not None and apply_noise: noise = self.action_noise() assert noise.shape == action.shape action += noise ''' noise is not nested: [number_of_actions] actions is nested: [[number_of_actions]] Can either nest noise or unnest actions * Revert "DDPG: noise_type 'normal_x' and 'ou_x' cause assert" * DDPG: noise_type 'normal_x' and 'ou_x' cause AssertionError noise_type default 'adaptive-param_0.2' works but the arguments that change from parameter noise to actor noise (like 'normal_0.2' and 'ou_0.2') cause an assert message and DDPG not to run. Issue is the following block: ''' if self.action_noise is not None and apply_noise: noise = self.action_noise() assert noise.shape == action.shape action += noise ''' noise is not nested: [number_of_actions] action is nested: [[number_of_actions]] Hence the shapes do not pass the assert line even though the action += noise line is correct | 30 October 2018, 17:13:39 UTC |
93c7cc2 | Peter Zhokhov | 29 October 2018, 22:25:38 UTC | Merge branch 'master' of github.com:openai/baselines | 29 October 2018, 22:25:38 UTC |
de36116 | Peter Zhokhov | 29 October 2018, 22:25:31 UTC | update tensorflow version check regex to parse version like 1.2.3rc4 (previously only 1.2.3-rc4) | 29 October 2018, 22:25:31 UTC |
e2b4182 | Mathieu Poliquin | 29 October 2018, 20:30:41 UTC | Set 'cnn' as default network for retro (#683) | 29 October 2018, 20:30:41 UTC |
8e56dde | pzhokhov | 24 October 2018, 18:01:59 UTC | Multidiscrete action space compatibility for policy gradient-based methods (#677) * multidiscrete space compatibility * flake8 and syntax | 24 October 2018, 18:01:59 UTC |
c3bd8ce | Juliano Laganá | 24 October 2018, 17:00:31 UTC | Adds description of param_noise parameter in deepq.learn method (#675) | 24 October 2018, 17:00:31 UTC |
84ea7aa | AurelianTactics | 24 October 2018, 16:59:46 UTC | DDPG has unused 'seed' argument (#676) DeepQ, PPO2, ACER, trpo_mpi, A2C, and ACKTR have the code for: ``` from baselines.common import set_global_seeds ... def learn(...): ... set_global_seeds(seed) ``` DDPG has the argument 'seed=None' but doesn't have the two lines of code needed to set the global seeds. | 24 October 2018, 16:59:46 UTC |
88300ed | Peter Zhokhov | 24 October 2018, 16:57:57 UTC | fix raise NotImplemented() complaints of latest flake8 | 24 October 2018, 16:57:57 UTC |
583ba08 | pzhokhov | 23 October 2018, 18:22:27 UTC | Update cmd_util.py | 23 October 2018, 18:22:27 UTC |
014a559 | pzhokhov | 23 October 2018, 17:01:25 UTC | refactor ACER (#664) * make acer use vecframestack * acer passes mnist test with 20k steps * acer with non-image observations and tests * flake8 * test acer serialization with non-recurrent policies | 23 October 2018, 17:01:25 UTC |
4ed1350 | Isaac Poulton | 23 October 2018, 17:00:09 UTC | Fixed TypeError on creating atari vec envs (#671) | 23 October 2018, 17:00:09 UTC |
8513d73 | Rishabh Jangir | 23 October 2018, 02:04:40 UTC | HER : new functionality, enables demo based training (#474) * Add, initialize, normalize and sample from a demo buffer * Modify losses and add cloning loss * Add demo file parameter to train.py * Introduce new params in config.py for demo based training * Change logger.warning to logger.warn in rollout.py;bug * Add data generation file for Fetch environments * Update README file | 23 October 2018, 02:04:40 UTC |
c28acb2 | Xingdong Zuo | 23 October 2018, 02:01:26 UTC | [Clean-up]: delete `running_stat` and `filters` as they are replaced by `running_mean_std` and not used anymore (#614) * Delete filters.py * Delete running_stat.py | 23 October 2018, 02:01:26 UTC |