Revision history - None - origin: https://github.com/openai/baselines

visit type:

Revision	Author	Date	Message	Commit Date
165c622	AurelianTactics	30 October 2018, 17:13:39 UTC	DDPG: noise_type 'normal_x' and 'ou_x' cause AssertionError (#680) * DDPG has unused 'seed' argument DeepQ, PPO2, ACER, trpo_mpi, A2C, and ACKTR have the code for: ``` from baselines.common import set_global_seeds ... def learn(...): ... set_global_seeds(seed) ``` DDPG has the argument 'seed=None' but doesn't have the two lines of code needed to set the global seeds. * DDPG: duplicate variable assignment variable nb_actions assigned same value twice in space of 10 lines nb_actions = env.action_space.shape[-1] * DDPG: noise_type 'normal_x' and 'ou_x' cause assert noise_type default 'adaptive-param_0.2' works but the arguments that change from parameter noise to actor noise (like 'normal_0.2' and 'ou_0.2' cause an assert message and DDPG not to run. Issue is noise following block: ''' if self.action_noise is not None and apply_noise: noise = self.action_noise() assert noise.shape == action.shape action += noise ''' noise is not nested: [number_of_actions] actions is nested: [[number_of_actions]] Can either nest noise or unnest actions * Revert "DDPG: noise_type 'normal_x' and 'ou_x' cause assert" * DDPG: noise_type 'normal_x' and 'ou_x' cause AssertionError noise_type default 'adaptive-param_0.2' works but the arguments that change from parameter noise to actor noise (like 'normal_0.2' and 'ou_0.2') cause an assert message and DDPG not to run. Issue is the following block: ''' if self.action_noise is not None and apply_noise: noise = self.action_noise() assert noise.shape == action.shape action += noise ''' noise is not nested: [number_of_actions] action is nested: [[number_of_actions]] Hence the shapes do not pass the assert line even though the action += noise line is correct	30 October 2018, 17:13:39 UTC
93c7cc2	Peter Zhokhov	29 October 2018, 22:25:38 UTC	Merge branch 'master' of github.com:openai/baselines	29 October 2018, 22:25:38 UTC
de36116	Peter Zhokhov	29 October 2018, 22:25:31 UTC	update tensorflow version check regex to parse version like 1.2.3rc4 (previously only 1.2.3-rc4)	29 October 2018, 22:25:31 UTC
e2b4182	Mathieu Poliquin	29 October 2018, 20:30:41 UTC	Set 'cnn' as default network for retro (#683)	29 October 2018, 20:30:41 UTC
8e56dde	pzhokhov	24 October 2018, 18:01:59 UTC	Multidiscrete action space compatibility for policy gradient-based methods (#677) * multidiscrete space compatibility * flake8 and syntax	24 October 2018, 18:01:59 UTC
c3bd8ce	Juliano Laganá	24 October 2018, 17:00:31 UTC	Adds description of param_noise parameter in deepq.learn method (#675)	24 October 2018, 17:00:31 UTC
84ea7aa	AurelianTactics	24 October 2018, 16:59:46 UTC	DDPG has unused 'seed' argument (#676) DeepQ, PPO2, ACER, trpo_mpi, A2C, and ACKTR have the code for: ``` from baselines.common import set_global_seeds ... def learn(...): ... set_global_seeds(seed) ``` DDPG has the argument 'seed=None' but doesn't have the two lines of code needed to set the global seeds.	24 October 2018, 16:59:46 UTC
88300ed	Peter Zhokhov	24 October 2018, 16:57:57 UTC	fix raise NotImplemented() complaints of latest flake8	24 October 2018, 16:57:57 UTC
583ba08	pzhokhov	23 October 2018, 18:22:27 UTC	Update cmd_util.py	23 October 2018, 18:22:27 UTC
014a559	pzhokhov	23 October 2018, 17:01:25 UTC	refactor ACER (#664) * make acer use vecframestack * acer passes mnist test with 20k steps * acer with non-image observations and tests * flake8 * test acer serialization with non-recurrent policies	23 October 2018, 17:01:25 UTC
4ed1350	Isaac Poulton	23 October 2018, 17:00:09 UTC	Fixed TypeError on creating atari vec envs (#671)	23 October 2018, 17:00:09 UTC
8513d73	Rishabh Jangir	23 October 2018, 02:04:40 UTC	HER : new functionality, enables demo based training (#474) * Add, initialize, normalize and sample from a demo buffer * Modify losses and add cloning loss * Add demo file parameter to train.py * Introduce new params in config.py for demo based training * Change logger.warning to logger.warn in rollout.py;bug * Add data generation file for Fetch environments * Update README file	23 October 2018, 02:04:40 UTC
c28acb2	Xingdong Zuo	23 October 2018, 02:01:26 UTC	[Clean-up]: delete `running_stat` and `filters` as they are replaced by `running_mean_std` and not used anymore (#614) * Delete filters.py * Delete running_stat.py	23 October 2018, 02:01:26 UTC
c5d9c4a	pzhokhov	23 October 2018, 01:36:39 UTC	wrap retro envs correctly for other (non-deepq) algorithms (#669) * wrap retro envs correctly for other (non-deepq) algorithms * flake and csh comments * flake and csh comments	23 October 2018, 01:36:39 UTC
c0fa11a	pzhokhov	22 October 2018, 16:15:04 UTC	minor fixes from internal (#665) * sync internal changes. Make ddpg work with vecenvs * B -> nenvs for consistency with other algos, small cleanups * eval_done[d]==True -> eval_done[d] * flake8 and numpy.random.random_integers deprecation warning * Merge branch 'master' of github.com:openai/games into peterz_track_baselines_branch	22 October 2018, 16:15:04 UTC
bd390c2	Peter Zhokhov	20 October 2018, 00:50:54 UTC	updated docstring for deepq	20 October 2018, 00:50:54 UTC
d0cc325	pzhokhov	19 October 2018, 15:54:21 UTC	store session at policy creation time (#655) * sync internal changes. Make ddpg work with vecenvs * B -> nenvs for consistency with other algos, small cleanups * eval_done[d]==True -> eval_done[d] * flake8 and numpy.random.random_integers deprecation warning * store session at policy creation time * coexistence tests * fix a typo * autopep8 * ... and flake8 * updated todo links in test_serialization	19 October 2018, 15:54:21 UTC
fc7f9ce	pzhokhov	18 October 2018, 23:07:14 UTC	disable gym subpackages in setup.py (#661) * disable gym subpackages in setup.py * include gym[atari] in test requirements * gym[atari] -> atari-py in test requirements	18 October 2018, 23:07:14 UTC
3677dc1	Matthew Rahtz	18 October 2018, 20:54:39 UTC	Set allow_growth=True for MuJoCo session (#643)	18 October 2018, 20:54:39 UTC
ef96f38	Matthew Rahtz	16 October 2018, 23:28:23 UTC	Drop S and M args so that --play works (#636)	16 October 2018, 23:28:23 UTC
a03dacd	pzhokhov	16 October 2018, 23:26:46 UTC	sync internal changes. Make ddpg work with vecenvs (#654) * sync internal changes. Make ddpg work with vecenvs * B -> nenvs for consistency with other algos, small cleanups * eval_done[d]==True -> eval_done[d] * flake8 and numpy.random.random_integers deprecation warning	16 October 2018, 23:26:46 UTC
e57f81b	Tianhong Dai	16 October 2018, 23:22:06 UTC	revise the readme of ddpg (#653)	16 October 2018, 23:22:06 UTC
28aca63	Peter Zhokhov	09 October 2018, 16:48:31 UTC	update benchmark results	09 October 2018, 16:48:31 UTC
7bfbcf1	Erik Doffagne	04 October 2018, 17:31:22 UTC	Fixed typos in README (#635)	04 October 2018, 17:31:22 UTC
394339d	pzhokhov	04 October 2018, 03:53:58 UTC	Update README.md	04 October 2018, 03:53:58 UTC
10c205c	pzhokhov	02 October 2018, 23:33:19 UTC	Debug codegen ppo (#123) * disabled tests, running benchmarks only * dummy commit to RUN BENCHMARKS * benchmark ppo_metal; disable all but Bullet benchmarks * ppo2, codegen ppo and ppo_metal on Bullet RUN BENCHMARKS * run benchmarks on Roboschool instead RUN BENCHMARKS * run ppo_metal on Roboschool as well RUN BENCHMARKS * install roboschool in cron rcall user_config * dummy commit to RUN BENCHMARKS * import roboschool in codegen/contcontrol_prob.py RUN BENCHMARKS * re-enable tests, flake8 * get entropy from a distribution in Pred RUN BENCHMARKS * gin for hyperparameter injection; try codegen ppo close to baselines ppo RUN BENCHMARKS * provide default value for cg2/bmv_net_ops.py * dummy commit to RUN BENCHMARKS * make tests and benchmarks parallel; use relative path to gin file for rcall compatibility RUN BENCHMARKS * syntax error in run-benchmarks-new.py RUN BENCHMARKS * syntax error in run-benchmarks-new.py RUN BENCHMARKS * path relative to codegen/training for gin files RUN BENCHMARKS * another reconcilliation attempt between codegen ppo and baselines ppo RUN BENCHMARKS * value_network=copy for ppo2 on roboschool RUN BENCHMARKS * make None seed work with torch seeding RUN BENCHMARKS * try sequential batches with ppo2 RUN BENCHMARKS * try ppo without advantage normalization RUN BENCHMARKS * use Distribution to compute ema NLL RUN BENCHMARKS * autopep8 * clip gradient norm in algo_agent RUN BENCHMARKS * try ppo2 without vfloss clipping RUN BENCHMARKS * trying with gamma=0.0 - assumption is, both algos should be equally bad RUN BENCHMARKS * set gamma=0 in ppo2 RUN BENCHMARKS * try with ppo2 with single minibatch RUN BENCHMARKS * try with nminibatches=4, value_network=copy RUN BENCHMARKS * try with nminibatches=1 take two RUN BENCHMARKS * try initialization for vf=0.01 RUN BENCHMARKS * fix the problem with min_istart >= max_istart * i have no idea RUN BENCHMARKS * fix non-shared variance between old and new RUN BENCHMARKS * restored baselines.common.policies * 16 minibatches in ppo_roboschool.gin * fixing results of merge * cleanups * cleanups * fix run-benchmarks-new RUN BENCHMARKS Roboschool8M * fix syntax in run-benchmarks-new RUN BENCHMARKS Roboschool8M * fix test failures * moved gin requirement to codegen/setup.py * remove duplicated build_softq in get_algo.py * linting * run softq on continuous action spaces RUN BENCHMARKS Roboschool8M	03 October 2018, 21:38:32 UTC
62fe7c4	pzhokhov	02 October 2018, 22:54:14 UTC	disable async acktr (#129) * disable async acktr * linting * linting * linting	03 October 2018, 21:38:32 UTC
fbdf55f	Xingyou Song	01 October 2018, 18:39:14 UTC	Xsong lqr ddpg (#125) * allows vec_envs to work * allows vec_envs to work * fixed branch with correct ddpg * running experiments jointly now * changed to subproc * changed to subproc * changed to subproc * small fix md * removed placeholder * removed placeholder * added ppotest * probably fixed ddpg hyperparam issues * checkpoint * edited readme * added orthogonal * added orthogonal * added ddpg-vecenv * reverted ddpg to old baselines	03 October 2018, 21:38:32 UTC
9ee804c	Christopher Hesse	01 October 2018, 17:38:07 UTC	minor change to install.py and baselines run.py (#121)	03 October 2018, 21:38:32 UTC
4cf7dc9	John Schulman	30 September 2018, 21:54:44 UTC	Big refactor (#124) * massive revision inspired by soup: algo folder works * porting rl commands, WIP * various * git subrepo push --remote=git@github.com:openai/codegen.git --branch=refactor codegen subrepo: subdir: "codegen" merged: "aa27e069" upstream: origin: "git@github.com:openai/codegen.git" branch: "refactor" commit: "aa27e069" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8" * various * rewrite RL stuff in new framework * fix almost everything * woohoo tests pass * more tests * reformatting * fixes * write tests for embeddings * re-remove cg2 * pylint * minor * move smooth_helpers import; seems to cause nondeterministic failure in parallel pytest	03 October 2018, 21:38:32 UTC
e820b86	Xingyou Song	27 September 2018, 20:11:11 UTC	ppo2 now has eval stats (#120) * ppo2 now has eval stats * fixed spaces * fixed kwargs ordering * whitespace fix	03 October 2018, 21:38:32 UTC
858afa8	pzhokhov	26 September 2018, 22:28:52 UTC	Refactor DDPG (#111) * run ddpg on Mujoco benchmark RUN BENCHMARKS * autopep8 * fixed all syntax in refactored ddpg * a little bit more refactoring * autopep8 * identity test with ddpg WIP * enable test_identity with ddpg * refactored ddpg RUN BENCHMARKS * autopep8 * include ddpg into style check * fixing tests RUN BENCHMARKS * set default seed to None RUN BENCHMARKS * run tests and benchmarks in separate buildkite steps RUN BENCHMARKS * cleanup pdb usage * flake8 and cleanups * re-enabled all benchmarks in run-benchmarks-new.py * flake8 complaints * deepq model builder compatible with network functions returning single tensor * remove ddpg test with test_discrete_identity * make ppo_metal use make_vec_env instead of make_atari_env * make ppo_metal use make_vec_env instead of make_atari_env * fixed syntax in ppo_metal.run_atari	03 October 2018, 21:38:32 UTC
4121d9c	pzhokhov	03 October 2018, 21:37:40 UTC	fix DQN learning bug (#632) * Update run.py * Update utils.py * Update utils.py	03 October 2018, 21:37:40 UTC
34ae319	Peter Zhokhov	27 September 2018, 19:51:43 UTC	add a note about DQN algorithms not performing well	27 September 2018, 19:51:43 UTC
4402b8e	Thomas Simonini	24 September 2018, 16:54:41 UTC	Updated A2C and PPO2 comments (#612) * Updated A2C and PPO2 comments * Fixed format errors to respect PEP 8 style guide	24 September 2018, 16:54:41 UTC
555a5cb	ahuhn	22 September 2018, 00:22:56 UTC	Adding num_env to readme example (#609) * Adding num_env to readme example * Updated readme example fix	22 September 2018, 00:22:56 UTC
8158f35	Thomas Simonini	21 September 2018, 20:12:31 UTC	Wrote some comments to explain the A2C and PPO2 implementation (#607) * added comments in A2C and PPO2 * Fixed format errors to respect PEP 8 style guide	21 September 2018, 20:12:31 UTC
a7fd8a4	cclauss	20 September 2018, 23:40:03 UTC	Run flake8 to find syntax errors and undefined names (#439) __E901,E999,F821,F822,F823__ are the "showstopper" flake8 issues that can halt the runtime with a SyntaxError, NameError, etc. The other flake8 issues are merely "style violations" -- useful for readability but they do not effect runtime safety. This PR therefore recommends a flake8 run of those tests on the entire codebase. * F821: undefined name `name` * F822: undefined name `name` in `__all__` * F823: local variable `name` referenced before assignment * E901: SyntaxError or IndentationError * E999: SyntaxError -- failed to compile a file into an Abstract Syntax Tree	20 September 2018, 23:40:03 UTC
e791565	John Schulman	20 September 2018, 20:31:25 UTC	Codegen more abstract abstract classes 3a (#106) * Soup code, arch search on CIFAR-10 * Oh I understood how choice_sequence() worked * Undo some pointless changes * Some beautification 1 * Some beautification 2 * An attempt to debug test_get_algo_outputs() number 70, unsuccessful. * Code style warning * Code style warnings, more * wip * wip * wip * fix almost everything; soup machine still broken * revert mpi_eda changes * minor fixes	20 September 2018, 23:19:07 UTC
7859f60	XFFXFF	20 September 2018, 23:16:44 UTC	prioritized experience replay bug (#527)	20 September 2018, 23:16:44 UTC
0f4ae2f	pzhokhov	20 September 2018, 23:05:26 UTC	refactor acktr (#560) * refactor acktr * setup.cfg now tests style/syntax in acktr as well * flake8 complaints * added note about continuous action spaces for acktr into the README.md	20 September 2018, 23:05:26 UTC
0e7048b	pzhokhov	19 September 2018, 22:04:54 UTC	Update README.md	19 September 2018, 22:04:54 UTC
75983ba	pzhokhov	19 September 2018, 22:04:01 UTC	Update README.md	19 September 2018, 22:04:01 UTC
85be745	Alfredo Canziani	19 September 2018, 16:43:45 UTC	Add possibility of plotting timesteps vs episodes (#578) * Add possibility of plotting timesteps vs episodes * Remove leftover from personal project patch * Auto plt.tight_layout() on resize window event Calls `plt.tight_layout()` if a `resize_event` is issued. This means that the plot will look good even after the user has resized the plotting window.	19 September 2018, 16:43:45 UTC
115b59d	Geoffrey Irving	18 September 2018, 22:52:57 UTC	Merge pull request #598 from openai/irving-rc Fix setup.py for tensorflow -rc versions	18 September 2018, 22:52:57 UTC
d34049c	Xingdong Zuo	18 September 2018, 21:14:38 UTC	Update running_mean_std.py (#585)	18 September 2018, 21:14:38 UTC
59662ff	pzhokhov	18 September 2018, 21:13:05 UTC	rename entcoeff to ent_coef in trpo_mpi for compatibility with other algos (#581)	18 September 2018, 21:13:05 UTC
a42c4eb	Geoffrey Irving	18 September 2018, 18:35:43 UTC	Fix setup.py for tensorflow -rc versions	18 September 2018, 18:35:43 UTC
68a29d0	R1ckF	17 September 2018, 21:33:39 UTC	--play now works with LSTM (#595)	17 September 2018, 21:33:39 UTC
0c6f357	Xingdong Zuo	17 September 2018, 16:53:34 UTC	Delete identity_env.py (#588)	17 September 2018, 16:53:34 UTC
4dc697e	pzhokhov	14 September 2018, 01:18:45 UTC	codegen test fixes (#95) * fix discovered test failures * autopep8 * test indices up to 123 * testing from index 124 on * add scope to logstd * fix flakiness in test_train_mle * autopep8	14 September 2018, 22:43:50 UTC
e790f52	Peter Zhokhov	13 September 2018, 22:37:04 UTC	define mean for CategoricalPd (as softmax of logits)	14 September 2018, 22:43:50 UTC
fe06c6b	pzhokhov	12 September 2018, 17:14:41 UTC	continuous action spaces for codegen + some benchmarking (#82) * add some docstrings * start making big changes * state machine redesign * sampling seems to work * some reorg * fixed sampling of real vals * json conversion * made it possible to register new commands got nontrivial version of Pred working * consolidate command definitions * add more macro blocks * revived visualization * rename Userdata -> CmdInterpreter make AlgoSmInstance subclass of SmInstance that uses appropriate userdata argument * replace userdata by ci when appropriate * minor test fixes * revamped handmade dir, can run ppo_metal * seed to avoid random test failure * implement AlgoAgent * Autogenerated object that performs all ops and macros * more CmdRecorder changes * move files around * move MatchProb and JtftProb * remove obsolete * fix tests involving AlgoAgent (pending the next commit on ppo_metal code) * ppo_metal: reduce duplication in policy_gen, make sess an attribute of PpoAgent and StochasticPolicy instead of using get_default_session everywhere. * maze_env reformatting, move algo_search script (but stil broken) * move agent.py * fix test on handcrafted agents * tuning/fixing ppo_metal baseline * minor * Fix ppo_metal baseline * Don’t set epcount, tcount unless they’re being used * get rid of old ppo_metal baseline * fixes for handmade/run.py tuning * fix codegen ppo * fix handmade ppo hps * fix test, go back to safe_div * switch to more complex filtering * make sure all handcrafted algos have finite probability * train to maximize logprob of provided samples Trex changes to avoid segfault * AlgoSm also includes global hyperparams * don’t duplicate global hyperparam defaults * create generic_ob_ac_space function * use sorted list of outkeys * revive tsne * todo changes * determinism test * todo + test fix * remove a few deprecated files, rename other tests so they don’t run automatically, fix real test failure * continuous control with codegen * continuous control with codegen * implement continuous action space algodistr * ppo with trex RUN BENCHMARKS * wrap trex in a monitor * dummy commit to RUN BENCHMARKS * adding monitor to trex env RUN BENCHMARKS * adding monitor to trex RUN BENCHMARKS * include monitor into trex env RUN BENCHMARKS * generate nll and predmean using Distribution node * dummy commit to RUN BENCHMARKS * include pybullet into baselines optional dependencies * dummy commit to RUN BENCHMARKS * install games for cron rcall user RUN BENCHMARKS * add --yes flag to install.py in rcall config for cron user RUN BENCHMARKS * both continuous and discrete versions seem to run * fixes to monitor to work with vecenv-like info and rewards RUN BENCHMARKS * dummy commit to RUN BENCHMARKS * removed shape check from one-hot encoding logic in distributions.CategoricalPd * reset logger configuration in codegen/handmade/run.py to be in-line with baselines RUN BENCHMARKS * merged peterz_codegen_benchmarks RUN BENCHMARKS * skip tests RUN BENCHMARKS * working on test failures * save benchmark dicts RUN BENCHMARK * merged peterz_codegen_benchmark RUN BENCHMARKS * add get_git_commit_message to the baselines.common.console_util * dummy commit to RUN BENCHMARKS * merged fixes from peterz_codegen_benchmark RUN BENCHMARKS * fixing failure in test_algo_nll WIP * test_algo_nll passes with both ppo and softq * re-enabled tests * run trex on gpus for 100k total (horizon=100k / 16) RUN BENCHMARKS * merged latest peterz_codegen_benchmarks RUN BENCHMARKS * fixing codegen test failures (logging-related) * fixed name collision in run-benchmarks-new.py RUN BENCHMARKS * fixed name collision in run-benchmarks-new.py RUN BENCHMARKS * fixed import in node_filters.py * test_algo_search passes * some cleanup * dummy commit to RUN BENCHMARKS * merge fast fail for subprocvecenv RUN BENCHMARKS * use SubprocVecEnv in sonic_prob * added deprecation note to shmem_vec_env * allow indexing of distributions * add timeout to pipeline.yaml * typo in pipeline.yml * run tests with --forked option * resolved merge conflict in rl_algs.bench.benchmarks * re-enable parallel tests * fix remaining merge conflicts and syntax * Update trex_prob.py * fixes to ResultsWriter * take baselines/run.py from peterz_codegen branch * actually save stuff to file in VecMonitor RUN BENCHMARKS * enable parallel tests * merge stricter flake8 * merge peterz_codegen_benchmark, resolve conflicts * autopep8 * remove traces of Monitor from trex env, check shapes before encoding in CategoricalPd * asserts and warnings to make q -> distribution change more explicit * fixed assert in CategoricalPd * add header to vec_monitor output file RUN BENCHMARKS * make VecMonitor write header to the output file * remove deprecation message from shmem_vec_env RUN BENCHMARKS * autopep8 * proper shape test in distributions.py * ResultsWriter can take dict headers * dummy commit to RUN BENCHMARKS * replace assert len(qs)==1 with warning RUN BENCHMARKS * removed pdb from ppo2 RUN BENCHMARKS	14 September 2018, 22:43:49 UTC
1f99a56	Peter Zhokhov	11 September 2018, 20:21:52 UTC	autopep8	11 September 2018, 20:21:52 UTC
4e2a888	Peter Zhokhov	11 September 2018, 20:19:39 UTC	Merge commit 'refs/subrepo/baselines/fetch' into subrepo/baselines	11 September 2018, 20:19:39 UTC
c5b2918	Peter Zhokhov	11 September 2018, 19:48:16 UTC	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "2742f819" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "5c5a9f4b" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	11 September 2018, 20:18:43 UTC
3bf31a4	Peter Zhokhov	11 September 2018, 19:42:47 UTC	git subrepo commit (merge) baselines subrepo: subdir: "baselines" merged: "0846932a" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "c5d6f299" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	11 September 2018, 20:18:43 UTC
9070ee7	pzhokhov	11 September 2018, 18:01:51 UTC	tighten flake8, autopep8 to fix trailing whitespaces and blank lines with whitespaces (#87)	11 September 2018, 20:18:43 UTC
e568034	Peter Zhokhov	10 September 2018, 19:50:51 UTC	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "5c6a1fd9" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "23b23332" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	11 September 2018, 20:18:42 UTC
b3bc25d	pzhokhov	10 September 2018, 18:58:22 UTC	add fast failure when calling methods on a closed subprocvecenv (#84)	11 September 2018, 20:18:42 UTC
5183fa9	Peter Zhokhov	11 September 2018, 19:47:50 UTC	autopep8 on deepq/experiments	11 September 2018, 19:47:50 UTC
5c5a9f4	Peter Zhokhov	11 September 2018, 19:47:50 UTC	autopep8 on deepq/experiments	11 September 2018, 19:47:50 UTC
3bf35cb	Peter Zhokhov	11 September 2018, 19:44:51 UTC	added peterz to baselines authorlist	11 September 2018, 19:44:51 UTC
5c62f5c	Peter Zhokhov	11 September 2018, 19:44:51 UTC	added peterz to baselines authorlist	11 September 2018, 19:44:51 UTC
29bf587	Peter Zhokhov	11 September 2018, 19:40:29 UTC	Merge branch 'master' of github.com:openai/baselines	11 September 2018, 19:40:29 UTC
c5d6f29	Peter Zhokhov	11 September 2018, 19:40:29 UTC	Merge branch 'master' of github.com:openai/baselines	11 September 2018, 19:40:29 UTC
06bdc28	Peter Zhokhov	11 September 2018, 19:40:23 UTC	docstrings about vecenvs	11 September 2018, 19:40:23 UTC
23b2333	pzhokhov	10 September 2018, 18:50:59 UTC	baselines issue #564 (#574) * fixes to enjoy_cartpole, enjoy_mountaincar.py * fixed {train,enjoy}_pong, removed enjoy_retro * set number of timesteps to 1e7 in train_pong * flake8 complaints * use synchronous version fo acktr in test_env_after_learn * flake8	10 September 2018, 18:50:59 UTC
adaa8ae	pzhokhov	10 September 2018, 18:50:59 UTC	baselines issue #564 (#574) * fixes to enjoy_cartpole, enjoy_mountaincar.py * fixed {train,enjoy}_pong, removed enjoy_retro * set number of timesteps to 1e7 in train_pong * flake8 complaints * use synchronous version fo acktr in test_env_after_learn * flake8	10 September 2018, 18:50:59 UTC
8614c4d	Peter Zhokhov	10 September 2018, 17:41:29 UTC	flake8	10 September 2018, 17:41:29 UTC
59a7ffb	Peter Zhokhov	10 September 2018, 17:32:42 UTC	fixe tests of test_env_after_learn	10 September 2018, 17:32:42 UTC
58b1021	Daniel Angelov	08 September 2018, 00:04:02 UTC	Add tensorboard start command for convenience (#569)	08 September 2018, 00:04:02 UTC
a60e88b	Peter Zhokhov	07 September 2018, 21:42:29 UTC	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "8785db28" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "35e95ee8" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	07 September 2018, 23:35:00 UTC
75b93b8	pzhokhov	06 September 2018, 23:17:59 UTC	implement pdfromlatent in BernoulliPdType (#81) * implement pdfromlatent in BernoulliPdType * remove env.close() at the end of algorithms * test case for environment after learn * closing env in run.py * fixes for acktr and trpo_mpi * add make_session with new graph for every call in test_env_after_learn * remove extra prints from test_env_after_learn	07 September 2018, 23:35:00 UTC
565b215	John Schulman	06 September 2018, 22:31:30 UTC	Add lots of docstrings (#76) * Add lots of docstrings Change hyperparameter transformations for slightly better efficiency and to avoid circular dependency. Now all parameters are stored in a “human-readable” form. * improve pretty-print of nodes and trees * newlines at end-of-file, return graph in render(), assert_valid() fix * split run_algo_search.py into several simpler scripts * add joint_train option to get_prob * minor changes to soln_db and embedding script * Arguments: -> Args: * fix replay, part 1 * fix behavior when using unpickled algos * re-add retrieve_weights * make training scripts more consistent * lint * lint * lint + remove rendering some rendering functionality from trex env as it’s also elsewhere * get rid of warnings * refactor functionality for getting final q-function and losses. revive code for removing useless terms & tests for simplification. * fix vecenv closing * finish removing algo folder (most useful functionality has been moved out of it) * control verbosity of trex * fix tests * rename spec => choice_spec, some comments, asserts, debug prints * fix some tests	07 September 2018, 23:34:59 UTC
35e95ee	Peter Zhokhov	06 September 2018, 19:00:19 UTC	fix python 3.5 string format compatibility	06 September 2018, 19:00:19 UTC
ad219e2	Isaac Lascasas	06 September 2018, 17:21:50 UTC	VecNormalize: set env. returns to zero on resets. (#556) * VecNormalize: set env. returns to zero on resets. * VecNormalize: returns reset in step_wait after ret_rms.update.	06 September 2018, 17:21:50 UTC
be9118b	Peter Zhokhov	06 September 2018, 17:17:55 UTC	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "f2a9b8f2" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "cc4215ef" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	06 September 2018, 17:18:13 UTC
02a5e7a	pzhokhov	06 September 2018, 17:17:21 UTC	fixes to readme and baselines/run.py (#80) * fixes to readme and baselines/run.py * polish installation section of baselines README * polish installation section of baselines README	06 September 2018, 17:18:13 UTC
87ac8bc	pzhokhov	05 September 2018, 21:03:13 UTC	install roboschool in install.py (#55) * putting instructions from README.md into a script * install roboschool as a part of setup.py * install roboschool from install.py * export pkg_config_path * remove compilation step from roboschool/setup.py * removed roboschool install from games install due to extra compilation step * removed unused import from roboschool/setup.py	06 September 2018, 17:18:13 UTC
cc4215e	Tom	06 September 2018, 17:16:06 UTC	refactor common.models via registering reflection (#565)	06 September 2018, 17:16:06 UTC
1e9051e	Clayton Thorrez	05 September 2018, 22:12:01 UTC	fixed warning (#464)	05 September 2018, 22:12:01 UTC
43ed769	uronce-cc	05 September 2018, 22:06:29 UTC	Fix mean reward per episode after training Pong. (#562) * Fix mean reward per episode after training Pong. * Fix typo.	05 September 2018, 22:06:29 UTC
7f08c67	Peter Zhokhov	04 September 2018, 17:23:29 UTC	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "39f8be8f" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "0a40206c" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	04 September 2018, 17:23:40 UTC
b3f966a	pzhokhov	04 September 2018, 17:22:32 UTC	use env.render in dummy_vec_env.render when num_envs == 1 (#74) * use env.render in dummy_vec_env.render when num_envs == 1 * use shorter super() syntax per Alex's suggestion	04 September 2018, 17:23:40 UTC
51cefc9	pzhokhov	30 August 2018, 22:32:55 UTC	make load_variables compatible with old list format (#71) * make load_variables compatible with old list format * cosmetic fixes	04 September 2018, 17:23:39 UTC
7bccb29	Christopher Hesse	30 August 2018, 22:04:40 UTC	baselines: default logger similar to configure() logger, rcall: don't call logger.configure() for new rl_algs * error if logger looks wrong * check version of logger, call logger.configure() on import * remove changes entry * add version to rl-algs * fix typo * add comment * switch version to string * set logger env variable	04 September 2018, 17:23:39 UTC
0a40206	uronce-cc	31 August 2018, 16:02:18 UTC	ncpu needs to be an integer. (#558)	31 August 2018, 16:02:18 UTC
1937826	Alfredo Canziani	31 August 2018, 00:21:25 UTC	Fix alien syntax and apply PEP 8 style (#554)	31 August 2018, 00:21:25 UTC
b29c802	pzhokhov	30 August 2018, 20:40:40 UTC	remove saving model as a pickle file in ppo2 (tries to pull environment in; bad idea - may need to use constructor argument pickling or somesuch if at all necessary) (#69)	30 August 2018, 20:41:38 UTC
4ec308a	Peter Zhokhov	30 August 2018, 17:27:18 UTC	fixed syntax	30 August 2018, 20:41:38 UTC
3bbf3f3	Peter Zhokhov	30 August 2018, 16:40:42 UTC	allow_early_resets=True in create_vec_env	30 August 2018, 20:41:38 UTC
e5de29a	Joshua Meier	29 August 2018, 22:25:47 UTC	instructions for tensorboard (#61)	30 August 2018, 20:41:37 UTC
2507d33	Joshua Meier	29 August 2018, 22:17:43 UTC	Tensorboard util (#60) * separate_validation_set was not imported * launching tensorboard automatically	30 August 2018, 20:41:37 UTC
bdd4d38	Damien Lancry	29 August 2018, 00:48:56 UTC	Fix result_plotters in vectorized mujoco environments (#533) * I investigated a bit about running a training in a vectorized monitored mujoco env and found out that the 0.monitor.csv file could not be plotted using baselines.results_plotter.py functions. Moreover the seed is the same in every parallel environments due to the particular behaviour of lambda. this fixes both issues without breaking the function in other files (baselines.acktr.run_mujoco still works) * unifies make_atari_env and make_mujoco_env * redefine make_mujoco_env because of run_mujoco in acktr not compatible with DummyVecEnv and SubprocVecEnv * fix if else * Update run.py	29 August 2018, 00:48:56 UTC
0961f5d	Peter Zhokhov	27 August 2018, 23:39:51 UTC	git subrepo pull (merge) baselines subrepo: subdir: "baselines" merged: "95a81e86" upstream: origin: "git@github.com:openai/baselines.git" branch: "master" commit: "c6c0f45c" git-subrepo: version: "0.4.0" origin: "git@github.com:ingydotnet/git-subrepo.git" commit: "74339e8"	27 August 2018, 23:40:14 UTC
337d913	Christopher Hesse	27 August 2018, 19:48:05 UTC	remove reset_task from subproc vec env (#45)	27 August 2018, 23:40:14 UTC
34af61a	Karl Cobbe	27 August 2018, 03:54:38 UTC	baselines: fix dummy vec env render mode (#42)	27 August 2018, 23:40:14 UTC
1ea5ec6	Christopher Hesse	24 August 2018, 22:44:56 UTC	export SimpleEnv and assert_envs_equal, fix minor bug in action space (#46)	27 August 2018, 23:40:14 UTC
2fc7a1c	pzhokhov	23 August 2018, 20:20:01 UTC	Trigger benchmarks from buildkite (#40) * rig buildkite pipeline to run benchmarks when commit ends with RUN BENCHMARKS * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file * fix the buildkite pipeline file - merge test and benchmark steps * fix the buildkite pipeline file - merge test and benchmark steps * fix buildkite pipeline file * fix buildkite pipeline file * dry RUN BENCHMARKS * dry RUN BENCHMARKS * dry not run BENCHMARKS * not run benchmarks * not running benchmarks * no running benchmarks * no running benchmarks * still not running benchmarks * dummy commit to RUN BENCHMARKS * trigger benchmarks from buildkite RUN BENCHMARKS * specifying RCALL_KUBE_CLUSTER RUN BENCHMARKS * remove rl-algs/run-benchmarks-new.py (moved to ci), merged baselines/common/console_util and baselines/common/util.py * added missing imports in console_util * clone subrepo over https	27 August 2018, 23:40:14 UTC

Newer
Older