https://github.com/rlworkgroup/garage

sort by:
Revision Author Date Message Commit Date
22dd7b0 pylint 26 July 2018, 22:21:04 UTC
a3c25cc modifications 18 July 2018, 21:59:27 UTC
e8fa358 pre-commit 12 July 2018, 22:20:20 UTC
ab2cc92 Remove /progress from Tensorboard output paths (#179) There is an argument in scripts/run_experiment that sets log_dir/progress as the tensorboard output dir. Delete it and set the tensorboard dir same as log_dir removes `/progress` in the end of every run. Also, mkdir_p in TensorboradOutput is useless and it creates an empty dir every run when rencord_tensor is not called. Delete it. 11 July 2018, 21:05:26 UTC
975e02a Fix baselines installation (#176) Fix baselines installation in environments.yml. Using `pip install baselines` installs the wrong version of baselines. Change it to install using git. Add mpi dependency in setup_linux.sh and setup_osx.sh. 11 July 2018, 20:00:16 UTC
9615cdf Update pre-commit instructions in CONTRIBUTING 11 July 2018, 19:01:27 UTC
9071c11 Centralize examples in garage (#167) TF and Theano examples are located in examples/ in separate directories. 10 July 2018, 23:54:03 UTC
649037f Sawyer reacher mujoco (#130) Add reacher environment to mujoco. Add task space control to the pick-and-place environment. Fix a collision detection bug for the sawyer model/ 10 July 2018, 21:48:47 UTC
3be3b06 Fix spec() implementation (#164) spec(env) now returns garage.Spaces, and the algorithms using env_spec are updated accordingly to treat them as garage.Spaces. 10 July 2018, 17:38:32 UTC
32a36f6 Mock out dm_control in check_imports.py (#160) Some import tests have been failing because dm_control is having trouble finding glew libraries in TravisCI (even though we install them). The best solution to this problem is to control our CI environment more closely using Docker (See #159). In the meantime, I am mocking dm_control imports out of the check_imports.py script, so that dm_control can't break the build. 09 July 2018, 21:35:21 UTC
7103a0a Use tf.variable_scope() in FirstOrderOptimizer (#154) TF optimizers add gradient annotation variables (e.g. foo/Adam:0) to the variable scope of the TF ops associated with the gradient. This creates a problem for Parameterized classes, because constructing the Parameterized class in a new process will *not* construct the optimizer as in the main process, meaning that these annotation parameters will be missing. This causes the class to fail to serialize across processes. Enclosing FirstOrderOptimizer's TF op creation in tf.variable_scope() quarantines the new annotation parameters in a new variable scope, away from subgraphs of Parameterized primitives. 09 July 2018, 18:28:40 UTC
d6d111b allow_multiline_lambdas in YAPF (#157) Otherwise YAPF will fail to wrap lines with lambdas, leading to conflicts with flake8. Despite not being the default, this setting does not appear to conflict with PEP8. 09 July 2018, 17:46:03 UTC
c2ee748 Update .gitignore (#155) Updates .gitignore to catch Sublime hidden files more generally 08 July 2018, 00:21:51 UTC
1174ccc Run some CI checks using pre-commit (#149) Refactored pre-commit to also include yapf and pylint checks. 06 July 2018, 22:12:01 UTC
560a18d Tune the sawyer mujoco envs to match the real setup (#148) - modify sawyer_robot.xml to match real robot - add bin's xmls 06 July 2018, 20:40:52 UTC
73e28de Reraise exceptions caught in run_experiment.py (#153) We catch BaseException in run_experiment so that we can cleanly terminate all of the worker processes when an interrupt occurs. If we don't re-raise the exception after cleaning up, all exceptions which end the program are silently supressed. This change re-raises the caught exception after clean-up so that it propagates to the user. 06 July 2018, 20:22:58 UTC
43153f3 Override plot parameter of algorithm constructor (#119) When running a training session, there's two places to enable the plotter: the algorithm constructor or the function run_experiment. However, when using run_experiment, if its plot parameter is false, all the algorithms that run under run_experiment have to keep their Plotters disabled as well. A static variable was introduced in the Plotter class (both in the Theano and TensorFlow branch) to disable the Plotter from the run_experiment function. Also, a cleanup of the plotter in the Theano branch was performed, and there's some legacy PEP8 style errors that where fixed. 06 July 2018, 19:30:50 UTC
8958515 Move spaces into respective directories (#150) Move garage.spaces.{theano, tf} to garage.{theano, tf}.spaces. 06 July 2018, 18:46:55 UTC
9340ee6 Missing changes to cleanly exit worker processes The worker processes pool is only created when n_parallel is bigger than 1, so n_parallel=1 was creating null exceptions when terminate was called. This is solved by adding a guard to check if the pool was instantiated. Another fix that is required is to rename back the member g in singleton_pool to G. This was done in a previous change to enforce PEP8 style of having non-capital variables/parameters, but the member has to keep the same name as defined in singleton_pool. 06 July 2018, 18:03:10 UTC
f13bb1a Move garage.tf.replay_buffer to garage.replay_buffer (#151) Replay_buffer is a fixed size memory to store experience transitions. It has no tf dependencies. Move it to garage.replay_buffer. Replace SimpleReplayPool in garage.algos.ddpg with garage.replay_buffer. 06 July 2018, 16:12:16 UTC
dcdc6b4 Cleanly terminate worker processes on an interrupt When calling function run_experiment and the parameter n_parallel is bigger than zero, worker processes are created to do parallel sampling during the training. However, when there's a keyboard interrupt, there are certain occasions when processes are not completely killed after the execution has finished. By checking the PID of these processes using the command "ps -fu | grep run_experiment", we noticed they had the status code "Sl", where S means interruptible sleep, so the processes are waiting for a signal to wake up and continue their execution. When executing the command "kill -SIGINT <pid>" with the PID of the sleeping process, we got the python trace back and found that the processes were sleeping trying to acquire the lock of the inqueue inside the pool of python multiprocessing. By default, when a keyboard interruption occurs all processes get notified, but the notifications are not sent in any order. This could produce an issue when a process is interrupted and finished without releasing the lock that other process are waiting for as it's in this case. To avoid this problem, the parent of the worker processes catches the BaseException (consider other cases besides the KeyboardInterrupt) and terminates makes sure to terminate the pool of worker processes to avoid the dead locks. Also, the start method of joblib was configured as forkserver. As indicated in joblib documentation, the JOBLIB_START_METHOD environment variable has to be set as "forkserver". In garage, this is now done at the beginning of the run_experiment function, so the child process that runs joblib has the start method configured as forkserver. An assert is added to the child process that runs the experiment to make sure the variable is correctly set. Other changes were required to enforce the PEP8 style in legacy code. 06 July 2018, 01:07:06 UTC
1058f9a Fix render() in mujoco envs (#142) - There were missing overrides imports in mujoco/sawyer. Add them. - NormalizedEnv didn't implement render(), therefore caused NotImplementError. Add render() in NormalizedEnv. Same with GridWorldEnv. - Fix multiple errors in EmbeddedViwer. There were wrong params assignments and missing interaction setting such as set default camera. - Fix multiple errors in GatherEnv and GatherViewer. - Reformat with PEP8. - Remove function calls that don't exist in MjViewer. 05 July 2018, 23:37:57 UTC
96733d8 Add top-level base Space class for garage (#140) * Both TensorFlow and Theano spaces inherit from and build off of garage.Spaces 05 July 2018, 22:32:26 UTC
87f538d Fix key space in record_histogram (#141) Enclosing variable scope outside the record_hisogram function in TensorBoardOutput messed key spaces in self._histogram_ds. Fix it by enclosing variable scope only outside the variable. 05 July 2018, 20:21:49 UTC
c4f9192 Ignore PEP8 rule W503 for flake8 (#126) W503 enforces the break after the operator, which is acceptable by PEP8, but it's preferred to do it before the operator. Since YAPF enforces the preferred style, this rule is ignored to avoid conflicts between both lint tools. 04 July 2018, 01:38:13 UTC
28615c5 Set up mujoco in setup_linux.sh and setup_osx.sh The script setup_mujoco.sh is called now from the scripts to install garage in Linux and OS X, since it's required by the packages installed by conda. Also, the three scripts were formatted with the style defined by Google. 04 July 2018, 01:12:19 UTC
c1c0533 Fix box2d render issue (#136) * Removed all kwargs from render() as box2d_env inherits from gym.Env * Added `mode` parameter to conform to gym.Env 03 July 2018, 23:56:34 UTC
df7add1 Add DDPG to TensorFlow (#66) * Add OU exploration strategy to TensorFlow Ornstein Uhlenbeck exploration strategy comes from the Ornstein-Uhlenbeck process, which is a stationary process that describes the velocity of a Brownian particle under the influence of friction. The OU strategy is often used in DDPG algorithm because in continuous control task it is better to have temporally correlated exploration to get smoother transitions. And OU process is relatively smooth in time. * Add replay buffer to TensorFlow Replay buffer is an important technique in reinforcement learning. It stores transitions in a memory buffer of fixed size. When the buffer is full, oldest memory will be discarded. At each step, a batch of memories will be sampled from the buffer to update the agent's parameters. In a word, replay buffer breaks temporal correlations and thus benefits RL algorithms. DDPG uses replay buffer to update parameters in policy network and q function network. This commit implements a ReplayBuffer class to TensorFlow. In each step, it adds a transition of (observation, reward, terminal, action, next_observation) into the memory buffer. And returns a dict of a batch of transitions when agent asks for a sampled transitions. * Implement actor-critic for DDPG DDPG uses actor-critic method to optimize the policy and reward prediction. The actor-critic method a kind of TD method. It consists of an actor, which is a policy network used for action selection, and a critic, which is a q function network used for action value estimation. The critic genetates TD error to indicate the optimization of actor and critic. In DDPG, actor accepts observation as input, uses MLP to fit the mapping from observation to action, then outputs the predicted action. The critic accpets observation and action as the input, output the q value. Here, the policy network is named as ContinuousMLPPolicy. The q function network is names as ContinuousMLPQFunction. * Add DDPG to TensorFlow Deep Deterministic Polict Gradients is an off-policy reinforcement learning algorithm for contiunous control tasks. It uses the replay buffer and two target networks, corresponding to policy and q function network, to stablize the training process. This DDPG algorithm uses the primitives in garage/tf to set up the algorithm. * Add launcher file for DDPG to TensorFlow * Add miscellaneous change to DDPG - Fix bugs in DDPG. The last DDPG incorrectly calculates action_loss. - Set output_nonlinearity param to tf.nn.tanh in ddpg_pendulum.py. Without this, the training process will be negatively influenced. - Delete mistakenly pushed files. * Reorder imports and add docstring - Add missed args in docstring - Reorder imports in alphabetical order * Add tf async plotter to DDPG - Add tf async plotter to DDPG - Formatting other files * Improve training process of DDPG Add transitions to DDPG when terminal is True. This proves to improve the average return significantly. * Reformatting ddpg.py to conform with flake8 - Add multiple docstring to ddpg.py. - Change get_target_ops to module methods since it is rather a static method than a class method. * Break lines in q_functions/__init__.py Break lines in garage/tf/q_functions/__init__.py because line is too long. * Add multiple docstrings to multiple files There are several files has missing docstrings. Add them to pass flask8 checks. * Add miscellaneous changes - Fix errors in docstring. - Reorder imports. - Add requirement third party lib baselines into environment.yml. * Add benchmark test into tests Add a benchmark script which can run regression tests between baselines and garage algorithms. For different algorithms, imports, params, method run_garage() and run_baselines() need to change. * Add requirement lib for mpi4py - Mpi4py needs libopenmpi-dev be pre-installed in Linux. * Change tf_benchmark_ddpg plotter - Change plotter in tf_benchmark_ddpg.py from one plot per trail to one plot per task, one curve per trail. * Reformat code with YAPF * Add another dep of mpi4py Add openmpi-bin to deps so that the check_imports passes. * Add title into benchmark plotter * Fix syntax error in q_functions/__init__.py * Add miscellaneous change - Use flat_dim to get dimenstion of space. - Change error comment in the class docstring of DDPG. - Use tf.set_random_seed to get same sequence of values. - Delete trainable=True in the build_type(). * Fix D413 in multiple files There should be a blank line after last section of docstrings. * Add miscellaneous change - Use bounds in garage.envs.util to get action spcace bound. - Set built_net to private. - Namespace log in DDPG to a relevant component. - Move benchmark.png to the parent directory, more convenient. - Set default params of DDPG to the one gets best performance. - Refactor critic_optimizer name in DDPG. 02 July 2018, 23:32:27 UTC
5802207 pre_commit environment fix (#137) 02 July 2018, 23:07:33 UTC
a16706f Fixed tf EnvSpec assert issue (#135) assert() statement now checks for correct class 02 July 2018, 22:45:53 UTC
97a9bbf Add support for alphabetizing logger tabular (#124) * Add support for alphabetizing logger tabular Group each tabular data together by the alphabetical order of their keys. * Update docstrings Update command results in tabulate's docstrings. Format with flake8. 02 July 2018, 18:14:17 UTC
09c9982 Add pre-commit to garage This PR adds pre-commit support to garage, a multi-language package manager for pre-commit hooks. pre-commit helps you adhere to the git commit message guidelines, such as reminding you to keep your subject line to 50 characters and to wrap your body to 72 characters. 30 June 2018, 03:11:51 UTC
5378034 Add name_scope in TensorBoardOutput (#118) * Add name_scope in TensorBoardOutput The record_histogram and get_histogram method in TensorBoardOutput wasn't group together, so all the variables created in them polluted the computation graph. By adding a name_scope in the constructor, all the nodes will be grouped under node TensorBoardOutput in the graph. Also, update examples/example_tensorboard_logger.py. * Fix bugs in TensorBoardOutput name scope The last commit has a bug. Using self._name_scope.__enter__() creates a scope block. Consequently, if there are variables created after calling get_histogram_by_type(), the node will shown as a subnode of TensorBoardOutput, which does not make sense. Instead, set a global name scope for TensorBoardOutput class, and using enclosing_scope to enclose all the subnodes under this glbal name. This fixes the above bug. 29 June 2018, 21:07:38 UTC
cc5d462 Add log_diagnostics to normalized_env 29 June 2018, 17:53:44 UTC
dfbe8ae Fixed GLEW import issue In order to correctly import glfw, we must first import mujoco_py. 29 June 2018, 17:29:13 UTC
8eedf58 Add gazebo reacher env (#99) Add gazebo reacher env 27 June 2018, 22:38:53 UTC
797ef6a False positive of flake8 checkers The script check_flake8 was returning a successful status even if flake8 was failing. The exit status of flake8 is obtained and returned in the main script so Travis CI detects the error as well. 26 June 2018, 21:57:43 UTC
a6f8846 Support different rollout functions TF plotter (#102) 26 June 2018, 20:12:50 UTC
755592c Support multiline imports in check_imports.py (#105) 26 June 2018, 19:56:45 UTC
3fad0ef Fix dm_control installation for linux (#100) * fix dm_control installation for linux (Issue #27) * fix a small bug in normalized env 25 June 2018, 23:02:13 UTC
42f47a3 Fix contrib.ros 'from garage.config_personal' should be 'from garage.config' (#96) - Refer to #95 22 June 2018, 07:14:34 UTC
d2821d6 Update normalized_env (#91) Update normalized_env because all environments in garage now are converted to conform with gym.Env. 18 June 2018, 18:47:36 UTC
f48614c Update README.md 18 June 2018, 16:49:39 UTC
5b61f8c Fix tf distributions (#97) Some minor fixes for issues I found for tf distributions. They should inherit garage.tf.distributions.Distribution because the dist_info_keys method is implemented there. Also an import issue is fixed. 18 June 2018, 07:41:46 UTC
924cc74 Remove quotes in filename variables in pep8 checks (#89) Both the scripts for pylint and flake8 were doing the correct filename subtitution but with added quotes that were passed to corresponding command for pylint and flake8, not checking any file at all. For example, the bad command produced was: `pylint 'garage/algos/cma_es_lib.py garage/envs/dm_control_env.py'` Instead of: `pylint garage/algos/cma_es_lib.py garage/envs/dm_control_env.py` This commit fixes the problem. Also the grep command is used to filter the python files, because flake8 and pylint may check on all the files passed to them. 16 June 2018, 18:35:26 UTC
c005820 Renamed run_experiment_lite to run_experiment 15 June 2018, 23:16:28 UTC
8f8eb3b Remove blank config_personal.py (#94) 15 June 2018, 20:20:21 UTC
70c77a3 Add mujoco sawyer envs (#76) * Add mujoco sawyer envs * fix import order * move from rllab to garage * more refactor * Use short import * refactor goal_distance method * yapf 15 June 2018, 18:36:30 UTC
c54169a Fix tensorboard creates another TF device issue (#88) * Fix tensorboard creates another TF device issue Class TensorBoardOutput used to create tf.Session in its constructor. But the session seems to launch a TF device. Since the only use of the session is to dump histogram, and usually there is a session created as default when dump histograms into tensorboard. So I deleted it in the constructor and moved it into method _dump_histogram. There is also a circular import in TensorBoardOutput, deleted it. * Format with PEP8 Missing lines in tensorboard_output.py 15 June 2018, 18:18:17 UTC
9bcb322 Fixed PointEnv for test_env * All tests using PointEnv now pass in test_env 15 June 2018, 17:02:10 UTC
17347a1 Remove all usage of garage's internal tf.space * All algorithms and policies now correctly depend on gym.spaces 15 June 2018, 16:47:57 UTC
cf7505c Sawyer runtime (#67) - Add real sawyer support - refer to #20 15 June 2018, 02:08:10 UTC
76f17e5 Check docstrings only on added files (#83) The docstring checks are only applied to added files to avoid errors when refactoring legacy code. Also, other checks are only applied to modified files to avoid errors on copied, deleted or renamed files. Other miscellaneous changes include: - The name rllab was replaced for garage in option --application-import- names. - Dots were missing in grep expression at check_pylint. 15 June 2018, 01:39:13 UTC
7b1cc1b Fix local package names in flake8 config (#77) Also removes mistakenly included old sandbox files. 14 June 2018, 22:38:28 UTC
16016ec Add PEP8 checks to the CI Two lint tools have been added to Travis CI to enforce the PEP8 format in the new commits pushed to the repository. The tools are pylint and flake8, since the latter does not cover all the rules found in PEP8. Only the error codes for PEP8 have been enabled for the lint tools, and they're defined in the file setup.cfg at the root of the repository. Developers can run flake8 and it will automatically fetch the configuration from setup.cfg, but it's recommended to only do it on the set of changed files, since the project currently has many format errors introduced in the legacy code. Run the following command to check errors with flake8: $ git diff origin/master | flake8 --diff Pylint cannot analyze differences as flake8, so pass only the names of the files to verify and the configuration file like so: $ git diff origin/master --name-only > | grep "*.py" > | xargs pylint --rcfile=setup.cfg Assign an alias for convenience and run the commands above before pushing the code for review to avoid spending time waiting for Travis to spot the errors. If there's a conflict between pylint or flake8 with yapf, disable yapf only for the lines of code that present this conflict, as exemplified here: https://github.com/google/yapf#why-does-yapf-destroy-my-awesome- formatting 14 June 2018, 21:17:04 UTC
3c58b86 Rename rllab to garage (#62) 14 June 2018, 01:44:13 UTC
f1b2716 Replaced rllab.envs.Env with gym.Env (#58) rllab.envs.Env has become obsolete, as the community has embraced the gym.Env interface. This change replaces rllab.envs.Env to make gym.Env the *only* Environment abstraction in garage. All garage components can communicate with any instance of gym.Env, and all garage environments implement gym.Env (though they may be backed by anything, not just gym). 13 June 2018, 21:18:14 UTC
377d9f8 Update CONTRIBUTING.md 13 June 2018, 19:43:05 UTC
ea26ede Update CONTRIBUTING.md 13 June 2018, 16:42:53 UTC
64831a6 Fix check_imports CI script (#64) The check_imports CI script was not detecting all broken imports, and some snuck in. This change makes the check_imports script more strict, and fixes the problems detected by the stricter check. 13 June 2018, 03:14:36 UTC
9ff2f5b Added asynchronous plotting to TensorFlow (#56) * Converted synchronous to asynchronous plotting of TensorFlow. * Implemented a better naming scheme for variables and functions 13 June 2018, 01:52:13 UTC
6c29cc9 Remove remnants of sandbox.rocky.tf (#65) 13 June 2018, 01:38:31 UTC
e4b6b19 Fix check_imports CI script (#63) It was previously failing because imports "from __future__ ..." must come before all others. This change adds a feature to ignore arbitrary module names, and adds "__future__" as one of those modules. 13 June 2018, 00:43:38 UTC
bcef914 Update .gitignore (#61) 12 June 2018, 23:45:20 UTC
438f9df Dynamics randomization for MuJoCo (#51) Add support for dynamics randomization for mujoco environment. Includes a simple data structure consisting of a list of variation objects, a wrapped environment of mujoco to perform dynamics randomization. Each variation object is an instance of the Variation class that works as a container for each of the fields used to randomized a dynamic parameter within the simulation environment. The wrapper class of mujoco performs dynamics randomization on each reset(). The data structure and the wrapper class are tested in test_dynamics_rand.py. Refer to: #14 12 June 2018, 23:28:32 UTC
38f5f97 Moved sandbox.rocky.tf to rllab.tf (#54) Upgraded tf to first-class citizenship. This means that the sandbox.rocky.tf folder is now located at rllab.tf. 12 June 2018, 22:11:37 UTC
931822d Create CONTRIBUTING.md (#52) * Sets out guidelines and review process for project contributions * Documents git workflow and provides examples * Documents official style and code quality standards 12 June 2018, 16:54:29 UTC
f4fcefd Fix circular imports created by alphabetization (#59) This PR also adds an automated import test to prevent future breakages from circular imports. 12 June 2018, 16:35:04 UTC
fd9a073 Refactor rllab ros (#50) - refactor rllab ros - add worlds 11 June 2018, 18:28:00 UTC
8619820 Update README.md 11 June 2018, 17:03:58 UTC
9205fdd Update README.md 11 June 2018, 16:48:23 UTC
eb43cc8 Add EditorConfig (#49) Facilitates automated configuration of most editors for parts of the style guide. 11 June 2018, 06:03:18 UTC
d94934e Create CODEOWNERS This will ensure that all PRs get at least one review from a maintainer 11 June 2018, 04:47:16 UTC
d6419f2 Formatting tweaks to make the CI green 11 June 2018, 04:25:00 UTC
1cd44ce Add import order support to the CI Also turns off changed-files-only rule for YAPF formatting 11 June 2018, 04:21:11 UTC
a4b5a39 Group and alphabetize imports according to PEP8 flake8 --import-order-style=google --application-import-names=sandbox,rllab,examples,contrib --select=I100,I101,I201,I202 11 June 2018, 04:10:58 UTC
e92860c Update LICENSE 11 June 2018, 00:17:29 UTC
75349b3 Format with yapf and limit line length to 80 char yapf -irp --style=pep8 . Some hand-crafted corrections added 11 June 2018, 00:15:35 UTC
00ec862 Asynchronous plotting for Theano (#124) Added support for asynchronous plotting for Theano. The main caveat is that Linux machines need to use multiprocessing.Process and Mac OS X machines need to use threading.Thread. If Linux machines use Threads, the program slows to a crawl (like as a result of Python's Global Interpreter Lock GIL); conversely, if Mac machines use Process, the glfw will throw a segmentation fault and fail to draw the window. 07 June 2018, 19:26:04 UTC
5c5d3a2 Upgrade gym to v0.10.5 (#122) This is necessary to use mujoco_py>=1.5 05 June 2018, 17:15:02 UTC
c32e6cc Add missing convenience imports and fix circular dependency (#113) Adds missing convenience imports to __init__.py files, and moves some utilities from rllab/algos to rllab/sampler to fix a circular import dependency. 04 June 2018, 23:00:05 UTC
60c6867 Refactor sawyer robot interface (#115) - refactor sawyer robot interface and how it is used. 31 May 2018, 20:28:32 UTC
8a5ee1a Add task object manger interface (#114) - Users use this to mange every objects in task except for robots. 31 May 2018, 20:05:00 UTC
839bfe1 Merge tensorboard summary and tensorboard output (#103) * Merge tensorboard summary and tensorboard output Merge tensorboard_summary.Summary into tensorboard_output.TensorBoardOutput, which supports graph, scalar, tensor, and histogram logging to tensorboard. The private function in TensorBoardOutput starts with '_'. All other functions are public. What this commit does: 1. Optimize imports 2. Merge class 3. Solve the duplicate fields in custom scalars. (https://github.com/ryanjulian/rllab/pull/88#issuecomment-392245839) * Add deps in environment and alphabetize imports Add dependencies jsonmerge and protobuf into environment.yml. Alphabetize imports. 31 May 2018, 19:07:07 UTC
59bcc7f Implement specific env functions for specific task (#106) make every specific task env has its own implementation of sample_goal, get_observation, reward, _goal_distance. 31 May 2018, 18:13:42 UTC
e9468f6 Fixed glfw throwing segmentation fault, requires reverting back to synchronous plotting (#112) This PR enables the glfw window to be drawn when plot=True, but requires reverting the implementation of plotter back to synchronous plotting. This is because asynchronous plotting throws a segmentation fault within glfw. Furthermore, every 'import glfw' statement must be preceded with 'import mujoco_py' to ensure that glfw is being imported properly. 31 May 2018, 04:21:04 UTC
06b6de3 Move ros node init to launcher files (#104) I moved ros node init to launcher files, so that I can ensure after init every thing related to ros can work. Before this change, ros node init happens in one class constructor which is not good. 30 May 2018, 19:39:22 UTC
4315906 Add TensorBoard histogram support (#58) Allows users to log histograms to TensorBoard using the new record_histogram API. This PR also starts organizing TensorBoard outputs into their own class. 30 May 2018, 16:43:06 UTC
1cc419f Add robot interface (#101) Add an abstrct robot interface so that people can use and add different robot more easily and unified. 30 May 2018, 00:56:09 UTC
2e55295 Refactor rllab from rllab.mujoco_py to mujoco_py (v1.5.0) (#91) In this change, rllab is updated to use the latest version of MuJoCo (i.e. v1.5.0) by replacing all applications of rllab.mujoco_py in the code with mujoco_py, and subsequently remove the rllab.mujoco_py folder. This also updates the setup_mujoco.sh script to support installation using MuJoCo v1.5.0 in lieu of v1.3.0. Most changes are minor, such as replacing applications of self.model with self.sim due to MuJoCo transferring functionality from model to sim in the new version. Instead, sim is instantiated by passing a reference to model: self.sim = MjSim(self.model). However, some changes are major, most notably embedded_viewer.py and gather_env.py. Many functions within these files were using deprecated APIs that did not have a direct and easy complement in the new MuJoCo documentation. Some of these functions were renamed, others were moved around underneath new or pre-existing classes under a new guise, and others were most likely removed. These files are not as of now rigorously tested, but no major functionality has been broken thus far, and so can be used tentatively. 30 May 2018, 00:03:15 UTC
49e7505 Fix missing imports and other small errors (#99) 1. Add missing imports and optimize imports. 2. Delete duplicate parameter name. 3. Delete duplicate parameter initialization. 26 May 2018, 00:51:39 UTC
4d2417e Support for tensor logging to tensorboard (#88) Add a customized tensor scalar to tensorboard by using the custom_scalar plugin in tensorboard. Each line in the scalar corresponds to an element in the tensor. Wrap the tensorboard logging module into a new class `Summary` in file rllab/misc/tensor_summary.py. It supports both the simple value and tensor logging. It also saves the computation graph created by rllab. To record the tensor into tensorboard, use the `record_tensor` function in file rllab/misc/logger.py. Refer to: #39, #38 26 May 2018, 00:10:30 UTC
5c42053 Add variable scope to symbolic operations using TensorFlow (#72) Add variable scope to symbolic operations using TensorFlow A variable scope facilitates the reading of a TensorFlow graph by grouping tensor objects in hierarchies. In rllab, the hierarchies are defined with primitive objects and symbolic operations, where the latter is a member function of a primitive. In this change, the primitives are the algorithms, networks, distributions, optimizers, policies, Q-functions and regressors. An example of a primitive is the DiagonalGaussian class, which implements a probability distribution, and the symbolic function is kl_sym, which implements the Kullback–Leibler divergence. The idea of implementing the variable scope is that all the tensor operations in kl_sym are encapsulated within the hierarchy DiagonalGaussian/kl_sym. Each primitive and symbolic function have the parameter "name", which has a default value equal to the primitive or symbolic function name, but developers can set those parameters as they may find more convenient, so the previous example could be changed to be: distribution/divergence. A context class was added to the file tensor_utils.py to verify if the variable scope of the corresponding primitive is already set, in order to set it in case it's not, and remove it once the symbolic function is exiting. The only caveat with variable scopes is that they work based on the call stack and not with a class or file scope, so even if the same scope is used in two different call stacks, TensorFlow adds an index to the primitive or symbolic function names to make each scope unique. Therefore, not all symbolic operations performed in a primitive will appear under the same primitive scope, but they will be split in different instances of the primitive scope (e.g. DiagonalGaussian and DiagonalGaussian_1). 25 May 2018, 21:41:41 UTC
3ddb23d Upgrade Theano to 1.0.1 (#65) Upgrades Theano to 1.0.1 25 May 2018, 19:02:33 UTC
7586365 Remove convenience imports that generate circular dependencies (#87) When the convenience imports were introduced, they accidentally created a circular dependency was between packages. To avoid future crashes, an analysis of the code imports was performed to find all the circular dependencies in rllab. We found no no circular dependencies in the TensorFlow tree, or between packages in the TensorFlow and Theano tree. There were circulater dependencies in the Theano tree. Circular dependencies in the Theano tree are the following: - The package misc with algos, baselines, core, and viskit. - The package algos with sampler Since misc shares so many circular dependencies, conflicting imports were removed from its file __init__.py. Regarding algos and sampler, the issue is generated by imports in sampler in its file __init__.py, so they were removed as well. Due to these removals, some imports had to be restored to its previous long package path. 23 May 2018, 22:57:31 UTC
dd3c968 add gazebo sawyer (#49) Design the ros environment support for rllab. Add sawyer simulation support and gazebo environment support. Refer to: #49 23 May 2018, 01:44:49 UTC
d49f039 Fix error in Theano when using GPU and TF The configuration of the GPU device in Theano is set if USE_TF is false and USE_GPU is true. Since USE_TF is only set to True when using AWS EC2, it's now also set as a parameter in the run_experiment_list function. To avoid the conflict with Theano when using TF and GPU, set both use_tf and use_gpu to True when calling run_experiment_list. To enable and disable GPU with TensorFlow, an if statement was inserted to make the GPU invisible when use_gpu=False. Also, an environment variable for use_tf for non-local modes such as local_docker or EC2 was added if it's required to switch between Theano and TensorFlow. Finally, the corresponding documentation was added in experiments.rst and the script instrument.py was formatted with PEP8 using YAPF. 23 May 2018, 01:34:20 UTC
0075a6b Import classes and modules in __init__.py of each package (#66) Imports classes and modules in the __init__.py of each package to make package-level imports shorter. 23 May 2018, 01:11:38 UTC
bb2475f Support std_share_network in all GaussianMLP* classes (#69) Implement std_share_network architecture in the GaussianMLP* classes of TensorFlow. The std_share_network creates a single neural net with output length of 2 * action_dimension. The first half output params are the means params, and the second half params are the log_std params. As a result, the GaussianMLP* classes can support a new architecture. Refer to: #44 22 May 2018, 22:14:15 UTC
92c41f3 Cleanup list of dependencies in conda environment (#56) (#60) It was verified that each of the dependencies removed were not imported or had any package dependants in rllab. The new environment file was tested with a fresh copy of the rllab project by running the setup_linux script, and with the training of the swimmer in Theano and cartpole in TensorFlow with plot=True. The channels of the removed packages were discarded as well. 16 May 2018, 06:27:59 UTC
back to top