https://github.com/casperkaae/parmesan
Revision ac48403bed22081758cc905c57145217739942c5 authored by wuaalb on 16 October 2015, 11:07:55 UTC, committed by wuaalb on 16 October 2015, 11:07:55 UTC
With previous parameters the test ELBO would start going up after approx. 60 epochs. Decreased learning rate and number of hidden units in deterministic layers of encoder/decoder. Set `analytic_kl_term=True` by default as it seems to improve results and is what the Kingma et al. paper does in its examples. Changed non-linearity `softplus` for deterministic hidden layers. Also tried `tanh` and `very_leaky_rectify`, but this seemed to perform best. Results with these settings ``` *Epoch: 999 Time: 9.03 LR: 0.00030 LL Train: -90.331 LL test: -93.592 ```
1 parent a2c1086
Tip revision: ac48403bed22081758cc905c57145217739942c5 authored by wuaalb on 16 October 2015, 11:07:55 UTC
Better default hyper-parameters for vae_vanilla example
Better default hyper-parameters for vae_vanilla example
Tip revision: ac48403
File | Mode | Size |
---|---|---|
examples | ||
misc | ||
parmesan | ||
.gitignore | -rw-r--r-- | 129 bytes |
LICENSE | -rwxr-xr-x | 1.4 KB |
MANIFEST.in | -rwxr-xr-x | 139 bytes |
README.rst | -rwxr-xr-x | 3.2 KB |
requirements-dev.txt | -rwxr-xr-x | 121 bytes |
requirements.txt | -rwxr-xr-x | 114 bytes |
setup.py | -rwxr-xr-x | 1.5 KB |
Computing file changes ...