012d521 | Afroz Mohiuddin | 08 April 2019, 18:07:46 UTC | Bump setup.py to 1.13.2 -- Travis is green. PiperOrigin-RevId: 242497797 | 08 April 2019, 18:08:24 UTC |
79b0582 | Afroz Mohiuddin | 08 April 2019, 17:01:21 UTC | Don't require installing trax dependencies (jax, jaxlib) since they aren't available for Windows. PiperOrigin-RevId: 242481869 | 08 April 2019, 17:01:55 UTC |
3fb0eee | T2T Team | 08 April 2019, 15:34:56 UTC | Internal change. PiperOrigin-RevId: 242467951 | 08 April 2019, 15:35:31 UTC |
ef283ac | Afroz Mohiuddin | 07 April 2019, 02:22:20 UTC | Change comments. PiperOrigin-RevId: 242312909 | 08 April 2019, 11:37:16 UTC |
22b964c | Afroz Mohiuddin | 06 April 2019, 04:00:24 UTC | A PPO implementation in JAX. Much work remains to be done. PiperOrigin-RevId: 242240039 | 08 April 2019, 11:35:16 UTC |
5ddd46d | Noam Shazeer | 06 April 2019, 00:18:05 UTC | Move the unfortunately-named examples/transformer_standalone.py to transformer/main.py make model and layers hyperparameters gin-configurable so as to cut out a lot of plumbing. Rename "encoder" to "vocabulary" to avoid confusion with the encoder in encoder/decoder architectures. Add functionality to run transformer on t2t datasets. Modify some of the default model hyperparameters to better match previous transformer experiments. Enable layer-postprocess-dropout, which had been mistakenly omitted. Add options for sharing embedding and softmax layer weights. Several changes to speed up input pipeline: - custom op for sequence-packing (borrowed from t2t) (requires custom-built tf binary, so won't work yet on cloud-tpu) - enable parallelism in dataset.map calls - add prefetching - input pipeline seems to run fine now at >1M tokens/sec PiperOrigin-RevId: 242222578 | 06 April 2019, 00:18:46 UTC |
42b2d34 | T2T Team | 05 April 2019, 19:11:02 UTC | Internal change PiperOrigin-RevId: 242169130 | 05 April 2019, 19:11:36 UTC |
3562111 | Dustin Tran | 05 April 2019, 18:11:08 UTC | Fix OSS test for reversible layers. PiperOrigin-RevId: 242157332 | 05 April 2019, 18:11:45 UTC |
d7c9c60 | T2T Team | 05 April 2019, 16:48:02 UTC | Adding Pillow package dependency. PiperOrigin-RevId: 242140706 | 05 April 2019, 16:48:47 UTC |
9c557c2 | Lukasz Kaiser | 05 April 2019, 02:18:14 UTC | Fork stax to allow more experimentation. PiperOrigin-RevId: 242055396 | 05 April 2019, 02:18:51 UTC |
7e06bd6 | Lukasz Kaiser | 05 April 2019, 00:45:22 UTC | Allow hard attention in Universal Transformer. PiperOrigin-RevId: 242044506 | 05 April 2019, 00:46:19 UTC |
f6c024b | T2T Team | 04 April 2019, 22:48:35 UTC | Enable frame resizing for rendered gym environments. PiperOrigin-RevId: 242024877 | 04 April 2019, 22:49:42 UTC |
9dc3d12 | Lukasz Kaiser | 04 April 2019, 21:27:28 UTC | Add recent DeepMind math dataset to T2T. PiperOrigin-RevId: 242008722 | 04 April 2019, 21:28:01 UTC |
2182ee5 | Afroz Mohiuddin | 04 April 2019, 21:06:56 UTC | Add __init__.py to t2t/keras to make Travis happy. PiperOrigin-RevId: 242004246 | 04 April 2019, 21:07:35 UTC |
614794a | T2T Team | 04 April 2019, 15:30:35 UTC | internal change PiperOrigin-RevId: 241934922 | 04 April 2019, 15:31:10 UTC |
64c4e2c | Dustin Tran | 04 April 2019, 00:03:57 UTC | Remove internal implementation's reliance on Bijectors. This cleans up dependencies, showing how TFP Bijectors aren't necessary in the world of Keras layers. Future CLs may: + Move TransformedRandomVariable upstream to Edward2. + Refactor Edward2 to follow Edward1's mix-in approach to wrap Distributions. This lets us implement TransformedRandomVariable without having to implement a new TransformedDistribution. PiperOrigin-RevId: 241835590 | 04 April 2019, 00:04:47 UTC |
4de8254 | cbockman | 03 April 2019, 21:18:17 UTC | fix get_standardized_layers spelling (#1529) | 03 April 2019, 21:18:17 UTC |
c2a89e8 | T2T Team | 03 April 2019, 20:19:25 UTC | Allow skip_eos_postprocess to happen when decoding as well. PiperOrigin-RevId: 241792692 | 03 April 2019, 20:19:57 UTC |
e79589f | Lukasz Kaiser | 03 April 2019, 02:14:52 UTC | Correct rng passing in multi-device mode. ResNet trains on a TPU donut now. PiperOrigin-RevId: 241649966 | 03 April 2019, 02:15:35 UTC |
b6a9bbb | Manoj Kumar | 03 April 2019, 00:26:54 UTC | Add VideoFlow paper to T2T Readme PiperOrigin-RevId: 241637251 | 03 April 2019, 00:27:33 UTC |
0fc87aa | Afroz Mohiuddin | 02 April 2019, 22:31:34 UTC | Minor documentation. PiperOrigin-RevId: 241617014 | 02 April 2019, 22:32:13 UTC |
8b41452 | T2T Team | 02 April 2019, 18:26:25 UTC | Fix wasteful relative attention when using memory PiperOrigin-RevId: 241567826 | 02 April 2019, 18:27:20 UTC |
87a8784 | Lukasz Kaiser | 02 April 2019, 04:49:01 UTC | Decouple TRAX from t2t_trainer (use trax/trainer instead). PiperOrigin-RevId: 241461284 | 02 April 2019, 04:49:35 UTC |
c84eafe | Lukasz Kaiser | 01 April 2019, 23:47:24 UTC | Make 1-device mode not call pmap for now, correct rng handling in TRAX. PiperOrigin-RevId: 241427191 | 01 April 2019, 23:48:04 UTC |
861ead8 | Dustin Tran | 01 April 2019, 21:38:52 UTC | Add T2T constraints, initializers, & regularizers following Keras. PiperOrigin-RevId: 241402234 | 01 April 2019, 21:39:33 UTC |
de7f0b3 | Dustin Tran | 01 April 2019, 20:55:18 UTC | Demonstrate how to use posterior mean (or any other value) on forward pass. PiperOrigin-RevId: 241392931 | 01 April 2019, 20:55:57 UTC |
241a315 | James Martens | 01 April 2019, 20:01:42 UTC | Modifying the Layer Norm parameters to use the new features of K-FAC PiperOrigin-RevId: 241382675 | 01 April 2019, 20:02:17 UTC |
2f1380d | T2T Team | 01 April 2019, 16:42:27 UTC | Enable customizing host_call for eval/train modes. PiperOrigin-RevId: 241341729 | 01 April 2019, 16:43:09 UTC |
bb6440d | T2T Team | 31 March 2019, 15:31:14 UTC | Integrated the neural memory with transformer. PiperOrigin-RevId: 241213370 | 31 March 2019, 15:31:53 UTC |
164b342 | Lukasz Kaiser | 30 March 2019, 01:31:49 UTC | internal PiperOrigin-RevId: 241086998 | 30 March 2019, 01:32:38 UTC |
d691283 | Zi Yang | 29 March 2019, 21:44:23 UTC | Added key checks for "inputs" and "targets". PiperOrigin-RevId: 241054946 | 29 March 2019, 21:44:59 UTC |
de821fc | Lukasz Kaiser | 29 March 2019, 20:43:41 UTC | Correct random seed handling in TRAX, make input pipeline more aligned with T2T and set defaults better for colab ease of use. PiperOrigin-RevId: 241042919 | 29 March 2019, 20:44:32 UTC |
75611fc | T2T Team | 29 March 2019, 03:48:12 UTC | Do not save recurrent memory state in checkpoints This ensures that checkpoint files are compatible across different chunk and memory sizes. PiperOrigin-RevId: 240912219 | 29 March 2019, 03:48:51 UTC |
dd58574 | T2T Team | 28 March 2019, 21:57:15 UTC | Fix recurrent memory batch size inference PiperOrigin-RevId: 240859652 | 28 March 2019, 21:57:52 UTC |
6be7d5c | Lukasz Kaiser | 28 March 2019, 17:17:16 UTC | Add a basic type of hard attention to Transformer; set hparams="hard_attention_k=16" to try. PiperOrigin-RevId: 240798075 | 28 March 2019, 17:17:56 UTC |
2d2d160 | Lukasz Kaiser | 28 March 2019, 00:47:59 UTC | Correct flat CIFAR modality to not consider 0 as padding. PiperOrigin-RevId: 240682373 | 28 March 2019, 00:48:38 UTC |
211c824 | T2T Team | 27 March 2019, 21:40:16 UTC | enable concrete models to override the default tpu host call. PiperOrigin-RevId: 240643874 | 27 March 2019, 21:41:00 UTC |
0a251ef | T2T Team | 27 March 2019, 14:40:41 UTC | CIFAR-10 flat subpixel generation PiperOrigin-RevId: 240555453 | 27 March 2019, 14:41:12 UTC |
7561ead | T2T Team | 26 March 2019, 23:20:16 UTC | Decouple recurrent memory size from chunk size PiperOrigin-RevId: 240450826 | 26 March 2019, 23:21:07 UTC |
b2cc9f2 | Piotr Milos | 26 March 2019, 22:03:44 UTC | Merge of PR #1511 PiperOrigin-RevId: 240435415 | 26 March 2019, 22:04:22 UTC |
9fa015b | Piotr Milos | 26 March 2019, 22:03:25 UTC | removing datasets for serving data (#1511) | 26 March 2019, 22:03:25 UTC |
36894c6 | T2T Team | 26 March 2019, 16:51:55 UTC | pass tpu_job_name to TPUConfig. PiperOrigin-RevId: 240367576 | 26 March 2019, 17:04:22 UTC |
8521bf7 | Afroz Mohiuddin | 25 March 2019, 23:58:33 UTC | Minor change to README.md PiperOrigin-RevId: 240247338 | 26 March 2019, 00:00:31 UTC |
f28a5e9 | konradczechowski | 25 March 2019, 23:29:11 UTC | Merge of PR #1500 PiperOrigin-RevId: 240242179 | 26 March 2019, 00:00:01 UTC |
3c7f7ca | Lukasz Kaiser | 25 March 2019, 23:23:45 UTC | Correct LayerNorm implementation and remove slax.multiplex to be more basic stax in Transformer (which now trains better). PiperOrigin-RevId: 240241257 | 25 March 2019, 23:59:38 UTC |
151dc27 | T2T Team | 25 March 2019, 22:21:18 UTC | "Adding mixture transformer" PiperOrigin-RevId: 240229309 | 25 March 2019, 23:59:13 UTC |
150aad3 | konradczechowski | 25 March 2019, 23:28:50 UTC | Model-Based RL: batched environments for DQN (#1500) * MBRL: batched dopamine, runner and agent * MBRL: Fix _observation usage in BatchDQNAgent; clean up tests; some minor changes. * MBRL: Perform multiple _train_steps per env_step in batched dqn, to keep the same _train_steps:env_steps ratio as in non-batched version. * MBRL: Use batched dopamine in MBRL pipeline. * Minor fixes, including dopamine 1.0.4 compatibility. * Batched Dopamine: Fix current_rollouts reset in BatchedAgent * Padded BatchEnv, prints. * Assert batch_size=1 for dopamine evaluation. * Enable model-free with dqn. * Move to dopamine 2.0.1. * Remove unused functions, add documentation. * Remove deprecated TODOs. * Fix SimulatedBatchEnv closing. * Fix closing environment in dopamine. * Improve batch size inference. * Add test for model-free and model-based dqn, reduce model-free ppo test time. * Parameter for model-based dqn number of evaluation episodes. * Unify batch_env attribute name for dopamine environment and wrappers. * Remove PaddedBatchEnv from default model-based dqn pipeline. * Linting. * Update tests for batch dqn runner and agent. | 25 March 2019, 23:28:50 UTC |
3744017 | T2T Team | 25 March 2019, 20:56:11 UTC | Add LanguagemodelWikitext103L16k problem PiperOrigin-RevId: 240211764 | 25 March 2019, 20:56:54 UTC |
623f5cb | Piotr Milos | 25 March 2019, 19:38:59 UTC | Merge of PR #1518 PiperOrigin-RevId: 240196506 | 25 March 2019, 19:39:41 UTC |
8c9b80c | Piotr Milos | 25 March 2019, 19:38:43 UTC | rl notebook fixes (#1518) | 25 March 2019, 19:38:43 UTC |
06dafa8 | T2T Team | 25 March 2019, 19:02:06 UTC | Use GFile PiperOrigin-RevId: 240189694 | 25 March 2019, 19:02:48 UTC |
16d59ab | T2T Team | 25 March 2019, 17:55:05 UTC | Queries and values don't need to have the same depth in dot_product_unmasked_self_attention_relative_2d. PiperOrigin-RevId: 240174892 | 25 March 2019, 17:55:47 UTC |
d63dc7f | Lukasz Kaiser | 25 March 2019, 02:56:45 UTC | Use pmap to make trax work in multi-device mode. PiperOrigin-RevId: 240068295 | 25 March 2019, 02:57:19 UTC |
ff1fd68 | Lukasz Kaiser | 22 March 2019, 23:19:15 UTC | Allow autoregressive output frame generation and relu as non-linearity in SD video models (both off by default). PiperOrigin-RevId: 239885106 | 22 March 2019, 23:19:54 UTC |
6bd50f8 | konradczechowski | 22 March 2019, 22:53:22 UTC | Merge of PR #1479 PiperOrigin-RevId: 239880513 | 22 March 2019, 22:54:02 UTC |
560644e | konradczechowski | 22 March 2019, 22:42:16 UTC | Model Based RL: Add sticky actions to model-based and model-free pipelines. (#1479) * Add sticky action wrapper, symplify code for environments wrapping. * Add sticky_actions option to model-based and model-free pipelines. | 22 March 2019, 22:42:16 UTC |
70b54f8 | Lukasz Kaiser | 22 March 2019, 19:20:14 UTC | Add link to paper and website to rl/README. PiperOrigin-RevId: 239841325 | 22 March 2019, 19:20:54 UTC |
765f651 | T2T Team | 22 March 2019, 18:52:25 UTC | Update to Paracrawl release4, and simplify/regularize some problem names. PiperOrigin-RevId: 239836265 | 22 March 2019, 18:53:05 UTC |
012796a | Anudhyan Boral | 22 March 2019, 18:03:05 UTC | Reshape the output explicitly in `sparse_message_pass`. This propagates static shape information that is lost during the `tf.sparse_reduce_sum` operation. This would prevent failing downstream tests from failing after broadcasting support is added in tf.matmul. After this addition, unknown (static) rank on either operand of tf.matmul would result in unknown rank of the output Tensor. PiperOrigin-RevId: 239826458 | 22 March 2019, 18:03:43 UTC |
db2bbe4 | Lukasz Kaiser | 22 March 2019, 17:57:08 UTC | Update RL notebook and add a link to it in rl/README. Great thanks to Piotr Kozakowski for the colab! PiperOrigin-RevId: 239825106 | 22 March 2019, 17:57:57 UTC |
516d50e | RJ Skerry-Ryan | 22 March 2019, 15:45:29 UTC | Use a TPU-compatible approach to matrix inversion, until tf.linalg.inv is supported. PiperOrigin-RevId: 239801671 | 22 March 2019, 15:46:05 UTC |
0d840ee | Lukasz Kaiser | 21 March 2019, 23:02:05 UTC | Make TransformerLM train reasonably well in trax. Adding loss and metric masking and dropout refactor in Transformer. PiperOrigin-RevId: 239692595 | 21 March 2019, 23:02:40 UTC |
eedd6d7 | Afroz Mohiuddin | 21 March 2019, 22:53:27 UTC | Bump setup.py version to 1.13.1 PiperOrigin-RevId: 239690861 | 21 March 2019, 22:54:05 UTC |
0b17e18 | T2T Team | 21 March 2019, 22:31:46 UTC | Fix TransformerMemory model Fixes a bug with training and enables relative attention. Absolute attention works poorly when the timing signal is reset at the start of each chunk. PiperOrigin-RevId: 239686861 | 21 March 2019, 22:32:27 UTC |
a6f8a00 | Afroz Mohiuddin | 21 March 2019, 21:29:44 UTC | Increase git depth, since there maybe a lag from when we trigger a build to other changes getting checked into the repo. In which case git checkout on a commit id (after cloning with the extra changes) fails with "fatal: reference is not a tree" PiperOrigin-RevId: 239673455 | 21 March 2019, 21:30:36 UTC |
f41e517 | Lukas Geiger | 21 March 2019, 21:16:11 UTC | Upgrade ML Engine runtime to 1.13 and remove unused Gcloud class (#1494) This PR upgrades the ML Engine runtime to 1.13 in order to fix #1472. Unfortunately ML Engine doesn't support TPUs in version 1.13 yet: https://cloud.google.com/ml-engine/docs/release-notes This also removes the unused Gcloud class to clean up the code a bit. | 21 March 2019, 21:16:11 UTC |
e26510c | Ashish Vaswani | 21 March 2019, 20:33:04 UTC | query shape = memory shape. Allows for just one block extraction step for faster TPU processing. We also don't do the final linear transformation after attention. Includes tests PiperOrigin-RevId: 239661165 | 21 March 2019, 20:33:42 UTC |
45b3f4f | Afroz Mohiuddin | 21 March 2019, 19:55:53 UTC | Fix Travis breakage for tf != nightly. PiperOrigin-RevId: 239653689 | 21 March 2019, 19:56:30 UTC |
893f14f | Akio Ohta | 21 March 2019, 19:30:18 UTC | Merge of PR #1495 PiperOrigin-RevId: 239648633 | 21 March 2019, 19:31:00 UTC |
b56a5cd | Akio Ohta | 21 March 2019, 19:29:55 UTC | Modify serving utils (#1495) * Add decode logic for model with return_beams=True. * Add print logic for model with return_beams=True. | 21 March 2019, 19:29:55 UTC |
a4071d6 | Afroz Mohiuddin | 21 March 2019, 18:55:52 UTC | Remove unused tf-agents dependency for now, will add once we need to. PiperOrigin-RevId: 239641868 | 21 March 2019, 18:56:32 UTC |
642279b | Yongkeun Hwang | 21 March 2019, 18:46:54 UTC | Merge of PR #1480 PiperOrigin-RevId: 239640123 | 21 March 2019, 18:48:23 UTC |
d5b9ba2 | Afroz Mohiuddin | 21 March 2019, 18:46:03 UTC | Adding pypng package because of RenderedEnvProblems. PiperOrigin-RevId: 239639943 | 21 March 2019, 18:47:33 UTC |
5d345a9 | Yongkeun Hwang | 21 March 2019, 18:46:33 UTC | Allowing to use user-defined modules on t2t-eval (#1480) | 21 March 2019, 18:46:33 UTC |
1382452 | Piotr Kozakowski | 21 March 2019, 18:36:54 UTC | Merge of PR #1505 PiperOrigin-RevId: 239638087 | 21 March 2019, 18:37:33 UTC |
3dce919 | Piotr Kozakowski | 21 March 2019, 18:22:50 UTC | RL fixes (#1505) | 21 March 2019, 18:22:50 UTC |
3669442 | T2T Team | 21 March 2019, 18:19:41 UTC | Allow Evolved Transformer number of decoder attention heads to exceed 16. PiperOrigin-RevId: 239634336 | 21 March 2019, 18:20:16 UTC |
ba35f3d | T2T Team | 19 March 2019, 18:54:12 UTC | Added hparam for top-k random sampling in Transformer. PiperOrigin-RevId: 239238821 | 19 March 2019, 18:54:54 UTC |
886b774 | T2T Team | 19 March 2019, 18:31:19 UTC | fix shape bugs in dilated attention PiperOrigin-RevId: 239233677 | 19 March 2019, 18:31:57 UTC |
e5e7d4b | T2T Team | 18 March 2019, 22:20:46 UTC | Transformer with memory in the style of Transformer-XL PiperOrigin-RevId: 239072524 | 18 March 2019, 22:21:25 UTC |
191d9ad | T2T Team | 18 March 2019, 21:13:59 UTC | Resolve TODO. PiperOrigin-RevId: 239058553 | 18 March 2019, 21:14:38 UTC |
64eeb9d | Lukasz Kaiser | 18 March 2019, 19:05:58 UTC | Correct eval adjusting schedule and use for resnet in trax (improves training). PiperOrigin-RevId: 239033536 | 18 March 2019, 19:06:34 UTC |
8d93b2e | T2T Team | 16 March 2019, 00:33:43 UTC | Base hparam sets for languagemodel_wikitext103_l4k PiperOrigin-RevId: 238741124 | 16 March 2019, 00:34:22 UTC |
0542447 | T2T Team | 15 March 2019, 23:18:50 UTC | Implemented Neural Turing Machine. PiperOrigin-RevId: 238730108 | 15 March 2019, 23:19:42 UTC |
a187657 | Le Zhang | 15 March 2019, 15:53:47 UTC | Merge of PR #1487 PiperOrigin-RevId: 238649533 | 15 March 2019, 15:54:54 UTC |
3edecf9 | cbockman | 15 March 2019, 15:52:26 UTC | Merge of PR #1485 PiperOrigin-RevId: 238649310 | 15 March 2019, 15:54:29 UTC |
f23f535 | Le Zhang | 15 March 2019, 15:53:25 UTC | Fix step size extraction for checkpoint name with - in it such as avg-model.ckpt-1234 (#1487) | 15 March 2019, 15:53:25 UTC |
fc6037d | cbockman | 15 March 2019, 15:52:06 UTC | (minor) spelling deault -> default (#1485) | 15 March 2019, 15:52:06 UTC |
e84c425 | Ashish Vaswani | 15 March 2019, 15:33:25 UTC | Replaced gather local 2d with splits instead of slices. Slightly better MXU utilization. PiperOrigin-RevId: 238646469 | 15 March 2019, 15:34:04 UTC |
c3b6024 | Shivani Agrawal | 15 March 2019, 05:30:26 UTC | Separated out skip_summary() method from summary_op_util, fixing accordingly. PiperOrigin-RevId: 238585287 | 15 March 2019, 05:31:05 UTC |
6838765 | Trevor Gale | 14 March 2019, 23:32:47 UTC | Fixing dev token count for lm1b32k. PiperOrigin-RevId: 238544184 | 14 March 2019, 23:33:25 UTC |
b79c310 | T2T Team | 13 March 2019, 20:57:25 UTC | Search in base and env registries to create a problem. PiperOrigin-RevId: 238301625 | 13 March 2019, 20:58:01 UTC |
cb1f6a9 | Ashish Vaswani | 13 March 2019, 17:27:56 UTC | Simpler local 2d tpu attention. Algorithm developed by avaswani@ and nikip@. Will only work if memory flange is half of query size. PiperOrigin-RevId: 238252818 | 13 March 2019, 17:28:27 UTC |
b439442 | Ashish Vaswani | 12 March 2019, 22:05:59 UTC | Fold head and batch because tf does not like > 6 dimensions for gradients. Adjusted tests accordingly. PiperOrigin-RevId: 238105879 | 12 March 2019, 22:06:38 UTC |
d0a8ed0 | T2T Team | 12 March 2019, 18:25:41 UTC | Add RenderedEnvProblem. This is the base class for any Gym environment problem with rgb array as observations to behave as a VideoProblem. PiperOrigin-RevId: 238058750 | 12 March 2019, 18:26:25 UTC |
42a4120 | Zi Yang | 12 March 2019, 04:47:10 UTC | Updated Tensor2Tensor's serving_input_fn to allow correctly pad features and prepare the batch size at exporting time for use at the serving time, if TPU is used. PiperOrigin-RevId: 237947564 | 12 March 2019, 04:47:46 UTC |
c9b8032 | Ashish Vaswani | 12 March 2019, 00:32:54 UTC | Fixed bug in 2d local TPU. PiperOrigin-RevId: 237920758 | 12 March 2019, 00:33:35 UTC |
b2346e1 | Ashish Vaswani | 11 March 2019, 22:53:37 UTC | local 2d attention that works on TPU. PiperOrigin-RevId: 237901376 | 11 March 2019, 22:54:12 UTC |
a1a6404 | T2T Team | 11 March 2019, 22:42:50 UTC | Added hparam for top-k random sampling in Transformer. PiperOrigin-RevId: 237899161 | 11 March 2019, 22:43:36 UTC |
2569b77 | T2T Team | 11 March 2019, 21:02:38 UTC | Added hparam for top-k random sampling in Transformer. PiperOrigin-RevId: 237877301 | 11 March 2019, 21:03:27 UTC |