Revision - e1467a7 - Merge pull request #3806 from [...]

Revision e1467a79dc6580ae009d827b5e6f274faff3b339 authored by liqunfu on 27 March 2020, 21:42:04 UTC, committed by GitHub on 27 March 2020, 21:42:04 UTC

Merge pull request #3806 from microsoft/liqun/wrap_maxlpool_with_reshape

support Pooling ops with Sequence axis

2 parent s c7bc93f + a2055f6

Files
Changes

README.md

# Build Neural Language Model using Sampled Softmax

This example demonstrates how to use sampled softmax for training a token based neural language model.
The model predicts the next word in a text given the previous ones where the probability of the next word is computed using a softmax.
As the number of different words might be very high this final softmax step can turn out to be costly.

Sampled-softmax is a technique to reduce this cost at training time. For details see also the [sampled softmax tutorial](https://github.com/Microsoft/CNTK/blob/release/2.7/Tutorials/CNTK_207_Training_with_Sampled_Softmax.ipynb)

Note the provided data set has only 10.000 distinct words. This number is still not very high and sampled softmax doesn't show any significant perf improvements here.
The real perf gains will show up with larger vocabularies.

## HOWTO

This example uses Penn Treebank Data which is not stored in GitHub but must be downloaded first.
To download the data please run download_data.py once. This will create a directory ./ptb that contains all the data we need 
for running the example.

Run word-rnn.py to train a model.
The main section of word-rnn defines some parameters to control the training.

* `use_sampled_softmax` allows to switch between sampled-softmax and full softmax.
* `softmax_sample_size` sets the number of random samples used in sampled-softmax.

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...