3296d0b | Frank Seide | 10 February 2018, 02:15:10 UTC | bug fix: overflow check for very large minibatches did not typecast correctly. | 10 February 2018, 02:15:10 UTC |
8326dfa | Frank Seide | 10 February 2018, 01:49:02 UTC | bug fix: relPosition should be relative to epochSize | 10 February 2018, 01:49:02 UTC |
ffc34c2 | Frank Seide | 10 February 2018, 01:20:45 UTC | new option --fromLatest, to allow automatic restart from checkpoint in Philly retries | 10 February 2018, 01:20:45 UTC |
489aa6f | Frank Seide | 10 February 2018, 01:07:45 UTC | checkpointing now saves a latest tag | 10 February 2018, 01:07:45 UTC |
6537f78 | Frank Seide | 10 February 2018, 00:50:52 UTC | changed checkpointing to fractions of epochs | 10 February 2018, 00:50:52 UTC |
dc6bfaa | Frank Seide | 09 February 2018, 23:42:06 UTC | added code to reader API to report current sample position; towards an epochSize-based scheduling, checkpointing, etc. (removed a debug-leftover sleep()) | 09 February 2018, 23:42:06 UTC |
6ed5349 | Frank Seide | 09 February 2018, 22:07:59 UTC | added a comment | 09 February 2018, 22:07:59 UTC |
c78abf7 | Frank Seide | 09 February 2018, 21:48:35 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/cntk into fseide/dynamite | 09 February 2018, 21:48:35 UTC |
d681d42 | Frank Seide | 09 February 2018, 21:46:41 UTC | added an overflow check to Unpack() for the cast of time index into a float32 (which can overflow for large minibatches) | 09 February 2018, 21:46:41 UTC |
70c5eb6 | Frank Seide | 09 February 2018, 21:44:01 UTC | changed pathnames to have and interpolate with environment variables, for running on Philly | 09 February 2018, 21:44:01 UTC |
442c033 | Frank Seide | 09 February 2018, 21:42:44 UTC | (comment) | 09 February 2018, 21:42:44 UTC |
c0383c4 | Frank Seide | 09 February 2018, 21:42:28 UTC | uncommented some checks | 09 February 2018, 21:42:28 UTC |
bfec707 | Frank Seide | 09 February 2018, 21:40:23 UTC | cleaned up checking code in AutoBatch | 09 February 2018, 21:40:23 UTC |
c609923 | Frank Seide | 08 February 2018, 17:58:25 UTC | bug fix: Makefile lib order should pick up explicit boost lib before generic system path | 08 February 2018, 17:58:25 UTC |
adfc1ea | Frank Seide | 07 February 2018, 05:32:15 UTC | increased bucketingFactor and loss-reporting smoothing time constant; turned dumping model parameters into a new command dump_model (and for that, split out the Marian calls into separate functions) | 07 February 2018, 05:32:15 UTC |
f03a669 | Frank Seide | 03 February 2018, 07:20:34 UTC | (added a little Python script to convert Dynamite parameter dumps to a Marian model readable by marian-decoder) | 03 February 2018, 07:20:34 UTC |
50a5d41 | Frank Seide | 02 February 2018, 06:07:15 UTC | bug fix: PlainTextDeserializer::PlainTextSequenceData should hold a ref count to each chunk's data-index array too, to avoid an incorrect early free; renamed a few variables in SetMatrixFromCSCFormat() in prep of reusing it for CSR | 02 February 2018, 06:07:15 UTC |
66fd02f | Frank Seide | 02 February 2018, 01:19:47 UTC | MT.cpp now prints call stack in case of crash | 02 February 2018, 01:19:47 UTC |
4155605 | Frank Seide | 02 February 2018, 00:53:35 UTC | cleaned up SetMatrixFromCSCFormat(), it should now be the same as SetMatrixFromCSRFormat() | 02 February 2018, 00:53:35 UTC |
a7decff | Frank Seide | 02 February 2018, 00:19:20 UTC | cleaned up resetting of GPUSparseMatrix::m_sliceViewOffset | 02 February 2018, 00:19:20 UTC |
bfe74b4 | Frank Seide | 01 February 2018, 23:49:00 UTC | CopyToCPUSparseMatrix() now makes a compact copy of the actual size, not preserving the allocation size | 01 February 2018, 23:49:00 UTC |
0972e98 | Frank Seide | 01 February 2018, 22:04:51 UTC | bug fix in GatherBatch(): should use correct NzCount | 01 February 2018, 22:04:51 UTC |
1eec80c | Frank Seide | 01 February 2018, 19:08:23 UTC | Matrix::AssignValuesOf() can now copy from GPUSparse to CPUSparse; bug fix: CopyToCPUSparseMatrix() should honor and adjust copied indices for sliceViewOffset | 01 February 2018, 19:08:23 UTC |
1931f11 | Frank Seide | 01 February 2018, 17:40:37 UTC | added CheckSparse(), but not working | 01 February 2018, 17:40:37 UTC |
336b25f | Frank Seide | 01 February 2018, 08:41:49 UTC | (comments) | 01 February 2018, 08:41:49 UTC |
365218c | Frank Seide | 01 February 2018, 08:11:14 UTC | added lots of checks to NDArrayViewArena (seems all correctly working); disabled arena for now, to track down some bug | 01 February 2018, 08:11:14 UTC |
9ff0bb5 | Frank Seide | 01 February 2018, 08:08:39 UTC | new methods MatrixBase::AsPtr(), AsRef(), ElemTypeIs() | 01 February 2018, 08:08:39 UTC |
6e6290d | Frank Seide | 01 February 2018, 07:35:53 UTC | cleaned up the last commit | 01 February 2018, 07:35:53 UTC |
f6561c0 | Frank Seide | 01 February 2018, 07:12:19 UTC | last commit improved by now launching more CUDA threads (more parallelism) | 01 February 2018, 07:12:19 UTC |
6885e3f | Frank Seide | 01 February 2018, 06:58:04 UTC | refactored sparse Convolution with transposed rhs to move the loop into the CUDA kernel, avoiding lots of CUDA launches | 01 February 2018, 06:58:04 UTC |
5a95f30 | Frank Seide | 01 February 2018, 06:42:06 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/CNTK into fseide/dynamite | 01 February 2018, 06:42:06 UTC |
14451bd | Frank Seide | 01 February 2018, 06:42:03 UTC | (comment) | 01 February 2018, 06:42:03 UTC |
aceeac6 | Frank Seide | 01 February 2018, 06:41:08 UTC | clarified some names in sparse Convolution | 01 February 2018, 06:41:08 UTC |
3137490 | Frank Seide | 01 February 2018, 06:39:42 UTC | renamed some variables in sparse kernels for clarity | 01 February 2018, 06:39:42 UTC |
4036e68 | Frank Seide | 01 February 2018, 05:31:32 UTC | cleaned up sparse x dense -> dense, and added a Dynamite test | 01 February 2018, 05:31:32 UTC |
67bbbd5 | Frank Seide | 31 January 2018, 19:49:42 UTC | removed s_recycledDenseArenas, as its role has been taken over by the recycled memory blocks | 31 January 2018, 19:49:42 UTC |
a9d4d30 | Frank Seide | 31 January 2018, 19:39:57 UTC | removed s_currentArenaSize/Used | 31 January 2018, 19:39:57 UTC |
e447f57 | Frank Seide | 31 January 2018, 19:35:19 UTC | removed s_currentArena | 31 January 2018, 19:35:19 UTC |
782e41a | Frank Seide | 31 January 2018, 19:31:31 UTC | changed an indentation (separate commit since diff shows it as one huge changed blob), no code change | 31 January 2018, 19:31:31 UTC |
9d66ffb | Frank Seide | 31 January 2018, 19:30:27 UTC | minor cleanup of previous commit, for better diffing | 31 January 2018, 19:30:27 UTC |
6315819 | Frank Seide | 31 January 2018, 19:25:07 UTC | preparing to delete the arena allocation, moving to using gaps only; factored gap allocation into separate function (no code change, but some renaming) | 31 January 2018, 19:25:07 UTC |
f15bf2f | Frank Seide | 31 January 2018, 18:54:41 UTC | (moved another function, no code change) | 31 January 2018, 18:54:41 UTC |
ffe27b5 | Frank Seide | 31 January 2018, 18:53:42 UTC | (moved a function, no code change) | 31 January 2018, 18:53:42 UTC |
a0e8712 | Frank Seide | 31 January 2018, 18:51:54 UTC | (minor fix to last commit) | 31 January 2018, 18:51:54 UTC |
a10c637 | Frank Seide | 31 January 2018, 18:49:18 UTC | separated s_recycledArenass into one dense and sparse | 31 January 2018, 18:49:18 UTC |
d8a755b | Frank Seide | 31 January 2018, 18:43:02 UTC | minor cleanup of last commit (in two steps for better diffing); bug fix: missed a lock_guard in previous refactoring | 31 January 2018, 18:43:02 UTC |
9d2d85c | Frank Seide | 31 January 2018, 18:39:24 UTC | refactored NDArrayViewArena::New() a little, no code change otherwise | 31 January 2018, 18:39:24 UTC |
0d230c6 | Frank Seide | 31 January 2018, 18:32:39 UTC | can now pass an allocator to Primitivefunction::Forward() and BackwardTo(), to allow temporary objects in the future; NewNDArrayView() now simply called New() | 31 January 2018, 18:32:39 UTC |
2930c48 | Frank Seide | 31 January 2018, 17:59:43 UTC | bug fix: IsMatrixOfDataType() should use correct type 'double'; bug fix: GetSubBatches_CreateMinibatches() should account for the last sentence of a partial minibatch correctly; bug fix: Train() should not run minibatch source in inference mode! | 31 January 2018, 17:59:43 UTC |
66cfd80 | Frank Seide | 29 January 2018, 23:51:55 UTC | rewrote partial batching, to account for Marian's padding | 29 January 2018, 23:51:55 UTC |
50dbc38 | Frank Seide | 29 January 2018, 21:22:48 UTC | now catches bad_alloc and logs the specific MB | 29 January 2018, 21:22:48 UTC |
b4aa61b | Frank Seide | 29 January 2018, 21:11:39 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/cntk into fseide/dynamite | 29 January 2018, 21:11:39 UTC |
4934c9e | Frank Seide | 29 January 2018, 21:11:30 UTC | TracingGPUMemoryAllocator::AllocateNoTrace(0 now throws bad_alloc(); simplified the allocator a little (most variables only exist for dense, so no need for arrays); bug fix: sparse allocation should not kill the current dense arena | 29 January 2018, 21:11:30 UTC |
499c3bc | Frank Seide | 29 January 2018, 19:45:36 UTC | improved logging for mem alloc | 29 January 2018, 19:45:36 UTC |
cbbd830 | Frank Seide | 29 January 2018, 19:12:32 UTC | merging now happens inside RecycleMemoryBlock() | 29 January 2018, 19:12:32 UTC |
9aaa78a | Frank Seide | 29 January 2018, 18:40:48 UTC | refactored NewNDArray() quite a bit; bug fix: gap merging should increment j correctly in all cases | 29 January 2018, 18:40:48 UTC |
34c20fe | Frank Seide | 29 January 2018, 16:14:39 UTC | added tracking to mem allocator; changed saveEvery | 29 January 2018, 16:14:39 UTC |
dc4756f | Frank Seide | 28 January 2018, 20:46:45 UTC | super hacks in NewNDArrayView, seems to work, but could be sped up | 28 January 2018, 20:46:45 UTC |
dbf882d | Frank Seide | 28 January 2018, 02:55:14 UTC | added diagnostics for allocator | 28 January 2018, 02:55:14 UTC |
17ee59d | Frank Seide | 28 January 2018, 02:51:45 UTC | towards a gap allocator | 28 January 2018, 02:51:45 UTC |
b4d1930 | Frank Seide | 28 January 2018, 01:36:30 UTC | prototypical implementation of recording recycled gaps | 28 January 2018, 01:36:30 UTC |
a01f79a | Frank Seide | 28 January 2018, 01:11:08 UTC | disabled logging of gaps | 28 January 2018, 01:11:08 UTC |
cd183ed | Frank Seide | 28 January 2018, 00:43:00 UTC | made gcc happy | 28 January 2018, 00:43:00 UTC |
adbaf0f | Frank Seide | 28 January 2018, 00:38:16 UTC | added tracing to the memory allocator, in prep for more proper reuse of memory | 28 January 2018, 00:38:16 UTC |
ceaacce | Frank Seide | 27 January 2018, 18:47:26 UTC | added a custom deleter to Matrix for matrix storage objects that wrap an external buffer. Not tested yet, but want to break this into small commits | 27 January 2018, 18:47:26 UTC |
5c4f1e2 | Frank Seide | 27 January 2018, 02:36:48 UTC | (fixed a few messages); layer_norm() switched to Pow() | 27 January 2018, 02:36:48 UTC |
b5f8124 | Frank Seide | 26 January 2018, 22:44:11 UTC | switched layer_norm to NormalizeDenormalize() | 26 January 2018, 22:44:11 UTC |
25da613 | Frank Seide | 26 January 2018, 22:41:40 UTC | implemented NormalizeDenormalize() and tests | 26 January 2018, 22:41:40 UTC |
f2cd275 | Frank Seide | 26 January 2018, 21:57:06 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/CNTK into fseide/dynamite | 26 January 2018, 21:57:06 UTC |
853baa5 | Frank Seide | 26 January 2018, 21:56:53 UTC | made gcc happy | 26 January 2018, 21:56:53 UTC |
163d024 | Frank Seide | 26 January 2018, 21:53:37 UTC | removed some old, wrong version of matrix-weight backprop | 26 January 2018, 21:53:37 UTC |
f5625b6 | Frank Seide | 26 January 2018, 21:51:32 UTC | added another test for ElementAffine() with an explicit formula, to test opAxBplusC itself | 26 January 2018, 21:51:32 UTC |
55226b4 | Frank Seide | 26 January 2018, 21:49:03 UTC | implemented ElementAffine(), with tests; using it in layer_norm | 26 January 2018, 21:49:03 UTC |
348e73a | Frank Seide | 26 January 2018, 20:09:58 UTC | Expr operators with scalars now use ScaleAndShift(), avoiding the constants | 26 January 2018, 20:09:58 UTC |
97607ad | Frank Seide | 26 January 2018, 19:54:52 UTC | implemented ScaleAndShift(), with tests | 26 January 2018, 19:54:52 UTC |
e7c27aa | Frank Seide | 26 January 2018, 19:14:20 UTC | bug fix: batched backprop into matrix weight can now merge more complex map shapes like found in Marian, by reshaping; changed marian::affine() to use Affine(), saving some memory | 26 January 2018, 19:14:20 UTC |
6c1085c | Frank Seide | 26 January 2018, 19:10:23 UTC | (adapted a test to a new function signature) | 26 January 2018, 19:10:23 UTC |
c89cb67 | Frank Seide | 26 January 2018, 02:00:42 UTC | undid an accidental commit of an intentional test failure | 26 January 2018, 02:00:42 UTC |
ed85c77 | Frank Seide | 26 January 2018, 01:57:47 UTC | tests for variants of Affine() | 26 January 2018, 01:57:47 UTC |
be3d3e9 | Frank Seide | 26 January 2018, 01:53:15 UTC | bug fix: gradient of TransposeAffine should be the same as TransposeTimes; added tests for variants of Affine() | 26 January 2018, 01:53:15 UTC |
9efe860 | Frank Seide | 26 January 2018, 00:47:44 UTC | minor fix in marian::layer_norm | 26 January 2018, 00:47:44 UTC |
6d44662 | Frank Seide | 26 January 2018, 00:46:34 UTC | implemented Affine() and a test. This required quite a bit of additions at several places; bug fix: back prop into matrix weight should use a || instead of && to match (this fix revealed another bug to be fixed later) | 26 January 2018, 00:46:34 UTC |
85f619f | Frank Seide | 25 January 2018, 22:27:07 UTC | added stubs for new operations such as Affine() and ScaleAndShift(), but not implemented yet | 25 January 2018, 22:27:07 UTC |
6c9b6f9 | Frank Seide | 25 January 2018, 22:10:26 UTC | added new PrimitiveFunction() constructor overload that takes an InputsVectorType | 25 January 2018, 22:10:26 UTC |
081aa04 | Frank Seide | 25 January 2018, 07:08:45 UTC | added runtime stats to BackpropThroughSplice() and BackpropToMatrixWeight() | 25 January 2018, 07:08:45 UTC |
d8951a1 | Frank Seide | 25 January 2018, 06:43:27 UTC | disabled LR decay; disabled reading out partial-MB loss from GPU | 25 January 2018, 06:43:27 UTC |
2c513db | Frank Seide | 25 January 2018, 05:50:39 UTC | bug fixes: when passing something by move(), the same argument list should not access it; added statistics reporting for BackpropToUnbatched() | 25 January 2018, 05:50:39 UTC |
e917e77 | Frank Seide | 25 January 2018, 03:30:43 UTC | towards moving backprop onto bg thread | 25 January 2018, 03:30:43 UTC |
cc7c591 | Frank Seide | 25 January 2018, 03:17:25 UTC | cleaned up error handling in the Memoizer thread | 25 January 2018, 03:17:25 UTC |
8a1f8e1 | Frank Seide | 25 January 2018, 01:42:42 UTC | minor updates of logging | 25 January 2018, 01:42:42 UTC |
f20678a | Frank Seide | 25 January 2018, 01:35:20 UTC | fix in AutoBatch stats-logging counter | 25 January 2018, 01:35:20 UTC |
a2d588a | Frank Seide | 25 January 2018, 01:21:32 UTC | BatchedBackward() now collects init time; control of batching stats fixed w.r.t. backward | 25 January 2018, 01:21:32 UTC |
85d737f | Frank Seide | 25 January 2018, 01:11:25 UTC | (fixed statistics) | 25 January 2018, 01:11:25 UTC |
163353f | Frank Seide | 24 January 2018, 19:54:24 UTC | towards showsing stats for Backward | 24 January 2018, 19:54:24 UTC |
9c88fba | Frank Seide | 24 January 2018, 19:39:14 UTC | disabled GPU sync for time measurement in BatcgedForward(); removed the hard-coded Marian hyper-parameters in Adam; MT: changed savedEvery to 10000; updated some logging; changed Adam momentum parameters to better match Marian's; added a new end-to-end timer for the non-Update() sub-minibatches | 24 January 2018, 19:39:14 UTC |
e780fd3 | Frank Seide | 24 January 2018, 16:57:30 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/CNTK into fseide/dynamite | 24 January 2018, 16:57:30 UTC |
6c4a2f2 | Frank Seide | 24 January 2018, 16:57:23 UTC | gcc compiler happiness | 24 January 2018, 16:57:23 UTC |
00c5592 | Frank Seide | 24 January 2018, 16:51:09 UTC | bug fix: RAggregateGradientFromAllConsumers() should separate transposed and non-transposed Times for weight gradient, to ensure the dense gradients are done before the sparse ones for the same weight; bug fix: marian DropoutMask() should scale the mask; reenabled dropout in Transformer | 24 January 2018, 16:51:09 UTC |
d7a36ba | Frank Seide | 23 January 2018, 22:10:11 UTC | adjusted the tests for BernoulliRandom() | 23 January 2018, 22:10:11 UTC |
80fa470 | Frank Seide | 23 January 2018, 09:48:06 UTC | Marian dropout now implemented using BernoulliRandom() | 23 January 2018, 09:48:06 UTC |