5aa28a9 | Frank Seide | 11 February 2018, 07:34:03 UTC | made MSVC happy; undid the fake implementation of AssignValuesOf<int>(), instead resolved all needed templates | 11 February 2018, 07:34:03 UTC |
e7a3733 | Frank Seide | 11 February 2018, 06:14:45 UTC | implemented an int version of Unpack()'s underlying functionality, avoiding float overflows for large minibatches. Not working yet, need to move to Windows to debug. Involved changes to template definitions throughout the matrix stack, but no real code change | 11 February 2018, 06:14:45 UTC |
b5a8adc | Frank Seide | 10 February 2018, 03:31:03 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/cntk into fseide/dynamite | 10 February 2018, 03:31:03 UTC |
b7663f9 | Frank Seide | 10 February 2018, 03:30:20 UTC | implemented learning-rate decay; exposed learning-rate warm-up as a paraeter | 10 February 2018, 03:30:20 UTC |
8a6d806 | Frank Seide | 10 February 2018, 03:10:12 UTC | minor fix in tag and logging | 10 February 2018, 03:10:12 UTC |
9307403 | Frank Seide | 10 February 2018, 02:54:45 UTC | changed log output to show position relative to epoch; position tag changed to percentage | 10 February 2018, 02:54:45 UTC |
b222d64 | Frank Seide | 10 February 2018, 02:51:23 UTC | GetSubBatches() now breaks loads into small pieces to avoid the float32 overflow for very large batches; removed L=0 from partial log | 10 February 2018, 02:51:23 UTC |
3296d0b | Frank Seide | 10 February 2018, 02:15:10 UTC | bug fix: overflow check for very large minibatches did not typecast correctly. | 10 February 2018, 02:15:10 UTC |
8326dfa | Frank Seide | 10 February 2018, 01:49:02 UTC | bug fix: relPosition should be relative to epochSize | 10 February 2018, 01:49:02 UTC |
ffc34c2 | Frank Seide | 10 February 2018, 01:20:45 UTC | new option --fromLatest, to allow automatic restart from checkpoint in Philly retries | 10 February 2018, 01:20:45 UTC |
489aa6f | Frank Seide | 10 February 2018, 01:07:45 UTC | checkpointing now saves a latest tag | 10 February 2018, 01:07:45 UTC |
6537f78 | Frank Seide | 10 February 2018, 00:50:52 UTC | changed checkpointing to fractions of epochs | 10 February 2018, 00:50:52 UTC |
dc6bfaa | Frank Seide | 09 February 2018, 23:42:06 UTC | added code to reader API to report current sample position; towards an epochSize-based scheduling, checkpointing, etc. (removed a debug-leftover sleep()) | 09 February 2018, 23:42:06 UTC |
6ed5349 | Frank Seide | 09 February 2018, 22:07:59 UTC | added a comment | 09 February 2018, 22:07:59 UTC |
c78abf7 | Frank Seide | 09 February 2018, 21:48:35 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/cntk into fseide/dynamite | 09 February 2018, 21:48:35 UTC |
d681d42 | Frank Seide | 09 February 2018, 21:46:41 UTC | added an overflow check to Unpack() for the cast of time index into a float32 (which can overflow for large minibatches) | 09 February 2018, 21:46:41 UTC |
70c5eb6 | Frank Seide | 09 February 2018, 21:44:01 UTC | changed pathnames to have and interpolate with environment variables, for running on Philly | 09 February 2018, 21:44:01 UTC |
442c033 | Frank Seide | 09 February 2018, 21:42:44 UTC | (comment) | 09 February 2018, 21:42:44 UTC |
c0383c4 | Frank Seide | 09 February 2018, 21:42:28 UTC | uncommented some checks | 09 February 2018, 21:42:28 UTC |
bfec707 | Frank Seide | 09 February 2018, 21:40:23 UTC | cleaned up checking code in AutoBatch | 09 February 2018, 21:40:23 UTC |
c609923 | Frank Seide | 08 February 2018, 17:58:25 UTC | bug fix: Makefile lib order should pick up explicit boost lib before generic system path | 08 February 2018, 17:58:25 UTC |
adfc1ea | Frank Seide | 07 February 2018, 05:32:15 UTC | increased bucketingFactor and loss-reporting smoothing time constant; turned dumping model parameters into a new command dump_model (and for that, split out the Marian calls into separate functions) | 07 February 2018, 05:32:15 UTC |
f03a669 | Frank Seide | 03 February 2018, 07:20:34 UTC | (added a little Python script to convert Dynamite parameter dumps to a Marian model readable by marian-decoder) | 03 February 2018, 07:20:34 UTC |
50a5d41 | Frank Seide | 02 February 2018, 06:07:15 UTC | bug fix: PlainTextDeserializer::PlainTextSequenceData should hold a ref count to each chunk's data-index array too, to avoid an incorrect early free; renamed a few variables in SetMatrixFromCSCFormat() in prep of reusing it for CSR | 02 February 2018, 06:07:15 UTC |
66fd02f | Frank Seide | 02 February 2018, 01:19:47 UTC | MT.cpp now prints call stack in case of crash | 02 February 2018, 01:19:47 UTC |
4155605 | Frank Seide | 02 February 2018, 00:53:35 UTC | cleaned up SetMatrixFromCSCFormat(), it should now be the same as SetMatrixFromCSRFormat() | 02 February 2018, 00:53:35 UTC |
a7decff | Frank Seide | 02 February 2018, 00:19:20 UTC | cleaned up resetting of GPUSparseMatrix::m_sliceViewOffset | 02 February 2018, 00:19:20 UTC |
bfe74b4 | Frank Seide | 01 February 2018, 23:49:00 UTC | CopyToCPUSparseMatrix() now makes a compact copy of the actual size, not preserving the allocation size | 01 February 2018, 23:49:00 UTC |
0972e98 | Frank Seide | 01 February 2018, 22:04:51 UTC | bug fix in GatherBatch(): should use correct NzCount | 01 February 2018, 22:04:51 UTC |
1eec80c | Frank Seide | 01 February 2018, 19:08:23 UTC | Matrix::AssignValuesOf() can now copy from GPUSparse to CPUSparse; bug fix: CopyToCPUSparseMatrix() should honor and adjust copied indices for sliceViewOffset | 01 February 2018, 19:08:23 UTC |
1931f11 | Frank Seide | 01 February 2018, 17:40:37 UTC | added CheckSparse(), but not working | 01 February 2018, 17:40:37 UTC |
336b25f | Frank Seide | 01 February 2018, 08:41:49 UTC | (comments) | 01 February 2018, 08:41:49 UTC |
365218c | Frank Seide | 01 February 2018, 08:11:14 UTC | added lots of checks to NDArrayViewArena (seems all correctly working); disabled arena for now, to track down some bug | 01 February 2018, 08:11:14 UTC |
9ff0bb5 | Frank Seide | 01 February 2018, 08:08:39 UTC | new methods MatrixBase::AsPtr(), AsRef(), ElemTypeIs() | 01 February 2018, 08:08:39 UTC |
6e6290d | Frank Seide | 01 February 2018, 07:35:53 UTC | cleaned up the last commit | 01 February 2018, 07:35:53 UTC |
f6561c0 | Frank Seide | 01 February 2018, 07:12:19 UTC | last commit improved by now launching more CUDA threads (more parallelism) | 01 February 2018, 07:12:19 UTC |
6885e3f | Frank Seide | 01 February 2018, 06:58:04 UTC | refactored sparse Convolution with transposed rhs to move the loop into the CUDA kernel, avoiding lots of CUDA launches | 01 February 2018, 06:58:04 UTC |
5a95f30 | Frank Seide | 01 February 2018, 06:42:06 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/CNTK into fseide/dynamite | 01 February 2018, 06:42:06 UTC |
14451bd | Frank Seide | 01 February 2018, 06:42:03 UTC | (comment) | 01 February 2018, 06:42:03 UTC |
aceeac6 | Frank Seide | 01 February 2018, 06:41:08 UTC | clarified some names in sparse Convolution | 01 February 2018, 06:41:08 UTC |
3137490 | Frank Seide | 01 February 2018, 06:39:42 UTC | renamed some variables in sparse kernels for clarity | 01 February 2018, 06:39:42 UTC |
4036e68 | Frank Seide | 01 February 2018, 05:31:32 UTC | cleaned up sparse x dense -> dense, and added a Dynamite test | 01 February 2018, 05:31:32 UTC |
67bbbd5 | Frank Seide | 31 January 2018, 19:49:42 UTC | removed s_recycledDenseArenas, as its role has been taken over by the recycled memory blocks | 31 January 2018, 19:49:42 UTC |
a9d4d30 | Frank Seide | 31 January 2018, 19:39:57 UTC | removed s_currentArenaSize/Used | 31 January 2018, 19:39:57 UTC |
e447f57 | Frank Seide | 31 January 2018, 19:35:19 UTC | removed s_currentArena | 31 January 2018, 19:35:19 UTC |
782e41a | Frank Seide | 31 January 2018, 19:31:31 UTC | changed an indentation (separate commit since diff shows it as one huge changed blob), no code change | 31 January 2018, 19:31:31 UTC |
9d66ffb | Frank Seide | 31 January 2018, 19:30:27 UTC | minor cleanup of previous commit, for better diffing | 31 January 2018, 19:30:27 UTC |
6315819 | Frank Seide | 31 January 2018, 19:25:07 UTC | preparing to delete the arena allocation, moving to using gaps only; factored gap allocation into separate function (no code change, but some renaming) | 31 January 2018, 19:25:07 UTC |
f15bf2f | Frank Seide | 31 January 2018, 18:54:41 UTC | (moved another function, no code change) | 31 January 2018, 18:54:41 UTC |
ffe27b5 | Frank Seide | 31 January 2018, 18:53:42 UTC | (moved a function, no code change) | 31 January 2018, 18:53:42 UTC |
a0e8712 | Frank Seide | 31 January 2018, 18:51:54 UTC | (minor fix to last commit) | 31 January 2018, 18:51:54 UTC |
a10c637 | Frank Seide | 31 January 2018, 18:49:18 UTC | separated s_recycledArenass into one dense and sparse | 31 January 2018, 18:49:18 UTC |
d8a755b | Frank Seide | 31 January 2018, 18:43:02 UTC | minor cleanup of last commit (in two steps for better diffing); bug fix: missed a lock_guard in previous refactoring | 31 January 2018, 18:43:02 UTC |
9d2d85c | Frank Seide | 31 January 2018, 18:39:24 UTC | refactored NDArrayViewArena::New() a little, no code change otherwise | 31 January 2018, 18:39:24 UTC |
0d230c6 | Frank Seide | 31 January 2018, 18:32:39 UTC | can now pass an allocator to Primitivefunction::Forward() and BackwardTo(), to allow temporary objects in the future; NewNDArrayView() now simply called New() | 31 January 2018, 18:32:39 UTC |
2930c48 | Frank Seide | 31 January 2018, 17:59:43 UTC | bug fix: IsMatrixOfDataType() should use correct type 'double'; bug fix: GetSubBatches_CreateMinibatches() should account for the last sentence of a partial minibatch correctly; bug fix: Train() should not run minibatch source in inference mode! | 31 January 2018, 17:59:43 UTC |
66cfd80 | Frank Seide | 29 January 2018, 23:51:55 UTC | rewrote partial batching, to account for Marian's padding | 29 January 2018, 23:51:55 UTC |
50dbc38 | Frank Seide | 29 January 2018, 21:22:48 UTC | now catches bad_alloc and logs the specific MB | 29 January 2018, 21:22:48 UTC |
b4aa61b | Frank Seide | 29 January 2018, 21:11:39 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/cntk into fseide/dynamite | 29 January 2018, 21:11:39 UTC |
4934c9e | Frank Seide | 29 January 2018, 21:11:30 UTC | TracingGPUMemoryAllocator::AllocateNoTrace(0 now throws bad_alloc(); simplified the allocator a little (most variables only exist for dense, so no need for arrays); bug fix: sparse allocation should not kill the current dense arena | 29 January 2018, 21:11:30 UTC |
499c3bc | Frank Seide | 29 January 2018, 19:45:36 UTC | improved logging for mem alloc | 29 January 2018, 19:45:36 UTC |
cbbd830 | Frank Seide | 29 January 2018, 19:12:32 UTC | merging now happens inside RecycleMemoryBlock() | 29 January 2018, 19:12:32 UTC |
9aaa78a | Frank Seide | 29 January 2018, 18:40:48 UTC | refactored NewNDArray() quite a bit; bug fix: gap merging should increment j correctly in all cases | 29 January 2018, 18:40:48 UTC |
34c20fe | Frank Seide | 29 January 2018, 16:14:39 UTC | added tracking to mem allocator; changed saveEvery | 29 January 2018, 16:14:39 UTC |
dc4756f | Frank Seide | 28 January 2018, 20:46:45 UTC | super hacks in NewNDArrayView, seems to work, but could be sped up | 28 January 2018, 20:46:45 UTC |
dbf882d | Frank Seide | 28 January 2018, 02:55:14 UTC | added diagnostics for allocator | 28 January 2018, 02:55:14 UTC |
17ee59d | Frank Seide | 28 January 2018, 02:51:45 UTC | towards a gap allocator | 28 January 2018, 02:51:45 UTC |
b4d1930 | Frank Seide | 28 January 2018, 01:36:30 UTC | prototypical implementation of recording recycled gaps | 28 January 2018, 01:36:30 UTC |
a01f79a | Frank Seide | 28 January 2018, 01:11:08 UTC | disabled logging of gaps | 28 January 2018, 01:11:08 UTC |
cd183ed | Frank Seide | 28 January 2018, 00:43:00 UTC | made gcc happy | 28 January 2018, 00:43:00 UTC |
adbaf0f | Frank Seide | 28 January 2018, 00:38:16 UTC | added tracing to the memory allocator, in prep for more proper reuse of memory | 28 January 2018, 00:38:16 UTC |
ceaacce | Frank Seide | 27 January 2018, 18:47:26 UTC | added a custom deleter to Matrix for matrix storage objects that wrap an external buffer. Not tested yet, but want to break this into small commits | 27 January 2018, 18:47:26 UTC |
5c4f1e2 | Frank Seide | 27 January 2018, 02:36:48 UTC | (fixed a few messages); layer_norm() switched to Pow() | 27 January 2018, 02:36:48 UTC |
b5f8124 | Frank Seide | 26 January 2018, 22:44:11 UTC | switched layer_norm to NormalizeDenormalize() | 26 January 2018, 22:44:11 UTC |
25da613 | Frank Seide | 26 January 2018, 22:41:40 UTC | implemented NormalizeDenormalize() and tests | 26 January 2018, 22:41:40 UTC |
f2cd275 | Frank Seide | 26 January 2018, 21:57:06 UTC | Merge branch 'fseide/dynamite' of https://github.com/Microsoft/CNTK into fseide/dynamite | 26 January 2018, 21:57:06 UTC |
853baa5 | Frank Seide | 26 January 2018, 21:56:53 UTC | made gcc happy | 26 January 2018, 21:56:53 UTC |
163d024 | Frank Seide | 26 January 2018, 21:53:37 UTC | removed some old, wrong version of matrix-weight backprop | 26 January 2018, 21:53:37 UTC |
f5625b6 | Frank Seide | 26 January 2018, 21:51:32 UTC | added another test for ElementAffine() with an explicit formula, to test opAxBplusC itself | 26 January 2018, 21:51:32 UTC |
55226b4 | Frank Seide | 26 January 2018, 21:49:03 UTC | implemented ElementAffine(), with tests; using it in layer_norm | 26 January 2018, 21:49:03 UTC |
348e73a | Frank Seide | 26 January 2018, 20:09:58 UTC | Expr operators with scalars now use ScaleAndShift(), avoiding the constants | 26 January 2018, 20:09:58 UTC |
97607ad | Frank Seide | 26 January 2018, 19:54:52 UTC | implemented ScaleAndShift(), with tests | 26 January 2018, 19:54:52 UTC |
e7c27aa | Frank Seide | 26 January 2018, 19:14:20 UTC | bug fix: batched backprop into matrix weight can now merge more complex map shapes like found in Marian, by reshaping; changed marian::affine() to use Affine(), saving some memory | 26 January 2018, 19:14:20 UTC |
6c1085c | Frank Seide | 26 January 2018, 19:10:23 UTC | (adapted a test to a new function signature) | 26 January 2018, 19:10:23 UTC |
c89cb67 | Frank Seide | 26 January 2018, 02:00:42 UTC | undid an accidental commit of an intentional test failure | 26 January 2018, 02:00:42 UTC |
ed85c77 | Frank Seide | 26 January 2018, 01:57:47 UTC | tests for variants of Affine() | 26 January 2018, 01:57:47 UTC |
be3d3e9 | Frank Seide | 26 January 2018, 01:53:15 UTC | bug fix: gradient of TransposeAffine should be the same as TransposeTimes; added tests for variants of Affine() | 26 January 2018, 01:53:15 UTC |
9efe860 | Frank Seide | 26 January 2018, 00:47:44 UTC | minor fix in marian::layer_norm | 26 January 2018, 00:47:44 UTC |
6d44662 | Frank Seide | 26 January 2018, 00:46:34 UTC | implemented Affine() and a test. This required quite a bit of additions at several places; bug fix: back prop into matrix weight should use a || instead of && to match (this fix revealed another bug to be fixed later) | 26 January 2018, 00:46:34 UTC |
85f619f | Frank Seide | 25 January 2018, 22:27:07 UTC | added stubs for new operations such as Affine() and ScaleAndShift(), but not implemented yet | 25 January 2018, 22:27:07 UTC |
6c9b6f9 | Frank Seide | 25 January 2018, 22:10:26 UTC | added new PrimitiveFunction() constructor overload that takes an InputsVectorType | 25 January 2018, 22:10:26 UTC |
081aa04 | Frank Seide | 25 January 2018, 07:08:45 UTC | added runtime stats to BackpropThroughSplice() and BackpropToMatrixWeight() | 25 January 2018, 07:08:45 UTC |
d8951a1 | Frank Seide | 25 January 2018, 06:43:27 UTC | disabled LR decay; disabled reading out partial-MB loss from GPU | 25 January 2018, 06:43:27 UTC |
2c513db | Frank Seide | 25 January 2018, 05:50:39 UTC | bug fixes: when passing something by move(), the same argument list should not access it; added statistics reporting for BackpropToUnbatched() | 25 January 2018, 05:50:39 UTC |
e917e77 | Frank Seide | 25 January 2018, 03:30:43 UTC | towards moving backprop onto bg thread | 25 January 2018, 03:30:43 UTC |
cc7c591 | Frank Seide | 25 January 2018, 03:17:25 UTC | cleaned up error handling in the Memoizer thread | 25 January 2018, 03:17:25 UTC |
8a1f8e1 | Frank Seide | 25 January 2018, 01:42:42 UTC | minor updates of logging | 25 January 2018, 01:42:42 UTC |
f20678a | Frank Seide | 25 January 2018, 01:35:20 UTC | fix in AutoBatch stats-logging counter | 25 January 2018, 01:35:20 UTC |
a2d588a | Frank Seide | 25 January 2018, 01:21:32 UTC | BatchedBackward() now collects init time; control of batching stats fixed w.r.t. backward | 25 January 2018, 01:21:32 UTC |
85d737f | Frank Seide | 25 January 2018, 01:11:25 UTC | (fixed statistics) | 25 January 2018, 01:11:25 UTC |