5c3f708 | Mark Hillebrand | 18 January 2016, 08:36:30 UTC | License change | 18 January 2016, 08:36:30 UTC |
1143418 | Marko Radmilac | 29 December 2015, 18:59:54 UTC | fix format | 29 December 2015, 18:59:54 UTC |
1893322 | Marko Radmilac | 29 December 2015, 18:47:32 UTC | add outputs | 29 December 2015, 18:47:32 UTC |
d7f0c07 | Frank Seide | 24 December 2015, 11:04:44 UTC | deleted virtual function ComputationNode::InferImageDimsFromInputs() since no longer needed after update of tensor-dim inference. Unary zip ops just copy the layout from their input, and binary zip ops take dimension-wise max (to consider broadcasting) | 24 December 2015, 11:04:44 UTC |
c7a0b62 | Frank Seide | 24 December 2015, 09:48:25 UTC | bug fix: SimpleNetworkBuilder::AddTrainAndEvalCriterionNodes() should not compute 'tinput' for certain training-criterion nodes because 'input' has a different meaning for those; revived BatchSequenceReader, now supports MBLayout::AddSequence() | 24 December 2015, 09:48:25 UTC |
9b8a508 | Frank Seide | 24 December 2015, 07:58:23 UTC | bug fix: ConvolutionNode must not ever change m_sampleLayout numChannels dimension | 24 December 2015, 07:58:23 UTC |
ff0a801 | Frank Seide | 24 December 2015, 07:11:40 UTC | fixed numRows test for 0 dimensions | 24 December 2015, 07:11:40 UTC |
086e62f | Frank Seide | 24 December 2015, 07:02:37 UTC | removed SetDims(size_t,cols) which only set m_numRows. Now, row dimension can only be set together with the sample layout, and in debug builds they are verified against each other | 24 December 2015, 07:02:37 UTC |
9f08610 | Frank Seide | 24 December 2015, 06:29:50 UTC | (minor fix of last commit) | 24 December 2015, 06:29:50 UTC |
bf35c75 | Frank Seide | 24 December 2015, 06:23:20 UTC | inlined all InferImageDimsFromInputs() calls in ConvolutionNodes.h. This allowed to use SetDims(TensorShape,cols) instead of setting numRows and sampleLayout separately | 24 December 2015, 06:23:20 UTC |
d5fd84f | Frank Seide | 24 December 2015, 06:12:24 UTC | removed m_inputSampleLayout. It is only used in ConvolutionNodes.h, and is identical to the input's sample layout, so we can get it where we need it from the input directly | 24 December 2015, 06:12:24 UTC |
6a69048 | Frank Seide | 24 December 2015, 05:33:16 UTC | changed former InferImageDimsFromInput(index) to a getter that just reads out the layout from the child, and all calls to this now assign it manually to the (input)SampleLayout variables, in quest of removing m_inputSampleLayout altogether | 24 December 2015, 05:33:16 UTC |
0864a8c | Frank Seide | 24 December 2015, 05:12:51 UTC | split up InferImageDimsFromInput() into false and true versions | 24 December 2015, 05:12:51 UTC |
db4c5ea | Frank Seide | 24 December 2015, 04:48:08 UTC | SetDims(node) now also clones that node's m_sampleLayout | 24 December 2015, 04:48:08 UTC |
166536b | Frank Seide | 24 December 2015, 04:46:36 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 24 December 2015, 04:46:36 UTC |
df6c25e | Frank Seide | 24 December 2015, 04:45:43 UTC | sorted out various SetDims() calls, towards using SetDims() only to set both m_numRows and m_sampleLayout jointly | 24 December 2015, 04:45:43 UTC |
dfd780a | Frank Seide | 24 December 2015, 04:21:27 UTC | cleaning up order of Validate(): MBLayout is now set first, and SetDim() and InferImageDimsFromInput() are grouped (they will get merged) | 24 December 2015, 04:21:27 UTC |
5829d3d | Frank Seide | 24 December 2015, 03:11:28 UTC | (added a perf comment) | 24 December 2015, 03:11:28 UTC |
6418994 | Frank Seide | 24 December 2015, 02:27:27 UTC | enabled ScaleNode::BackpropTo to use tensor lib; merged with master | 24 December 2015, 02:27:27 UTC |
3f721d1 | Alexey Kamenev | 24 December 2015, 00:59:18 UTC | Updated ResNet sample. | 24 December 2015, 00:59:18 UTC |
87096d4 | Alexey Kamenev | 24 December 2015, 00:44:11 UTC | Added ResNet ImageNet samples. | 24 December 2015, 00:44:11 UTC |
44e7343 | Frank Seide | 23 December 2015, 23:29:34 UTC | merged from master | 23 December 2015, 23:29:34 UTC |
28ffb10 | Frank Seide | 23 December 2015, 23:20:40 UTC | changed TensorShape editing functions to in-place, to avoid more mem copying; disabled the mapping of ScaleNode, RowElementTimesNode, and ColumnElementTimesNode for now because we see a perf hit with Scale. Reenable once that is solved | 23 December 2015, 23:20:40 UTC |
a590e5e | Frank Seide | 23 December 2015, 22:44:24 UTC | more optimizations of PrepareTensorOperands() aimed at reducing memory copies and mallocs | 23 December 2015, 22:44:24 UTC |
fd9e792 | Alexey Kamenev | 23 December 2015, 21:49:20 UTC | Updated ResNet sample. | 23 December 2015, 21:59:31 UTC |
02f1f56 | Alexey Kamenev | 23 December 2015, 21:44:36 UTC | Updated samples, remove bogus restriction from conv and pool nodes. | 23 December 2015, 21:46:10 UTC |
05af287 | Alexey Kamenev | 23 December 2015, 17:44:10 UTC | Updated image samples. | 23 December 2015, 21:46:00 UTC |
0e880c2 | Frank Seide | 23 December 2015, 21:38:13 UTC | some simplification of tensor-shape building, but no measurable speed impact; GetSampleShape() changed to GetAndValidateSampleLayout() which no longer makes up TensorShapes but rather enforces that m_sampleLayout is consistent and plausibly set up. Avoids a copy | 23 December 2015, 21:38:13 UTC |
42a027f | Jasha Droppo | 23 December 2015, 20:45:59 UTC | Merge branch 'master' into CUDA-elementwise-rework Conflicts: Source/Math/GPUMatrix.cu Source/Math/GPUSparseMatrix.cu | 23 December 2015, 20:45:59 UTC |
ecbc649 | Jasha Droppo | 23 December 2015, 20:36:38 UTC | CUDA making sure older sigmoid kernel is used for AssignSigmoidOf() | 23 December 2015, 20:36:38 UTC |
788b1d3 | Frank Seide | 23 December 2015, 19:20:16 UTC | undid last stdafx.h uncomment, did not work | 23 December 2015, 19:20:16 UTC |
7c9a991 | Frank Seide | 23 December 2015, 17:18:07 UTC | commented out all #include "stdafx.h" as it caused build errors without a meaningful error message that would point out which file is wrong | 23 December 2015, 17:18:07 UTC |
81eeff1 | Frank Seide | 23 December 2015, 08:39:38 UTC | added a#ifndef _CRT_SECURE_NO_WARNINGS to several stdafx.h | 23 December 2015, 08:39:38 UTC |
af84f92 | Frank Seide | 23 December 2015, 08:30:59 UTC | bug fix: ElementTimes::BackpropTo() in the TensorView prototype did not mask gaps correctly | 23 December 2015, 08:30:59 UTC |
279a653 | Frank Seide | 23 December 2015, 08:17:01 UTC | added a missing _CRT_SECURE_NO_WARNINGS | 23 December 2015, 08:17:01 UTC |
e551995 | Frank Seide | 23 December 2015, 08:05:09 UTC | updated NoGPU.cpp re TensorOp()/SmallVector | 23 December 2015, 08:05:09 UTC |
8856f49 | Frank Seide | 23 December 2015, 07:58:49 UTC | all tensor ops are now available in three variants: DoOpOf(), AssignOpOf(), and AddOpOf(); SigmoidNode prototype with tensor lib for gradient as well | 23 December 2015, 07:58:49 UTC |
5040ff7 | Frank Seide | 23 December 2015, 07:18:24 UTC | reimplemented SmallVector without dynamic memory allocation--10% faster for small minibatches, same speed as without tensor lib | 23 December 2015, 07:18:24 UTC |
beda4ef | Frank Seide | 23 December 2015, 06:25:50 UTC | changed TensorShape to use new class SmallVector<> instead of std::vector, which is meant to avoid dynamic memory allocations | 23 December 2015, 06:25:50 UTC |
283f22d | Frank Seide | 23 December 2015, 05:36:54 UTC | new CUDA-efficient Sigmoid() implementation as suggested by Jasha | 23 December 2015, 05:36:54 UTC |
fb3dae2 | Frank Seide | 23 December 2015, 04:28:21 UTC | prototypical implementation of Sigmoid ForwardProp() using the tensor lib; special optimization for linear-scan unary tensor operations (most frequent case); adjusted the threshold for the Sigmoid() tensor op to reduce probability of thread divergence (thresholding at 0 will give a 50:50 split, doubling runtime). Still a overall 10% slower than old kernel, not clear why | 23 December 2015, 04:28:21 UTC |
cd87741 | Frank Seide | 23 December 2015, 01:37:11 UTC | merged with master | 23 December 2015, 01:37:11 UTC |
d717aa0 | Frank Seide | 23 December 2015, 01:34:10 UTC | undid previous heuristic, and instead changed the condition in RowSliceNode to not fail if the input is really a vector in image disguise (as one would load from old model files); made gcc happy, suddenly it no longer liked to match the template of TensorOpN() | 23 December 2015, 01:34:10 UTC |
0eed21c | Yongqiang Wang | 22 December 2015, 23:49:02 UTC | Fix a bug in RowSlice node (support legacy model format) | 22 December 2015, 23:49:02 UTC |
bdfcf91 | Frank Seide | 22 December 2015, 23:04:52 UTC | added a heuristic to TensorShape::Load() that allows to read old models created when sample layouts were not fully consistent | 22 December 2015, 23:04:52 UTC |
644c470 | Frank Seide | 22 December 2015, 22:35:19 UTC | new class GridDim to centralize computation of grid dimensions | 22 December 2015, 22:35:19 UTC |
58b8afb | Alexey Kamenev | 22 December 2015, 21:58:25 UTC | Updated AlexNet and VGG sample configs. | 22 December 2015, 21:59:40 UTC |
9602bdd | Alexey Kamenev | 22 December 2015, 21:52:40 UTC | Updated image samples. | 22 December 2015, 21:54:18 UTC |
e8c4e8f | Frank Seide | 22 December 2015, 20:47:01 UTC | minor optimization of tensor CUDA kernel | 22 December 2015, 20:47:01 UTC |
12a0c2e | Frank Seide | 22 December 2015, 09:49:09 UTC | made gcc happy (two-phase name lookup) | 22 December 2015, 09:49:09 UTC |
85c186f | Frank Seide | 22 December 2015, 09:46:50 UTC | (comments) | 22 December 2015, 09:46:50 UTC |
71cb56c | Frank Seide | 22 December 2015, 09:43:59 UTC | all BackpropTo() overloads from derived classes of NonlinearityNodeBase are now removed, code is cleaner and regular; brought back Validate() of DropoutNode, why did it go missing? | 22 December 2015, 09:43:59 UTC |
33e58f3 | Frank Seide | 22 December 2015, 09:32:21 UTC | further unified BackpropToV(), close to being ready to remove the dups | 22 December 2015, 09:32:21 UTC |
3ecbc8a | Frank Seide | 22 December 2015, 09:13:20 UTC | removed ScaleNode, RowElementTimesNode, and ColumnElementTimesNode if ENABLE_TENSORVIEW is #defined | 22 December 2015, 09:13:20 UTC |
b297b6d | Frank Seide | 22 December 2015, 09:01:00 UTC | made NonlinearityNodes.h a little more regular and removed some left-overs | 22 December 2015, 09:01:00 UTC |
5301b18 | Frank Seide | 22 December 2015, 07:43:59 UTC | prototypically implemented ScaleNode, RowElementTimesNode, and ColumnElementTimesNode just as ElementTimesNode with tensor lib (broadcasting), by hacking the name mapping | 22 December 2015, 07:43:59 UTC |
d17a011 | Frank Seide | 22 December 2015, 07:10:52 UTC | imlemented MinusNode and ElementTimesNode with tensor lib; changed aggregation in tensor lib to use double instead of ElemType; replaced ValueForToDense() by dong the same thing in BinaryElementwiseNode::BeginForwardProp() | 22 December 2015, 07:10:52 UTC |
90387f0 | Frank Seide | 22 December 2015, 06:36:15 UTC | moved MinusNode and ElementTimesNode to derive from BinaryElementwiseNode; added missing 'override's | 22 December 2015, 06:36:15 UTC |
757509c | Frank Seide | 22 December 2015, 06:14:13 UTC | created a new standard base class BinaryElementwiseNode, to implement PlusNode and friends. PlusNode first got the treatment, others will follow | 22 December 2015, 06:14:13 UTC |
3369234 | Frank Seide | 22 December 2015, 06:00:05 UTC | new #define ENABLE_TENSORVIEW; new methods ComputationNode::ValueTensorFor() and GradientTensorFor(). Makes PlusNode much easier | 22 December 2015, 06:00:05 UTC |
782a100 | U-FAREAST\fseide | 22 December 2015, 03:03:47 UTC | merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 22 December 2015, 03:03:47 UTC |
dfd7b39 | Frank Seide | 22 December 2015, 03:02:27 UTC | first prototype of gradient using tensor lib | 22 December 2015, 03:02:27 UTC |
152054c | Frank Seide | 22 December 2015, 02:10:23 UTC | bug fix in InputValueBase: now initializes and deserializes tensor dimensions correctly; bug fix: RowSliceNode now tests Input(0) layout directly since we no longer infer the input layout | 22 December 2015, 02:10:23 UTC |
24df1c1 | Frank Seide | 22 December 2015, 01:26:25 UTC | partial cleanup of sample layouts: ComputationNodeBase::GetSampleShape() now uses m_sampleLayout; added an overload for ComputationNode::SetDims() that takes a TensorShape and sets both m_sampleLayout and m_numRows from it; changed all sample layouts that were set/inferred to something resembling a vector to a 1-dim tensor. This may break code, as the old code would put the vector dimension into the third index. I presume that the old behavior is the bug; bug fix in RowSlice: it inferred the tensor dimension as if we were slicing the first tensor dimension, which we are not. RowSlice() now only allows actual vectors as inputs | 22 December 2015, 01:26:25 UTC |
49ab831 | Frank Seide | 21 December 2015, 23:54:08 UTC | refactored the GetTensorsForwardBinary() prototype implementation to be more general; prototypical implementation of gap masking with the new tensor lib (not enabled yet) | 21 December 2015, 23:54:08 UTC |
8add7ee | Frank Seide | 21 December 2015, 21:58:50 UTC | cleaned up DropoutNode | 21 December 2015, 21:58:50 UTC |
1527b89 | Philipp Kranen | 21 December 2015, 10:30:43 UTC | Adapted Readme files wrt new directory structure | 21 December 2015, 10:30:43 UTC |
9cf0f3d | Frank Seide | 19 December 2015, 08:15:55 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 19 December 2015, 08:15:55 UTC |
2fc5fa9 | Frank Seide | 19 December 2015, 08:15:32 UTC | bug fix: ComputationNode::GetTensorsForwardBinary() made a copy instead of taking a reference to the matrix | 19 December 2015, 08:15:32 UTC |
835b319 | Amit Agarwal | 19 December 2015, 07:03:20 UTC | Use only 2 threads for CPUMatrix::SetValue as this is a memory bound operation and use of large number of threads does not help but instead hurts in case of parallel training where this causes oversubscription. | 19 December 2015, 07:03:20 UTC |
d1752e1 | Frank Seide | 19 December 2015, 07:01:22 UTC | log messages to see where it fails in Jenkins | 19 December 2015, 07:01:22 UTC |
775458f | Frank Seide | 19 December 2015, 05:36:56 UTC | bug fix: wrong index bounds in TensorOpReduce | 19 December 2015, 05:36:56 UTC |
61df8fe | Frank Seide | 19 December 2015, 04:35:16 UTC | fixed a warning in Windows Release build; fixed CPUONLY build (TensorOp() missing in NoGPU.cpp) | 19 December 2015, 04:35:16 UTC |
4da5273 | Frank Seide | 19 December 2015, 04:02:29 UTC | disabled the prototypical use of the new tensor addition in the PlusNode again | 19 December 2015, 04:02:29 UTC |
b98855a | Frank Seide | 19 December 2015, 03:59:26 UTC | tensor library works in PlusNode prototype implementation() which reduces ForwardProp() to two lines of code; bug fix: GetTensorsForwardBinary() ignored #samples in FrameRange; bug fix: TensorView cannot hold a reference to a Matrix object since these are often temporaries. Instead, we copy a Matrix object that itself is a reference; deleted orphaned ClassDiagram.cd from Math Project | 19 December 2015, 03:59:26 UTC |
9fe02f9 | Frank Seide | 19 December 2015, 03:59:26 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 19 December 2015, 03:59:26 UTC |
befdf9d | Frank Seide | 19 December 2015, 02:17:41 UTC | GPU version of tensor ops passes the simple tests now | 19 December 2015, 02:17:41 UTC |
57ddd12 | Frank Seide | 19 December 2015, 01:59:44 UTC | CUDA implementation for tensor ops complete, but not computing the right thing | 19 December 2015, 01:59:44 UTC |
ef80d86 | Alexey Kamenev | 18 December 2015, 23:41:59 UTC | Updated Linux baselines. | 18 December 2015, 23:42:35 UTC |
cf7fc9f | Alexey Kamenev | 18 December 2015, 23:17:49 UTC | Updated baselines for cuDNN. | 18 December 2015, 23:42:35 UTC |
b2fb4d0 | Frank Seide | 18 December 2015, 23:00:05 UTC | added plumbing for tensor operations to GPUMatrix.cu, but have not implemented an acty; implemented row-major conversion in GPUMatrix::SetValue() (on the CPU, so not maximally efficient) | 18 December 2015, 23:00:05 UTC |
3a6ea21 | Frank Seide | 18 December 2015, 21:31:26 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 18 December 2015, 21:31:26 UTC |
c343e98 | Frank Seide | 18 December 2015, 21:30:33 UTC | further optimized the most frequent tensor loops (1-stride loops for unary and binary ops), but still not seeing 4-way SSE parallelism | 18 December 2015, 21:30:33 UTC |
9d33fc1 | Frank Seide | 18 December 2015, 18:55:30 UTC | added a specialization of a tensor op for inner dimensions where all strides are 1. Seems not quite enough for really efficient unrolling though | 18 December 2015, 18:55:30 UTC |
1a1bd17 | Frank Seide | 18 December 2015, 18:01:17 UTC | bug fix: ComputationNode::DetermineNumCols() was an outdated pre-refactoring hold-over with a now incorrect validity check. Can just be removed, Should fix reported by user xiaoqing; removed unnecessary ad inconsistent use of 'this->' throughout Matrix.cpp, also fixed some bad indentations | 18 December 2015, 18:01:17 UTC |
1f26215 | Alexey Kamenev | 17 December 2015, 20:31:17 UTC | Fixed mbStart in ImageReader for distributed case. | 18 December 2015, 17:29:40 UTC |
b8de2fe | Alexey Kamenev | 17 December 2015, 02:52:17 UTC | Added support for distributed reading in ImageReader. | 18 December 2015, 17:29:30 UTC |
6e20025 | Frank Seide | 18 December 2015, 16:57:49 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 18 December 2015, 16:57:49 UTC |
91eadb0 | Frank Seide | 18 December 2015, 16:54:19 UTC | moved all tensor ops to a new header TensorOps.h so they can be shared between matrix types; also moved the float/double-unified math overloads (e.g. exp_()) there, as well as additional typically needed functions such as Sigmoid() | 18 December 2015, 16:54:19 UTC |
679c3c5 | Mark Hillebrand | 15 December 2015, 12:39:43 UTC | Source/Readers/LMSequenceReader/: also build SequenceWriter on Linux | 18 December 2015, 11:59:40 UTC |
12e312f | Frank Seide | 18 December 2015, 08:17:45 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 18 December 2015, 08:17:45 UTC |
f54e1fe | Frank Seide | 18 December 2015, 08:07:59 UTC | implemented unary and ternary tensor ops. CPU implementation of elementwise tensor ops is feature complete (but may require optimization) | 18 December 2015, 08:07:59 UTC |
7d32cdf | Frank Seide | 18 December 2015, 06:41:19 UTC | implemented all binary tensor operators (don't we love macros!) | 18 December 2015, 06:41:19 UTC |
83e5bbc | Qiwei Ye | 18 December 2015, 04:38:39 UTC | Revert "Revert "adding an MPI init test in case of that MPI was initialized repeatedly"" This reverts commit 23ebe452a5e35dddfba2d08e8fb3265901bfc8af. | 18 December 2015, 04:38:39 UTC |
928da88 | Frank Seide | 18 December 2015, 01:04:55 UTC | first version of CPU implementation of TensorView::DoSumOf() working now | 18 December 2015, 01:04:55 UTC |
38cb2fa | Frank Seide | 18 December 2015, 00:14:54 UTC | bug fix in MBLayout: We should not guard against all parallel sequences having a gap at a time step, as that happens in truncated BPTT, and it would be much more complex to fix the reader, so we allow it | 18 December 2015, 00:14:54 UTC |
aa5d1a7 | Frank Seide | 17 December 2015, 23:50:00 UTC | implemented plumbing and first shot for TensorView operation with reduction | 17 December 2015, 23:50:00 UTC |
8588af4 | Frank Seide | 17 December 2015, 19:35:54 UTC | Merge branch 'master' of https://git.codeplex.com/cntk into fseide/tensors | 17 December 2015, 19:35:54 UTC |
e6040d0 | Frank Seide | 17 December 2015, 19:35:28 UTC | made Linux build happy (missing explicit method template specialization of CPUMatrix<char>::Resize()) | 17 December 2015, 19:35:28 UTC |
bb6fc1b | Frank Seide | 17 December 2015, 19:33:52 UTC | optimized MBLayout::InitAsFrameMode(), short-replacing calls to AddSequence() by a much simpler direct initialization for this special case; added editing functions to TensorShape, and rewrote TensorView::DoBinaryOpOf() to use them | 17 December 2015, 19:33:52 UTC |