https://github.com/Microsoft/CNTK
Revision 93cc8c40dbacfabd1d59059da2e1af7d8719a6d9 authored by Alexey Reznichenko on 18 July 2017, 15:56:32 UTC, committed by Alexey Reznichenko on 25 July 2017, 08:11:25 UTC
  * Refactor index data structures and rewrite indexers (with most changes
    in the text index builder).
  * Add best effort caching: the cache is written out asynchronously in a
    separate thread, on restart the index builder tries to restore the
    index from cache (as long as the cache is not older than the input
    file) and goes back no normal indexing if that fails (i.e., the cache
    is corrupt).
  * Refactor and simplify MemoryBuffer (renamed to BufferedFileReader).
  * Use KMP patter-matching to simply sample counting with non-empty main
    stream (num samples in sequence = number of lines that contain main
    stream name).
  * Refactor and simplify MLFIndexBuilder (it now also uses
    BufferedFileReader)
  * Use 512KB chunks when loading index from cache for faster reading.
  * Add a number of unit tests for the indexing both with and without
    caching.
1 parent f217bac
History
Tip revision: 93cc8c40dbacfabd1d59059da2e1af7d8719a6d9 authored by Alexey Reznichenko on 18 July 2017, 15:56:32 UTC
Add index caching
Tip revision: 93cc8c4
File Mode Size
Dependencies
Documentation
Examples
Manual
Scripts
Source
Tests
Tools
Tutorials
bindings
.clang-format -rw-r--r-- 931 bytes
.gitattributes -rw-r--r-- 3.1 KB
.gitignore -rw-r--r-- 6.1 KB
.gitmodules -rw-r--r-- 211 bytes
CNTK.Common.props -rw-r--r-- 1.5 KB
CNTK.Cpp.props -rw-r--r-- 9.3 KB
CNTK.sln -rw-r--r-- 194.7 KB
CONTRIBUTING.md -rw-r--r-- 210 bytes
CppCntk.vssettings -rw-r--r-- 10.0 KB
LICENSE.md -rw-r--r-- 4.6 KB
Makefile -rw-r--r-- 56.8 KB
README.md -rw-r--r-- 4.7 KB
configure -rwxr-xr-x 34.5 KB

README.md

back to top