https://github.com/facebook/rocksdb

sort by:
Revision Author Date Message Commit Date
2c1ef06 Flush before Fsync()/Sync() Summary: Calling Fsync()/Sync() on a file should give the guarantee that whatever you written to the file is now persisted. This is currently not the case, since we might have some data left in application cache as we do Fsync()/Sync(). For example, BuildTable() calls Fsync() without the flush, assuming all sst data is now persisted, but it's actually not. This may result in big inconsistencies. Test Plan: no test Reviewers: sdong, dhruba, haobo, ljin, yhchiang Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D18159 22 April 2014, 01:08:49 UTC
25fef46 RocksDB 2.8 to be able to read files generated by 2.6 Summary: From 2.6 to 2.7, property block name is renamed from rocksdb.stats to rocksdb.properties. Older properties were not able to be loaded. In 2.8, we seem to have added some logic that uses property block without checking null pointers, which create segment faults. In this patch, we fix it by: (1) try rocksdb.stats if rocksdb.properties is not found (2) add some null checking before consuming rep->table_properties Test Plan: make sure a file generated in 2.7 couldn't be opened now can be opened. Reviewers: haobo, igor, yhchiang Reviewed By: igor CC: ljin, xjin, dhruba, kailiu, leveldb Differential Revision: https://reviews.facebook.net/D17961 17 April 2014, 16:57:24 UTC
f01a04e Update HISTORY.md Summary: Update HISTORY.md to make existing items to 2.8 release and add something I think is missing. Test Plan: N/A Reviewers: haobo, igor, ljin, dhruba, yhchiang, xjin Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17517 05 April 2014, 00:30:55 UTC
acdc6a1 relax backupable db rate limit tests 04 April 2014, 23:27:47 UTC
bcd1f15 Remove -Wno-unused-const-variable 04 April 2014, 23:15:47 UTC
ea0198f Create log::Writer out of DB Mutex Summary: Our measurement shows that sometimes new log::Write's constructor can take hundreds of milliseconds. It's unclear why but just simply move it out of DB mutex. Test Plan: make all check Reviewers: haobo, ljin, igor Reviewed By: haobo CC: nkg-, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17487 04 April 2014, 22:46:28 UTC
c90d446 make hash_link_list Node's key space consecutively followed at the end Summary: per sdong's request, this will help processor prefetch on n->key case. Test Plan: make all check Reviewers: sdong, haobo, igor Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D17415 04 April 2014, 22:37:28 UTC
318eace Dynamically choose SSE 4.2 Summary: Otherwise, if we compile on machine with SSE4.2 support and run it on machine without the support, we will fail. Test Plan: compiles, verified that isSse42() gets called. Reviewers: dhruba Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17505 04 April 2014, 21:03:19 UTC
51023c3 Make RocksDB compile for iOS Summary: I had to make number of changes to the code and Makefile: * Add `make lib`, that will create static library without debug info. We need this to avoid growing binary too much. Currently it's 14MB. * Remove cpuinfo() function and use __SSE4_2__ macro. We actually used the macro as part of Fast_CRC32() function. As a result, I also accidentally fixed this issue: https://www.facebook.com/groups/rocksdb.dev/permalink/549700778461774/?stream_ref=2 * Remove __thread locals in OS_MACOSX Test Plan: `make lib PLATFORM=IOS` Reviewers: ljin, haobo, dhruba, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17475 04 April 2014, 20:11:44 UTC
32b2c1a Merge branch 'jni' 04 April 2014, 19:52:40 UTC
99c756f Flush Buffered Info Logs Before Doing Compaction (one line change) Summary: Flushing log buffer earlier to avoid confusion of time holding the locks. Test Plan: Should be safe as long as several related db test passes Reviewers: haobo, igor, ljin Reviewed By: igor CC: nkg-, leveldb Differential Revision: https://reviews.facebook.net/D17493 04 April 2014, 17:58:30 UTC
ef7dc38 Fix some other signed & unsigned comparisons Summary: Fix some signed and unsigned comparisons to make some other build script happy. Test Plan: Build and run those changed tests Reviewers: ljin, igor, haobo Reviewed By: igor CC: yhchiang, dhruba, kailiu, leveldb Differential Revision: https://reviews.facebook.net/D17463 04 April 2014, 03:51:27 UTC
3699fda Merge branch 'master' into jni 04 April 2014, 00:14:10 UTC
040657a Fix MacOS errors 03 April 2014, 23:04:34 UTC
2fa5d41 [RocksDB] make SetPerfLevel affect only the current thread Summary: as title, make it easy to turn on/off profiling at per thread level. Test Plan: make check Reviewers: sdong, ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D17469 03 April 2014, 22:09:17 UTC
f76e402 initialize candidate count 03 April 2014, 18:45:44 UTC
f5469b1 Merge branch 'master' of github.com:facebook/rocksdb into ignore 03 April 2014, 17:55:23 UTC
47ccf71 Include java related output files in .gitignore Summary: Include java related output files in .gitignore Test Plan: make jni git status Reviewers: ljin, igor, sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D17457 03 April 2014, 17:49:40 UTC
b9767d0 Move several more logging inside DB mutex to log buffer Summary: Move several some common logging still in DB mutex to log buffer. Test Plan: make all check Reviewers: haobo, igor, ljin, nkg- Reviewed By: nkg- CC: nkg-, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17439 03 April 2014, 17:47:18 UTC
c0b9fa8 Add script auto_sanity_test.sh to perform auto sanity test Summary: Add script auto_sanity_test.sh to perform auto sanity test usage: auto_sanity_test.sh [new_commit] [old_commit] Running without commit parameter will do the sanity test with the latest and the latest 10 commit. Test Plan: ./auto_sanity_test.sh Reviewers: haobo, igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17397 03 April 2014, 17:21:46 UTC
078365b Merge pull request #108 from tecbot/c-api-enhancements C-API enhancements 03 April 2014, 17:10:54 UTC
98422cb [C-API] implemented more options 03 April 2014, 08:47:37 UTC
3a30b5b [C-API] added "rocksdb_options_set_plain_table_factory" to make it possible to use plain table factory 03 April 2014, 08:47:37 UTC
e351184 [JNI] Avoid a potential byte-array-copy btw c++ and java in RocksDB.get(byte[], byte[]). Summary: Avoid a JNI call to GetByteArrayElements, which may introduce a byte-array-copy. Test Plan: make jtest Reviewers: haobo, sdong, dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17451 03 April 2014, 07:02:01 UTC
92d2766 [JNI] Improve the internal interface between java and c++ for basic db operations. Summary: Improve the internal interface between java and c++ for basic db operations by including the RocksDB native handle (i.e., c++ pointer of rocksdb::DB) as a input parameter of the internal interface. This improvement reduces one JNI call per db operation from c++. Test Plan: make test Reviewers: haobo, sdong, dhruba Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D17445 03 April 2014, 05:23:04 UTC
48bc0c6 [RocksDB] Fix a race condition in GetSortedWalFiles Summary: This patch fixed a race condition where a log file is moved to archived dir in the middle of GetSortedWalFiles. Without the fix, the log file would be missed in the result, which leads to transaction log iterator gap. A test utility SyncPoint is added to help reproducing the race condition. Test Plan: TransactionLogIteratorRace; make check Reviewers: dhruba, ljin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17121 03 April 2014, 05:12:29 UTC
d1d19f5 Fix valgrind error in c_test 03 April 2014, 00:24:30 UTC
158845b Move a info logging out of DB Mutex Summary: As we know, logging can be slow, or even hang for some file systems. Move one more logging out of DB mutex. Test Plan: make all check Reviewers: haobo, igor, ljin Reviewed By: igor CC: yhchiang, nkg-, leveldb Differential Revision: https://reviews.facebook.net/D17427 02 April 2014, 23:48:32 UTC
c9622aa Merge pull request #107 from alberts/fastah crc32: build a whole special Extend function for SSE 4.2. 02 April 2014, 23:00:22 UTC
56ca75e crc32: build a whole special Extend function for SSE 4.2. Disassembling the Extend function shows something that looks much more healthy now. The SSE 4.2 instructions are right there in the body of the function. Intel(R) Core(TM) i7-3540M CPU @ 3.00GHz Before: crc32c: 1.305 micros/op 766260 ops/sec; 2993.2 MB/s (4K per op) After: crc32c: 0.442 micros/op 2263843 ops/sec; 8843.1 MB/s (4K per op) 02 April 2014, 22:15:57 UTC
4af1954 Compaction Filter V1 to use old context struct to keep backward compatible Summary: The previous change D15087 changed existing compaction filter, which makes the commonly used class not backward compatible. Revert the older interface. Use a new interface for V2 instead. Test Plan: make all check Reviewers: haobo, yhchiang, igor CC: danguo, dhruba, ljin, igor, leveldb Differential Revision: https://reviews.facebook.net/D17223 02 April 2014, 21:57:51 UTC
da0887a [JNI] Add java api and java tests for WriteBatch and WriteOptions, add put() and remove() to RocksDB. Summary: * Add java api for rocksdb::WriteBatch and rocksdb::WriteOptions, which are necessary components for running benchmark. * Add java test for org.rocksdb.WriteBatch and org.rocksdb.WriteOptions. * Add remove() to org.rocksdb.RocksDB, and add put() and remove() to RocksDB which take org.rocksdb.WriteOptions. Test Plan: make jtest Reviewers: haobo, sdong, dhruba Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D17373 02 April 2014, 21:49:20 UTC
284c365 Fix valgrind error caused by FileMetaData as two level iterator's index block handle Summary: It is a regression valgrind bug caused by using FileMetaData as index block handle. One of the fields of FileMetaData is not initialized after being contructed and copied, but I'm not able to find which one. Also, I realized that it's not a good idea to use FileMetaData as in TwoLevelIterator::InitDataBlock(), a copied FileMetaData can be compared with the one in version set byte by byte, but the refs can be changed. Also comparing such a large structure is slightly more expensive. Use a simpler structure instead Test Plan: Run the failing valgrind test (Harness.RandomizedLongDB) make all check Reviewers: igor, haobo, ljin Reviewed By: igor CC: yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17409 02 April 2014, 21:38:28 UTC
8c4a3bf Add a java api for rocksdb::Options, currently only supports create_if_missing. Summary: * [java] Add a java api for rocksdb::Options, currently only supports create_if_missing. * [java] Add a test for RocksDBException in RocksDBSample. Test Plan: make jtest Reviewers: haobo, sdong Reviewed By: haobo CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D17385 01 April 2014, 23:59:05 UTC
e0a87c4 DBIter to use static allocated char array for saved_key_ (if it is not too long) Summary: DBIter now uses a std::string for saved_key. Based on some profiling, it could be more expensive than we though. Optimize it with the same technique as LookupKey -- if it is short, we copy it to a static allocated char. Otherwise, dynamically allocate memory for it. Test Plan: make all check Reviewers: haobo, ljin Reviewed By: haobo CC: dhruba, igor, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17289 01 April 2014, 23:43:11 UTC
807b2c2 reduce thread count in ThreadLocalTest.ConcurrentReadWriteTest Summary: to make it less CPU intensive Test Plan: ran it Reviewers: igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17403 01 April 2014, 23:26:07 UTC
d50619a PlainTableIterator::Seek() shouldn't check bloom filter in total order mode Summary: In total order mode, iterator's seek() shouldn't check total order. Also some cleaning up about checking null for shared pointers. I don't know the behavior before it. This bug was reported by @igor. Test Plan: test plain_table_db_test Reviewers: ljin, haobo, igor Reviewed By: igor CC: yhchiang, dhruba, igor, leveldb Differential Revision: https://reviews.facebook.net/D17391 01 April 2014, 22:05:16 UTC
442e1bc Merge pull request #105 from tecbot/c-api-prefix [C-API] support prefix seeks 01 April 2014, 18:41:21 UTC
fa84eb1 Fixed a compile error which tries to check whether a size_t < 0 in env_posix.cc Summary: Fixed a compile error which tries to check whether a size_t < 0 in env_posix.cc util/env_posix.cc:180:16: error: comparison of unsigned expression < 0 is always false [-Werror,-Wtautological-compare] } while (r < 0 && errno == EINTR); ~ ^ ~ 1 error generated. Test Plan: make check all Reviewers: igor, haobo Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17379 01 April 2014, 18:09:06 UTC
38dc5ef [C-API] added the possiblity to create a HashSkipList or HashLinkedList to support prefix seeks 01 April 2014, 10:44:27 UTC
a73383e Minor fix in rocksdb jni library, RocksDB now does not implement Closeable. Summary: * [java] RocksDB now does not implement Closeable. * [java] RocksDB.close() is now synchronized. * [c++] Fix a bug in rocksjni.cc that does not release a java reference before throwing an exception. Test Plan: make jni make jtest Reviewers: haobo, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17355 01 April 2014, 04:46:10 UTC
8e81caf Fix Autoroll logger Summary: If auto roll logger can't create a new LOG file on roll (if, for example, somebody deletes rocksdb directory while rocksdb is running, khm), we'll try to call Logv on invalid address and get a SIGSEGV. This diff will fix the issue Here's the paste of the stack trace: https://phabricator.fb.com/P8276386 (fb-only) Test Plan: make check is fine, although not really testing error condition Reviewers: haobo, ljin, yhchiang Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17367 01 April 2014, 00:18:06 UTC
05080da fix db_sanity_test 01 April 2014, 00:06:53 UTC
726c808 Retry FS system calls on EINTR Summary: EINTR means 'please retry'. We don't do that currenty. We should. Test Plan: make check, although it doesn't really test the new code. we'll just have to believe in the code! Reviewers: haobo, ljin Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17349 31 March 2014, 21:45:26 UTC
577556d Don't store version number in MANIFEST Summary: Talked to <insert internal project name> folks and they found it really scary that they won't be able to roll back once they upgrade to 2.8. We should fix this. Test Plan: make check Reviewers: haobo, ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D17343 31 March 2014, 18:33:09 UTC
5ec38c3 Minor fix in rocksdb jni library. Summary: Fix a bug in RocksDBSample.java and minor code improvement in rocksjni.cc. Test Plan: make jni make jtest Reviewers: haobo, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17325 30 March 2014, 05:00:52 UTC
8a139a0 More valgrind issues! Summary: Fix some more CompactionFilterV2 valgrind issues. Maybe it would make sense for CompactionFilterV2 to delete its prefix_extractor? Test Plan: ran CompactionFilterV2* tests with valgrind. issues before patch -> no issues after Reviewers: haobo, sdong, ljin, dhruba Reviewed By: dhruba CC: leveldb, danguo Differential Revision: https://reviews.facebook.net/D17337 29 March 2014, 17:34:47 UTC
550cca7 dynamicbloom fix: don't offset address when it is already aligned Summary: this causes overflow and asan failure Test Plan: make asan_check Reviewers: igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17301 29 March 2014, 00:30:20 UTC
43a593a Change default value of some Options Summary: Since we are optimizing for server workloads, some default values are not optimized any more. We change some of those values that I feel it's less prone to regression bugs. Test Plan: make all check Reviewers: dhruba, haobo, ljin, igor, yhchiang Reviewed By: igor CC: leveldb, MarkCallaghan Differential Revision: https://reviews.facebook.net/D16995 29 March 2014, 00:09:28 UTC
2d3468c MemTableIterator not to reference Memtable Summary: In one of the perf, I shows 10%-25% CPU costs of MemTableIterator.Seek(), when doing dynamic prefix seek, are spent on checking whether we need to do bloom filter check or finding out the prefix extractor. Seems that more level of pointer checking makes CPU cache miss more likely. This patch makes things slightly simpler by copying pointer of bloom of prefix extractor into the iterator. Test Plan: make all check Reviewers: haobo, ljin Reviewed By: ljin CC: igor, dhruba, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17247 28 March 2014, 23:46:25 UTC
c8bb799 fix the buffer overflow in dynamic_bloom_test Summary: int -> uint64_t Test Plan: it think it is pretty obvious will run asan_check before committing Reviewers: igor, haobo Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17241 28 March 2014, 23:21:42 UTC
96e2c2c Geo spatial support. Summary: Test Plan: Reviewers: CC: Task ID: # Blame Rev: 28 March 2014, 23:12:50 UTC
4031b98 A GIS implementation for rocksdb. Summary: This patch stores gps locations in rocksdb. Each object is uniquely identified by an id. Each object has a gps (latitude, longitude) associated with it. The geodb supports looking up an object either by its gps location or by its id. There is a method to retrieve all objects within a circular radius centered at a specified gps location. Test Plan: Simple unit-test attached. Reviewers: leveldb, haobo Reviewed By: haobo CC: leveldb, tecbot, haobo Differential Revision: https://reviews.facebook.net/D15567 28 March 2014, 22:26:44 UTC
64ae6e9 Don't preallocate log files 28 March 2014, 22:04:43 UTC
0d463a3 Add a jni library for rocksdb which supports Open, Get, Put, and Close. Summary: This diff contains a simple jni library for rocksdb which supports open, get, put and closeusing default options (including Options, ReadOptions, and WriteOptions.) In the usual case, Java developers can use the c++ rocksdb library in the way similar to the following: RocksDB db = RocksDB.open(path_to_db); ... db.put("hello".getBytes(), "world".getBytes(); byte[] value = db.get("hello".getBytes()); ... db.close(); Specifically, this diff has the following major classes: * RocksDB: a Java wrapper class which forwards the operations from the java side to c++ rocksdb library. * RocksDBException: ncapsulates the error of an operation. This exception type is used to describe an internal error from the c++ rocksdb library. This diff also include a simple java sample code calling c++ rocksdb library. To build the rocksdb jni library, simply run make jni, and make jtest will try to build and run the sample code. Note that if the rocksdb is not built with the default glibc that Java uses, java will try to load the wrong glibc during the run time. As a result, the sample code might not work properly during the run time. Test Plan: * make jni * make jtest Reviewers: haobo, dhruba, sdong, igor, ljin Reviewed By: dhruba CC: leveldb, xjin Differential Revision: https://reviews.facebook.net/D17109 28 March 2014, 21:19:21 UTC
0d755ff cache friendly blocked bloomfilter Summary: By constraining the probes within cache line(s), we can improve the cache miss rate thus performance. This probably only makes sense for in-memory workload so defaults the option to off. Numbers and comparision can be found in wiki: https://our.intern.facebook.com/intern/wiki/index.php/Ljin/rocksdb_perf/2014_03_17#Bloom_Filter_Study Test Plan: benchmarked this change substantially. Will run make all check as well Reviewers: haobo, igor, dhruba, sdong, yhchiang Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17133 28 March 2014, 16:21:20 UTC
10cebec Fix the bug in MergeUtil which causes mixing values of different keys. Summary: Fix the bug in MergeUtil which causes mixing values of different keys. Test Plan: stringappend_test make all check Reviewers: haobo, igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17235 27 March 2014, 23:15:25 UTC
a92194e [RocksDB] Add db property "rocksdb.cur-size-active-mem-table" Summary: as title Test Plan: db_test Reviewers: sdong Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D17217 27 March 2014, 22:14:04 UTC
b14c1f9 allow mmap writes 27 March 2014, 19:00:38 UTC
5826f95 Make rate limiting unit test more robust 27 March 2014, 18:53:05 UTC
1c9f8f0 Fix valgrind issues Summary: NewFixedPrefixTransform is leaked in default options. Broken by https://github.com/facebook/rocksdb/commit/b47812fba601e23872349407d565d15f0b41a2fe Also included in the diff some code cleanup Test Plan: valgrind env_test also make check Reviewers: haobo, danguo, yhchiang Reviewed By: danguo CC: leveldb Differential Revision: https://reviews.facebook.net/D17211 27 March 2014, 15:22:59 UTC
d556200 Some small cleaning up to make some compiling environment happy Summary: Compiler complains some errors when building using our internal build settings. Fix them. Test Plan: rebuild Reviewers: haobo, dhruba, igor, yhchiang, ljin Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17199 27 March 2014, 01:11:41 UTC
6a08bc0 Fix no return warning in FileComparator 26 March 2014, 21:46:07 UTC
1e9621d Sort files correctly in Builder::SaveTo Summary: Previously, we used to sort all files by BySmallestFirst comparator and then re-sort level0 files in the Finalize() (recently moved to end of SaveTo). In this diff, I chose the correct comparator at the beginning and sort the files correctly in Builder::SaveTo. I also added a verification that all files are sorted correctly in CheckConsistency() NOTE: This diff depends on D17037 Test Plan: make check. Will also run db_stress Reviewers: dhruba, haobo, sdong, ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D17049 26 March 2014, 20:30:14 UTC
954679b AssertHeld() should do things Summary: AssertHeld() was a no-op before. Now it does things. Also, this change caught a bad bug in SuperVersion::Init(). The method is calling db->mutex.AssertHeld(), but db variable is not initialized yet! I also fixed that issue. Test Plan: make check Reviewers: dhruba, haobo, ljin, sdong, yhchiang Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17193 26 March 2014, 18:24:52 UTC
ad9a39c [RocksDB] Preallocate new MANIFEST files Summary: We don't preallocate MANIFEST file, even though we have an option for that. This diff preallocates manifest file every time we create it Test Plan: make check Reviewers: dhruba, haobo Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17163 26 March 2014, 16:37:53 UTC
6b2e7a2 When Options.max_num_files=-1, non level0 files also by pass table cache Summary: This is the part that was not finished when doing the Options.max_num_files=-1 feature. For iterating non level0 SST files (which was done using two level iterator), table cache is not bypassed. With this patch, the leftover feature is done. Test Plan: make all check; change Options.max_num_files=-1 in one of the tests to cover the codes. Reviewers: haobo, igor, dhruba, ljin, yhchiang Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17001 26 March 2014, 01:40:52 UTC
b9ce156 Add assert to MergeOperator::PartialMergeMulti to check # of operands. Summary: Add assert(operands_list.size() >= 2) in MergeOperator::PartialMergeMulti to ensure it's only be called when we have at least two merge operands. Test Plan: run merge_test and stringappend_test. Reviewers: haobo, igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17169 25 March 2014, 20:39:17 UTC
5c44a8d fallocate_with_keep_size is false for LogWrites 25 March 2014, 19:53:23 UTC
d9ca83d [rocksdb] make init prefix more robust Summary: Currently if client uses kNULLString as the prefix, it will confuse compaction filter v2. This diff added a bool to indicate if the prefix has been intialized. I also added a unit test to cover this case and make sure the new code path is hit. Test Plan: db_test Reviewers: igor, haobo Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D17151 25 March 2014, 18:59:40 UTC
34f9da1 Fix the failure of stringappend_test caused by PartialMergeMulti. Summary: Fix a bug that PartialMergeMulti will try to merge the first operand with an empty slice. Test Plan: run stringappend_test and merge_test. Reviewers: haobo, igor Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17157 25 March 2014, 18:50:09 UTC
ebaff6f fix the HISTORY file to describe change happened in b47812fb 25 March 2014, 17:24:37 UTC
b47812f [rocksdb] new CompactionFilterV2 API Summary: This diff adds a new CompactionFilterV2 API that roll up the decisions of kv pairs during compactions. These kv pairs must share the same key prefix. They are buffered inside the db. typedef std::vector<Slice> SliceVector; virtual std::vector<bool> Filter(int level, const SliceVector& keys, const SliceVector& existing_values, std::vector<std::string>* new_values, std::vector<bool>* values_changed ) const = 0; Application can override the Filter() function to operate on the buffered kv pairs. More details in the inline documentation. Test Plan: make check. Added unit tests to make sure Keep, Delete, Change all works. Reviewers: haobo CCs: leveldb Differential Revision: https://reviews.facebook.net/D15087 25 March 2014, 03:47:53 UTC
cda4006 Enhance partial merge to support multiple arguments Summary: * PartialMerge api now takes a list of operands instead of two operands. * Add min_pertial_merge_operands to Options, indicating the minimum number of operands to trigger partial merge. * This diff is based on Schalk's previous diff (D14601), but it also includes necessary changes such as updating the pure C api for partial merge. Test Plan: * make check all * develop tests for cases where partial merge takes more than two operands. TODOs (from Schalk): * Add test with min_partial_merge_operands > 2. * Perform benchmarks to measure the performance improvements (can probably use results of task #2837810.) * Add description of problem to doc/index.html. * Change wiki pages to reflect the interface changes. Reviewers: haobo, igor, vamsi Reviewed By: haobo CC: leveldb, dhruba Differential Revision: https://reviews.facebook.net/D16815 25 March 2014, 00:57:13 UTC
e6d4b00 Relax backupable RateLimiter unit test for slow environments 24 March 2014, 18:59:42 UTC
b253f24 Rate limiter for BackupableDB Summary: Might be useful if client doesn't want to effect running system during backup too much. Test Plan: added a test case Reviewers: dhruba, haobo, ljin Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17091 24 March 2014, 18:38:44 UTC
83ab62e Fix data corruption by LogBuffer Summary: LogBuffer::AddLogToBuffer() uses vsnprintf() in the wrong way, which might cause buffer overflow when log line is too line. Fix it. Test Plan: Add a unit test to cover most LogBuffer's most logic. Reviewers: igor, haobo, dhruba Reviewed By: igor CC: ljin, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17103 21 March 2014, 22:32:48 UTC
c21ce14 Fix double-free in corruption_test 20 March 2014, 21:37:30 UTC
e67241f Sanity check on Open Summary: Everytime a client opens a DB, we do a sanity check that: * checks the existance of all the necessary files * verifies that file sizes are correct Some of the code was stolen from https://reviews.facebook.net/D16935 Test Plan: added a unit test Reviewers: dhruba, haobo, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17097 20 March 2014, 21:18:29 UTC
7981a43 Consistency Check Function Summary: Added a function/command to check the consistency of live files' meta data Test Plan: Manual test (size mismatch, file not exist). Command test script. Reviewers: haobo Reviewed By: haobo CC: dhruba, leveldb Differential Revision: https://reviews.facebook.net/D16935 20 March 2014, 20:42:45 UTC
8ea3cb6 If paranoid_checks -- Mark DB read-only on any IOError Summary: Whenever we get an IOError from GetImpl() or NewIterator(), we should immediatelly mark the DB read-only. The same check already exists in Write() and Compaction(). This should help with clients that are somehow missing a file. Test Plan: make check Reviewers: dhruba, haobo, sdong, ljin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17061 20 March 2014, 20:10:02 UTC
4cfb0eb Delete rocksdb dir after crashtest Summary: We should clean up after ourselves. Last crashtest failed because the disk on the cibuild machine was full. Also, db_crashtest is supposed to run the test on the same database in every iteration. This isn't the case currently, because every run creates a new tmp directory. It's fixed with this diff. Test Plan: ran it Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D17073 20 March 2014, 18:11:08 UTC
f681030 Fix DBTest.UniversalCompactionTrigger failure caused by D17067 Summary: D17067 breaks DBTest.UniversalCompactionTrigger because of wrong location of the checking. Fix it. Test Plan: Run the test and make sure it passes. Reviewers: igor, haobo Reviewed By: igor CC: dhruba, ljin, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D17079 20 March 2014, 18:10:11 UTC
752ec46 Add a unit test to verify compaction filter context Summary: Add unit tests to make sure CompactionFilterContext::is_manual_compaction_ and CompactionFilterContext::is_full_compaction_ are set correctly. Test Plan: run the new tests. Reviewers: haobo, igor, dhruba, yhchiang, ljin Reviewed By: haobo CC: nkg-, leveldb Differential Revision: https://reviews.facebook.net/D17067 20 March 2014, 01:10:48 UTC
fcd5c5e ComputeCompactionScore in CompactionPicker Summary: As it turns out, we need the call to ComputeCompactionScore (previously: Finalize) in CompactionPicker. The issue caused a deadlock in db_stress: http://ci-builds.fb.com/job/rocksdb_crashtest/290/console The last two lines before a deadlock were: 2014/03/18-22:43:41.481029 7facafbee700 (Original Log Time 2014/03/18-22:43:41.480989) Compaction nothing to do 2014/03/18-22:43:41.481041 7faccf7fc700 wait for fewer level0 files... "Compaction nothing to do" and other thread waiting for fewer level0 files. Hm hm. I moved the pre-sorting to SaveTo, which should fix both the original and the new issue. Test Plan: make check for now, will run db_stress in jenkins Reviewers: dhruba, haobo, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17037 19 March 2014, 23:52:26 UTC
69f6cf4 Fix two bugs in talbe format Summary: Previous code had two bugs: * didn't initialize the table_magic_number_ explicitly -- as a result a random junk number is stored for table_magic_number_, making HasInitializedMagicNumber() always return true. * if condition is inconrrect in set_table_magic_number(), and the return value is not checked. I replace if-else by a stronger requirement enforced by assert(). Test Plan: Previous sst_dump failed to work. After the fix, things back to normal. Reviewers: yhchiang CC: haobo, sdong, igor, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D17055 19 March 2014, 23:18:33 UTC
e493f2f Don't compact with zero input files Summary: We have an issue with internal service trying to run compaction with zero input files: 2014/02/07-02:26:58.386531 7f79117ec700 Compaction start summary: Base version 1420 Base level 3, seek compaction:0, inputs:[ϛ~^Qy^?],[] 2014/02/07-02:26:58.386539 7f79117ec700 Compacted 0@3 + 0@4 files => 0 bytes There are two issues: * inputsummary is printing out junk * it's constantly retrying (since I guess madeProgress is true), so it prints out a lot of data in the LOG file (40GB in one day). I read through the Level compaction picker and added some failure condition if input[0] is empty. I think PickCompaction() should not return compaction with zero input files with this change. I'm not confident enough to add an assertion though :) Test Plan: make check Reviewers: dhruba, haobo, sdong, kailiu Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16005 19 March 2014, 23:01:25 UTC
1ad0c2f add tags to gitignore 19 March 2014, 22:40:33 UTC
22507af Fix compile issue in Mac OS Summary: Compile issues are: * Unused variable env_ * Unused fallocate_with_keep_size_ Test Plan: compiles Reviewers: dhruba, haobo, sdong Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D17043 19 March 2014, 22:40:12 UTC
6dc940d avoid shared_ptr assignment in Version::Get() Summary: This is a 500ns operation while the whole Get() call takes only a few micro! Test Plan: ran db_bench, for a DB with 50M keys, QPS jumps from 5.2M/s to 7.2M/s Reviewers: haobo, igor, dhruba Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D17007 19 March 2014, 17:54:32 UTC
71e6a34 Add a DB property to indicate number of background errors encountered Summary: Add a property to calculate number of background errors encountered to help users build their monitoring Test Plan: Add a unit test. make all check Reviewers: haobo, igor, dhruba Reviewed By: igor CC: ljin, nkg-, yhchiang, leveldb Differential Revision: https://reviews.facebook.net/D16959 18 March 2014, 21:28:30 UTC
1ec72b3 Several easy-to-add properties related to compaction and flushes Summary: To partly address the request @nkg- raised, add three easy-to-add properties to compactions and flushes. Test Plan: run unit tests and add a new unit test to cover new properties. Reviewers: haobo, dhruba Reviewed By: dhruba CC: nkg-, leveldb Differential Revision: https://reviews.facebook.net/D13677 18 March 2014, 21:00:09 UTC
758fa8c Don't Finalize in CompactionPicker Summary: Finalize re-sorts (read: mutates) the files_ in Version* and it is called by CompactionPicker during normal runtime. At the same time, this same Version* lives in the SuperVersion* and is accessed without the mutex in GetImpl() code path. Mutating the files_ in one thread and reading the same files_ in another thread is a bad idea. It caused this issue: http://ci-builds.fb.com/job/rocksdb_crashtest/285/console Long-term, we need to be more careful with method contracts and clearly document what state can be mutated when. Now that we are much faster because we don't lock in GetImpl(), we keep running into data races that were not a problem before when we were slower. db_stress has been very helpful in detecting those. Short-term, I removed Finalize() from CompactionPicker. Note: I believe this is an issue in current 2.7 version running in production. Test Plan: make check Will also run db_stress to see if issue is gone Reviewers: sdong, ljin, dhruba, haobo Reviewed By: sdong CC: leveldb Differential Revision: https://reviews.facebook.net/D16983 18 March 2014, 20:59:59 UTC
63cef90 disable the log_number check in Recover() Summary: There is a chance that an old MANIFEST is corrupted in 2.7 but just not noticed. This check would fail them. Change it to log instead of returning a Corruption status. Test Plan: make Reviewers: haobo, igor Reviewed By: igor CC: leveldb Differential Revision: https://reviews.facebook.net/D16923 18 March 2014, 19:46:29 UTC
7624f43 Fixed a typo in INSTALL.md Summary: Replace "RocskDB" by "RocksDB" in INSTALL.md Test Plan: No code change. Reviewers: ljin, igor Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16977 18 March 2014, 19:05:23 UTC
f26cb0f Optimize fallocation Summary: Based on my recent findings (posted in our internal group), if we use fallocate without KEEP_SIZE flag, we get superior performance of fdatasync() in append-only workloads. This diff provides an option for user to not use KEEP_SIZE flag, thus optimizing his sync performance by up to 2x-3x. At one point we also just called posix_fallocate instead of fallocate, which isn't very fast: http://code.woboq.org/userspace/glibc/sysdeps/posix/posix_fallocate.c.html (tl;dr it manually writes out zero bytes to allocate storage). This diff also fixes that, by first calling fallocate and then posix_fallocate if fallocate is not supported. Test Plan: make check Reviewers: dhruba, sdong, haobo, ljin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D16761 18 March 2014, 04:52:14 UTC
ae25742 Fix race condition in manifest roll Summary: When the manifest is getting rolled the following happens: 1) manifest_file_number_ is assigned to a new manifest number (even though the old one is still current) 2) mutex is unlocked 3) SetCurrentFile() creates temporary file manifest_file_number_.dbtmp 4) SetCurrentFile() renames manifest_file_number_.dbtmp to CURRENT 5) mutex is locked If FindObsoleteFiles happens between (3) and (4) it will: 1) Delete manifest_file_number_.dbtmp (because it's not in pending_outputs_) 2) Delete old manifest (because the manifest_file_number_ already points to a new one) I introduce the concept of prev_manifest_file_number_ that will avoid the race condition. However, we should discuss the future of MANIFEST file rolling. We found some race conditions with it last week and who knows how many more are there. Nobody is using it in production because we don't trust the implementation. Should we even support it? Test Plan: make check Reviewers: ljin, dhruba, haobo, sdong Reviewed By: haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16929 18 March 2014, 04:50:15 UTC
5601bc4 Check starts_with(prefix) in MultiPrefixIterate Summary: We switched to prefix_seek method of seeking. This means that anytime we check Valid(), we also need to check starts_with(prefix) Test Plan: ran db_stress Reviewers: ljin Reviewed By: ljin CC: leveldb Differential Revision: https://reviews.facebook.net/D16953 18 March 2014, 00:02:34 UTC
9caeff5 keep_log_files option in BackupableDB Summary: Added an option to BackupableDB implementation that allows users to persist in-memory databases. When the restore happens with keep_log_files = true, it will *) Not delete existing log files in wal_dir *) Move log files from archive directory to wal_dir, so that DB can replay them if necessary Test Plan: Added an unit test Reviewers: dhruba, ljin Reviewed By: dhruba CC: leveldb Differential Revision: https://reviews.facebook.net/D16941 17 March 2014, 22:39:23 UTC
a5fafd4 Correct the logic of MemTable::ShouldFlushNow(). Summary: Memtable will now be forced to flush if the one of the following conditions is met: 1. Already allocated more than write_buffer_size + 60% arena block size. (the overflowing condition) 2. Unable to safely allocate one more arena block without hitting the overflowing condition AND the unused allocated memory < 25% arena block size. Test Plan: make all check Reviewers: sdong, haobo CC: leveldb Differential Revision: https://reviews.facebook.net/D16893 17 March 2014, 19:20:11 UTC
back to top