https://github.com/facebook/rocksdb

sort by:
Revision Author Date Message Commit Date
641fae6 update history and bump version 11 February 2019, 22:02:52 UTC
b7434c2 Properly set upper bound of subcompaction output (#4879) (#4898) Summary: Fix the ouput overlap bug when using subcompactions, the upper bound of output file was extended incorrectly. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4898 Differential Revision: D13736107 Pulled By: ajkr fbshipit-source-id: 21dca09f81d5f07bf2766bf566f9b50dcab7d8e3 11 February 2019, 22:01:38 UTC
a1774dd Bump version to 5.18.2 31 January 2019, 23:49:35 UTC
65b2298 Use correct FileMeta for atomic flush result install (#4932) Summary: 1. this commit fixes our handling of a combination of two separate edge cases. If a flush job does not pick any memtable to flush (because another flush job has already picked the same memtables), and the column family assigned to the flush job is dropped right before RocksDB calls rocksdb::InstallMemtableAtomicFlushResults, our original code passes a FileMetaData object whose file number is 0, failing the assertion in rocksdb::InstallMemtableAtomicFlushResults (assert(m->GetFileNumber() > 0)). 2. Also piggyback a small change: since we already create a local copy of column family's mutable CF options to eliminate potential race condition with `SetOptions` call, we might as well use the local copy in other function calls in the same scope. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4932 Differential Revision: D13901322 Pulled By: riversand963 fbshipit-source-id: b936580af7c127ea0c6c19ea10cd5fcede9fb0f9 31 January 2019, 23:06:26 UTC
acba14b Make a copy of MutableCFOptions to avoid race condition (#4876) Summary: If we do not do this, then reading MutableCFOptions may have a race condition with SetOptions which modifies MutableCFOptions. Also reserve space in advance for vectors to avoid reallocation changing the address of its elements. Test plan ``` $make clean && make -j32 all check $make clean && COMPILE_WITH_TSAN=1 make -j32 all check $make clean && COMPILE_WITH_ASAN=1 make -j32 all check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4876 Differential Revision: D13644500 Pulled By: riversand963 fbshipit-source-id: 4b8112c5c819d5a2922bb61ad1521b3d2fb2fd47 31 January 2019, 23:06:08 UTC
53f760b Always delete Blob DB files in the background (#4928) Summary: Blob DB files are not tracked by the SFM, so they currently don't get deleted in the background. Force them to be deleted in background so rate limiting can be applied Pull Request resolved: https://github.com/facebook/rocksdb/pull/4928 Differential Revision: D13854649 Pulled By: anand1976 fbshipit-source-id: 8031ce66842ff0af440c715d886b377983dad7d8 31 January 2019, 22:19:04 UTC
35c05bc Deleting Blob files also goes through SstFileManager (#4904) Summary: Right now, deleting blob files is not rate limited, even if SstFileManger is specified. On the other hand, rate limiting blob deletion is not supported. With this change, Blob file deletion will go through SstFileManager too. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4904 Differential Revision: D13772545 Pulled By: siying fbshipit-source-id: bd1b1d0beb26d5167385e00b7ecb8b94b879de84 31 January 2019, 22:18:21 UTC
9ae0528 Use chrono::time_point instead of time_t (#4868) Summary: By convention, time_t almost always stores the integral number of seconds since 00:00 hours, Jan 1, 1970 UTC, according to http://www.cplusplus.com/reference/ctime/time_t/. We surely want more precision than seconds. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4868 Differential Revision: D13633046 Pulled By: riversand963 fbshipit-source-id: 4e01e23a22e8838023c51a91247a286dbf3a5396 23 January 2019, 19:11:13 UTC
4eeb1bf Bump version to 5.18.1 10 January 2019, 00:15:59 UTC
3bcc312 Initialize two members in PerfContext (#4859) Summary: as titled. Currently it's possible to create a local object of type PerfContext since it's part of public API. Then it's safe to initialize the two members to 0. If PerfContext is created as thread-local object, then all members are zero-initialized according to C++ standard. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4859 Differential Revision: D13614504 Pulled By: riversand963 fbshipit-source-id: 406ff548e105a074f379ad1054d56fece5f524a0 10 January 2019, 00:15:31 UTC
e78f5cf Fix point lookup on range tombstone sentinel endpoint (#4829) Summary: Previously for point lookup we decided which file to look into based on user key overlap only. We also did not truncate range tombstones in the point lookup code path. These two ideas did not interact well in cases like this: - L1 has range tombstone [a, c)#1 and point key b#2. The data is split between file1 with range [a#1,1, b#72057594037927935,15], and file2 with range [b#2, c#1]. - L1's file2 gets compacted to L2. - User issues `Get()` for b#3. - L1's file1 is opened and the range tombstone [a, c)#1 is found for b, while no point-key for b is found in L1. - `Get()` assumes that the range tombstone must cover all data in that range in lower levels, so short circuits and returns `NotFound`. The solution to this problem is to not look into files that only overlap with the point lookup at a range tombstone sentinel endpoint. In the above example, this would mean not opening L1's file1 or its tombstones during the `Get()`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4829 Differential Revision: D13561355 Pulled By: ajkr fbshipit-source-id: a13c21c816870a2f5d32a48af6dbd719a7d9d19f 09 January 2019, 01:50:02 UTC
97773d0 Update HISTORY.md 07 January 2019, 18:18:58 UTC
35c950a Refactor atomic flush result installation to MANIFEST (#4791) Summary: as titled. Since different bg flush threads can flush different sets of column families (due to column family creation and drop), we decide not to let one thread perform atomic flush result installation for other threads. Bg flush threads will install their atomic flush results sequentially to MANIFEST, using a conditional variable, i.e. atomic_flush_install_cv_ to coordinate. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4791 Differential Revision: D13498930 Pulled By: riversand963 fbshipit-source-id: dd7482fc41f4bd22dad1e1ef7d4764ef424688d7 07 January 2019, 18:12:51 UTC
e265e08 Avoid switching empty memtable in certain cases (#4792) Summary: in certain cases, we do not perform memtable switching if the active memtable of the column family is empty. Two exceptions: 1. In manual flush, if cached_recoverable_state_empty_ is false, then we need to switch memtable due to requirement of transaction. 2. In switch WAL, we need to switch memtable anyway because we have to seal the memtable if the WAL on which it depends will be closed. This change can potentially delay the occurence of write stalls because number of memtables increase more slowly. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4792 Differential Revision: D13499501 Pulled By: riversand963 fbshipit-source-id: 91c9b17ae753578578039f3851667d93610005e1 07 January 2019, 18:06:49 UTC
663d24f Improve flushing multiple column families (#4708) Summary: If one column family is dropped, we should simply skip it and continue to flush other active ones. Currently we use Status::ShutdownInProgress to notify caller of column families being dropped. In the future, we should consider using a different Status code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4708 Differential Revision: D13378954 Pulled By: riversand963 fbshipit-source-id: 42f248cdf2d32d4c0f677cd39012694b8f1328ca 07 January 2019, 18:02:59 UTC
ec43385 Enable checkpoint of read-only db (#4681) Summary: 1. DBImplReadOnly::GetLiveFiles should not return NotSupported. Instead, it should call DBImpl::GetLiveFiles(flush_memtable=false). 2. In DBImp::Recover, we should also recover the OPTIONS file name and/or number so that an immediate subsequent GetLiveFiles will get the correct OPTIONS name. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4681 Differential Revision: D13069205 Pulled By: riversand963 fbshipit-source-id: 3e6a0174307d06db5a01feb099b306cea1f7f88a 07 January 2019, 17:54:42 UTC
8a643b7 Detect if Jemalloc is linked with the binary (#4844) Summary: Declare Jemalloc non-standard APIs as weak symbols, so that if Jemalloc is linked with the binary, these symbols will be replaced by Jemalloc's, otherwise they will be nullptr. This is similar to how folly detect jemalloc, but we assume the main program use jemalloc as long as jemalloc is linked: https://github.com/facebook/folly/blob/master/folly/memory/Malloc.h#L147 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4844 Differential Revision: D13574934 Pulled By: yiwu-arbug fbshipit-source-id: 7ea871beb1be7d5a1259cc38f9b78078793db2db 04 January 2019, 19:08:52 UTC
de0891e Fix unused member compile error Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4793 Differential Revision: D13509363 Pulled By: abhimadan fbshipit-source-id: 530b4765e3335d6ecd016bfaa89645f8aa98c61f 18 December 2018, 23:23:20 UTC
33564d2 Remove v1 RangeDelAggregator (#4778) Summary: Now that v2 is fully functional, the v1 aggregator is removed. The v2 aggregator has been renamed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4778 Differential Revision: D13495930 Pulled By: abhimadan fbshipit-source-id: 9d69500a60a283e79b6c4fa938fc68a8aa4d40d6 18 December 2018, 23:23:20 UTC
96de211 Add compaction logic to RangeDelAggregatorV2 (#4758) Summary: RangeDelAggregatorV2 now supports ShouldDelete calls on snapshot stripes and creation of range tombstone compaction iterators. RangeDelAggregator is no longer used on any non-test code path, and will be removed in a future commit. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4758 Differential Revision: D13439254 Pulled By: abhimadan fbshipit-source-id: fe105bcf8e3d4a2df37a622d5510843cd71b0401 17 December 2018, 23:33:11 UTC
8522d9c Prepare FragmentedRangeTombstoneIterator for use in compaction (#4740) Summary: To support the flush/compaction use cases of RangeDelAggregator in v2, FragmentedRangeTombstoneIterator now supports dropping tombstones that cannot be read in the compaction output file. Furthermore, FragmentedRangeTombstoneIterator supports the "snapshot striping" use case by allowing an iterator to be split by a list of snapshots. RangeDelAggregatorV2 will use these changes in a follow-up change. In the process of making these changes, other miscellaneous cleanups were also done in these files. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4740 Differential Revision: D13287382 Pulled By: abhimadan fbshipit-source-id: f5aeb03e1b3058049b80c02a558ee48f723fa48c 17 December 2018, 23:33:11 UTC
1d679e3 Update HISTORY.md (#4753) Summary: As titled. Update history to include a recent bug fix in 9be3e6b48884c80733fa791b982a02f62a199e92. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4753 Differential Revision: D13350286 Pulled By: riversand963 fbshipit-source-id: b6324780dee4cb1757bc2209403a08531c150c08 06 December 2018, 00:55:58 UTC
9be3e6b Allow file-ingest-triggered flush to skip waiting for write-stall clear (#4751) Summary: When write stall has already been triggered due to number of L0 files reaching threshold, file ingestion must proceed with its flush without waiting for the write stall condition to cleared by the compaction because compaction can wait for ingestion to finish (circular wait). In order to avoid this wait, we can set `FlushOptions.allow_write_stall` to be true (default is false). Setting it to false can cause deadlock. This can happen when the number of compaction threads is low. Considere the following ``` Time compaction_thread ingestion_thread | num_running_ingest_file_++ | while(num_running_ingest_file_>0){wait} | flush V ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4751 Differential Revision: D13343037 Pulled By: riversand963 fbshipit-source-id: d3b95938814af46ec4c463feff0b50c70bd8b23f 05 December 2018, 22:59:29 UTC
b96fccb Move a function to critical section (#4752) Summary: Test plan ``` $make clean && make -j32 all check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4752 Differential Revision: D13344705 Pulled By: riversand963 fbshipit-source-id: fc3a43174d09d70ccc2b09decd78e1da1b6ba9d1 05 December 2018, 21:12:09 UTC
e58d769 Fix buck dev mode fbcode builds (#4747) Summary: Don't enable ROCKSDB_JEMALLOC unless the build mode is opt and default allocator is jemalloc. In dev mode, this is causing compile/link errors such as - ``` stderr: buck-out/dev/gen/rocksdb/src/rocksdb_lib#compile-pic-malloc_stats.cc.o4768b59e,gcc-5-glibc-2.23-clang/db/malloc_stats.cc.o:malloc_stats.cc:function rocksdb::DumpMallocStats(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*): error: undefined reference to 'malloc_stats_print' clang-7.0: error: linker command failed with exit code 1 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4747 Differential Revision: D13324840 Pulled By: anand1976 fbshipit-source-id: 45ffbd4f63fe4d9e8a0473d8f066155e4ef64a14 05 December 2018, 18:40:31 UTC
2f1ca4e Revert "BaseDeltaIterator: always check valid() before accessing key(… (#4744) Summary: …) (#4702)" This reverts commit 3a18bb3e15e67067d111302d711ae9ac9dc45816. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4744 Differential Revision: D13311869 Pulled By: miasantreble fbshipit-source-id: 6300b12cc34828d8b9274e907a3aef1506d5d553 04 December 2018, 07:38:27 UTC
55479eb Update History for fast-forwarded 5.18 branch Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4704 Differential Revision: D13283300 Pulled By: gfosco fbshipit-source-id: cb4fdaa93137e0bba64b781ba7e8fe31b19e5656 01 December 2018, 00:25:09 UTC
3a18bb3 BaseDeltaIterator: always check valid() before accessing key() (#4702) Summary: Current implementation of `current_over_upper_bound_` fails to take into consideration that keys might be invalid in either base iterator or delta iterator. Calling key() in such scenario will lead to assertion failure and runtime errors. This PR addresses the bug by adding check for valid keys before calling `IsOverUpperBound()`, also added test coverage for iterate_upper_bound usage in BaseDeltaIterator Also recommit https://github.com/facebook/rocksdb/pull/4656 (It was reverted earlier due to bugs) Pull Request resolved: https://github.com/facebook/rocksdb/pull/4702 Differential Revision: D13146643 Pulled By: miasantreble fbshipit-source-id: 6d136929da12d0f2e2a5cea474a8038ec5cdf1d0 30 November 2018, 23:35:13 UTC
6e938c9 Make NewBloomFilterPolicy() use full filter by default (#4735) Summary: Full block (use_block_based_builder=false) Bloom filter has clear CPU saving benefits but with limitation of using temp memory when building an SST file proportional to the SST file size. We reduced the chance of having large SST files with multi-level universal compaction. Now we change to a default with better performance. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4735 Differential Revision: D13266674 Pulled By: siying fbshipit-source-id: 7594a4c3e32568a5a2adce22bb0e46553e55c602 30 November 2018, 21:13:27 UTC
b0f3d9b fix unused param "options" error in jemalloc_nodump_allocator.cc (#4738) Summary: Currently tests are failing on master with the following message: > util/jemalloc_nodump_allocator.cc:132:8: error: unused parameter ‘options’ [-Werror=unused-parameter] Status NewJemallocNodumpAllocator( This PR attempts to fix the issue Pull Request resolved: https://github.com/facebook/rocksdb/pull/4738 Differential Revision: D13278804 Pulled By: miasantreble fbshipit-source-id: 64a6204aa685bd85d8b5080655cafef9980fac2f 30 November 2018, 20:08:55 UTC
f1b0841 WritePrepared: followup fix for snapshot double release issue (#4734) Summary: The fix in #4727 for double snapshot release was incomplete since it does not properly remove the duplicate entires in the snapshot list after finding that a snapshot is still valid. The patch does that and also improves the unit test to show the issue. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4734 Differential Revision: D13266260 Pulled By: maysamyabandeh fbshipit-source-id: 351e2c40cca45a87b757774c11af74182314911e 30 November 2018, 05:01:57 UTC
cf1df5d JemallocNodumpAllocator: option to limit tcache memory usage (#4736) Summary: Add option to limit tcache usage by allocation size. This is to reduce total tcache size in case there are many user threads accessing the allocator and incur non-trivial memory usage. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4736 Differential Revision: D13269305 Pulled By: yiwu-arbug fbshipit-source-id: 95a9b7fc67facd66837c849137e30e137112e19d 30 November 2018, 01:33:40 UTC
7064535 Move FIFOCompactionPicker to a separate file (#4724) Summary: **Summary:** Simplified the code layout by moving FIFOCompactionPicker to a separate file. **Why?:** While trying to add ttl functionality to universal compaction, I found that `FIFOCompactionPicker` class and its impl methods to be interspersed between `LevelCompactionPicker` methods which kind-of made the code a little hard to traverse. So I moved `FIFOCompactionPicker` to a separate compaction_picker_fifo.h/cc file, similar to `UniversalCompactionPicker`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4724 Differential Revision: D13227914 Pulled By: sagar0 fbshipit-source-id: 89471766ea67fa4d87664a41c057dd7df4b3d4e3 30 November 2018, 00:04:52 UTC
8d7bc76 Fix a flaky test DBFlushTest.SyncFail (#4633) Summary: There is a race condition in DBFlushTest.SyncFail, as illustrated below. ``` time thread1 bg_flush_thread | Flush(wait=false, cfd) | refs_before=cfd->current()->TEST_refs() PickMemtable calls cfd->current()->Ref() V ``` The race condition between thread1 getting the ref count of cfd's current version and bg_flush_thread incrementing the cfd's current version makes it possible for later assertion on refs_before to fail. Therefore, we add test sync points to enforce the order and assert on the ref count before and after PickMemtable is called in bg_flush_thread. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4633 Differential Revision: D12967131 Pulled By: riversand963 fbshipit-source-id: a99d2bacb7869ec5d8d03b24ef2babc0e6ae1a3b 29 November 2018, 21:39:56 UTC
7dbee38 db/repair: reset Repair::db_lock_ in ctor (#4683) Summary: there is chance that * the caller tries to repair the db when holding the db_lock, in that case the env implementation might not set the `lock` parameter of Repairer::Run(). * the caller somehow never calls Repairer::Run(). either way, the desctructor of Repair will compare the uninitialized db_lock_ with nullptr, and tries to unlock it. there is good chance that the db_lock_ is not nullptr, then boom. Signed-off-by: Kefu Chai <tchaikov@gmail.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/4683 Differential Revision: D13260287 Pulled By: riversand963 fbshipit-source-id: 878a119d2e9f10a0fa17ee62cf3fb24b33d49fa5 29 November 2018, 19:26:41 UTC
8d9b4d9 Fix failure of sst_file_reader_test in LITE mode regression test (#4725) Summary: Add a dummy main() in sst_file_reader_test for ROCKSDB_LITE to fix link failure in regression Pull Request resolved: https://github.com/facebook/rocksdb/pull/4725 Differential Revision: D13252885 Pulled By: anand1976 fbshipit-source-id: 0e22b964815e2bf01aff7d03ed4ae59d44fa86f1 29 November 2018, 18:51:41 UTC
1a5a93f WritePrepared: Fix double snapshot release issue (#4727) Summary: Currently the garbage collection of items in old_commit_map_ was done upon ::ReleaseSnapshot. The assumption behind this method was that the sequence number of snapshots are unique, which is incorrect. In the very rare cases that two consecutive snapshot have the same sequence number this could lead the release of the first snapshot affect the old_commit_map_ that is necessary to service the reads of the second snapshot. The bug would be triggered only if i) two snapshot have the same seq, ii) both of them are very old (older than the last ~4m transactions), and iii) there is commit entry overlapping with the snapshot seq number. It is fixed by doing the cleanup of old_commit_map_ in UpdateSnapshot: the new list of snapshots are compared with the old one and the missing sequence numbers are concluded released. If two snapshots have the same seq number, after the release of one of them, the seq number still appears in the snapshot least and thus not cleaned up prematurely. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4727 Differential Revision: D13246495 Pulled By: maysamyabandeh fbshipit-source-id: 93b87a5042afd8060889df245526d3f5d29de9fe 29 November 2018, 03:03:31 UTC
512a5e3 Fix BlockBasedTable not always using memory allocator if available (#4678) Summary: Fix block based table reader not using memory_allocator when allocating index blocks and compression dictionary blocks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4678 Differential Revision: D13054594 Pulled By: yiwu-arbug fbshipit-source-id: 379f25bcc665395662511c4f873f4b7b55104ce2 29 November 2018, 02:01:24 UTC
8fe1e06 Clean up FragmentedRangeTombstoneList (#4692) Summary: Removed `one_time_use` flag, which removed the need for some tests, and changed all `NewRangeTombstoneIterator` methods to return `FragmentedRangeTombstoneIterators`. These changes also led to removing `RangeDelAggregatorV2::AddUnfragmentedTombstones` and one of the `MemTableListVersion::AddRangeTombstoneIterators` methods. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4692 Differential Revision: D13106570 Pulled By: abhimadan fbshipit-source-id: cbab5432d7fc2d9cdfd8d9d40361a1bffaa8f845 28 November 2018, 23:29:02 UTC
7125e24 Add the max trace file size limitation option to Tracing (#4610) Summary: If user do not end the trace manually, the tracing will continue which can potential use up all the storage space and cause problem. In this PR, the max trace file size is added to the TraceOptions and user can set the value if they need or the default is 64GB. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4610 Differential Revision: D12893400 Pulled By: zhichao-cao fbshipit-source-id: acf4b5a6076bb691778bdfbac4864e1006758953 27 November 2018, 22:27:05 UTC
c94f073 Fix Mac build break in casting (#4722) Summary: Mac build is failing with the below error: ``` $ make db_bench -j8 ... ... tools/db_bench_tool.cc:4583:25: error: no matching function for call to 'max' (uint64_t)std::max(0l, seek_pos - FLAGS_max_scan_distance), ^~~~~~~~ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2717:1: note: candidate template ignored: deduced conflicting types for parameter '_Tp' ('long' vs. 'long long') max(const _Tp& __a, const _Tp& __b) ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2727:1: note: candidate template ignored: could not match 'initializer_list<type-parameter-0-0>' against 'long' max(initializer_list<_Tp> __t, _Compare __comp) ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2709:1: note: candidate function template not viable: requires 3 arguments, but 2 were provided max(const _Tp& __a, const _Tp& __b, _Compare __comp) ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/include/c++/v1/algorithm:2735:1: note: candidate function template not viable: requires single argument '__t', but 2 arguments were provided max(initializer_list<_Tp> __t) ^ 1 error generated. make: *** [tools/db_bench_tool.o] Error 1 ``` My compiler version: Mac OS X Mojave ``` $ clang++ --version Apple LLVM version 10.0.0 (clang-1000.11.45.5) Target: x86_64-apple-darwin18.2.0 Thread model: posix InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4722 Differential Revision: D13220196 Pulled By: sagar0 fbshipit-source-id: 01e5e928288a5613027c83a26ad8aedf04438b14 27 November 2018, 21:30:16 UTC
5e72bc1 Add SstFileReader to read sst files (#4717) Summary: A user friendly sst file reader is useful when we want to access sst files outside of RocksDB. For example, we can generate an sst file with SstFileWriter and send it to other places, then use SstFileReader to read the file and process the entries in other ways. Also rename the original SstFileReader to SstFileDumper because of name conflict, and seems SstFileDumper is more appropriate for tools. TODO: there is only a very simple test now, because I want to get some feedback first. If the changes look good, I will add more tests soon. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4717 Differential Revision: D13212686 Pulled By: ajkr fbshipit-source-id: 737593383264c954b79e63edaf44aaae0d947e56 27 November 2018, 21:02:23 UTC
3fa80f0 Remove enable_internal_stats (#4714) Summary: Simple patch to address comments in [statistics.h#L65](https://github.com/facebook/rocksdb/blob/master/monitoring/statistics.h#L65|statistics.h#L65) `TODO(ajkr): clean this up since there are no internal stats anymore` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4714 Differential Revision: D13208093 Pulled By: ajkr fbshipit-source-id: 4468badb850592411147539f859082644f5296f6 27 November 2018, 20:58:58 UTC
e764481 Remove DeleteRange experimental comment (#4709) Summary: DeleteRange is now ready for production use. Change the header comment to reflect this, and update HISTORY.md with the feature's status. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4709 Differential Revision: D13209055 Pulled By: abhimadan fbshipit-source-id: 65423eb1a4927cf593c38254cd87c322f73ae137 27 November 2018, 19:11:35 UTC
1db4a09 Test mapping of Histograms and HistogramsNameMap (#4720) Summary: Adding sanity check test for mapping of `Histograms` and `HistogramsNameMap` ``` [==========] Running 2 tests from 1 test case. [----------] Global test environment set-up. [----------] 2 tests from StatisticsTest [ RUN ] StatisticsTest.SanityTickers [ OK ] StatisticsTest.SanityTickers (0 ms) [ RUN ] StatisticsTest.SanityHistograms [ OK ] StatisticsTest.SanityHistograms (0 ms) [----------] 2 tests from StatisticsTest (0 ms total) [----------] Global test environment tear-down [==========] 2 tests from 1 test case ran. (0 ms total) [ PASSED ] 2 tests. ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4720 Differential Revision: D13217061 Pulled By: ajkr fbshipit-source-id: 6427f4e684c36b2f3c3440808b74fee86a364683 27 November 2018, 18:48:30 UTC
a2dec2e Fix Java to C++ ticker conversions (#4719) Summary: Added back `NO_ITERATORS` and moved `NO_ITERATOR_CREATED` to the end of `toCppTickers`. This is a leftover fix which is needed in addition to a138e351bcc017667560c7ecbb295800b30881c2 to correctly convert java tickers to c++ tickers. a138e351bcc017667560c7ecbb295800b30881c2 only updated `toJavaTickerType` but both `toJavaTickerType` and `toCppTickers` need to be changed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4719 Differential Revision: D13208847 Pulled By: sagar0 fbshipit-source-id: 53a42f3d6ffe04034acfde972d73040b92b4c1af 27 November 2018, 18:17:07 UTC
60deb44 Fix build with ROCKSDB_LITE and -Wunused-private-field (#4715) Summary: The error message of databases/rocksdb-lite (FreeBSD port) is as follows: ``` tools/db_bench_tool.cc:1976:16: error: private field 'trace_options_' is not used [-Werror,-Wunused-private-field] TraceOptions trace_options_; ^ ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4715 Differential Revision: D13207902 Pulled By: ajkr fbshipit-source-id: be3c612eba656aeddb77e35e2f201dd25dc92f7e 27 November 2018, 05:35:38 UTC
f183759 FIX #3278: Move global const object definitions from .h to .cc (#4691) Summary: Summary We should declare constants in headers and define them in source files. But this commit is only aimed at compound types. I don't know if it is necessary to do the same thing to fundamental types. I used this command to find all of the constant definitions in header files. `find . -name "*.h" | xargs grep -e "^const .*=.*"` And here is what I found: ``` ./db/version_edit.h:const uint64_t kFileNumberMask = 0x3FFFFFFFFFFFFFFF; ./include/rocksdb/env.h:const size_t kDefaultPageSize = 4 * 1024; ./include/rocksdb/statistics.h:const std::vector<std::pair<Tickers, std::string>> TickersNameMap = { ./include/rocksdb/statistics.h:const std::vector<std::pair<Histograms, std::string>> HistogramsNameMap = { ./include/rocksdb/table.h:const uint32_t kPlainTableVariableLength = 0; ./include/rocksdb/utilities/transaction_db.h:const uint32_t kInitialMaxDeadlocks = 5; ./port/port_posix.h:const uint32_t kMaxUint32 = std::numeric_limits<uint32_t>::max(); ./port/port_posix.h:const int kMaxInt32 = std::numeric_limits<int32_t>::max(); ./port/port_posix.h:const uint64_t kMaxUint64 = std::numeric_limits<uint64_t>::max(); ./port/port_posix.h:const int64_t kMaxInt64 = std::numeric_limits<int64_t>::max(); ./port/port_posix.h:const size_t kMaxSizet = std::numeric_limits<size_t>::max(); ./port/win/port_win.h:const uint32_t kMaxUint32 = UINT32_MAX; ./port/win/port_win.h:const int kMaxInt32 = INT32_MAX; ./port/win/port_win.h:const int64_t kMaxInt64 = INT64_MAX; ./port/win/port_win.h:const uint64_t kMaxUint64 = UINT64_MAX; ./port/win/port_win.h:const size_t kMaxSizet = UINT64_MAX; ./port/win/port_win.h:const size_t kMaxSizet = UINT_MAX; ./port/win/port_win.h:const uint32_t kMaxUint32 = std::numeric_limits<uint32_t>::max(); ./port/win/port_win.h:const int kMaxInt32 = std::numeric_limits<int>::max(); ./port/win/port_win.h:const uint64_t kMaxUint64 = std::numeric_limits<uint64_t>::max(); ./port/win/port_win.h:const int64_t kMaxInt64 = std::numeric_limits<int64_t>::max(); ./port/win/port_win.h:const size_t kMaxSizet = std::numeric_limits<size_t>::max(); ./port/win/port_win.h:const bool kLittleEndian = true; ./table/cuckoo_table_factory.h:const uint32_t kCuckooMurmurSeedMultiplier = 816922183; ./table/data_block_hash_index.h:const uint8_t kNoEntry = 255; ./table/data_block_hash_index.h:const uint8_t kCollision = 254; ./table/data_block_hash_index.h:const uint8_t kMaxRestartSupportedByHashIndex = 253; ./table/data_block_hash_index.h:const size_t kMaxBlockSizeSupportedByHashIndex = 1u << 16; ./table/data_block_hash_index.h:const double kDefaultUtilRatio = 0.75; ./table/filter_block.h:const uint64_t kNotValid = ULLONG_MAX; ./table/format.h:const int kMagicNumberLengthByte = 8; ./third-party/fbson/FbsonJsonParser.h:const char* const kJsonDelim = " ,]}\t\r\n"; ./third-party/fbson/FbsonJsonParser.h:const char* const kWhiteSpace = " \t\n\r"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const BiggestInt kMaxBiggestInt = ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kDeathTestStyleFlag[] = "death_test_style"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kDeathTestUseFork[] = "death_test_use_fork"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char kInternalRunDeathTestFlag[] = "internal_run_death_test"; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const char* pets[] = {"cat", "dog"}; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const size_t kProtobufOneLinerMaxLength = 50; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const int kMaxStackTraceDepth = 100; ./third-party/gtest-1.7.0/fused-src/gtest/gtest.h:const T* WithParamInterface<T>::parameter_ = NULL; ./util/coding.h:const unsigned int kMaxVarint64Length = 10; ./util/filename.h:const size_t kFormatFileNumberBufSize = 38; ./util/testutil.h:const SliceTransform* RandomSliceTransform(Random* rnd, int pre_defined = -1); ./util/trace_replay.h:const std::string kTraceMagic = "feedcafedeadbeef"; ./util/trace_replay.h:const unsigned int kTraceTimestampSize = 8; ./util/trace_replay.h:const unsigned int kTraceTypeSize = 1; ./util/trace_replay.h:const unsigned int kTracePayloadLengthSize = 4; ./util/trace_replay.h:const unsigned int kTraceMetadataSize = ./utilities/cassandra/serialize.h:const int64_t kCharMask = 0xFFLL; ./utilities/cassandra/serialize.h:const int32_t kBitsPerByte = 8; ``` And these 3 lines are related to this commit: ``` ./include/rocksdb/statistics.h:const std::vector<std::pair<Tickers, std::string>> TickersNameMap = { ./include/rocksdb/statistics.h:const std::vector<std::pair<Histograms, std::string>> HistogramsNameMap = { ./util/trace_replay.h:const std::string kTraceMagic = "feedcafedeadbeef"; ``` Any comments would be appreciated. Thanks. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4691 Differential Revision: D13208049 Pulled By: ajkr fbshipit-source-id: e5ee55fdaec5447fc5798c6721e2821e7cdc0d5b 27 November 2018, 05:32:03 UTC
0d65315 RocksJava: Add the missing FIFO compaction options (#4609) Summary: Make CompactionOptionsFIFO's ttl and allow_compaction options to be available in RocksJava. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4609 Differential Revision: D12849503 Pulled By: sagar0 fbshipit-source-id: 47baa97918d252370f234c36c1af15ff2dad7658 27 November 2018, 01:02:08 UTC
85394a9 Speed up range scans with range tombstones (#4677) Summary: Previously, every range tombstone iterator was seeked on every ShouldDelete call, which quickly degraded performance for long range scans. This PR improves performance by tracking iterator positions and only advancing iterators when necessary. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4677 Differential Revision: D13205373 Pulled By: abhimadan fbshipit-source-id: 80c199dace1e19362a4c61c686bf01913eae87cb 27 November 2018, 00:33:41 UTC
a21cb22 Revert "apply ReadOptions.iterate_upper_bound to transaction iterator… (#4705) Summary: … (#4656)" This reverts commit b76398a82bde58bfcfa3ed5ba3dbfb6168c241de. Will add test coverage for iterate_upper_bound before re-commit b76398 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4705 Differential Revision: D13148592 Pulled By: miasantreble fbshipit-source-id: 4d1ce0bfd9f7a5359a7688bd780eb06a66f45b1f 24 November 2018, 18:46:28 UTC
e9372dc DeleteRange blog post (#4711) Summary: as titled Pull Request resolved: https://github.com/facebook/rocksdb/pull/4711 Differential Revision: D13166391 Pulled By: ajkr fbshipit-source-id: 3a3e537cebe2ba97a7ae6fcc3282db2ea755158e 22 November 2018, 04:28:03 UTC
07cf0ee Fix ticker stat for number files closed (#4703) Summary: We haven't been populating `NO_FILE_CLOSES` since v1.5.8 even though it was never marked as deprecated. Start populating it again. Conveniently `DeleteTableReader` has an unused `void*` argument that we can use... Blame: 63f216ee0a2a6e28f9dfe24913d134d3a7fa3aca Closes #4700. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4703 Differential Revision: D13146769 Pulled By: ajkr fbshipit-source-id: ad8d6fb0493e701f60a165a3bca1787d255be008 22 November 2018, 02:31:34 UTC
05d9d82 Revert "Move MemoryAllocator option from Cache to BlockBasedTableOpti… (#4697) Summary: …ons (#4676)" This reverts commit b32d087dbb3b9d9f2c9597caa650d0ca9d2e2d7f. `MemoryAllocator` needs to be with `Cache`, since cache entry can outlive DB and block based table. The cache needs to hold reference to memory allocator when deleting cache entry. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4697 Differential Revision: D13133490 Pulled By: yiwu-arbug fbshipit-source-id: 8ef7e8a51263bfd929f892fd062665ff4ce9ce5a 21 November 2018, 19:29:57 UTC
457f77b Introduce RangeDelAggregatorV2 (#4649) Summary: The old RangeDelAggregator did expensive pre-processing work to create a collapsed, binary-searchable representation of range tombstones. With FragmentedRangeTombstoneIterator, much of this work is now unnecessary. RangeDelAggregatorV2 takes advantage of this by seeking in each iterator to find a covering tombstone in ShouldDelete, while doing minimal work in AddTombstones. The old RangeDelAggregator is still used during flush/compaction for now, though RangeDelAggregatorV2 will support those uses in a future PR. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4649 Differential Revision: D13146964 Pulled By: abhimadan fbshipit-source-id: be29a4c020fc440500c137216fcc1cf529571eb3 21 November 2018, 18:56:45 UTC
ed5aec5 Fix range tombstone covering short-circuit logic (#4698) Summary: Since a range tombstone seen at one level will cover all keys in the range at lower levels, there was a short-circuiting check in Get that reported a key was not found at most one file after the range tombstone was discovered. However, this was incorrect for merge operands, since a deletion might only cover some merge operands, which implies that the key should be found. This PR fixes this logic in the Version portion of Get, and removes the logic from the MemTable portion of Get, since the perforamnce benefit provided there is minimal. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4698 Differential Revision: D13142484 Pulled By: abhimadan fbshipit-source-id: cbd74537c806032f2bfa564724d01a80df7c8f10 20 November 2018, 21:29:22 UTC
a138e35 Fix compatibility of public ticker stats (#4701) Summary: - Added back the `NO_ITERATORS` that was removed in 5945e16dfc6e99522c607841cfd387eebed256fc. - Marked it as deprecated since it is no longer populated, but kept for API compatibility. - Made sure the new tickers, `NO_ITERATOR_CREATED` and `NO_ITERATOR_DELETED`, are appended at the end of the enum, in case people are relying on the int values. The change where `NO_ITERATOR_CREATED` and `NO_ITERATOR_DELETED` were introduced is unreleased so I believe it is ok to change their ordering. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4701 Differential Revision: D13142887 Pulled By: ajkr fbshipit-source-id: 29a336ce5b46632ce50ad42ccc4a29013f71d6d6 20 November 2018, 21:13:16 UTC
327097c JemallocAllocator: thread-local tcache (#4603) Summary: Add option to support thread-local tcache to reduce mutex contention inside Jemalloc arena. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4603 Differential Revision: D12830738 Pulled By: yiwu-arbug fbshipit-source-id: 59bd25b165b903f23a6a8531b18d72e140d69f65 20 November 2018, 06:39:08 UTC
659d0e6 Run Define codemod in fbcode Summary: Found a callsite for `moo_translate` in the Scuba warnings and realized we have a few calls to `define()` left in fbcode. - I ran the `DefineCodemod` script against fbcode - Fixed broken tests, and ensured that tests that are explicitly testing the behaviour of `define()` were not changed. bypass-lint Reviewed By: kmeht Differential Revision: D12968447 fbshipit-source-id: d8fd3649a2ce9868b8938d293e1bebf1a6d2fad8 19 November 2018, 19:59:15 UTC
13579e8 WriteBufferManger doens't cost to cache if no limit is set (#4695) Summary: WriteBufferManger is not invoked when allocating memory for memtable if the limit is not set even if a cache is passed. It is inconsistent from the comment syas. Fix it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4695 Differential Revision: D13112722 Pulled By: siying fbshipit-source-id: 0b27eef63867f679cd06033ea56907c0569597f4 19 November 2018, 00:55:43 UTC
9d6d486 Fix uninitialized fields in file metadata (#4693) Summary: This is a quick fix for the uninitialized bugs in `LiveFileMetaData` and `SstFileMetaData` that were uncovered in #4686. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4693 Differential Revision: D13113189 Pulled By: ajkr fbshipit-source-id: 18e798d031d2a59d0b55fc010c135e0126f4042d 17 November 2018, 04:49:17 UTC
1476974 Rollback memtable flush upon atomic flush fail (#4641) Summary: This fixes an assertion. An atomic flush can have multiple flush jobs. Some of them may fail. If any of them fails, we need to rollback all of them. For the flush jobs that do fail, we already call `RollbackMemTableFlush` in `FlushJob::Run`. The tricky part is for flush jobs that have completed successfully. We need to call `RollbackMemTableFlush` for them as well. The newly added DBAtomicFlushTest.AtomicFlushRollbackSomeJobs will SigAbort without the corresponding change in AtomicFlushMemTablesToOutputFiles. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4641 Differential Revision: D12943649 Pulled By: riversand963 fbshipit-source-id: c66a4a664a1e0938e938fd41edc5a70c34cdd868 15 November 2018, 04:54:17 UTC
6bee36a Modify FragmentedRangeTombstoneList member layout (#4632) Summary: Rather than storing a `vector<RangeTombstone>`, we now store a `vector<RangeTombstoneStack>` and a `vector<SequenceNumber>`. A `RangeTombstoneStack` contains the start and end keys of a range tombstone fragment, and indices into the seqnum vector to indicate which sequence numbers the fragment is located at. The diagram below illustrates an example: ``` tombstones_: [a, b) [c, e) [h, k) | \ / \ / | | \ / \ / | v v v v tombstone_seqs_: [ 5 3 10 7 2 8 6 ] ``` This format allows binary searching the tombstone list to use less key comparisons, which helps in cases where there are many overlapping tombstones. Also, this format makes it easier to add DBIter-like semantics to `FragmentedRangeTombstoneIterator` in the future. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4632 Differential Revision: D13053103 Pulled By: abhimadan fbshipit-source-id: e8220cc712fcf5be4d602913bb23ace8ea5f8ef0 15 November 2018, 01:52:17 UTC
f5c8cf5 Increase wait time in DBTest.SanitizeNumThreads (#4659) Summary: DBTest.SanitizeNumThreads Sometimes fails. The test waited for 10ms timeout and expect all threads scheduled to be executed. This can be a source of flakiness. Make a check every 1ms and up to 10s. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4659 Differential Revision: D13074174 Pulled By: siying fbshipit-source-id: b1d5ff87a326a4fc9eab8d1cc307bbb940dfe70c 15 November 2018, 00:19:36 UTC
c2a20f1 Fix ignoring params in default impl of GetForUpdate (#4679) Summary: The default implementation of GetForUpdate that receives PinnableSlice was mistakenly dropping column_family and exclusive parameters. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4679 Differential Revision: D13062531 Pulled By: maysamyabandeh fbshipit-source-id: 7625d0c1ba872a5d894b58ced42147d6c8556a6f 14 November 2018, 19:29:38 UTC
0ed738f Add max_scan_distance flag to db_bench (#4660) Summary: The new flag makes it possible to constrain iterator traversal by the upper/lower bound the iterator is expected to pass. This allows seekrandom results to be more easily comparable between DBs with and without deletions. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4660 Differential Revision: D13053111 Pulled By: abhimadan fbshipit-source-id: 33e250f2e2d210b54c7726399da30a33f723c33c 14 November 2018, 18:46:12 UTC
de65103 Improve result report of scan (#4648) Summary: When iterator becomes invalid, there are two possibilities. First, all data in the column family have been scanned and there is nothing more to scan. Second, an underlying error has occurred, causing `status()` to be !ok. Therefore, we need to check for both cases when `!iter->Valid()`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4648 Differential Revision: D12959601 Pulled By: riversand963 fbshipit-source-id: 49c9382c9ea9e78f2e2b6f3708f0670b822ca8dd 14 November 2018, 04:03:59 UTC
5cf5f47 Expose underlying Read/Write APIs for avoiding unnecessary memory copy (#2303) Summary: adamretter As you already mentioned at #1247 . Pull Request resolved: https://github.com/facebook/rocksdb/pull/2303 Differential Revision: D10209001 Pulled By: sagar0 fbshipit-source-id: bcbce004112c2edeaff116968d79c6f90aab4b6c 14 November 2018, 01:33:09 UTC
d8df169 release db mutex when calling ApproximateSize (#4630) Summary: `GenSubcompactionBoundaries` calls `VersionSet::ApproximateSize` which gets BlockBasedTableReader for every file and seeks in its index block to find `key`'s offset. If the table or index block aren't in memory already, this involves I/O. This can be improved by releasing DB mutex when calling ApproximateSize. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4630 Differential Revision: D13052653 Pulled By: miasantreble fbshipit-source-id: cae31d46d10d0860fa8a26b8d5154b2d17d1685f 14 November 2018, 01:08:34 UTC
b82e57d Remove two variables from BlockContents class and don't use class Block for compressed block (#4650) Summary: We carry compression type and "cachable" variables for every block in the block cache, while they take well-known values. 8-byte is wasted for each block (2-byte for useful information but it takes 8 bytes because of padding). With this change, these two variables are removed. The cachable information is only useful in the process of reading the block. We use other information to infer from it. For compressed blocks, the compression type is a part of the block content itself so we can get it from there. Some code is slightly refactored so that the cachable information can flow better. Another change is to only use class BlockContents for compressed block, and narrow the class Block to only be used for uncompressed blocks, including blocks in compressed block cache. This can make the Block class less confusing. It also saves tens of bytes for each block in compressed block cache. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4650 Differential Revision: D12969070 Pulled By: siying fbshipit-source-id: 548b62724e9eb66993026429fd9c7c3acd1f95ed 14 November 2018, 01:02:55 UTC
b76398a apply ReadOptions.iterate_upper_bound to transaction iterator (#4656) Summary: Currently transaction iterator does not apply `ReadOptions.iterate_upper_bound` when iterating. This PR attempts to fix the problem by having `BaseDeltaIterator` enforcing the upper bound check when iterator state is changed. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4656 Differential Revision: D13039257 Pulled By: miasantreble fbshipit-source-id: 909eb9f6b4597a4d80418fb139f32ec82c6ec1d1 13 November 2018, 23:44:15 UTC
a2de8e5 optimized the performance of autovector::emplace_back. (#4606) Summary: It called the autovector::push_back simply in autovector::emplace_back. This was not efficient, and then optimazed this function through the perfect forwarding. This was the src and result of the benchmark(using the google'benchmark library, the type of elem in autovector was std::string, and call emplace_back with the "char *" type): https://gist.github.com/monadbobo/93448b89a42737b08cbada81de75c5cd PS: The benchmark's result of previous PR was not accurate, and so I update the test case and result. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4606 Differential Revision: D13046813 Pulled By: sagar0 fbshipit-source-id: 19cde1bcadafe899aa454b703acb35737a1cc02d 13 November 2018, 22:39:03 UTC
b32d087 Move MemoryAllocator option from Cache to BlockBasedTableOptions (#4676) Summary: Per offline discussion with siying, `MemoryAllocator` and `Cache` should be decouple. The idea is that memory allocator handles memory allocation, while cache handle cache policy. It is normal that external cache libraries pack couple the two components for better optimization. If we want to integrate with such library in the future, we can make a wrapper of the library implementing both `Cache` and `MemoryAllocator` interface. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4676 Differential Revision: D13047662 Pulled By: yiwu-arbug fbshipit-source-id: cd42e246d80ab600b4de47d073f7d2db308ce6dd 13 November 2018, 21:48:38 UTC
abb1a8f Add a unit test to assert number of preads (#4657) Summary: We used to have a bug, which caused every block to be read twice, and none of our tests caught it. Add a very simply unit test to make sure that when reading a data block, we only issue one pread against the SST file. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4657 Differential Revision: D13005260 Pulled By: siying fbshipit-source-id: 03167b554ad2451192b1707415536d7d05e9026c 13 November 2018, 20:52:19 UTC
05dab3a BlobDB: use char array instead of string as buffer (#4662) Summary: As pointed out in #4059, we miss use string as buffer for file read. Changing to use char array instead. Closing #4059 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4662 Differential Revision: D13012998 Pulled By: yiwu-arbug fbshipit-source-id: 41234ba17c0bccea65bd647e362a0e979152bd1e 13 November 2018, 20:49:29 UTC
4f0fcb7 Expose num entries and deletions of sst files (#4623) Summary: he ratio of num_deletions to num_entries of a level can be useful to determine if a manual compaction needs to be triggered on a level. Also refer #3980 Pull Request resolved: https://github.com/facebook/rocksdb/pull/4623 Differential Revision: D13045744 Pulled By: sagar0 fbshipit-source-id: 71f3c8e363a8ffd194ec3bb0ed0b69612231f0b3 13 November 2018, 19:52:19 UTC
5945e16 Divide `NO_ITERATORS` into two counters `NO_ITERATOR_CREATED` and `NO_ITERATOR_DELETE` (#4498) Summary: Currently, `Statistics` can record tick by `recordTick()` whose second parameter is an `uint64_t`. That means tick can only increase. If we want to reduce tick, we have to work around like `RecordTick(statistics_, NO_ITERATORS, uint64_t(-1));`. That's kind of a hack. So, this PR divide `NO_ITERATORS` into two counters `NO_ITERATOR_CREATED` and `NO_ITERATOR_DELETE`, making the counters increase only. Fixes #3013 . Pull Request resolved: https://github.com/facebook/rocksdb/pull/4498 Differential Revision: D10395010 Pulled By: sagar0 fbshipit-source-id: cfb523b22a37411c794b4e9da090f1ae30293db2 13 November 2018, 19:46:32 UTC
a478682 Fix #3840: only `SyncClosedLogs` for multiple CFs (#4460) Summary: Call `SyncClosedLogs()` only if there are more than one column families. Update several unit tests (in `fault_injection_test` and `db_flush_test`) correspondingly. See #3840 for more info. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4460 Differential Revision: D12896377 Pulled By: riversand963 fbshipit-source-id: f49afdaec32568f12f001219a3aec1dfde3b32bf 13 November 2018, 19:32:16 UTC
ea94547 Backup engine support for direct I/O reads (#4640) Summary: Use the `DBOptions` that the backup engine already holds to figure out the right `EnvOptions` to use when reading the DB files. This means that, if a user opened a DB instance with `use_direct_reads=true`, then using `BackupEngine` to back up that DB instance will use direct I/O to read files when calculating checksums and copying. Currently the WALs and manifests would still be read using buffered I/O to prevent mixing direct I/O reads with concurrent buffered I/O writes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4640 Differential Revision: D13015268 Pulled By: ajkr fbshipit-source-id: 77006ad6f3e00ce58374ca4793b785eea0db6269 13 November 2018, 19:17:25 UTC
b313019 use per-level perfcontext for DB::Get calls (#4617) Summary: this PR adds two more per-level perf context counters to track * number of keys returned in Get call, break down by levels * total processing time at each level during Get call Pull Request resolved: https://github.com/facebook/rocksdb/pull/4617 Differential Revision: D12898024 Pulled By: miasantreble fbshipit-source-id: 6b84ef1c8097c0d9e97bee1a774958f56ab4a6c4 13 November 2018, 18:40:49 UTC
2993cd2 Fix RocksDB Lite build (#4675) Summary: Our internal CI test caught RocksDB Lite build failures. The failures are due to a new test introduced in #4665 using `SSTFileWriter` and `IngestExternalFile`, but these is not exposed under lite mode. Fixed by #ifdef'ing out the test. ``` db/db_test2.cc: In member function ‘virtual void rocksdb::DBTest2_TestCompactFiles_Test::TestBody()’: db/db_test2.cc:2907:3: error: ‘SstFileWriter’ is not a member of ‘rocksdb’ rocksdb::SstFileWriter sst_file_writer{rocksdb::EnvOptions(), options}; ^ In file included from ./util/testharness.h:15:0, from ./table/mock_table.h:23, from ./db/db_test_util.h:44, from db/db_test2.cc:13: db/db_test2.cc:2912:13: error: ‘sst_file_writer’ was not declared in this scope ASSERT_OK(sst_file_writer.Open(external_file1)); ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4675 Differential Revision: D13035984 Pulled By: sagar0 fbshipit-source-id: c1ceac550dfac1a85eeea436693dc7dd467519a6 13 November 2018, 03:01:37 UTC
dd742e2 Automatically set LITE=1 on passing OPT="-DROCKSDB_LITE" (#4671) Summary: In #4652 we are setting -Os for lite builds only when LITE=1 is specified. But currently almost all the users invoke lite build via OPT="-DROCKSDB_LITE=1". So this diff tries to set LITE=1 when users already pass in -DROCKSDB_LITE=1 via the command line. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4671 Differential Revision: D13033801 Pulled By: sagar0 fbshipit-source-id: e7b506cee574f9e3f42221ee6647915011c78d78 13 November 2018, 00:58:54 UTC
7d04ef4 Fix flaky DBDynamicLevelTest.DynamicLevelMaxBytesBase2 (#4668) Summary: Part of the test required that a compaction start before a manual flush, but this was not enforced by the test. In some cases, particularly when writing to tmpfs, this could lead to the compaction starting after the flush, which caused the base level to be higher than it was expected to be. Add a sync point in the test to ensure that the flush and compaction happen simultaneously. The test also had some stale comments, so those have been removed or modified, and the test has been simplified so that it no longer uses sleeps and writes uncompressed SSTs. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4668 Differential Revision: D13032440 Pulled By: abhimadan fbshipit-source-id: 3f23b583a096454dafb8d8ea75678605dec80209 13 November 2018, 00:42:16 UTC
050d735 Update history and version for future 5.18 release. Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4669 Differential Revision: D13031522 Pulled By: gfosco fbshipit-source-id: d09e655a9d5556594f195c5d1b786900932145ce 12 November 2018, 22:45:47 UTC
0f88160 Fix `CompactFiles` bug (#4665) Summary: `CompactFiles` gets `SuperVersion` before `WaitForIngestFile`, while `IngestExternalFile` may add files that overlap with `input_file_names` The timeline of execution flow is as follow: Let's say that level N has two file [1,2] and [5,6] ``` timeline user_thread1 user_thread2 t0 | CompactFiles([1, 2], [5, 6]) begin t1 | GetReferencedSuperVersion() t2 | IngestExternalFile([3,4]) to level N begin t3 | CompactFiles resume V ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4665 Differential Revision: D13030674 Pulled By: ajkr fbshipit-source-id: 8be19477fd6e505032267a979d32f3097cc3be51 12 November 2018, 22:32:18 UTC
05dec0c Remove redundant member var and set options (#4631) Summary: In the past, both `DBImpl::atomic_flush_` and `DBImpl::immutable_db_options_.atomic_flush` exist. However, we fail to set `immutable_db_options_.atomic_flush`, but use `DBImpl::atomic_flush_` which is set correctly. This does not lead to incorrect behavior, but is a duplicate of information. Since `immutable_db_options_` is always there and has `atomic_flush`, we should use it as source of truth and remove `DBImpl::atomic_flush_`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4631 Differential Revision: D12928371 Pulled By: riversand963 fbshipit-source-id: f85a811959d3828aad4a3a1b05f71facf19c636d 12 November 2018, 20:24:26 UTC
09426ae Fix `DBImpl::GetColumnFamilyHandleUnlocked` data race (#4666) Summary: Hi, yiwu-arbug, I found that `DBImpl::GetColumnFamilyHandleUnlocked` still have data race condition, because `column_family_memtables_` has a stateful cache `current_` and `column_family_memtables_::Seek` maybe call without the protection of `mutex_` by a write thread check https://github.com/facebook/rocksdb/blob/859dbda6e3cac17416aff48f1760d01707867351/db/write_batch.cc#L1188 and https://github.com/facebook/rocksdb/blob/859dbda6e3cac17416aff48f1760d01707867351/db/write_batch.cc#L1756 and https://github.com/facebook/rocksdb/blob/859dbda6e3cac17416aff48f1760d01707867351/db/db_impl_write.cc#L318 So it's better to use `versions_->GetColumnFamilySet()->GetColumnFamily` instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4666 Differential Revision: D13027117 Pulled By: yiwu-arbug fbshipit-source-id: 4e3778eaf8e7f7c8577bbd78129b6a5fd7ce79fb 12 November 2018, 19:52:34 UTC
d761857 Add unique key number changing statistics to Trace_analyzer (#4646) Summary: Changes: 1. in current version, key size distribution is printed out as the result. In this change, the result will be output to a file to make further analyze easier 2. To understand how the unique keys are accessed over time, the total unique key number of each CF of each query type in each second over time is output to a file. In this way, user could know when the unique keys are accessed frequently or accessed rarely. 3. output the total QPS of each CF to a file 4. Add the print result of total queries of each CF of each query type. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4646 Differential Revision: D12968156 Pulled By: zhichao-cao fbshipit-source-id: 6c411c7ec47c7843a70929136efd71a150db0e4c 12 November 2018, 16:26:50 UTC
859dbda Fix DBTest.SoftLimit flakyness (#4658) Summary: The flakyness can be reproduced with the following patch: ``` --- a/db/db_impl_compaction_flush.cc +++ b/db/db_impl_compaction_flush.cc @@ -2013,6 +2013,9 @@ void DBImpl::BackgroundCallFlush() { if (job_context.HaveSomethingToDelete()) { PurgeObsoleteFiles(job_context); } + static int f_count = 0; + printf("clean flush job context %d\n", ++f_count); + env_->SleepForMicroseconds(1000000); job_context.Clean(); mutex_.Lock(); } ``` The issue is that FlushMemtable with opt.wait=true does not wait for `OnStallConditionsChanged` being called. The event listener is triggered on `JobContext::Clean`, which happens after flush result is installed. At the time we check for stall condition after flushing memtable, the job context cleanup may not be finished. To fix the flaykyness, we use sync point to create a custom WaitForFlush that waits for context cleanup. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4658 Differential Revision: D13007301 Pulled By: yiwu-arbug fbshipit-source-id: d98395ee7b0ad4c62e83e8d0e9b6028058c61712 10 November 2018, 00:45:19 UTC
7a2f98a Fix liblua link error when building shared lib under fbcode (#4651) Summary: When running `make shared_lib` under fbcode, there's liblua link error: https://gist.github.com/yiwu-arbug/b796bff6b3d46d90c1ed878d983de50d This is because we link liblua.a when building shared lib. If we want to link with liblua, we need to link with liblua_pic.a instead. Fixing by simply not link with lua. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4651 Differential Revision: D12964798 Pulled By: yiwu-arbug fbshipit-source-id: 18d6cee94afe20748068822b76e29ef255cdb04d 09 November 2018, 22:13:40 UTC
0c4678f Update documentation about dynamic options (#4653) Summary: Updated the comments around all options which are currently dynamic. Without explicitly specifying in the comments around the options, its hard for RocksDB users to know if an option is dynamically changeable or not unless they dig into the implementation details. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4653 Differential Revision: D12966735 Pulled By: sagar0 fbshipit-source-id: 1a01acbf6fe506b989e71629ce223f9803ebae27 09 November 2018, 21:30:49 UTC
dc35280 Update all unique/shared_ptr instances to be qualified with namespace std (#4638) Summary: Ran the following commands to recursively change all the files under RocksDB: ``` find . -type f -name "*.cc" -exec sed -i 's/ unique_ptr/ std::unique_ptr/g' {} + find . -type f -name "*.cc" -exec sed -i 's/<unique_ptr/<std::unique_ptr/g' {} + find . -type f -name "*.cc" -exec sed -i 's/ shared_ptr/ std::shared_ptr/g' {} + find . -type f -name "*.cc" -exec sed -i 's/<shared_ptr/<std::shared_ptr/g' {} + ``` Running `make format` updated some formatting on the files touched. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4638 Differential Revision: D12934992 Pulled By: sagar0 fbshipit-source-id: 45a15d23c230cdd64c08f9c0243e5183934338a8 09 November 2018, 19:19:58 UTC
8ba17f3 Verify restore from backup in db_stress (#4655) Summary: We already exercised backup functionality in `db_stress` according to the `-backup_one_in` flag. This PR verifies the backup can be restored/opened and sanity checks a few keys. Changes in this PR: - Extracted existing backup-related logic to a helper function, `TestBackupRestore` - Added restore logic, which targets a hidden directory named "./.restore\<thread number\>", similar to how backups target hidden directories named "./.backup\<thread number\>". - After restore, check the existence/non-existence of a few keys. - With this PR, backup is no longer compatible with clearing column families. - Also included unrelated fixes to set `ReadOptions::total_order_seek=true` when using `-compare_full_db_state_snapshot` Pull Request resolved: https://github.com/facebook/rocksdb/pull/4655 Differential Revision: D12972496 Pulled By: ajkr fbshipit-source-id: 481a40052d9a38d1bd5c5159aa4d7c5a4b546b80 08 November 2018, 23:15:24 UTC
8c2a487 Use -Os for lite release build (#4652) Summary: Set `-Os` for lite release build to minimize binary size. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4652 Differential Revision: D12965427 Pulled By: yiwu-arbug fbshipit-source-id: c8b647642c24b3e5df6a2cd13112e452a08e8398 08 November 2018, 06:10:28 UTC
fce5994 Add more sync point to fix flaky test GroupCommitTest Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/4637 Differential Revision: D12963727 Pulled By: miasantreble fbshipit-source-id: 76053501afbecc6ef388ddc56542fa0185243e3f 07 November 2018, 22:07:53 UTC
bec59f9 Ensure delete[] and not delete is used on buffer_ (#4647) Summary: Ensure delete[] and not delete is called on buffer_, as it is reset with new char[buffer_size_]. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4647 Differential Revision: D12961327 Pulled By: ajkr fbshipit-source-id: c1af373b98359edfdc291caebe4e0acdfb8afdd8 07 November 2018, 19:59:50 UTC
0148f71 Move `#include` outside of namespace (#4629) Summary: clang modules warns about `#include`s inside of namespaces. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4629 Reviewed By: ajkr Differential Revision: D12927333 Pulled By: andrewjcg fbshipit-source-id: a9e0b069e63d8224f78b7c3be1c3acf09bb83d3f 07 November 2018, 01:18:28 UTC
d7a0438 Include newer RocksDB versions in compat test (#4634) Summary: Include 5.16 and 5.17 in check_format_compatible.sh Pull Request resolved: https://github.com/facebook/rocksdb/pull/4634 Differential Revision: D12947140 Pulled By: riversand963 fbshipit-source-id: 79852b76d5139b2f31db59ed14cb368be01f2c32 06 November 2018, 22:25:39 UTC
566fc8b Black list some valgrind tests (#4642) Summary: valgrind tests with 1 thread run too long. To make it shorter, black list some long tests. These are already blacklisted in parallel valgrind tests, but they are not in non-parallel mode Pull Request resolved: https://github.com/facebook/rocksdb/pull/4642 Differential Revision: D12945237 Pulled By: siying fbshipit-source-id: 04cf977d435996480fe87aa09f14b17975b74f7d 06 November 2018, 22:22:36 UTC
0a53577 Move xxhash64 checksum support to 'Unreleased' section (#4627) Summary: Move the line `Add xxhash64 checksum support` to `Unreleased` section because it has not been released to 5.17. Pull Request resolved: https://github.com/facebook/rocksdb/pull/4627 Differential Revision: D12944123 Pulled By: riversand963 fbshipit-source-id: 2762857065b9e741c64ff8b6116ca62fe31891d8 06 November 2018, 20:06:22 UTC
back to top