https://github.com/facebook/rocksdb
- HEAD
- refs/heads/2.2.fb.branch
- refs/heads/2.3.fb.branch
- refs/heads/2.4.fb.branch
- refs/heads/2.5.fb.branch
- refs/heads/2.6.fb.branch
- refs/heads/2.7
- refs/heads/2.7.fb.branch
- refs/heads/2.8.1.fb
- refs/heads/2.8.fb
- refs/heads/2.8.fb.trunk
- refs/heads/3.0.fb
- refs/heads/3.0.fb.branch
- refs/heads/3.1.fb
- refs/heads/3.10.fb
- refs/heads/3.11.fb
- refs/heads/3.12.fb
- refs/heads/3.13.fb
- refs/heads/3.2.fb
- refs/heads/3.3.fb
- refs/heads/3.4.fb
- refs/heads/3.5.fb
- refs/heads/3.6.fb
- refs/heads/3.7.fb
- refs/heads/3.8.fb
- refs/heads/3.9.fb
- refs/heads/4.0.fb
- refs/heads/4.1.fb
- refs/heads/4.10.fb
- refs/heads/4.11.fb
- refs/heads/4.12.fb
- refs/heads/4.13.fb
- refs/heads/4.2.fb
- refs/heads/4.3.fb
- refs/heads/4.4.fb
- refs/heads/4.5.fb
- refs/heads/4.6.fb
- refs/heads/4.7.fb
- refs/heads/4.8.fb
- refs/heads/4.9.fb
- refs/heads/5.0.fb
- refs/heads/5.1.fb
- refs/heads/5.10.fb
- refs/heads/5.11.fb
- refs/heads/5.12.fb
- refs/heads/5.13.fb
- refs/heads/5.13.fb.myrocks
- refs/heads/5.14.fb
- refs/heads/5.14.fb.myrocks
- refs/heads/5.15.fb
- refs/heads/5.16.fb
- refs/heads/5.17.fb
- refs/heads/5.17.fb.myrocks
- refs/heads/5.18.fb
- refs/heads/5.2.fb
- refs/heads/5.3.fb
- refs/heads/5.4.fb
- refs/heads/5.5.fb
- refs/heads/5.6.fb
- refs/heads/5.7.fb
- refs/heads/5.7.fb.myrocks
- refs/heads/5.8.3
- refs/heads/5.8.fb
- refs/heads/5.9.fb
- refs/heads/5.9.fb.myrocks
- refs/heads/6.0.fb
- refs/heads/6.0.fb.myrocks
- refs/heads/6.1.fb
- refs/heads/6.1.fb.myrocks
- refs/heads/6.1.fb.prod201905
- refs/heads/6.10.fb
- refs/heads/6.11.fb
- refs/heads/6.12.fb
- refs/heads/6.13.fb
- refs/heads/6.13.fb.laser
- refs/heads/6.14.fb
- refs/heads/6.14.fb.laser
- refs/heads/6.15.fb
- refs/heads/6.16.fb
- refs/heads/6.17.fb
- refs/heads/6.17.fb.laser
- refs/heads/6.18.fb
- refs/heads/6.19.fb
- refs/heads/6.2.fb
- refs/heads/6.20.fb
- refs/heads/6.21.fb
- refs/heads/6.22-history.md-fixup
- refs/heads/6.22.fb
- refs/heads/6.23.fb
- refs/heads/6.24.fb
- refs/heads/6.25.fb
- refs/heads/6.26.fb
- refs/heads/6.27.fb
- refs/heads/6.28.fb
- refs/heads/6.29.fb
- refs/heads/6.3.fb
- refs/heads/6.3.fb.myrocks
- refs/heads/6.3.fb.myrocks2
- refs/heads/6.3fb
- refs/heads/6.4.fb
- refs/heads/6.5.fb
- refs/heads/6.6.fb
- refs/heads/6.7.fb
- refs/heads/6.8.fb
- refs/heads/6.9.fb
- refs/heads/7.0.fb
- refs/heads/7.1.fb
- refs/heads/7.10.fb
- refs/heads/7.2.fb
- refs/heads/7.3.fb
- refs/heads/7.4.fb
- refs/heads/7.5.fb
- refs/heads/7.6.fb
- refs/heads/7.7.fb
- refs/heads/7.8.fb
- refs/heads/7.9.fb
- refs/heads/8.0.fb
- refs/heads/8.1.fb
- refs/heads/8.10.fb
- refs/heads/8.11.2_zippydb
- refs/heads/8.11.fb
- refs/heads/8.11.fb_zippydb
- refs/heads/8.2.fb
- refs/heads/8.3.fb
- refs/heads/8.4.fb
- refs/heads/8.5.fb
- refs/heads/8.6.fb
- refs/heads/8.7.fb
- refs/heads/8.8.fb
- refs/heads/8.9.fb
- refs/heads/9.0.fb
- refs/heads/9.1.fb
- refs/heads/9.1.fb.myrocks
- refs/heads/9.2.fb
- refs/heads/9.3.fb
- refs/heads/9.3.fb_exactly_at_hlc_testing
- refs/heads/9.4.fb
- refs/heads/9.5.fb
- refs/heads/9.6.fb
- refs/heads/adaptive
- refs/heads/ajkr-patch-1
- refs/heads/ajkr-patch-2
- refs/heads/blob_shadow
- refs/heads/bottom-pri-level
- refs/heads/bugfix-build-detect
- refs/heads/checksum_readahead_mmap_fix
- refs/heads/draft-myrocks-and-fbcode-8.0.fb
- refs/heads/feature/debug-rocksdbjavastatic
- refs/heads/feature/travis-arm64
- refs/heads/fix-release-notes
- refs/heads/fix-win2022-build
- refs/heads/fix-write-batch-comment
- refs/heads/format_compatible_4
- refs/heads/getmergeops
- refs/heads/gh-pages-old
- refs/heads/history-update
- refs/heads/hotfix/lambda-capture
- refs/heads/improve-support
- refs/heads/jijiew-patch-1
- refs/heads/katherinez-patch-1
- refs/heads/katherinez-patch-2
- refs/heads/main
- refs/heads/master
- refs/heads/mdcallag_benchmark_oct22
- refs/heads/nvm_cache_proto
- refs/heads/pr-sanity-check-as-GHAction
- refs/heads/pr/11267
- refs/heads/pr/6062
- refs/heads/ramvadiv-patch-1
- refs/heads/release_fix
- refs/heads/revert-10606-7.6.1
- refs/heads/ribbon_bloom_hybrid
- refs/heads/scaffold
- refs/heads/siying-patch-1
- refs/heads/siying-patch-10
- refs/heads/siying-patch-2
- refs/heads/siying-patch-3
- refs/heads/siying-patch-4
- refs/heads/siying-patch-5
- refs/heads/siying-patch-6
- refs/heads/siying-patch-7
- refs/heads/siying-patch-8
- refs/heads/skip_memtable_flush
- refs/heads/testing_ppc_build
- refs/heads/tests
- refs/heads/unschedule_issue_test_base
- refs/heads/unused-var
- refs/heads/v6.6.4
- refs/heads/xxhash_merge_base
- refs/heads/yiwu_stackable
- refs/heads/yuslepukhin
- refs/remotes/origin/5.13.fb
- refs/tags/2.5.fb
- refs/tags/2.6.fb
- refs/tags/3.0.fb
- refs/tags/do-not-use-me2
- refs/tags/rocksdb-3.1
- refs/tags/rocksdb-3.10.2
- refs/tags/rocksdb-3.11
- refs/tags/rocksdb-3.11.1
- refs/tags/rocksdb-3.11.2
- refs/tags/rocksdb-3.2
- refs/tags/rocksdb-3.3
- refs/tags/rocksdb-3.4
- refs/tags/rocksdb-3.5
- refs/tags/rocksdb-3.5.1
- refs/tags/rocksdb-3.6.1
- refs/tags/rocksdb-3.6.2
- refs/tags/rocksdb-3.7
- refs/tags/rocksdb-3.8
- refs/tags/rocksdb-3.9
- refs/tags/rocksdb-3.9.1
- refs/tags/rocksdb-4.1
- refs/tags/rocksdb-5.10.2
- refs/tags/rocksdb-5.10.3
- refs/tags/rocksdb-5.10.4
- refs/tags/rocksdb-5.11.2
- refs/tags/rocksdb-5.11.3
- refs/tags/rocksdb-5.14.3
- refs/tags/rocksdb-5.2.1
- refs/tags/rocksdb-5.3.3
- refs/tags/rocksdb-5.3.4
- refs/tags/rocksdb-5.3.5
- refs/tags/rocksdb-5.3.6
- refs/tags/rocksdb-5.4.10
- refs/tags/rocksdb-5.4.5
- refs/tags/rocksdb-5.4.6
- refs/tags/rocksdb-5.5.2
- refs/tags/rocksdb-5.5.3
- refs/tags/rocksdb-5.5.4
- refs/tags/rocksdb-5.5.5
- refs/tags/rocksdb-5.5.6
- refs/tags/rocksdb-5.6.1
- refs/tags/rocksdb-5.6.2
- refs/tags/rocksdb-5.7.1
- refs/tags/rocksdb-5.7.2
- refs/tags/rocksdb-5.7.3
- refs/tags/rocksdb-5.7.5
- refs/tags/rocksdb-5.8.6
- refs/tags/rocksdb-5.8.7
- refs/tags/rocksdb-5.8.8
- refs/tags/rocksdb-5.9.2
- refs/tags/v4.0
- refs/tags/v4.1
- refs/tags/v5.10.2
- refs/tags/v5.10.3
- refs/tags/v5.10.4
- refs/tags/v5.11.2
- refs/tags/v5.11.3
- refs/tags/v5.13.3
- refs/tags/v5.14.3
- refs/tags/v5.15.10
- refs/tags/v5.18.3
- refs/tags/v5.2.1
- refs/tags/v5.3.3
- refs/tags/v5.3.4
- refs/tags/v5.3.5
- refs/tags/v5.3.6
- refs/tags/v5.4.10
- refs/tags/v5.4.5
- refs/tags/v5.4.6
- refs/tags/v5.5.2
- refs/tags/v5.5.3
- refs/tags/v5.5.4
- refs/tags/v5.5.5
- refs/tags/v5.5.6
- refs/tags/v5.6.1
- refs/tags/v5.6.2
- refs/tags/v5.7.1
- refs/tags/v5.7.2
- refs/tags/v5.7.3
- refs/tags/v5.7.5
- refs/tags/v5.8.6
- refs/tags/v5.8.7
- refs/tags/v5.8.8
- refs/tags/v5.9.2
- refs/tags/v6.0.1
- refs/tags/v6.0.2
- refs/tags/v6.1.1
- refs/tags/v6.1.2
- refs/tags/v6.10.1
- refs/tags/v6.10.2
- refs/tags/v6.11.4
- refs/tags/v6.11.6
- refs/tags/v6.12.6
- refs/tags/v6.12.7
- refs/tags/v6.13.2
- refs/tags/v6.13.3
- refs/tags/v6.14.5
- refs/tags/v6.14.6
- refs/tags/v6.15.4
- refs/tags/v6.15.5
- refs/tags/v6.16.3
- refs/tags/v6.16.4
- refs/tags/v6.17.3
- refs/tags/v6.2.2
- refs/tags/v6.2.4
- refs/tags/v6.20.3
- refs/tags/v6.22.1
- refs/tags/v6.25.3
- refs/tags/v6.26.1
- refs/tags/v6.28.2
- refs/tags/v6.29.3
- refs/tags/v6.29.4
- refs/tags/v6.29.5
- refs/tags/v6.3.6
- refs/tags/v6.4.6
- refs/tags/v6.5.2
- refs/tags/v6.5.3
- refs/tags/v6.6.3
- refs/tags/v6.6.4
- refs/tags/v6.7.3
- refs/tags/v6.8.1
- refs/tags/v7.0.1
- refs/tags/v7.0.2
- refs/tags/v7.0.4
- refs/tags/v7.2.0
- refs/tags/v7.2.2
- refs/tags/v7.5.3
- refs/tags/v7.7.2
- refs/tags/v7.9.2
- refs/tags/v8.0.0
- refs/tags/v8.11.4
- refs/tags/v8.3.2
- refs/tags/v8.3.3
- refs/tags/v8.4.4
- refs/tags/v8.5.3
- refs/tags/v8.6.7
- refs/tags/v8.7.3
- refs/tags/v9.0.1
- refs/tags/v9.1.1
- refs/tags/v9.2.1
- refs/tags/v9.3.1
- refs/tags/v9.4.0
- v9.5.2
- v9.1.0
- v9.0.0
- v8.9.1
- v8.8.1
- v8.5.4
- v8.11.3
- v8.10.2
- v8.10.0
- v8.1.1
- v7.8.3
- v7.7.8
- v7.7.3
- v7.6.0
- v7.4.5
- v7.4.4
- v7.4.3
- v7.3.1
- v7.10.2
- v7.1.2
- v7.1.1
- v7.0.3
- v6.27.3
- v6.26.0
- v6.25.1
- v6.24.2
- v6.23.3
- v6.23.2
- v6.19.3
- v6.15.2
- v5.8
- v5.5.1
- v5.4.7
- v5.18.4
- v5.17.2
- v5.16.6
- v5.14.2
- v5.13.4
- v5.13.2
- v5.13.1
- v5.12.5
- v5.12.4
- v5.12.3
- v5.12.2
- v5.1.4
- v5.1.3
- v5.1.2
- v5.0.2
- v5.0.1
- v4.9
- v4.8
- v4.6.1
- v4.5.1
- v4.4.1
- v4.4
- v4.3.1
- v4.3
- v4.2
- v4.13.5
- v4.13
- v4.11.2
- v3.9
- v3.8
- v3.7
- v3.6.1
- v3.5
- v3.4
- v3.3
- v3.2
- v3.13.1
- v3.13
- v3.12.1
- v3.12
- v3.11
- v3.10
- v3.1
- v3.0
- v2.8
- v2.7
- v2.6
- v2.5
- v2.4
- v2.3
- v2.2
- v2.1
- v2.0
- v1.5.9.1
- v1.5.8.2
- v1.5.8.1
- v1.5.8
- v1.5.7
- rocksdb-5.8
- rocksdb-5.4.7
- rocksdb-5.1.4
- rocksdb-5.1.3
- rocksdb-5.1.2
- rocksdb-5.0.2
- rocksdb-5.0.1
- rocksdb-4.9
- rocksdb-4.8
- rocksdb-4.6.1
- rocksdb-4.5.1
- rocksdb-4.4.1
- rocksdb-4.4
- rocksdb-4.3.1
- rocksdb-4.3
- rocksdb-4.2
- rocksdb-4.13.5
- rocksdb-4.13
- rocksdb-4.11.2
- rocksdb-3.10.1
- blob_st_lvl-pre
- 2.8.fb
- 2.7.fb
- 2.4.fb
- 2.3.fb
- 2.2.fb
- 2.1.fb
- 2.0.fb
- 1.5.9.fb
- 1.5.9.2.fb
- 1.5.9.1.fb
- 1.5.8.fb
- 1.5.8.2.fb
- 1.5.8.1.fb
- 1.5.7.fb
Take a new snapshot of a software origin
If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.
Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.Processing "take a new snapshot" request ...
Permalinks
To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.
Revision | Author | Date | Message | Commit Date |
---|---|---|---|---|
01d5faf | Adam Retter | 11 December 2019, 04:00:57 UTC | Add Visual Studio 2015 to AppVeyor (#5446) Summary: This is required to compile on Windows with Visual Studio 2015, which is used for creating the RocksJava releases. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5446 Differential Revision: D18924811 fbshipit-source-id: a183a62e79a2af5aaf59cd08235458a172fe7dcb | 19 February 2020, 21:11:39 UTC |
6367dee | Peter Dillinger | 30 January 2020, 19:00:08 UTC | Don't download from (unreliable) maven.org (#6348) Summary: I set up a mirror of our Java deps on github so we can download them through github URLs rather than maven.org, which is proving terribly unreliable from Travis builds. Also sanitized calls to curl, so they are easier to read and appropriately fail on download failure. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6348 Test Plan: CI Differential Revision: D19633621 Pulled By: pdillinger fbshipit-source-id: 7eb3f730953db2ead758dc94039c040f406790f3 | 19 February 2020, 21:11:32 UTC |
524f195 | Adam Retter | 29 January 2020, 16:00:16 UTC | Reduce the need to re-download dependencies (#6318) Summary: Both changes are related to RocksJava: 1. Allow dependencies that are already present on the host system due to Maven to be reused in Docker builds. 2. Extend the `make clean-not-downloaded` target to RocksJava, so that libraries needed as dependencies for the test suite are not deleted and re-downloaded unnecessarily. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6318 Differential Revision: D19608742 Pulled By: pdillinger fbshipit-source-id: 25e25649e3e3212b537ac4512b40e2e53dc02ae7 | 19 February 2020, 21:11:25 UTC |
71b3e43 | Levi Tamasi | 16 January 2020, 01:53:23 UTC | Access Maven Central over HTTPS (#6301) Summary: As of 1/15/2020, Maven Central does not support plain HTTP. Because of this, our Travis and AppVeyor builds have started failing during the assertj download step. This patch will hopefully fix these issues. See https://blog.sonatype.com/central-repository-moving-to-https for more info. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6301 Test Plan: Will monitor the builds. ("I don't always test my changes but when I do, I do it in production.") Differential Revision: D19422923 Pulled By: ltamasi fbshipit-source-id: 76f9a8564a5b66ddc721d705f9cbfc736bf7a97d | 19 February 2020, 21:10:45 UTC |
551a110 | Fosco Marotto | 31 January 2020, 21:03:51 UTC | Update version to 6.6.4 | 31 January 2020, 21:03:51 UTC |
f5f46ad | anand76 | 31 January 2020, 20:58:44 UTC | Fix a unit test in error_handler_test.cc | 31 January 2020, 20:58:44 UTC |
07786d9 | anand76 | 30 January 2020, 18:53:46 UTC | Force a new manifest file if append to current one fails (#6331) Summary: Fix for issue https://github.com/facebook/rocksdb/issues/6316 When an append/sync of the manifest file fails due to an IO error such as NoSpace, we don't always put the DB in read-only mode. This is true for flush and compactions, as well as foreground operatons such as column family add/drop, CompactFiles etc. Subsequent changes to the DB will be recorded in the same manifest file, which would have a corrupted record in the middle due to the previous failure. On next DB::Open(), it will fail to process the full manifest and data will be lost. To fix this, we reset VersionSet::descriptor_log_ on append/sync failure, which will force a new manifest file to be written on the next append. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6331 Test Plan: Add new unit tests in error_handler_test.cc Differential Revision: D19632951 Pulled By: anand1976 fbshipit-source-id: 68d527cb6e59a94cbbbf9f5a17a7f464381d51e3 | 31 January 2020, 19:45:01 UTC |
ac29858 | anand76 | 24 January 2020, 22:00:58 UTC | Update version to 6.6.3 Summary: Test Plan: Reviewers: Subscribers: Tasks: Tags: | 24 January 2020, 22:01:40 UTC |
f7619b4 | Maysam Yabandeh | 24 January 2020, 21:03:19 UTC | Implement PinnableSlice::remove_prefix (#6330) Summary: The function was left unimplemented. Although we currently don't have a use for that it was declared with an assert(0) to prevent mistakenly using the remove_prefix of the parent class. The function body with only assert(0) however causes issues with some compiler's warning levels. The patch implements the function to avoid the warning. It also piggybacks some minor code warning for unnecessary semicolons after the function definition.s Pull Request resolved: https://github.com/facebook/rocksdb/pull/6330 Differential Revision: D19559062 Pulled By: maysamyabandeh fbshipit-source-id: 3a022484f688c9abd4556e5412bcc2628ab96a00 | 24 January 2020, 21:30:13 UTC |
19e2178 | anand76 | 23 January 2020, 21:59:48 UTC | Fix queue manipulation in WriteThread::BeginWriteStall() (#6322) Summary: When there is a write stall, the active write group leader calls ```BeginWriteStall()``` to walk the queue of writers and remove any with the ```no_slowdown``` option set. There was a bug in the code which updated the back pointer but not the forward pointer (```link_newer```), corrupting the list and causing some threads to wait forever. This PR fixes it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6322 Test Plan: Add a unit test in db_write_test Differential Revision: D19538313 Pulled By: anand1976 fbshipit-source-id: 6fbed819e594913f435886606f5d36f74f235c3a | 24 January 2020, 18:08:01 UTC |
1fab610 | Sagar Vemuri | 13 January 2020, 20:28:06 UTC | Update version to 6.6.2 | 13 January 2020, 20:28:06 UTC |
4df4e63 | Sagar Vemuri | 11 January 2020, 03:01:00 UTC | Consider all compaction input files to compute the oldest ancestor time (#6279) Summary: Look at all compaction input files to compute the oldest ancestor time. In https://github.com/facebook/rocksdb/issues/5992 we changed how creation_time (aka oldest-ancestor-time) table property of compaction output files is computed from max(creation-time-of-all-compaction-inputs) to min(creation-time-of-all-inputs). This exposed a bug where, during compaction, the creation_time:s of only the L0 compaction inputs were being looked at, and all other input levels were being ignored. This PR fixes the issue. Some TTL compactions when using Level-Style compactions might not have run due to this bug. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6279 Test Plan: Enhanced the unit tests to validate that the correct time is propagated to the compaction outputs. Differential Revision: D19337812 Pulled By: sagar0 fbshipit-source-id: edf8a72f11e405e93032ff5f45590816debe0bb4 | 13 January 2020, 20:20:11 UTC |
beca3c9 | Yanqin Jin | 02 January 2020, 20:50:59 UTC | Update release date | 02 January 2020, 20:50:59 UTC |
4fc5e6c | Yanqin Jin | 02 January 2020, 20:33:36 UTC | Update HISTORY and bump up version number | 02 January 2020, 20:38:21 UTC |
74b01ac | Mike Kolupaev | 18 December 2019, 04:07:21 UTC | Fix use-after-free and double-deleting files in BackgroundCallPurge() (#6193) Summary: The bad code was: ``` mutex.Lock(); // `mutex` protects `container` for (auto& x : container) { mutex.Unlock(); // do stuff to x mutex.Lock(); } ``` It's incorrect because both `x` and the iterator may become invalid if another thread modifies the container while this thread is not holding the mutex. Broken by https://github.com/facebook/rocksdb/pull/5796 - it replaced a `while (!container.empty())` loop with a `for (auto x : container)`. (RocksDB code does a lot of such unlocking+re-locking of mutexes, and this type of bugs comes up a lot :/ ) Pull Request resolved: https://github.com/facebook/rocksdb/pull/6193 Test Plan: Ran some logdevice integration tests that were crashing without this fix. Differential Revision: D19116874 Pulled By: al13n321 fbshipit-source-id: 9672bc4227c1b68f46f7436db2b96811adb8c703 | 02 January 2020, 20:21:53 UTC |
924bc5f | 解轶伦 | 17 December 2019, 21:20:42 UTC | delete superversions in BackgroundCallPurge (#6146) Summary: I found that CleanupSuperVersion() may block Get() for 30ms+ (per MemTable is 256MB). Then I found "delete sv" in ~SuperVersion() takes the time. The backtrace looks like this DBImpl::GetImpl() -> DBImpl::ReturnAndCleanupSuperVersion() -> DBImpl::CleanupSuperVersion() : delete sv; -> ~SuperVersion() I think it's better to delete in a background thread, please review it。 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6146 Differential Revision: D18972066 fbshipit-source-id: 0f7b0b70b9bb1e27ad6fc1c8a408fbbf237ae08c | 02 January 2020, 20:20:29 UTC |
7168d16 | Levi Tamasi | 20 December 2019, 02:03:24 UTC | BlobDB: only compare CF IDs when checking whether an API call is for the default CF (#6226) Summary: BlobDB currently only supports using the default column family. The earlier code enforces this by comparing the `ColumnFamilyHandle` passed to the `Get`/`Put`/etc. call with the handle returned by `DefaultColumnFamily` (which, at the end of the day, comes from `DBImpl::default_cf_handle_`). Since other `ColumnFamilyHandle`s can also point to the default column family, this can reject legitimate requests as well. (As an example, with the earlier code, the handle returned by `BlobDB::Open` cannot actually be used in API calls.) The patch fixes this by comparing only the IDs of the column family handles instead of the pointers themselves. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6226 Test Plan: `make check` Differential Revision: D19187461 Pulled By: ltamasi fbshipit-source-id: 54ce2e12ebb1f07e6d1e70e3b1e0213dfa94bda2 | 20 December 2019, 02:29:39 UTC |
d848059 | suzanwen | 09 December 2019, 05:33:23 UTC | Isolate building db_bench from tests with `WITH_BENCHMARK_TOOLS` option. (#6098) Summary: Isolate `db_bench` from building tests, out of respect for the related comments. Let building tests yields to `WITH_TEST=ON` AND `CMAKE_BUILD_TYPE=Debug` both, and building `db_bench` yields to `WITH_BENCHMARK_TOOLS=ON`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6098 Test Plan: cmake -DCMAKE_BUILD_TYPE=Debug/Release -DWITH_TESTS=ON/OFF -DWITH_BENCHMARK_TOOLS=ON/OFF -DWITH_TOOLS=ON/OFF && make Differential Revision: D18856891 Pulled By: riversand963 fbshipit-source-id: addbee8ad6abefb877843a313b4630cfab3ce4f0 | 19 December 2019, 22:05:50 UTC |
5929ac8 | Adam Retter | 14 December 2019, 00:11:40 UTC | Env should also load the native library (#6167) Summary: Closes https://github.com/facebook/rocksdb/issues/6118 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6167 Differential Revision: D19053577 Pulled By: pdillinger fbshipit-source-id: 86aca9a5bec0947a641649b515da17b3cb12bdde | 19 December 2019, 19:38:15 UTC |
9ea7363 | Adam Retter | 12 December 2019, 19:58:26 UTC | Add missing mutable DBOptions to RocksJava (#6152) Summary: As requested in https://github.com/facebook/rocksdb/issues/6127 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6152 Differential Revision: D18955608 Pulled By: pdillinger fbshipit-source-id: 3e1367d944e44d5f1675a422f7dd2451c86feb6f | 19 December 2019, 19:38:06 UTC |
137dfbc | 奏之章 | 12 December 2019, 23:16:13 UTC | Fix RangeDeletion bug (#6062) Summary: Read keys from a snapshot that a range deletion were added after the snapshot was created and this range deletion was inside an immutable memtable, we will get wrong key set. More detail rest in codes. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6062 Differential Revision: D18966785 Pulled By: pdillinger fbshipit-source-id: 38a60bb1e2d0a1dbfc8ec641617200b6a02b86c3 | 18 December 2019, 01:09:46 UTC |
3ff4012 | Levi Tamasi | 16 December 2019, 22:09:03 UTC | Update HISTORY.md with recent BlobDB related changes | 17 December 2019, 20:32:20 UTC |
df032f5 | Levi Tamasi | 12 December 2019, 19:29:01 UTC | Do not update SST <-> blob file mapping if compaction failed Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6156 Test Plan: Extended unit tests. Differential Revision: D18943867 Pulled By: ltamasi fbshipit-source-id: b3669d2dd6af08e987ad1a59d6712ae2514da0b1 | 17 December 2019, 20:31:10 UTC |
142f00d | Levi Tamasi | 16 December 2019, 23:15:42 UTC | Update HISTORY.md with the recent memtable trimming fixes Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6194 Differential Revision: D19125292 Pulled By: ltamasi fbshipit-source-id: d41aca2755ec4bec07feedd6b561e8d18606a931 | 17 December 2019, 15:53:12 UTC |
509da20 | Levi Tamasi | 16 December 2019, 21:13:42 UTC | Fix a data race related to memtable trimming (#6187) Summary: https://github.com/facebook/rocksdb/pull/6177 introduced a data race involving `MemTableList::InstallNewVersion` and `MemTableList::NumFlushed`. The patch fixes this by caching whether the current version has any memtable history (i.e. flushed memtables that are kept around for transaction conflict checking) in an `std::atomic<bool>` member called `current_has_history_`, similarly to how `current_memory_usage_excluding_last_` is handled. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6187 Test Plan: ``` make clean COMPILE_WITH_TSAN=1 make db_test -j24 ./db_test ``` Differential Revision: D19084059 Pulled By: ltamasi fbshipit-source-id: 327a5af9700fb7102baea2cc8903c085f69543b9 | 17 December 2019, 15:52:22 UTC |
628786e | Levi Tamasi | 14 December 2019, 03:09:53 UTC | Do not schedule memtable trimming if there is no history (#6177) Summary: We have observed an increase in CPU load caused by frequent calls to `ColumnFamilyData::InstallSuperVersion` from `DBImpl::TrimMemtableHistory` when using `max_write_buffer_size_to_maintain` to limit the amount of memtable history maintained for transaction conflict checking. Part of the issue is that trimming can potentially be scheduled even if there is no memtable history. The patch adds a check that fixes this. See also https://github.com/facebook/rocksdb/pull/6169. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6177 Test Plan: Compared `perf` output for ``` ./db_bench -benchmarks=randomtransaction -optimistic_transaction_db=1 -statistics -stats_interval_seconds=1 -duration=90 -num=500000 --max_write_buffer_size_to_maintain=16000000 --transaction_set_snapshot=1 --threads=32 ``` before and after the change. There is a significant reduction for the call chain `rocksdb::DBImpl::TrimMemtableHistory` -> `rocksdb::ColumnFamilyData::InstallSuperVersion` -> `rocksdb::ThreadLocalPtr::StaticMeta::Scrape` even without https://github.com/facebook/rocksdb/pull/6169. Differential Revision: D19057445 Pulled By: ltamasi fbshipit-source-id: dff81882d7b280e17eda7d9b072a2d4882c50f79 | 17 December 2019, 15:52:22 UTC |
80de900 | Levi Tamasi | 13 December 2019, 20:45:49 UTC | Do not create/install new SuperVersion if nothing was deleted during memtable trim (#6169) Summary: We have observed an increase in CPU load caused by frequent calls to `ColumnFamilyData::InstallSuperVersion` from `DBImpl::TrimMemtableHistory` when using `max_write_buffer_size_to_maintain` to limit the amount of memtable history maintained for transaction conflict checking. As it turns out, this is caused by the code creating and installing a new `SuperVersion` even if no memtables were actually trimmed. The patch adds a check to avoid this. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6169 Test Plan: Compared `perf` output for ``` ./db_bench -benchmarks=randomtransaction -optimistic_transaction_db=1 -statistics -stats_interval_seconds=1 -duration=90 -num=500000 --max_write_buffer_size_to_maintain=16000000 --transaction_set_snapshot=1 --threads=32 ``` before and after the change. With the fix, the call chain `rocksdb::DBImpl::TrimMemtableHistory` -> `rocksdb::ColumnFamilyData::InstallSuperVersion` -> `rocksdb::ThreadLocalPtr::StaticMeta::Scrape` no longer registers in the `perf` report. Differential Revision: D19031509 Pulled By: ltamasi fbshipit-source-id: 02686fce594e5b50eba0710e4b28a9b808c8aa20 | 17 December 2019, 15:52:22 UTC |
1d9eae3 | Yanqin Jin | 17 December 2019, 04:00:43 UTC | Use Env::LoadEnv to create custom Env objects (#6196) Summary: As title. Previous assumption was that the underlying lib can always return a shared_ptr<Env>. This is too strong. Therefore, we use Env::LoadEnv to relax it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6196 Test Plan: make check Differential Revision: D19133199 Pulled By: riversand963 fbshipit-source-id: c83a0c02a42610d077054f2de1acfc45126b3a75 | 17 December 2019, 07:00:35 UTC |
2ba7f1e | anand1976 | 17 December 2019, 04:35:11 UTC | Fix crash in Transaction::MultiGet() when num_keys > 32 Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6192 Test Plan: Add a unit test that fails without the fix and passes now make check Differential Revision: D19124781 Pulled By: anand1976 fbshipit-source-id: 8c8cb6fa16c3fc23ec011e168561a13f76bbd783 | 17 December 2019, 06:17:35 UTC |
d6e1990 | Maysam Yabandeh | 12 December 2019, 21:48:50 UTC | Fix build breakage from lock_guard error (#6161) Summary: This change fixes a source issue that caused compile time error which breaks build for many fbcode services in that setup. The size() member function of channel is a const member, so member variables accessed within it are implicitly const as well. This caused error when clang fails to resolve to a constructor that takes std::mutex because the suitable constructor got rejected due to loss of constness for its argument. The fix is to add mutable modifier to the lock_ member of channel. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6161 Differential Revision: D18967685 Pulled By: maysamyabandeh fbshipit-source-id: 698b6a5153c3c92eeacb842c467aa28cc350d432 | 12 December 2019, 21:54:29 UTC |
92453f2 | Peter Dillinger | 06 December 2019, 18:25:40 UTC | Disable new Bloom filter assertion (#6128) Summary: A longstanding bug in our C interface can trigger this assertion; see issue https://github.com/facebook/rocksdb/issues/6129. Disabling the assertion for now (for 6.6.0) and will re-enable on fix of that bug. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6128 Differential Revision: D18854899 Pulled By: pdillinger fbshipit-source-id: 9eb5294b9f11b208dc1a8cc148aaa31e47ff892b | 06 December 2019, 18:31:04 UTC |
e106a3c | Jim Meyering | 05 December 2019, 19:47:38 UTC | build_tools/precommit_checker.py: don't hard-code a platform-afflicted python path (#6124) Summary: Use `#!/usr/bin/env python2.7` instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6124 Test Plan: `J=8 make commit_prereq` Differential Revision: D18834668 Pulled By: ltamasi fbshipit-source-id: cec40266cd5bcae8bf6cbe5a564ae78540deccc4 | 05 December 2019, 20:20:14 UTC |
98c4147 | Yanqin Jin | 03 December 2019, 01:43:37 UTC | Let DBSecondary close files after catch up (#6114) Summary: After secondary instance replays the logs from primary, certain files become obsolete. The secondary should find these files, evict their table readers from table cache and close them. If this is not done, the secondary will hold on to these files and prevent their space from being freed. Test plan (devserver): ``` $./db_secondary_test --gtest_filter=DBSecondaryTest.SecondaryCloseFiles $make check $./db_stress -ops_per_thread=100000 -enable_secondary=true -threads=32 -secondary_catch_up_one_in=10000 -clear_column_family_one_in=1000 -reopen=100 ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6114 Differential Revision: D18769998 Pulled By: riversand963 fbshipit-source-id: 5d1f151567247196164e1b79d8402fa2045b9120 | 03 December 2019, 01:53:24 UTC |
96da9d7 | anand76 | 02 December 2019, 22:58:22 UTC | Remove key length assertion LRUHandle::CalcTotalCharge (#6115) Summary: Inserting an entry in the block cache with 0 length key is a valid use case. Remove the assertion in ```LRUHandle::CalcTotalCharge```. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6115 Differential Revision: D18769693 Pulled By: anand1976 fbshipit-source-id: 34cc159650300dda6d7273480640478f28392cda | 02 December 2019, 23:52:55 UTC |
7e8b4f5 | Peter Dillinger | 27 November 2019, 23:05:32 UTC | Update comment on max_valid_backups_to_open (#6105) Summary: To reflect changes in PR https://github.com/facebook/rocksdb/issues/6072 This comment also implies that a seemingly valid use-case for max_valid_backups_to_open is flawed: even if you only want to add a new backup without trying to delete, you might need to clean up after a backup creation that never finished. To clean up properly requires opening all backups to get proper ref counts on shared files. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6105 Test Plan: code comment only Differential Revision: D18736716 Pulled By: pdillinger fbshipit-source-id: 2447c0000eefe3a4ca606926bfe922a8456b0cb7 | 27 November 2019, 23:17:57 UTC |
ce1abbc | Peter Dillinger | 27 November 2019, 18:22:45 UTC | Update format_version comment for 6.6.0 Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6097 Differential Revision: D18729661 Pulled By: pdillinger fbshipit-source-id: d2e4a9d6803aad8dd61ececd5c2b861e6f2da73b | 27 November 2019, 23:17:45 UTC |
4d26e75 | Adam Retter | 27 November 2019, 21:07:28 UTC | Fix BlobDB compilation on older GCC versions Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6094 Differential Revision: D18731951 Pulled By: ltamasi fbshipit-source-id: 5b73c6009c748f6a2a48d4d880b1259980d801d4 | 27 November 2019, 22:10:33 UTC |
880e30a | John Ericson | 27 November 2019, 05:40:16 UTC | Work around weird unused errors with Mingw (#6075) Summary: From the reset of the code, it looks this this maybe can be unconditionally given the attribute? But I couldn't test with MSVC so I defensively put under CPP. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6075 Differential Revision: D18723749 fbshipit-source-id: 45fc8732c28dd29aab1644225d68f3c6f39bd69b | 27 November 2019, 17:51:01 UTC |
73c1203 | sdong | 27 November 2019, 05:38:38 UTC | Support options.max_open_files = -1 with periodic_compaction_seconds (#6090) Summary: options.periodic_compaction_seconds isn't supported when options.max_open_files != -1. It's because that the information of file creation time is stored in table properties and are not guaranteed to be loaded unless options.max_open_files = -1. Relax this constraint by storing the information in manifest. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6090 Test Plan: Pass all existing tests; Modify an existing test to force the manifest value to take 0 to simulate backward compatibility case; manually open the DB generated with the change by release 4.2. Differential Revision: D18702268 fbshipit-source-id: 13e0bd94f546498a04f3dc5fc0d9dff5125ec9eb | 27 November 2019, 17:50:44 UTC |
496a6ae | anand76 | 27 November 2019, 02:59:24 UTC | Fix HISTORY.md for 6.6.0 (#6096) Summary: Some of the entries were incorrectly listed under 6.5.0. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6096 Differential Revision: D18722801 Pulled By: gfosco fbshipit-source-id: 18d1187deb6a9d69a8feb68b727d2f720a65f2bc | 27 November 2019, 03:04:49 UTC |
ca3b6c2 | Peter Dillinger | 27 November 2019, 02:18:29 UTC | Expose and elaborate FilterBuildingContext (#6088) Summary: This change enables custom implementations of FilterPolicy to wrap a variety of NewBloomFilterPolicy and select among them based on contextual information such as table level and compaction style. * Moves FilterBuildingContext to public API and elaborates it with more useful data. (It would be nice to put more general options-like data, but at the time this object is constructed, we are using internal APIs ImmutableCFOptions and MutableCFOptions and don't have easy access to ColumnFamilyOptions that I can tell.) * Renames BloomFilterPolicy::GetFilterBitsBuilderInternal to GetBuilderWithContext, because it's now public. * Plumbs through the table's "level_at_creation" for filter building context. * Simplified some tests by adding GetBuilder() to MockBlockBasedTableTester. * Adds test as DBBloomFilterTest.ContextCustomFilterPolicy, including sample wrapper class LevelAndStyleCustomFilterPolicy. * Fixes a cross-test bug in DBBloomFilterTest.OptimizeFiltersForHits where it does not reset perf context. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6088 Test Plan: make check, valgrind on db_bloom_filter_test Differential Revision: D18697817 Pulled By: pdillinger fbshipit-source-id: 5f987a2d7b07cc7a33670bc08ca6b4ca698c1cf4 | 27 November 2019, 02:24:10 UTC |
6d58ea9 | Adam Retter | 27 November 2019, 00:55:46 UTC | Fix compilation under MSVC VS2015 (#6081) Summary: **NOTE**: this also needs to be back-ported to 6.4.6 and possibly older branches if further releases from them is envisaged. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6081 Differential Revision: D18710107 Pulled By: zhichao-cao fbshipit-source-id: 03260f9316566e2bfc12c7d702d6338bb7941e01 | 27 November 2019, 02:24:09 UTC |
8ae149e | Patrick Double | 27 November 2019, 00:51:26 UTC | Add shared library for musl-libc (#3143) Summary: Add the jni library for musl-libc, specifically for incorporating into Alpine based docker images. The classifier is `musl64`. I have signed the CLA electronically. Pull Request resolved: https://github.com/facebook/rocksdb/pull/3143 Differential Revision: D18719372 fbshipit-source-id: 6189d149310b6436d6def7d808566b0234b23313 | 27 November 2019, 02:24:09 UTC |
d9314a9 | Levi Tamasi | 27 November 2019, 00:42:44 UTC | Refactor and clean up the code that reads a blob from a file (#6093) Summary: This patch factors out the logic that reads a (potentially compressed) blob from a file into a separate helper method `GetRawBlobFromFile`, and cleans up the code a bit. Also, errors during decompression are now logged/propagated to the user by returning a `Status` code of `Corruption`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6093 Test Plan: `make check` Differential Revision: D18716673 Pulled By: ltamasi fbshipit-source-id: 44144bc064cab616862d5643f34384f2bae6eb78 | 27 November 2019, 00:49:39 UTC |
57f3032 | Peter Dillinger | 26 November 2019, 23:49:16 UTC | Allow fractional bits/key in BloomFilterPolicy (#6092) Summary: There's no technological impediment to allowing the Bloom filter bits/key to be non-integer (fractional/decimal) values, and it provides finer control over the memory vs. accuracy trade-off. This is especially handy in using the format_version=5 Bloom filter in place of the old one, because bits_per_key=9.55 provides the same accuracy as the old bits_per_key=10. This change not only requires refining the logic for choosing the best num_probes for a given bits/key setting, it revealed a flaw in that logic. As bits/key gets higher, the best num_probes for a cache-local Bloom filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a standard Bloom filter. For example, at 16 bits per key, the best num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%). This change fixes and refines that logic (for the format_version=5 Bloom filter only, just in case) based on empirical tests to find accuracy inflection points between each num_probes. Although bits_per_key is now specified as a double, the new Bloom filter converts/rounds this to "millibits / key" for predictable/precise internal computations. Just in case of unforeseen compatibility issues, we round to the nearest whole number bits / key for the legacy Bloom filter, so as not to unlock new behaviors for it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092 Test Plan: unit tests included Differential Revision: D18711313 Pulled By: pdillinger fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a | 26 November 2019, 23:59:34 UTC |
72daa92 | Levi Tamasi | 26 November 2019, 21:16:39 UTC | Refactor blob file creation logic (#6066) Summary: The patch refactors and cleans up the logic around creating new blob files by moving the common code of `SelectBlobFile` and `SelectBlobFileTTL` to a new helper method `CreateBlobFileAndWriter`, bringing the implementation of `SelectBlobFile` and `SelectBlobFileTTL` into sync, and increasing encapsulation by adding new constructors for `BlobFile` and `BlobLogHeader`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6066 Test Plan: Ran `make check` and used the BlobDB mode of `db_bench` to sanity test both the TTL and the non-TTL code paths. Differential Revision: D18646921 Pulled By: ltamasi fbshipit-source-id: e5705a84807932e31dccab4f49b3e64369cea26d | 26 November 2019, 21:28:32 UTC |
771e172 | John Ericson | 26 November 2019, 18:57:29 UTC | Use lowercase for shlwapi.lib rpcrt4.lib (#6076) Summary: This fixes MinGW cross compilation from case-sensative file systems, at no harm to MinGW builds on Windows. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6076 Differential Revision: D18710554 fbshipit-source-id: a9f299ac3aa019f7dbc07ed0c4a79e19cf99b488 | 26 November 2019, 21:28:32 UTC |
1bf316e | Adam Retter | 26 November 2019, 18:53:23 UTC | Fix naming of library on PPC64LE (#6080) Summary: **NOTE**: This also needs to be back-ported to be 6.4.6 Fix a regression introduced in f2bf0b2 by https://github.com/facebook/rocksdb/pull/5674 whereby the compiled library would get the wrong name on PPC64LE platforms. On PPC64LE, the regression caused the library to be named `librocksdbjni-linux64.so` instead of `librocksdbjni-linux-ppc64le.so`. This PR corrects the name back to `librocksdbjni-linux-ppc64le.so` and also corrects the ordering of conditional arguments in the Makefile to match the expected order as defined in the documentation for Make. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6080 Differential Revision: D18710351 fbshipit-source-id: d4db87ef378263b57de7f9edce1b7d15644cf9de | 26 November 2019, 21:28:32 UTC |
7f14519 | Adam Retter | 26 November 2019, 18:52:04 UTC | Small improvements to Docker build for RocksJava (#6079) Summary: * We can reuse downloaded 3rd-party libraries * We can isolate the build to a Docker volume. This is useful for investigating failed builds, as we can examine the volume by assigning it a name during the build. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6079 Differential Revision: D18710263 fbshipit-source-id: 93f456ba44b49e48941c43b0c4d53995ecc1f404 | 26 November 2019, 21:28:31 UTC |
4f17d33 | Peter Dillinger | 26 November 2019, 18:47:25 UTC | Remove unused/undefined ImmutableCFOptions() (#6086) Summary: default constructor not used or even defined Pull Request resolved: https://github.com/facebook/rocksdb/pull/6086 Differential Revision: D18695669 Pulled By: pdillinger fbshipit-source-id: 6b6ac46029f4fb6edf1c11ee6ce1d9f172b2eaf2 | 26 November 2019, 21:28:31 UTC |
382b154 | Adam Retter | 26 November 2019, 18:45:36 UTC | Update 3rd-party libraries used by RocksJava (#6084) Summary: * LZ4 1.8.3 -> 1.9.2 * ZSTD 1.4.0 -> 1.4.4 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6084 Differential Revision: D18710224 fbshipit-source-id: a461ef19a473d3480acdc027f627ec3048730692 | 26 November 2019, 21:28:31 UTC |
77eab5c | sdong | 26 November 2019, 01:11:26 UTC | Make default value of options.ttl to be 30 days when it is supported. (#6073) Summary: By default options.ttl is disabled. We believe a better default will be 30 days, which means deleted data the database will be removed from SST files slightly after 30 days, for most of the cases. Make the default UINT64_MAX - 1 to indicate that it is not overridden by users. Change periodic_compaction_seconds to be UINT64_MAX - 1 to UINT64_MAX too to be consistent. Also fix a small bug in the previous periodic_compaction_seconds default code. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6073 Test Plan: Add unit tests for it. Differential Revision: D18669626 fbshipit-source-id: 957cd4374cafc1557d45a0ba002010552a378cc8 | 26 November 2019, 18:00:32 UTC |
fcd7e03 | Sebastiano Peluso | 25 November 2019, 22:18:10 UTC | Ignore value of BackupableDBOptions::max_valid_backups_to_open when B… (#6072) Summary: This change ignores the value of BackupableDBOptions::max_valid_backups_to_open when a BackupEngine is not read-only. Issue: https://github.com/facebook/rocksdb/issues/4997 Note on tests: I had to remove test case WriteOnlyEngine of BackupableDBTest because it was not consistent with the new semantic of BackupableDBOptions::max_valid_backups_to_open. Maybe, we should think about adding a new interface for append-only BackupEngines. On the other hand, I changed LimitBackupsOpened test case to use a read-only BackupEngine, and I added a new specific test case for the change. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6072 Reviewed By: pdillinger Differential Revision: D18687364 Pulled By: sebastianopeluso fbshipit-source-id: 77bc1f927d623964d59137a93de123bbd719da4e | 26 November 2019, 18:00:31 UTC |
0bc8744 | sdong | 25 November 2019, 20:03:06 UTC | Update HISTORY.md for forward compatibility (#6085) Summary: https://github.com/facebook/rocksdb/pull/6060 broke forward compatiblity for releases from 3.10 to 4.2. Update HISTORY.md to mention it. Also remove it from the compatibility tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6085 Differential Revision: D18691694 fbshipit-source-id: 4ef903783dc722b8a4d3e8229abbf0f021a114c9 | 26 November 2019, 18:00:31 UTC |
669ea77 | Sagar Vemuri | 23 November 2019, 06:12:09 UTC | Support ttl in Universal Compaction (#6071) Summary: `options.ttl` is now supported in universal compaction, similar to how periodic compactions are implemented in PR https://github.com/facebook/rocksdb/issues/5970 . Setting `options.ttl` will simply set `options.periodic_compaction_seconds` to execute the periodic compactions code path. Discarded PR https://github.com/facebook/rocksdb/issues/4749 in lieu of this. This is a short term work-around/hack of falling back to periodic compactions when ttl is set. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6071 Test Plan: Added a unit test. Differential Revision: D18668336 Pulled By: sagar0 fbshipit-source-id: e75f5b81ba949f77ef9eff05e44bb1c757f58612 | 23 November 2019, 06:13:35 UTC |
75dfc78 | Levi Tamasi | 23 November 2019, 02:12:35 UTC | Fix the constness issues around autovector::iterator_impl's dereference operators (#6057) Summary: As described in detail in issue https://github.com/facebook/rocksdb/issues/6048, iterators' dereference operators (`*`, `->`, and `[]`) should return `pointer`s/`reference`s (as opposed to `const_pointer`s/`const_reference`s) even if the iterator itself is `const` to be in sync with the standard's iterator concept. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6057 Test Plan: make check Differential Revision: D18623235 Pulled By: ltamasi fbshipit-source-id: 04e82d73bc0c67fb0ded018383af8dfc332050cc | 23 November 2019, 05:23:00 UTC |
d8c28e6 | sdong | 23 November 2019, 00:01:21 UTC | Support options.ttl with options.max_open_files = -1 (#6060) Summary: Previously, options.ttl cannot be set with options.max_open_files = -1, because it makes use of creation_time field in table properties, which is not available unless max_open_files = -1. With this commit, the information will be stored in manifest and when it is available, will be used instead. Note that, this change will break forward compatibility for release 5.1 and older. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6060 Test Plan: Extend existing test case to options.max_open_files != -1, and simulate backward compatility in one test case by forcing the value to be 0. Differential Revision: D18631623 fbshipit-source-id: 30c232a8672de5432ce9608bb2488ecc19138830 | 23 November 2019, 05:23:00 UTC |
adcf920 | suzanwen | 22 November 2019, 16:18:15 UTC | Compatible changes for cmake (#6045) Summary: `${TESTUTILLIB}` should be linked with targets`${LIBS}`, otherwise it may not find the references. After that, we have to work fine with `${CMAKE_CURRENT_SOURCE_DIR}` in `cmake/modules/ReadVersion.cmake`, while building external projects with `add_subdirectory(/path/to/rocksdb)`. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6045 Differential Revision: D18641791 Pulled By: pdillinger fbshipit-source-id: a56b03b4dda6bae6edce1375324f51340917dddc | 22 November 2019, 16:19:48 UTC |
e50b64b | Little-Wallace | 21 November 2019, 23:22:38 UTC | fix unstable unittest caused by #5958 (#6061) Summary: Signed-off-by: Little-Wallace <bupt2013211450@gmail.com> This PR is to fix unstable unit test added by (https://github.com/facebook/rocksdb/pull/5958). I set SYNC_POINT in PickCompaction before. If IntraL0Compaction was trigger, the compact job which compact sst to base level would start instantly. If the compaction thread run faster than unittest main thread, we may observe the number of files in L0 reduce. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6061 Differential Revision: D18642301 fbshipit-source-id: 3e4da2ee963532b6e142336951ea3f47d46df148 | 21 November 2019, 23:24:01 UTC |
0ce0edb | Yanqin Jin | 21 November 2019, 00:32:17 UTC | Fix a data race between GetColumnFamilyMetaData and MarkFilesBeingCompacted (#6056) Summary: Use db mutex to protect the execution of Version::GetColumnFamilyMetaData() called in DBImpl::GetColumnFamilyMetaData(). Without mutex, GetColumnFamilyMetaData() races with MarkFilesBeingCompacted() for access to FileMetaData::being_compacted. Other than mutex, there are several more alternatives. - Make FileMetaData::being_compacted an atomic variable. This will make FileMetaData non-copy-able. - Separate being_compacted from FileMetaData. This requires re-organizing data structures that are already used in many places. Test Plan (dev server): ``` make check ``` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6056 Differential Revision: D18620488 Pulled By: riversand963 fbshipit-source-id: 87f89660b5d5e2ab4ef7962b7b2a7d00e346aa3b | 21 November 2019, 00:36:29 UTC |
c0983d0 | Cheng Chang | 20 November 2019, 22:17:16 UTC | Add asserts in transaction example (#6055) Summary: The intention of the example for read committed is clearer with these added asserts. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6055 Test Plan: `cd examples && make transaction_example && ./transaction_example` Differential Revision: D18621830 Pulled By: riversand963 fbshipit-source-id: a94b08c5958b589049409ee4fc4d6799e5cbef79 | 20 November 2019, 22:18:51 UTC |
3cd7573 | Stephan T. Lavavej | 20 November 2019, 19:27:05 UTC | Add operator[] to autovector::iterator_impl. (#6047) Summary: This is a required operator for random-access iterators, and an upcoming update for Visual Studio 2019 will change the C++ Standard Library's heap algorithms to use this operator. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6047 Differential Revision: D18618531 Pulled By: ltamasi fbshipit-source-id: 08d10bc85bf2dbc3f7ef0fa3c777e99f1e927ef5 | 20 November 2019, 19:28:41 UTC |
27ec3b3 | sdong | 20 November 2019, 18:35:56 UTC | Sanitize input in DB::MultiGet() API (#6054) Summary: The new DB::MultiGet() doesn't validate input for num_keys > 1 and GCC-9 complains about it. Fix it by directly return when num_keys == 0 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6054 Test Plan: Build with GCC-9 and see it passes. Differential Revision: D18608958 fbshipit-source-id: 1c279aff3c7fe6e9d5a6d085ed02550ecea4fdb2 | 20 November 2019, 18:38:01 UTC |
0306e01 | Peter Dillinger | 19 November 2019, 23:41:56 UTC | Fixes for g++ 4.9.2 compatibility (#6053) Summary: Taken from merryChris in https://github.com/facebook/rocksdb/issues/6043 Stackoverflow ref on {{}} vs. {}: https://stackoverflow.com/questions/26947704/implicit-conversion-failure-from-initializer-list Note to reader: .clear() does not empty out an ostringstream, but .str("") suffices because we don't have to worry about clearing error flags. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6053 Test Plan: make check, manual run of filter_bench Differential Revision: D18602259 Pulled By: pdillinger fbshipit-source-id: f6190f83b8eab4e80e7c107348839edabe727841 | 19 November 2019, 23:43:37 UTC |
ec3e3c3 | Little-Wallace | 19 November 2019, 23:07:49 UTC | Fix corruption with intra-L0 on ingested files (#5958) Summary: ## Problem Description Our process was abort when it call `CheckConsistency`. And the information in `stderr` show that "`L0 files seqno 3001491972 3004797440 vs. 3002875611 3004524421` ". Here are the causes of the accident I investigated. * RocksDB will call `CheckConsistency` whenever `MANIFEST` file is update. It will check sequence number interval of every file, except files which were ingested. * When one file is ingested into RocksDB, it will be assigned the value of global sequence number, and the minimum and maximum seqno of this file are equal, which are both equal to global sequence number. * `CheckConsistency` determines whether the file is ingested by whether the smallest and largest seqno of an sstable file are equal. * If IntraL0Compaction picks one sst which was ingested just now and compacted it into another sst, the `smallest_seqno` of this new file will be smaller than his `largest_seqno`. * If more than one ingested file was ingested before memtable schedule flush, and they all compact into one new sstable file by `IntraL0Compaction`. The sequence interval of this new file will be included in the interval of the memtable. So `CheckConsistency` will return a `Corruption`. * If a sstable was ingested after the memtable was schedule to flush, which would assign a larger seqno to it than memtable. Then the file was compacted with other files (these files were all flushed before the memtable) in L0 into one file. This compaction start before the flush job of memtable start, but completed after the flush job finish. So this new file produced by the compaction (we call it s1) would have a larger interval of sequence number than the file produced by flush (we call it s2). **But there was still some data in s1 written into RocksDB before the s2, so it's possible that some data in s2 was cover by old data in s1.** Of course, it would also make a `Corruption` because of overlap of seqno. There is the relationship of the files: > s1.smallest_seqno < s2.smallest_seqno < s2.largest_seqno < s1.largest_seqno So I skip pick sst file which was ingested in function `FindIntraL0Compaction ` ## Reason Here is my bug report: https://github.com/facebook/rocksdb/issues/5913 There are two situations that can cause the check to fail. ### First situation: - First we ingest five external sst into Rocksdb, and they happened to be ingested in L0. and there had been some data in memtable, which make the smallest sequence number of memtable is less than which of sst that we ingest. - If there had been one compaction job which compacted sst from L0 to L1, `LevelCompactionPicker` would trigger a `IntraL0Compaction` which would compact this five sst from L0 to L0. We call this sst A, which was merged from five ingested sst. - Then some data was put into memtable, and memtable was flushed to L0. We called this sst B. - RocksDB check consistency , and find the `smallest_seqno` of B is less than that of A and crash. Because A was merged from five sst, the smallest sequence number of it was less than the biggest sequece number of itself, so RocksDB could not tell if A was produce by ingested. ### Secondary situaion - First we have flushed many sst in L0, we call them [s1, s2, s3]. - There is an immutable memtable request to be flushed, but because flush thread is busy, so it has not been picked. we call it m1. And at the moment, one sst is ingested into L0. We call it s4. Because s4 is ingested after m1 became immutable memtable, so it has a larger log sequence number than m1. - m1 is flushed in L0. because it is small, this flush job finish quickly. we call it s5. - [s1, s2, s3, s4] are compacted into one sst to L0, by IntraL0Compaction. We call it s6. - compacted 4@0 files to L0 - When s6 is added into manifest, the corruption happened. because the largest sequence number of s6 is equal to s4, and they are both larger than that of s5. But because s1 is older than m1, so the smallest sequence number of s6 is smaller than that of s5. - s6.smallest_seqno < s5.smallest_seqno < s5.largest_seqno < s6.largest_seqno Pull Request resolved: https://github.com/facebook/rocksdb/pull/5958 Differential Revision: D18601316 fbshipit-source-id: 5fe54b3c9af52a2e1400728f565e895cde1c7267 | 19 November 2019, 23:09:11 UTC |
019eb1f | Levi Tamasi | 19 November 2019, 23:00:47 UTC | Disable blob iterator test with max_sequential_skip_in_iterations==0 in LITE mode (#6052) Summary: The SetOptions API used by the test is not supported in LITE mode, so we should skip the new chunk in this case. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6052 Test Plan: Ran the unit tests both in regular and LITE mode. Differential Revision: D18601763 Pulled By: ltamasi fbshipit-source-id: 883d6882771e0fb4aae72bb77ba4e63d9febec04 | 19 November 2019, 23:02:41 UTC |
4e0dcd3 | sdong | 19 November 2019, 21:15:40 UTC | db_stress sometimes generates keys close to SST file boundaries (#6037) Summary: Recently, a bug was found related to a seek key that is close to SST file boundary. However, it only occurs in a very small chance in db_stress, because the chance that a random key hits SST file boundaries is small. To boost the chance, with 1/16 chance, we pick keys that are close to SST file boundaries. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6037 Test Plan: Did some manual printing out, and hack to cover the key generation logic to be correct. Differential Revision: D18598476 fbshipit-source-id: 13b76687d106c5be4e3e02a0c77fa5578105a071 | 19 November 2019, 21:17:03 UTC |
20b48c6 | tabokie | 19 November 2019, 19:37:24 UTC | Fix blob context when db_iter uses seek (#6051) Summary: Fix: when `db_iter` falls back to using seek by `FindValueForCurrentKeyUsingSeek`, `is_blob_` flag is not properly set on encountering BlobIndex. Also patch existing test for the mentioned code path. Signed-off-by: tabokie <xy.tao@outlook.com> Pull Request resolved: https://github.com/facebook/rocksdb/pull/6051 Differential Revision: D18596274 Pulled By: ltamasi fbshipit-source-id: 8e4714af263b99dc2c379707d50db88fe6799278 | 19 November 2019, 19:39:02 UTC |
38cc611 | anand76 | 19 November 2019, 18:11:56 UTC | Fix test failure in LITE mode (#6050) Summary: GetSupportedCompressions() is not available in LITE build, so check and use Snappy compression in db_basic_test.cc. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6050 Test Plan: make LITE=1 check make check Differential Revision: D18588114 Pulled By: anand1976 fbshipit-source-id: a193de58c44f91bcc237107f25dbc1b9458eef3d | 19 November 2019, 18:13:24 UTC |
ac498cd | Peter Dillinger | 19 November 2019, 16:18:57 UTC | Remove a few unnecessary includes Summary: Pull Request resolved: https://github.com/facebook/rocksdb/pull/6046 Test Plan: make check, manual inspection Differential Revision: D18573044 Pulled By: pdillinger fbshipit-source-id: 7a5999fc08d798ce3157b56d4b36d24027409fc3 | 19 November 2019, 16:20:42 UTC |
279c488 | Levi Tamasi | 19 November 2019, 00:28:04 UTC | Mark blob files not needed by any memtables/SSTs obsolete (#6032) Summary: The patch adds logic to mark no longer needed blob files obsolete upon database open and whenever a flush or compaction completes. Unneeded blob files are detected by iterating through live immutable non-TTL blob files starting from the lowest-numbered one, and stopping when a blob file used by any SSTs or potentially used by memtables is found. (The latter is determined by comparing the sequence number at which the blob file became immutable with the largest sequence number received in flush notifications.) In addition, the patch cleans up the logic around closing and obsoleting blob files and enforces invariants around this area (blob files are now guaranteed to go through the stages mutable-non-obsolete, immutable-non-obsolete, and immutable-obsolete in this order). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6032 Test Plan: Extended unit tests and tested using the BlobDB mode of `db_bench`. Differential Revision: D18495610 Pulled By: ltamasi fbshipit-source-id: 11825b84af74f3f4abfd9bcae04e80870ae58961 | 19 November 2019, 00:30:06 UTC |
a150604 | sdong | 18 November 2019, 23:00:23 UTC | db_stress to cover total order seek (#6039) Summary: Right now, in db_stress, as long as prefix extractor is defined, TestIterator always uses. There is value of cover total_order_seek = true when prefix extractor is define. Add a small chance that this flag is turned on. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6039 Test Plan: Run the test for a while. Differential Revision: D18539689 fbshipit-source-id: 568790dd7789c9986b83764b870df0423a122d99 | 18 November 2019, 23:01:38 UTC |
5b9233b | anand76 | 18 November 2019, 17:35:37 UTC | Fix a test failure on systems that don't have Snappy compression libraries (#6038) Summary: The ParallelIO/DBBasicTestWithParallelIO.MultiGet/11 test fails if Snappy compression library is not installed, since RocksDB defaults to Snappy if none is specified. So dynamically determine the supported compression types and pick the first one. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6038 Differential Revision: D18532370 Pulled By: anand1976 fbshipit-source-id: a0a735114d1f8892ea09f7c4af8688d7bcc5b075 | 18 November 2019, 17:37:18 UTC |
f65ec09 | Little-Wallace | 15 November 2019, 21:59:03 UTC | Fix IngestExternalFile's bug with two_write_queue (#5976) Summary: When two_write_queue enable, IngestExternalFile performs EnterUnbatched on both write queues. SwitchMemtable also EnterUnbatched on 2nd write queue when this option is enabled. When the call stack includes IngestExternalFile -> FlushMemTable -> SwitchMemtable, this results into a deadlock. The implemented solution is to pass on the existing writes_stopped argument in FlushMemTable to skip EnterUnbatched in SwitchMemtable. Fixes https://github.com/facebook/rocksdb/issues/5974 Pull Request resolved: https://github.com/facebook/rocksdb/pull/5976 Differential Revision: D18535943 Pulled By: maysamyabandeh fbshipit-source-id: a4f9d4964c10d4a7ca06b1e0102ca2ec395512bc | 15 November 2019, 22:00:37 UTC |
0058dae | Maysam Yabandeh | 14 November 2019, 22:39:48 UTC | Disable SmallestUnCommittedSeq in Valgrind run (#6035) Summary: SmallestUnCommittedSeq sometimes takes too long when run under Valgrind. The patch disables it when the tests are run under Valgrind. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6035 Differential Revision: D18509198 Pulled By: maysamyabandeh fbshipit-source-id: 1191443b9fedb6b9c50d6b76f5c92371f5030230 | 14 November 2019, 22:41:52 UTC |
00d58a3 | Peter Dillinger | 14 November 2019, 22:00:58 UTC | Abandon use of folly::Optional (#6036) Summary: Had complications with LITE build and valgrind test. Reverts/fixes small parts of PR https://github.com/facebook/rocksdb/issues/6007 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6036 Test Plan: make LITE=1 all check and ROCKSDB_VALGRIND_RUN=1 DISABLE_JEMALLOC=1 make -j24 db_bloom_filter_test && ROCKSDB_VALGRIND_RUN=1 DISABLE_JEMALLOC=1 ./db_bloom_filter_test Differential Revision: D18512238 Pulled By: pdillinger fbshipit-source-id: 37213cf0d309edf11c483fb4b2fb6c02c2cf2b28 | 14 November 2019, 22:04:15 UTC |
6123611 | sdong | 14 November 2019, 21:59:43 UTC | crash_test: use large max_manifest_file_size most of the time. (#6034) Summary: Right now, crash_test always uses 16KB max_manifest_file_size value. It is good to cover logic of manifest file switch. However, information stored in manifest files might be useful in debugging failures. Switch to only use small manifest file size in 1/15 of the time. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6034 Test Plan: Observe command generated by db_crash_test.py multiple times and see the --max_manifest_file_size value distribution. Differential Revision: D18513824 fbshipit-source-id: 7b3ae6dbe521a0918df41064e3fa5ecbf2466e04 | 14 November 2019, 22:01:06 UTC |
e8e7fb1 | Peter Dillinger | 14 November 2019, 14:18:23 UTC | More fixes to auto-GarbageCollect in BackupEngine (#6023) Summary: Production: * Fixes GarbageCollect (and auto-GC triggered by PurgeOldBackups, DeleteBackup, or CreateNewBackup) to clean up backup directory independent of current settings (except max_valid_backups_to_open; see issue https://github.com/facebook/rocksdb/issues/4997) and prior settings used with same backup directory. * Fixes GarbageCollect (and auto-GC) not to attempt to remove "." and ".." entries from directories. * Clarifies contract with users in modifying BackupEngine operations. In short, leftovers from any incomplete operation are cleaned up by any subsequent call to that same kind of operation (PurgeOldBackups and DeleteBackup considered the same kind of operation). GarbageCollect is available to clean up after all kinds. (NB: right now PurgeOldBackups and DeleteBackup will clean up after incomplete CreateNewBackup, but we aren't promising to continue that behavior.) Pull Request resolved: https://github.com/facebook/rocksdb/pull/6023 Test Plan: * Refactors open parameters to use an option enum, for readability, etc. (Also fixes an unused parameter bug in the redundant OpenDBAndBackupEngineShareWithChecksum.) * Fixes an apparent bug in ShareTableFilesWithChecksumsTransition in which old backup data was destroyed in the transition to be tested. That test is now augmented to ensure GarbageCollect (or auto-GC) does not remove shared files when BackupEngine is opened with share_table_files=false. * Augments DeleteTmpFiles test to ensure that CreateNewBackup does auto-GC when an incompletely created backup is detected. Differential Revision: D18453559 Pulled By: pdillinger fbshipit-source-id: 5e54e7b08d711b161bc9c656181012b69a8feac4 | 14 November 2019, 14:20:18 UTC |
f059c7d | Peter Dillinger | 14 November 2019, 00:31:26 UTC | New Bloom filter implementation for full and partitioned filters (#6007) Summary: Adds an improved, replacement Bloom filter implementation (FastLocalBloom) for full and partitioned filters in the block-based table. This replacement is faster and more accurate, especially for high bits per key or millions of keys in a single filter. Speed The improved speed, at least on recent x86_64, comes from * Using fastrange instead of modulo (%) * Using our new hash function (XXH3 preview, added in a previous commit), which is much faster for large keys and only *slightly* slower on keys around 12 bytes if hashing the same size many thousands of times in a row. * Optimizing the Bloom filter queries with AVX2 SIMD operations. (Added AVX2 to the USE_SSE=1 build.) Careful design was required to support (a) SIMD-optimized queries, (b) compatible non-SIMD code that's simple and efficient, (c) flexible choice of number of probes, and (d) essentially maximized accuracy for a cache-local Bloom filter. Probes are made eight at a time, so any number of probes up to 8 is the same speed, then up to 16, etc. * Prefetching cache lines when building the filter. Although this optimization could be applied to the old structure as well, it seems to balance out the small added cost of accumulating 64 bit hashes for adding to the filter rather than 32 bit hashes. Here's nominal speed data from filter_bench (200MB in filters, about 10k keys each, 10 bits filter data / key, 6 probes, avg key size 24 bytes, includes hashing time) on Skylake DE (relatively low clock speed): $ ./filter_bench -quick -impl=2 -net_includes_hashing # New Bloom filter Build avg ns/key: 47.7135 Mixed inside/outside queries... Single filter net ns/op: 26.2825 Random filter net ns/op: 150.459 Average FP rate %: 0.954651 $ ./filter_bench -quick -impl=0 -net_includes_hashing # Old Bloom filter Build avg ns/key: 47.2245 Mixed inside/outside queries... Single filter net ns/op: 63.2978 Random filter net ns/op: 188.038 Average FP rate %: 1.13823 Similar build time but dramatically faster query times on hot data (63 ns to 26 ns), and somewhat faster on stale data (188 ns to 150 ns). Performance differences on batched and skewed query loads are between these extremes as expected. The only other interesting thing about speed is "inside" (query key was added to filter) vs. "outside" (query key was not added to filter) query times. The non-SIMD implementations are substantially slower when most queries are "outside" vs. "inside". This goes against what one might expect or would have observed years ago, as "outside" queries only need about two probes on average, due to short-circuiting, while "inside" always have num_probes (say 6). The problem is probably the nastily unpredictable branch. The SIMD implementation has few branches (very predictable) and has pretty consistent running time regardless of query outcome. Accuracy The generally improved accuracy (re: Issue https://github.com/facebook/rocksdb/issues/5857) comes from a better design for probing indices within a cache line (re: Issue https://github.com/facebook/rocksdb/issues/4120) and improved accuracy for millions of keys in a single filter from using a 64-bit hash function (XXH3p). Design details in code comments. Accuracy data (generalizes, except old impl gets worse with millions of keys): Memory bits per key: FP rate percent old impl -> FP rate percent new impl 6: 5.70953 -> 5.69888 8: 2.45766 -> 2.29709 10: 1.13977 -> 0.959254 12: 0.662498 -> 0.411593 16: 0.353023 -> 0.0873754 24: 0.261552 -> 0.0060971 50: 0.225453 -> ~0.00003 (less than 1 in a million queries are FP) Fixes https://github.com/facebook/rocksdb/issues/5857 Fixes https://github.com/facebook/rocksdb/issues/4120 Unlike the old implementation, this implementation has a fixed cache line size (64 bytes). At 10 bits per key, the accuracy of this new implementation is very close to the old implementation with 128-byte cache line size. If there's sufficient demand, this implementation could be generalized. Compatibility Although old releases would see the new structure as corrupt filter data and read the table as if there's no filter, we've decided only to enable the new Bloom filter with new format_version=5. This provides a smooth path for automatic adoption over time, with an option for early opt-in. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6007 Test Plan: filter_bench has been used thoroughly to validate speed, accuracy, and correctness. Unit tests have been carefully updated to exercise new and old implementations, as well as the logic to select an implementation based on context (format_version). Differential Revision: D18294749 Pulled By: pdillinger fbshipit-source-id: d44c9db3696e4d0a17caaec47075b7755c262c5f | 14 November 2019, 00:44:01 UTC |
f382f44 | Fatih Şentürk | 13 November 2019, 19:00:57 UTC | fix typo (#6025) Summary: fix a typo at java readme page Pull Request resolved: https://github.com/facebook/rocksdb/pull/6025 Differential Revision: D18481232 fbshipit-source-id: 1c70c2435bcd4b02f25e28cd7e35c42273e07be0 | 13 November 2019, 19:02:28 UTC |
bb23bfe | sdong | 13 November 2019, 18:10:09 UTC | Fix a regression bug on total order seek with prefix enabled and range delete (#6028) Summary: Recent change https://github.com/facebook/rocksdb/pull/5861 mistakely use "prefix_extractor_ != nullptr" as the condition to determine whehter prefix bloom filter isused. It fails to consider read_options.total_order_seek, so it is wrong. The result is that an optimization for non-total-order seek is mistakely applied to total order seek, and introduces a bug in following corner case: Because of RangeDelete(), a file's largest key is extended. Seek key falls into the range deleted file, so level iterator seeks into the previous file without getting any key. The correct behavior is to place the iterator to the first key of the next file. However, an optimization is triggered and invalidates the iterator because it is out of the prefix range, causing wrong results. This behavior is reproduced in the unit test added. Fix the bug by setting prefix_extractor to be null if total order seek is used. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6028 Test Plan: Add a unit test which fails without the fix. Differential Revision: D18479063 fbshipit-source-id: ac075f013029fcf69eb3a598f14c98cce3e810b3 | 13 November 2019, 18:11:34 UTC |
42b5494 | Peter Dillinger | 12 November 2019, 23:27:19 UTC | Fix BloomFilterPolicy changes for unsigned char (ARM) (#6024) Summary: Bug in PR https://github.com/facebook/rocksdb/issues/5941 when char is unsigned that should only affect assertion on unused/invalid filter metadata. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6024 Test Plan: on ARM: ./bloom_test && ./db_bloom_filter_test && ./block_based_filter_block_test && ./full_filter_block_test && ./partitioned_filter_block_test Differential Revision: D18461206 Pulled By: pdillinger fbshipit-source-id: 68a7c813a0b5791c05265edc03cdf52c78880e9a | 12 November 2019, 23:29:15 UTC |
6c7b1a0 | anand76 | 12 November 2019, 21:51:18 UTC | Batched MultiGet API for multiple column families (#5816) Summary: Add a new API that allows a user to call MultiGet specifying multiple keys belonging to different column families. This is mainly useful for users who want to do a consistent read of keys across column families, with the added performance benefits of batching and returning values using PinnableSlice. As part of this change, the code in the original multi-column family MultiGet for acquiring the super versions has been refactored into a separate function that can be used by both, the batching and the non-batching versions of MultiGet. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5816 Test Plan: make check make asan_check asan_crash_test Differential Revision: D18408676 Pulled By: anand1976 fbshipit-source-id: 933e7bec91dd70e7b633be4ff623a1116cc28c8d | 12 November 2019, 21:52:55 UTC |
a19de78 | sdong | 12 November 2019, 01:32:17 UTC | db_stress to cover SeekForPrev() (#6022) Summary: Right now, db_stress doesn't cover SeekForPrev(). Add the coverage, which mirrors what we do for Seek(). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6022 Test Plan: Run "make crash_test". Do some manual source code hack to simular iterator wrong results and see it caught. Differential Revision: D18442193 fbshipit-source-id: 879b79000d5e33c625c7e970636de191ccd7776c | 12 November 2019, 01:33:54 UTC |
03ce7fb | anand76 | 12 November 2019, 00:57:49 UTC | Fix a buffer overrun problem in BlockBasedTable::MultiGet (#6014) Summary: The calculation in BlockBasedTable::MultiGet for the required buffer length for reading in compressed blocks is incorrect. It needs to take the 5-byte block trailer into account. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6014 Test Plan: Add a unit test DBBasicTest.MultiGetBufferOverrun that fails in asan_check before the fix, and passes after. Differential Revision: D18412753 Pulled By: anand1976 fbshipit-source-id: 754dfb66be1d5f161a7efdf87be872198c7e3b72 | 12 November 2019, 00:59:15 UTC |
f29e6b3 | 蔡渠棠 | 11 November 2019, 23:56:07 UTC | bugfix: MemTableList::RemoveOldMemTables invalid iterator after remov… (#6013) Summary: Fix issue https://github.com/facebook/rocksdb/issues/6012. I found that it may be caused by the following codes in function _RemoveOldMemTables()_ in **db/memtable_list.cc** : ``` for (auto it = memlist.rbegin(); it != memlist.rend(); ++it) { MemTable* mem = *it; if (mem->GetNextLogNumber() > log_number) { break; } current_->Remove(mem, to_delete); ``` The iterator **it** turns invalid after `current_->Remove(mem, to_delete);` Pull Request resolved: https://github.com/facebook/rocksdb/pull/6013 Test Plan: ``` make check ``` Differential Revision: D18401107 Pulled By: riversand963 fbshipit-source-id: bf0da3b868ed70f7aff24cf7b3e2049c0c5c7a4e | 11 November 2019, 23:57:38 UTC |
c17384f | Sagar Vemuri | 11 November 2019, 22:07:36 UTC | Cascade TTL Compactions to move expired key ranges to bottom levels faster (#5992) Summary: When users use Level-Compaction-with-TTL by setting `cf_options.ttl`, the ttl-expired data could take n*ttl time to reach the bottom level (where n is the number of levels) due to how the `creation_time` table property was calculated for the newly created files during compaction. The creation time of new files was set to a max of all compaction-input-files-creation-times which essentially resulted in resetting the ttl as the key range moves across levels. This behavior is now fixed by changing the `creation_time` to be based on minimum of all compaction-input-files-creation-times; this will cause cascading compactions across levels for the ttl-expired data to move to the bottom level, resulting in getting rid of tombstones/deleted-data faster. This will help start cascading compactions to move the expired key range to the bottom-most level faster. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5992 Test Plan: `make check` Differential Revision: D18257883 Pulled By: sagar0 fbshipit-source-id: 00df0bb8d0b7e14d9fc239df2cba8559f3e54cbc | 11 November 2019, 22:09:01 UTC |
8e7aa62 | Levi Tamasi | 11 November 2019, 22:00:25 UTC | BlobDB: Maintain mapping between blob files and SSTs (#6020) Summary: The patch adds logic to BlobDB to maintain the mapping between blob files and SSTs for which the blob file in question is the oldest blob file referenced by the SST file. The mapping is initialized during database open based on the information retrieved using `GetLiveFilesMetaData`, and updated after flushes/compactions based on the information received through the `EventListener` interface (or, in the case of manual compactions issued through the `CompactFiles` API, the `CompactionJobInfo` object). Pull Request resolved: https://github.com/facebook/rocksdb/pull/6020 Test Plan: Added a unit test; also tested using the BlobDB mode of `db_bench`. Differential Revision: D18410508 Pulled By: ltamasi fbshipit-source-id: dd9e778af781cfdb0d7056298c54ba9cebdd54a5 | 11 November 2019, 22:01:34 UTC |
aa63abf | Peter Dillinger | 09 November 2019, 03:13:41 UTC | Auto-GarbageCollect on PurgeOldBackups and DeleteBackup (#6015) Summary: Only if there is a crash, power failure, or I/O error in DeleteBackup, shared or private files from the backup might be left behind that are not cleaned up by PurgeOldBackups or DeleteBackup-- only by GarbageCollect. This makes the BackupEngine API "leaky by default." Even if it means a modest performance hit, I think we should make Delete and Purge do as they say, with ongoing best effort: i.e. future calls will attempt to finish any incomplete work from earlier calls. This change does that by having DeleteBackup and PurgeOldBackups do a GarbageCollect, unless (to minimize performance hit) this BackupEngine has already done a GarbageCollect and there have been no deletion-related I/O errors in that GarbageCollect or since then. Rejected alternative 1: remove meta file last instead of first. This would in theory turn partially deleted backups into corrupted backups, but code changes would be needed to allow the missing files and consider it acceptably corrupt, rather than failing to open the BackupEngine. This might be a reasonable choice, but I mostly rejected it because it doesn't solve the legacy problem of cleaning up existing lingering files. Rejected alternative 2: use a deletion marker file. If deletion started with creating a file that marks a backup as flagged for deletion, then we could reliably detect partially deleted backups and efficiently finish removing them. In addition to not solving the legacy problem, this could be precarious if there's a disk full situation, and we try to create a new file in order to delete some files. Ugh. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6015 Test Plan: Updated unit tests Differential Revision: D18401333 Pulled By: pdillinger fbshipit-source-id: 12944e372ce6809f3f5a4c416c3b321a8927d925 | 09 November 2019, 03:15:35 UTC |
72de842 | Yi Wu | 08 November 2019, 21:45:31 UTC | Fix DBFlushTest::FireOnFlushCompletedAfterCommittedResult hang (#6018) Summary: The test would fire two flushes to let them run in parallel. Previously it wait for the first job to be scheduled before firing the second. It is possible the job is not started before the second job being scheduled, making the two job combine into one. Change to wait for the first job being started. Fixes https://github.com/facebook/rocksdb/issues/6017 Pull Request resolved: https://github.com/facebook/rocksdb/pull/6018 Test Plan: ``` while ./db_flush_test --gtest_filter=*FireOnFlushCompletedAfterCommittedResult*; do :; done ``` and let it run for a while. Signed-off-by: Yi Wu <yiwu@pingcap.com> Differential Revision: D18405576 Pulled By: riversand963 fbshipit-source-id: 6ebb6262e033d5dc2ef81cb3eb410b314f2de4c9 | 08 November 2019, 21:47:29 UTC |
f80050f | Levi Tamasi | 07 November 2019, 22:02:16 UTC | Add file number/oldest referenced blob file number to {Sst,Live}FileMetaData (#6011) Summary: The patch exposes the file numbers of the SSTs as well as the oldest blob files they contain a reference to through the GetColumnFamilyMetaData/ GetLiveFilesMetaData interface. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6011 Test Plan: Fixed and extended the existing unit tests. (The earlier ColumnFamilyMetaDataTest wasn't really testing anything because the generated memtables were never flushed, so the metadata structure was essentially empty.) Differential Revision: D18361697 Pulled By: ltamasi fbshipit-source-id: d5ed1d94ac70858b84393c48711441ddfe1251e9 | 07 November 2019, 22:04:16 UTC |
07a0ad3 | Yun Tang | 07 November 2019, 20:49:39 UTC | Download bzip2 packages from sourceforge (#5995) Summary: From bzip2's official [download page](http://www.bzip.org/downloads.html), we could download it from sourceforge. This source would be more credible than previous web archive. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5995 Differential Revision: D18377662 fbshipit-source-id: e8353f83d5d6ea6067f78208b7bfb7f0d5b49c05 | 07 November 2019, 20:51:06 UTC |
9836a1f | anand76 | 07 November 2019, 20:00:45 UTC | Fix MultiGet crash when no_block_cache is set (#5991) Summary: This PR fixes https://github.com/facebook/rocksdb/issues/5975. In ```BlockBasedTable::RetrieveMultipleBlocks()```, we were calling ```MaybeReadBlocksAndLoadToCache()```, which is a no-op if neither uncompressed nor compressed block cache are configured. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5991 Test Plan: 1. Add unit tests that fail with the old code and pass with the new 2. make check and asan_check Cc spetrunia Differential Revision: D18272744 Pulled By: anand1976 fbshipit-source-id: e62fa6090d1a6adf84fcd51dfd6859b03c6aebfe | 07 November 2019, 20:02:21 UTC |
1da1f04 | sdong | 07 November 2019, 19:13:36 UTC | Stress test to relax the iterator verification case for lower bound (#5869) Summary: In stress test, all iterator verification is turned off is lower bound is enabled. This might be stricter than needed. This PR relaxes the condition and include the case where lower bound is lower than both of seek key and upper bound. It seems to work mostly fine when I run crash test locally. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5869 Test Plan: Run crash_test Differential Revision: D18363578 fbshipit-source-id: 23d57e11ea507949b8100f4190ddfbe8db052d5a | 07 November 2019, 19:16:59 UTC |
982a753 | sdong | 07 November 2019, 19:12:50 UTC | Add two test cases for single sorted universal periodic compaction (#6002) Summary: It's useful to add test coverage for universal compaction's periodic compaction. Add two tests. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6002 Test Plan: Run the two tests Differential Revision: D18363544 fbshipit-source-id: bbd04b54057315f64f959709006412db1f76d170 | 07 November 2019, 19:14:14 UTC |
f0b469e | sdong | 07 November 2019, 18:56:25 UTC | Turn on periodic compaction in universal by default if compaction filter is used. (#5994) Summary: Recently, periodic compaction got turned on by default for leveled compaction is compaction filter is used. Since periodic compaction is now supported in universal compaction too, we do the same default for universal now. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5994 Test Plan: Add a new unit test. Differential Revision: D18363744 fbshipit-source-id: 5093288ce990ee3cab0e44ffd92d8489fbcd6a48 | 07 November 2019, 18:58:10 UTC |
7b3222e | Peter Dillinger | 07 November 2019, 17:49:41 UTC | Partial rebalance of TEST_GROUPs for Travis (#6010) Summary: TEST_GROUP=1 has sometimes been timing out but generally taking 45-50 minutes vs. 20-25 for groups 2-4. Beyond the compilation time, tests in group 1 consist of about 19 minutes of db_test, and 7 minutes of everything else. This change moves most of that "everything else" to group 2. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6010 Test Plan: Travis for this PR, oncall watch Travis Differential Revision: D18373536 Pulled By: pdillinger fbshipit-source-id: 0b3af004c71e4fd6bc01a94dac34cc3079fc9ce1 | 07 November 2019, 17:50:59 UTC |
111ebf3 | sdong | 07 November 2019, 01:37:07 UTC | db_stress: improve TestGet() failure printing (#5989) Summary: Right now, in db_stress's CF consistency test's TestGet case, if failure happens, we do normal string printing, rather than hex printing, so that some text is not printed out, which makes debugging harder. Fix it by printing hex instead. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5989 Test Plan: Build db_stress and see t passes. Differential Revision: D18363552 fbshipit-source-id: 09d1b8f6fbff37441cbe7e63a1aef27551226cec | 07 November 2019, 01:38:25 UTC |
8ea087a | Zhichao Cao | 06 November 2019, 20:50:33 UTC | Workload generator (Mixgraph) based on prefix hotness (#5953) Summary: In the previous PR https://github.com/facebook/rocksdb/issues/4788, user can use db_bench mix_graph option to generate the workload that is from the social graph. The key is generated based on the key access hotness. In this PR, user can further model the key-range hotness and fit those to two-term-exponential distribution. First, user cuts the whole key space into small key ranges (e.g., key-ranges are the same size and the key-range number is the number of SST files). Then, user calculates the average access count per key of each key-range as the key-range hotness. Next, user fits the key-range hotness to two-term-exponential distribution (f(x) = f(x) = a*exp(b*x) + c*exp(d*x)) and generate the value of a, b, c, and d. They are the parameters in db_bench: prefix_dist_a, prefix_dist_b, prefix_dist_c, and prefix_dist_d. Finally, user can run db_bench by specify the parameters. For example: `./db_bench --benchmarks="mixgraph" -use_direct_io_for_flush_and_compaction=true -use_direct_reads=true -cache_size=268435456 -key_dist_a=0.002312 -key_dist_b=0.3467 -keyrange_dist_a=14.18 -keyrange_dist_b=-2.917 -keyrange_dist_c=0.0164 -keyrange_dist_d=-0.08082 -keyrange_num=30 -value_k=0.2615 -value_sigma=25.45 -iter_k=2.517 -iter_sigma=14.236 -mix_get_ratio=0.85 -mix_put_ratio=0.14 -mix_seek_ratio=0.01 -sine_mix_rate_interval_milliseconds=5000 -sine_a=350 -sine_b=0.0105 -sine_d=50000 --perf_level=2 -reads=1000000 -num=5000000 -key_size=48` Pull Request resolved: https://github.com/facebook/rocksdb/pull/5953 Test Plan: run db_bench with different parameters and checked the results. Differential Revision: D18053527 Pulled By: zhichao-cao fbshipit-source-id: 171f8b3142bd76462f1967c58345ad7e4f84bab7 | 06 November 2019, 21:02:20 UTC |
5080465 | Maysam Yabandeh | 06 November 2019, 19:11:51 UTC | Enable write-conflict snapshot in stress tests (#5897) Summary: DBImpl extends the public GetSnapshot() with GetSnapshotForWriteConflictBoundary() method that takes snapshots specially for write-write conflict checking. Compaction treats such snapshots differently to avoid GCing a value written after that, so that the write conflict remains visible even after the compaction. The patch extends stress tests with such snapshots. Pull Request resolved: https://github.com/facebook/rocksdb/pull/5897 Differential Revision: D17937476 Pulled By: maysamyabandeh fbshipit-source-id: bd8b0c578827990302194f63ae0181e15752951d | 06 November 2019, 19:13:22 UTC |