f9e0a06 | Andrew Kryczka | 06 June 2018, 18:16:32 UTC | update history and version | 06 June 2018, 18:16:32 UTC |
8d73964 | Huachao Huang | 22 March 2018, 05:56:48 UTC | Ignore empty filter block when data block is empty Summary: Close https://github.com/facebook/rocksdb/issues/3592 Closes https://github.com/facebook/rocksdb/pull/3614 Differential Revision: D7291706 Pulled By: ajkr fbshipit-source-id: 9dd8f40bd7716588e1e3fd6be0c2bc2766861f8c | 06 June 2018, 18:13:18 UTC |
d25ca5f | Fosco Marotto | 25 May 2018, 23:41:35 UTC | Update history for release of 5.13.2 | 25 May 2018, 23:41:35 UTC |
2da2286 | Adam Retter | 25 May 2018, 22:06:47 UTC | Fix an issue with unnecessary capture in lambda expressions Summary: Closes https://github.com/facebook/rocksdb/issues/3900 Replaces https://github.com/facebook/rocksdb/pull/3901 I needed this to build v5.12.4 on Mac OS X (10.13.3). Closes https://github.com/facebook/rocksdb/pull/3904 Differential Revision: D8169357 Pulled By: sagar0 fbshipit-source-id: 85faac42168796e7def9250d0c221a9a03b84476 | 25 May 2018, 23:26:04 UTC |
83973dd | Yanqin Jin | 25 May 2018, 18:45:12 UTC | Fix segfault caused by object premature destruction Summary: Please refer to earlier discussion in [issue 3609](https://github.com/facebook/rocksdb/issues/3609). There was also an alternative fix in [PR 3888](https://github.com/facebook/rocksdb/pull/3888), but the proposed solution requires complex change. To summarize the cause of the problem. Upon creation of a column family, a `BlockBasedTableFactory` object is `new`ed and encapsulated by a `std::shared_ptr`. Since there is no other `std::shared_ptr` pointing to this `BlockBasedTableFactory`, when the column family is dropped, the `ColumnFamilyData` is `delete`d, causing the destructor of `std::shared_ptr`. Since there is no other `std::shared_ptr`, the underlying memory is also freed. Later when the db exits, it releases all the table readers, including the table readers that have been operating on the dropped column family. This needs to access the `table_options` owned by `BlockBasedTableFactory` that has already been deleted. Therefore, a segfault is raised. Previous workaround is to purge all obsolete files upon `ColumnFamilyData` destruction, which leads to a force release of table readers of the dropped column family. However this does not work when the user disables file deletion. Our solution in this PR is making a copy of `table_options` in `BlockBasedTable::Rep`. This solution increases memory copy and usage, but is much simpler. Test plan ``` $ make -j16 $ ./column_family_test --gtest_filter=ColumnFamilyTest.CreateDropAndDestroy:ColumnFamilyTest.CreateDropAndDestroyWithoutFileDeletion ``` Expected behavior: All tests should pass. Closes https://github.com/facebook/rocksdb/pull/3898 Differential Revision: D8149421 Pulled By: riversand963 fbshipit-source-id: eaecc2e064057ef607fbdd4cc275874f866c3438 | 25 May 2018, 22:54:14 UTC |
1c18482 | Andrew Kryczka | 24 May 2018, 04:01:42 UTC | bump version to 5.13.2 and update HISTORY | 24 May 2018, 04:01:48 UTC |
b245e51 | Andrew Kryczka | 24 May 2018, 01:33:00 UTC | Introduce library-independent default compression level Summary: Previously we were using -1 as the default for every library, which was legacy from our zlib options. That worked for a while, but after zstd introduced https://github.com/facebook/zstd/commit/a146ee04ae5866b948be0c1911418e0436d80cb4, it started giving poor compression ratios by default in zstd. This PR adds a constant to RocksDB public API, `CompressionOptions::kDefaultCompressionLevel`, which will get translated to the default value specific to the compression library being used in "util/compression.h". The constant uses a number that appears to be larger than any library's maximum compression level. Closes https://github.com/facebook/rocksdb/pull/3895 Differential Revision: D8125780 Pulled By: ajkr fbshipit-source-id: 2db157a89118cd4f94577c2f4a0a5ff31c8391c6 | 24 May 2018, 03:59:31 UTC |
c60df9d | Andrew Kryczka | 01 May 2018, 01:19:02 UTC | update history and version | 01 May 2018, 01:19:02 UTC |
747c853 | Andrew Kryczka | 21 April 2018, 00:23:34 UTC | Avoid directory renames in BackupEngine Summary: We used to name private directories like "1.tmp" while BackupEngine populated them, and then rename without the ".tmp" suffix (i.e., rename "1.tmp" to "1") after all files were copied. On glusterfs, directory renames like this require operations across many hosts, and partial failures have caused operational problems. Fortunately we don't need to rename private directories. We already have a meta-file that uses the tempfile-rename pattern to commit a backup atomically after all its files have been successfully copied. So we can copy private files directly to their final location, so now there's no directory rename. Closes https://github.com/facebook/rocksdb/pull/3749 Differential Revision: D7705610 Pulled By: ajkr fbshipit-source-id: fd724a28dd2bf993ce323a5f2cb7e7d6980cc346 | 01 May 2018, 00:50:24 UTC |
f5ee207 | Gabriel Wicke | 24 April 2018, 15:38:01 UTC | Support lowering CPU priority of background threads Summary: Background activities like compaction can negatively affect latency of higher-priority tasks like request processing. To avoid this, rocksdb already lowers the IO priority of background threads on Linux systems. While this takes care of typical IO-bound systems, it does not help much when CPU (temporarily) becomes the bottleneck. This is especially likely when using more expensive compression settings. This patch adds an API to allow for lowering the CPU priority of background threads, modeled on the IO priority API. Benchmarks (see below) show significant latency and throughput improvements when CPU bound. As a result, workloads with some CPU usage bursts should benefit from lower latencies at a given utilization, or should be able to push utilization higher at a given request latency target. A useful side effect is that compaction CPU usage is now easily visible in common tools, allowing for an easier estimation of the contribution of compaction vs. request processing threads. As with IO priority, the implementation is limited to Linux, degrading to a no-op on other systems. Closes https://github.com/facebook/rocksdb/pull/3763 Differential Revision: D7740096 Pulled By: gwicke fbshipit-source-id: e5d32373e8dc403a7b0c2227023f9ce4f22b413c | 25 April 2018, 01:18:32 UTC |
f2fd21f | Zhongyi Xie | 16 April 2018, 00:19:57 UTC | fix memory leak in two_level_iterator Summary: this PR fixes a few failed contbuild: 1. ASAN memory leak in Block::NewIterator (table/block.cc:429). the proper destruction of first_level_iter_ and second_level_iter_ of two_level_iterator.cc is missing from the code after the refactoring in https://github.com/facebook/rocksdb/pull/3406 2. various unused param errors introduced by https://github.com/facebook/rocksdb/pull/3662 3. updated comment for `ForceReleaseCachedEntry` to emphasize the use of `force_erase` flag. Closes https://github.com/facebook/rocksdb/pull/3718 Reviewed By: maysamyabandeh Differential Revision: D7621192 Pulled By: miasantreble fbshipit-source-id: 476c94264083a0730ded957c29de7807e4f5b146 | 16 April 2018, 21:23:11 UTC |
7585278 | Maysam Yabandeh | 09 April 2018, 23:17:15 UTC | Fix the memory leak with pinned partitioned filters Summary: The existing unit test did not set the level so the check for pinned partitioned filter/index being properly released from the block cache was not properly exercised as they only take effect in level 0. As a result a memory leak in pinned partitioned filters was hidden. The patch fix the test as well as the bug. Closes https://github.com/facebook/rocksdb/pull/3692 Differential Revision: D7559763 Pulled By: maysamyabandeh fbshipit-source-id: 55eff274945838af983c764a7d71e8daff092e4a | 16 April 2018, 21:12:49 UTC |
9d2e34e | Sagar Vemuri | 23 March 2018, 21:58:51 UTC | Fix History | 23 March 2018, 21:58:51 UTC |
1f5103d | Sagar Vemuri | 23 March 2018, 19:13:00 UTC | Add Java-API-Changes section to History Summary: We have not been updating our HISTORY.md change log with the RocksJava changes. Going forward, lets add Java changes also to HISTORY.md. There is an old java/HISTORY-JAVA.md, but it hasn't been updated in years. It is much easier to remember to update the change log in a single file, HISTORY.md. I added information about shared block cache here, which was introduced in #3623. Closes https://github.com/facebook/rocksdb/pull/3647 Differential Revision: D7384448 Pulled By: sagar0 fbshipit-source-id: 9b6e569f44e6df5cb7ba06413d9975df0b517d20 | 23 March 2018, 21:56:44 UTC |
163dd4b | Sagar Vemuri | 22 March 2018, 01:31:21 UTC | Shared block cache in RocksJava Summary: Changes to support sharing block cache using the Java API. Previously DB instances could share the block cache only when the same Options instance is passed to all the DB instances. But now, with this change, it is possible to explicitly create a cache and pass it to multiple options instances, to share the block cache. Implementing this for [Rocksandra](https://github.com/instagram/cassandra/tree/rocks_3.0), but this feature has been requested by many java api users over the years. Closes https://github.com/facebook/rocksdb/pull/3623 Differential Revision: D7305794 Pulled By: sagar0 fbshipit-source-id: 03e4e8ed7aeee6f88bada4a8365d4279ede2ad71 | 23 March 2018, 21:56:05 UTC |
1642582 | Sagar Vemuri | 23 March 2018, 00:34:52 UTC | Fsync after writing global seq number in ExternalSstFileIngestionJob Summary: Fsync after writing global sequence number to the ingestion file in ExternalSstFileIngestionJob. Otherwise the file metadata could be incorrect. Closes https://github.com/facebook/rocksdb/pull/3644 Differential Revision: D7373813 Pulled By: sagar0 fbshipit-source-id: 4da2c9e71a8beb5c08b4ac955f288ee1576358b8 | 23 March 2018, 21:52:21 UTC |
8d28083 | Fosco Marotto | 20 March 2018, 23:17:53 UTC | Update history for future 5.13 release | 20 March 2018, 23:17:53 UTC |
d1b2650 | Andrew Kryczka | 19 March 2018, 19:12:41 UTC | fix db_compaction_test when compression disabled Summary: Previously, the compaction in `DBCompactionTestWithParam.ForceBottommostLevelCompaction` generated multiple files in no-compression use case, andone file in compression use case. I increased `target_file_size_base` so it generates one file in both use cases. Closes https://github.com/facebook/rocksdb/pull/3625 Differential Revision: D7311885 Pulled By: ajkr fbshipit-source-id: 97f249fa83a9924ac34357a4bb3189c969ecb107 | 19 March 2018, 19:30:05 UTC |
ccb7613 | Tobias Tschinkowitz | 19 March 2018, 19:11:58 UTC | Enable compilation on OpenBSD Summary: I modified the Makefile so that we can compile rocksdb on OpenBSD. The instructions for building have been added to INSTALL.md. The whole compilation process works fine like this on OpenBSD-current Closes https://github.com/facebook/rocksdb/pull/3617 Differential Revision: D7323754 Pulled By: siying fbshipit-source-id: 990037d1cc69138d22f85bd77ef4dc8c1ba9edea | 19 March 2018, 19:30:05 UTC |
1139422 | Yanqin Jin | 19 March 2018, 05:39:18 UTC | Fix the command used to generate ctags Summary: In original $ROCKSDB_HOME/Makefile, the command used to generate ctags is ``` ctags * -R ``` However, this failed to generate tags for me. I did some search on the usage of ctags command and found that it should be ``` ctags -R . ``` or ``` ctags -R * ``` After the change, I can find the tags in vim using `:ts <identifier>`. Closes https://github.com/facebook/rocksdb/pull/3626 Reviewed By: ajkr Differential Revision: D7320217 Pulled By: riversand963 fbshipit-source-id: e4cd8f8a67842370a2343f0213df3cbd07754111 | 19 March 2018, 05:43:18 UTC |
bef95be | Adam Retter | 16 March 2018, 20:24:48 UTC | Improve the output of the RocksJava JUnit runner Summary: This changes the console output when the RocksJava tests are run. It makes spotting the errors and failures much easier; perviously the output was malformed with results like "ERun" where the "E" represented an error in the preceding test. Closes https://github.com/facebook/rocksdb/pull/3621 Differential Revision: D7306172 Pulled By: sagar0 fbshipit-source-id: 3fa6f6e1ca6c6ea7ceef55a23ca81903716132b7 | 16 March 2018, 20:27:55 UTC |
cc34026 | zhsj | 16 March 2018, 20:23:36 UTC | fix wrong length in snprintf Summary: Closes https://github.com/facebook/rocksdb/pull/3622 Differential Revision: D7307689 Pulled By: ajkr fbshipit-source-id: b8f52effc63fea06c2058b39c60944c2c1f814b4 | 16 March 2018, 20:27:55 UTC |
ecfca1f | Huachao Huang | 16 March 2018, 17:27:39 UTC | Optimize overlap checking for external file ingestion Summary: If there are a lot of overlapped files in L0, creating a merging iterator for all files in L0 to check overlap can be very slow because we need to read and seek all files in L0. However, in that case, the ingested file is likely to overlap with some files in L0, so if we check those files one by one, we can stop once we encounter overlap. Ref: https://github.com/facebook/rocksdb/issues/3540 Closes https://github.com/facebook/rocksdb/pull/3564 Differential Revision: D7196784 Pulled By: anand1976 fbshipit-source-id: 8700c1e903bd515d0fa7005b6ce9b3a3d9db2d67 | 16 March 2018, 17:43:17 UTC |
da82aab | Niv Dayan | 15 March 2018, 18:46:16 UTC | allowing CompactFiles to return new file names Summary: This is a small API extension to allow the CompactFiles method to return the names of files that were created during the compaction. Closes https://github.com/facebook/rocksdb/pull/3608 Differential Revision: D7275789 Pulled By: siying fbshipit-source-id: 1ec0c3954a0f10cd877efb5f29f9be6c7b59e9ba | 15 March 2018, 18:58:12 UTC |
cc118b0 | Sagar Vemuri | 15 March 2018, 17:29:05 UTC | Update version Summary: We missed updating version.h on master when cutting 5.11.fb and 5.12.fb branches. It should be the same as the version in the latest release branch (or should it be one more?). I noticed this when trying to run some upgrade/downgrade tests from 5.11 to some new code on master. Closes https://github.com/facebook/rocksdb/pull/3611 Differential Revision: D7282917 Pulled By: sagar0 fbshipit-source-id: 205ee75b77c5b6bbcea95a272760b427025a4aba | 15 March 2018, 17:41:48 UTC |
0cdaa1a | Andrew Kryczka | 14 March 2018, 22:59:26 UTC | Fix WAL corruption from checkpoint/backup race condition Summary: `Writer::WriteBuffer` was always called at the beginning of checkpoint/backup. But that log writer has no internal synchronization, which meant the same buffer could be flushed twice in a race condition case, causing a WAL entry to be duplicated. Then subsequent WAL entries would be at unexpected offsets, causing the 32KB block boundaries to be overlapped and manifesting as a corruption. This PR fixes the behavior to only use `WriteBuffer` (via `FlushWAL`) in checkpoint/backup when manual WAL flush is enabled. In that case, users are responsible for providing synchronization between WAL flushes. We can also consider removing the call entirely. Closes https://github.com/facebook/rocksdb/pull/3603 Differential Revision: D7277447 Pulled By: ajkr fbshipit-source-id: 1b15bd7fd930511222b075418c10de0aaa70a35a | 14 March 2018, 23:12:50 UTC |
449627f | Yi Wu | 14 March 2018, 21:20:44 UTC | Blob DB: remove unreacheable code Summary: Fixing #3604. Closes https://github.com/facebook/rocksdb/pull/3606 Reviewed By: siying Differential Revision: D7276604 Pulled By: yiwu-arbug fbshipit-source-id: 915c5897b010d28956f369989e49e64785d1161f | 14 March 2018, 21:27:28 UTC |
6f7b7f9 | Dmitri Smirnov | 14 March 2018, 07:55:04 UTC | Optionally create DuplicateDetector Summary: Address issue https://github.com/facebook/rocksdb/issues/3579 Closes https://github.com/facebook/rocksdb/pull/3589 Differential Revision: D7221161 Pulled By: yiwu-arbug fbshipit-source-id: bd875ab0aa0e414dfa98b1bf036ba9b4ed351361 | 14 March 2018, 07:57:25 UTC |
e003d22 | Chinmay Kamat | 14 March 2018, 07:48:11 UTC | Fix FaultInjectionTestEnv to work with DirectIO Summary: Implemented PositionedAppend() and use_direct_io() for TestWritableFile. With these changes, FaultInjectionTestEnv can be used with DirectIO enabled. Closes https://github.com/facebook/rocksdb/pull/3586 Differential Revision: D7244305 Pulled By: yiwu-arbug fbshipit-source-id: f6b7aece53daa0f9977bc684164a0693693e514c | 14 March 2018, 07:57:24 UTC |
09e5d7a | Zhongyi Xie | 14 March 2018, 01:47:38 UTC | add 4th test_group in travis Summary: to overcome the space limitation Closes https://github.com/facebook/rocksdb/pull/3605 Differential Revision: D7262607 Pulled By: miasantreble fbshipit-source-id: 1b1148026f17a7ee4b9f3a17ddc6b4ba9cf7af7f | 14 March 2018, 01:57:29 UTC |
2256dab | Andrew Kryczka | 13 March 2018, 21:46:41 UTC | fix flaky DBSSTTest.DeleteSchedulerMultipleDBPaths Summary: I landed #3544 which made this test flaky. The reason was the files scheduled for deletion sometimes went through the trash-marking process, and sometimes were deleted directly. Our counter only bumped on the former code path, so if the latter code path was used, we'd miss counting a file deleted by deletion scheduler. This PR also bumps the counter in the latter code path. Closes https://github.com/facebook/rocksdb/pull/3593 Differential Revision: D7226173 Pulled By: yiwu-arbug fbshipit-source-id: 81ab44c60834df6ff88db1d73ea34e26c6e93c39 | 13 March 2018, 21:57:26 UTC |
7153153 | Chinmay Kamat | 13 March 2018, 18:50:16 UTC | Fix enable_pipelined_write output in OPTIONS file Summary: enable_pipelined_write was not set in BuildDBOptions() causing its default value to be dumped in the OPTIONS file Closes https://github.com/facebook/rocksdb/pull/3585 Differential Revision: D7226395 Pulled By: yiwu-arbug fbshipit-source-id: 45a659a48d18103ac9ee74bb8805dd0a6ec12474 | 13 March 2018, 18:59:02 UTC |
f6156fb | Javeme Lee | 08 March 2018, 23:59:51 UTC | Support StringAppendOperator(delimiter_char) constructor in java-api Summary: Fixes #3336 Closes https://github.com/facebook/rocksdb/pull/3337 Differential Revision: D7196585 Pulled By: sagar0 fbshipit-source-id: a854f3fc906862ecba685b31946e4ef7c0b421c5 | 09 March 2018, 00:17:47 UTC |
c5302a8 | Adam Retter | 08 March 2018, 19:16:46 UTC | Java wrapper for Native Comparators Summary: This is an abstraction for working with custom Comparators implemented in native C++ code from Java. Native code must directly extend `rocksdb::Comparator`. When the native code comparator is compiled into the RocksDB codebase, you can then create a Java Class, and JNI stub to wrap it. Useful if the C++/JNI barrier overhead is too much for your applications comparator performance. An example is provided in `java/rocksjni/native_comparator_wrapper_test.cc` and `java/src/main/java/org/rocksdb/NativeComparatorWrapperTest.java`. Closes https://github.com/facebook/rocksdb/pull/3334 Differential Revision: D7172605 Pulled By: miasantreble fbshipit-source-id: e24b7eb267a3bcb6afa214e0379a1d5e8a2ceabe | 08 March 2018, 19:27:42 UTC |
e476d0e | Amy Tai | 08 March 2018, 18:39:15 UTC | Adding stat to count cancelled compactions Summary: Added a stat that counts the number of cancelled compactions. Closes https://github.com/facebook/rocksdb/pull/3574 Differential Revision: D7190259 Pulled By: amytai fbshipit-source-id: d5ce82dc9398da6d6d34023ad4ed8cec909852a3 | 08 March 2018, 18:42:28 UTC |
a3a3f54 | Bruce Mitchener | 08 March 2018, 18:18:34 UTC | Fix some typos in comments and docs. Summary: Closes https://github.com/facebook/rocksdb/pull/3568 Differential Revision: D7170953 Pulled By: siying fbshipit-source-id: 9cfb8dd88b7266da920c0e0c1e10fb2c5af0641c | 08 March 2018, 18:27:25 UTC |
a277b0f | Lukas Rist | 08 March 2018, 18:11:18 UTC | Clarification regarding record format Summary: The CRC is actually calculated based on the record type and payload. The wiki should also be updated accordingly and extended with a section on the recyclable record format. Closes https://github.com/facebook/rocksdb/pull/3576 Differential Revision: D7196478 Pulled By: siying fbshipit-source-id: 39f7a0395075cc73e2aa2bfc9e42c85bce35e765 | 08 March 2018, 18:27:25 UTC |
b560fc9 | Siying Dong | 08 March 2018, 18:09:59 UTC | Fix a block pinning regression introduced in b555ed30a4a93b80a3ac4781c6721ab988e03b5b Summary: b555ed30a4a93b80a3ac4781c6721ab988e03b5b introduces a regression, which causes blocks always to be pinned in block based iterators. Fix it. Closes https://github.com/facebook/rocksdb/pull/3582 Differential Revision: D7189534 Pulled By: siying fbshipit-source-id: 117dc7a03d0a0e360424db02efb366e12da2be03 | 08 March 2018, 18:12:23 UTC |
e69f6e8 | Sagar Vemuri | 07 March 2018, 23:17:37 UTC | Fix API name in a comment in db.h Summary: ... so that people are not confused. Closes https://github.com/facebook/rocksdb/pull/3580 Differential Revision: D7187175 Pulled By: sagar0 fbshipit-source-id: bce70093d52e38cd24c9432fd708885d7c2c013e | 07 March 2018, 23:27:17 UTC |
0de710f | Bruce Mitchener | 07 March 2018, 20:39:19 UTC | Use nullptr instead of NULL / 0 more consistently. Summary: Closes https://github.com/facebook/rocksdb/pull/3569 Differential Revision: D7170968 Pulled By: yiwu-arbug fbshipit-source-id: 308a6b7dd358a04fd9a7de3d927bfd8abd57d348 | 07 March 2018, 20:42:12 UTC |
f021f1d | Stuart | 07 March 2018, 04:51:30 UTC | Add rocksdb_open_with_ttl function in C API Summary: Change-Id: Ie6f9b10bce459f6bf0ade0e5877264b4e10da3f5 Signed-off-by: Stuart <Stuart.Hu@emc.com> Closes https://github.com/facebook/rocksdb/pull/3553 Differential Revision: D7144833 Pulled By: sagar0 fbshipit-source-id: 815225fa6e560d8a5bc47ffd0a98118b107ce264 | 07 March 2018, 04:57:20 UTC |
0a3db28 | amytai | 07 March 2018, 00:13:05 UTC | Disallow compactions if there isn't enough free space Summary: This diff handles cases where compaction causes an ENOSPC error. This does not handle corner cases where another background job is started while compaction is running, and the other background job triggers ENOSPC, although we do allow the user to provision for these background jobs with SstFileManager::SetCompactionBufferSize. It also does not handle the case where compaction has finished and some other background job independently triggers ENOSPC. Usage: Functionality is inside SstFileManager. In particular, users should set SstFileManager::SetMaxAllowedSpaceUsage, which is the reference highwatermark for determining whether to cancel compactions. Closes https://github.com/facebook/rocksdb/pull/3449 Differential Revision: D7016941 Pulled By: amytai fbshipit-source-id: 8965ab8dd8b00972e771637a41b4e6c645450445 | 07 March 2018, 00:27:54 UTC |
20c508c | Andrew Kryczka | 06 March 2018, 20:41:37 UTC | Enable subcompactions in manual level-based compaction Summary: This is the simplest way I could think of to speed up `CompactRange`. It works but isn't that optimal because it relies on the same `max_compaction_bytes` and `max_subcompactions` options that are used in other places. If it turns out to be useful we can allow overriding these in `CompactRangeOptions` in the future. Closes https://github.com/facebook/rocksdb/pull/3549 Differential Revision: D7117634 Pulled By: ajkr fbshipit-source-id: d0cd03d6bd0d2fd7ea3fb13cd3b8bf7c47d11e42 | 06 March 2018, 20:43:51 UTC |
3462c94 | Fosco Marotto | 06 March 2018, 20:32:35 UTC | Add dual-license info to README.md Summary: From #3417 and after talking to both GitHub and our open source legal team, the recommended approach was to explicitly state the dual-license in the readme. Changing the license files to accommodate the auto-detection is too much of a pain, would involve editing every code file header. Closes https://github.com/facebook/rocksdb/pull/3541 Differential Revision: D7171111 Pulled By: gfosco fbshipit-source-id: 0ee7b134446015228249efe991fa5e76526ca0b0 | 06 March 2018, 20:43:51 UTC |
6a3eebb | Andrew Kryczka | 06 March 2018, 20:31:25 UTC | support multiple db_paths in SstFileManager Summary: Now that files scheduled for deletion are kept in the same directory, we don't need to constrain deletion scheduler to `db_paths[0]`. Previously this was done because there was a separate trash directory, and this constraint prevented files from being accidentally copied to another filesystem when they're scheduled for deletion. Closes https://github.com/facebook/rocksdb/pull/3544 Differential Revision: D7093786 Pulled By: ajkr fbshipit-source-id: 202f5c92d925eafebec1281fb95bb5828d33414f | 06 March 2018, 20:43:51 UTC |
d518fe1 | Fosco Marotto | 06 March 2018, 20:27:07 UTC | uint64_t and size_t changes to compile for iOS Summary: In attempting to build a static lib for use in iOS, I ran in to lots of type errors between uint64_t and size_t. This PR contains the changes I made to get `TARGET_OS=IOS make static_lib` to succeed while also getting Xcode to build successfully with the resulting `librocksdb.a` library imported. This also compiles for me on macOS and tests fine, but I'm really not sure if I made the correct decisions about where to `static_cast` and where to change types. Also up for discussion: is iOS worth supporting? Getting the static lib is just part one, we aren't providing any bridging headers or wrappers like the ObjectiveRocks project, it won't be a great experience. Closes https://github.com/facebook/rocksdb/pull/3503 Differential Revision: D7106457 Pulled By: gfosco fbshipit-source-id: 82ac2073de7e1f09b91f6b4faea91d18bd311f8e | 06 March 2018, 20:43:51 UTC |
8bc41f4 | Siying Dong | 06 March 2018, 20:19:15 UTC | Update TARGETS Summary: Watch the build Closes https://github.com/facebook/rocksdb/pull/3533 Differential Revision: D7063777 Pulled By: siying fbshipit-source-id: db9cdfc362a8d281dada6513ab034a6d6f0d552e | 06 March 2018, 20:27:28 UTC |
c364eb4 | Dmitri Smirnov | 06 March 2018, 19:47:42 UTC | Windows cumulative patch Summary: This patch addressed several issues. Portability including db_test std::thread -> port::Thread Cc: @ and %z to ROCKSDB portable macro. Cc: maysamyabandeh Implement Env::AreFilesSame Make the implementation of file unique number more robust Get rid of C-runtime and go directly to Windows API when dealing with file primitives. Implement GetSectorSize() and aling unbuffered read on the value if available. Adjust Windows Logger for the new interface, implement CloseImpl() Cc: anand1976 Fix test running script issue where $status var was of incorrect scope so the failures were swallowed and not reported. DestroyDB() creates a logger and opens a LOG file in the directory being cleaned up. This holds a lock on the folder and the cleanup is prevented. This fails one of the checkpoin tests. We observe the same in production. We close the log file in this change. Fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose failure where the test attempts to open a directory with NewRandomAccessFile which does not work on Windows. Fix DBTest.SoftLimit as it is dependent on thread timing. CC: yiwu-arbug Closes https://github.com/facebook/rocksdb/pull/3552 Differential Revision: D7156304 Pulled By: siying fbshipit-source-id: 43db0a757f1dfceffeb2b7988043156639173f5b | 06 March 2018, 19:57:43 UTC |
b864bc9 | Yi Wu | 06 March 2018, 19:46:20 UTC | Blob DB: Improve FIFO eviction Summary: Improving blob db FIFO eviction with the following changes, * Change blob_dir_size to max_db_size. Take into account SST file size when computing DB size. * FIFO now only take into account live sst files and live blob files. It is normal for disk usage to go over max_db_size because there are obsolete sst files and blob files pending deletion. * FIFO eviction now also evict TTL blob files that's still open. It doesn't evict non-TTL blob files. * If FIFO is triggered, it will pass an expiration and the current sequence number to compaction filter. Compaction filter will then filter inlined keys to evict those with an earlier expiration and smaller sequence number. So call LSM FIFO. * Compaction filter also filter those blob indexes where corresponding blob file is gone. * Add an event listener to listen compaction/flush event and update sst file size. * Implement DB::Close() to make sure base db, as well as event listener and compaction filter, destruct before blob db. * More blob db statistics around FIFO. * Fix some locking issue when accessing a blob file. Closes https://github.com/facebook/rocksdb/pull/3556 Differential Revision: D7139328 Pulled By: yiwu-arbug fbshipit-source-id: ea5edb07b33dfceacb2682f4789bea61de28bbfa | 06 March 2018, 19:57:42 UTC |
0a2354c | Pooya Shareghi | 06 March 2018, 18:20:52 UTC | Added bytes XOR merge operator Summary: Closes https://github.com/facebook/rocksdb/pull/575 I fixed the merge conflicts etc. Closes https://github.com/facebook/rocksdb/pull/3065 Differential Revision: D7128233 Pulled By: sagar0 fbshipit-source-id: 2c23a48c9f0432c290b0cd16a12fb691bb37820c | 06 March 2018, 18:27:36 UTC |
62277e1 | Maysam Yabandeh | 06 March 2018, 07:48:23 UTC | WritePrepared Txn: Move DuplicateDetector to util Summary: Move DuplicateDetector and SetComparator to its own header file in util. It would also address a complaint in the unity test. Closes https://github.com/facebook/rocksdb/pull/3567 Differential Revision: D7163268 Pulled By: maysamyabandeh fbshipit-source-id: 6ddf82773473646dbbc1284ae601a78c4907c778 | 06 March 2018, 07:57:12 UTC |
9cb4856 | Huachao Huang | 06 March 2018, 01:44:52 UTC | Don't need to UpdateFilesByCompactionPri for kCompactionStyleNone Summary: Closes https://github.com/facebook/rocksdb/pull/3563 Differential Revision: D7154653 Pulled By: ajkr fbshipit-source-id: 4f32fb1b02451a934504c40be22b07fb1f2deb9c | 06 March 2018, 01:57:39 UTC |
5d68243 | Andrew Kryczka | 05 March 2018, 21:08:17 UTC | Comment out unused variables Summary: Submitting on behalf of another employee. Closes https://github.com/facebook/rocksdb/pull/3557 Differential Revision: D7146025 Pulled By: ajkr fbshipit-source-id: 495ca5db5beec3789e671e26f78170957704e77e | 05 March 2018, 21:13:41 UTC |
1ccdc2c | Pengchao Wang | 05 March 2018, 19:54:59 UTC | Fix vagrant build process Summary: https://blog.github.com/2018-02-23-weak-cryptographic-standards-removed/ Github dropped supporting some weak cryptographic protocols from their website couple of weeks ago which cause our vagrant build process to fail on curl downloading step. This diff force curl use tls v1.2 protocol if it is supported so that it does not rely on the default protocol on different systems. Closes https://github.com/facebook/rocksdb/pull/3561 Differential Revision: D7148575 Pulled By: wpc fbshipit-source-id: b8cecfdfeb2bc8236de2d0d14f044532befec98c | 05 March 2018, 19:57:41 UTC |
92b1a68 | Zhongyi Xie | 05 March 2018, 19:10:38 UTC | fix FreeBSD build Summary: Currently FreeBSD build is broken in master and possibly some previous releases due to unrecognized symbol `O_DIRECT`. This PR will fix the build on FreeBSD Closes https://github.com/facebook/rocksdb/pull/3560 Differential Revision: D7148646 Pulled By: miasantreble fbshipit-source-id: 95b6c3d310fa531267c086b2cd40a5ab1c042b5a | 05 March 2018, 19:12:28 UTC |
680864a | Maysam Yabandeh | 05 March 2018, 18:48:29 UTC | WritePrepared Txn: Fix bug with duplicate keys during recovery Summary: Fix the following bugs: - During recovery a duplicate key was inserted twice into the write batch of the recovery transaction, once when the memtable returns false (because it was duplicates) and once for the 2nd attempt. This would result into different SubBatch count measured when the recovered transactions is committing. - If a cf is flushed during recovery the memtable is not available to assist in detecting the duplicate key. This could result into not advancing the sequence number when iterating over duplicate keys of a flushed cf and hence inserting the next key with the wrong sequence number. - SubBacthCounter would reset the comparator to default comparator after the first duplicate key. The 2nd duplicate key hence would have gone through a wrong comparator and not being detected. Closes https://github.com/facebook/rocksdb/pull/3562 Differential Revision: D7149440 Pulled By: maysamyabandeh fbshipit-source-id: 91ec317b165f363f5d11ff8b8c47c81cebb8ed77 | 05 March 2018, 18:57:59 UTC |
15f55e5 | Sagar Vemuri | 03 March 2018, 00:19:56 UTC | Fix TSAN timeout in MergeOperatorPinningTest.Randomized/x test Summary: [FB - Internal] MergeOperatorPinningTest.Randomized/x tests are frequently failing with timeouts when run with tsan, as they are exceeding 10 minute limit for tests. The tests are in turn getting disabled due to frequent failures. I halved the number of rounds to make the test complete sooner. This reduces the number of testing iterations a little, but it still is much better than totally letting the test be disabled. Closes https://github.com/facebook/rocksdb/pull/3523 Differential Revision: D7031498 Pulled By: sagar0 fbshipit-source-id: 9a694f2176b235259920a42bf24bca5346f7cff1 | 03 March 2018, 00:27:21 UTC |
db2445a | Adam Retter | 02 March 2018, 23:33:08 UTC | Brings the Java API for WriteBatch inline with the C++ API Summary: * Exposes status * Corrects some method naming * Adds missing functionality Closes https://github.com/facebook/rocksdb/pull/3550 Differential Revision: D7140790 Pulled By: sagar0 fbshipit-source-id: cbdab6c5a7ae4f3030fb46739e9060e381b26fa6 | 02 March 2018, 23:44:10 UTC |
1209b6d | Yi Wu | 02 March 2018, 20:54:24 UTC | Blob DB: remove existing garbage collection implementation Summary: Red diff to remove existing implementation of garbage collection. The current approach is reference counting kind of approach and require a lot of effort to get the size counter right on compaction and deletion. I'm going to go with a simple mark-sweep kind of approach and will send another PR for that. CompactionEventListener was added solely for blob db and it adds complexity and overhead to compaction iterator. Removing it as well. Closes https://github.com/facebook/rocksdb/pull/3551 Differential Revision: D7130190 Pulled By: yiwu-arbug fbshipit-source-id: c3a375ad2639a3f6ed179df6eda602372cc5b8df | 02 March 2018, 20:57:23 UTC |
2ac988c | Adam Retter | 02 March 2018, 18:22:38 UTC | Add TransactionDB and OptimisticTransactionDB to the Java API Summary: Closes https://github.com/facebook/rocksdb/issues/697 Closes https://github.com/facebook/rocksdb/issues/1151 Closes https://github.com/facebook/rocksdb/pull/1298 Differential Revision: D7131402 Pulled By: sagar0 fbshipit-source-id: bcd34ce95ed88cc641786089ff4232df7b2f089f | 02 March 2018, 18:34:13 UTC |
d060421 | Maysam Yabandeh | 02 March 2018, 04:33:41 UTC | Fix a leak in prepared_section_completed_ Summary: The zeroed entries were not removed from prepared_section_completed_ map. This patch adds a unit test to show the problem and fixes that by refactoring the code. The new code is more efficient since i) it uses two separate mutex to avoid contention between commit and prepare threads, ii) it uses a sorted vector for maintaining uniq log entires with prepare which avoids a very large heap with many duplicate entries. Closes https://github.com/facebook/rocksdb/pull/3545 Differential Revision: D7106071 Pulled By: maysamyabandeh fbshipit-source-id: b3ae17cb6cd37ef10b6b35e0086c15c758768a48 | 02 March 2018, 04:41:56 UTC |
bf937cf | Yi Wu | 02 March 2018, 01:50:54 UTC | Add "rocksdb.live-sst-files-size" DB property Summary: Add "rocksdb.live-sst-files-size" DB property which only include files of latest version. Existing "rocksdb.total-sst-files-size" include files from all versions and thus include files that's obsolete but not yet deleted. I'm going to use this new property to cap blob db sst + blob files size. Closes https://github.com/facebook/rocksdb/pull/3548 Differential Revision: D7116939 Pulled By: yiwu-arbug fbshipit-source-id: c6a52e45ce0f24ef78708156e1a923c1dd6bc79a | 02 March 2018, 02:01:10 UTC |
ec5843d | leviathan1995 | 28 February 2018, 17:49:32 UTC | Comment typo Summary: Closes https://github.com/facebook/rocksdb/pull/3546 Differential Revision: D7111708 Pulled By: ajkr fbshipit-source-id: 522a4a00eb3e34c73afcb86c1f75cd2e90e7608d | 28 February 2018, 17:56:45 UTC |
3ae0047 | Andrew Kryczka | 28 February 2018, 01:08:34 UTC | skip CompactRange flush based on memtable contents Summary: CompactRange has a call to Flush because we guarantee that, at the time it's called, all existing keys in the range will be pushed through the user's compaction filter. However, previously the flush was done blindly, so it'd happen even if the memtable does not contain keys in the range specified by the user. This caused unnecessarily many L0 files to be created, leading to write stalls in some cases. This PR checks the memtable's contents, and decides to flush only if it overlaps with `CompactRange`'s range. - Move the memtable overlap check logic from `ExternalSstFileIngestionJob` to `ColumnFamilyData::RangesOverlapWithMemtables` - Reuse the above logic in `CompactRange` and skip flushing if no overlap Closes https://github.com/facebook/rocksdb/pull/3520 Differential Revision: D7018897 Pulled By: ajkr fbshipit-source-id: a3c6b1cfae56687b49dd89ccac7c948e53545934 | 28 February 2018, 01:12:44 UTC |
c287c09 | Siying Dong | 27 February 2018, 20:32:09 UTC | Update comments in DB::Close() Summary: Closes https://github.com/facebook/rocksdb/pull/3543 Differential Revision: D7093251 Pulled By: siying fbshipit-source-id: 4066b82c95ecb65866c5842d68ab13ab9f85d567 | 27 February 2018, 20:42:31 UTC |
d633656 | Istvan Szukacs | 26 February 2018, 23:17:22 UTC | Adding CentOS 7 Vagrantfile & build script Summary: I have updated the Vagrantfile to have an entry for CentOS 7. Also created a simple build script which is pretty similar to the one in Beringei. How to test: ``` vagrant up centos7 ``` Todo: Implement -j X for the build. Closes https://github.com/facebook/rocksdb/pull/3530 Differential Revision: D7090739 Pulled By: ajkr fbshipit-source-id: 9f9eda5b507568993543d08de7ce168dfc12282e | 26 February 2018, 23:27:17 UTC |
ad05cbb | Zhongyi Xie | 26 February 2018, 22:46:12 UTC | DB:Open should fail on tmpfs when use_direct_reads=true Summary: Before: > $ TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32 DB path: [/dev/shm/dbbench] put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument db_bench: tpp.c:84: __pthread_tpp_change_priority: Assertion `new_prio == -1 || (new_prio >= fifo_min_prio && new_prio <= fifo_max_prio)' failed. put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument After: > TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32 Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags open error: Not implemented: Direct I/O is not supported by the specified DB. Closes https://github.com/facebook/rocksdb/pull/3539 Differential Revision: D7082658 Pulled By: miasantreble fbshipit-source-id: f9d9c6ec3b5e9e049cab52154940ee101ba4d342 | 26 February 2018, 22:58:06 UTC |
7eb292d | Dmitri Smirnov | 26 February 2018, 21:44:25 UTC | Fix a memory leak in WindowsThread Summary: _endthreadex does not return and thus objects for stack destructors do not run. This creates a memory leak. We remove the calls since _enthreadex called automatically after the threadproc returns i.e. thread exits. Closes https://github.com/facebook/rocksdb/pull/3542 Differential Revision: D7088713 Pulled By: ajkr fbshipit-source-id: 749ecafc6a9572f587f76e516547e07734349a54 | 26 February 2018, 21:46:12 UTC |
dfbe52e | Anand Ananthabhotla | 23 February 2018, 21:50:02 UTC | Fix the Logger::Close() and DBImpl::Close() design pattern Summary: The recent Logger::Close() and DBImpl::Close() implementation rely on calling the CloseImpl() virtual function from the destructor, which will not work. Refactor the implementation to have a private close helper function in derived classes that can be called by both CloseImpl() and the destructor. Closes https://github.com/facebook/rocksdb/pull/3528 Reviewed By: gfosco Differential Revision: D7049303 Pulled By: anand1976 fbshipit-source-id: 76a64cbf403209216dfe4864ecf96b5d7f3db9f4 | 23 February 2018, 21:57:26 UTC |
30649dc | Siying Dong | 23 February 2018, 19:30:55 UTC | Have a different function when ROCKSDB_JEMALLOC=0 Summary: Some sanitizer is not happy with parameter name with ROCKSDB_JEMALLOC not set. Use another function instead. Closes https://github.com/facebook/rocksdb/pull/3536 Differential Revision: D7064849 Pulled By: siying fbshipit-source-id: c6ae94e044686176af1259df9172453d52c2f9d5 | 23 February 2018, 19:42:33 UTC |
90eca1e | Maysam Yabandeh | 23 February 2018, 02:05:14 UTC | WritePrepared Txn: optimize SubBatchCnt Summary: Make use of the index in WriteBatchWithIndex to also count the number of sub-batches. This eliminates the need to separately scan the batch to count the number of sub-batches once a duplicate key is detected. Closes https://github.com/facebook/rocksdb/pull/3529 Differential Revision: D7049947 Pulled By: maysamyabandeh fbshipit-source-id: 81cbf12c4e662541c772c7265a8f91631e25c7cd | 23 February 2018, 02:12:26 UTC |
243220d | Zhongyi Xie | 23 February 2018, 00:31:58 UTC | Update HISTORY.md to 5.12.0 Summary: Closes https://github.com/facebook/rocksdb/pull/3532 Differential Revision: D7062828 Pulled By: miasantreble fbshipit-source-id: d36967a1cfbcaeeeb33b9f0e09e15dea85b08b70 | 23 February 2018, 00:47:01 UTC |
4624edc | Siying Dong | 22 February 2018, 21:23:53 UTC | RocksDBOptionsParser::Parse()'s `ignore_unknown_options` argument only ingores options from higher version. Summary: RocksDB should always be able to parse an option file generated using the same or lower version. Unknown option should only happen if it is from a higher version. Change the behavior of RocksDBOptionsParser::Parse()'s behavior with ignore_unknown_options=true so that unknown option from a lower or the same version will never be skipped. Closes https://github.com/facebook/rocksdb/pull/3527 Differential Revision: D7048851 Pulled By: siying fbshipit-source-id: e261caea12f6515611a4a29f39acf2b619df2361 | 22 February 2018, 21:28:12 UTC |
aba3409 | Igor Sugak | 22 February 2018, 20:36:41 UTC | Back out "[codemod] - comment out unused parameters" Reviewed By: igorsugak fbshipit-source-id: 4a93675cc1931089ddd574cacdb15d228b1e5f37 | 22 February 2018, 20:43:17 UTC |
f4a030c | David Lai | 22 February 2018, 17:37:17 UTC | - comment out unused parameters Reviewed By: everiq, igorsugak Differential Revision: D7046710 fbshipit-source-id: 8e10b1f1e2aecebbfb229c742e214db887e5a461 | 22 February 2018, 17:44:23 UTC |
b092977 | Andrew Kryczka | 22 February 2018, 01:33:14 UTC | BackupEngine gluster-friendly file naming convention Summary: Use the rsync tempfile naming convention in our `BackupEngine`. The temp file follows the format, `.<filename>.<suffix>`, which is later renamed to `<filename>`. We fix `tmp` as the `<suffix>` as we don't need to use random bytes for now. The benefit is gluster treats this tempfile naming convention specially and applies hashing only to `<filename>`, so the file won't need to be linked or moved when it's renamed. Our gluster team suggested this will make things operationally easier. Closes https://github.com/facebook/rocksdb/pull/3463 Differential Revision: D6893333 Pulled By: ajkr fbshipit-source-id: fd7622978f4b2487fce33cde40dd3124f16bcaa8 | 22 February 2018, 01:42:07 UTC |
828211e | Maysam Yabandeh | 21 February 2018, 21:40:31 UTC | WritePrepared Txn: fix non-emptied PreparedHeap bug Summary: Under a certain sequence of accessing PreparedHeap, there was a bug that would not successfully empty the heap. This would result in performance issues when the heap content is moved to old_prepared_ after max_evicted_seq_ advances the orphan prepared sequence numbers. The patch fixed the bug and add more unit tests. It also does more logging when the unlikely scenarios are faced Closes https://github.com/facebook/rocksdb/pull/3526 Differential Revision: D7038486 Pulled By: maysamyabandeh fbshipit-source-id: f1e40bea558f67b03d2a29131fcb8734c65fce97 | 21 February 2018, 21:42:23 UTC |
8ada876 | Sagar Vemuri | 21 February 2018, 03:05:21 UTC | Add rocksdb.iterator.internal-key property Summary: Added a new iterator property: `rocksdb.iterator.internal-key` to get the internal-key (converted to user key) at which the iterator stopped. Closes https://github.com/facebook/rocksdb/pull/3525 Differential Revision: D7033694 Pulled By: sagar0 fbshipit-source-id: d51e6c00f5e9d766c6276ef79774b81c6c5216f8 | 21 February 2018, 03:12:09 UTC |
e9c31ab | jsteemann | 21 February 2018, 01:34:44 UTC | save redundant key lookup in map of locked keys Summary: In case it is found that a key is already marked as locked in a stripe's map of locked keys, it is not necessary to look it up again using `std::unordered_map<std::string, ...>::at(size_t)`. Instead, we can use the already found position using the iterator produced by the previous `find` operation. Reusing the iterator will avoid having to hash the key again and do additional "random" memory lookups in the map of keys (though the data will very likely sit available in caches here already due to the previous find operation) Closes https://github.com/facebook/rocksdb/pull/3505 Differential Revision: D7036446 Pulled By: sagar0 fbshipit-source-id: cced51547b2bd2d49394f6bc8c5896f09fa80f68 | 21 February 2018, 01:44:44 UTC |
1960e73 | Andrew Kryczka | 21 February 2018, 00:42:06 UTC | fix handling of empty string as checkpoint directory Summary: - made `CreateCheckpoint` properly return `InvalidArgument` when called with an empty directory. Previously it triggered an assertion failure due to a bug in the logic. - made `ldb` set empty `checkpoint_dir` if that's what the user specifies, so that we can use it to properly test `CreateCheckpoint` in the future. Differential Revision: D6874562 fbshipit-source-id: dcc1bd41768261d9338987fa7711444289707ed7 | 21 February 2018, 00:44:00 UTC |
5263da6 | Igor Sugak | 21 February 2018, 00:41:54 UTC | fix shift UBSAN error in col_buf_encoder.cc Summary: Add a static cast to perform the left shift as with an unsigned type. make ubsan_check Closes https://github.com/facebook/rocksdb/pull/3517 Reviewed By: sagar0 Differential Revision: D7016044 Pulled By: igorsugak fbshipit-source-id: baf72f6197edd8f7220d010b15a23d6de6a72c49 | 21 February 2018, 00:44:00 UTC |
ab446dc | Po-Chuan Hsieh | 16 February 2018, 18:34:48 UTC | Fix build with USE_RTTI=0 Summary: utilities/column_aware_encoding_util.cc:61:23: error: cannot use dynamic_cast with -fno-rtti table_reader_.reset(dynamic_cast<BlockBasedTable*>(table_reader.release())); ^ 1 error generated. It was added as a [local patch](https://svnweb.freebsd.org/ports/head/databases/rocksdb/files/patch-utilities-column_aware_encoding_util.cc) on FreeBSD since RocksDB 5.8. It also fixes #2707. Closes https://github.com/facebook/rocksdb/pull/3514 Differential Revision: D7005571 Pulled By: siying fbshipit-source-id: 351a9055d21d0accdd7a932e8e7bfcd3c8e22068 | 16 February 2018, 18:41:49 UTC |
c178da0 | Maysam Yabandeh | 16 February 2018, 16:36:47 UTC | WritePrepared Txn: optimizations for sysbench update_noindex Summary: These are optimization that we applied to improve sysbech's update_noindex performance. 1. Make use of LIKELY compiler hint 2. Move std::atomic so the subclass 3. Make use of skip_prepared in non-2pc transactions. Closes https://github.com/facebook/rocksdb/pull/3512 Differential Revision: D7000075 Pulled By: maysamyabandeh fbshipit-source-id: 1ab8292584df1f6305a4992973fb1b7933632181 | 16 February 2018, 16:42:31 UTC |
97307d8 | Mike Kolupaev | 16 February 2018, 15:58:18 UTC | Fix deadlock in ColumnFamilyData::InstallSuperVersion() Summary: Deadlock: a memtable flush holds DB::mutex_ and calls ThreadLocalPtr::Scrape(), which locks ThreadLocalPtr mutex; meanwhile, a thread exit handler locks ThreadLocalPtr mutex and calls SuperVersionUnrefHandle, which tries to lock DB::mutex_. This deadlock is hit all the time on our workload. It blocks our release. In general, the problem is that ThreadLocalPtr takes an arbitrary callback and calls it while holding a lock on a global mutex. The same global mutex is (at least in some cases) locked by almost all ThreadLocalPtr methods, on any instance of ThreadLocalPtr. So, there'll be a deadlock if the callback tries to do anything to any instance of ThreadLocalPtr, or waits for another thread to do so. So, probably the only safe way to use ThreadLocalPtr callbacks is to do only do simple and lock-free things in them. This PR fixes the deadlock by making sure that local_sv_ never holds the last reference to a SuperVersion, and therefore SuperVersionUnrefHandle never has to do any nontrivial cleanup. I also searched for other uses of ThreadLocalPtr to see if they may have similar bugs. There's only one other use, in transaction_lock_mgr.cc, and it looks fine. Closes https://github.com/facebook/rocksdb/pull/3510 Reviewed By: sagar0 Differential Revision: D7005346 Pulled By: al13n321 fbshipit-source-id: 37575591b84f07a891d6659e87e784660fde815f | 16 February 2018, 16:13:34 UTC |
0454f78 | Andrew Kryczka | 16 February 2018, 03:30:52 UTC | fix advance reservation of arena block addresses Summary: Calling `std::vector::reserve()` causes memory to be reallocated and then data to be moved. It was called prior to adding every block. This reallocation could be done a huge amount of times, e.g., for users with large index blocks. Instead, we can simply use `std::vector::emplace_back()` in such a way that preserves the no-memory-leak guarantee, while letting the vector decide when to reallocate space. Now I see reallocation/moving happen O(logN) times, rather than O(N) times, where N is the final size of vector. Closes https://github.com/facebook/rocksdb/pull/3508 Differential Revision: D6994228 Pulled By: ajkr fbshipit-source-id: ab7c11e13ff37c8c6c8249be7a79566a4068cd27 | 16 February 2018, 03:41:52 UTC |
989d123 | Yi Wu | 16 February 2018, 01:14:08 UTC | Legocastle job to report lite build binary size to scuba Summary: Add a legocastle job to continuously build the last 10 commits every 4 hours and report lite build binary size to scuba. Closes https://github.com/facebook/rocksdb/pull/3511 Differential Revision: D7001730 Pulled By: yiwu-arbug fbshipit-source-id: 7c8ca87c46d663c786a0d32be69ebbe7b19a5eb9 | 16 February 2018, 01:27:24 UTC |
8eb1d44 | Maysam Yabandeh | 16 February 2018, 01:12:48 UTC | Unbreak MemTableRep API change Summary: The MemTableRep API was broken by this commit: 813719e9525f647aaebf19ca3d4bb6f1c63e2648 This patch reverts the changes and instead adds InsertKey (and etc.) overloads to extend the MemTableRep API without breaking the existing classes that inherit from it. Closes https://github.com/facebook/rocksdb/pull/3513 Differential Revision: D7004134 Pulled By: maysamyabandeh fbshipit-source-id: e568d91fe1e17dd76c0c1f6c7dd51a18633b1c4f | 16 February 2018, 01:27:24 UTC |
4e7a182 | jsteemann | 16 February 2018, 00:43:23 UTC | Several small "fixes" Summary: - removed a few unneeded variables - fused some variable declarations and their assignments - fixed right-trimming code in string_util.cc to not underflow - simplifed an assertion - move non-nullptr check assertion before dereferencing of that pointer - pass an std::string function parameter by const reference instead of by value (avoiding potential copy) Closes https://github.com/facebook/rocksdb/pull/3507 Differential Revision: D7004679 Pulled By: sagar0 fbshipit-source-id: 52944952d9b56dfcac3bea3cd7878e315bb563c4 | 16 February 2018, 00:57:37 UTC |
c88c57c | Zhongyi Xie | 15 February 2018, 22:11:08 UTC | Tweak external file ingestion seqno logic under universal compaction Summary: Right now it is possible that a file gets assigned to L0 but also assigned the seqno from a higher level which it doesn't fit Under the current impl, it is possibe that seqno in lower levels (Ln) can be equal to smallest seqno of higher levels (Ln-1), which is undesirable from universal compaction's point of view. This should fix the intermittent failure of `ExternalSSTFileBasicTest.IngestFileWithGlobalSeqnoPickedSeqno` Closes https://github.com/facebook/rocksdb/pull/3411 Differential Revision: D6813802 Pulled By: miasantreble fbshipit-source-id: 693d0462fa94725ccfb9d8858743e6d2d9992d14 | 15 February 2018, 22:13:39 UTC |
6a30b98 | jsteemann | 15 February 2018, 19:01:42 UTC | fix wrong indentation Summary: Somehow the indentation was incorrect in this file. The only change in this PR is to get it right again in order to make the code more readable. Please reject if you think it's not worth it. Closes https://github.com/facebook/rocksdb/pull/3504 Differential Revision: D6996011 Pulled By: miasantreble fbshipit-source-id: 060514a3a8c910d34bad795b36eb4d278512b154 | 15 February 2018, 19:13:37 UTC |
ba6ee1f | Fosco Marotto | 14 February 2018, 19:08:39 UTC | Fix 2 more unused reference errors VS2017 Summary: As in #3425 Closes https://github.com/facebook/rocksdb/pull/3497 Differential Revision: D6979588 Pulled By: gfosco fbshipit-source-id: e9fb32d04ad45575dfe9de1d79348d158e474197 | 14 February 2018, 19:12:36 UTC |
b3c5351 | Siying Dong | 14 February 2018, 00:20:13 UTC | Direct I/O writable file should do fsync in Close() Summary: We don't do fsync() after truncate in direct I/O writeable file (in fact we don't do any fsync ever). This can cause metadata not persistent to disk after the file is generated. We call it instead. Closes https://github.com/facebook/rocksdb/pull/3500 Differential Revision: D6981482 Pulled By: siying fbshipit-source-id: 7e2b591b7e5dd1b96fc0775515b8b9e6092980ef | 14 February 2018, 00:27:11 UTC |
d08d05c | Igor Sugak | 13 February 2018, 22:07:48 UTC | fix UBSAN errors in fault_injection_test Summary: This fixes shift and signed-integer-overflow UBSAN checks in fault_injection_test by using a larger and unsigned type. Closes https://github.com/facebook/rocksdb/pull/3498 Reviewed By: siying Differential Revision: D6981116 Pulled By: igorsugak fbshipit-source-id: 3688f62cce570534b161e9b5f42109ebc9ae5a2c | 13 February 2018, 22:12:40 UTC |
dadf016 | Siying Dong | 13 February 2018, 21:44:22 UTC | Rename one of the two LevelIterator Summary: A new LevelIterator was recently created. Rename the old one to make unity build happy. It's also not a good idea to have two classes in the same name anyway. Closes https://github.com/facebook/rocksdb/pull/3499 Differential Revision: D6979325 Pulled By: siying fbshipit-source-id: 3a032d93fe205650a08e92e5262594731ec726bb | 13 February 2018, 21:57:58 UTC |
7474861 | Siying Dong | 13 February 2018, 20:05:36 UTC | Suppress UBSAN error in finer guanularity Summary: Now we suppress alignment UBSAN error as a whole. Suppressing 3-way CRC and murmurhash feels a better idea than turning off alignment check as a whole. Closes https://github.com/facebook/rocksdb/pull/3495 Differential Revision: D6971273 Pulled By: siying fbshipit-source-id: 080b59fed6df494b9f622ef7cb5d42d39e6a8cdf | 13 February 2018, 20:18:07 UTC |
3c380fd | Fosco Marotto | 13 February 2018, 19:45:24 UTC | Adding blog post for 5.10.2 release Summary: Closes https://github.com/facebook/rocksdb/pull/3464 Differential Revision: D6906184 Pulled By: gfosco fbshipit-source-id: 415934d7b1dd8dd226b6619bfb71781184d55cd9 | 13 February 2018, 19:56:59 UTC |
b555ed3 | Siying Dong | 13 February 2018, 00:57:56 UTC | Customized BlockBasedTableIterator and LevelIterator Summary: Use a customzied BlockBasedTableIterator and LevelIterator to replace current implementations leveraging two-level-iterator. Hope the customized logic will make code easier to understand. As a side effect, BlockBasedTableIterator reduces the allocation for the data block iterator object, and avoid the virtual function call to it, because we can directly reference BlockIter, a final class. Similarly, LevelIterator reduces virtual function call to the dummy iterator iterating the file metadata. It also enabled further optimization. The upper bound check is also moved from index block to data block. This implementation fits this iterator better. After the change, forwared iterator is slightly optimized to ensure we trim those iterators. The two-level-iterator now is only used by partitioned index, so it is simplified. Closes https://github.com/facebook/rocksdb/pull/3406 Differential Revision: D6809041 Pulled By: siying fbshipit-source-id: 7da3b9b1d3c8e9d9405302c15920af1fcaf50ffa | 13 February 2018, 01:12:25 UTC |
8a04ee4 | Maysam Yabandeh | 13 February 2018, 00:27:39 UTC | WritePrepared Txn: use TransactionDBWriteOptimizations (2nd attempt) Summary: TransactionDB::Write can receive some optimization hints from the user. One is to skip the concurrency control mechanism. WritePreparedTxnDB is currently ignoring such hints. This patch optimizes WritePreparedTxnDB::Write for skip_concurrency_control and skip_duplicate_key_check hints. Closes https://github.com/facebook/rocksdb/pull/3496 Differential Revision: D6971784 Pulled By: maysamyabandeh fbshipit-source-id: cbab10ad538fa2b8bcb47e37c77724afe6e30f03 | 13 February 2018, 00:43:40 UTC |
ee1c802 | Andrew Kryczka | 12 February 2018, 23:34:39 UTC | Add delay before flush in CompactRange to avoid write stalling Summary: - Refactored logic for checking write stall condition to a helper function: `GetWriteStallConditionAndCause`. Now it is decoupled from the logic for updating WriteController / stats in `RecalculateWriteStallConditions`, so we can reuse it for predicting whether write stall will occur. - Updated `CompactRange` to first check whether the one additional immutable memtable / L0 file would cause stalling before it flushes. If so, it waits until that is no longer true. - Updated `bg_cv_` to be signaled on `SetOptions` calls. The stall conditions `CompactRange` cares about can change when (1) flush finishes, (2) compaction finishes, or (3) options dynamically change. The cv was already signaled for (1) and (2) but not yet for (3). Closes https://github.com/facebook/rocksdb/pull/3381 Differential Revision: D6754983 Pulled By: ajkr fbshipit-source-id: 5613e03f1524df7192dc6ae885d40fd8f091d972 | 12 February 2018, 23:42:47 UTC |
0a0fad4 | Andrew Kryczka | 12 February 2018, 22:54:50 UTC | db_bench separate options for partition index and filters Summary: Some workloads (like my current benchmarking) may want partitioned indexes without partitioned filters. Particularly, when `-optimize_filters_for_hits=true`, the total index size may be larger than the total filter size, so it can make sense to hold all filters in-memory but not all indexes. Closes https://github.com/facebook/rocksdb/pull/3492 Differential Revision: D6970092 Pulled By: ajkr fbshipit-source-id: b7fa1828e1d13829339aefb90fd56eb7c5337f61 | 12 February 2018, 22:57:13 UTC |