swh:1:snp:5115096b921df712aeb2a08114fede57fb3331fb

sort by:
Revision Author Date Message Commit Date
c60df9d update history and version 01 May 2018, 01:19:02 UTC
747c853 Avoid directory renames in BackupEngine Summary: We used to name private directories like "1.tmp" while BackupEngine populated them, and then rename without the ".tmp" suffix (i.e., rename "1.tmp" to "1") after all files were copied. On glusterfs, directory renames like this require operations across many hosts, and partial failures have caused operational problems. Fortunately we don't need to rename private directories. We already have a meta-file that uses the tempfile-rename pattern to commit a backup atomically after all its files have been successfully copied. So we can copy private files directly to their final location, so now there's no directory rename. Closes https://github.com/facebook/rocksdb/pull/3749 Differential Revision: D7705610 Pulled By: ajkr fbshipit-source-id: fd724a28dd2bf993ce323a5f2cb7e7d6980cc346 01 May 2018, 00:50:24 UTC
f5ee207 Support lowering CPU priority of background threads Summary: Background activities like compaction can negatively affect latency of higher-priority tasks like request processing. To avoid this, rocksdb already lowers the IO priority of background threads on Linux systems. While this takes care of typical IO-bound systems, it does not help much when CPU (temporarily) becomes the bottleneck. This is especially likely when using more expensive compression settings. This patch adds an API to allow for lowering the CPU priority of background threads, modeled on the IO priority API. Benchmarks (see below) show significant latency and throughput improvements when CPU bound. As a result, workloads with some CPU usage bursts should benefit from lower latencies at a given utilization, or should be able to push utilization higher at a given request latency target. A useful side effect is that compaction CPU usage is now easily visible in common tools, allowing for an easier estimation of the contribution of compaction vs. request processing threads. As with IO priority, the implementation is limited to Linux, degrading to a no-op on other systems. Closes https://github.com/facebook/rocksdb/pull/3763 Differential Revision: D7740096 Pulled By: gwicke fbshipit-source-id: e5d32373e8dc403a7b0c2227023f9ce4f22b413c 25 April 2018, 01:18:32 UTC
f2fd21f fix memory leak in two_level_iterator Summary: this PR fixes a few failed contbuild: 1. ASAN memory leak in Block::NewIterator (table/block.cc:429). the proper destruction of first_level_iter_ and second_level_iter_ of two_level_iterator.cc is missing from the code after the refactoring in https://github.com/facebook/rocksdb/pull/3406 2. various unused param errors introduced by https://github.com/facebook/rocksdb/pull/3662 3. updated comment for `ForceReleaseCachedEntry` to emphasize the use of `force_erase` flag. Closes https://github.com/facebook/rocksdb/pull/3718 Reviewed By: maysamyabandeh Differential Revision: D7621192 Pulled By: miasantreble fbshipit-source-id: 476c94264083a0730ded957c29de7807e4f5b146 16 April 2018, 21:23:11 UTC
7585278 Fix the memory leak with pinned partitioned filters Summary: The existing unit test did not set the level so the check for pinned partitioned filter/index being properly released from the block cache was not properly exercised as they only take effect in level 0. As a result a memory leak in pinned partitioned filters was hidden. The patch fix the test as well as the bug. Closes https://github.com/facebook/rocksdb/pull/3692 Differential Revision: D7559763 Pulled By: maysamyabandeh fbshipit-source-id: 55eff274945838af983c764a7d71e8daff092e4a 16 April 2018, 21:12:49 UTC
9d2e34e Fix History 23 March 2018, 21:58:51 UTC
1f5103d Add Java-API-Changes section to History Summary: We have not been updating our HISTORY.md change log with the RocksJava changes. Going forward, lets add Java changes also to HISTORY.md. There is an old java/HISTORY-JAVA.md, but it hasn't been updated in years. It is much easier to remember to update the change log in a single file, HISTORY.md. I added information about shared block cache here, which was introduced in #3623. Closes https://github.com/facebook/rocksdb/pull/3647 Differential Revision: D7384448 Pulled By: sagar0 fbshipit-source-id: 9b6e569f44e6df5cb7ba06413d9975df0b517d20 23 March 2018, 21:56:44 UTC
163dd4b Shared block cache in RocksJava Summary: Changes to support sharing block cache using the Java API. Previously DB instances could share the block cache only when the same Options instance is passed to all the DB instances. But now, with this change, it is possible to explicitly create a cache and pass it to multiple options instances, to share the block cache. Implementing this for [Rocksandra](https://github.com/instagram/cassandra/tree/rocks_3.0), but this feature has been requested by many java api users over the years. Closes https://github.com/facebook/rocksdb/pull/3623 Differential Revision: D7305794 Pulled By: sagar0 fbshipit-source-id: 03e4e8ed7aeee6f88bada4a8365d4279ede2ad71 23 March 2018, 21:56:05 UTC
1642582 Fsync after writing global seq number in ExternalSstFileIngestionJob Summary: Fsync after writing global sequence number to the ingestion file in ExternalSstFileIngestionJob. Otherwise the file metadata could be incorrect. Closes https://github.com/facebook/rocksdb/pull/3644 Differential Revision: D7373813 Pulled By: sagar0 fbshipit-source-id: 4da2c9e71a8beb5c08b4ac955f288ee1576358b8 23 March 2018, 21:52:21 UTC
8d28083 Update history for future 5.13 release 20 March 2018, 23:17:53 UTC
d1b2650 fix db_compaction_test when compression disabled Summary: Previously, the compaction in `DBCompactionTestWithParam.ForceBottommostLevelCompaction` generated multiple files in no-compression use case, andone file in compression use case. I increased `target_file_size_base` so it generates one file in both use cases. Closes https://github.com/facebook/rocksdb/pull/3625 Differential Revision: D7311885 Pulled By: ajkr fbshipit-source-id: 97f249fa83a9924ac34357a4bb3189c969ecb107 19 March 2018, 19:30:05 UTC
ccb7613 Enable compilation on OpenBSD Summary: I modified the Makefile so that we can compile rocksdb on OpenBSD. The instructions for building have been added to INSTALL.md. The whole compilation process works fine like this on OpenBSD-current Closes https://github.com/facebook/rocksdb/pull/3617 Differential Revision: D7323754 Pulled By: siying fbshipit-source-id: 990037d1cc69138d22f85bd77ef4dc8c1ba9edea 19 March 2018, 19:30:05 UTC
1139422 Fix the command used to generate ctags Summary: In original $ROCKSDB_HOME/Makefile, the command used to generate ctags is ``` ctags * -R ``` However, this failed to generate tags for me. I did some search on the usage of ctags command and found that it should be ``` ctags -R . ``` or ``` ctags -R * ``` After the change, I can find the tags in vim using `:ts <identifier>`. Closes https://github.com/facebook/rocksdb/pull/3626 Reviewed By: ajkr Differential Revision: D7320217 Pulled By: riversand963 fbshipit-source-id: e4cd8f8a67842370a2343f0213df3cbd07754111 19 March 2018, 05:43:18 UTC
bef95be Improve the output of the RocksJava JUnit runner Summary: This changes the console output when the RocksJava tests are run. It makes spotting the errors and failures much easier; perviously the output was malformed with results like "ERun" where the "E" represented an error in the preceding test. Closes https://github.com/facebook/rocksdb/pull/3621 Differential Revision: D7306172 Pulled By: sagar0 fbshipit-source-id: 3fa6f6e1ca6c6ea7ceef55a23ca81903716132b7 16 March 2018, 20:27:55 UTC
cc34026 fix wrong length in snprintf Summary: Closes https://github.com/facebook/rocksdb/pull/3622 Differential Revision: D7307689 Pulled By: ajkr fbshipit-source-id: b8f52effc63fea06c2058b39c60944c2c1f814b4 16 March 2018, 20:27:55 UTC
ecfca1f Optimize overlap checking for external file ingestion Summary: If there are a lot of overlapped files in L0, creating a merging iterator for all files in L0 to check overlap can be very slow because we need to read and seek all files in L0. However, in that case, the ingested file is likely to overlap with some files in L0, so if we check those files one by one, we can stop once we encounter overlap. Ref: https://github.com/facebook/rocksdb/issues/3540 Closes https://github.com/facebook/rocksdb/pull/3564 Differential Revision: D7196784 Pulled By: anand1976 fbshipit-source-id: 8700c1e903bd515d0fa7005b6ce9b3a3d9db2d67 16 March 2018, 17:43:17 UTC
da82aab allowing CompactFiles to return new file names Summary: This is a small API extension to allow the CompactFiles method to return the names of files that were created during the compaction. Closes https://github.com/facebook/rocksdb/pull/3608 Differential Revision: D7275789 Pulled By: siying fbshipit-source-id: 1ec0c3954a0f10cd877efb5f29f9be6c7b59e9ba 15 March 2018, 18:58:12 UTC
cc118b0 Update version Summary: We missed updating version.h on master when cutting 5.11.fb and 5.12.fb branches. It should be the same as the version in the latest release branch (or should it be one more?). I noticed this when trying to run some upgrade/downgrade tests from 5.11 to some new code on master. Closes https://github.com/facebook/rocksdb/pull/3611 Differential Revision: D7282917 Pulled By: sagar0 fbshipit-source-id: 205ee75b77c5b6bbcea95a272760b427025a4aba 15 March 2018, 17:41:48 UTC
0cdaa1a Fix WAL corruption from checkpoint/backup race condition Summary: `Writer::WriteBuffer` was always called at the beginning of checkpoint/backup. But that log writer has no internal synchronization, which meant the same buffer could be flushed twice in a race condition case, causing a WAL entry to be duplicated. Then subsequent WAL entries would be at unexpected offsets, causing the 32KB block boundaries to be overlapped and manifesting as a corruption. This PR fixes the behavior to only use `WriteBuffer` (via `FlushWAL`) in checkpoint/backup when manual WAL flush is enabled. In that case, users are responsible for providing synchronization between WAL flushes. We can also consider removing the call entirely. Closes https://github.com/facebook/rocksdb/pull/3603 Differential Revision: D7277447 Pulled By: ajkr fbshipit-source-id: 1b15bd7fd930511222b075418c10de0aaa70a35a 14 March 2018, 23:12:50 UTC
449627f Blob DB: remove unreacheable code Summary: Fixing #3604. Closes https://github.com/facebook/rocksdb/pull/3606 Reviewed By: siying Differential Revision: D7276604 Pulled By: yiwu-arbug fbshipit-source-id: 915c5897b010d28956f369989e49e64785d1161f 14 March 2018, 21:27:28 UTC
6f7b7f9 Optionally create DuplicateDetector Summary: Address issue https://github.com/facebook/rocksdb/issues/3579 Closes https://github.com/facebook/rocksdb/pull/3589 Differential Revision: D7221161 Pulled By: yiwu-arbug fbshipit-source-id: bd875ab0aa0e414dfa98b1bf036ba9b4ed351361 14 March 2018, 07:57:25 UTC
e003d22 Fix FaultInjectionTestEnv to work with DirectIO Summary: Implemented PositionedAppend() and use_direct_io() for TestWritableFile. With these changes, FaultInjectionTestEnv can be used with DirectIO enabled. Closes https://github.com/facebook/rocksdb/pull/3586 Differential Revision: D7244305 Pulled By: yiwu-arbug fbshipit-source-id: f6b7aece53daa0f9977bc684164a0693693e514c 14 March 2018, 07:57:24 UTC
09e5d7a add 4th test_group in travis Summary: to overcome the space limitation Closes https://github.com/facebook/rocksdb/pull/3605 Differential Revision: D7262607 Pulled By: miasantreble fbshipit-source-id: 1b1148026f17a7ee4b9f3a17ddc6b4ba9cf7af7f 14 March 2018, 01:57:29 UTC
2256dab fix flaky DBSSTTest.DeleteSchedulerMultipleDBPaths Summary: I landed #3544 which made this test flaky. The reason was the files scheduled for deletion sometimes went through the trash-marking process, and sometimes were deleted directly. Our counter only bumped on the former code path, so if the latter code path was used, we'd miss counting a file deleted by deletion scheduler. This PR also bumps the counter in the latter code path. Closes https://github.com/facebook/rocksdb/pull/3593 Differential Revision: D7226173 Pulled By: yiwu-arbug fbshipit-source-id: 81ab44c60834df6ff88db1d73ea34e26c6e93c39 13 March 2018, 21:57:26 UTC
7153153 Fix enable_pipelined_write output in OPTIONS file Summary: enable_pipelined_write was not set in BuildDBOptions() causing its default value to be dumped in the OPTIONS file Closes https://github.com/facebook/rocksdb/pull/3585 Differential Revision: D7226395 Pulled By: yiwu-arbug fbshipit-source-id: 45a659a48d18103ac9ee74bb8805dd0a6ec12474 13 March 2018, 18:59:02 UTC
f6156fb Support StringAppendOperator(delimiter_char) constructor in java-api Summary: Fixes #3336 Closes https://github.com/facebook/rocksdb/pull/3337 Differential Revision: D7196585 Pulled By: sagar0 fbshipit-source-id: a854f3fc906862ecba685b31946e4ef7c0b421c5 09 March 2018, 00:17:47 UTC
c5302a8 Java wrapper for Native Comparators Summary: This is an abstraction for working with custom Comparators implemented in native C++ code from Java. Native code must directly extend `rocksdb::Comparator`. When the native code comparator is compiled into the RocksDB codebase, you can then create a Java Class, and JNI stub to wrap it. Useful if the C++/JNI barrier overhead is too much for your applications comparator performance. An example is provided in `java/rocksjni/native_comparator_wrapper_test.cc` and `java/src/main/java/org/rocksdb/NativeComparatorWrapperTest.java`. Closes https://github.com/facebook/rocksdb/pull/3334 Differential Revision: D7172605 Pulled By: miasantreble fbshipit-source-id: e24b7eb267a3bcb6afa214e0379a1d5e8a2ceabe 08 March 2018, 19:27:42 UTC
e476d0e Adding stat to count cancelled compactions Summary: Added a stat that counts the number of cancelled compactions. Closes https://github.com/facebook/rocksdb/pull/3574 Differential Revision: D7190259 Pulled By: amytai fbshipit-source-id: d5ce82dc9398da6d6d34023ad4ed8cec909852a3 08 March 2018, 18:42:28 UTC
a3a3f54 Fix some typos in comments and docs. Summary: Closes https://github.com/facebook/rocksdb/pull/3568 Differential Revision: D7170953 Pulled By: siying fbshipit-source-id: 9cfb8dd88b7266da920c0e0c1e10fb2c5af0641c 08 March 2018, 18:27:25 UTC
a277b0f Clarification regarding record format Summary: The CRC is actually calculated based on the record type and payload. The wiki should also be updated accordingly and extended with a section on the recyclable record format. Closes https://github.com/facebook/rocksdb/pull/3576 Differential Revision: D7196478 Pulled By: siying fbshipit-source-id: 39f7a0395075cc73e2aa2bfc9e42c85bce35e765 08 March 2018, 18:27:25 UTC
b560fc9 Fix a block pinning regression introduced in b555ed30a4a93b80a3ac4781c6721ab988e03b5b Summary: b555ed30a4a93b80a3ac4781c6721ab988e03b5b introduces a regression, which causes blocks always to be pinned in block based iterators. Fix it. Closes https://github.com/facebook/rocksdb/pull/3582 Differential Revision: D7189534 Pulled By: siying fbshipit-source-id: 117dc7a03d0a0e360424db02efb366e12da2be03 08 March 2018, 18:12:23 UTC
e69f6e8 Fix API name in a comment in db.h Summary: ... so that people are not confused. Closes https://github.com/facebook/rocksdb/pull/3580 Differential Revision: D7187175 Pulled By: sagar0 fbshipit-source-id: bce70093d52e38cd24c9432fd708885d7c2c013e 07 March 2018, 23:27:17 UTC
0de710f Use nullptr instead of NULL / 0 more consistently. Summary: Closes https://github.com/facebook/rocksdb/pull/3569 Differential Revision: D7170968 Pulled By: yiwu-arbug fbshipit-source-id: 308a6b7dd358a04fd9a7de3d927bfd8abd57d348 07 March 2018, 20:42:12 UTC
f021f1d Add rocksdb_open_with_ttl function in C API Summary: Change-Id: Ie6f9b10bce459f6bf0ade0e5877264b4e10da3f5 Signed-off-by: Stuart <Stuart.Hu@emc.com> Closes https://github.com/facebook/rocksdb/pull/3553 Differential Revision: D7144833 Pulled By: sagar0 fbshipit-source-id: 815225fa6e560d8a5bc47ffd0a98118b107ce264 07 March 2018, 04:57:20 UTC
0a3db28 Disallow compactions if there isn't enough free space Summary: This diff handles cases where compaction causes an ENOSPC error. This does not handle corner cases where another background job is started while compaction is running, and the other background job triggers ENOSPC, although we do allow the user to provision for these background jobs with SstFileManager::SetCompactionBufferSize. It also does not handle the case where compaction has finished and some other background job independently triggers ENOSPC. Usage: Functionality is inside SstFileManager. In particular, users should set SstFileManager::SetMaxAllowedSpaceUsage, which is the reference highwatermark for determining whether to cancel compactions. Closes https://github.com/facebook/rocksdb/pull/3449 Differential Revision: D7016941 Pulled By: amytai fbshipit-source-id: 8965ab8dd8b00972e771637a41b4e6c645450445 07 March 2018, 00:27:54 UTC
20c508c Enable subcompactions in manual level-based compaction Summary: This is the simplest way I could think of to speed up `CompactRange`. It works but isn't that optimal because it relies on the same `max_compaction_bytes` and `max_subcompactions` options that are used in other places. If it turns out to be useful we can allow overriding these in `CompactRangeOptions` in the future. Closes https://github.com/facebook/rocksdb/pull/3549 Differential Revision: D7117634 Pulled By: ajkr fbshipit-source-id: d0cd03d6bd0d2fd7ea3fb13cd3b8bf7c47d11e42 06 March 2018, 20:43:51 UTC
3462c94 Add dual-license info to README.md Summary: From #3417 and after talking to both GitHub and our open source legal team, the recommended approach was to explicitly state the dual-license in the readme. Changing the license files to accommodate the auto-detection is too much of a pain, would involve editing every code file header. Closes https://github.com/facebook/rocksdb/pull/3541 Differential Revision: D7171111 Pulled By: gfosco fbshipit-source-id: 0ee7b134446015228249efe991fa5e76526ca0b0 06 March 2018, 20:43:51 UTC
6a3eebb support multiple db_paths in SstFileManager Summary: Now that files scheduled for deletion are kept in the same directory, we don't need to constrain deletion scheduler to `db_paths[0]`. Previously this was done because there was a separate trash directory, and this constraint prevented files from being accidentally copied to another filesystem when they're scheduled for deletion. Closes https://github.com/facebook/rocksdb/pull/3544 Differential Revision: D7093786 Pulled By: ajkr fbshipit-source-id: 202f5c92d925eafebec1281fb95bb5828d33414f 06 March 2018, 20:43:51 UTC
d518fe1 uint64_t and size_t changes to compile for iOS Summary: In attempting to build a static lib for use in iOS, I ran in to lots of type errors between uint64_t and size_t. This PR contains the changes I made to get `TARGET_OS=IOS make static_lib` to succeed while also getting Xcode to build successfully with the resulting `librocksdb.a` library imported. This also compiles for me on macOS and tests fine, but I'm really not sure if I made the correct decisions about where to `static_cast` and where to change types. Also up for discussion: is iOS worth supporting? Getting the static lib is just part one, we aren't providing any bridging headers or wrappers like the ObjectiveRocks project, it won't be a great experience. Closes https://github.com/facebook/rocksdb/pull/3503 Differential Revision: D7106457 Pulled By: gfosco fbshipit-source-id: 82ac2073de7e1f09b91f6b4faea91d18bd311f8e 06 March 2018, 20:43:51 UTC
8bc41f4 Update TARGETS Summary: Watch the build Closes https://github.com/facebook/rocksdb/pull/3533 Differential Revision: D7063777 Pulled By: siying fbshipit-source-id: db9cdfc362a8d281dada6513ab034a6d6f0d552e 06 March 2018, 20:27:28 UTC
c364eb4 Windows cumulative patch Summary: This patch addressed several issues. Portability including db_test std::thread -> port::Thread Cc: @ and %z to ROCKSDB portable macro. Cc: maysamyabandeh Implement Env::AreFilesSame Make the implementation of file unique number more robust Get rid of C-runtime and go directly to Windows API when dealing with file primitives. Implement GetSectorSize() and aling unbuffered read on the value if available. Adjust Windows Logger for the new interface, implement CloseImpl() Cc: anand1976 Fix test running script issue where $status var was of incorrect scope so the failures were swallowed and not reported. DestroyDB() creates a logger and opens a LOG file in the directory being cleaned up. This holds a lock on the folder and the cleanup is prevented. This fails one of the checkpoin tests. We observe the same in production. We close the log file in this change. Fix DBTest2.ReadAmpBitmapLiveInCacheAfterDBClose failure where the test attempts to open a directory with NewRandomAccessFile which does not work on Windows. Fix DBTest.SoftLimit as it is dependent on thread timing. CC: yiwu-arbug Closes https://github.com/facebook/rocksdb/pull/3552 Differential Revision: D7156304 Pulled By: siying fbshipit-source-id: 43db0a757f1dfceffeb2b7988043156639173f5b 06 March 2018, 19:57:43 UTC
b864bc9 Blob DB: Improve FIFO eviction Summary: Improving blob db FIFO eviction with the following changes, * Change blob_dir_size to max_db_size. Take into account SST file size when computing DB size. * FIFO now only take into account live sst files and live blob files. It is normal for disk usage to go over max_db_size because there are obsolete sst files and blob files pending deletion. * FIFO eviction now also evict TTL blob files that's still open. It doesn't evict non-TTL blob files. * If FIFO is triggered, it will pass an expiration and the current sequence number to compaction filter. Compaction filter will then filter inlined keys to evict those with an earlier expiration and smaller sequence number. So call LSM FIFO. * Compaction filter also filter those blob indexes where corresponding blob file is gone. * Add an event listener to listen compaction/flush event and update sst file size. * Implement DB::Close() to make sure base db, as well as event listener and compaction filter, destruct before blob db. * More blob db statistics around FIFO. * Fix some locking issue when accessing a blob file. Closes https://github.com/facebook/rocksdb/pull/3556 Differential Revision: D7139328 Pulled By: yiwu-arbug fbshipit-source-id: ea5edb07b33dfceacb2682f4789bea61de28bbfa 06 March 2018, 19:57:42 UTC
0a2354c Added bytes XOR merge operator Summary: Closes https://github.com/facebook/rocksdb/pull/575 I fixed the merge conflicts etc. Closes https://github.com/facebook/rocksdb/pull/3065 Differential Revision: D7128233 Pulled By: sagar0 fbshipit-source-id: 2c23a48c9f0432c290b0cd16a12fb691bb37820c 06 March 2018, 18:27:36 UTC
62277e1 WritePrepared Txn: Move DuplicateDetector to util Summary: Move DuplicateDetector and SetComparator to its own header file in util. It would also address a complaint in the unity test. Closes https://github.com/facebook/rocksdb/pull/3567 Differential Revision: D7163268 Pulled By: maysamyabandeh fbshipit-source-id: 6ddf82773473646dbbc1284ae601a78c4907c778 06 March 2018, 07:57:12 UTC
9cb4856 Don't need to UpdateFilesByCompactionPri for kCompactionStyleNone Summary: Closes https://github.com/facebook/rocksdb/pull/3563 Differential Revision: D7154653 Pulled By: ajkr fbshipit-source-id: 4f32fb1b02451a934504c40be22b07fb1f2deb9c 06 March 2018, 01:57:39 UTC
5d68243 Comment out unused variables Summary: Submitting on behalf of another employee. Closes https://github.com/facebook/rocksdb/pull/3557 Differential Revision: D7146025 Pulled By: ajkr fbshipit-source-id: 495ca5db5beec3789e671e26f78170957704e77e 05 March 2018, 21:13:41 UTC
1ccdc2c Fix vagrant build process Summary: https://blog.github.com/2018-02-23-weak-cryptographic-standards-removed/ Github dropped supporting some weak cryptographic protocols from their website couple of weeks ago which cause our vagrant build process to fail on curl downloading step. This diff force curl use tls v1.2 protocol if it is supported so that it does not rely on the default protocol on different systems. Closes https://github.com/facebook/rocksdb/pull/3561 Differential Revision: D7148575 Pulled By: wpc fbshipit-source-id: b8cecfdfeb2bc8236de2d0d14f044532befec98c 05 March 2018, 19:57:41 UTC
92b1a68 fix FreeBSD build Summary: Currently FreeBSD build is broken in master and possibly some previous releases due to unrecognized symbol `O_DIRECT`. This PR will fix the build on FreeBSD Closes https://github.com/facebook/rocksdb/pull/3560 Differential Revision: D7148646 Pulled By: miasantreble fbshipit-source-id: 95b6c3d310fa531267c086b2cd40a5ab1c042b5a 05 March 2018, 19:12:28 UTC
680864a WritePrepared Txn: Fix bug with duplicate keys during recovery Summary: Fix the following bugs: - During recovery a duplicate key was inserted twice into the write batch of the recovery transaction, once when the memtable returns false (because it was duplicates) and once for the 2nd attempt. This would result into different SubBatch count measured when the recovered transactions is committing. - If a cf is flushed during recovery the memtable is not available to assist in detecting the duplicate key. This could result into not advancing the sequence number when iterating over duplicate keys of a flushed cf and hence inserting the next key with the wrong sequence number. - SubBacthCounter would reset the comparator to default comparator after the first duplicate key. The 2nd duplicate key hence would have gone through a wrong comparator and not being detected. Closes https://github.com/facebook/rocksdb/pull/3562 Differential Revision: D7149440 Pulled By: maysamyabandeh fbshipit-source-id: 91ec317b165f363f5d11ff8b8c47c81cebb8ed77 05 March 2018, 18:57:59 UTC
15f55e5 Fix TSAN timeout in MergeOperatorPinningTest.Randomized/x test Summary: [FB - Internal] MergeOperatorPinningTest.Randomized/x tests are frequently failing with timeouts when run with tsan, as they are exceeding 10 minute limit for tests. The tests are in turn getting disabled due to frequent failures. I halved the number of rounds to make the test complete sooner. This reduces the number of testing iterations a little, but it still is much better than totally letting the test be disabled. Closes https://github.com/facebook/rocksdb/pull/3523 Differential Revision: D7031498 Pulled By: sagar0 fbshipit-source-id: 9a694f2176b235259920a42bf24bca5346f7cff1 03 March 2018, 00:27:21 UTC
db2445a Brings the Java API for WriteBatch inline with the C++ API Summary: * Exposes status * Corrects some method naming * Adds missing functionality Closes https://github.com/facebook/rocksdb/pull/3550 Differential Revision: D7140790 Pulled By: sagar0 fbshipit-source-id: cbdab6c5a7ae4f3030fb46739e9060e381b26fa6 02 March 2018, 23:44:10 UTC
1209b6d Blob DB: remove existing garbage collection implementation Summary: Red diff to remove existing implementation of garbage collection. The current approach is reference counting kind of approach and require a lot of effort to get the size counter right on compaction and deletion. I'm going to go with a simple mark-sweep kind of approach and will send another PR for that. CompactionEventListener was added solely for blob db and it adds complexity and overhead to compaction iterator. Removing it as well. Closes https://github.com/facebook/rocksdb/pull/3551 Differential Revision: D7130190 Pulled By: yiwu-arbug fbshipit-source-id: c3a375ad2639a3f6ed179df6eda602372cc5b8df 02 March 2018, 20:57:23 UTC
2ac988c Add TransactionDB and OptimisticTransactionDB to the Java API Summary: Closes https://github.com/facebook/rocksdb/issues/697 Closes https://github.com/facebook/rocksdb/issues/1151 Closes https://github.com/facebook/rocksdb/pull/1298 Differential Revision: D7131402 Pulled By: sagar0 fbshipit-source-id: bcd34ce95ed88cc641786089ff4232df7b2f089f 02 March 2018, 18:34:13 UTC
d060421 Fix a leak in prepared_section_completed_ Summary: The zeroed entries were not removed from prepared_section_completed_ map. This patch adds a unit test to show the problem and fixes that by refactoring the code. The new code is more efficient since i) it uses two separate mutex to avoid contention between commit and prepare threads, ii) it uses a sorted vector for maintaining uniq log entires with prepare which avoids a very large heap with many duplicate entries. Closes https://github.com/facebook/rocksdb/pull/3545 Differential Revision: D7106071 Pulled By: maysamyabandeh fbshipit-source-id: b3ae17cb6cd37ef10b6b35e0086c15c758768a48 02 March 2018, 04:41:56 UTC
bf937cf Add "rocksdb.live-sst-files-size" DB property Summary: Add "rocksdb.live-sst-files-size" DB property which only include files of latest version. Existing "rocksdb.total-sst-files-size" include files from all versions and thus include files that's obsolete but not yet deleted. I'm going to use this new property to cap blob db sst + blob files size. Closes https://github.com/facebook/rocksdb/pull/3548 Differential Revision: D7116939 Pulled By: yiwu-arbug fbshipit-source-id: c6a52e45ce0f24ef78708156e1a923c1dd6bc79a 02 March 2018, 02:01:10 UTC
ec5843d Comment typo Summary: Closes https://github.com/facebook/rocksdb/pull/3546 Differential Revision: D7111708 Pulled By: ajkr fbshipit-source-id: 522a4a00eb3e34c73afcb86c1f75cd2e90e7608d 28 February 2018, 17:56:45 UTC
3ae0047 skip CompactRange flush based on memtable contents Summary: CompactRange has a call to Flush because we guarantee that, at the time it's called, all existing keys in the range will be pushed through the user's compaction filter. However, previously the flush was done blindly, so it'd happen even if the memtable does not contain keys in the range specified by the user. This caused unnecessarily many L0 files to be created, leading to write stalls in some cases. This PR checks the memtable's contents, and decides to flush only if it overlaps with `CompactRange`'s range. - Move the memtable overlap check logic from `ExternalSstFileIngestionJob` to `ColumnFamilyData::RangesOverlapWithMemtables` - Reuse the above logic in `CompactRange` and skip flushing if no overlap Closes https://github.com/facebook/rocksdb/pull/3520 Differential Revision: D7018897 Pulled By: ajkr fbshipit-source-id: a3c6b1cfae56687b49dd89ccac7c948e53545934 28 February 2018, 01:12:44 UTC
c287c09 Update comments in DB::Close() Summary: Closes https://github.com/facebook/rocksdb/pull/3543 Differential Revision: D7093251 Pulled By: siying fbshipit-source-id: 4066b82c95ecb65866c5842d68ab13ab9f85d567 27 February 2018, 20:42:31 UTC
d633656 Adding CentOS 7 Vagrantfile & build script Summary: I have updated the Vagrantfile to have an entry for CentOS 7. Also created a simple build script which is pretty similar to the one in Beringei. How to test: ``` vagrant up centos7 ``` Todo: Implement -j X for the build. Closes https://github.com/facebook/rocksdb/pull/3530 Differential Revision: D7090739 Pulled By: ajkr fbshipit-source-id: 9f9eda5b507568993543d08de7ce168dfc12282e 26 February 2018, 23:27:17 UTC
ad05cbb DB:Open should fail on tmpfs when use_direct_reads=true Summary: Before: > $ TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32 DB path: [/dev/shm/dbbench] put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument db_bench: tpp.c:84: __pthread_tpp_change_priority: Assertion `new_prio == -1 || (new_prio >= fifo_min_prio && new_prio <= fifo_max_prio)' failed. put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument put error: IO error: While open a file for random read: /dev/shm/dbbench/000007.sst: Invalid argument After: > TEST_TMPDIR=/dev/shm ./db_bench -use_direct_reads=true -benchmarks=readrandomwriterandom -num=10000000 -reads=100000 -write_buffer_size=1048576 -target_file_size_base=1048576 -max_bytes_for_level_base=4194304 -max_background_jobs=12 -readwritepercent=50 -key_size=16 -value_size=48 -threads=32 Initializing RocksDB Options from the specified file Initializing RocksDB Options from command-line flags open error: Not implemented: Direct I/O is not supported by the specified DB. Closes https://github.com/facebook/rocksdb/pull/3539 Differential Revision: D7082658 Pulled By: miasantreble fbshipit-source-id: f9d9c6ec3b5e9e049cab52154940ee101ba4d342 26 February 2018, 22:58:06 UTC
7eb292d Fix a memory leak in WindowsThread Summary: _endthreadex does not return and thus objects for stack destructors do not run. This creates a memory leak. We remove the calls since _enthreadex called automatically after the threadproc returns i.e. thread exits. Closes https://github.com/facebook/rocksdb/pull/3542 Differential Revision: D7088713 Pulled By: ajkr fbshipit-source-id: 749ecafc6a9572f587f76e516547e07734349a54 26 February 2018, 21:46:12 UTC
dfbe52e Fix the Logger::Close() and DBImpl::Close() design pattern Summary: The recent Logger::Close() and DBImpl::Close() implementation rely on calling the CloseImpl() virtual function from the destructor, which will not work. Refactor the implementation to have a private close helper function in derived classes that can be called by both CloseImpl() and the destructor. Closes https://github.com/facebook/rocksdb/pull/3528 Reviewed By: gfosco Differential Revision: D7049303 Pulled By: anand1976 fbshipit-source-id: 76a64cbf403209216dfe4864ecf96b5d7f3db9f4 23 February 2018, 21:57:26 UTC
30649dc Have a different function when ROCKSDB_JEMALLOC=0 Summary: Some sanitizer is not happy with parameter name with ROCKSDB_JEMALLOC not set. Use another function instead. Closes https://github.com/facebook/rocksdb/pull/3536 Differential Revision: D7064849 Pulled By: siying fbshipit-source-id: c6ae94e044686176af1259df9172453d52c2f9d5 23 February 2018, 19:42:33 UTC
90eca1e WritePrepared Txn: optimize SubBatchCnt Summary: Make use of the index in WriteBatchWithIndex to also count the number of sub-batches. This eliminates the need to separately scan the batch to count the number of sub-batches once a duplicate key is detected. Closes https://github.com/facebook/rocksdb/pull/3529 Differential Revision: D7049947 Pulled By: maysamyabandeh fbshipit-source-id: 81cbf12c4e662541c772c7265a8f91631e25c7cd 23 February 2018, 02:12:26 UTC
243220d Update HISTORY.md to 5.12.0 Summary: Closes https://github.com/facebook/rocksdb/pull/3532 Differential Revision: D7062828 Pulled By: miasantreble fbshipit-source-id: d36967a1cfbcaeeeb33b9f0e09e15dea85b08b70 23 February 2018, 00:47:01 UTC
4624edc RocksDBOptionsParser::Parse()'s `ignore_unknown_options` argument only ingores options from higher version. Summary: RocksDB should always be able to parse an option file generated using the same or lower version. Unknown option should only happen if it is from a higher version. Change the behavior of RocksDBOptionsParser::Parse()'s behavior with ignore_unknown_options=true so that unknown option from a lower or the same version will never be skipped. Closes https://github.com/facebook/rocksdb/pull/3527 Differential Revision: D7048851 Pulled By: siying fbshipit-source-id: e261caea12f6515611a4a29f39acf2b619df2361 22 February 2018, 21:28:12 UTC
aba3409 Back out "[codemod] - comment out unused parameters" Reviewed By: igorsugak fbshipit-source-id: 4a93675cc1931089ddd574cacdb15d228b1e5f37 22 February 2018, 20:43:17 UTC
f4a030c - comment out unused parameters Reviewed By: everiq, igorsugak Differential Revision: D7046710 fbshipit-source-id: 8e10b1f1e2aecebbfb229c742e214db887e5a461 22 February 2018, 17:44:23 UTC
b092977 BackupEngine gluster-friendly file naming convention Summary: Use the rsync tempfile naming convention in our `BackupEngine`. The temp file follows the format, `.<filename>.<suffix>`, which is later renamed to `<filename>`. We fix `tmp` as the `<suffix>` as we don't need to use random bytes for now. The benefit is gluster treats this tempfile naming convention specially and applies hashing only to `<filename>`, so the file won't need to be linked or moved when it's renamed. Our gluster team suggested this will make things operationally easier. Closes https://github.com/facebook/rocksdb/pull/3463 Differential Revision: D6893333 Pulled By: ajkr fbshipit-source-id: fd7622978f4b2487fce33cde40dd3124f16bcaa8 22 February 2018, 01:42:07 UTC
828211e WritePrepared Txn: fix non-emptied PreparedHeap bug Summary: Under a certain sequence of accessing PreparedHeap, there was a bug that would not successfully empty the heap. This would result in performance issues when the heap content is moved to old_prepared_ after max_evicted_seq_ advances the orphan prepared sequence numbers. The patch fixed the bug and add more unit tests. It also does more logging when the unlikely scenarios are faced Closes https://github.com/facebook/rocksdb/pull/3526 Differential Revision: D7038486 Pulled By: maysamyabandeh fbshipit-source-id: f1e40bea558f67b03d2a29131fcb8734c65fce97 21 February 2018, 21:42:23 UTC
8ada876 Add rocksdb.iterator.internal-key property Summary: Added a new iterator property: `rocksdb.iterator.internal-key` to get the internal-key (converted to user key) at which the iterator stopped. Closes https://github.com/facebook/rocksdb/pull/3525 Differential Revision: D7033694 Pulled By: sagar0 fbshipit-source-id: d51e6c00f5e9d766c6276ef79774b81c6c5216f8 21 February 2018, 03:12:09 UTC
e9c31ab save redundant key lookup in map of locked keys Summary: In case it is found that a key is already marked as locked in a stripe's map of locked keys, it is not necessary to look it up again using `std::unordered_map<std::string, ...>::at(size_t)`. Instead, we can use the already found position using the iterator produced by the previous `find` operation. Reusing the iterator will avoid having to hash the key again and do additional "random" memory lookups in the map of keys (though the data will very likely sit available in caches here already due to the previous find operation) Closes https://github.com/facebook/rocksdb/pull/3505 Differential Revision: D7036446 Pulled By: sagar0 fbshipit-source-id: cced51547b2bd2d49394f6bc8c5896f09fa80f68 21 February 2018, 01:44:44 UTC
1960e73 fix handling of empty string as checkpoint directory Summary: - made `CreateCheckpoint` properly return `InvalidArgument` when called with an empty directory. Previously it triggered an assertion failure due to a bug in the logic. - made `ldb` set empty `checkpoint_dir` if that's what the user specifies, so that we can use it to properly test `CreateCheckpoint` in the future. Differential Revision: D6874562 fbshipit-source-id: dcc1bd41768261d9338987fa7711444289707ed7 21 February 2018, 00:44:00 UTC
5263da6 fix shift UBSAN error in col_buf_encoder.cc Summary: Add a static cast to perform the left shift as with an unsigned type. make ubsan_check Closes https://github.com/facebook/rocksdb/pull/3517 Reviewed By: sagar0 Differential Revision: D7016044 Pulled By: igorsugak fbshipit-source-id: baf72f6197edd8f7220d010b15a23d6de6a72c49 21 February 2018, 00:44:00 UTC
ab446dc Fix build with USE_RTTI=0 Summary: utilities/column_aware_encoding_util.cc:61:23: error: cannot use dynamic_cast with -fno-rtti table_reader_.reset(dynamic_cast<BlockBasedTable*>(table_reader.release())); ^ 1 error generated. It was added as a [local patch](https://svnweb.freebsd.org/ports/head/databases/rocksdb/files/patch-utilities-column_aware_encoding_util.cc) on FreeBSD since RocksDB 5.8. It also fixes #2707. Closes https://github.com/facebook/rocksdb/pull/3514 Differential Revision: D7005571 Pulled By: siying fbshipit-source-id: 351a9055d21d0accdd7a932e8e7bfcd3c8e22068 16 February 2018, 18:41:49 UTC
c178da0 WritePrepared Txn: optimizations for sysbench update_noindex Summary: These are optimization that we applied to improve sysbech's update_noindex performance. 1. Make use of LIKELY compiler hint 2. Move std::atomic so the subclass 3. Make use of skip_prepared in non-2pc transactions. Closes https://github.com/facebook/rocksdb/pull/3512 Differential Revision: D7000075 Pulled By: maysamyabandeh fbshipit-source-id: 1ab8292584df1f6305a4992973fb1b7933632181 16 February 2018, 16:42:31 UTC
97307d8 Fix deadlock in ColumnFamilyData::InstallSuperVersion() Summary: Deadlock: a memtable flush holds DB::mutex_ and calls ThreadLocalPtr::Scrape(), which locks ThreadLocalPtr mutex; meanwhile, a thread exit handler locks ThreadLocalPtr mutex and calls SuperVersionUnrefHandle, which tries to lock DB::mutex_. This deadlock is hit all the time on our workload. It blocks our release. In general, the problem is that ThreadLocalPtr takes an arbitrary callback and calls it while holding a lock on a global mutex. The same global mutex is (at least in some cases) locked by almost all ThreadLocalPtr methods, on any instance of ThreadLocalPtr. So, there'll be a deadlock if the callback tries to do anything to any instance of ThreadLocalPtr, or waits for another thread to do so. So, probably the only safe way to use ThreadLocalPtr callbacks is to do only do simple and lock-free things in them. This PR fixes the deadlock by making sure that local_sv_ never holds the last reference to a SuperVersion, and therefore SuperVersionUnrefHandle never has to do any nontrivial cleanup. I also searched for other uses of ThreadLocalPtr to see if they may have similar bugs. There's only one other use, in transaction_lock_mgr.cc, and it looks fine. Closes https://github.com/facebook/rocksdb/pull/3510 Reviewed By: sagar0 Differential Revision: D7005346 Pulled By: al13n321 fbshipit-source-id: 37575591b84f07a891d6659e87e784660fde815f 16 February 2018, 16:13:34 UTC
0454f78 fix advance reservation of arena block addresses Summary: Calling `std::vector::reserve()` causes memory to be reallocated and then data to be moved. It was called prior to adding every block. This reallocation could be done a huge amount of times, e.g., for users with large index blocks. Instead, we can simply use `std::vector::emplace_back()` in such a way that preserves the no-memory-leak guarantee, while letting the vector decide when to reallocate space. Now I see reallocation/moving happen O(logN) times, rather than O(N) times, where N is the final size of vector. Closes https://github.com/facebook/rocksdb/pull/3508 Differential Revision: D6994228 Pulled By: ajkr fbshipit-source-id: ab7c11e13ff37c8c6c8249be7a79566a4068cd27 16 February 2018, 03:41:52 UTC
989d123 Legocastle job to report lite build binary size to scuba Summary: Add a legocastle job to continuously build the last 10 commits every 4 hours and report lite build binary size to scuba. Closes https://github.com/facebook/rocksdb/pull/3511 Differential Revision: D7001730 Pulled By: yiwu-arbug fbshipit-source-id: 7c8ca87c46d663c786a0d32be69ebbe7b19a5eb9 16 February 2018, 01:27:24 UTC
8eb1d44 Unbreak MemTableRep API change Summary: The MemTableRep API was broken by this commit: 813719e9525f647aaebf19ca3d4bb6f1c63e2648 This patch reverts the changes and instead adds InsertKey (and etc.) overloads to extend the MemTableRep API without breaking the existing classes that inherit from it. Closes https://github.com/facebook/rocksdb/pull/3513 Differential Revision: D7004134 Pulled By: maysamyabandeh fbshipit-source-id: e568d91fe1e17dd76c0c1f6c7dd51a18633b1c4f 16 February 2018, 01:27:24 UTC
4e7a182 Several small "fixes" Summary: - removed a few unneeded variables - fused some variable declarations and their assignments - fixed right-trimming code in string_util.cc to not underflow - simplifed an assertion - move non-nullptr check assertion before dereferencing of that pointer - pass an std::string function parameter by const reference instead of by value (avoiding potential copy) Closes https://github.com/facebook/rocksdb/pull/3507 Differential Revision: D7004679 Pulled By: sagar0 fbshipit-source-id: 52944952d9b56dfcac3bea3cd7878e315bb563c4 16 February 2018, 00:57:37 UTC
c88c57c Tweak external file ingestion seqno logic under universal compaction Summary: Right now it is possible that a file gets assigned to L0 but also assigned the seqno from a higher level which it doesn't fit Under the current impl, it is possibe that seqno in lower levels (Ln) can be equal to smallest seqno of higher levels (Ln-1), which is undesirable from universal compaction's point of view. This should fix the intermittent failure of `ExternalSSTFileBasicTest.IngestFileWithGlobalSeqnoPickedSeqno` Closes https://github.com/facebook/rocksdb/pull/3411 Differential Revision: D6813802 Pulled By: miasantreble fbshipit-source-id: 693d0462fa94725ccfb9d8858743e6d2d9992d14 15 February 2018, 22:13:39 UTC
6a30b98 fix wrong indentation Summary: Somehow the indentation was incorrect in this file. The only change in this PR is to get it right again in order to make the code more readable. Please reject if you think it's not worth it. Closes https://github.com/facebook/rocksdb/pull/3504 Differential Revision: D6996011 Pulled By: miasantreble fbshipit-source-id: 060514a3a8c910d34bad795b36eb4d278512b154 15 February 2018, 19:13:37 UTC
ba6ee1f Fix 2 more unused reference errors VS2017 Summary: As in #3425 Closes https://github.com/facebook/rocksdb/pull/3497 Differential Revision: D6979588 Pulled By: gfosco fbshipit-source-id: e9fb32d04ad45575dfe9de1d79348d158e474197 14 February 2018, 19:12:36 UTC
b3c5351 Direct I/O writable file should do fsync in Close() Summary: We don't do fsync() after truncate in direct I/O writeable file (in fact we don't do any fsync ever). This can cause metadata not persistent to disk after the file is generated. We call it instead. Closes https://github.com/facebook/rocksdb/pull/3500 Differential Revision: D6981482 Pulled By: siying fbshipit-source-id: 7e2b591b7e5dd1b96fc0775515b8b9e6092980ef 14 February 2018, 00:27:11 UTC
d08d05c fix UBSAN errors in fault_injection_test Summary: This fixes shift and signed-integer-overflow UBSAN checks in fault_injection_test by using a larger and unsigned type. Closes https://github.com/facebook/rocksdb/pull/3498 Reviewed By: siying Differential Revision: D6981116 Pulled By: igorsugak fbshipit-source-id: 3688f62cce570534b161e9b5f42109ebc9ae5a2c 13 February 2018, 22:12:40 UTC
dadf016 Rename one of the two LevelIterator Summary: A new LevelIterator was recently created. Rename the old one to make unity build happy. It's also not a good idea to have two classes in the same name anyway. Closes https://github.com/facebook/rocksdb/pull/3499 Differential Revision: D6979325 Pulled By: siying fbshipit-source-id: 3a032d93fe205650a08e92e5262594731ec726bb 13 February 2018, 21:57:58 UTC
7474861 Suppress UBSAN error in finer guanularity Summary: Now we suppress alignment UBSAN error as a whole. Suppressing 3-way CRC and murmurhash feels a better idea than turning off alignment check as a whole. Closes https://github.com/facebook/rocksdb/pull/3495 Differential Revision: D6971273 Pulled By: siying fbshipit-source-id: 080b59fed6df494b9f622ef7cb5d42d39e6a8cdf 13 February 2018, 20:18:07 UTC
3c380fd Adding blog post for 5.10.2 release Summary: Closes https://github.com/facebook/rocksdb/pull/3464 Differential Revision: D6906184 Pulled By: gfosco fbshipit-source-id: 415934d7b1dd8dd226b6619bfb71781184d55cd9 13 February 2018, 19:56:59 UTC
b555ed3 Customized BlockBasedTableIterator and LevelIterator Summary: Use a customzied BlockBasedTableIterator and LevelIterator to replace current implementations leveraging two-level-iterator. Hope the customized logic will make code easier to understand. As a side effect, BlockBasedTableIterator reduces the allocation for the data block iterator object, and avoid the virtual function call to it, because we can directly reference BlockIter, a final class. Similarly, LevelIterator reduces virtual function call to the dummy iterator iterating the file metadata. It also enabled further optimization. The upper bound check is also moved from index block to data block. This implementation fits this iterator better. After the change, forwared iterator is slightly optimized to ensure we trim those iterators. The two-level-iterator now is only used by partitioned index, so it is simplified. Closes https://github.com/facebook/rocksdb/pull/3406 Differential Revision: D6809041 Pulled By: siying fbshipit-source-id: 7da3b9b1d3c8e9d9405302c15920af1fcaf50ffa 13 February 2018, 01:12:25 UTC
8a04ee4 WritePrepared Txn: use TransactionDBWriteOptimizations (2nd attempt) Summary: TransactionDB::Write can receive some optimization hints from the user. One is to skip the concurrency control mechanism. WritePreparedTxnDB is currently ignoring such hints. This patch optimizes WritePreparedTxnDB::Write for skip_concurrency_control and skip_duplicate_key_check hints. Closes https://github.com/facebook/rocksdb/pull/3496 Differential Revision: D6971784 Pulled By: maysamyabandeh fbshipit-source-id: cbab10ad538fa2b8bcb47e37c77724afe6e30f03 13 February 2018, 00:43:40 UTC
ee1c802 Add delay before flush in CompactRange to avoid write stalling Summary: - Refactored logic for checking write stall condition to a helper function: `GetWriteStallConditionAndCause`. Now it is decoupled from the logic for updating WriteController / stats in `RecalculateWriteStallConditions`, so we can reuse it for predicting whether write stall will occur. - Updated `CompactRange` to first check whether the one additional immutable memtable / L0 file would cause stalling before it flushes. If so, it waits until that is no longer true. - Updated `bg_cv_` to be signaled on `SetOptions` calls. The stall conditions `CompactRange` cares about can change when (1) flush finishes, (2) compaction finishes, or (3) options dynamically change. The cv was already signaled for (1) and (2) but not yet for (3). Closes https://github.com/facebook/rocksdb/pull/3381 Differential Revision: D6754983 Pulled By: ajkr fbshipit-source-id: 5613e03f1524df7192dc6ae885d40fd8f091d972 12 February 2018, 23:42:47 UTC
0a0fad4 db_bench separate options for partition index and filters Summary: Some workloads (like my current benchmarking) may want partitioned indexes without partitioned filters. Particularly, when `-optimize_filters_for_hits=true`, the total index size may be larger than the total filter size, so it can make sense to hold all filters in-memory but not all indexes. Closes https://github.com/facebook/rocksdb/pull/3492 Differential Revision: D6970092 Pulled By: ajkr fbshipit-source-id: b7fa1828e1d13829339aefb90fd56eb7c5337f61 12 February 2018, 22:57:13 UTC
3f1bb07 make flush_reason_ atomic to keep TSAN happy Summary: Closes https://github.com/facebook/rocksdb/pull/3487 Differential Revision: D6967098 Pulled By: miasantreble fbshipit-source-id: 48e0accf2e3b3f589ddb797ff8083c8520269bf0 12 February 2018, 21:28:18 UTC
ef29d2a Explictly fail writes if key or value is not smaller than 4GB Summary: Right now, users will encounter unexpected bahavior if they use key or value larger than 4GB. We should explicitly fail the queriers. Closes https://github.com/facebook/rocksdb/pull/3484 Differential Revision: D6953895 Pulled By: siying fbshipit-source-id: b60491e1af064fc5d52971956661f6c18ceac24f 09 February 2018, 22:57:54 UTC
fe228da WritePrepared Txn: Support merge operator Summary: CompactionIterator invoke MergeHelper::MergeUntil() to do partial merge between snapshot boundaries. Previously it only depend on sequence number to tell snapshot boundary, but we also need to make use of snapshot_checker to verify visibility of the merge operands to the snapshots. For example, say there is a snapshot with seq = 2 but only can see data with seq <= 1. There are three merges, each with seq = 1, 2, 3. A correct compaction output would be (1),(2+3). Without taking snapshot_checker into account when generating merge result, compaction will generate output (1+2),(3). By filtering uncommitted keys with read callback, the read path already take care of merges well and don't need additional updates. Closes https://github.com/facebook/rocksdb/pull/3475 Differential Revision: D6926087 Pulled By: yiwu-arbug fbshipit-source-id: 8f539d6f897cfe29b6dc27a8992f68c2a629d40a 09 February 2018, 22:57:54 UTC
9fc72d6 Compilation fixes for powerpc build, -Wparentheses-equality error and missing header guards Summary: This pull request contains miscellaneous compilation fixes. Thanks, Chinmay Closes https://github.com/facebook/rocksdb/pull/3462 Differential Revision: D6941424 Pulled By: sagar0 fbshipit-source-id: fe9c26507bf131221f2466740204bff40a15614a 09 February 2018, 22:12:43 UTC
d62af7f fix a typo (of a potential vi user) Summary: Closes https://github.com/facebook/rocksdb/pull/3481 Differential Revision: D6939089 Pulled By: siying fbshipit-source-id: ccce3ae10cc5ff50a74b85804afd044b21a3c3e2 09 February 2018, 20:58:07 UTC
945f618 log flush reason for better debugging experience Summary: It's always a mystery from the logs why flush was triggered -- user triggered it manually, WriteBufferManager triggered it, logs were full, write buffer was full, etc. This PR logs Flush reason whenever a flush is scheduled. Closes https://github.com/facebook/rocksdb/pull/3401 Differential Revision: D6788142 Pulled By: miasantreble fbshipit-source-id: a867e54d493c06adf5172bd36a180fb3faae3511 09 February 2018, 20:12:43 UTC
e78715c Eliminate a memcpy for uncompressed blocks Summary: `ReadBlockFromFile` uses a stack buffer to hold small data blocks before passing them to the compression library, which outputs uncompressed data in a heap buffer. In the case of `kNoCompression` there is a `memcpy` to copy from stack buffer to heap buffer. This PR optimizes `ReadBlockFromFile` to skip the stack buffer for files whose blocks are known to be uncompressed. We determine this using the SST file property, "compression_name", if it's available. Closes https://github.com/facebook/rocksdb/pull/3472 Differential Revision: D6920848 Pulled By: ajkr fbshipit-source-id: 5c753e804efc178b9229ae5dbe6a4adc32031f07 07 February 2018, 23:57:37 UTC
back to top