https://github.com/facebook/rocksdb

sort by:
Revision Author Date Message Commit Date
2467be0 Fix casts for MSVC 01 November 2016, 00:47:10 UTC
d5555d9 Fix MSVC compile error in 32 bit compilation Summary: Passing std::atomic<uint64_t> variables to ASSERT_EQ() results in compile error C2718 'const T1': actual parameter with requested alignment of 8 won't be aligned. VS2015 defines std::atomic as specially aligned type ( with 'alignas'), however the compiler does not like declspec(align)ed function arguments. Worked around by casting std::atomic<uint64_t> types to uint64_t in ASSERT_EQ. Closes https://github.com/facebook/rocksdb/pull/1450 Differential Revision: D4106788 Pulled By: yiwu-arbug fbshipit-source-id: 5fb42c3 01 November 2016, 00:24:18 UTC
da61f34 Print compression and Fast CRC support info as Header level Summary: Currently the compression suppport and fast CRC support information is printed as info level. They should be in the same level as options, which is header level. Also add ZSTD to this printing. Closes https://github.com/facebook/rocksdb/pull/1448 Differential Revision: D4106608 Pulled By: yiwu-arbug fbshipit-source-id: cb9a076 31 October 2016, 23:09:13 UTC
f9eb567 db_bench: --dump_malloc_stats takes no effect Summary: Fix the bug that --dump_malloc_stats is set before opening the DB. Closes https://github.com/facebook/rocksdb/pull/1447 Differential Revision: D4106001 Pulled By: siying fbshipit-source-id: 4e746da 31 October 2016, 21:54:26 UTC
6a4faee fix freebsd build include path err and so & jar file name Summary: Closes https://github.com/facebook/rocksdb/pull/1441 Differential Revision: D4103477 Pulled By: yiwu-arbug fbshipit-source-id: 071a0dc 31 October 2016, 16:39:16 UTC
c90c48d Show More DB Stats in info logs Summary: DB Stats now are truncated if there are too many CFs. Extend the buffer size to allow more to be printed out. Also, separate out malloc to another log line. Closes https://github.com/facebook/rocksdb/pull/1439 Differential Revision: D4100943 Pulled By: yiwu-arbug fbshipit-source-id: 79f7218 29 October 2016, 23:09:18 UTC
1b295ac DBTest.GetThreadStatus: Wait for test results for longer Summary: The current 10 millisecond waiting for test results may not be sufficient in some test environments. Increase it to 60 seconds and check the results for every 1 milliseond. Already reviewed: https://reviews.facebook.net/D65457 Closes https://github.com/facebook/rocksdb/pull/1437 Differential Revision: D4099443 Pulled By: siying fbshipit-source-id: cf1f205 29 October 2016, 23:09:18 UTC
25f5742 Update documentation to point at gcc 4.8 Summary: Rocksdb currently has many references to std::map.emplace_back() which is not implemented in gcc 4.7, but valid in gcc 4.8. Confirmed that it did not build with gcc 4.7, but builds fine with gcc 4.8 Closes https://github.com/facebook/rocksdb/pull/1272 Differential Revision: D4101385 Pulled By: IslamAbdelRahman fbshipit-source-id: f6af453 29 October 2016, 19:09:17 UTC
b50a81a Add a test for tailing_iterator Summary: A bug that tailingIterator->Seek(target) skips records. I think the bug is in the SeekInternal starting at lines 387: search_left_bound > search_right_bound There are only 2 cases this can happen: (1) target key is smaller than left most file (2) target key is larger than right most file The comment is wrong, there is another possibility that at the higher level there is a big gap such that the file in the lower level fits completely in the gap and then indexer->GetNextLevelIndex returns search_left_bound > search_right_bound I think pointing on the files after and before the gap. details: https://github.com/facebook/rocksdb/issues/1372 fixed this bug with test case added. Closes https://github.com/facebook/rocksdb/pull/1436 Reviewed By: IslamAbdelRahman Differential Revision: D4099313 Pulled By: lightmark fbshipit-source-id: 6a675b3 29 October 2016, 01:24:14 UTC
04751d5 L0 compression should follow options.compression_per_level if not empty Summary: Currently, we don't use options.compression_per_level[0] as the compression style for L0 compression type, unless it is None. This behavior doesn't look like on purpose. This diff will make sure L0 compress using the style of options.compression_per_level[0]. Reviewed and accepted in: https://reviews.facebook.net/D65607 Closes https://github.com/facebook/rocksdb/pull/1435 Differential Revision: D4099368 Pulled By: siying fbshipit-source-id: cfbbdcd 29 October 2016, 00:39:20 UTC
2946cad Improve RangeDelAggregator documentation Summary: as requested in D62259 Closes https://github.com/facebook/rocksdb/pull/1434 Differential Revision: D4099047 Pulled By: ajkr fbshipit-source-id: a258cfb 28 October 2016, 22:54:21 UTC
0a9fd05 Update Vagrant file (test internal phabricator workflow) Summary: Add simple comment to Vagrant file Closes https://github.com/facebook/rocksdb/pull/1433 Differential Revision: D4098740 Pulled By: IslamAbdelRahman fbshipit-source-id: 4903bff 28 October 2016, 22:39:19 UTC
fcd1e0b Make rocksdb work with internal repo fbshipit-source-id: f52d2b6d39668516270c51945fc4e1693e553ff7 28 October 2016, 21:59:50 UTC
0aab5e5 FreeBSD: malloc_usable_size is in <malloc_np.h> (#1428) Signed-off-by: Willem Jan Withagen <wjw@digiware.nl> 28 October 2016, 17:44:52 UTC
9c0bb7f cmake: drop "-march=native" from CXX_FLAGS (#1429) this breaks the cross-compiling, and we can not assume that the building machine and the target machine share the same CPU spec. Signed-off-by: Kefu Chai <kchai@redhat.com> 28 October 2016, 17:40:47 UTC
eeb27e1 Add handy option to turn on direct I/O in db_bench (#1424) 28 October 2016, 17:36:05 UTC
c6168d1 removed some declarations from c.h which resulted in undefined symbols (#1407) 28 October 2016, 17:33:49 UTC
bc429de revert fractional cascading in farward iterator Summary: As offline discussion with Siying, revert this since it has bug with seek. Test Plan: make check -j64 Reviewers: yiwu, andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65559 28 October 2016, 17:25:39 UTC
b9bc7a2 Use skiplist rep for range tombstone memtable Summary: somehow missed committing this update in D62217 Test Plan: make check Reviewers: sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65361 27 October 2016, 17:07:28 UTC
60a2bbb Makefile: generate util/build_version.cc from .in file (#1384) * util/build_verion.cc.in: add this file, so cmake and make can share the template file for generating util/build_version.cc. * CMakeLists.txt: also, cmake v2.8.11 does not support file(GENERATE ...), so we are using configure_file() for creating build_version.cc. * Makefile: use util/build_verion.cc.in for creating build_version.cc. Signed-off-by: Kefu Chai <tchaikov@gmail.com> 25 October 2016, 18:31:39 UTC
9ee8406 Disable DBTest.RepeatedWritesToSameKey (#1420) Summary: The verification condition of the test DBTest.RepeatedWritesToSameKey doesn't hold anymore after 3ce3bb3da2486c2c18a332128dda7c05a91abb85. Disable the test for now before we find a way to replace it. Test Plan: Run the test and make sure it is disabled. 25 October 2016, 17:23:50 UTC
f41df30 OptionChangeMigration() to support FIFO compaction Summary: OptionChangeMigration() to support FIFO compaction. If the DB before migration is using FIFO compaction, nothing should be done. If the desitnation option is FIFO options, compact to one single L0 file if the source has more than one levels. Test Plan: Run option_change_migration_test Reviewers: andrewkr, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65289 25 October 2016, 01:04:32 UTC
2e8004e Changing the legocastle run to use valgrind_test instead of _check Summary: valgrind_test is the correct way to run valgrind tests. this is becasue we need to force DISABLE_JEMALLOC Test Plan: Running sandcastle and contrun Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65451 24 October 2016, 23:23:19 UTC
9de2f75 revert Prev() in MergingIterator to use previous code in non-prefix-seek mode Summary: Siying suggested to keep old code for normal mode prev() for safety Test Plan: make check -j64 Reviewers: yiwu, andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65439 24 October 2016, 20:13:01 UTC
2449518 DBSSTTest.RateLimitedDelete: not to use real clock Summary: Using real clock causes failures of DBSSTTest.RateLimitedDelete in some cases. Turn away from the real time. Use fake time instead. Test Plan: Run the tests and all existing tests. Reviewers: yiwu, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65145 24 October 2016, 17:35:00 UTC
1168cb8 Fix a bug that may cause a deleted row to appear again Summary: The previous fix of reappearing of a deleted row 0ce258f9b37c8661ea326039372bef8f185615ef missed a corner case, which can be reproduced using test CompactionPickerTest.OverlappingUserKeys7. Consider such an example: input level file: 1[B E] 2[F H] output level file: 3[A C] 4[D I] 5[I K] First file 2 is picked, which overlaps to file 4. 4 expands to 5. Now the all range is [D K] with 2 output level files. When we try to expand that, [D K] overlaps with file 1 and 2 in the input level, and 1 and 2 overlaps with 3 and 4 in the output level. So we end up with picking 3 and 4 in the output level. Without expanding, it also has 2 files, so we determine the output level doesn't change, although they are the different two files. The fix is to expand the output level files after we picked 3 and 4. In that case, there will be three output level files so we will abort the expanding. I also added two unit tests related to marked_for_compaction and being_compacted. They have been passing though. Test Plan: Run the new unit test, as well as all other tests. Reviewers: andrewkr, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: yoshinorim, leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65373 24 October 2016, 16:49:07 UTC
99c052a Fix integer overflow in GetL0ThresholdSpeedupCompaction (#1378) 24 October 2016, 01:43:29 UTC
f83cd64 Fix a bug that mistakenly disable regression_test.sh to update commit (#1415) Summary: Fix a bug that mistakenly disable regression_test.sh to update commit Test Plan: regression_test.sh 22 October 2016, 00:26:24 UTC
0e926b8 Passing DISABLE_JEMALLOC=1 to valgrind_check if run locally Summary: Valgrind does not work well with JEMALLOC. If you run a simple make valgrind_check, you will see lots of issues and crashes. When precommit runs, this is taken care of. Here we make sure valgrind_check is passed in DISABLE_JEMALLOC=1 Test Plan: Ran local valgrind_test and noticed the difference Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65379 21 October 2016, 21:57:44 UTC
4dfaa66 Make IsDeadlockDetect() virtual member of Transaction Summary: Make `IsDeadlockDetect()` virtual member of base class `Transaction` for ease of use in MyRocks Test Plan: compiles. compiles into MyRocks call-site. Reviewers: mung Reviewed By: mung Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65385 21 October 2016, 21:47:59 UTC
59a7c03 Change ioptions to store user_comparator, fix bug Summary: change ioptions.comparator to user_comparator instread of internal_comparator. Also change Comparator* to InternalKeyComparator* to make its type explicitly. Test Plan: make all check -j64 Reviewers: andrewkr, sdong, yiwu Reviewed By: yiwu Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65121 21 October 2016, 18:31:42 UTC
869ae5d Support IngestExternalFile (remove AddFile restrictions) Summary: Changes in the diff API changes: - Introduce IngestExternalFile to replace AddFile (I think this make the API more clear) - Introduce IngestExternalFileOptions (This struct will encapsulate the options for ingesting the external file) - Deprecate AddFile() API Logic changes: - If our file overlap with the memtable we will flush the memtable - We will find the first level in the LSM tree that our file key range overlap with the keys in it - We will find the lowest level in the LSM tree above the the level we found in step 2 that our file can fit in and ingest our file in it - We will assign a global sequence number to our new file - Remove AddFile restrictions by using global sequence numbers Other changes: - Refactor all AddFile logic to be encapsulated in ExternalSstFileIngestionJob Test Plan: unit tests (still need to add more) addfile_stress (https://reviews.facebook.net/D65037) Reviewers: yiwu, andrewkr, lightmark, yhchiang, sdong Reviewed By: sdong Subscribers: jkedgar, hcz, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65061 21 October 2016, 00:05:32 UTC
1d9dbef Restrict running condition of UniversalCompactionTrivialMoveTest2 Summary: DBTestUniversalCompaction.UniversalCompactionTrivialMoveTest2 verifies non-trivial move is not triggered if we load data in sequential order. However, if there are multiple compaction threads, this conditon may not hold. Restrict the running condition to 1 compaction thread to make the test more robust. Test Plan: Run the test and make sure at least it doesn't regress normally. Reviewers: yhchiang, andrewkr, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65277 20 October 2016, 22:43:00 UTC
4edd39f Implement deadlock detection Summary: Implement deadlock detection. This is done by maintaining a TxnID -> TxnID map which represents the edges in the wait for graph (this is named `wait_txn_map_`). Test Plan: transaction_test Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64491 20 October 2016, 02:45:57 UTC
48fd619 Minor fixes to RocksJava Native Library initialization (#1287) * [bugfix] Make sure the Native Library is initialized. Closes https://github.com/facebook/rocksdb/issues/989 * [bugfix] Just load the native libraries once 20 October 2016, 01:21:22 UTC
48e4e84 Disable auto compactions in memory_test and re-enable the test (#1408) Summary: Auto-compactions will change memory usage of DB but memory_test didn't take it into account. This PR disable auto compactions in the test and hopefully it fixes its flakyness. Test Plan: UBSAN build used to catch the flakyness. Run `make ubsan_check` and it passes. 20 October 2016, 01:18:42 UTC
fb2e412 column_family_test: disable some tests in LITE Summary: Some tests in column_family_test depend on functions that are not available in LITE build, which sometimes cause flakiness. Disable them. Test Plan: Run those tests in LITE build. Reviewers: yiwu, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65271 19 October 2016, 22:55:56 UTC
5af651d fix data race in compact_files_test Summary: fix data race Test Plan: compact_files_test Reviewers: sdong, yiwu, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65259 19 October 2016, 20:37:51 UTC
a0ba0aa Fix uninitialized variable gcc error for MyRocks Summary: make sure seq_ is properly initialized even if ParseInternalKey() fails. Test Plan: run myrocks release tests Reviewers: lightmark, mung, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65199 19 October 2016, 17:59:46 UTC
b88f8e8 Support SST files with Global sequence numbers [reland] Summary: reland https://reviews.facebook.net/D62523 - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno` - Update TableProperties to be aware of the offset of each property in the file - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks Test Plan: unit tests Reviewers: sdong, yhchiang Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65211 18 October 2016, 23:59:37 UTC
08616b4 [db_bench] add filldeterministic (Universal+level compaction) Summary: in db_bench, we can dynamically create a rocksdb database that guarantees the shape of its LSM. universal + level compaction no fifo compaction no multi db support Test Plan: ./db_bench -benchmarks=fillseqdeterministic -compaction_style=1 -num_levels=3 --disable_auto_compactions -num=1000000 -value_size=1000 ``` ---------------------- LSM --------------------- Level[0]: /000480.sst(size: 35060275 bytes) Level[0]: /000479.sst(size: 70443197 bytes) Level[0]: /000478.sst(size: 141600383 bytes) Level[1]: /000341.sst - /000475.sst(total size: 284726629 bytes) Level[2]: /000071.sst - /000340.sst(total size: 568649806 bytes) fillseqdeterministic : 60.447 micros/op 16543 ops/sec; 16.0 MB/s ``` Reviewers: sdong, andrewkr, IslamAbdelRahman, yhchiang Reviewed By: yhchiang Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D63111 18 October 2016, 23:30:57 UTC
52c9808 not split file in compaciton on level 0 Summary: we should not split file on level 0 in compaction because it will fail the following verification of seqno order on level 0 Test Plan: check with filldeterministic in db_bench Reviewers: yhchiang, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65193 18 October 2016, 23:30:34 UTC
5e0d6b4 fix db_stress assertion failure Summary: in rocksdb::DBIter::FindValueForCurrentKey(), last_not_merge_type could also be SingleDelete() which is omitted Test Plan: db_iter_test Reviewers: yhchiang, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65187 18 October 2016, 23:07:10 UTC
ab53998 Bump RocksDB version to 4.13 (#1405) Summary: Bump RocksDB version to 4.13 Test Plan: unit tests Reviewers: sdong, IslamAbdelRahman, andrewkr, lightmark Subscribers: leveldb 18 October 2016, 22:39:10 UTC
b4d0712 SamePrefixTest.InDomainTest to clear the test directory before testing Summary: SamePrefixTest.InDomainTest may fail if the previous run of some test cases in prefix_test fail. Test Plan: Run the test Reviewers: lightmark, yhchiang, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: leveldb, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65163 18 October 2016, 21:01:10 UTC
aa09d03 Avoid calling GetDBOptions() inside GetFromBatchAndDB() Summary: MyRocks hit a regression, @mung generated perf reports showing that the reason is the cost of calling `GetDBOptions()` inside `GetFromBatchAndDB()` This diff avoid calling `GetDBOptions` and use the `ImmutableDBOptions` instead Test Plan: make check -j64 Reviewers: sdong, yiwu Reviewed By: yiwu Subscribers: andrewkr, dhruba, mung Differential Revision: https://reviews.facebook.net/D65151 18 October 2016, 20:19:26 UTC
6fbe96b Compaction Support for Range Deletion Summary: This diff introduces RangeDelAggregator, which takes ownership of iterators provided to it via AddTombstones(). The tombstones are organized in a two-level map (snapshot stripe -> begin key -> tombstone). Tombstone creation avoids data copy by holding Slices returned by the iterator, which remain valid thanks to pinning. For compaction, we create a hierarchical range tombstone iterator with structure matching the iterator over compaction input data. An aggregator based on that iterator is used by CompactionIterator to determine which keys are covered by range tombstones. In case of merge operand, the same aggregator is used by MergeHelper. Upon finishing each file in the compaction, relevant range tombstones are added to the output file's range tombstone metablock and file boundaries are updated accordingly. To check whether a key is covered by range tombstone, RangeDelAggregator::ShouldDelete() considers tombstones in the key's snapshot stripe. When this function is used outside of compaction, it also checks newer stripes, which can contain covering tombstones. Currently the intra-stripe check involves a linear scan; however, in the future we plan to collapse ranges within a stripe such that binary search can be used. RangeDelAggregator::AddToBuilder() adds all range tombstones in the table's key-range to a new table's range tombstone meta-block. Since range tombstones may fall in the gap between files, we may need to extend some files' key-ranges. The strategy is (1) first file extends as far left as possible and other files do not extend left, (2) all files extend right until either the start of the next file or the end of the last range tombstone in the gap, whichever comes first. One other notable change is adding release/move semantics to ScopedArenaIterator such that it can be used to transfer ownership of an arena-allocated iterator, similar to how unique_ptr is used for malloc'd data. Depends on D61473 Test Plan: compaction_iterator_test, mock_table, end-to-end tests in D63927 Reviewers: sdong, IslamAbdelRahman, wanning, yhchiang, lightmark Reviewed By: lightmark Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D62205 18 October 2016, 19:04:56 UTC
257de78 Remove "-Xcheck:jni" from Java tests (#1402) Summary: Junit and our code generate lots of warning if "-Xcheck:jni" is on and force Travis to fail as the logs are too long. Test Plan: "make jtest" and see the warnings go away. 18 October 2016, 13:18:24 UTC
d88dff4 add seeforprev in history Summary: update new feature in history and avoid breaking mongorocks Test Plan: make check Reviewers: sdong, yiwu, andrewkr Reviewed By: andrewkr Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64611 17 October 2016, 22:34:13 UTC
5027dd1 Fix a minor bug in the ldb tool that was not selecting the specified (#1399) column family for compaction. 17 October 2016, 17:40:30 UTC
fea6fdd Fix @see in two Java functions (#1396) 15 October 2016, 06:03:17 UTC
b1031d6 Remove function local statics that interfere with memory pooling (#1392) 14 October 2016, 20:09:18 UTC
f470540 Handle WAL deletion when using avoid_flush_during_recovery Summary: Previously the WAL files that were avoided during recovery would never be considered for deletion. That was because alive_log_files_ was only populated when log files are created. This diff further populates alive_log_files_ with existing log files that aren't flushed during recovery, such that FindObsoleteFiles() can find them later. Depends on D64053. Test Plan: new unit test, verifies it fails before this change and passes after Reviewers: sdong, IslamAbdelRahman, yiwu Reviewed By: yiwu Subscribers: leveldb, dhruba, andrewkr Differential Revision: https://reviews.facebook.net/D64059 14 October 2016, 19:59:51 UTC
e29d3b6 Make max_background_compactions and base_background_compactions dynamic changeable Summary: Add DB::SetDBOptions to dynamic change max_background_compactions and base_background_compactions. I'll add more dynamic changeable options soon. Test Plan: unit test. Reviewers: yhchiang, IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64749 14 October 2016, 19:25:39 UTC
21e8dac fix assertion failure in Prev() Summary: fix assertion failure in db_stress. It happens because of prefix seek key is larger than merge iterator key when they have the same user key Test Plan: ./db_stress --max_background_compactions=1 --max_write_buffer_number=3 --sync=0 --reopen=20 --write_buffer_size=33554432 --delpercent=5 --log2_keys_per_lock=10 --block_size=16384 --allow_concurrent_memtable_write=0 --test_batches_snapshots=0 --max_bytes_for_level_base=67108864 --progress_reports=0 --mmap_read=0 --writepercent=35 --disable_data_sync=0 --readpercent=50 --subcompactions=4 --ops_per_thread=20000000 --memtablerep=skip_list --prefix_size=0 --target_file_size_multiplier=1 --column_families=1 --threads=32 --disable_wal=0 --open_files=500000 --destroy_db_initially=0 --target_file_size_base=16777216 --nooverwritepercent=1 --iterpercent=10 --max_key=100000000 --prefixpercent=0 --use_clock_cache=false --kill_random_test=888887 --cache_size=1048576 --verify_checksum=1 Reviewers: sdong, andrewkr, yiwu, yhchiang Reviewed By: yhchiang Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D65025 14 October 2016, 00:36:48 UTC
b9311aa Implement WinRandomRW file and improve code reuse (#1388) 13 October 2016, 23:36:34 UTC
a249a0b check_format_compatible.sh to use some branch which allows to run with GCC 4.8 (#1393) Summary: Some older tags don't run GCC 4.8 with FB internal setting. Fixed them and created branches. Change the format compatible script accordingly. Also add more releases to check format compatibility. 13 October 2016, 23:15:55 UTC
040328a Remove an assertion for single-delete in MergeHelper::MergeUntil Summary: Previously we have an assertion which triggers when we issue Merges after a single delete. However, merges after a single delete are unrelated to that single delete. Thus this behavior should be allowed. This will address a flakyness of db_stress. Test Plan: db_stress Reviewers: IslamAbdelRahman, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64923 13 October 2016, 21:26:57 UTC
8cbe3e1 Relax the acceptable bias RateLimiterTest::Rate test be 25% Summary: In the current implementation of RateLimiter, the difference between the configured rate and the actual rate might be more than 20%, while our test only allows 15% difference. This diff relaxes the acceptable bias RateLimiterTest::Rate test be 25% to make the test less flaky. Test Plan: rate_limiter_test Reviewers: IslamAbdelRahman, andrewkr, yiwu, lightmark, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64941 13 October 2016, 21:26:12 UTC
f26a139 Log successful AddFile Summary: Log successful AddFile Test Plan: visually check LOG file Reviewers: yiwu, andrewkr, lightmark, sdong Reviewed By: sdong Subscribers: andrewkr, jkedgar, dhruba Differential Revision: https://reviews.facebook.net/D65019 13 October 2016, 18:56:27 UTC
5691a1d Fix compaction conflict with running compaction Summary: Issue scenario: (1) We have 3 files in L1 and we issue a compaction that will compact them into 1 file in L2 (2) While compaction (1) is running, we flush a file into L0 and trigger another compaction that decide to move this file to L1 and then move it again to L2 (this file don't overlap with any other files) (3) compaction (1) finishes and install the file it generated in L2, but this file overlap with the file we generated in (2) so we break the LSM consistency Looks like this issue can be triggered by using non-exclusive manual compaction or AddFile() Test Plan: unit tests Reviewers: sdong Reviewed By: sdong Subscribers: hermanlee4, jkedgar, andrewkr, dhruba, yoshinorim Differential Revision: https://reviews.facebook.net/D64947 13 October 2016, 17:49:06 UTC
017de66 fixup commit Summary: I accidentally left out these changes from my commit of D64053 due to messing up the merge conflict resolution. Test Plan: ./db_wal_test Reviewers: Subscribers: Tasks: Blame Revision: D64053 13 October 2016, 15:48:40 UTC
1b7af5f Redo handling of recycled logs in full purge Summary: This reverts commit https://github.com/facebook/rocksdb/commit/9e4aa798c3d47c6be64324bd9d38f0813c8ead7b, which doesn't handle all cases (see inline comment). I reimplemented the logic as suggested in the initial PR: https://github.com/facebook/rocksdb/pull/1313. This approach has two benefits: - All the parsing/filtering of full_scan_candidate_files is kept together in PurgeObsoleteFiles. - We only need to check whether log file is recycled in one place where we've already determined it's a log file Test Plan: new unit test, verified fails before the original fix, still passes now. Reviewers: IslamAbdelRahman, yiwu, sdong Reviewed By: yiwu, sdong Subscribers: leveldb, dhruba, andrewkr Differential Revision: https://reviews.facebook.net/D64053 13 October 2016, 06:13:09 UTC
27bfe32 Editorial change to README.md 13 October 2016, 03:24:50 UTC
89cc404 A bit of doc restructuring 13 October 2016, 03:23:00 UTC
9e7fda8 Fix arcanist Summary: Set no_proxy to fix arcanist Test Plan: will check if tests are triggered Reviewers: arahut, yiwu, lightmark, andrewkr, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D65001 13 October 2016, 03:11:30 UTC
2e4b5ca Add missing RateLimiter class to the Windows build (#1382) 12 October 2016, 22:00:37 UTC
ce4963f [doc] Document that Visual Studio 2015+ is now required for Windows builds (#1389) Closes https://github.com/facebook/rocksdb/issues/1377 12 October 2016, 20:40:20 UTC
e489270 Fix scoped arena iterator (#1387) 12 October 2016, 18:16:16 UTC
f8d8cf5 Fix log_write_bench -bytes_per_sync option. (#1375) Hello and thanks for RocksDB, When log_write_bench is run with the -bytes_per_sync option, the option does not influence any *sync* behaviour. > strace -e trace=write,sync_file_range ./log_write_bench -record_interval 0 -record_size 1048576 -num_records 11 -bytes_per_sync 2097152 2>&1 | egrep '^(sync|write.*XXXX)' write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 I suspect that this is because the bytes_per_sync option now needs to be using a `WritableFileWriter` and not a `WritableFile`. With the diff below applied, it changes to: > strace -e trace=write,sync_file_range ./log_write_bench -record_interval 0 -record_size 1048576 -num_records 11 -bytes_per_sync 2097152 2>&1 | egrep '^(sync|write.*XXXX)' write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 sync_file_range(0x3, 0, 0x200000, 0x2) = 0 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 sync_file_range(0x3, 0x200000, 0x200000, 0x2) = 0 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 sync_file_range(0x3, 0x400000, 0x200000, 0x2) = 0 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 sync_file_range(0x3, 0x600000, 0x200000, 0x2) = 0 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 1048576) = 1048576 sync_file_range(0x3, 0x800000, 0x200000, 0x2) = 0 ( Note that the first 1MB is not synced as mentioned in util/file_reader_writer.cc::WritableFileWriter::Flush() ) This diff also includes the fix from https://github.com/facebook/rocksdb/pull/1373 > diff -du util/log_write_bench.cc.orig util/log_write_bench.cc --- util/log_write_bench.cc.orig 2016-10-04 12:06:29.115122580 -0400 +++ util/log_write_bench.cc 2016-10-05 07:24:09.677037576 -0400 @@ -14,6 +14,7 @@ #include <gflags/gflags.h> #include "rocksdb/env.h" +#include "util/file_reader_writer.h" #include "util/histogram.h" #include "util/testharness.h" #include "util/testutil.h" @@ -38,19 +39,21 @@ env_options.bytes_per_sync = FLAGS_bytes_per_sync; unique_ptr<WritableFile> file; env->NewWritableFile(file_name, &file, env_options); + unique_ptr<WritableFileWriter> writer; + writer.reset(new WritableFileWriter(std::move(file), env_options)); std::string record; - record.assign('X', FLAGS_record_size); + record.assign(FLAGS_record_size, 'X'); HistogramImpl hist; uint64_t start_time = env->NowMicros(); for (int i = 0; i < FLAGS_num_records; i++) { uint64_t start_nanos = env->NowNanos(); - file->Append(record); - file->Flush(); + writer->Append(record); + writer->Flush(); if (FLAGS_enable_sync) { - file->Sync(); + writer->Sync(false); } hist.Add(env->NowNanos() - start_nanos); 11 October 2016, 23:45:51 UTC
02b3e39 Make txn->GetState() const Summary: makes Transaction::GetState() a const function. Test Plan: compiles. Reviewers: mung Reviewed By: mung Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64929 11 October 2016, 22:48:50 UTC
447f171 new Prev() prefix support using SeekForPrev() Summary: 1) The previous solution for Prev() prefix support is not clean. Since I add api SeekForPrev(), now the Prev() can be symmetric to Next(). and we do not need SeekToLast() to be called in Prev() any more. Also, Next() will Seek(prefix_seek_key_) to solve the problem of possible inconsistency between db_iter and merge_iter when there is merge_operator. And prefix_seek_key is only refreshed when change direction to forward. 2) This diff also solves the bug of Iterator::SeekToLast() with iterate_upper_bound_ with prefix extractor. add test cases for the above two cases. There are some tests for the SeekToLast() in Prev(), I will clean them later. Test Plan: make all check Reviewers: IslamAbdelRahman, andrewkr, yiwu, sdong Reviewed By: sdong Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D63933 11 October 2016, 20:54:26 UTC
991b585 More block cache tickers Summary: Adding several missing block cache tickers. Test Plan: make all check Reviewers: IslamAbdelRahman, yhchiang, lightmark Reviewed By: lightmark Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64881 11 October 2016, 18:59:05 UTC
d6ae6de Add Statistics::getAndResetTickerCount(). Summary: A convience method to atomically get and reset ticker count. I'm wanting to use it to have a thin wrapper to the statistics object to export ticker counts to ODS for LogDevice (since they don't even use fb303). Test Plan: test in LogDevice shadow cluster. https://fburl.com/461868822 Reviewers: andrewkr, yhchiang, IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, dhruba, leveldb Differential Revision: https://reviews.facebook.net/D64869 11 October 2016, 17:54:11 UTC
aea3ce4 Avoid string CONCAT which is not supported in cmake 2.6 (#1383) Signed-off-by: Bassam Tabbara <bassam.tabbara@quantum.com> 11 October 2016, 00:32:04 UTC
2ad68b9 Support running consistency checks in release mode Summary: We always run consistency checks when compiling in debug mode allow users to set Options::force_consistency_checks to true to be able to run such checks even when compiling in release mode Test Plan: make check -j64 make release Reviewers: lightmark, sdong, yiwu Reviewed By: yiwu Subscribers: hermanlee4, andrewkr, yoshinorim, jkedgar, dhruba Differential Revision: https://reviews.facebook.net/D64701 08 October 2016, 00:21:45 UTC
67501cf Fix -ve std::string::resize Summary: I saw this exception thrown because sometimes we may resize with -ve value if we have empty max_bytes_for_level_multiplier_additional vector Test Plan: run the tests Reviewers: yiwu Reviewed By: yiwu Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64791 08 October 2016, 00:16:13 UTC
04b02dd Testing asset links after config change 07 October 2016, 23:28:44 UTC
8c55bb8 Make Lock Info test multiple column families Summary: Modifies the lock info export test to test multiple column families after I was experiencing a bug while developing the MyRocks front-end for this. Test Plan: is test. Reviewers: mung Reviewed By: mung Subscribers: andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D64725 07 October 2016, 22:04:05 UTC
d062328 Revert "Support SST files with Global sequence numbers" This reverts commit ab01da5437385e3142689077c647a3b13ba3402f. 07 October 2016, 21:05:12 UTC
5cd2883 [RocksJava] Adjusted RateLimiter to 3.10.0 (#1368) Summary: - Deprecated RateLimiterConfig and GenericRateLimiterConfig - Introduced RateLimiter It is now possible to use all C++ related methods also in RocksJava. A noteable method is setBytesPerSecond which can change the allowed number of bytes per second at runtime. Test Plan: make rocksdbjava make jtest Reviewers: adamretter, yhchiang, ankgup87 Subscribers: dhruba Differential Revision: https://reviews.facebook.net/D35715 07 October 2016, 19:32:21 UTC
37737c3 Expose Transaction State Publicly Summary: This exposes a transactions state through a public api rather than through a public member variable. I also do some name refactoring. ExecutionStatus => TransactionState exec_status_ => trx_state_ Test Plan: It compiles and transaction_test passes. Reviewers: IslamAbdelRahman Reviewed By: IslamAbdelRahman Subscribers: andrewkr, mung, dhruba, sdong Differential Revision: https://reviews.facebook.net/D64689 07 October 2016, 18:58:53 UTC
2c1f952 Add facility to write only a portion of WriteBatch to WAL Summary: When constructing a write batch a client may now call MarkWalTerminationPoint() on that batch. No batch operations after this call will be added written to the WAL but will still be inserted into the Memtable. This facility is used to remove one of the three WriteImpl calls in 2PC transactions. This produces a ~1% perf improvement. ``` RocksDB - unoptimized 2pc, sync_binlog=1, disable_2pc=off INFO 2016-08-31 14:30:38,814 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2619 seconds. Requests/second = 28628 RocksDB - optimized 2pc , sync_binlog=1, disable_2pc=off INFO 2016-08-31 16:26:59,442 [main]: REQUEST PHASE COMPLETED. 75000000 requests done in 2581 seconds. Requests/second = 29054 ``` Test Plan: Two unit tests added. Reviewers: sdong, yiwu, IslamAbdelRahman Reviewed By: yiwu Subscribers: hermanlee4, dhruba, andrewkr Differential Revision: https://reviews.facebook.net/D64599 07 October 2016, 18:32:10 UTC
043cb62 Fix record_size in log_write_bench, swap args to std::string::assign. (#1373) Hello and thank you for RocksDB, I noticed when using log_write_bench that writes were always 88 bytes: > strace -e trace=write ./log_write_bench -num_records 2 2>&1 | head -n 2 write(3, "\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371"..., 88) = 88 write(3, "\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371\371"..., 88) = 88 > strace -e trace=write ./log_write_bench -record_size 4096 -num_records 2 2>&1 | head -n 2 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 88) = 88 write(3, "\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0"..., 88) = 88 I think this should be: << record.assign('X', FLAGS_record_size); >> record.assign(FLAGS_record_size, 'X'); So fill and not buffer. Otherwise I always see writes of size 88 (the decimal value for chr "X"). string& assign (const char* s, size_t n); buffer - Copies the first n characters from the array of characters pointed by s. string& assign (size_t n, char c); fill - Replaces the current value by n consecutive copies of character c. perl -le 'print ord "X"' 88 With the change: > strace -e trace=write ./log_write_bench -record_size 4096 -num_records 2 2>&1 | head -n 2 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 4096) = 4096 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 4096) = 4096 > strace -e trace=write ./log_write_bench -num_records 2 2>&1 | head -n 2 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 249) = 249 write(3, "XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX"..., 249) = 249 Thanks. https://github.com/facebook/rocksdb/commit/01c27be5fb42524c5052b4b4a23e05501e1d1421 https://reviews.facebook.net/D16239 06 October 2016, 17:45:31 UTC
4985f60 env_mirror: fix a few leaks (#1363) * env_mirror: fix leak from LockFile Signed-off-by: Sage Weil <sage@redhat.com> * env_mirror: instruct EnvMirror whether mirrored Envs should be destroyed The lifecycle rules for Env are frustrating and undocumented. Notably, Env::Default() should *not* be freed, but any Env instances we created should be. Explicitly instruct EnvMirror whether to clean up child Env instances. Default to false so that we do not affect existing callers. Signed-off-by: Sage Weil <sage@redhat.com> 06 October 2016, 17:43:05 UTC
5aded67 update of c.h (#1371) Added rocksdb_options_set_memtable_prefix_bloom_size_ratio function implemented in c.cc but not exported via c.h 06 October 2016, 17:37:19 UTC
912aec1 "Recent Posts" -> "All Posts" Blog sidebar shows all the posts, not just the most recent ones. 05 October 2016, 17:29:11 UTC
7cbb298 Make sure that when contribtuing we call out creating appropriate directories .... if they do not exist 04 October 2016, 22:38:41 UTC
a06ad47 Add top level doc information to CONTRIBUTING.md 04 October 2016, 22:27:28 UTC
3fdd5b9 A little more generic CONTRIBUTING.md 04 October 2016, 22:22:28 UTC
ed4fc31 Add link to CONTRIBUTING.md to main docs README.md 04 October 2016, 22:21:43 UTC
e4922e1 Forgot to truncate one blog post 04 October 2016, 22:20:15 UTC
6d8cd7e Add CONTRIBUTING.md for rocksdb.org contribution guidance 04 October 2016, 22:19:00 UTC
bd55e5a Fix some formatting of compaction blog post 04 October 2016, 21:33:07 UTC
0f60358 CRLF -> LF mod (including removing trailing whitespace for those files) 04 October 2016, 21:31:36 UTC
b90e29c Truncate posts on the main /blog/ page 04 October 2016, 21:20:26 UTC
0d7acad Add author fields to blog posts Now the author associated with fbid will be shown at top of blog post 04 October 2016, 21:11:04 UTC
01be441 Add GitHub link to the landing page header 04 October 2016, 20:49:13 UTC
9d6c961 Fix Mac build 04 October 2016, 01:25:10 UTC
ab01da5 Support SST files with Global sequence numbers Summary: - Update SstFileWriter to include a property for a global sequence number in the SST file `rocksdb.external_sst_file.global_seqno` - Update TableProperties to be aware of the offset of each property in the file - Update BlockBasedTableReader and Block to be able to honor the sequence number in `rocksdb.external_sst_file.global_seqno` property and use it to overwrite all sequence number in the file Something worth mentioning is that we don't update the seqno in the index block since and when doing a binary search, the reason for that is that it's guaranteed that SST files with global seqno will have only one user_key and each key will have seqno=0 encoded in it, This mean that this key is greater than any other key with seqno> 0. That mean that we can actually keep the current logic for these blocks Test Plan: unit tests Reviewers: andrewkr, yhchiang, yiwu, sdong Reviewed By: sdong Subscribers: hcz, andrewkr, dhruba Differential Revision: https://reviews.facebook.net/D62523 03 October 2016, 23:12:39 UTC
back to top