https://github.com/facebook/rocksdb

sort by:
Revision Author Date Message Commit Date
004237e fix ASAN for DeleteFilesInRange test case Summary: error message was ``` ==3095==ERROR: AddressSanitizer: stack-use-after-scope on address 0x7ffd18216c40 at pc 0x0000005edda1 bp 0x7ffd18215550 sp 0x7ffd18214d00 ... Address 0x7ffd18216c40 is located in stack of thread T0 at offset 1952 in frame #0 internal_repo_rocksdb/db_compaction_test.cc:1520 rocksdb::DBCompactionTest_DeleteFileRangeFileEndpointsOverlapBug_Test::TestBody() ``` It was unsafe to have slices referring to the temporary string objects' buffers, as those strings were destroyed before the slices were used. Fixed it by assigning the strings returned by `Key()` to local variables. Closes https://github.com/facebook/rocksdb/pull/3238 Differential Revision: D6507864 Pulled By: ajkr fbshipit-source-id: dd07de1a0070c6748c1ab4f3d7bd31f9a81889d0 07 December 2017, 19:47:51 UTC
ea3a279 bump version 07 December 2017, 03:12:48 UTC
077e810 update history 07 December 2017, 03:12:33 UTC
a7aed7c Preserve overlapping file endpoint invariant Summary: Fix for #2833. - In `DeleteFilesInRange`, use `GetCleanInputsWithinInterval` instead of `GetOverlappingInputs` to make sure we get a clean cut set of files to delete. - In `GetCleanInputsWithinInterval`, support nullptr as `begin_key` or `end_key`. - In `GetOverlappingInputsRangeBinarySearch`, move the assertion for non-empty range away from `ExtendFileRangeWithinInterval`, which should be allowed to return an empty range (via `end_index < begin_index`). Closes https://github.com/facebook/rocksdb/pull/2843 Differential Revision: D5772387 Pulled By: ajkr fbshipit-source-id: e554e8461823c6be82b21a9262a2da02b3957881 07 December 2017, 03:11:13 UTC
d1939b0 Remove `import` use from TARGETS Summary: We're moving away from `import`. The equivalent internal construct that gets the directory from `fbcode/` is `package_name()`. This is a Skylark friendly wrapper around [`get_base_path`]. The additional whitespace change is from running `python ./buckifier/buckify_rocksdb.py`. [`get_base_path`]: https://buckbuild.com/function/get_base_path.html Closes https://github.com/facebook/rocksdb/pull/3210 Reviewed By: yiwu-arbug Differential Revision: D6451242 Pulled By: zertosh fbshipit-source-id: 445757261de0ec89d5d332c1ba9af097086326dc 01 December 2017, 00:05:36 UTC
920b7df TARGETS file not include tests in opt mode Summary: Do not build the tests in opt mode, since SyncPoint and other test code will not be included. Closes https://github.com/facebook/rocksdb/pull/3204 Differential Revision: D6431154 Pulled By: yiwu-arbug fbshipit-source-id: c404ef042c1a6f679e5c1dc57600b3d8cb52fc28 30 November 2017, 20:15:53 UTC
7b3bc81 Bump version to 5.9.1 28 November 2017, 20:09:13 UTC
84afb8a Blob DB: Add statistics Summary: Adding a list of blob db counters. Also remove WaStats() which doesn't expose the stats and can be substitute by (BLOB_DB_BYTES_WRITTEN / BLOB_DB_BLOB_FILE_BYTES_WRITTEN). Closes https://github.com/facebook/rocksdb/pull/3193 Differential Revision: D6394216 Pulled By: yiwu-arbug fbshipit-source-id: 017508c8ff3fcd7ea7403c64d0f9834b24816803 28 November 2017, 20:07:49 UTC
3c3f7b1 Fix IOError on WAL write doesn't propagate to write group follower Summary: This is a simpler version of #3097 by removing all unrelated changes. Fixing the bug where concurrent writes may get Status::OK while it actually gets IOError on WAL write. This happens when multiple writes form a write batch group, and the leader get an IOError while writing to WAL. The leader failed to pass the error to followers in the group, and the followers end up returning Status::OK() while actually writing nothing. The bug only affect writes in a batch group. Future writes after the batch group will correctly return immediately with the IOError. Closes https://github.com/facebook/rocksdb/pull/3201 Differential Revision: D6421644 Pulled By: yiwu-arbug fbshipit-source-id: 1c2a455c5b73f6842423785eb8a9dbfbb191dc0e 28 November 2017, 20:06:55 UTC
9debbba Blob DB: Fix GC handling for inlined blob Summary: Garbage collection checks if the offset in blob index matches the offset of the blob value in the file. If it is a mismatch, the value is the current version. However it failed to check if the blob index is an inlined type, which don't even have an offset. Fixing it. Closes https://github.com/facebook/rocksdb/pull/3194 Differential Revision: D6394270 Pulled By: yiwu-arbug fbshipit-source-id: 7c2b9d795f1116f55f4d728086980f9b6e88ea78 28 November 2017, 20:06:48 UTC
3984b44 Fix calculating filter partition target size Summary: block_size_deviation is in percentage while the partition size is in bytes. The current code fails to take that into account resulting into very large target size for filter partitions. Closes https://github.com/facebook/rocksdb/pull/3187 Differential Revision: D6376069 Pulled By: maysamyabandeh fbshipit-source-id: 276546fc68f50e0da32c462abb46f6cf676db9b2 28 November 2017, 20:06:36 UTC
28d93ae Revert "No need for Restart Interval for meta blocks" Summary: See [issue 3169](https://github.com/facebook/rocksdb/issues/3169) for more information This reverts commit 593d3de37171d99a761ce2ab34ffa12654acd055. Closes https://github.com/facebook/rocksdb/pull/3188 Differential Revision: D6379271 Pulled By: miasantreble fbshipit-source-id: 88f9ed67ba52237ad9b6f7251db83672b62d7537 21 November 2017, 00:50:31 UTC
5652b6e Fix TARGETS lint warnings. Summary: Fix buckifier script and regenerate TARGETS file with no lint warnings. Closes https://github.com/facebook/rocksdb/pull/3170 Differential Revision: D6328993 Pulled By: yiwu-arbug fbshipit-source-id: 17d0e4ed92f676f35fed76659386611cc72b00b2 20 November 2017, 03:11:35 UTC
adee219 Blob DB: not using PinnableSlice move assignment Summary: The current implementation of PinnableSlice move assignment have an issue #3163. We are moving away from it instead of try to get the move assignment right, since it is too tricky. Closes https://github.com/facebook/rocksdb/pull/3164 Differential Revision: D6319201 Pulled By: yiwu-arbug fbshipit-source-id: 8f3279021f3710da4a4caa14fd238ed2df902c48 20 November 2017, 03:08:48 UTC
e58d377 Blob DB: Fix race condition between flush and write Summary: A race condition will happen when: * a user thread writes a value, but it hits the write stop condition because there are too many un-flushed memtables, while holding blob_db_impl.write_mutex_. * Flush is triggered and call flush begin listener and try to acquire blob_db_impl.write_mutex_. Fixing it. Closes https://github.com/facebook/rocksdb/pull/3149 Differential Revision: D6279805 Pulled By: yiwu-arbug fbshipit-source-id: 0e3c58afb78795ebe3360a2c69e05651e3908c40 20 November 2017, 03:08:34 UTC
3e29986 Blob DB: Fix release build Summary: `compression` shadow the method name in `BlobFile`. Rename it. Closes https://github.com/facebook/rocksdb/pull/3148 Differential Revision: D6274498 Pulled By: yiwu-arbug fbshipit-source-id: 7d293596530998b23b6b8a8940f983f9b6343a98 20 November 2017, 03:07:56 UTC
c468fd1 Blob DB: use compression in file header instead of global options Summary: To fix the issue of failing to decompress existing value after reopen DB with a different compression settings. Closes https://github.com/facebook/rocksdb/pull/3142 Differential Revision: D6267260 Pulled By: yiwu-arbug fbshipit-source-id: c7cf7f3e33b0cd25520abf4771cdf9180cc02a5f 20 November 2017, 03:07:50 UTC
c3efe60 Fix PinnableSlice move assignment Summary: After move assignment, we need to re-initialized the moved PinnableSlice. Also update blob_db_impl.cc to not reuse the moved PinnableSlice since it is supposed to be in an undefined state after move. Closes https://github.com/facebook/rocksdb/pull/3127 Differential Revision: D6238585 Pulled By: yiwu-arbug fbshipit-source-id: bd99f2e37406c4f7de160c7dee6a2e8126bc224e 20 November 2017, 03:07:43 UTC
a0016dc Blob DB: Fix BlobDBTest::SnapshotAndGarbageCollection asan failure Summary: Fix unreleased snapshot at the end of the test. Closes https://github.com/facebook/rocksdb/pull/3126 Differential Revision: D6232867 Pulled By: yiwu-arbug fbshipit-source-id: 651ca3144fc573ea2ab0ab20f0a752fb4a101d26 20 November 2017, 03:07:34 UTC
49e764a Blob DB: Add compaction filter to remove expired blob index entries Summary: After adding expiration to blob index in #3066, we are now able to add a compaction filter to cleanup expired blob index entries. Closes https://github.com/facebook/rocksdb/pull/3090 Differential Revision: D6183812 Pulled By: yiwu-arbug fbshipit-source-id: 9cb03267a9702975290e758c9c176a2c03530b83 20 November 2017, 03:07:26 UTC
7c27f3d Blob DB: fix snapshot handling Summary: Blob db will keep blob file if data in the file is visible to an active snapshot. Before this patch it checks whether there is an active snapshot has sequence number greater than the earliest sequence in the file. This is problematic since we take snapshot on every read, if it keep having reads, old blob files will not be cleanup. Change to check if there is an active snapshot falls in the range of [earliest_sequence, obsolete_sequence) where obsolete sequence is 1. if data is relocated to another file by garbage collection, it is the latest sequence at the time garbage collection finish 2. otherwise, it is the latest sequence of the file Closes https://github.com/facebook/rocksdb/pull/3087 Differential Revision: D6182519 Pulled By: yiwu-arbug fbshipit-source-id: cdf4c35281f782eb2a9ad6a87b6727bbdff27a45 20 November 2017, 03:07:15 UTC
37db5f3 Blob DB: Evict oldest blob file when close to blob db size limit Summary: Evict oldest blob file and put it in obsolete_files list when close to blob db size limit. The file will be delete when the `DeleteObsoleteFiles` background job runs next time. For now I set `kEvictOldestFileAtSize` constant, which controls when to evict the oldest file, at 90%. It could be tweaked or made into an option if really needed; I didn't want to expose it as an option pre-maturely as there are already too many :) . Closes https://github.com/facebook/rocksdb/pull/3094 Differential Revision: D6187340 Pulled By: sagar0 fbshipit-source-id: 687f8262101b9301bf964b94025a2fe9d8573421 20 November 2017, 03:06:56 UTC
cb9104c Blob DB: option to enable garbage collection Summary: Add an option to enable/disable auto garbage collection, where we keep counting how many keys have been evicted by either deletion or compaction and decide whether to garbage collect a blob file. Default disable auto garbage collection for now since the whole logic is not fully tested and we plan to make major change to it. Closes https://github.com/facebook/rocksdb/pull/3117 Differential Revision: D6224756 Pulled By: yiwu-arbug fbshipit-source-id: cdf53bdccec96a4580a2b3a342110ad9e8864dfe 20 November 2017, 03:04:05 UTC
f1b040c Blob DB: Fix flaky BlobDBTest::GCExpiredKeyWhileOverwriting test Summary: The test intent to wait until key being overwritten until proceed with garbage collection. It failed to wait for `PutUntil` finally finish. Fixing it. Closes https://github.com/facebook/rocksdb/pull/3116 Differential Revision: D6222833 Pulled By: yiwu-arbug fbshipit-source-id: fa9b57a772b92a66cf250b44e7975c43f62f45c5 04 November 2017, 00:11:49 UTC
d070003 dynamically change current memtable size Summary: Previously setting `write_buffer_size` with `SetOptions` would only apply to new memtables. An internal user wanted it to take effect immediately, instead of at an arbitrary future point, to prevent OOM. This PR makes the memtable's size mutable, and makes `SetOptions()` mutate it. There is one case when we preserve the old behavior, which is when memtable prefix bloom filter is enabled and the user is increasing the memtable's capacity. That's because the prefix bloom filter's size is fixed and wouldn't work as well on a larger memtable. Closes https://github.com/facebook/rocksdb/pull/3119 Differential Revision: D6228304 Pulled By: ajkr fbshipit-source-id: e44bd9d10a5f8c9d8c464bf7436070bb3eafdfc9 04 November 2017, 00:11:17 UTC
25fbd9a Remove the experimental notes about partitioning Summary: This patch will remove the existing comments that declare partitioning indexes and filters as experimental. Closes https://github.com/facebook/rocksdb/pull/3115 Differential Revision: D6222227 Pulled By: maysamyabandeh fbshipit-source-id: 6179ec43b22c518494051b674d91c9e1b54d4ac0 02 November 2017, 18:14:30 UTC
60d83df WritePrepared Txn: Move DB class to its own file Summary: Move WritePreparedTxnDB from pessimistic_transaction_db.h to its own header, write_prepared_txn_db.h Closes https://github.com/facebook/rocksdb/pull/3114 Differential Revision: D6220987 Pulled By: maysamyabandeh fbshipit-source-id: 18893fb4fdc6b809fe117dabb544080f9b4a301b 02 November 2017, 18:14:30 UTC
6778690 fix duplicate definition of GetEntryType() Summary: It's also defined in db/dbformat.cc per 7fe3b32896ecbb21d67ec52fccb713cb9bc6a644 Closes https://github.com/facebook/rocksdb/pull/3111 Differential Revision: D6219140 Pulled By: ajkr fbshipit-source-id: 0f2b14e41457334a4665c6b7e3f42f1a060a0f35 02 November 2017, 05:56:17 UTC
cd12421 release 5.9 Summary: updated HISTORY.md and version.h for the release. Closes https://github.com/facebook/rocksdb/pull/3110 Differential Revision: D6218645 Pulled By: ajkr fbshipit-source-id: 99ab8473e9088b02d7596e92351cce7a60a99e93 02 November 2017, 04:26:14 UTC
02693f6 WritePrepared Txn: ValidateSnapshot Summary: Implements ValidateSnapshot for WritePrepared txns and also adds a unit test to clarify the contract of this function. Closes https://github.com/facebook/rocksdb/pull/3101 Differential Revision: D6199405 Pulled By: maysamyabandeh fbshipit-source-id: ace509934c307ea5d26f4bbac5f836d7c80fd240 02 November 2017, 02:11:09 UTC
7fe3b32 Added support for differential snapshots Summary: The motivation for this PR is to add to RocksDB support for differential (incremental) snapshots, as snapshot of the DB changes between two points in time (one can think of it as diff between to sequence numbers, or the diff D which can be thought of as an SST file or just set of KVs that can be applied to sequence number S1 to get the database to the state at sequence number S2). This feature would be useful for various distributed storages layers built on top of RocksDB, as it should help reduce resources (time and network bandwidth) needed to recover and rebuilt DB instances as replicas in the context of distributed storages. From the API standpoint that would like client app requesting iterator between (start seqnum) and current DB state, and reading the "diff". This is a very draft PR for initial review in the discussion on the approach, i'm going to rework some parts and keep updating the PR. For now, what's done here according to initial discussions: Preserving deletes: - We want to be able to optionally preserve recent deletes for some defined period of time, so that if a delete came in recently and might need to be included in the next incremental snapshot it would't get dropped by a compaction. This is done by adding new param to Options (preserve deletes flag) and new variable to DB Impl where we keep track of the sequence number after which we don't want to drop tombstones, even if they are otherwise eligible for deletion. - I also added a new API call for clients to be able to advance this cutoff seqnum after which we drop deletes; i assume it's more flexible to let clients control this, since otherwise we'd need to keep some kind of timestamp < -- > seqnum mapping inside the DB, which sounds messy and painful to support. Clients could make use of it by periodically calling GetLatestSequenceNumber(), noting the timestamp, doing some calculation and figuring out by how much we need to advance the cutoff seqnum. - Compaction codepath in compaction_iterator.cc has been modified to avoid dropping tombstones with seqnum > cutoff seqnum. Iterator changes: - couple params added to ReadOptions, to optionally allow client to request internal keys instead of user keys (so that client can get the latest value of a key, be it delete marker or a put), as well as min timestamp and min seqnum. TableCache changes: - I modified table_cache code to be able to quickly exclude SST files from iterators heep if creation_time on the file is less then iter_start_ts as passed in ReadOptions. That would help a lot in some DB settings (like reading very recent data only or using FIFO compactions), but not so much for universal compaction with more or less long iterator time span. What's left: - Still looking at how to best plug that inside DBIter codepath. So far it seems that FindNextUserKeyInternal only parses values as UserKeys, and iter->key() call generally returns user key. Can we add new API to DBIter as internal_key(), and modify this internal method to optionally set saved_key_ to point to the full internal key? I don't need to store actual seqnum there, but I do need to store type. Closes https://github.com/facebook/rocksdb/pull/2999 Differential Revision: D6175602 Pulled By: mikhail-antonov fbshipit-source-id: c779a6696ee2d574d86c69cec866a3ae095aa900 02 November 2017, 01:56:43 UTC
17731a4 WritePrepared Txn: Optimize for recoverable state Summary: GetCommitTimeWriteBatch is currently used to store some state as part of commit in 2PC. In MyRocks it is specifically used to store some data that would be needed only during recovery. So it is not need to be stored in memtable right after each commit. This patch enables an optimization to write the GetCommitTimeWriteBatch only to the WAL. The batch will be written to memtable during recovery when the WAL is replayed. To cover the case when WAL is deleted after memtable flush, the batch is also buffered and written to memtable right before each memtable flush. Closes https://github.com/facebook/rocksdb/pull/3071 Differential Revision: D6148023 Pulled By: maysamyabandeh fbshipit-source-id: 2d09bae5565abe2017c0327421010d5c0d55eaa7 02 November 2017, 00:26:46 UTC
c1cf94c WritePrepared Txn: sort indexes before batch collapse Summary: The collapse of duplicate keys in write batch needs to sort the indexes of duplicate keys since it only checks the index in the batch with the head of the list of duplicate keys. Closes https://github.com/facebook/rocksdb/pull/3093 Differential Revision: D6186800 Pulled By: maysamyabandeh fbshipit-source-id: abc9ae8c2f1840445a5584f925cf86ecc6f37154 01 November 2017, 15:56:57 UTC
f6082d1 Blob DB: cleanup unused options Summary: * cleanup num_concurrent_simple_blobs. We don't do concurrent writes (by taking write_mutex_) so it doesn't make sense to have multiple non TTL files open. We can revisit later when we want to improve writes. * cleanup eviction callback. we don't have plan to use it now. * rename s/open_simple_blob_files_/open_non_ttl_file_/ and s/open_blob_files_/open_ttl_files_/ to avoid confusion. Closes https://github.com/facebook/rocksdb/pull/3088 Differential Revision: D6182598 Pulled By: yiwu-arbug fbshipit-source-id: 99e6f5e01fa66d31309cdb06ce48502464bac6ad 31 October 2017, 23:42:08 UTC
f5078dd Blob DB: Initialize all fields in Blob Header, Footer and Record structs Summary: Fixing un-itializations caught by valgrind. Closes https://github.com/facebook/rocksdb/pull/3103 Differential Revision: D6200195 Pulled By: sagar0 fbshipit-source-id: bf35a3fb03eb1d308e4c5ce30dee1e345d7b03b3 31 October 2017, 23:42:08 UTC
33c7d4c Make writable_file_max_buffer_size dynamic Summary: The DBOptions::writable_file_max_buffer_size can be changed dynamically. Closes https://github.com/facebook/rocksdb/pull/3053 Differential Revision: D6152720 Pulled By: shligit fbshipit-source-id: aa0c0cfcfae6a54eb17faadb148d904797c68681 31 October 2017, 20:56:35 UTC
c1be8d8 Fix removed structurally dead return statement Summary: There seems to be a typo mistake in env ReuseWritableFile func where status is being returned twice. Closes https://github.com/facebook/rocksdb/pull/3099 Differential Revision: D6196204 Pulled By: ajkr fbshipit-source-id: abb6e3e1c1e772dd485fc39e7f1b9d502fa188fe 31 October 2017, 08:26:13 UTC
4d43c6a db_stress snapshot compatibility with reopens Summary: - Release all snapshots before crashing and reopening the DB. Without this, we may attempt to release snapshots from an old DB using a new DB. That tripped an assertion. - Release multiple snapshots in the same operation if needed. Without this, we would sometimes leak snapshots. Closes https://github.com/facebook/rocksdb/pull/3098 Differential Revision: D6194923 Pulled By: ajkr fbshipit-source-id: b9c89bcca7ebcbb6c7802c616f9d1175a005aadf 31 October 2017, 08:26:08 UTC
b7bc9cc fix tracking oldest snapshot for bottom-level compaction Summary: The assertion was caught by `MySQLStyleTransactionTest/MySQLStyleTransactionTest.TransactionStressTest/5` when run in a loop. The caller doesn't track whether the released snapshot is oldest, so let this function handle that case. Closes https://github.com/facebook/rocksdb/pull/3080 Differential Revision: D6185257 Pulled By: ajkr fbshipit-source-id: 4b3015c11db5d31e46521a00af568546ef4558cd 30 October 2017, 07:55:58 UTC
792ef10 Return Status::InvalidArgument if user request sync write while disabling WAL Summary: write_options.sync = true and write_options.disableWAL is incompatible. When WAL is disabled, we are not able to persist the write immediately. Return an error in this case to avoid misuse of the options. Closes https://github.com/facebook/rocksdb/pull/3086 Differential Revision: D6176822 Pulled By: yiwu-arbug fbshipit-source-id: 1eb10028c14fe7d7c13c8bc12c0ef659f75aa071 29 October 2017, 05:11:18 UTC
6a9335d always drop tombstones compacted to bottommost level Summary: Problem was in bottommost compaction, when an L0->L0 compaction happened and L0 was bottommost. Then we'd preserve tombstones according to `Compaction::KeyNotExistsBeyondOutputLevel`, while zeroing seqnum according to `CompactionIterator::PrepareOutput`, thus triggering the assertion in `PrepareOutput`. To fix, we can just drop tombstones in L0->L0 when the output is "bottommost", i.e., the compaction includes the oldest L0 file and there's nothing at lower levels. Closes https://github.com/facebook/rocksdb/pull/3085 Differential Revision: D6175742 Pulled By: ajkr fbshipit-source-id: 8ab19a2e001496f362e9eb0a71757e2f6ecfdb3b 27 October 2017, 22:56:35 UTC
84a04af TableProperty::oldest_key_time defaults to 0 Summary: We don't propagate TableProperty::oldest_key_time on compaction and just write the default value to SST files. It is more natural to default the value to 0. Also revert db_sst_test back to before #2842. Closes https://github.com/facebook/rocksdb/pull/3079 Differential Revision: D6165702 Pulled By: yiwu-arbug fbshipit-source-id: ca3ce5928d96ae79a5beb12bb7d8c640a71478a0 27 October 2017, 22:00:05 UTC
0599315 Mark files as trash by using .trash extension Summary: SstFileManager move files that need to be deleted into a trash directory. Deprecate this behaviour and instead add ".trash" extension to files that need to be deleted Closes https://github.com/facebook/rocksdb/pull/2970 Differential Revision: D5976805 Pulled By: IslamAbdelRahman fbshipit-source-id: 27374ece4315610b2792c30ffcd50232d4c9a343 27 October 2017, 20:27:12 UTC
3ebb7ba Blob DB: update blob file format Summary: Changing blob file format and some code cleanup around the change. The change with blob log format are: * Remove timestamp field in blob file header, blob file footer and blob records. The field is not being use and often confuse with expiration field. * Blob file header now come with column family id, which always equal to default column family id. It leaves room for future support of column family. * Compression field in blob file header now is a standalone byte (instead of compact encode with flags field) * Blob file footer now come with its own crc. * Key length now being uint64_t instead of uint32_t * Blob CRC now checksum both key and value (instead of value only). * Some reordering of the fields. The list of cleanups: * Better inline comments in blob_log_format.h * rename ttlrange_t and snrange_t to ExpirationRange and SequenceRange respectively. * simplify blob_db::Reader * Move crc checking logic to inside blob_log_format.cc Closes https://github.com/facebook/rocksdb/pull/3081 Differential Revision: D6171304 Pulled By: yiwu-arbug fbshipit-source-id: e4373e0d39264441b7e2fbd0caba93ddd99ea2af 27 October 2017, 20:27:12 UTC
682db81 Enable cacheline_aligned_alloc() to allocate from jemalloc if enabled. Summary: Reuse WITH_JEMALLOC option in preparation for module search unification. Move jemalloc overrides into a separate .cc Remote obsolete JEMALLOC_NOINIT option. Closes https://github.com/facebook/rocksdb/pull/3078 Differential Revision: D6174826 Pulled By: yiwu-arbug fbshipit-source-id: 9970a0289b4490272d15853920d9d7531af91140 27 October 2017, 20:27:12 UTC
d9240b5 Fix coverity uninitialized fields warnings in lru_cache Summary: Coverity uninitialized member variable warnings in lru_cache Closes https://github.com/facebook/rocksdb/pull/3082 Differential Revision: D6173062 Pulled By: sagar0 fbshipit-source-id: 7bcfc653457bd362d46045d06527838c9a6adad6 27 October 2017, 18:26:43 UTC
50e95a6 Fix coverity issues column_family, compaction_db/iterator Summary: db/column_family.h : 79 ColumnFamilyHandleInternal() CID 1322806 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR) 2. uninit_member: Non-static class member internal_cfd_ is not initialized in this constructor nor in any functions that it calls. 80 : ColumnFamilyHandleImpl(nullptr, nullptr, nullptr) {} db/compacted_db_impl.cc: 18CompactedDBImpl::CompactedDBImpl( 19 const DBOptions& options, const std::string& dbname) 20 : DBImpl(options, dbname) { 2. uninit_member: Non-static class member cfd_ is not initialized in this constructor nor in any functions that it calls. 4. uninit_member: Non-static class member version_ is not initialized in this constructor nor in any functions that it calls. CID 1396120 (#1 of 1): Uninitialized pointer field (UNINIT_CTOR) 6. uninit_member: Non-static class member user_comparator_ is not initialized in this constructor nor in any functions that it calls. 21} db/compaction_iterator.cc: 9. uninit_member: Non-static class member current_user_key_sequence_ is not initialized in this constructor nor in any functions that it calls. 11. uninit_member: Non-static class member current_user_key_snapshot_ is not initialized in this constructor nor in any functions that it calls. CID 1419855 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR) 13. uninit_member: Non-static class member current_key_committed_ is not initialized in this constructor nor in any functions that it calls. Closes https://github.com/facebook/rocksdb/pull/3084 Differential Revision: D6172999 Pulled By: sagar0 fbshipit-source-id: 084d73393faf8022c01359cfb445807b6a782460 27 October 2017, 18:26:42 UTC
47166ba Fix coverity uninitialized fields warnings Pulled By: ajkr Differential Revision: D6170448 fbshipit-source-id: 5fd6d1608fc0df27c94d9f5059315ce7f79b8f5c 27 October 2017, 04:11:50 UTC
67b29e2 Fix coverity issue for MutableDBOptions default constructor Summary: 228MutableDBOptions::MutableDBOptions() 229 : max_background_jobs(2), 230 base_background_compactions(-1), 231 max_background_compactions(-1), 232 avoid_flush_during_shutdown(false), 233 delayed_write_rate(2 * 1024U * 1024U), 234 max_total_wal_size(0), 235 delete_obsolete_files_period_micros(6ULL * 60 * 60 * 1000000), 236 stats_dump_period_sec(600), 2. uninit_member: Non-static class member bytes_per_sync is not initialized in this constructor nor in any functions that it calls. CID 1419857 (#1 of 1): Uninitialized scalar field (UNINIT_CTOR) 4. uninit_member: Non-static class member wal_bytes_per_sync is not initialized in this constructor nor in any functions that it calls. 237 max_open_files(-1) {} Closes https://github.com/facebook/rocksdb/pull/3069 Differential Revision: D6170424 Pulled By: ajkr fbshipit-source-id: 1f94e86b87611ad2330b8b1707911150978d68b8 27 October 2017, 03:56:45 UTC
9566738 implement lower bound for iterators Summary: - for `SeekToFirst()`, just convert it to a regular `Seek()` if lower bound is specified - for operations that iterate backwards over user keys (`SeekForPrev`, `SeekToLast`, `Prev`), change `PrevInternal` to check whether user key went below lower bound every time the user key changes -- same approach we use to ensure we stay within a prefix when `prefix_same_as_start=true`. Closes https://github.com/facebook/rocksdb/pull/3074 Differential Revision: D6158654 Pulled By: ajkr fbshipit-source-id: cb0e3a922e2650d2cd4d1c6e1c0f1e8b729ff518 27 October 2017, 00:27:42 UTC
5a2a648 Blob DB: Inline small values in base DB Summary: Adding the `min_blob_size` option to allow storing small values in base db (in LSM tree) together with the key. The goal is to improve performance for small values, while taking advantage of blob db's low write amplification for large values. Also adding expiration timestamp to blob index. It will be useful to evict stale blob indexes in base db by adding a compaction filter. I'll work on the compaction filter in future patches. See blob_index.h for the new blob index format. There are 4 cases when writing a new key: * small value w/o TTL: put in base db as normal value (i.e. ValueType::kTypeValue) * small value w/ TTL: put (type, expiration, value) to base db. * large value w/o TTL: write value to blob log and put (type, file, offset, size, compression) to base db. * large value w/TTL: write value to blob log and put (type, expiration, file, offset, size, compression) to base db. Closes https://github.com/facebook/rocksdb/pull/3066 Differential Revision: D6142115 Pulled By: yiwu-arbug fbshipit-source-id: 9526e76e19f0839310a3f5f2a43772a4ad182cd0 26 October 2017, 19:30:54 UTC
9b18cc2 single-file bottom-level compaction when snapshot released Summary: When snapshots are held for a long time, files may reach the bottom level containing overwritten/deleted keys. We previously had no mechanism to trigger compaction on such files. This particularly impacted DBs that write to different parts of the keyspace over time, as such files would never be naturally compacted due to second-last level files moving down. This PR introduces a mechanism for bottommost files to be recompacted upon releasing all snapshots that prevent them from dropping their deleted/overwritten keys. - Changed `CompactionPicker` to compact files in `BottommostFilesMarkedForCompaction()`. These are the last choice when picking. Each file will be compacted alone and output to the same level in which it originated. The goal of this type of compaction is to rewrite the data excluding deleted/overwritten keys. - Changed `ReleaseSnapshot()` to recompute the bottom files marked for compaction when the oldest existing snapshot changes, and schedule a compaction if needed. We cache the value that oldest existing snapshot needs to exceed in order for another file to be marked in `bottommost_files_mark_threshold_`, which allows us to avoid recomputing marked files for most snapshot releases. - Changed `VersionStorageInfo` to track the list of bottommost files, which is recomputed every time the version changes by `UpdateBottommostFiles()`. The list of marked bottommost files is first computed in `ComputeBottommostFilesMarkedForCompaction()` when the version changes, but may also be recomputed when `ReleaseSnapshot()` is called. - Extracted core logic of `Compaction::IsBottommostLevel()` into `VersionStorageInfo::RangeMightExistAfterSortedRun()` since logic to check whether a file is bottommost is now necessary outside of compaction. Closes https://github.com/facebook/rocksdb/pull/3009 Differential Revision: D6062044 Pulled By: ajkr fbshipit-source-id: 123d201cf140715a7d5928e8b3cb4f9cd9f7ad21 25 October 2017, 23:30:37 UTC
96e3a60 Return write error on reaching blob dir size limit Summary: I found that we continue accepting writes even when the blob db goes beyond the configured blob directory size limit. Now, we return an error for writes on reaching `blob_dir_size` limit and if `is_fifo` is set to false. (We cannot just drop any file when `is_fifo` is true.) Deleting the oldest file when `is_fifo` is true will be handled in a later PR. Closes https://github.com/facebook/rocksdb/pull/3060 Differential Revision: D6136156 Pulled By: sagar0 fbshipit-source-id: 2f11cb3f2eedfa94524fbfa2613dd64bfad7a23c 25 October 2017, 23:30:37 UTC
addfe1e Fix tombstone scans in SeekForPrev outside prefix Summary: When doing a Seek() or SeekForPrev() we should stop the moment we see a key with a different prefix as start if ReadOptions:: prefix_same_as_start was set to true Right now we don't stop if we encounter a tombstone outside the prefix while executing SeekForPrev() Closes https://github.com/facebook/rocksdb/pull/3067 Differential Revision: D6149638 Pulled By: IslamAbdelRahman fbshipit-source-id: 7f659862d2bf552d3c9104a360c79439ceba2f18 25 October 2017, 22:12:00 UTC
386a57e Fix build on OpenBSD Summary: A few simple changes to allow RocksDB to be built on OpenBSD. Let me know if any further changes are needed. Closes https://github.com/facebook/rocksdb/pull/3061 Differential Revision: D6138800 Pulled By: ajkr fbshipit-source-id: a13a17b5dc051e6518bd56a8c5efd1d24dd81b0c 24 October 2017, 20:27:38 UTC
57fcdc2 added missing subcodes and improved error message for missing enum values Summary: Java's `Status.SubCode` was out of sync with `include/rocksdb/status.h:SubCode`. When running out of disc space this led to an `IllegalArgumentException` because of an invalid status code, rather than just returning the corresponding status code without an exception. I added the missing status codes. By this, we keep the behaviour of throwing an `IllegalArgumentException` in case of newly added status codes that are defined in C but not in Java. We could think of an alternative strategy: add in Java another code "UnknownCode" which acts as a catch-all for all those status codes that are not yet mirrored from C to Java. This approach would never throw an exception but simply return a non-OK status-code. I think the current approach of throwing an Exception in case of a C/Java inconsistency is fine, but if you have some opinion on the alternative strategy, then feel free to comment here. Closes https://github.com/facebook/rocksdb/pull/3050 Differential Revision: D6129682 Pulled By: sagar0 fbshipit-source-id: f2bf44caad650837cffdcb1f93eb793b43580c66 23 October 2017, 23:42:07 UTC
66a2c44 Add DB::Properties::kEstimateOldestKeyTime Summary: With FIFO compaction we would like to get the oldest data time for monitoring. The problem is we don't have timestamp for each key in the DB. As an approximation, we expose the earliest of sst file "creation_time" property. My plan is to override the property with a more accurate value with blob db, where we actually have timestamp. Closes https://github.com/facebook/rocksdb/pull/2842 Differential Revision: D5770600 Pulled By: yiwu-arbug fbshipit-source-id: 03833c8f10bbfbee62f8ea5c0d03c0cafb5d853a 23 October 2017, 22:27:27 UTC
d2a65c5 Fix unused var warnings in Release mode Summary: MSVC does not support unused attribute at this time. A separate assignment line fixes the issue probably by being counted as usage for MSVC and it no longer complains about unused var. Closes https://github.com/facebook/rocksdb/pull/3048 Differential Revision: D6126272 Pulled By: maysamyabandeh fbshipit-source-id: 4907865db45fd75a39a15725c0695aaa17509c1f 23 October 2017, 21:27:04 UTC
63822eb Enable two write queues for transactions Summary: Enable concurrent_prepare flag for WritePrepared transactions and extend the existing transaction tests with this config. Closes https://github.com/facebook/rocksdb/pull/3046 Differential Revision: D6106534 Pulled By: maysamyabandeh fbshipit-source-id: 88c8d21d45bc492beb0a131caea84a2ac5e7d38c 23 October 2017, 21:27:04 UTC
a02ed12 Exclude DBTest.DynamicFIFOCompactionOptions test under RocksDB Lite Summary: This test shouldn't be enabled under the lite version; and this fixes the failing contrun test due to #3006. Closes https://github.com/facebook/rocksdb/pull/3056 Differential Revision: D6114681 Pulled By: sagar0 fbshipit-source-id: dc5243549ae6b1353cec7edb820c771d95f66dda 21 October 2017, 00:11:39 UTC
3ef55d2 Split CompactionFilterWithValueChange Summary: The test currently times out when it is run under tsan. This patch split it into 4 tests. Closes https://github.com/facebook/rocksdb/pull/3047 Differential Revision: D6106515 Pulled By: maysamyabandeh fbshipit-source-id: 03a28cdf8b1c097be2361b1b0cc3dc1acf2b5d63 20 October 2017, 22:42:07 UTC
d75793d db_stress support long-held snapshots Summary: Add options to `db_stress` (correctness testing tool) to randomly acquire snapshot and release it after some period of time. It's useful for correctness testing of #3009, as well as other parts of compaction that behave differently depending on which snapshots are held. Closes https://github.com/facebook/rocksdb/pull/3038 Differential Revision: D6086501 Pulled By: ajkr fbshipit-source-id: 3ec0d8666c78ac507f1f808887c4ff759ba9b865 20 October 2017, 22:26:59 UTC
f8b5bb2 remove unused code Summary: fixup 6a541afcc4d1e5b6e6d78e288b9bee3bb2a933b5. This code didn't do anything because (1) `bytes_per_sync` is assigned in `EnvOptions`'s constructor; and (2) `OptimizeForCompactionTableWrite`'s return value was ignored, even though its only purpose is to return something. Closes https://github.com/facebook/rocksdb/pull/3055 Differential Revision: D6114132 Pulled By: ajkr fbshipit-source-id: ea4831770930e9cf83518e13eb2e1934d1f5487c 20 October 2017, 21:11:52 UTC
57fd4a8 update HISTORY with recent changes Summary: We should mention these: - `EventListener::OnStallConditionsChanged()` in 01542400a87ad130aed790bf895c029082745cbe - `DeleteRange()` fix in 966b32b57c673774ea677fe3a40d255e6c8916ba Closes https://github.com/facebook/rocksdb/pull/3054 Differential Revision: D6113989 Pulled By: ajkr fbshipit-source-id: d5e058e1ab07570df22936e8d5939fb30fb4d381 20 October 2017, 20:56:49 UTC
ee2b1ec Fix unstable floating point exception Summary: Fix unstable floating point exception, tested on Windows, 64-bit build. The problem appeared in `SetCapacity()` method at line `high_pri_pool_capacity_ = capacity_ * high_pri_pool_ratio_;` `high_pri_pool_ratio_` was not initialized at that moment, because `SetHighPriorityPoolRatio()` is called after `SetCapacity()`. So, `high_pri_pool_ratio_` contained garbage, which caused "Floating point exception" sometimes. Closes https://github.com/facebook/rocksdb/pull/3052 Differential Revision: D6111161 Pulled By: yiwu-arbug fbshipit-source-id: d170329111ad12b4bf9bbcf37bcb6411523438ae 20 October 2017, 17:12:49 UTC
f0804db Make FIFO compaction options dynamically configurable Summary: ColumnFamilyOptions::compaction_options_fifo and all its sub-fields can be set dynamically now. Some of the ways in which the fifo compaction options can be set are: - `SetOptions({{"compaction_options_fifo", "{max_table_files_size=1024}"}})` - `SetOptions({{"compaction_options_fifo", "{ttl=600;}"}})` - `SetOptions({{"compaction_options_fifo", "{max_table_files_size=1024;ttl=600;}"}})` - `SetOptions({{"compaction_options_fifo", "{max_table_files_size=51;ttl=49;allow_compaction=true;}"}})` Most of the code has been made generic enough so that it could be reused later to make universal options (and other such nested defined-types) dynamic with very few lines of parsing/serializing code changes. Introduced a few new functions like `ParseStruct`, `SerializeStruct` and `GetStringFromStruct`. The duplicate code in `GetStringFromDBOptions` and `GetStringFromColumnFamilyOptions` has been moved into `GetStringFromStruct`. So they become just simple wrappers now. Closes https://github.com/facebook/rocksdb/pull/3006 Differential Revision: D6058619 Pulled By: sagar0 fbshipit-source-id: 1e8f78b3374ca5249bb4f3be8a6d3bb4cbc52f92 19 October 2017, 22:26:36 UTC
ebab2e2 Enable MSVC W4 with a few exceptions. Fix warnings and bugs Summary: Closes https://github.com/facebook/rocksdb/pull/3018 Differential Revision: D6079011 Pulled By: yiwu-arbug fbshipit-source-id: 988a721e7e7617967859dba71d660fc69f4dff57 19 October 2017, 17:57:12 UTC
b749994 Update RocksDB Authors File Summary: Update RocksDB Authors File. Reviewed By: yiwu-arbug Differential Revision: D6075453 fbshipit-source-id: dff52f483aab33c41de391f145a8273acfd6cbde 18 October 2017, 21:42:10 UTC
7deed2b Fix a typo in a comment Summary: instad of for specific level -> instead of a specific level Closes https://github.com/facebook/rocksdb/pull/3040 Differential Revision: D6090811 Pulled By: sagar0 fbshipit-source-id: 499edef0a6f596c448f61791e6aca8f5cce08e9c 18 October 2017, 19:32:28 UTC
7e38238 WritePrepared Txn: Disable GC during recovery Summary: Disables GC during recovery of a WritePrepared txn db to avoid GCing uncommitted key values. Closes https://github.com/facebook/rocksdb/pull/2980 Differential Revision: D6000191 Pulled By: maysamyabandeh fbshipit-source-id: fc4d522c643d24ebf043f811fe4ecd0dd0294675 18 October 2017, 16:11:50 UTC
7891af8 expose a hook to skip tables during iteration Summary: As discussed on the mailing list (["Skipping entire SSTs while iterating"](https://groups.google.com/forum/#!topic/rocksdb/ujHCJVLrHlU)), this patch adds a `table_filter` to `ReadOptions` that allows specifying a callback to be executed during iteration before each table in the database is scanned. The callback is passed the table's properties; the table is scanned iff the callback returns true. This can be used in conjunction with a `TablePropertiesCollector` to dramatically speed up scans by skipping tables that are known to contain irrelevant data for the scan at hand. We're using this [downstream in CockroachDB](https://github.com/cockroachdb/cockroach/blob/master/pkg/storage/engine/db.cc#L2009-L2022) already. With this feature, under ideal conditions, we can reduce the time of an incremental backup in from hours to seconds. FYI, the first commit in this PR fixes a segfault that I unfortunately have not figured out how to reproduce outside of CockroachDB. I'm hoping you accept it on the grounds that it is not correct to return 8-byte aligned memory from a call to `malloc` on some 64-bit platforms; one correct approach is to infer the necessary alignment from `std::max_align_t`, as done here. As noted in the first commit message, the bug is tickled by having a`std::function` in `struct ReadOptions`. That is, the following patch alone is enough to cause RocksDB to segfault when run from CockroachDB on Darwin. ```diff --- a/include/rocksdb/options.h +++ b/include/rocksdb/options.h @@ -1546,6 +1546,13 @@ struct ReadOptions { // Default: false bool ignore_range_deletions; + // A callback to determine whether relevant keys for this scan exist in a + // given table based on the table's properties. The callback is passed the + // properties of each table during iteration. If the callback returns false, + // the table will not be scanned. + // Default: empty (every table will be scanned) + std::function<bool(const TableProperties&)> table_filter; + ReadOptions(); ReadOptions(bool cksum, bool cache); }; ``` /cc danhhz Closes https://github.com/facebook/rocksdb/pull/2265 Differential Revision: D5054262 Pulled By: yiwu-arbug fbshipit-source-id: dd6b28f2bba6cb8466250d8c5c542d3c92785476 18 October 2017, 05:12:00 UTC
eaaef91 Blob DB: Store blob index as kTypeBlobIndex in base db Summary: Blob db insert blob index to base db as kTypeBlobIndex type, to tell apart values written by plain rocksdb or blob db. This is to make it possible to migrate from existing rocksdb to blob db. Also with the patch blob db garbage collection get away from OptimisticTransaction. Instead it use a custom write callback to achieve similar behavior as OptimisticTransaction. This is because we need to pass the is_blob_index flag to DBImpl::Get but OptimisticTransaction don't support it. Closes https://github.com/facebook/rocksdb/pull/3000 Differential Revision: D6050044 Pulled By: yiwu-arbug fbshipit-source-id: 61dc72ab9977625e75f78cd968e7d8a3976e3632 18 October 2017, 00:28:11 UTC
0552029 Blob DB: not writing sequence number as blob record footer Summary: Previously each time we write a blob we write blog_record_header + key + value + blob_record_footer to blob log. The footer only contains a sequence and a crc for the sequence number. The sequence number was used in garbage collection to verify the value is recent. After #2703 we moved to use optimistic transaction and no longer use sequence number from the footer. Remove the footer altogether. There's another usage of sequence number and we are keeping it: Each blob log file keep track of sequence number range of keys in it, and use it to check if it is reference by a snapshot, before being deleted. Closes https://github.com/facebook/rocksdb/pull/3005 Differential Revision: D6057585 Pulled By: yiwu-arbug fbshipit-source-id: d6da53c457a316e9723f359a1b47facfc3ffe090 17 October 2017, 19:13:08 UTC
966b32b fix delete range bug Summary: Fix this [issue](https://github.com/facebook/rocksdb/issues/2989). ajkr PTAL Close #2989 Closes https://github.com/facebook/rocksdb/pull/3017 Differential Revision: D6078541 Pulled By: yiwu-arbug fbshipit-source-id: ef3db87b37b9156f83ca468aa39dea1f6dbde49d 17 October 2017, 18:13:19 UTC
c0208df arena: derive alignment unit from std::max_align_t Summary: As raised in #2265, the arena allocator will return memory that is improperly aligned to store a `std::function` on macOS. Oddly, I'm unable to tickle this bug without adding a `std::function` field to `struct ReadOptions`—but my proposal in #2265 does exactly that. In any case, here's a simple reproduction. Apply this bogus patch to get a `std::function` into `struct ReadOptions` ``` --- a/include/rocksdb/options.h +++ b/include/rocksdb/options.h @@ -1035,6 +1035,8 @@ struct ReadOptions { // Default: 0 uint64_t max_skippable_internal_keys; + std::function<void()> foo; + ReadOptions(); ReadOptions(bool cksum, bool cache); }; ``` then compile `db_properties_test` *with ubsan* and run `ReadLatencyHistogramByLevel`: ``` $ make COMPILE_WITH_UBSAN=1 db_properties_test $ ./db_properties_test --gtest_filter=DBPropertiesTest.ReadLatencyHistogramByLevel ``` ubsan will complain about several misaligned accesses: ``` Note: Google Test filter = DBPropertiesTest.ReadLatencyHistogramByLevel [==========] Running 1 test from 1 test case. [----------] Global test environment set-up. [----------] 1 test from DBPropertiesTest [ RUN ] DBPropertiesTest.ReadLatencyHistogramByLevel util/coding.h:372:12: runtime error: load of misaligned address 0x00010d85516c for type 'const unsigned long', which requires 8 byte alignment 0x00010d85516c: note: pointer points here 01 00 34 57 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 24 82 0a 01 00 00 00 ^ util/coding.h:362:3: runtime error: store to misaligned address 0x7fff5733fac4 for type 'unsigned long', which requires 8 byte alignment 0x7fff5733fac4: note: pointer points here 01 00 00 00 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 80 1d 96 0d 01 00 00 00 ^ util/coding.h:372:12: runtime error: load of misaligned address 0x00010d85516c for type 'const unsigned long', which requires 8 byte alignment 0x00010d85516c: note: pointer points here 01 00 34 57 00 00 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 78 24 82 0a 01 00 00 00 ^ version_set.cc:854: runtime error: constructor call on misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ version_set.cc:512: runtime error: constructor call on misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ version_set.cc:505: runtime error: constructor call on misaligned address 0x00010dbfa5e8 for type 'rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ options.h:931: runtime error: constructor call on misaligned address 0x00010dbfa5e8 for type 'rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ options.h:931: runtime error: constructor call on misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ functional:1583: runtime error: constructor call on misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1585:9: runtime error: member access within misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1585:9: runtime error: store to misaligned address 0x00010dbfa648 for type '__base *' (aka '__base<void ()> *'), which requires 16 byte alignment 0x00010dbfa648: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ db/version_set.cc:864:29: runtime error: upcast of misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:521:12: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:521:12: runtime error: load of misaligned address 0x00010dbfa5d8 for type 'rocksdb::TableCache *', which requires 16 byte alignment 0x00010dbfa5d8: note: pointer points here 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 ^ db/version_set.cc:522:9: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:522:9: runtime error: reference binding to misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ db/version_set.cc:522:24: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:522:38: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:522:57: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:522:57: runtime error: load of misaligned address 0x00010dbfa678 for type 'rocksdb::RangeDelAggregator *', which requires 16 byte alignment 0x00010dbfa678: note: pointer points here 01 00 00 00 d0 a1 bf 0d 01 00 00 00 00 00 00 00 00 00 00 00 f8 db 70 0a 01 00 00 00 00 00 00 00 ^ db/version_set.cc:523:54: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:523:54: runtime error: load of misaligned address 0x00010dbfa668 for type 'rocksdb::HistogramImpl *', which requires 16 byte alignment 0x00010dbfa668: note: pointer points here 01 00 00 00 c8 88 a5 0d 01 00 00 00 00 00 00 00 01 00 00 00 d0 a1 bf 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:524:9: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:524:47: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:524:62: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/table_cache.cc:228:33: runtime error: reference binding to misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ table/block_based_table_reader.cc:1554:41: runtime error: reference binding to misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ table/block_based_table_reader.cc:1396:21: runtime error: reference binding to misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ include/rocksdb/options.h:931:8: runtime error: reference binding to misaligned address 0x00010dbfa628 for type 'const std::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1584:13: runtime error: load of misaligned address 0x00010dbfa648 for type '__base *const' (aka '__base<void ()> *const'), which requires 16 byte alignment 0x00010dbfa648: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c8 a5 97 0d 01 00 00 00 38 36 9b 0d ^ table/block_based_table_reader.cc:1555:24: runtime error: reference binding to misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ db/table_cache.cc:244:54: runtime error: load of misaligned address 0x00010dbfa618 for type 'const bool', which requires 16 byte alignment 0x00010dbfa618: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ db/table_cache.cc:246:49: runtime error: reference binding to misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ db/version_set.cc:532:12: runtime error: member access within misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ db/version_set.cc:532:12: runtime error: member access within misaligned address 0x00010dbfa5e8 for type 'const rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ db/version_set.cc:532:26: runtime error: load of misaligned address 0x00010dbfa5f8 for type 'const rocksdb::Slice *const', which requires 16 byte alignment 0x00010dbfa5f8: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ version_set.cc:493: runtime error: member call on misaligned address 0x00010dbfa5c8 for type 'rocksdb::(anonymous namespace)::LevelFileIteratorState', which requires 16 byte alignment 0x00010dbfa5c8: note: pointer points here 00 00 00 00 a0 db 70 0a 01 00 00 00 00 00 00 00 00 00 00 00 90 14 98 0d 01 00 00 00 00 00 00 00 ^ version_set.cc:493: runtime error: member call on misaligned address 0x00010dbfa5e8 for type 'rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ options.h:931: runtime error: member call on misaligned address 0x00010dbfa5e8 for type 'rocksdb::ReadOptions', which requires 16 byte alignment 0x00010dbfa5e8: note: pointer points here 00 00 00 00 01 01 ff ff ff ff ff ff 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ options.h:931: runtime error: member call on misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ functional:1765: runtime error: member call on misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1766:9: runtime error: member access within misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1766:9: runtime error: load of misaligned address 0x00010dbfa648 for type '__base *' (aka '__base<void ()> *'), which requires 16 byte alignment 0x00010dbfa648: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c8 a5 97 0d 01 00 00 00 38 36 9b 0d ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1766:27: runtime error: member access within misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1768:14: runtime error: member access within misaligned address 0x00010dbfa628 for type 'std::__1::function<void ()>', which requires 16 byte alignment 0x00010dbfa628: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ^ /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/../include/c++/v1/functional:1768:14: runtime error: load of misaligned address 0x00010dbfa648 for type '__base *' (aka '__base<void ()> *'), which requires 16 byte alignment 0x00010dbfa648: note: pointer points here 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 c8 a5 97 0d 01 00 00 00 38 36 9b 0d ^ [ OK ] DBPropertiesTest.ReadLatencyHistogramByLevel (1599 ms) [----------] 1 test from DBPropertiesTest (1599 ms total) [----------] Global test environment tear-down [==========] 1 test from 1 test case ran. (1599 ms total) [ PASSED ] 1 test. ``` So it seems the root cause is that the internal implementation of `std::function` on macOS (and perhaps with libc++ generally?) requires 16-byte aligned memory, but the arena allocator only guarantees that the returned memory will be `sizeof(void*)` aligned, which is only 8-byte alignment on my machine. This patch solves the problem by adjusting the allocator to derive the necessary alignment from `alignof(std::max_align_t)`, which is properly 16 bytes on my machine. As I mentioned in #2265, none of RocksDB's tests will cause this unaligned access to actually abort the process, but, on macOS, linking CockroachDB against a version of RocksDB with the above patch and letting it run for just a few seconds will cause a SIGABRT. ``` Process 19792 stopped * thread #2, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) frame #0: 0x0000000004f5e78f cockroach`DBNewIter + 95 cockroach`DBNewIter: -> 0x4f5e78f <+95>: callq *0x28(%rax) 0x4f5e792 <+98>: jmp 0x4f5e79e ; <+110> 0x4f5e794 <+100>: movq -0x50(%rbp), %rcx 0x4f5e798 <+104>: movq %rax, %rdi (lldb) bt * thread #2, stop reason = EXC_BAD_ACCESS (code=EXC_I386_GPFLT) * frame #0: 0x0000000004f5e78f cockroach`DBNewIter + 95 ``` I'd get you a backtrace, but [Go doesn't include cgo debug information on macOS](https://github.com/golang/go/issues/6942). I've also tried building against libc++ on Linux, where debug information would be available, but I can't seem to trigger the bug there. In any case, this PR both fixes the segfault in CockroachDB and fixes the warnings reported by ubsan. Closes https://github.com/facebook/rocksdb/pull/2347 Differential Revision: D5108596 Pulled By: yiwu-arbug fbshipit-source-id: bd5e4323b2ce915ed4fe78e123cb8996aec75a00 17 October 2017, 18:13:19 UTC
b8cea7c VersionBuilder: Erase with iterators for better performance Summary: Closes https://github.com/facebook/rocksdb/pull/3007 Differential Revision: D6077701 Pulled By: yiwu-arbug fbshipit-source-id: a6fd5b8a23f4feb1660b9ce027f651a7e90352b3 17 October 2017, 17:12:37 UTC
f7843f3 Move ~Comparator define to comparator.h Summary: When I impl my own comparator, and build in release mode. The following compile error occurs. undefined reference to `typeinfo for rocksdb::Comparator' This fix allows users build with RTTI off when has their own comparator. Closes https://github.com/facebook/rocksdb/pull/3008 Differential Revision: D6077354 Pulled By: yiwu-arbug fbshipit-source-id: 914c26dbab72f0ad1f0e15f8666a3fb2f10bfed8 17 October 2017, 16:58:13 UTC
8e63cad fix lite build Summary: * make `checksum_type_string_map` available for lite * comment out `FilesPerLevel` in lite mode. * travis and legocastle lite build also build `all` target and run tests Closes https://github.com/facebook/rocksdb/pull/3015 Differential Revision: D6069822 Pulled By: yiwu-arbug fbshipit-source-id: 9fe92ac220e711e9e6ed4e921bd25ef4314796a0 17 October 2017, 15:57:09 UTC
7318952 db_bench randomtransaction print throughput Summary: print throughput in MB/s upon finishing randomtransaction benchmark Closes https://github.com/facebook/rocksdb/pull/3016 Differential Revision: D6070426 Pulled By: ajkr fbshipit-source-id: 69df43beed4c374a36d826e761ca3a83e1fdcbf5 17 October 2017, 01:42:25 UTC
60c09f5 print more table_options to info log Summary: print more table_options to info log Closes https://github.com/facebook/rocksdb/pull/3003 Differential Revision: D6054490 Pulled By: yiwu-arbug fbshipit-source-id: 8e6f96e08bdc906077b6c62ade419d7cb739110f 13 October 2017, 21:42:26 UTC
10ba50e Blob DB: Move BlobFile definition to a separate file Summary: simply move BlobFile definition from blob_db_impl.h to blob_file.h. Closes https://github.com/facebook/rocksdb/pull/3002 Differential Revision: D6050143 Pulled By: yiwu-arbug fbshipit-source-id: a8fb6e094fe39bdeace6279569834bc65aa64a34 13 October 2017, 21:42:26 UTC
31d3e41 PinnableSlice move assignment Summary: Allow `std::move(pinnable_slice)`. Closes https://github.com/facebook/rocksdb/pull/2997 Differential Revision: D6036782 Pulled By: yiwu-arbug fbshipit-source-id: 583fb0419a97e437ff530f4305822341cd3381fa 13 October 2017, 01:28:24 UTC
cc67b22 Add OptionsUtil class to java/CMakeLists.txt Summary: Adding OptionsUtil java class and options_util.cc to java/CMakeLists.txt, which were missed accidentally when they were introduced in #2898. Closes https://github.com/facebook/rocksdb/pull/2985 Differential Revision: D6015878 Pulled By: sagar0 fbshipit-source-id: 1abbd46db4aebad1e07ea53523eacbdcb12823e1 12 October 2017, 23:57:05 UTC
560e984 Added CompactionFilterFactory support to RocksJava Summary: This PR also includes some cleanup, bugfixes and refactoring of the Java API. However these are really pre-cursors on the road to CompactionFilterFactory support. Closes https://github.com/facebook/rocksdb/pull/1241 Differential Revision: D6012778 Pulled By: sagar0 fbshipit-source-id: 0774465940ee99001a78906e4fed4ef57068ad5c 12 October 2017, 18:12:16 UTC
8dd0a7e add comment in SuperVersion referencing logic Summary: The referencing logic is super confusing so added a comment at the part that took me longest to figure out. Closes https://github.com/facebook/rocksdb/pull/2996 Differential Revision: D6034969 Pulled By: ajkr fbshipit-source-id: 9cc2e744c1f79d6d57d378f86ed59238a5f583db 11 October 2017, 22:12:31 UTC
fb4ae4d fix DBImpl::NewInternalIterator super-version leak on failure Summary: Close #2955 Closes https://github.com/facebook/rocksdb/pull/2960 Differential Revision: D5962872 Pulled By: yiwu-arbug fbshipit-source-id: a6472d5c015bea3dc476c572ff5a5c90259e6059 11 October 2017, 21:57:43 UTC
019aa70 cmake: pass "-msse4.2" to when building crc32c.cc if HAVE_SSE42 Summary: it turns out that, with older GCC shipped from centos7, the SSE42 intrinsics are not available even with "target" specified. so we need to pass "-msse42" for checking compiler's sse4.2 support and for building crc32c.cc which uses sse4.2 intrinsics for crc32. Signed-off-by: Kefu Chai <tchaikov@gmail.com> Closes https://github.com/facebook/rocksdb/pull/2950 Differential Revision: D6032298 Pulled By: siying fbshipit-source-id: 124c946321043661b3fb0a70b6cdf4c9c5126ab4 11 October 2017, 19:26:46 UTC
38bbc87 fix travis failure for ccache command not found Summary: Travis don't have ccache installed on Mac. Only run the ccache command when it exists. https://docs.travis-ci.com/user/caching/#ccache-cache Closes https://github.com/facebook/rocksdb/pull/2990 Differential Revision: D6028837 Pulled By: yiwu-arbug fbshipit-source-id: 1e2d1c7f37be2c73773258c1fd5f24eebe7a06c6 11 October 2017, 05:26:33 UTC
0f3b369 Fix counter for memtable updates Summary: Right now in `PutCFImpl` we always increment NUMBER_KEYS_UPDATED counter for both in-place update or insertion. This PR fixes this by using the correct counter for either case. Closes https://github.com/facebook/rocksdb/pull/2986 Differential Revision: D6016300 Pulled By: miasantreble fbshipit-source-id: 0aed327522e659450d533d1c47d3a9f568fac65d 11 October 2017, 04:26:11 UTC
70aa942 fix file numbers after repair Summary: The file numbers assigned post-repair were sometimes smaller than older files' numbers due to `LogAndApply` saving the wrong next file number in the manifest. - Mark the highest file seen during repair as used before `LogAndApply` so the correct next file number will be stored. - Renamed `MarkFileNumberUsedDuringRecovery` to `MarkFileNumberUsed` since now it's used during repair in addition to during recovery - Added `TEST_Current_Next_FileNo` to expose the next file number for the unit test. Closes https://github.com/facebook/rocksdb/pull/2988 Differential Revision: D6018083 Pulled By: ajkr fbshipit-source-id: 3f25cbf74439cb8f16dd12af90b67f9f9f75e718 10 October 2017, 20:12:37 UTC
1a61ba1 compaction picker to use max_bytes_for_level_multiplier_additional Summary: Hi, As part of some optimization, we're using multiple DB locations (tmpfs and spindle) to store data and configured max_bytes_for_level_multiplier_additional. But, max_bytes_for_level_multiplier_additional is not used to compute the actual size for the level while picking the DB location. So, even if DB location does not have space, RocksDB mistakenly puts the level at that location. Can someone pls. verify the fix? Let me know any other changes required. Thanks, Jay Closes https://github.com/facebook/rocksdb/pull/2704 Differential Revision: D5992515 Pulled By: ajkr fbshipit-source-id: cbbc6c0e0a7dbdca91c72e0f37b218c4cec57e28 10 October 2017, 05:59:02 UTC
e254836 add GetLiveFiles and GetLiveFilesMetaData for BlobDB Summary: Closes https://github.com/facebook/rocksdb/pull/2976 Differential Revision: D5994759 Pulled By: miasantreble fbshipit-source-id: 985c31dccb957cb970c302f813cd07a1e8cb6438 10 October 2017, 02:56:04 UTC
8c392a3 WritePrepared Txn: Iterator Summary: On iterator create, take a snapshot, create a ReadCallback and pass the ReadCallback to the underlying DBIter to check if key is committed. Closes https://github.com/facebook/rocksdb/pull/2981 Differential Revision: D6001471 Pulled By: yiwu-arbug fbshipit-source-id: 3565c4cdaf25370ba47008b0e0cb65b31dfe79fe 10 October 2017, 00:15:28 UTC
5a38e18 Make some WriteOptions defaults more explicit Summary: Some WriteOptions defaults were not clearly documented. So, added comments to make the defaults more explicit. Closes https://github.com/facebook/rocksdb/pull/2984 Differential Revision: D6014500 Pulled By: sagar0 fbshipit-source-id: a28078818e335e42b303c1fc6fbfec692ed16c7c 09 October 2017, 23:36:52 UTC
17c6325 WritePrepare Txn: Cancel flush/compaction before destruction Summary: On WritePreparedTxnDB destruct there could be running compaction/flush holding a SnapshotChecker, which holds a pointer back to WritePreparedTxnDB. Make sure those jobs finished before destructing WritePreparedTxnDB. This is caught by TransactionTest::SeqAdvanceTest. Closes https://github.com/facebook/rocksdb/pull/2982 Differential Revision: D6002957 Pulled By: yiwu-arbug fbshipit-source-id: f1e70390c9798d1bd7959f5c8e2a1c14100773c3 07 October 2017, 03:55:53 UTC
ec6c538 WritePrepared Txn: end-to-end tests Summary: Enable WritePrepared policy for existing transaction tests. Closes https://github.com/facebook/rocksdb/pull/2972 Differential Revision: D5993614 Pulled By: maysamyabandeh fbshipit-source-id: d1eb53e2920c4e2a56434bb001231c98426f3509 06 October 2017, 21:26:45 UTC
da29eba Enable WAL for blob index Summary: Enabled WAL, during GC, for blob index which is stored on regular RocksDB. Closes https://github.com/facebook/rocksdb/pull/2975 Differential Revision: D5997384 Pulled By: sagar0 fbshipit-source-id: b76c1487d8b5be0e36c55e8d77ffe3d37d63d85b 06 October 2017, 17:59:31 UTC
d1b74b0 WritePrepared Txn: Compaction/Flush Summary: Update Compaction/Flush to support WritePreparedTxnDB: Add SnapshotChecker which is a proxy to query WritePreparedTxnDB::IsInSnapshot. Pass SnapshotChecker to DBImpl on WritePreparedTxnDB open. CompactionIterator use it to check if a key has been committed and if it is visible to a snapshot. In CompactionIterator: * check if key has been committed. If not, output uncommitted keys AS-IS. * use SnapshotChecker to check if key is visible to a snapshot when in need. * do not output key with seq = 0 if the key is not committed. Closes https://github.com/facebook/rocksdb/pull/2926 Differential Revision: D5902907 Pulled By: yiwu-arbug fbshipit-source-id: 945e037fdf0aa652dc5ba0ad879461040baa0320 06 October 2017, 17:41:53 UTC
0154240 Inform caller when rocksdb is stalling writes Summary: Add a new function in Listener to let the caller know when rocksdb is stalling writes. Closes https://github.com/facebook/rocksdb/pull/2897 Differential Revision: D5860124 Pulled By: schischi fbshipit-source-id: ee791606169aa64f772c86f817cebf02624e05e1 06 October 2017, 01:11:43 UTC
cc20ec3 WritePrepared Txn: Test sequence number 0 is visible Summary: Compaction will output keys with sequence number 0, if it is visible to earliest snapshot. Adding a test to make sure IsInSnapshot() report sequence number 0 is visible to any snapshot. Closes https://github.com/facebook/rocksdb/pull/2974 Differential Revision: D5990665 Pulled By: yiwu-arbug fbshipit-source-id: ef50ebc777ff8ca688771f3ab598c7a609b0b65e 05 October 2017, 23:26:44 UTC
back to top