f4ae1ba | Andrew Kryczka | 30 June 2017, 23:45:44 UTC | update history for OnBackgroundError and DeleteRange fix Summary: Mentioned changes: - #2477 - #2503 Closes https://github.com/facebook/rocksdb/pull/2528 Differential Revision: D5360185 Pulled By: ajkr fbshipit-source-id: 59d6ae465bcb0aa0a739317581fa3fc7871c6de6 | 30 June 2017, 23:57:57 UTC |
1cb8c6d | Siying Dong | 30 June 2017, 22:21:02 UTC | Add -enable_pipelined_write to db_bench and add two defaults Summary: Expose pipeline write in db_bench and change the default to parallel memtable inserts Closes https://github.com/facebook/rocksdb/pull/2527 Differential Revision: D5359825 Pulled By: siying fbshipit-source-id: e30755feb07ff19a731c4058acf101e02de4e197 | 30 June 2017, 22:27:03 UTC |
7604b46 | Maysam Yabandeh | 30 June 2017, 17:48:03 UTC | Update the AddDBStats in LITE Summary: Closes https://github.com/facebook/rocksdb/pull/2525 Differential Revision: D5356859 Pulled By: maysamyabandeh fbshipit-source-id: f593adad2a8aab12dcd6ab25db076eca51d30d34 | 30 June 2017, 17:56:50 UTC |
1e34d07 | Maysam Yabandeh | 30 June 2017, 16:30:03 UTC | Simplify and document sync rules for logs_ etc Summary: Adding/Correcting inline comments and clarify the sync rules. To make it simple to reason, the rules are a big general which ended up to some extra synchronizations. However such synchronizations are not on the fast path, and they are worth the simplicity. Closes https://github.com/facebook/rocksdb/pull/2517 Differential Revision: D5348239 Pulled By: maysamyabandeh fbshipit-source-id: ff2e59fb1e568c122d2cdbf598310f3613b7d212 | 30 June 2017, 16:42:28 UTC |
d310e0f | Andrew Kryczka | 30 June 2017, 07:00:59 UTC | Regression test for empty dedicated range deletion file Summary: Issue: #2478 Fix: #2503 The bug happened when all of these conditions were satisfied: - A subcompaction generates no keys - `RangeDelAggregator::ShouldAddTombstones()` returns true because there's at least one non-obsoleted range deletion in its map - None of the non-obsolete tombstones overlap with the subcompaction key-range Under those conditions, we were creating a dedicated file for range deletions which was left empty, thus causing an error in VersionEdit. I verified this test case fails before the #2503 fix and passes after. Closes https://github.com/facebook/rocksdb/pull/2521 Differential Revision: D5352568 Pulled By: ajkr fbshipit-source-id: f619cae39984ce9bb9b7a4e7a9ac0f2bb2ce43e9 | 30 June 2017, 07:11:25 UTC |
e9f91a5 | Maysam Yabandeh | 29 June 2017, 23:57:13 UTC | Add a fetch_add variation to AddDBStats Summary: AddDBStats is in two steps of load and store, which is more efficient than fetch_add. This is however not thread-safe. Currently we have to protect concurrent access to AddDBStats with a mutex which is less efficient that fetch_add. This patch adds the option to do fetch_add when AddDBStats. The results for my 2pc benchmark on sysbench is: - vanilla: 68618 tps - removing mutex on AddDBStats (unsafe): 69767 tps - fetch_add for all AddDBStats: 69200 tps - fetch_add only for concurrently access AddDBStats (this patch): 69579 tps Closes https://github.com/facebook/rocksdb/pull/2505 Differential Revision: D5330656 Pulled By: maysamyabandeh fbshipit-source-id: af64d7bee135b0e86b4fac323a4f9d9113eaa383 | 30 June 2017, 00:12:19 UTC |
c1b375e | zhangjinpeng1987 | 29 June 2017, 22:13:02 UTC | skip generating empty sst Summary: When a compaction job output nothing, there is no necessary to generate a empty sst file which will cause `VersionEdit::EncodeTo` failed. ref https://github.com/facebook/rocksdb/issues/2478 Closes https://github.com/facebook/rocksdb/pull/2503 Differential Revision: D5350799 Pulled By: ajkr fbshipit-source-id: df0b4fcf3507fe1c3c435208b762e75478e00143 | 29 June 2017, 22:26:52 UTC |
67b417d | Yi Wu | 29 June 2017, 17:34:22 UTC | fix format compatible test Summary: The comma "," is not a valid separator for bash arrays. Closes https://github.com/facebook/rocksdb/pull/2516 Differential Revision: D5348101 Pulled By: yiwu-arbug fbshipit-source-id: 8f0afdac368e21076eb7366b7df7dbaaf158cf96 | 29 June 2017, 17:42:14 UTC |
afbef65 | Siying Dong | 29 June 2017, 04:37:55 UTC | Bug fix: Fast CRC Support printing is not honest Summary: 11c5d4741a1e11a1315d5ca644ce555e07e91f61 introduces a bug that IsFastCrc32Supported() returns wrong result. Fix it. Also fix some FB internal scripts. Closes https://github.com/facebook/rocksdb/pull/2513 Differential Revision: D5343802 Pulled By: yiwu-arbug fbshipit-source-id: 057dc7ae3b262fe951413d1190ce60afc788cc05 | 29 June 2017, 04:41:42 UTC |
397ab11 | Mike Kolupaev | 29 June 2017, 04:26:03 UTC | Improve Status message for block checksum mismatches Summary: We've got some DBs where iterators return Status with message "Corruption: block checksum mismatch" all the time. That's not very informative. It would be much easier to investigate if the error message contained the file name - then we would know e.g. how old the corrupted file is, which would be very useful for finding the root cause. This PR adds file name, offset and other stuff to some block corruption-related status messages. It doesn't improve all the error messages, just a few that were easy to improve. I'm mostly interested in "block checksum mismatch" and "Bad table magic number" since they're the only corruption errors that I've ever seen in the wild. Closes https://github.com/facebook/rocksdb/pull/2507 Differential Revision: D5345702 Pulled By: al13n321 fbshipit-source-id: fc8023d43f1935ad927cef1b9c55481ab3cb1339 | 29 June 2017, 04:27:01 UTC |
18c63af | Siying Dong | 28 June 2017, 22:36:11 UTC | Make "make analyze" happy Summary: "make analyze" is reporting some errors. It's complicated to look but it seems to me that they are all false positive. Anyway, I think cleaning them up is a good idea. Some of the changes are hacky but I don't know a better way. Closes https://github.com/facebook/rocksdb/pull/2508 Differential Revision: D5341710 Pulled By: siying fbshipit-source-id: 6070e430e0e41a080ef441e05e8ec827d45efab6 | 28 June 2017, 22:42:27 UTC |
01534db | Maysam Yabandeh | 28 June 2017, 20:05:52 UTC | Fix the reported asan issues Summary: This is to resolve the asan complains. In the meanwhile I am working on clarifying/revisiting the sync rules. Closes https://github.com/facebook/rocksdb/pull/2510 Differential Revision: D5338660 Pulled By: yiwu-arbug fbshipit-source-id: ce6f6e0826d43a2c0bfa4328a00c78f73cd6498a | 28 June 2017, 20:13:22 UTC |
1cd45cd | Sagar Vemuri | 28 June 2017, 00:02:20 UTC | FIFO Compaction with TTL Summary: Introducing FIFO compactions with TTL. FIFO compaction is based on size only which makes it tricky to enable in production as use cases can have organic growth. A user requested an option to drop files based on the time of their creation instead of the total size. To address that request: - Added a new TTL option to FIFO compaction options. - Updated FIFO compaction score to take TTL into consideration. - Added a new table property, creation_time, to keep track of when the SST file is created. - Creation_time is set as below: - On Flush: Set to the time of flush. - On Compaction: Set to the max creation_time of all the files involved in the compaction. - On Repair and Recovery: Set to the time of repair/recovery. - Old files created prior to this code change will have a creation_time of 0. - FIFO compaction with TTL is enabled when ttl > 0. All files older than ttl will be deleted during compaction. i.e. `if (file.creation_time < (current_time - ttl)) then delete(file)`. This will enable cases where you might want to delete all files older than, say, 1 day. - FIFO compaction will fall back to the prior way of deleting files based on size if: - the creation_time of all files involved in compaction is 0. - the total size (of all SST files combined) does not drop below `compaction_options_fifo.max_table_files_size` even if the files older than ttl are deleted. This feature is not supported if max_open_files != -1 or with table formats other than Block-based. **Test Plan:** Added tests. **Benchmark results:** Base: FIFO with max size: 100MB :: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 readwhilewriting : 1.924 micros/op 519858 ops/sec; 13.6 MB/s (1176277 of 5000000 found) ``` With TTL (a low one for testing) :: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ TEST_TMPDIR=/dev/shm ./db_bench --benchmarks=readwhilewriting --num=5000000 --threads=16 --compaction_style=2 --fifo_compaction_max_table_files_size_mb=100 --fifo_compaction_ttl=20 readwhilewriting : 1.902 micros/op 525817 ops/sec; 13.7 MB/s (1185057 of 5000000 found) ``` Example Log lines: ``` 2017/06/26-15:17:24.609249 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609177) [db/compaction_picker.cc:1471] [default] FIFO compaction: picking file 40 with creation time 1498515423 for deletion 2017/06/26-15:17:24.609255 7fd5a45ff700 (Original Log Time 2017/06/26-15:17:24.609234) [db/db_impl_compaction_flush.cc:1541] [default] Deleted 1 files ... 2017/06/26-15:17:25.553185 7fd5a61a5800 [DEBUG] [db/db_impl_files.cc:309] [JOB 0] Delete /dev/shm/dbbench/000040.sst type=2 #40 -- OK 2017/06/26-15:17:25.553205 7fd5a61a5800 EVENT_LOG_v1 {"time_micros": 1498515445553199, "job": 0, "event": "table_file_deletion", "file_number": 40} ``` SST Files remaining in the dbbench dir, after db_bench execution completed: ``` svemuri@dev15905 ~/rocksdb (fifo-compaction) $ ls -l /dev/shm//dbbench/*.sst -rw-r--r--. 1 svemuri users 30749887 Jun 26 15:17 /dev/shm//dbbench/000042.sst -rw-r--r--. 1 svemuri users 30768779 Jun 26 15:17 /dev/shm//dbbench/000044.sst -rw-r--r--. 1 svemuri users 30757481 Jun 26 15:17 /dev/shm//dbbench/000046.sst ``` Closes https://github.com/facebook/rocksdb/pull/2480 Differential Revision: D5305116 Pulled By: sagar0 fbshipit-source-id: 3e5cfcf5dd07ed2211b5b37492eb235b45139174 | 28 June 2017, 00:11:48 UTC |
982cec2 | Yi Wu | 27 June 2017, 21:01:14 UTC | Fix TARGETS file tests list Summary: 1. The buckifier script assume each test "foo" comes with a .cc file of the same name (i.e. foo.cc). Update cassandra tests to follow this pattern so that the buckifier script can recognize them. 2. add blob_db_test Closes https://github.com/facebook/rocksdb/pull/2506 Differential Revision: D5331517 Pulled By: yiwu-arbug fbshipit-source-id: 86f3eba471fc621186ab44cbd073b6162cde8e57 | 27 June 2017, 21:12:02 UTC |
b49b371 | Yi Wu | 27 June 2017, 18:25:00 UTC | allow numa >= 2.0.8 Summary: Allow numa >= 2.0.8 in buck TARGET file. Closes https://github.com/facebook/rocksdb/pull/2504 Differential Revision: D5330550 Pulled By: yiwu-arbug fbshipit-source-id: 8ffb6167b4ad913877eac16a20a91023b31f8d41 | 27 June 2017, 18:27:02 UTC |
e517bfa | Siying Dong | 27 June 2017, 17:49:34 UTC | CLANG Tidy Summary: Closes https://github.com/facebook/rocksdb/pull/2502 Differential Revision: D5326498 Pulled By: siying fbshipit-source-id: 2f0ac6dc6ca5ddb23cecf67a278c086e52646714 | 27 June 2017, 18:00:59 UTC |
dc3d2e4 | Yi Wu | 27 June 2017, 17:48:40 UTC | update compatible test Summary: update compatible test to include 5.5 and 5.6 branch. Closes https://github.com/facebook/rocksdb/pull/2501 Differential Revision: D5325220 Pulled By: yiwu-arbug fbshipit-source-id: 5f5271491e6dd2d7b2cf73a7142f38a571553bc4 | 27 June 2017, 18:00:59 UTC |
89468c0 | Siying Dong | 27 June 2017, 00:14:11 UTC | Fix Windows build broken by 5c97a7c0664d4071768113814e9ba71fe87e18cf Summary: A typo conversion fails Windows build. Fix it. Closes https://github.com/facebook/rocksdb/pull/2500 Differential Revision: D5325962 Pulled By: siying fbshipit-source-id: 2cefdafc9afbc85f856f403af7c876b622400630 | 27 June 2017, 00:28:22 UTC |
5177861 | Ewout Prangsma | 26 June 2017, 23:52:06 UTC | Encryption at rest support Summary: This PR adds support for encrypting data stored by RocksDB when written to disk. It adds an `EncryptedEnv` override of the `Env` class with matching overrides for sequential&random access files. The encryption itself is done through a configurable `EncryptionProvider`. This class creates is asked to create `BlockAccessCipherStream` for a file. This is where the actual encryption/decryption is being done. Currently there is a Counter mode implementation of `BlockAccessCipherStream` with a `ROT13` block cipher (NOTE the `ROT13` is for demo purposes only!!). The Counter operation mode uses an initial counter & random initialization vector (IV). Both are created randomly for each file and stored in a 4K (default size) block that is prefixed to that file. The `EncryptedEnv` implementation is such that clients of the `Env` class do not see this prefix (nor data, nor in filesize). The largest part of the prefix block is also encrypted, and there is room left for implementation specific settings/values/keys in there. To test the encryption, the `DBTestBase` class has been extended to consider a new environment variable called `ENCRYPTED_ENV`. If set, the test will setup a encrypted instance of the `Env` class to use for all tests. Typically you would run it like this: ``` ENCRYPTED_ENV=1 make check_some ``` There is also an added test that checks that some data inserted into the database is or is not "visible" on disk. With `ENCRYPTED_ENV` active it must not find plain text strings, with `ENCRYPTED_ENV` unset, it must find the plain text strings. Closes https://github.com/facebook/rocksdb/pull/2424 Differential Revision: D5322178 Pulled By: sdwilsh fbshipit-source-id: 253b0a9c2c498cc98f580df7f2623cbf7678a27f | 26 June 2017, 23:56:24 UTC |
7061912 | Siying Dong | 26 June 2017, 22:53:41 UTC | Trivial typo in HISTORY.md Summary: Closes https://github.com/facebook/rocksdb/pull/2499 Differential Revision: D5324914 Pulled By: siying fbshipit-source-id: c69827d4dddafa81e651180633943a3380cdd5bb | 26 June 2017, 22:57:08 UTC |
2a9cd87 | Yi Wu | 26 June 2017, 22:24:13 UTC | Fix jni WriteBatchThreadedTest Summary: WriteBatchThreadedTest is failing, at least on Mac. The problem seems to be `wb` is getting GC before we finish write. Explicitly close it seems to fix it. Closes https://github.com/facebook/rocksdb/pull/2482 Differential Revision: D5307379 Pulled By: yiwu-arbug fbshipit-source-id: 8ff7f8170451078c941951f5aafae83afffb7933 | 26 June 2017, 22:27:17 UTC |
0025a36 | Aaron Gao | 26 June 2017, 22:13:35 UTC | revert perf_context and io_stats to __thread Summary: https://github.com/facebook/rocksdb/pull/2380 introduces a regression by replacing __thread with ThreadLocalPtr. Revert the thread local implementation back. Closes https://github.com/facebook/rocksdb/pull/2485 Differential Revision: D5308050 Pulled By: lightmark fbshipit-source-id: 2676e9c22edf76e8133d3f4c50e2711e11a95480 | 26 June 2017, 22:27:17 UTC |
5c97a7c | Siying Dong | 26 June 2017, 20:15:55 UTC | Unit Tests for sync, range sync and file close failures Summary: Closes https://github.com/facebook/rocksdb/pull/2454 Differential Revision: D5255320 Pulled By: siying fbshipit-source-id: 0080830fa8eb5da6de25e17ba68aee91018c7913 | 26 June 2017, 20:27:58 UTC |
4cee11f | Andrew Kryczka | 26 June 2017, 20:04:20 UTC | Intra-L0 blog post Summary: as titled Closes https://github.com/facebook/rocksdb/pull/2497 Differential Revision: D5322732 Pulled By: ajkr fbshipit-source-id: 35a648a7af737032949ed99f430f4fd865ac9e9c | 26 June 2017, 20:11:41 UTC |
857e996 | Siying Dong | 26 June 2017, 19:42:21 UTC | Improve the error message for I/O related errors. Summary: Force people to write something other than file name while returning status for IOError. Closes https://github.com/facebook/rocksdb/pull/2493 Differential Revision: D5321309 Pulled By: siying fbshipit-source-id: 38bcf6c19e80831cd3e300a047e975cbb131d822 | 26 June 2017, 19:57:01 UTC |
d757355 | Siying Dong | 26 June 2017, 19:32:52 UTC | Fix bug that flush doesn't respond to fsync result Summary: With a regression bug was introduced two years ago, by https://github.com/facebook/rocksdb/commit/6e9fbeb27c38329f33ae541302c44c8db8374f8c , we fail to check return status of fsync call. This can cause we miss the information from the file system and can potentially cause corrupted data which we could have been detected. Closes https://github.com/facebook/rocksdb/pull/2495 Reviewed By: ajkr Differential Revision: D5321949 Pulled By: siying fbshipit-source-id: c68117914bb40700198fc37d0e4c63163a8a1031 | 26 June 2017, 19:41:48 UTC |
8e6345d | Maysam Yabandeh | 25 June 2017, 01:07:34 UTC | Update rename of ParanoidCheck Summary: Closes https://github.com/facebook/rocksdb/pull/2494 Differential Revision: D5317902 Pulled By: maysamyabandeh fbshipit-source-id: 097330292180816b3d0c9f4cbbdb6f68f0180200 | 25 June 2017, 01:12:03 UTC |
499ebb3 | Maysam Yabandeh | 24 June 2017, 21:06:43 UTC | Optimize for serial commits in 2PC Summary: Throughput: 46k tps in our sysbench settings (filling the details later) The idea is to have the simplest change that gives us a reasonable boost in 2PC throughput. Major design changes: 1. The WAL file internal buffer is not flushed after each write. Instead it is flushed before critical operations (WAL copy via fs) or when FlushWAL is called by MySQL. Flushing the WAL buffer is also protected via mutex_. 2. Use two sequence numbers: last seq, and last seq for write. Last seq is the last visible sequence number for reads. Last seq for write is the next sequence number that should be used to write to WAL/memtable. This allows to have a memtable write be in parallel to WAL writes. 3. BatchGroup is not used for writes. This means that we can have parallel writers which changes a major assumption in the code base. To accommodate for that i) allow only 1 WriteImpl that intends to write to memtable via mem_mutex_--which is fine since in 2PC almost all of the memtable writes come via group commit phase which is serial anyway, ii) make all the parts in the code base that assumed to be the only writer (via EnterUnbatched) to also acquire mem_mutex_, iii) stat updates are protected via a stat_mutex_. Note: the first commit has the approach figured out but is not clean. Submitting the PR anyway to get the early feedback on the approach. If we are ok with the approach I will go ahead with this updates: 0) Rebase with Yi's pipelining changes 1) Currently batching is disabled by default to make sure that it will be consistent with all unit tests. Will make this optional via a config. 2) A couple of unit tests are disabled. They need to be updated with the serial commit of 2PC taken into account. 3) Replacing BatchGroup with mem_mutex_ got a bit ugly as it requires releasing mutex_ beforehand (the same way EnterUnbatched does). This needs to be cleaned up. Closes https://github.com/facebook/rocksdb/pull/2345 Differential Revision: D5210732 Pulled By: maysamyabandeh fbshipit-source-id: 78653bd95a35cd1e831e555e0e57bdfd695355a4 | 24 June 2017, 21:11:29 UTC |
0ac4afb | Maysam Yabandeh | 24 June 2017, 01:18:21 UTC | Sanitize partitioning options Summary: We currently do not support partitioning filters if indexes are not partitioned. The patch makes sure that these two are consistent. Closes https://github.com/facebook/rocksdb/pull/2455 Differential Revision: D5275644 Pulled By: maysamyabandeh fbshipit-source-id: b61701ac8914c2206d06f5e33ff6f67b24406d1d | 24 June 2017, 01:30:01 UTC |
521724b | jsteemann | 23 June 2017, 16:32:11 UTC | fixed wrong type for "allow_compaction" parameter Summary: should be boolean, not uint64_t MSVC complains about it during compilation with error `include\rocksdb\advanced_options.h(77): warning C4800: 'uint64_t': forcing value to bool 'true' or 'false' (performance warning)` Closes https://github.com/facebook/rocksdb/pull/2487 Differential Revision: D5310685 Pulled By: siying fbshipit-source-id: 719a33b3dba4f711aa72e3f229013c188015dc86 | 23 June 2017, 16:41:19 UTC |
71f5bcb | Andrew Kryczka | 23 June 2017, 02:30:39 UTC | Introduce OnBackgroundError callback Summary: Some users want to prevent rocksdb from entering read-only mode in certain error cases. This diff gives them a callback, `OnBackgroundError`, that they can use to achieve it. - call `OnBackgroundError` every time we consider setting `bg_error_`. Use its result to assign `bg_error_` but not to change the function's return status. - classified calls using `BackgroundErrorReason` to give the callback some info about where the error happened - renamed `ParanoidCheck` to something more specific so we can provide a clear `BackgroundErrorReason` - unit tests for the most common cases: flush or compaction errors Closes https://github.com/facebook/rocksdb/pull/2477 Differential Revision: D5300190 Pulled By: ajkr fbshipit-source-id: a0ea4564249719b83428e3f4c6ca2c49e366e9b3 | 23 June 2017, 02:41:50 UTC |
88cd2d9 | Siying Dong | 22 June 2017, 23:16:19 UTC | Downgrade option sanitiy check level for prefix_extractor Summary: With c7004840d2f4ad5fc1bdce042902b822492f3a0e, it's safe to open a DB with different prefix extractor. So it's safe to skip prefix extractor check. Closes https://github.com/facebook/rocksdb/pull/2474 Differential Revision: D5294700 Pulled By: siying fbshipit-source-id: eeb500da795eecb29b8c9c56a14cfd4afda12ecc | 22 June 2017, 23:26:36 UTC |
6837a17 | Siying Dong | 22 June 2017, 22:45:42 UTC | Fix Data Race Between CreateColumnFamily() and GetAggregatedIntProperty() Summary: CreateColumnFamily() releases DB mutex after adding column family to the set and install super version (to write option file), so if users call GetAggregatedIntProperty() in the middle, then super version will be null and the process will crash. Fix it by skipping those column families without super version installed. Maybe we should also fix the problem of releasing the lock when reading option file, but it is more risky. so I'm doing a quick and safer fix and we can investigate it later. Closes https://github.com/facebook/rocksdb/pull/2475 Differential Revision: D5298053 Pulled By: siying fbshipit-source-id: 4b3c8f91c60400b163fcc6cda8a0c77723be0ef6 | 22 June 2017, 22:56:47 UTC |
af17467 | Siying Dong | 21 June 2017, 17:28:54 UTC | WriteBufferManager will not trigger flush if much data is already being flushed Summary: Even if hard limit hits, flushing more memtable may not help cap the memory usage if already more than half data is scheduled for flush. Not triggering flush instead. Closes https://github.com/facebook/rocksdb/pull/2469 Differential Revision: D5284249 Pulled By: siying fbshipit-source-id: 8ab7ba1aba56a634dbe72b318fcab2093063972e | 21 June 2017, 17:41:37 UTC |
9467eb6 | Andrew Kryczka | 20 June 2017, 23:43:09 UTC | Fix flush assertion with tsan Summary: DBImpl's instance variables should only be accessed with mutex held. I moved an assert later to uphold this rule. DBTest.LastWriteBufferDelay test was sporadically failing TSAN because it tried to flush around the same time the db was destroyed, so the variable was accessed simultaneously by two threads. Closes https://github.com/facebook/rocksdb/pull/2471 Differential Revision: D5286857 Pulled By: ajkr fbshipit-source-id: 435abd84efa601f667c254e320b0bb5a434b971f | 20 June 2017, 23:43:44 UTC |
048446f | Andrew Kryczka | 20 June 2017, 20:16:55 UTC | Fix cassandra ASAN use-after-free Summary: When we create a column based on the `string::c_str()`, we need to make sure that char array doesn't get deleted when calls to `string::append()` cause the string to expand. Closes https://github.com/facebook/rocksdb/pull/2470 Differential Revision: D5285049 Pulled By: ajkr fbshipit-source-id: f918dd426ff3c024e7a293dcb10448f10b6c98e8 | 20 June 2017, 20:27:16 UTC |
a21db16 | Dmitri Smirnov | 20 June 2017, 17:16:24 UTC | Implement ReopenWritibaleFile on Windows and other fixes Summary: Make default impl return NoSupported so the db_blob tests exist in a meaningful manner. Replace std::thread to port::Thread Closes https://github.com/facebook/rocksdb/pull/2465 Differential Revision: D5275563 Pulled By: yiwu-arbug fbshipit-source-id: cedf1a18a2c05e20d768c1308b3f3224dbd70ab6 | 20 June 2017, 17:31:13 UTC |
c430d69 | zhangjinpeng1987 | 19 June 2017, 18:39:45 UTC | fix coredump for release nullptr Summary: Coredump will be triggered when ingest external sst file after delete range. ref https://github.com/facebook/rocksdb/issues/2398 Closes https://github.com/facebook/rocksdb/pull/2463 Differential Revision: D5275599 Pulled By: ajkr fbshipit-source-id: 0828dbc062ea8c74e913877cd63494fd3478a30d | 19 June 2017, 18:41:38 UTC |
0d27845 | Andrew Kryczka | 18 June 2017, 19:40:21 UTC | default implementation for InRange Summary: it's confusing to implementors of prefix extractor to implement an unused function Closes https://github.com/facebook/rocksdb/pull/2460 Differential Revision: D5267408 Pulled By: ajkr fbshipit-source-id: 2f1fe3131efc978f6098ae7a80e52bc7a0b13571 | 18 June 2017, 19:42:42 UTC |
cbd825d | Chen Shen | 16 June 2017, 21:12:52 UTC | Create a MergeOperator for Cassandra Row Value Summary: This PR implements the MergeOperator for Cassandra Row Values. Closes https://github.com/facebook/rocksdb/pull/2289 Differential Revision: D5055464 Pulled By: scv119 fbshipit-source-id: 45f276ef8cbc4704279202f6a20c64889bc1adef | 16 June 2017, 21:27:00 UTC |
2c98b06 | Maysam Yabandeh | 15 June 2017, 23:08:32 UTC | Remove pin_slice option by making it the default Summary: This would simplify db_bench_tool.cc Closes https://github.com/facebook/rocksdb/pull/2457 Differential Revision: D5259035 Pulled By: maysamyabandeh fbshipit-source-id: 0a9c3abda624070fe2650200b885ad7e1c60182c | 15 June 2017, 23:14:08 UTC |
c80c611 | Maysam Yabandeh | 15 June 2017, 23:06:22 UTC | add db_bench options for partitioning Summary: Closes https://github.com/facebook/rocksdb/pull/2456 Differential Revision: D5259083 Pulled By: maysamyabandeh fbshipit-source-id: 1ed1746da7a8baadf4772d023d927c6c4e6b112a | 15 June 2017, 23:14:08 UTC |
6a3377f | Ben Torfs | 14 June 2017, 23:57:39 UTC | Synchronize statistic enumeration values between statistics.h and java API Summary: Closes https://github.com/facebook/rocksdb/pull/2209 Differential Revision: D5251951 Pulled By: sagar0 fbshipit-source-id: 03a73d025a7b4a322bb8d8d86f5d249fcd7dd00e | 14 June 2017, 23:59:42 UTC |
53dda87 | Sagar Vemuri | 14 June 2017, 21:51:37 UTC | Do not run RateLimiterTest.Rate test on Travis+Mac OSX. Summary: RateLimiterTest.Rate test has been failing continuously since many days on travis in Mac OSX PLATFORM_DEPENDENT test suite. Check https://travis-ci.org/facebook/rocksdb/pull_requests. Disabling this test for now, so that we can investigate more in depth. Closes https://github.com/facebook/rocksdb/pull/2451 Differential Revision: D5250147 Pulled By: sagar0 fbshipit-source-id: d58476a3c2792d20e875754d1516c4bc7174e86c | 14 June 2017, 21:58:02 UTC |
ae8571f | Yi Wu | 14 June 2017, 20:44:36 UTC | Fix blob db compression bug Summary: `CompressBlock()` will return the uncompressed slice (i.e. `Slice(value_unc)`) if compression ratio is not good enough. This is undesired. We need to always assign the compressed slice to `value`. Closes https://github.com/facebook/rocksdb/pull/2447 Differential Revision: D5244682 Pulled By: yiwu-arbug fbshipit-source-id: 6989dd8852c9622822ba9acec9beea02007dff09 | 14 June 2017, 20:56:42 UTC |
7a380de | Yi Wu | 14 June 2017, 20:08:54 UTC | Update blob_db_test Summary: I'm trying to improve unit test of blob db. I'm rewriting blob db test. In this patch: * Rewrite tests of basic put/write/delete operations. * Add disable_background_tasks to BlobDBOptionsImpl to allow me not running any background job for basic unit tests. * Move DestroyBlobDB out from BlobDBImpl to be a standalone function. * Remove all garbage collection related tests. Will rewrite them in following patch. * Disabled compression test since it is failing. Will fix in a followup patch. Closes https://github.com/facebook/rocksdb/pull/2446 Differential Revision: D5243306 Pulled By: yiwu-arbug fbshipit-source-id: 157c71ad3b699307cb88baa3830e9b6e74f8e939 | 14 June 2017, 20:12:34 UTC |
89ad9f3 | Sagar Vemuri | 13 June 2017, 23:55:08 UTC | Allow ignoring unknown options when loading options from a file Summary: Added a flag, `ignore_unknown_options`, to skip unknown options when loading an options file (using `LoadLatestOptions`/`LoadOptionsFromFile`) or while verifying options (using `CheckOptionsCompatibility`). This will help in downgrading the db to an older version. Also added `--ignore_unknown_options` flag to ldb **Example Use case:** In MyRocks, if copying from newer version to older version, it is often impossible to start because of new RocksDB options that don't exist in older version, even though data format is compatible. MyRocks uses these load and verify functions in [ha_rocksdb.cc::check_rocksdb_options_compatibility](https://github.com/facebook/mysql-5.6/blob/e004fd9f416821d043ccc8ad4a345c33ac9953f0/storage/rocksdb/ha_rocksdb.cc#L3348-L3401). **Test Plan:** Updated the unit tests. `make check` ldb: $ ./ldb --db=/tmp/test_db --create_if_missing put a1 b1 OK Now edit /tmp/test_db/<OPTIONS-file> and add an unknown option. Try loading the options now, and it fails: $ ./ldb --db=/tmp/test_db --try_load_options get a1 Failed: Invalid argument: Unrecognized option DBOptions:: abcd Passes with the new --ignore_unknown_options flag $ ./ldb --db=/tmp/test_db --try_load_options --ignore_unknown_options get a1 b1 Closes https://github.com/facebook/rocksdb/pull/2423 Differential Revision: D5212091 Pulled By: sagar0 fbshipit-source-id: 2ec17636feb47dc0351b53a77e5f15ef7cbf2ca7 | 13 June 2017, 23:58:01 UTC |
6b5a5dc | hyunwoo | 13 June 2017, 23:46:17 UTC | fixed typo Summary: fixed typo Closes https://github.com/facebook/rocksdb/pull/2430 Differential Revision: D5242471 Pulled By: IslamAbdelRahman fbshipit-source-id: 832eb3a4c70221444ccd2ae63217823fec56c748 | 13 June 2017, 23:58:01 UTC |
0f228be | haoxiang | 13 June 2017, 23:39:57 UTC | fixed typo in util/dynamic_bloom.h Summary: fixed a typo in util/dynamic_bloom.h Closes https://github.com/facebook/rocksdb/pull/2442 Differential Revision: D5242397 Pulled By: IslamAbdelRahman fbshipit-source-id: c47fd18cc79afff6b022201a0410c0cd47626576 | 13 June 2017, 23:41:36 UTC |
c217e0b | Andrew Kryczka | 13 June 2017, 21:51:22 UTC | Call RateLimiter for compaction reads Summary: Allow users to rate limit background work based on read bytes, written bytes, or sum of read and written bytes. Support these by changing the RateLimiter API, so no additional options were needed. Closes https://github.com/facebook/rocksdb/pull/2433 Differential Revision: D5216946 Pulled By: ajkr fbshipit-source-id: aec57a8357dbb4bfde2003261094d786d94f724e | 13 June 2017, 21:56:46 UTC |
91e2aa3 | Yi Wu | 13 June 2017, 19:37:59 UTC | write exact sequence number for each put in write batch Summary: At the beginning of write batch write, grab the latest sequence from base db and assume sequence number will increment by 1 for each put and delete, and write the exact sequence number with each put. This is assuming we are the only writer to increment sequence number (no external file ingestion, etc) and there should be no holes in the sequence number. Also having some minor naming changes. Closes https://github.com/facebook/rocksdb/pull/2402 Differential Revision: D5176134 Pulled By: yiwu-arbug fbshipit-source-id: cb4712ee44478d5a2e5951213a10b72f08fe8c88 | 13 June 2017, 19:42:36 UTC |
6f4154d | Maysam Yabandeh | 13 June 2017, 17:59:22 UTC | record index partition properties Summary: When Partitioning index/filter is enabled the user might need to check the index block size as well as the top-level index size via sst_dump. This patch records i) number of partitions, ii) top-level index size and make it accessible through sst_dump. The number of partitions for filters is the same as that of indexes. The top-level index for filters has a similar size to top-level index for indexes, so it is not repeated. Closes https://github.com/facebook/rocksdb/pull/2437 Differential Revision: D5224225 Pulled By: maysamyabandeh fbshipit-source-id: 5324598c75793523aef1bb7ee225a5475e95a9cb | 13 June 2017, 18:21:32 UTC |
5d5a28a | Siying Dong | 13 June 2017, 11:51:46 UTC | Fix Clang release build broken by 5582123dee8426a5191dfd5e846cea8c676c793c Summary: 5582123dee8426a5191dfd5e846cea8c676c793c broken CLANG release build because of an unexpected change. Fix it. Closes https://github.com/facebook/rocksdb/pull/2443 Differential Revision: D5236297 Pulled By: siying fbshipit-source-id: 1b410adf13ded149c53e8235e9ea9f3130fb5403 | 13 June 2017, 11:56:35 UTC |
0175d58 | Siying Dong | 13 June 2017, 11:34:51 UTC | Make direct I/O write use incremental buffer Summary: Currently for direct I/O, the large maximum buffer is always allocated. This will be wasteful if users flush the data in much smaller chunks. This diff fix this by changing the behavior of incremental buffer works. When we enlarge buffer, we try to copy the existing data in the buffer to the enlarged buffer, rather than flush the buffer first. This can make sure that no extra I/O is introduced because of buffer enlargement. Closes https://github.com/facebook/rocksdb/pull/2403 Differential Revision: D5178403 Pulled By: siying fbshipit-source-id: a8fe1e7304bdb8cab2973340022fe80ff83449fd | 13 June 2017, 11:41:37 UTC |
7a27006 | Anirban Rahut | 13 June 2017, 11:33:54 UTC | GNU C library for struct tm has 2 additional fields. Summary: initialize 2 additional fields tm_gmtoff and tm_zone, otherwise under strict warnings for initialization, we get errors in myrocks. Closes https://github.com/facebook/rocksdb/pull/2439 Differential Revision: D5229013 Pulled By: yiwu-arbug fbshipit-source-id: 9fc1615a1919656f36064791706ed41e10e9db84 | 13 June 2017, 11:41:35 UTC |
d713471 | Islam AbdelRahman | 12 June 2017, 23:51:37 UTC | Limit trash directory to be 25% of total DB Summary: Update DeleteScheduler to delete files immediately if trash directory is >= 25% of DB size Closes https://github.com/facebook/rocksdb/pull/2436 Differential Revision: D5230384 Pulled By: IslamAbdelRahman fbshipit-source-id: 5cbda8ac536a3cc72c774641621edc02c8202482 | 12 June 2017, 23:57:21 UTC |
9bb91e9 | Orgad Shaneh | 12 June 2017, 20:08:57 UTC | Dedup release Summary: cc tamird sagar0 Closes https://github.com/facebook/rocksdb/pull/2325 Differential Revision: D5098302 Pulled By: sagar0 fbshipit-source-id: 297c5506b5d9b2ed1d7719c8caf0b96cffe503b8 | 12 June 2017, 20:13:06 UTC |
27b4501 | Sagar Vemuri | 12 June 2017, 19:51:19 UTC | Update HistogramTypes in the Java API Summary: This diff syncs the Histogram Types in the Java API with the ones in C++ API (`statistics.h`), and brings it up-to-date. I also found that the enum ordering between Java and C++ has gotten out-of-sync, a few years back, with the addition of `SUBCOMPACTION_SETUP_TIME`. So updated the order as well. `READ_NUM_MERGE_OPERANDS` added in #2373 is needed for Cassandra-on-RocksDB work. Closes https://github.com/facebook/rocksdb/pull/2429 Differential Revision: D5215623 Pulled By: sagar0 fbshipit-source-id: bd136698c48197e53693275eb52acc9198ee5a4e | 12 June 2017, 19:57:49 UTC |
e97304c | Andrew Kryczka | 12 June 2017, 19:27:54 UTC | update history for 5.6 Summary: - mention range deletion + file ingestion - move post-5.6 stuff into new section Closes https://github.com/facebook/rocksdb/pull/2440 Differential Revision: D5229910 Pulled By: ajkr fbshipit-source-id: 1facfe41993fa1f3b1f6fa7dc77d2b11aa2b317a | 12 June 2017, 19:30:09 UTC |
5582123 | Siying Dong | 12 June 2017, 13:58:25 UTC | Sample number of reads per SST file Summary: We estimate number of reads per SST files, by updating the counter per file in sampled read requests. This information can later be used to trigger compactions to improve read performacne. Closes https://github.com/facebook/rocksdb/pull/2417 Differential Revision: D5193528 Pulled By: siying fbshipit-source-id: b4241c5ad0eaf444b61afb53f8e6290d9f5da2df | 12 June 2017, 14:12:08 UTC |
db818d2 | Siying Dong | 12 June 2017, 13:32:01 UTC | Fix RocksDB Lite build with CLANG Summary: Closes https://github.com/facebook/rocksdb/pull/2419 Differential Revision: D5193976 Pulled By: siying fbshipit-source-id: 62d115edee6043237e9d6ad3c2a05481e162c9eb | 12 June 2017, 13:41:27 UTC |
a472c4a | Aaron Gao | 09 June 2017, 18:06:34 UTC | update 5.5 change log Summary: update bug fixed. Closes https://github.com/facebook/rocksdb/pull/2434 Differential Revision: D5218601 Pulled By: lightmark fbshipit-source-id: 1f86b2c93345673612381081537d464e7d12e434 | 09 June 2017, 18:12:10 UTC |
bc09c8a | Mike Kolupaev | 09 June 2017, 02:54:00 UTC | Fix crash in PosixWritableFile::Close() when fstat() fails Summary: We had a crash in this code: `fstat()` failed; `file_stats` contained garbage, in particular `file_stats.st_blksize == 6`; the expression `file_stats.st_blocks / (file_stats.st_blksize / 512)` divided by zero. Closes https://github.com/facebook/rocksdb/pull/2420 Differential Revision: D5216110 Pulled By: al13n321 fbshipit-source-id: 6d8fc5e7c4f98c1139e68c7829ebdbac68b0fce0 | 09 June 2017, 02:56:22 UTC |
6d0f22e | Yi Wu | 09 June 2017, 00:28:33 UTC | Fix mock_env.cc uninitialized variable Summary: Mingw is complaining about uninitialized variable in mock_env.cc. e.g. https://travis-ci.org/facebook/rocksdb/jobs/240132276 The fix is to initialize the variable. Closes https://github.com/facebook/rocksdb/pull/2428 Differential Revision: D5211306 Pulled By: yiwu-arbug fbshipit-source-id: ee02bf0327dcea8590a2aa087f0176fecaf8621c | 09 June 2017, 00:41:59 UTC |
c2012d4 | Sagar Vemuri | 09 June 2017, 00:11:10 UTC | Java APIs for put, merge and delete in file ingestion Summary: Adding SSTFileWriter's newly introduced put, merge and delete apis to the Java api. The C++ APIs were first introduced in #2361. Add is deprecated in favor of Put. Merge is especially needed to support streaming for Cassandra-on-RocksDB work in https://issues.apache.org/jira/browse/CASSANDRA-13476. Closes https://github.com/facebook/rocksdb/pull/2392 Differential Revision: D5165091 Pulled By: sagar0 fbshipit-source-id: 6f0ad396a7cbd2e27ca63e702584784dd72acaab | 09 June 2017, 00:11:57 UTC |
85dace2 | Yi Wu | 08 June 2017, 19:30:28 UTC | Disable DBRangeDelTest::TailingIteratorRangeTombstoneUnsupported for ubsan Summary: UBSAN crashes when it run the test. Disabling it for UBSAN. Closes https://github.com/facebook/rocksdb/pull/2427 Differential Revision: D5210897 Pulled By: yiwu-arbug fbshipit-source-id: 2f5a876807c98d8db79ab9581965f7e6b29d4163 | 08 June 2017, 19:43:01 UTC |
d4f7731 | Maysam Yabandeh | 08 June 2017, 17:38:45 UTC | fix travis error with init time in mockenv Summary: /home/travis/build/facebook/rocksdb/env/mock_env.cc: In member function ‘virtual void rocksdb::{anonymous}::TestMemLogger::Logv(const char*, va_list)’: /home/travis/build/facebook/rocksdb/env/mock_env.cc:391:53: error: ‘t.tm::tm_year’ may be used uninitialized in this function [-Werror=maybe-uninitialized] static_cast<int>(now_tv.tv_usec)); Closes https://github.com/facebook/rocksdb/pull/2418 Differential Revision: D5193597 Pulled By: maysamyabandeh fbshipit-source-id: 8801a3ef27f33eb419d534f7de747702cdf504a0 | 08 June 2017, 17:41:18 UTC |
550a1df | Maysam Yabandeh | 06 June 2017, 19:50:56 UTC | Fix clang errors by asserting the precondition Summary: USE_CLANG=1 make -j32 analyze The two errors would disappear after the assertion. Closes https://github.com/facebook/rocksdb/pull/2416 Differential Revision: D5193526 Pulled By: maysamyabandeh fbshipit-source-id: 16a21f18f68023f862764dd3ab9e00ca60b0eefa | 06 June 2017, 19:56:52 UTC |
cc5f933 | Maysam Yabandeh | 06 June 2017, 19:48:46 UTC | Fix concurrency issue with filter_block_set_ Summary: filter_block_set_ access must also be protected with mutex. Closes https://github.com/facebook/rocksdb/pull/2413 Differential Revision: D5193159 Pulled By: maysamyabandeh fbshipit-source-id: 6987fc219d9a65c20b9c7e52151aef4b8e4882e6 | 06 June 2017, 19:56:52 UTC |
2e64f45 | Aaron Gao | 05 June 2017, 22:55:55 UTC | bump version to 5.6 Summary: Bump version to 5.6 beforehand in master Closes https://github.com/facebook/rocksdb/pull/2411 Differential Revision: D5186896 Pulled By: lightmark fbshipit-source-id: 079538e621b1a959c2dc99dada894e9cdb99ef95 | 05 June 2017, 23:15:21 UTC |
afbc2d0 | Yi Wu | 05 June 2017, 22:34:40 UTC | Force travis to build with clang on MacOS Summary: Attempt to force travis to build with clang on MacOS Closes https://github.com/facebook/rocksdb/pull/2408 Differential Revision: D5186635 Pulled By: yiwu-arbug fbshipit-source-id: dbb779eff07b1cb7dbd2092631303cf946316656 | 05 June 2017, 22:41:57 UTC |
b172a3f | Sagar Vemuri | 05 June 2017, 22:16:37 UTC | Fix warnings while generating RocksJava documentation Summary: There are a couple of warnings while building RocksJava, coming from Javadoc generation. ``` Generating target/apidocs/org/rocksdb/RocksDB.html... src/main/java/org/rocksdb/RocksDB.java:2139: warning: no throws for org.rocksdb.RocksDBException public void ingestExternalFile(final List<String> filePathList, ^ src/main/java/org/rocksdb/RocksDB.java:2162: warning: no throws for org.rocksdb.RocksDBException public void ingestExternalFile(final ColumnFamilyHandle columnFamilyHandle, ^ ``` Closes https://github.com/facebook/rocksdb/pull/2396 Differential Revision: D5178388 Pulled By: sagar0 fbshipit-source-id: a0ab6696d6de78d089a9a860a559f64cc320019e | 05 June 2017, 22:28:00 UTC |
52a7f38 | Siying Dong | 05 June 2017, 21:42:34 UTC | WriteOptions.low_pri which can throttle low pri writes if needed Summary: If ReadOptions.low_pri=true and compaction is behind, the write will either return immediate or be slowed down based on ReadOptions.no_slowdown. Closes https://github.com/facebook/rocksdb/pull/2369 Differential Revision: D5127619 Pulled By: siying fbshipit-source-id: d30e1cff515890af0eff32dfb869d2e4c9545eb0 | 05 June 2017, 22:02:35 UTC |
26a8a80 | Adam Retter | 05 June 2017, 19:16:02 UTC | Switch from CentOS 5 to CentOS 6 for crossbuilding RocksJava Summary: Updates the statically linked libraries from linking against glibc 2.5, to linking against glibc 2.12. Closes https://github.com/facebook/rocksdb/pull/2405 Differential Revision: D5184132 Pulled By: sagar0 fbshipit-source-id: 7a8ad4cf7e737ca62f29e58938bd49fa02114541 | 05 June 2017, 19:27:24 UTC |
dba9f37 | Yi Wu | 05 June 2017, 19:06:23 UTC | Fix db_write_test clang/windows build failure Summary: Fix db_write_test clang/windows build failure. Explicitly cast size_t (unsigned long) to uint32_t (unsigned int). Closes https://github.com/facebook/rocksdb/pull/2407 Differential Revision: D5182995 Pulled By: yiwu-arbug fbshipit-source-id: aba225a9fccb12d5bfbdc2cd6efc11040706a9d2 | 05 June 2017, 19:27:24 UTC |
c7662a4 | hyunwoo | 05 June 2017, 18:23:31 UTC | fixed typo Summary: fixed typo Closes https://github.com/facebook/rocksdb/pull/2376 Differential Revision: D5183630 Pulled By: ajkr fbshipit-source-id: 133cfd0445959e70aa2cd1a12151bf3c0c5c3ac5 | 05 June 2017, 18:27:34 UTC |
7e8d95c | Adam Retter | 05 June 2017, 01:40:37 UTC | Fix the Java build which was broken by a4d9c02 Summary: Closes https://github.com/facebook/rocksdb/pull/2406 Differential Revision: D5181091 Pulled By: ajkr fbshipit-source-id: fd72525da4fb1d50143080a210f8d824cbb968d6 | 05 June 2017, 01:41:33 UTC |
7e5fac2 | Aaron Gao | 03 June 2017, 00:12:52 UTC | remove test dir before exit when current regression is running Summary: clean up the current test dir if the last regression test is still running. Closes https://github.com/facebook/rocksdb/pull/2401 Differential Revision: D5177882 Pulled By: lightmark fbshipit-source-id: 91d899fcc2bde841948eae71af8584d4bdb35468 | 03 June 2017, 00:26:19 UTC |
7f6c02d | Aaron Gao | 03 June 2017, 00:12:39 UTC | using ThreadLocalPtr to hide ROCKSDB_SUPPORT_THREAD_LOCAL from public… Summary: … headers https://github.com/facebook/rocksdb/pull/2199 should not reference RocksDB-specific macros (like ROCKSDB_SUPPORT_THREAD_LOCAL in this case) to public headers, `iostats_context.h` and `perf_context.h`. We shouldn't do that because users have to provide these compiler flags when building their binary with RocksDB. We should hide the thread local global variable inside our implementation and just expose a function api to retrieve these variables. It may break some users for now but good for long term. make check -j64 Closes https://github.com/facebook/rocksdb/pull/2380 Differential Revision: D5177896 Pulled By: lightmark fbshipit-source-id: 6fcdfac57f2e2dcfe60992b7385c5403f6dcb390 | 03 June 2017, 00:26:19 UTC |
138b87e | Mike Kolupaev | 02 June 2017, 21:56:31 UTC | Fix interaction between CompactionFilter::Decision::kRemoveAndSkipUnt… Summary: Fixes the following scenario: 1. Set prefix extractor. Enable bloom filters, with `whole_key_filtering = false`. Use compaction filter that sometimes returns `kRemoveAndSkipUntil`. 2. Do a compaction. 3. Compaction creates an iterator with `total_order_seek = false`, calls `SeekToFirst()` on it, then repeatedly calls `Next()`. 4. At some point compaction filter returns `kRemoveAndSkipUntil`. 5. Compaction calls `Seek(skip_until)` on the iterator. The key that it seeks to happens to have prefix that doesn't match the bloom filter. Since `total_order_seek = false`, iterator becomes invalid, and compaction thinks that it has reached the end. The rest of the compaction input is silently discarded. The fix is to make compaction iterator use `total_order_seek = true`. The implementation for PlainTable is quite awkward. I've made `kRemoveAndSkipUntil` officially incompatible with PlainTable. If you try to use them together, compaction will fail, and DB will enter read-only mode (`bg_error_`). That's not a very graceful way to communicate a misconfiguration, but the alternatives don't seem worth the implementation time and complexity. To be able to check in advance that `kRemoveAndSkipUntil` is not going to be used with PlainTable, we'd need to extend the interface of either `CompactionFilter` or `InternalIterator`. It seems unlikely that anyone will ever want to use `kRemoveAndSkipUntil` with PlainTable: PlainTable probably has very few users, and `kRemoveAndSkipUntil` has only one user so far: us (logdevice). Closes https://github.com/facebook/rocksdb/pull/2349 Differential Revision: D5110388 Pulled By: lightmark fbshipit-source-id: ec29101a99d9dcd97db33923b87f72bce56cc17a | 02 June 2017, 22:11:38 UTC |
95b0e89 | Siying Dong | 02 June 2017, 21:13:59 UTC | Improve write buffer manager (and allow the size to be tracked in block cache) Summary: Improve write buffer manager in several ways: 1. Size is tracked when arena block is allocated, rather than every allocation, so that it can better track actual memory usage and the tracking overhead is slightly lower. 2. We start to trigger memtable flush when 7/8 of the memory cap hits, instead of 100%, and make 100% much harder to hit. 3. Allow a cache object to be passed into buffer manager and the size allocated by memtable can be costed there. This can help users have one single memory cap across block cache and memtable. Closes https://github.com/facebook/rocksdb/pull/2350 Differential Revision: D5110648 Pulled By: siying fbshipit-source-id: b4238113094bf22574001e446b5d88523ba00017 | 02 June 2017, 21:26:56 UTC |
a4d9c02 | Andrew Kryczka | 02 June 2017, 19:08:01 UTC | Pass CF ID to MemTableRepFactory Summary: Some users want to monitor column family activity in their custom memtable implementations. Previously there was no way to figure out with which column family a memtable is associated. This diff: - adds an overload to MemTableRepFactory::CreateMemTableRep() that provides the CF ID. For compatibility, its default implementation calls the old overload. - updates MemTable to create MemTableRep's using the new overload. Closes https://github.com/facebook/rocksdb/pull/2346 Differential Revision: D5108061 Pulled By: ajkr fbshipit-source-id: 3a1921214a348dd8ea0f54e1cab3b71c3d46d616 | 02 June 2017, 19:12:06 UTC |
f68d88b | Yi Wu | 02 June 2017, 18:41:29 UTC | Fix DBWriteTest::ReturnSequenceNumberMultiThreaded data race Summary: rocksdb::Random is not thread-safe. Have one Random for each thread instead. Closes https://github.com/facebook/rocksdb/pull/2400 Differential Revision: D5173919 Pulled By: yiwu-arbug fbshipit-source-id: 1a99c7b877f3893eb22355af49e321bcad4e53e6 | 02 June 2017, 18:42:11 UTC |
215076e | Andrew Kryczka | 02 June 2017, 05:14:27 UTC | Fix TSAN: avoid arena mode with range deletions Summary: The range deletion meta-block iterators weren't getting cleaned up properly since they don't support arena allocation. I didn't implement arena support since, in the general case, each iterator is used only once and separately from all other iterators, so there should be no benefit to data locality. Anyways, this diff fixes up #2370 by treating range deletion iterators as non-arena-allocated. Closes https://github.com/facebook/rocksdb/pull/2399 Differential Revision: D5171119 Pulled By: ajkr fbshipit-source-id: bef6f5c4c5905a124f4993945aed4bd86e2807d8 | 02 June 2017, 05:26:49 UTC |
3a8a848 | Andrew Kryczka | 02 June 2017, 00:54:06 UTC | account for L0 size in estimated compaction bytes Summary: also changed the `>` in the comparison against `level0_file_num_compaction_trigger` into a `>=` since exactly `level0_file_num_compaction_trigger` can trigger a compaction from L0. Closes https://github.com/facebook/rocksdb/pull/2179 Differential Revision: D4915772 Pulled By: ajkr fbshipit-source-id: e38fec6253de6f9a40e61734615c6670d84038aa | 02 June 2017, 00:56:59 UTC |
0fae3f5 | Andrew Gallagher | 02 June 2017, 00:48:17 UTC | codemod: format TARGETS with buildifier [5/5] (D5092623) Reviewed By: igorsugak fbshipit-source-id: 906b744c179eb932f5a388b39f93209cecd50a80 | 02 June 2017, 00:56:59 UTC |
8721996 | Aaron Gao | 01 June 2017, 22:42:02 UTC | add checkpoint support for single db in regression test Summary: For level_compaction_style regression test. Closes https://github.com/facebook/rocksdb/pull/2397 Differential Revision: D5168545 Pulled By: lightmark fbshipit-source-id: 195e4d84917e7c261d9f4fbe9aee5d104c9cb9a2 | 01 June 2017, 22:56:59 UTC |
5a9b4d7 | Maysam Yabandeh | 01 June 2017, 22:30:27 UTC | Retire memenv https://github.com/facebook/rocksdb/pull/2082 Summary: This is a manual commit of this PR: Retire InMemoryEnv in favor of MockEnv #2082 With MockEnv doing the same yet being more mature, InMemoryEnv is redundant. Reviewed By: IslamAbdelRahman Differential Revision: D5162323 fbshipit-source-id: 59fd0082a891dc99cc531e4da9d68bf891eae3f5 | 01 June 2017, 22:41:20 UTC |
d601965 | Islam AbdelRahman | 01 June 2017, 19:31:13 UTC | sync internal/external TARGETS | 01 June 2017, 19:31:13 UTC |
bbaba51 | Volker Mische | 01 June 2017, 18:26:01 UTC | Add missing index type to C-API Summary: When the `TwoLevelIndexSearch` was introduced, it wasn't added to the C-API. Closes https://github.com/facebook/rocksdb/pull/2395 Differential Revision: D5165127 Pulled By: maysamyabandeh fbshipit-source-id: d077f16ab5646c18158d8202a33b0fd076c6c8ad | 01 June 2017, 18:27:04 UTC |
292edfd | Daniel Black | 01 June 2017, 17:11:23 UTC | travis: test with xcode8.3 (OS X 10.12) Summary: Use later xcode version from https://docs.travis-ci.com/user/osx-ci-environment Closes https://github.com/facebook/rocksdb/pull/2128 Differential Revision: D4907471 Pulled By: yiwu-arbug fbshipit-source-id: debf8e27baef71a5833c845401b1865bc75ac977 | 01 June 2017, 17:11:50 UTC |
0dc3040 | Tamir Duberstein | 01 June 2017, 05:41:44 UTC | db: avoid `#include`ing malloc and jemalloc simultaneously Summary: This fixes a compilation failure on Linux when the system libc is not glibc. jemalloc's configure script incorrectly assumes that glibc is always used on Linux systems, producing glibc-style signatures; when the system libc is e.g. musl, the following error is observed: ``` [ 0%] Building CXX object CMakeFiles/rocksdb.dir/db/db_impl.cc.o In file included from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/table/block.h:19:0, from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/db/db_impl.cc:77: /x-tools/x86_64-unknown-linux-musl/x86_64-unknown-linux-musl/sysroot/usr/include/malloc.h:19:8: error: declaration of 'size_t malloc_usable_size(void*)' has a different exception specifier size_t malloc_usable_size(void *); ^~~~~~~~~~~~~~~~~~ In file included from /go/src/github.com/cockroachdb/cockroach/c-deps/rocksdb.src/db/db_impl.cc:20:0: /go/native/x86_64-unknown-linux-musl/jemalloc/include/jemalloc/jemalloc.h:78:33: note: from previous declaration 'size_t malloc_usable_size(void*) throw ()' # define je_malloc_usable_size malloc_usable_size ^ /go/native/x86_64-unknown-linux-musl/jemalloc/include/jemalloc/jemalloc.h:239:41: note: in expansion of macro 'je_malloc_usable_size' JEMALLOC_EXPORT size_t JEMALLOC_NOTHROW je_malloc_usable_size( ^~~~~~~~~~~~~~~~~~~~~ CMakeFiles/rocksdb.dir/build.make:350: recipe for target 'CMakeFiles/rocksdb.dir/db/db_impl.cc.o' failed ``` This works around the issue by rearranging the sources such that jemalloc's headers are never in the same scope as the system's malloc header. The jemalloc issue has been reported as well, see: https://github.com/jemalloc/jemalloc/issues/778. cc tschottdorf Closes https://github.com/facebook/rocksdb/pull/2188 Differential Revision: D5163048 Pulled By: siying fbshipit-source-id: c553125458892def175c1be5682b0330d80b2a0d | 01 June 2017, 05:43:02 UTC |
9b3ed83 | Aaron Gao | 31 May 2017, 22:33:02 UTC | fix regression test Summary: fix regression test by not reporting stats when building db Closes https://github.com/facebook/rocksdb/pull/2390 Differential Revision: D5159909 Pulled By: lightmark fbshipit-source-id: c3f4b9deb9c6799ff84207fd341c529144f8158d | 31 May 2017, 22:41:45 UTC |
9c9909b | Andrew Kryczka | 31 May 2017, 20:43:25 UTC | Support ingest file when range deletions exist Summary: Previously we returned NotSupported when ingesting files into a database containing any range deletions. This diff adds the support. - Flush if any memtable contains range deletions overlapping the to-be-ingested file - Place to-be-ingested file before any level that contains range deletions overlapping it. - Added support for `Version` to return iterators over range deletions in a given level. Previously, we piggybacked getting range deletions onto `Version`'s `Get()` / `AddIterator()` functions by passing them a `RangeDelAggregator*`. But file ingestion needs to get iterators over range deletions, not populate an aggregator (since the aggregator does collapsing and doesn't expose the actual ranges). Closes https://github.com/facebook/rocksdb/pull/2370 Differential Revision: D5127648 Pulled By: ajkr fbshipit-source-id: 816faeb9708adfa5287962bafdde717db56e3f1a | 31 May 2017, 20:57:19 UTC |
ad19eb8 | Yi Wu | 31 May 2017, 17:45:47 UTC | Fixing blob db sequence number handling Summary: Blob db rely on base db returning sequence number through write batch after DB::Write(). However after recent changes to the write path, DB::Writ()e no longer return sequence number in some cases. Fixing it by have WriteBatchInternal::InsertInto() always encode sequence number into write batch. Stacking on #2375. Closes https://github.com/facebook/rocksdb/pull/2385 Differential Revision: D5148358 Pulled By: yiwu-arbug fbshipit-source-id: 8bda0aa07b9334ed03ed381548b39d167dc20c33 | 31 May 2017, 17:56:45 UTC |
51ac91f | Siying Dong | 31 May 2017, 14:27:40 UTC | Histogram of number of merge operands Summary: Add a histogram in statistics to help users understand how many merge operands they merge. Closes https://github.com/facebook/rocksdb/pull/2373 Differential Revision: D5139983 Pulled By: siying fbshipit-source-id: 61b9ba8ca83f358530a4833d68f0103b56a0e182 | 31 May 2017, 14:41:44 UTC |
345878a | Yi Wu | 31 May 2017, 05:16:32 UTC | update blob_db_test Summary: Re-enable blob_db_test with some update: * Commented out delay at the end of GC tests. Will update the logic later with sync point to properly trigger GC. * Added some helper functions. Also update make files to include blob_dump tool. Closes https://github.com/facebook/rocksdb/pull/2375 Differential Revision: D5133793 Pulled By: yiwu-arbug fbshipit-source-id: 95470b26d0c1f9592ba4b7637e027fdd263f425c | 31 May 2017, 05:26:13 UTC |
cbc821c | Aaron Gao | 30 May 2017, 23:33:26 UTC | change regression rebuild to one level Summary: abandon fillseqdeterministic test locally Closes https://github.com/facebook/rocksdb/pull/2290 Differential Revision: D5151867 Pulled By: lightmark fbshipit-source-id: 4c8a24cc937212ffb5ceb9bfaf7288eb8726d0c1 | 30 May 2017, 23:41:21 UTC |
103d069 | Tamir Duberstein | 30 May 2017, 18:05:28 UTC | Avoid unsupported attributes when not building with UBSAN Summary: yiwu-arbug see individual commits. Closes https://github.com/facebook/rocksdb/pull/2318 Differential Revision: D5141520 Pulled By: yiwu-arbug fbshipit-source-id: 7987c92ab4461eef36afce5a133d3a0ee0c96300 | 30 May 2017, 18:13:01 UTC |
5fd0456 | Tamir Duberstein | 30 May 2017, 17:32:16 UTC | travis: reduce the number of travis builders Summary: This collapses all the "platform dependent" tests into a single travis builder in an effort to reduce overall CI times. These builds currently take a combined 21-23 minutes, but each one has to compile the library, so combining them should yield some time savings (5-10 minutes). Unfortunately the other builders don't duplicate work, so combining them is unlikely to provide benefit. Closes https://github.com/facebook/rocksdb/pull/2306 Differential Revision: D5147850 Pulled By: yiwu-arbug fbshipit-source-id: d947dc8b9f49639fe22f3c8ab9a82a8d730ddddf | 30 May 2017, 17:42:01 UTC |