Revision 6f5e6445d9b56208e9f92bfeaa201d54f0c0b943 authored by Maysam Yabandeh on 12 April 2018, 02:59:25 UTC, committed by Facebook Github Bot on 12 April 2018, 03:11:51 UTC
Summary:
We introduced smallest_prep optimization in this commit b225de7e10f02be6d00e96b9fb86dfef880babdf, which enables storing the smallest uncommitted sequence number along with the snapshot. This enables the readers that read from the snapshot to skip further checks and safely assumed the data is committed if its sequence number is less than smallest uncommitted when the snapshot was taken. The problem was that smallest uncommitted and the snapshot must be taken atomically, and the lack of atomicity had led to readers using a smallest uncommitted after the snapshot was taken and hence mistakenly skipping some data.
This patch fixes the problem by i) separating the process of removing of prepare entries from the AddCommitted function, ii) removing the prepare entires AFTER the committed sequence number is published, iii) getting smallest uncommitted (from the prepare list) BEFORE taking a snapshot. This guarantees that the smallest uncommitted that is accompanied with a snapshot is less than or equal of such number if it was obtained atomically.

Tested by running MySQLStyleTransactionTest/MySQLStyleTransactionTest.TransactionStressTest that was failing sporadically.
Closes https://github.com/facebook/rocksdb/pull/3703

Differential Revision: D7581934

Pulled By: maysamyabandeh

fbshipit-source-id: dc9d6f4fb477eba75d4d5927326905b548a96a32
1 parent d42bd04
Raw File
DEFAULT_OPTIONS_HISTORY.md
# RocksDB default options change log
## Unreleased
* delayed_write_rate takes the rate given by rate_limiter if not specified.

## 5.2
* Change the default of delayed slowdown value to 16MB/s and further increase the L0 stop condition to 36 files.

## 5.0 (11/17/2016)
* Options::allow_concurrent_memtable_write and Options::enable_write_thread_adaptive_yield are now true by default
* Options.level0_stop_writes_trigger default value changes from 24 to 32.

## 4.8.0 (5/2/2016)
* options.max_open_files changes from 5000 to -1. It improves performance, but users need to set file descriptor limit to be large enough and watch memory usage for index and bloom filters.
* options.base_background_compactions changes from max_background_compactions to 1. When users set higher max_background_compactions but the write throughput is not high, the writes are less spiky to disks.
* options.wal_recovery_mode changes from kTolerateCorruptedTailRecords to kPointInTimeRecovery. Avoid some false positive when file system or hardware reorder the writes for file data and metadata.

## 4.7.0 (4/8/2016)
* options.write_buffer_size changes from 4MB to 64MB.
* options.target_file_size_base changes from 2MB to 64MB.
* options.max_bytes_for_level_base changes from 10MB to 256MB.
* options.soft_pending_compaction_bytes_limit changes from 0 (disabled) to 64GB.
* options.hard_pending_compaction_bytes_limit changes from 0 (disabled) to 256GB.
* table_cache_numshardbits changes from 4 to 6.
* max_file_opening_threads changes from 1 to 16.
back to top