Revision 57f30322855b6ed38296ec0f4c6fb03b095fe0b3 authored by Peter Dillinger on 26 November 2019, 23:49:16 UTC, committed by Facebook Github Bot on 26 November 2019, 23:59:34 UTC
Summary: There's no technological impediment to allowing the Bloom filter bits/key to be non-integer (fractional/decimal) values, and it provides finer control over the memory vs. accuracy trade-off. This is especially handy in using the format_version=5 Bloom filter in place of the old one, because bits_per_key=9.55 provides the same accuracy as the old bits_per_key=10. This change not only requires refining the logic for choosing the best num_probes for a given bits/key setting, it revealed a flaw in that logic. As bits/key gets higher, the best num_probes for a cache-local Bloom filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a standard Bloom filter. For example, at 16 bits per key, the best num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%). This change fixes and refines that logic (for the format_version=5 Bloom filter only, just in case) based on empirical tests to find accuracy inflection points between each num_probes. Although bits_per_key is now specified as a double, the new Bloom filter converts/rounds this to "millibits / key" for predictable/precise internal computations. Just in case of unforeseen compatibility issues, we round to the nearest whole number bits / key for the legacy Bloom filter, so as not to unlock new behaviors for it. Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092 Test Plan: unit tests included Differential Revision: D18711313 Pulled By: pdillinger fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a
1 parent 72daa92
File | Mode | Size |
---|---|---|
buckifier | ||
build_tools | ||
cache | ||
cmake | ||
coverage | ||
db | ||
docs | ||
env | ||
examples | ||
file | ||
hdfs | ||
include | ||
java | ||
logging | ||
memory | ||
memtable | ||
monitoring | ||
options | ||
port | ||
table | ||
test_util | ||
third-party | ||
tools | ||
trace_replay | ||
util | ||
utilities | ||
.clang-format | -rw-r--r-- | 138 bytes |
.gitignore | -rw-r--r-- | 875 bytes |
.lgtm.yml | -rw-r--r-- | 67 bytes |
.travis.yml | -rw-r--r-- | 3.5 KB |
.watchmanconfig | -rw-r--r-- | 130 bytes |
AUTHORS | -rw-r--r-- | 322 bytes |
CMakeLists.txt | -rw-r--r-- | 38.7 KB |
CODE_OF_CONDUCT.md | -rw-r--r-- | 3.3 KB |
CONTRIBUTING.md | -rw-r--r-- | 706 bytes |
COPYING | -rw-r--r-- | 17.7 KB |
DEFAULT_OPTIONS_HISTORY.md | -rw-r--r-- | 1.5 KB |
DUMP_FORMAT.md | -rw-r--r-- | 763 bytes |
HISTORY.md | -rw-r--r-- | 97.9 KB |
INSTALL.md | -rw-r--r-- | 7.5 KB |
LANGUAGE-BINDINGS.md | -rw-r--r-- | 1.1 KB |
LICENSE.Apache | -rw-r--r-- | 11.1 KB |
LICENSE.leveldb | -rw-r--r-- | 1.5 KB |
Makefile | -rw-r--r-- | 70.3 KB |
README.md | -rw-r--r-- | 1.8 KB |
ROCKSDB_LITE.md | -rw-r--r-- | 1.0 KB |
TARGETS | -rw-r--r-- | 34.4 KB |
USERS.md | -rw-r--r-- | 6.5 KB |
Vagrantfile | -rw-r--r-- | 1017 bytes |
WINDOWS_PORT.md | -rw-r--r-- | 12.5 KB |
appveyor.yml | -rw-r--r-- | 2.7 KB |
defs.bzl | -rw-r--r-- | 1.4 KB |
issue_template.md | -rw-r--r-- | 243 bytes |
src.mk | -rw-r--r-- | 32.8 KB |
thirdparty.inc | -rw-r--r-- | 7.8 KB |
Computing file changes ...