Revision 57f30322855b6ed38296ec0f4c6fb03b095fe0b3 authored by Peter Dillinger on 26 November 2019, 23:49:16 UTC, committed by Facebook Github Bot on 26 November 2019, 23:59:34 UTC
Summary:
There's no technological impediment to allowing the Bloom
filter bits/key to be non-integer (fractional/decimal) values, and it
provides finer control over the memory vs. accuracy trade-off. This is
especially handy in using the format_version=5 Bloom filter in place
of the old one, because bits_per_key=9.55 provides the same accuracy as
the old bits_per_key=10.

This change not only requires refining the logic for choosing the best
num_probes for a given bits/key setting, it revealed a flaw in that logic.
As bits/key gets higher, the best num_probes for a cache-local Bloom
filter is closer to bpk / 2 than to bpk * 0.69, the best choice for a
standard Bloom filter. For example, at 16 bits per key, the best
num_probes is 9 (FP rate = 0.0843%) not 11 (FP rate = 0.0884%).
This change fixes and refines that logic (for the format_version=5
Bloom filter only, just in case) based on empirical tests to find
accuracy inflection points between each num_probes.

Although bits_per_key is now specified as a double, the new Bloom
filter converts/rounds this to "millibits / key" for predictable/precise
internal computations. Just in case of unforeseen compatibility
issues, we round to the nearest whole number bits / key for the
legacy Bloom filter, so as not to unlock new behaviors for it.
Pull Request resolved: https://github.com/facebook/rocksdb/pull/6092

Test Plan: unit tests included

Differential Revision: D18711313

Pulled By: pdillinger

fbshipit-source-id: 1aa73295f152a995328cb846ef9157ae8a05522a
1 parent 72daa92
History
File Mode Size
buckifier
build_tools
cache
cmake
coverage
db
docs
env
examples
file
hdfs
include
java
logging
memory
memtable
monitoring
options
port
table
test_util
third-party
tools
trace_replay
util
utilities
.clang-format -rw-r--r-- 138 bytes
.gitignore -rw-r--r-- 875 bytes
.lgtm.yml -rw-r--r-- 67 bytes
.travis.yml -rw-r--r-- 3.5 KB
.watchmanconfig -rw-r--r-- 130 bytes
AUTHORS -rw-r--r-- 322 bytes
CMakeLists.txt -rw-r--r-- 38.7 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.3 KB
CONTRIBUTING.md -rw-r--r-- 706 bytes
COPYING -rw-r--r-- 17.7 KB
DEFAULT_OPTIONS_HISTORY.md -rw-r--r-- 1.5 KB
DUMP_FORMAT.md -rw-r--r-- 763 bytes
HISTORY.md -rw-r--r-- 97.9 KB
INSTALL.md -rw-r--r-- 7.5 KB
LANGUAGE-BINDINGS.md -rw-r--r-- 1.1 KB
LICENSE.Apache -rw-r--r-- 11.1 KB
LICENSE.leveldb -rw-r--r-- 1.5 KB
Makefile -rw-r--r-- 70.3 KB
README.md -rw-r--r-- 1.8 KB
ROCKSDB_LITE.md -rw-r--r-- 1.0 KB
TARGETS -rw-r--r-- 34.4 KB
USERS.md -rw-r--r-- 6.5 KB
Vagrantfile -rw-r--r-- 1017 bytes
WINDOWS_PORT.md -rw-r--r-- 12.5 KB
appveyor.yml -rw-r--r-- 2.7 KB
defs.bzl -rw-r--r-- 1.4 KB
issue_template.md -rw-r--r-- 243 bytes
src.mk -rw-r--r-- 32.8 KB
thirdparty.inc -rw-r--r-- 7.8 KB

README.md

back to top