swh:1:snp:5115096b921df712aeb2a08114fede57fb3331fb
Revision 51a8dc6d142af0f6a648d2e0399d76507380b810 authored by Levi Tamasi on 24 November 2020, 05:07:01 UTC, committed by Facebook GitHub Bot on 24 November 2020, 05:08:22 UTC
Summary:
The patch adds basic garbage collection support to the integrated BlobDB
implementation. Valid blobs residing in the oldest blob files are relocated
as they are encountered during compaction. The threshold that determines
which blob files qualify is computed based on the configuration option
`blob_garbage_collection_age_cutoff`, which was introduced in https://github.com/facebook/rocksdb/issues/7661 .
Once a blob is retrieved for the purposes of relocation, it passes through the
same logic that extracts large values to blob files in general. This means that
if, for instance, the size threshold for key-value separation (`min_blob_size`)
got changed or writing blob files got disabled altogether, it is possible for the
value to be moved back into the LSM tree. In particular, one way to re-inline
all blob values if needed would be to perform a full manual compaction with
`enable_blob_files` set to `false`, `enable_blob_garbage_collection` set to
`true`, and `blob_file_garbage_collection_age_cutoff` set to `1.0`.

Some TODOs that I plan to address in separate PRs:

1) We'll have to measure the amount of new garbage in each blob file and log
`BlobFileGarbage` entries as part of the compaction job's `VersionEdit`.
(For the time being, blob files are cleaned up solely based on the
`oldest_blob_file_number` relationships.)
2) When compression is used for blobs, the compression type hasn't changed,
and the blob still qualifies for being written to a blob file, we can simply copy
the compressed blob to the new file instead of going through decompression
and compression.
3) We need to update the formula for computing write amplification to account
for the amount of data read from blob files as part of GC.

Pull Request resolved: https://github.com/facebook/rocksdb/pull/7694

Test Plan: `make check`

Reviewed By: riversand963

Differential Revision: D25069663

Pulled By: ltamasi

fbshipit-source-id: bdfa8feb09afcf5bca3b4eba2ba72ce2f15cd06a
1 parent dd6b7fc
History
Tip revision: 19076c95aa2bcee55c26fcf0960cc844ad86ee9c authored by Levi Tamasi on 21 January 2021, 22:18:25 UTC
Update HISTORY.md for PR 7888 (#7890)
Tip revision: 19076c9
File Mode Size
.circleci
.github
buckifier
build_tools
cache
cmake
coverage
db
db_stress_tool
docs
env
examples
file
fuzz
hdfs
include
java
logging
memory
memtable
monitoring
options
port
table
test_util
third-party
tools
trace_replay
util
utilities
.clang-format -rw-r--r-- 138 bytes
.gitignore -rw-r--r-- 1.0 KB
.lgtm.yml -rw-r--r-- 67 bytes
.travis.yml -rw-r--r-- 9.7 KB
.watchmanconfig -rw-r--r-- 130 bytes
AUTHORS -rw-r--r-- 322 bytes
CMakeLists.txt -rw-r--r-- 45.2 KB
CODE_OF_CONDUCT.md -rw-r--r-- 3.3 KB
CONTRIBUTING.md -rw-r--r-- 706 bytes
COPYING -rw-r--r-- 17.7 KB
DEFAULT_OPTIONS_HISTORY.md -rw-r--r-- 1.5 KB
DUMP_FORMAT.md -rw-r--r-- 763 bytes
HISTORY.md -rw-r--r-- 145.4 KB
INSTALL.md -rw-r--r-- 7.5 KB
LANGUAGE-BINDINGS.md -rw-r--r-- 1.2 KB
LICENSE.Apache -rw-r--r-- 11.1 KB
LICENSE.leveldb -rw-r--r-- 1.5 KB
Makefile -rw-r--r-- 86.0 KB
README.md -rw-r--r-- 2.0 KB
ROCKSDB_LITE.md -rw-r--r-- 1.0 KB
TARGETS -rw-r--r-- 52.5 KB
USERS.md -rw-r--r-- 7.3 KB
Vagrantfile -rw-r--r-- 1017 bytes
WINDOWS_PORT.md -rw-r--r-- 12.5 KB
appveyor.yml -rw-r--r-- 3.3 KB
defs.bzl -rw-r--r-- 1.9 KB
issue_template.md -rw-r--r-- 294 bytes
src.mk -rw-r--r-- 39.6 KB
thirdparty.inc -rw-r--r-- 7.8 KB

README.md

back to top