swh:1:snp:5115096b921df712aeb2a08114fede57fb3331fb
Revision fea2b1dfb2c554a5394e213b101516a311142fa9 authored by Andrew Kryczka on 01 June 2018, 23:46:32 UTC, committed by Facebook Github Bot on 01 June 2018, 23:57:58 UTC
Summary:
For iterator reads, a `SuperVersion` is pinned to preserve a snapshot of SST files, and `Block`s are pinned to allow `key()` and `value()` to return pointers directly into a RocksDB memory region. This works for both non-mmap reads, where the block owns the memory region, and mmap reads, where the file owns the memory region.

For point reads with `PinnableSlice`, only the `Block` object is pinned. This works for non-mmap reads because the block owns the memory region, so even if the file is deleted after compaction, the memory region survives. However, for mmap reads, file deletion causes the memory region to which the `PinnableSlice` refers to be unmapped.   The result is usually a segfault upon accessing the `PinnableSlice`, although sometimes it returned wrong results (I repro'd this a bunch of times with `db_stress`).

This PR copies the value into the `PinnableSlice` when it comes from mmap'd memory. We can tell whether the `Block` owns its memory using `Block::cachable()`, which is unset when reads do not use the provided buffer as is the case with mmap file reads. When that is false we ensure the result of `Get()` is copied.

This feels like a short-term solution as ideally we'd have the `PinnableSlice` pin the mmap'd memory so we can do zero-copy reads. It seemed hard so I chose this approach to fix correctness in the meantime.
Closes https://github.com/facebook/rocksdb/pull/3881

Differential Revision: D8076288

Pulled By: ajkr

fbshipit-source-id: 31d78ec010198723522323dbc6ea325122a46b08
1 parent 88c3ee2
History
Tip revision: 19076c95aa2bcee55c26fcf0960cc844ad86ee9c authored by Levi Tamasi on 21 January 2021, 22:18:25 UTC
Update HISTORY.md for PR 7888 (#7890)
Tip revision: 19076c9
File Mode Size
buckifier
build_tools
cache
cmake
coverage
db
docs
env
examples
hdfs
include
java
memtable
monitoring
options
port
table
third-party
tools
util
utilities
.clang-format -rw-r--r-- 138 bytes
.gitignore -rw-r--r-- 705 bytes
.travis.yml -rw-r--r-- 3.2 KB
AUTHORS -rw-r--r-- 322 bytes
CMakeLists.txt -rw-r--r-- 35.3 KB
CODE_OF_CONDUCT.md -rw-r--r-- 249 bytes
CONTRIBUTING.md -rw-r--r-- 706 bytes
COPYING -rw-r--r-- 17.7 KB
DEFAULT_OPTIONS_HISTORY.md -rw-r--r-- 1.5 KB
DUMP_FORMAT.md -rw-r--r-- 763 bytes
HISTORY.md -rw-r--r-- 61.1 KB
INSTALL.md -rw-r--r-- 7.4 KB
LANGUAGE-BINDINGS.md -rw-r--r-- 904 bytes
LICENSE.Apache -rw-r--r-- 11.1 KB
LICENSE.leveldb -rw-r--r-- 1.5 KB
Makefile -rw-r--r-- 64.4 KB
README.md -rw-r--r-- 1.8 KB
ROCKSDB_LITE.md -rw-r--r-- 1.0 KB
TARGETS -rw-r--r-- 26.9 KB
USERS.md -rw-r--r-- 5.3 KB
Vagrantfile -rw-r--r-- 1017 bytes
WINDOWS_PORT.md -rw-r--r-- 12.5 KB
appveyor.yml -rw-r--r-- 428 bytes
issue_template.md -rw-r--r-- 243 bytes
src.mk -rw-r--r-- 29.4 KB
thirdparty.inc -rw-r--r-- 6.4 KB

README.md

back to top