https://github.com/JuliaLang/julia
Revision 8599e2f1a44f1407ecc2a9d824de7a79a9207085 authored by Valentin Churavy on 07 September 2023, 17:55:25 UTC, committed by GitHub on 07 September 2023, 17:55:25 UTC
The implementation is based on a [Hash Array Mapped Trie
(HAMT)](https://en.wikipedia.org/wiki/Hash_array_mapped_trie)
following [Bagwell
(2000)](http://infoscience.epfl.ch/record/64398/files/idealhashtrees.pdf).

A HAMT uses a fixed branching factor (commonly 32) together with each
node being sparse.
In order to search for an entry we take the hash of the key and chunk it
up into blocks,
with a branching factor of 32 each block is 5 bits. We use those 5 bits
to calculate the
index inside the node and use a bitmap within the node to keep track if
an element is
already set. This makes search a `log(32, n)` operation.

Persistency is implemented by path-copying. When we insert/delete a
value into the HAMT
we copy each node along the path into a new HAMT, all other nodes are
shared with
the previous HAMT.

A noteable implementation choice is that I didn't add a (resizeable)
root table.
Normally this root table is dense and uses the first `t` bits to
calculate an index
within. This makes large HAMT a bit cheaper since the root-table
effectivly folds
multiple lookup steps into one. It does hurt persistent use-cases since
path-copying
means that we also copy the root node/table.

Importantly the HAMT itself is not immutable/persistent, the use of it
as part of the
`PersistentDict` is. Direct mutation of the underlying data breaks the
persistentcy
invariants. One could use the HAMT to implement a non-persistent
dictionary (or
other datastructures). 

As an interesting side-note we could use a related data-structure
[Ctrie](http://lamp.epfl.ch/~prokopec/ctries-snapshot.pdf)
to implement a concurrent lock-free dictionary. Ctrie also support
`O(1)` snapshotting
so we could replace the HAMT used here with a Ctrie.
1 parent 27fa5de
History
Tip revision: 8599e2f1a44f1407ecc2a9d824de7a79a9207085 authored by Valentin Churavy on 07 September 2023, 17:55:25 UTC
add PersistentDict based on a HAMT (#51164)
Tip revision: 8599e2f
File Mode Size
.devcontainer
.github
base
cli
contrib
deps
doc
etc
src
stdlib
test
.buildkite-external-version -rw-r--r-- 5 bytes
.clang-format -rw-r--r-- 3.3 KB
.clangd -rw-r--r-- 114 bytes
.codecov.yml -rw-r--r-- 52 bytes
.git-blame-ignore-revs -rw-r--r-- 371 bytes
.gitattributes -rw-r--r-- 65 bytes
.gitignore -rw-r--r-- 523 bytes
.mailmap -rw-r--r-- 12.7 KB
CITATION.bib -rw-r--r-- 513 bytes
CITATION.cff -rw-r--r-- 940 bytes
CONTRIBUTING.md -rw-r--r-- 23.4 KB
HISTORY.md -rw-r--r-- 372.5 KB
LICENSE.md -rw-r--r-- 1.3 KB
Make.inc -rw-r--r-- 55.7 KB
Makefile -rw-r--r-- 30.5 KB
NEWS.md -rw-r--r-- 1.5 KB
README.md -rw-r--r-- 7.3 KB
THIRDPARTY.md -rw-r--r-- 3.8 KB
VERSION -rw-r--r-- 11 bytes
julia.spdx.json -rw-r--r-- 35.8 KB
pkgimage.mk -rw-r--r-- 6.3 KB
sysimage.mk -rw-r--r-- 4.3 KB

README.md

back to top