https://github.com/JuliaLang/julia
Revision dfab7be1f8bf8fd4400e7f61954b9596eacf4de0 authored by Diogo Netto on 24 January 2023, 19:07:52 UTC, committed by GitHub on 24 January 2023, 19:07:52 UTC
## Previous work

Since https://github.com/JuliaLang/julia/pull/21590, the GC mark-loop was implemented by keeping two manually managed stacks: one of which contained iterator states used to keep track of the object currently being marked. As an example, to mark arrays, we would pop the corresponding iterator state from the stack, iterate over the array until we found an unmarked reference, and if so, we would update the iterator state (to reflect the index we left off), "repush" it into the stack and  proceed with marking the reference we just found.

## This PR

This PR eliminates the need of keeping the iterator states by modifying the object graph traversal code. We keep a single stack of `jl_value_t *` currently being processed. To mark an object, we first pop it from the stack, push all unmarked references into the stack and proceed with marking.

I believe this doesn't break any invariant from the generational GC. Indeed, the age bits are set after marking (if the object survived one GC cycle it's moved to the old generation), so this new traversal scheme wouldn't change the fact of whether an object had references to old objects or not. Furthermore, we must not update GC metadata for objects in the `remset`, and we ensure this by calling `gc_mark_outrefs` in `gc_queue_remset` with `meta_updated` set to 1.

## Additional advantages

1. There are no recursive function calls in the GC mark-loop code (one of the reasons why https://github.com/JuliaLang/julia/pull/21590 was implemented).
2. Keeping a single GC queue will **greatly** simplify work-stealing in the multi-threaded GC we are working on (c.f. https://github.com/JuliaLang/julia/pull/45639).
3. Arrays of references, for example, are now marked on a regular stride fashion, which could help with hardware prefetching.
4. We can easily modify the traversal mode (to breadth first, for example) by only changing the `jl_gc_markqueue_t`(from LIFO to FIFO, for example) methods without touching the mark-loop itself, which could enable further exploration on the GC in the future.

Since this PR changes the mark-loop graph traversal code, there are some changes in the heap-snapshot, though I'm not familiar with that PR.

Some benchmark results are here: https://hackmd.io/@Idnmfpb3SxK98-OsBtRD5A/H1V6QSzvs.
1 parent f9d0473
History
Tip revision: dfab7be1f8bf8fd4400e7f61954b9596eacf4de0 authored by Diogo Netto on 24 January 2023, 19:07:52 UTC
GC mark-loop rewrite (#47292)
Tip revision: dfab7be
File Mode Size
.devcontainer
.github
base
cli
contrib
deps
doc
etc
src
stdlib
test
.buildkite-external-version -rw-r--r-- 5 bytes
.clang-format -rw-r--r-- 3.3 KB
.clangd -rw-r--r-- 114 bytes
.codecov.yml -rw-r--r-- 52 bytes
.git-blame-ignore-revs -rw-r--r-- 294 bytes
.gitattributes -rw-r--r-- 65 bytes
.gitignore -rw-r--r-- 514 bytes
.mailmap -rw-r--r-- 12.1 KB
CITATION.bib -rw-r--r-- 513 bytes
CITATION.cff -rw-r--r-- 940 bytes
CONTRIBUTING.md -rw-r--r-- 23.1 KB
HISTORY.md -rw-r--r-- 363.4 KB
LICENSE.md -rw-r--r-- 1.3 KB
Make.inc -rw-r--r-- 52.9 KB
Makefile -rw-r--r-- 30.1 KB
NEWS.md -rw-r--r-- 1.6 KB
README.md -rw-r--r-- 7.3 KB
THIRDPARTY.md -rw-r--r-- 3.7 KB
VERSION -rw-r--r-- 11 bytes
julia.spdx.json -rw-r--r-- 35.8 KB
sysimage.mk -rw-r--r-- 4.1 KB

README.md

back to top