Revision 719eb4f8cb0831db11184ed7d6667268fbb83720 authored by Sebastian Wicki on 29 May 2024, 16:08:46 UTC, committed by Sebastian Wicki on 10 June 2024, 16:06:10 UTC
This commit implements `CFP-28427: ToFQDN policy performance
improvements`. It is highly recommended to consult the CFP, as it
contains all the high-level design decisions and mechanism found in this
commit. The rest of this commit message therefore only explains the
"what" and "where", and not the "why".

Before this commit, there was circular interaction between the
`SelectorCache` and `NameManager`: `SelectorCache` would tell
`NameManager` about new `ToFQDN` selectors, and `NameManager` would in
turn inform `SelectorCache` about the IPs selected by that `ToFQDN`
selector. This commit simplifies this logic by removing the backlink
from the `NameManager` to the `SelectorCache`. IPs are instead now
labelled with the selector as an `fqdn` identity label in IPCache, thus
not requiring any direct changes to the `SelectorCache` when a new IP is
discovered that shares the identity with an old IP. If there is identity
allocation needed for an observed IP, the `SelectorCache` is still
updated, but only via `IPCache`, and no longer directly from
`NameManager`.

I recommend first looking at the changes to `SelectorCache` in
`pkg/policy`. Note the following changes:

1. The `identityNotifier` interface (implemented by `NameManger`) is
   simplified: We no longer care about IPs selected by a FQDN selector,
   and we no longer need to care about potential deadlocks, as there are no
   calls back from `NameManager` to `SelectorCache` in the invoked
   functions (the indirect backlink from `NameManager` to `SelectorCache`
   via `IPCache` happens in `NameManager.UpdateGenerateDNS`
   - but this function is called by the DNS proxy whenever it observes a
    new DNS lookup and thus is called without the selector cache lock
   held.
2. `UpdateFQDNSelector` (previously invoked by `NameManager`) is removed
   - `SelectorCache` no longer directly needs to know the IPs matched by
   a selector.
3. The `fqdnSelector` type is simplified: Instead of containing the list
   of CIDR identities (one for each selected IP) and checking for the
   CIDR identity in `matches`, we now can simply treat the FQDN selector as
   a label and thus check if the requested identity has the FQDN selector
   label.
4. All the unit test logic around managing the selected IPs is removed,
   as all the responsibility for updating IPs now lies in `NameManager`.

For the `NameManager` in `pkg/fqdn`, the changes are as follows:

1.  Minor changes to for the query functions in `DNSCache`: Instead of
    just listing or checking the existence of an IP, we now want to know
    about `(name, IP)` pairs (needed later for updating `IPCache`).
2.  Similarly, where before we only cared about the mapping between an
    `FQDNSelector` and the selected IPs, we now want to know what
    `(name, IP)` pairs are matched by a particular selector.
    Thus `mapSelectorsToIPsLocked` is replaced with
    `mapSelectorsToNamesLocked` and the unit tests are updated as well.
3.  `RegisterFQDNSelector` now checks if the new selector needs to be
    added to any known `(name, IP)` pairs as an `fqdn` label, and
    `UnregisterFQDNSelector` potentially removes `fqdn` labels from IPs.
4.  `UpdateGenerateDNS` (invoked for DNS lookups) determines the labels
    of any newly discovered IP and now directly spawns the go routine
    to wait for the new `(IP, identity)` pair to be injected into
    `IPCache`. Previously, this waiting was done as part of the call to
    `UpdateSelectors`, previously implemented in `daemon/cmd/fqdn.go`
    (and now removed).
5.  `ForceGenerateDNS` is removed. It was previously called by the
    `NameManager` GC to remove IPs from the `SelectorCache`, but since
    the `SelectorCache` no longer knows about IPs, the function is
    obsolete (note that `IPCache` removals are still performed upon GC)
6.  Changes in `CompleteBootstrap` to deal with the upgrade logic
    when upgrading from Cilium v1.15. See bullet point 9 below for
    details.
7.  `updateDNSIPs` (called from `UpdateGenerateDNS`, i.e. upon new DNS
    lookups) now determines the labels for every newly observed IP based
    on the available FQDN selectors, and no longer upserts CIDR
    identites. Note that we only update the labels matching the looked up
    `dnsName`. If an IP happens to also map to a different domain name and
    uses a different set of selectors for the alternative name, those labels
    in IPCache are unaffected by the call to `updateMetadata`, as every call
    to IPCache uses the DNS name as the resource owner.
8.  The `ipcacheResource`, `updateMetadata`, and `maybeRemoveMetadata`
    contain the calls to `IPCache` to update labels for a given `(name,
    IP)` pair. There are two main differences to before: Instead of
    upserting or removing CIDR prefixes, we now add labels. And instead of
    having one update per prefix, we now have one update per `(name, IP)`
    pair, meaning a single prefix (aka "IP") might have multiple IPCache
    resource owners in the `NameManager` (i.e. one for each `name` mapping
    to that IP).
9.  `RestoreCache` and `CompleteBootstrap` contain the logic to
    initialize `IPCache` when upgrading from Cilium v1.15. This requires
    the previous Cilium instance to have checkpointed the known `ToFQDN`
    selectors, which are read in during upgrade and used to derive and inject the
    `IPCache` labels we expect to have once endpoint regeneration has
    finished. After endpoint regeneration, those restored labels are then
    removed, leaving the real labels in place.
    In contrast to all other `IPCache` updates (where each update to an
    IP is "owned" by the DNS name mapping to that IP, and we rely on
    `IPCache` to merge those labels), the resource owner here is static.
    This is, because they are all added at once (in `RestoreCache`)
    and removed at once (in `CompleteBootstrap`), and no per-name
    tracking is required.
10. Various changes to unit tests. The old unit tests tested the
    interaction between `NameManager` and `SelectorCache`, where as the
    new unit tests now test the interaction between `NameManager` and
    `IPCache`.

Signed-off-by: Sebastian Wicki <sebastian@isovalent.com>
1 parent 625e39f
History
File Mode Size
.devcontainer
.github
.nvim
.vscode
Documentation
api
bpf
bugtool
cilium-dbg
cilium-health
clustermesh-apiserver
contrib
daemon
examples
hack
hubble
hubble-relay
images
install
operator
pkg
plugins
test
tools
vendor
.authors.aux -rw-r--r-- 416 bytes
.clang-format -rw-r--r-- 7.6 KB
.clomonitor.yml -rw-r--r-- 984 bytes
.gitattributes -rw-r--r-- 887 bytes
.gitignore -rw-r--r-- 1.8 KB
.golangci.yaml -rw-r--r-- 4.4 KB
.mailmap -rw-r--r-- 6.9 KB
AUTHORS -rw-r--r-- 51.5 KB
CODEOWNERS -rw-r--r-- 28.2 KB
CODE_OF_CONDUCT.md -rw-r--r-- 2.2 KB
CONTRIBUTING.md -rw-r--r-- 691 bytes
FURTHER_READINGS.rst -rw-r--r-- 6.4 KB
LICENSE -rw-r--r-- 11.1 KB
MAINTAINERS.md -rw-r--r-- 4.6 KB
Makefile -rw-r--r-- 25.3 KB
Makefile.defs -rw-r--r-- 7.5 KB
Makefile.docker -rw-r--r-- 7.1 KB
Makefile.kind -rw-r--r-- 16.8 KB
Makefile.quiet -rw-r--r-- 818 bytes
README.rst -rw-r--r-- 19.6 KB
SECURITY-INSIGHTS.yml -rw-r--r-- 2.1 KB
SECURITY.md -rw-r--r-- 1.0 KB
USERS.md -rw-r--r-- 35.0 KB
VERSION -rw-r--r-- 11 bytes
Vagrantfile -rw-r--r-- 14.9 KB
go.mod -rw-r--r-- 13.7 KB
go.sum -rw-r--r-- 97.3 KB
netlify.toml -rw-r--r-- 92 bytes
stable.txt -rw-r--r-- 8 bytes
vagrant_box_defaults.rb -rw-r--r-- 334 bytes

README.rst

back to top