https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
c7456db Update Docker dependency Stops https://github.com/advisories/GHSA-v23v-6jw2-98fq from appearing in container vulnerability scans. This is for tidiness, Cilium is not affected by this issue. Signed-off-by: Feroz Salam <feroz.salam@isovalent.com> 16 August 2024, 07:25:17 UTC
2aa5dca chore(deps): update dependency cilium/cilium-cli to v0.16.15 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 15 August 2024, 11:04:45 UTC
04dcb9e envoy: Switch to image with timestamp tag As renovate is comparing the tag versions from left to right, using the tag with timestamp will enable any updates in cilium/proxy, not just only for envoy version changes like what we have right now. Relates: #34381 Signed-off-by: Tam Mach <tam.mach@cilium.io> 15 August 2024, 08:02:10 UTC
1a9f0aa envoy: Bump golang version This also includes other go package upgrade as well. Signed-off-by: Tam Mach <tam.mach@cilium.io> 14 August 2024, 13:31:37 UTC
d4043b4 install: Update image digests for v1.16.1 Generated from https://github.com/cilium/cilium/actions/runs/10385466378 ## Docker Manifests ### cilium `quay.io/cilium/cilium:v1.16.1@sha256:0b4a3ab41a4760d86b7fc945b8783747ba27f29dac30dd434d94f2c9e3679f39` `quay.io/cilium/cilium:stable@sha256:0b4a3ab41a4760d86b7fc945b8783747ba27f29dac30dd434d94f2c9e3679f39` ### clustermesh-apiserver `quay.io/cilium/clustermesh-apiserver:v1.16.1@sha256:e9c77417cd474cc943b2303a76c5cf584ac7024dd513ebb8d608cb62fe28896f` `quay.io/cilium/clustermesh-apiserver:stable@sha256:e9c77417cd474cc943b2303a76c5cf584ac7024dd513ebb8d608cb62fe28896f` ### docker-plugin `quay.io/cilium/docker-plugin:v1.16.1@sha256:243fd7759818d990a7f9b33df3eb685a9f250a12020e22f660547f9516b76320` `quay.io/cilium/docker-plugin:stable@sha256:243fd7759818d990a7f9b33df3eb685a9f250a12020e22f660547f9516b76320` ### hubble-relay `quay.io/cilium/hubble-relay:v1.16.1@sha256:2e1b4c739a676ae187d4c2bfc45c3e865bda2567cc0320a90cb666657fcfcc35` `quay.io/cilium/hubble-relay:stable@sha256:2e1b4c739a676ae187d4c2bfc45c3e865bda2567cc0320a90cb666657fcfcc35` ### operator-alibabacloud `quay.io/cilium/operator-alibabacloud:v1.16.1@sha256:4381adf48d76ec482551183947e537d44bcac9b6c31a635a9ac63f696d978804` `quay.io/cilium/operator-alibabacloud:stable@sha256:4381adf48d76ec482551183947e537d44bcac9b6c31a635a9ac63f696d978804` ### operator-aws `quay.io/cilium/operator-aws:v1.16.1@sha256:e3876fcaf2d6ccc8d5b4aaaded7b1efa971f3f4175eaa2c8a499878d58c39df4` `quay.io/cilium/operator-aws:stable@sha256:e3876fcaf2d6ccc8d5b4aaaded7b1efa971f3f4175eaa2c8a499878d58c39df4` ### operator-azure `quay.io/cilium/operator-azure:v1.16.1@sha256:e55c222654a44ceb52db7ade3a7b9e8ef05681ff84c14ad1d46fea34869a7a22` `quay.io/cilium/operator-azure:stable@sha256:e55c222654a44ceb52db7ade3a7b9e8ef05681ff84c14ad1d46fea34869a7a22` ### operator-generic `quay.io/cilium/operator-generic:v1.16.1@sha256:3bc7e7a43bc4a4d8989cb7936c5d96675dd2d02c306adf925ce0a7c35aa27dc4` `quay.io/cilium/operator-generic:stable@sha256:3bc7e7a43bc4a4d8989cb7936c5d96675dd2d02c306adf925ce0a7c35aa27dc4` ### operator `quay.io/cilium/operator:v1.16.1@sha256:258b28fefc9f3fe1cbcb21a3b2c4c96dcc72f6ee258eed0afebe9b0ac47f462b` `quay.io/cilium/operator:stable@sha256:258b28fefc9f3fe1cbcb21a3b2c4c96dcc72f6ee258eed0afebe9b0ac47f462b` Signed-off-by: Cilium Release Bot <noreply@cilium.io> 14 August 2024, 12:02:50 UTC
6857905 Prepare for release v1.16.1 Signed-off-by: Cilium Release Bot <noreply@cilium.io> 13 August 2024, 13:37:09 UTC
17426bc policy: Add Port Range Precedence Tests [ upstream commit 8991904ee3c4aeed808fb7b6dcfe407ea5a205a8 ] Port range precedence tests specifically tests that port-protocol and deny-allow precedence is correct for all permutations of deny-allow-port precedence. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 13 August 2024, 12:54:16 UTC
1992c02 policy: Port Range Match Label Fix [ upstream commit 0e699f0ce5d1ac1cd9233e8a2e724ff4d7f2ca02 ] The original port range changes for EgressCoversContext inadvertantly undid implicit logic underlying using the go builtin map. That is that deny entries would always be correctly identified by label, based on the fact that no more than one port-protocol key could match a given set of labels. After the introduction of port ranges this implicit assumption was false, but no logic to prioritize deny entries over allows was introduced. Instead the code naively looked up the longest prefix match rule by port-protocol. This works most of the time, but fails for deny entries of lesser port-protocol prefix matches when a longer prefix allow is present. This commit introduces a new method on the L4PolicyMap `MatchesLabels` that looks up applicable rules by label using the longest prefix match lookup by port-protocol, but correctly finds deny entries that may have lower port-protocol precedence. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 13 August 2024, 12:54:16 UTC
d458088 ci: update docs-builder Signed-off-by: Cilium Imagebot <noreply@cilium.io> 13 August 2024, 11:55:25 UTC
77ba95c ci: update docs-builder [ upstream commit 4e012f493ed7eda2a745be9dd22db5503ed06765 ] Signed-off-by: Cilium Imagebot <noreply@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
96ef822 Documentation: Add support for redirects for moved/renamed pages [ upstream commit 60f21b78ad8b34814f4a86dc1ca772d8d0ce095c ] Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
f0ff5f3 Documentation: Fix update-requirements [ upstream commit b145514ee68a8991bfc1b14f4f338a32260361a2 ] When running on my Mac, update-requirements was failing due to lack of permissions to /.local since it's running with the UID/GID of my Mac, so to work around it, set HOME=/tmp before running pip so that pip has permissions to write it's packages somewhere. /tmp is fine because this command is running in a ephemeral container just to record the dependencies in requirements.txt and the packages aren't needed long term anyways. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
0cb3777 gha: Free up Github runner disk space [ upstream commit e553bd23443ef14c8cbbeeafa7ebab3bb30c3fac ] We are having the below failure due to no disk space, it seems like we can remove pre-installed software and language runtimes, which are not use in Cilium, to reclaim more disk space. Alternative option is to bump the runner, but it might not be the best resource and cost utilization. Relates: https://github.com/cilium/cilium/actions/runs/10300396788 Relates: https://github.com/actions/runner-images/issues/2840#issuecomment-2272410832 Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
7d8b60b Remove appArmorProfile from CronJob helm template [ upstream commit 55c44d66eddff116709a51156824f9c79a3d0c24 ] The condition is reversed, and fixing it breaks CI. See https://github.com/cilium/cilium/pull/33077 Signed-off-by: Mathieu Parent <math.parent@gmail.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
642f47f bgpv1: Fix DiffStore to work with multiple server instances [ upstream commit 4a9d01c8c362e7f01a50b320c203e25043817776 ] [ backporter's note: trivial conflicts due to BGPInstance doesn't have CancelCtx in 1.16. ] As the same DiffStore instance is used to diff services when reconciling for multiple server instances (virtual routers), we need to differentiate the diff per instance (and to be future-proof per reconciler as well) for diff to work properly. This introduces a new callerID argument to the Diff method of the DiffStore to provide caller-specific diff as well as InitDiff(callerID) and CleanupDiff(callerID) methods to initiate and cleanup caller-specific diffs. To wire this properly in the service reconcilers, new reconciler methods Init(instance) and Cleanup(instance) have been added to the reconciler API. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> Signed-off-by: gray <gray.liang@isovalent.com> 13 August 2024, 11:55:25 UTC
8b2be4a ci: update docs-builder [ upstream commit c806a85365cbee3eb9c8d9982d74cbe96c2ec40b ] Signed-off-by: Cilium Imagebot <noreply@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
5f8f1df docs: Build the builder before using it [ upstream commit 79cd77ebdb2ae5dd8f1fc23cdd0f7b9c5d9e93ba ] Previously this target could run two dependencies in parallel: Build the builder image, then also use it to run some commands. Fix it by moving the builder dependency as a concrete step under the cilium-build target, before doing the build within that container. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
a4caa4e docs: Fix UndefinedVar warning in update-cmdref [ upstream commit f440d5ce1ee6b3e20457e3c016f6e13752b728ea ] The READTHEDOCS_VERSION variable was not available in ENV during the build process, which would lead to the following extraneous warning to be printed in some cases when building the docs: UndefinedVar: Usage of undefined variable env READTHEDOCS_VERSION Fix this by updating the docs-builder image to receive this also as a build argument, and pass it in when built locally. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
dfc67dd Makefile.docker: Fix extraneous docker complaint [ upstream commit 66467a926a28e27242f673895a53ded48c3fb222 ] When running builds of the main tree within a container, like how some of the Documentation/ make targets do, the build would often fail with a complaint about how the 'docker' binary doesn't exist. This is because within the container, it doesn't. However, docker is not a hard dependency on the build process, so we can weaken the builder detection here in the case where the docker binary is not available. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
894beec Makefile: Fix GIT_VERSION detection for worktrees [ upstream commit f2fa8013530b21b5067a31b7fe9283b133fd744b ] If using git worktrees, '.git' is a file that contains contents referring to the original checkout of the code. For many of the make targets in the tree, they run inside containers, and the default docker flags aren't set up to properly map the original git worktree checkout into the container in order to allow git worktrees to work as expected. Work around this by ensuring that the git detection fails for the case of git worktrees to fall back to the version of these targets that doesn't need git. While we're at it, silence a common warning in this case where the `GIT_VERSION` file doesn't exist in the tree. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
566200b Makefile: Fix docker flags for fast image targets [ upstream commit 64458afc339a13695fb5626abc5dbc5415b877b8 ] When attempting to use the fast image targets from scratch, I experienced the following symptom: $ make kind-image-fast CHECK kind-ready kind is ready GO cilium-dbg/cilium-dbg GO daemon/cilium-agent GO operator/cilium-operator-generic GO clustermesh-apiserver/clustermesh-apiserver the input device is not a TTY make: *** [Makefile.kind:262: kind-image-fast-clustermesh-apiserver] Error 1 make: *** Waiting for unfinished jobs.... Tracing through to these Makefile targets, they are all passing "-t" flag (allocate a pseudo-terminal) and "-i" (attach stdin) to docker. However, the commands being run are pretty basic unix tools, none of which are using advanced terminal functionality or even stdin. Remove the unnecessary flags to resolve the issue. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
4dd904b Documentation: Use tabs for Hubble TLS configuration docs [ upstream commit f9dd080d186402ae29ed6c722c6c52f7227644f8 ] Also add some troubleshooting tips for each method of configuring TLS based on my practical experience with each method. As part of switching to tabs, I have also reordered the list of automated Hubble TLS methods. We want to encourage users to use the best possible method of managing TLS and generally we encourage cert-manager or the CronJob/certgen over the helm and user-provided certificate options, so let's updating the list of options and tab ordering to reflect that. The new order is: - cert-manager - CronJob (cergen) - helm - user provided certificates Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
a924a31 iptables: Support Envoy listener chaining [ upstream commit 4c9cf375b38aa90204a1396baf71b584ef7adec5 ] Allow Envoy listener chaining via the loopback device by not routing transparent proxy traffic destined to the loopback device to the cilium_host device. Fixes: #32683, cilium/proxy#742 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
b10ad54 Documentation: Update Hubble user provided TLS certs instructions [ upstream commit e2359977a0bf6d319553f1ea75ca9c98e697f063 ] Instead of suggesting users put TLS private keys into their helm values, instruct them to create secrets themselves and configure their helm values to use these pre-existing secrets. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
6153650 Documentation: Move Hubble TLS User provided certificates section last [ upstream commit 281bdb3b9506638cc87c9b938fb0dd6bfd69eed8 ] Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
209b869 helm: Add existingSecret option for hubble user provided TLS secrets [ upstream commit 5339161bfdcccf1bcb8d3ef3a5b9db803208ef81 ] [ backporter's note: minor conflicts in Documentation/operations/upgrade.rst ] Also deprecate specifying Hubble TLS key/cert values directly in Helm values. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: gray <gray.liang@isovalent.com> 13 August 2024, 11:55:25 UTC
bca3635 docs: Add warning on CRDs requirement for using the Gateway API [ upstream commit aa6f482d96b3e4989b28b74e89fa4ff798f84624 ] Add a warning for the Gateway API, about the necessity to have experimentals CRDs installed due to a known issue. [ Quentin: Removed spurious spaces, re-wrapped text, cleaned up commit title and description ] Signed-off-by: Christine Kim <xtineskim@gmail.com> Signed-off-by: Quentin Monnet <qmo@qmon.net> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
3c4e811 Add source IP visibility info to Ingress and Gateway API docs [ upstream commit 5e981cce0209f8838de91f0c55f5d981488adbe8 ] This commit adds information about how best to get access to the source IP address for Ingress and Gateway API configs. As part of this, break out the Reference section recently added to the Ingress config, and bring it into the Gateway API docs as well (for all the areas where the same things apply). Signed-off-by: Nick Young <ynick@cisco.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
8b794c8 Documentation: Update readthedocs configuration [ upstream commit 88e69c49ffed08711292e453255a6ccdff68cbd3 ] Readthedocs used to inject configuration into conf.py at build time, but they're deprecating this functionality in the coming months. In preparation, we need to make a couple of tweaks to ensure that RTD can inject configuration options. The following link contains more details: https://about.readthedocs.com/blog/2024/07/addons-by-default/ Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
fe7177b .github/actions/ginkgo: move 'WireGuard...' test focus to f21. [ upstream commit f4837f8e2904d163f70fc74952d8b90fffaffc67 ] As a final measure, this moves the 'WireGuard encryption strict mode' type tests to f21. These collectively take around 7 minutes, so this should balance f09/f21 more. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
4301b40 .github/actions/ginkgo: move high scale ipcache tests to f21. [ upstream commit 62ca60c71d0cf20d59864c02a1118c9cddaa2739 ] f09 is still quite a bit heavier than f21 in terms of test time. This is because many of the tests moved to f09 are actually skipped. High scale ipcache is currently the longest running test inside f09 so let move that to f21 as well. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
07c4428 .github: ginkgo: remove duplicate datapath ipv4only test in f09/f21. [ upstream commit 22372be9c6302a76b40cf64d46d6da96b0236c1d ] e150ea1 sought to split these tests into two, however 'K8sDatapathConfig IPv4Only' was in the list twice and so was left in both test suites. This removes this from f09. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
c4b5271 .github: split ginkgo 'f09-datapath-misc-2' into two focus suites. [ upstream commit e150ea17301e00b3e44de25dfc6613026e9ec82e ] This test is taking a while, and is routinely hitting the timeouts. This breaks the 'f09-datapath-misc-2' into a new one 'f21-datapath-misc-3' which runs about half of the previous focuses tests. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
fddba83 ci: multi pool run tests concurrently [ upstream commit 7a348ad934bb45b2fb81ced7f46ef48de0866fe2 ] Conformance multi pool IPAM workflow improved with connectivity test concurrent run to make it faster. Signed-off-by: viktor-kurchenko <viktor.kurchenko@isovalent.com> Signed-off-by: gray <greyschwinger@gmail.com> 13 August 2024, 11:55:25 UTC
d80af8b Revert "fix: support validation of stringToString values in ConfigMap" [ upstream commit c4c3b1c15d98763f6831cfdb5506c144716fb01c ] This reverts commit c57bfb6b9c52f5ef0b80b4ee77653dad485cd675. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: gray <greyschwinger@gmail.com> 12 August 2024, 12:09:01 UTC
a15dcc0 vendor: Bump StateDB to version v0.2.4 [ upstream commit df79ac9f0fd493cd14ce6e65e89df52df8c6459d ] [ backporter's notes: bumped to v0.2.4 and not v0.2.1 ] Update StateDB and fix up the API usage. The new version brings a simpler API and defaults for the reconciler to make its usage easier. Signed-off-by: Jussi Maki <jussi@isovalent.com> 09 August 2024, 11:58:45 UTC
887881f images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 07 August 2024, 17:00:30 UTC
d5107f7 chore(deps): update dependency protocolbuffers/protobuf to v27.3 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 07 August 2024, 17:00:30 UTC
e91007d bgpv2: Render route policies for LB VIPs instead of pools [ upstream commit 9bfc9929f2fb71aeae492084885a096f9bce99af ] BGPv2 API required Service advertisement selector to match both Service as well as CiliumLoadBalancerIPPool from which the service got assigned the LB VIP. This was not documented, confused some users, and could not work properly in some cases (large pool for many services, from which only some should be advertised). To fix all of that, we are removing the requirement to match the pool and render route-policies per LB VIP isntead of per LB pool. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
6b4d920 policy: Sanitize DNS Rules to Disallow Port Ranges [ upstream commit 005aed241b0ca39c91b31a14352b299187259af2 ] DNS rules do not (yet) support port ranges. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
a814c56 fqdn: fix TCP shared client refcount [ upstream commit e01842adf66cb15749aaf16a7c886c2b5287b33f ] The reference count for TCP shared clients was not decreased since we skipped the closer and let the closing happen as part of the closing of the downstream TCP connection. However, the shutdown method would only close the upstream connection if the reference count was zero - a left-over from lifting the closer code. This lead to the situation that the reference count would be increased by each query, but not decreased until the "shutdown" occurred. Since that decreased only once, however, the upstream connection was not necessarily closed. To fix this, we unconditionally close the shared client when the downstream connection closes. This in effect means that the reference count is only ever increased, which is confusing at best - this area is ripe for a more sweeping improvement. Fixes: 7007334438 ("dnsproxy: avoid multiple upstream DNS TCP conns") Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
2d2a24d fqdn: fix deadlock in dnsproxy shared client [ upstream commit 2a7f19adf546e67b4d1cf3fb2b6dfac86e1bc694 ] Introduced a self-deadlock in the shared client by forgetting to unlock in the case of not finding a client. Also clean up a bad copy-pasta comment. Fixes: 7007334438 ("dnsproxy: avoid multiple upstream DNS TCP conns") Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
bec1ce0 bpf: egressgw: don't install allow-all policy in to-netdev tests [ upstream commit 5ec17e8200baf816cffccdc832ce5ac05dc129c2 ] This was probably copy&pasted from the bpf_lxc tests. But to-netdev doesn't have any egress-policy hook, so we don't need to explicitly allow the test traffic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
1382891 dnsproxy: avoid multiple upstream DNS TCP conns [ upstream commit 7007334438cd7a0dc446df46c0d22442a8fb2312 ] Users have reported errors in DNS proxy transparent mode in #31197: Cannot forward proxied DNS lookup: "connect: cannot assign requested address" The error stems from a racy interaction with the kernel and manifests seldomly because a few conditions need to be satisfied for it to occur: 1. DNS over TCP must be used (either by choice or because a UDP DNS reply was marked as truncated, mandating the client to retry using TCP) 2. DNS proxy transparent mode must be enabled 3. the client must send multiple queries over the same TCP connection (e.g. A and AAAA) 4. the kernel-internal TCP state machine must reach FIN_WAIT2 (i.e. receive an ACK, but not a FIN ACK to its FIN) Of these, condition 1 occurs relatively infrequently - 2 is the default, 3 is almost a guarantee due to A/AAAA queries being sent in parallel by most clients. Condition 4 is a wildcard - I happened to reproduce because I am using a VM using the SLIRP network stack which ACKs the FIN before sending it out. The race condition is that the DNS proxy attempts to bind-then-connect with the same source addr/port in rapid succession. It does so because it (mistakenly) opens a new TCP connection for each query received on the client TCP connection. That, in turn, occurs because the cilium/dns.Server used by the proxy exhibits head-of-line blocking in the TCP case. There is infrastructure to share the same socket/"shared client" in the code, but due to the HOL blocking the sharing never actually occurs (as there is no concurrency). The DNS client sharing infrastructure works the same for UDP and TCP. For UDP, we can only keep the socket alive for as long as we have an outstanding request. Therefore, the TCP connection to upstream is torn down for every query, only to be reopened for the next. Since many DNS clients fire off A and AAAA queries in parallel, this occurs almost always. The kernel component of this bug is that once FIN_WAIT2 is reached, the SO_REUSEADDR socket option no longer works, and connect fails to bind the address due to a cool-down timer. (Since we cannot change the kernel with a fast turnaround in deployments, we need to work around the kernel bug.) After all of this, the proposed solution is relatively simple, but not perfect. The main idea is to avoid opening multiple TCP connections to the upstream DNS server. To do so, we employ a small hack - for DNS over TCP, we tie the lifecycle of the shared client to the downstream TCP connection: whenever the downstream connection breaks, we also close the upstream connection. The implementation is somewhat non-obvious, since the library's server interface is modeled after the HTTP request/response handling scheme - hence the underlying transport is abstracted away. We need access to the signal that the underlying conn has closed, thus we wrap the listener and TCP connection. The wrappers forward the close signal via the shared client registry, closing the upstream connection. The primary benefit of doing it this way is the minimal amount of changes needed, increasing confidence and possibility of backport. The disadvantages are as follows: 1. This does not solve the head of line blocking issue, which is acceptable for a server, but actually pretty bad for a DNS proxy. (This is likely a cause for high tail latency in the TCP case.) 2. shared clients contain reference counting logic that is not actually used in the TCP case, likely confusing the next person to stare at this code. 3. Following the flow of the implementation has gotten a bit harder still, since the wrapping creates one more layer of indirection. One alternative is a much more invasive rewrite of the proxy: if we take control over the transport aspect (i.e. reimplement dns.Server), we can reduce complexity. However, this has a high risk of breaking something in the process, and unfortunately the DNS library doesn't expose all the necessary fine-grained parsing methods - we'd likely have to switch to x/net/dns/dnsmessage or similar, which exposes more low-level DNS primitives. Suggested-by: Sebastian Wicki <sebastian@isovalent.com> Suggested-by: Fabian Fischer <fabian.fischer@isovalent.com> Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
1b944d3 dnsproxy: move shared client implementation in [ upstream commit 0a2010a03aa56d877a331d7d267756ac87b0d29d ] The implementation of sharing upstream DNS clients and associated state lived in our cilium/dns fork. Shared clients are needed only for the agent's transparent DNS proxy mode, making cilium/cilium a better place for the functionality than the library. In addition, upcoming refactors specialise the functionality further to cilium's usecase. (Removing the code from cilium/dns is not possible just yet, as stable branches depend on it.) The commit contains no intentional functional changes, though some massaging of the code was necessary due to unexported library functionality. Specifically, getTimeoutForRequest had to be adapted slightly, sync.Mutexes to be swapped for our own lock.Mutexes as well as stdlib "time" for pkg/time. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 07 August 2024, 08:43:29 UTC
a134e12 gha: Exempt GatewayInfrastructurePropagation Gateway Conformance test [ upstream commit cff3456c7ee560cc88d272fc768f89cbdb45e27e ] This test is not supported in standard channel, hence we need to add to the exclude list. Signed-off-by: Tam Mach <tam.mach@cilium.io> 07 August 2024, 07:51:34 UTC
44dfef8 gateway-api: Add required labels and annotations [ upstream commit 5e6e4af1cf9b541ad27464f409e16375fa126011 ] This commit is to add the required label gateway-name e.g. `gateway.networking.k8s.io/gateway-name`, and propagate all labels and annotations from spec.infrastructure in all generated resources. The main goal is to conform with below GEP. Relates: https://github.com/kubernetes-sigs/gateway-api/pull/1757 Signed-off-by: Tam Mach <tam.mach@cilium.io> 07 August 2024, 07:51:34 UTC
399afcb gha: Add extended features in gateway profile run [ upstream commit 9025fd4d755fef8774a2894e476e8f86e554876b ] Unlike normal run, Gateway API conformance profile didn't work well with exempt features flag, which causes issues of missing extended features in the report. This commit is to use --skip-tests instead. Signed-off-by: Tam Mach <tam.mach@cilium.io> 07 August 2024, 07:51:34 UTC
1b3ecd1 bgpv2: deprecate local port setting [ upstream commit 2193ab83dfe235861347c0ac47ee6f0c6f594329 ] Deprecate setting local port for BGP peer transport configuration. Setting local port from user prospective is unnecessary, it should be ephemeral port picked by the kernel when establishing TCP connection. Signed-off-by: harsimran pabla <hpabla@isovalent.com> 06 August 2024, 17:18:24 UTC
bda518c v1.16: Remove leftover backporter state file This file was accidentally committed to the tree. It contains ephemeral state of the `backporter` tool and thus can be deleted. Fixes: 92c110e58a7b ("gateway-api: Enqueue gateway for Reference Grant changes") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 06 August 2024, 16:31:47 UTC
7a8f4a3 etcd: fix paginated list missing events with parallel operations [ upstream commit 8b210fb7124eef3194378cdfd9971da39b5b5f08 ] Currently, the etcd paginatedList implementation is affected by a bug that can lead to missing events of both upsertions and deletions that are performed between the end of the first Get call and the last Get call. Indeed, the tracked revision is incorrectly updated after every Get call, and etcd always returns the current revision, regardless of whether WithRev is actually requesting a specific revision. In turn, this leads to subsequent Get calls targeting different revisions, hence missing all the events happened in between for the already processed chunks of the prefix, as the watch operation is eventually started from the latest retrieved revision. Let's address this by consistently using the same revision during the entire paginatedList execution, to ensure that we correctly list all entries at that revision, and subsequently start watching events from the next one. Additionally, let's update the associated unit test to additionally cover the parallel operations case and prevent future regressions in this respect. This bug affected both Cilium running in KVStore mode and clustermesh, although the race condition window is sufficiently short to trigger its occurrence only rarely and in high scale environments. Fixes: e33b9f9bff20 ("kvstore: add support for paginated lists in etcd.ListAndWatch") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 06 August 2024, 15:22:48 UTC
4a5dfbd service: Relax protocol matching for L7 Service [ upstream commit ffcfbd75bc305093b72e30eabb4b78b3a7a1ec8b ] Currently, the service datapath maps for both ipv4 and ipv6 didn't use protoc field, hence despite the protocol value is passed from k8s Service spec, the same information is not stored in datapath maps. Upon agent restart, service restoration will just perform no-op for such field (i.e. NONE will be used), this will cause severe failure for L7 service due to the protocol mismatch [^1]. This commit is to make sure that UDP and SCTP are not allowed instead, the reason is to avoid extra dependencies with ongoing work [^2], which will treat lb.NONE, (newly added) lb.ANY and lb.TCP in the same way. [^1]: https://github.com/cilium/cilium/blob/19c74877ffc0d35daf81f4e0cde81623d4246c5c/pkg/service/service.go#L697-L702 [^2]: https://github.com/cilium/cilium/pull/33434 Signed-off-by: Tam Mach <tam.mach@cilium.io> 06 August 2024, 12:36:52 UTC
d72fce1 ciliumenvoyconfig: Qualify internal listener references [ upstream commit a04db7162fd103bf3b178d9b4be31143f4c82e38 ] Qualify internal listener references in endpoint addresses so that they work as expected. If the name is in the same resource, validate that listener is also defined. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
392390a ciliumenvoyconfig: Check for duplicate after name is qualified [ upstream commit d7440a5dc564030276f3694a17bd53d89b3d68e0 ] Parsed resources have qualified names, so to detect duplicates we need to qualify the name first. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
c8eb755 ciliumenvoyconfig: Fix typos [ upstream commit 7e106d793dd8794c97be462d7daa609617106c2b ] Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
5bcd9e3 bugtool: enhance dumping Envoy information [ upstream commit 43dd4ee416a8eec7d4b2d9a3e8fccbc0893d04d7 ] Currently, during a Cilium sysdump the bugtool dumps the full Envoy config and prometheus metrics. In some cases it would be nice to have some information about listeners and clusters easier at hand. Therefore this commit enrichers the bugtool to also dump the listeners, clusters and server_info during a sysdump. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
8bfbcb9 tests-clustermesh-upgrade: Don't hardcode test namespace [ upstream commit b189c578ea4da02467dc636e338683b8b6b4b012 ] https://github.com/cilium/cilium-cli/pull/2680 changed the behavior of --test-namespace flag. It's now treated as a prefix, and the namespaces always contain "-$index" suffix even if --test-concurrency is set to 1. Instead of hardcoding the namespace, use app.kubernetes.io/name label to find the test namespace. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
fe42273 gateway-api: Add HTTP method condition in sortable routes [ upstream commit a3510fe4a92305822aa1a5e08cb6d6c873c8699a ] As per the below, method match should be considered before the largest of header/query param matches. This commit is to consider method attr into sorting rule. Relates: https://gateway-api.sigs.k8s.io/reference/spec/#gateway.networking.k8s.io/v1.HTTPRouteRule Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
68267e1 lbipam: fixed bug in sharing key logic [ upstream commit e5912040fe2751529c9cfab3856f4bd8634539a6 ] @brb found a bug in the sharing key logic where if you remove a service that is part of a sharing key and then add it back, it would get a new IP while it should have gotten the same IP back. This turns out to be caused by the `sharingKeyToServiceViewIPs` index which maps a sharing key to a list of IPs associated with that sharing key. Each IP can be assigned to multiple services, however, when a service was removed or received a new IP because it was no longer compatible with the rest of the services with the same sharing key, we would always remove the IP from the index. Because of this, upon re-adding the service it seems like the sharing key wasn't in use so a new IP is allocated and added as the sole IP used by the sharing key. The fix is to now we only remove it if the IP is not used by any other service with the same sharing key. This commit also adds a recession test for this case. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
914f0d1 bitlpm: Avoid allocs in CIDR trie lookups [ upstream commit 6347815fe587c63def5812f06afe4c8a83d0e4c2 ] CIDR key functions (pkg/container/bitlpm/cidr.go) used in lookups rely on netip.Addr.AsSlice(), which allocs a new slice. This results in many allocs during tree traversal used for all lookups in CIDR tries. Unfortunately netip package does not provide alloc-free access to the bits needed for CIDR trie implementation. It would be preferable to extend the exported interface of bitlpm.Addr to allow for this, but in the meanwhile we can rely on the unsafe package and our own unit tests that break if the underlying implementation of the prefix bits layout changes. This change accesses the brefix bits as an array of uint64 as stored in the underlying netip.Addr implementation. This has the main benefit of removing the need to allocate new slices whenever accessing bits in the prefix, but also making the CommonPrefix more efficient as it now operates on bigger units (uint64 vs byte). If using unsafe with unit testing guarding against breakage is not acceptable, we could "fork" netip.Addr/Prefix implementation into a Cilium specific data types, but this would be cumbersome as well. With this change the BenchmarkFindAffectedChildPrefixes in pkg/ipcache becomes upto 13% faster: Before: pkg/ipcache$ go test -bench BenchmarkFindAffectedChildPrefixes -run XYZ -timeout 0 goos: linux goarch: arm64 pkg: github.com/cilium/cilium/pkg/ipcache BenchmarkFindAffectedChildPrefixes/1000000/10000/Sparseness_0.330000/Trie_false-12 91 12609006 ns/op 4245 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/1000000/10000/Sparseness_0.330000/Trie_true-12 191980 7141 ns/op 4541 B/op 56 allocs/op BenchmarkFindAffectedChildPrefixes/100000/1000/Sparseness_0.330000/Trie_false-12 1018 1165072 ns/op 4280 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/100000/1000/Sparseness_0.330000/Trie_true-12 284762 3604 ns/op 4508 B/op 47 allocs/op BenchmarkFindAffectedChildPrefixes/10000/100/Sparseness_0.330000/Trie_false-12 9751 122269 ns/op 4308 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/10000/100/Sparseness_0.330000/Trie_true-12 512391 2047 ns/op 4507 B/op 36 allocs/op BenchmarkFindAffectedChildPrefixes/1000/10/Sparseness_0.330000/Trie_false-12 85987 13713 ns/op 4214 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/1000/10/Sparseness_0.330000/Trie_true-12 802904 1411 ns/op 4376 B/op 26 allocs/op PASS ok github.com/cilium/cilium/pkg/ipcache 18.700s After: pkg/ipcache$ go test -bench BenchmarkFindAffectedChildPrefixes -run XYZ -timeout 0 goos: linux goarch: arm64 pkg: github.com/cilium/cilium/pkg/ipcache BenchmarkFindAffectedChildPrefixes/1000000/10000/Sparseness_0.330000/Trie_false-12 94 12744952 ns/op 4251 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/1000000/10000/Sparseness_0.330000/Trie_true-12 194989 6398 ns/op 4357 B/op 10 allocs/op BenchmarkFindAffectedChildPrefixes/100000/1000/Sparseness_0.330000/Trie_false-12 1018 1174121 ns/op 4280 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/100000/1000/Sparseness_0.330000/Trie_true-12 317127 3222 ns/op 4359 B/op 10 allocs/op BenchmarkFindAffectedChildPrefixes/10000/100/Sparseness_0.330000/Trie_false-12 9727 121524 ns/op 4308 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/10000/100/Sparseness_0.330000/Trie_true-12 597798 1772 ns/op 4404 B/op 10 allocs/op BenchmarkFindAffectedChildPrefixes/1000/10/Sparseness_0.330000/Trie_false-12 84943 13826 ns/op 4214 B/op 6 allocs/op BenchmarkFindAffectedChildPrefixes/1000/10/Sparseness_0.330000/Trie_true-12 858214 1249 ns/op 4312 B/op 10 allocs/op PASS ok github.com/cilium/cilium/pkg/ipcache 15.177s Performance difference can be bigger with larger CIDR tries, as in when processing deny policies (>30% faster policy computation in a benchmark test). Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
dbb0a35 Fix workflow telemetry in ci-ipsec-upgrade [ upstream commit 9592c69c27dfe5ee143f5020d4fc0c3e920f4e04 ] Basically, workflow telemetry uses step names for gantt charts which uses Github's native support for mermaidjs for rendering the charts in issues and PRs. However mermaidjs has an issue (https://github.com/mermaid-js/mermaid/issues/2495) when you use mermaid reserved keywords in node names/text. To fix rendering of the ipsec upgrade workflow's telemetry, avoid the use of the "call" keyword. This is a work around for https://github.com/catchpoint/workflow-telemetry-action/issues/76. If an when https://github.com/catchpoint/workflow-telemetry-action/pull/77 is merged, we do not need to be concerned with workflow step names. Fixes #32241 Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
7bd9539 doc: update slack channel reference [ upstream commit 83fb3812a9bfa5a1f59e17958ae65b271bf1eea7 ] Signed-off-by: Huweicai <i@huweicai.com> 06 August 2024, 11:31:29 UTC
54ba918 wireguard: wait for kvstore synchronization before stale peers GC [ upstream commit e1b180dbebf919d87a5e6135b656823757507ffa ] Currently, temporary connectivity disruption can occur on agent restart when Cilium is configured in kvstore mode, and WireGuard encryption is enabled, in case the removal of obsolete peers is triggered before kvstore synchronization completed. Hence temporarily removing valid WireGuard peers until the corresponding node upsertion event is received again. Let's address this by explicitly waiting for kvstore synchronization before starting the garbage collection of the obsolete peers, to ensure that the full list of existing nodes has been retrieved at that point. Fixes: #31985 Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
f9b91fb endpoint: Init DNS history trigger only when datapath is ready for it [ upstream commit 258819d9c5d645a2d02ddfd3d79ff7da34e778fc ] This fixes a rare crash that can occur when a restored endpoint is doing DNS requests while the first loader Reinitialize() is still not completed (e.g. waiting for node information). Crash: time="2024-07-26T09:54:49Z" level=debug msg="Updated FQDN with new IPs" IPs="[75.2.60.5]" matchName=isovalent.com. subsys=fqdn time="2024-07-26T09:54:49Z" level=debug msg="Waited for endpoints to regenerate due to a DNS response" duration="64.816µs" endpointID=1050 qname=isovalent.com. subsys=daemon ... time="2024-07-26T09:54:49Z" level=debug msg="writing header file with DNSRules" DNSRulesV2="map[]" ciliumEndpointName=default/ubuntu .. panic: runtime error: invalid memory address or nil pointer dereference [signal SIGSEGV: segmentation violation code=0x1 addr=0x90 pc=0x2c65ce7] goroutine 368 [running]: github.com/cilium/cilium/pkg/datapath/types.(*LocalNodeConfiguration).DeviceNames(...) /home/jussi/go/src/github.com/cilium/cilium/pkg/datapath/types/node.go:165 github.com/cilium/cilium/pkg/datapath/linux/config.(*HeaderfileWriter).WriteEndpointConfig(0xc00269ab40, {0x445aaa0, 0xc00067d060?}, 0x0, {0x44df670, 0xc001b28808}) /home/jussi/go/src/github.com/cilium/cilium/pkg/datapath/linux/config/config.go:1045 +0x127 github.com/cilium/cilium/pkg/datapath/loader.(*loader).WriteEndpointConfig(0xc001b28808?, {0x445aaa0?, 0xc00067d060?}, {0x44df670?, 0xc001b28808?}) The issue is due to WriteEndpointConfig being called via the endpoint DNS history trigger when the LocalNodeConfiguration is not yet set. Fix the issue being initializing the trigger from regenerateBPF which is called only after datapath reinitialize has completed and it is ready to process the endpoint config writing. The fix was tested by adding a 5 second sleep into Reinitialize(), both before the compilation lock and before nodeConfig.Store. This reliably reproduced the issue and the fix was effective. Adding these sleeps did not uncover other problems. A principled long-term fix for this and similar issues lands in #33023 which gates all requests towards the loader and makes sure all relevant data is present. Fixes: #34019 Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
2b6b826 bgpv2: Avoid duplicate route policy naming [ upstream commit 271faf47147e4ec7b029373e1f872aae9cae6dc2 ] Consolidates route policy naming to always contain advertisement type, to avoid duplicate route policy names if the same name is used for two different resource types (e.g. PodIPPool & LoadBalancerIPPool). Also removes namespace from the policy name of non-namespaced resources (PodIPPool & LoadBalancerIPPool). Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
ef27f86 bugtool: dump cilium_skip_lb{4,6} [ upstream commit 4decd942d9c8423cae7bf845c513e2bd52b3c111 ] Collect cilium_skip_lb{4,6} maps in sysdump fixes: #33901 Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com> 06 August 2024, 11:31:29 UTC
92c110e gateway-api: Enqueue gateway for Reference Grant changes [ upstream commit ed3dfa0aab8b80f7e841a6d49d2a990ac2dca053 ] [ backporter's notes: resolved conflicts due to v1.16 still using `logrus` for logging. ] This commit is to make sure that the reconciliation loop is kicked off for Gateway object if there is any change in ReferenceGrant (mainly for SecretObjectReference). Signed-off-by: Tam Mach <tam.mach@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
8bbdc73 auth: Fix data race in Upsert [ upstream commit b9bb0c28b9bde4e231c2fc1427334ddb9cf41535 ] Fixes: #33899 Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
c760823 bgpv2: use correct path key in path reconciler [ upstream commit 17355b56c638eb199f5517158c77e801a16d165b ] Use passed key in path reconciler instead of assuming it will be NLRI.String(). Change also reverses path withdrawal and advertisement, this is useful in cases where update is required. In which case, path should be first removed and new path added. Signed-off-by: harsimran pabla <hpabla@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
b98e4b5 bgpv1: Reconcile with retry in BGP controller [ upstream commit 7ce560221c119bcff8b874bc256cdb96212d846b ] Run BGP reconciliation with retry to automatically recover from transient errors. If the reconciliation fails after reaching the defined retry/backoff period, if will fail with logging the last reconciliation error. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
6599a56 test: use cgr.dev/chainguard/busybox:1.36.0 instead of docker.io image. [ upstream commit 16ed7db50ac3359d78a175dfa16afb4d5f63c42c ] Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
09db2f6 helm: add config for nat-map-stats-{interval, entrie} config. [ upstream commit 7e10a114851ad8b6e6d6db1469413d49fe781c0f ] These configs where introduced in https://github.com/cilium/cilium/pull/32152 and control how often SNAT map is analyzed for stats, as well as how many of the top connections to store in-memory (i.e. statedb). This exposes this via helm. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
caf26ca bitlpm: Simplify matchPrefix() [ upstream commit cdecfbb7188c80d423ad2007d7a98ab6726ce73e ] matchPrefix is symmetric in its arguments, so we can make it a method instead of a plain function. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
09f88ab helm: Support configuring imagePullSecrets for spire agent/server pods [ upstream commit fd26153ffbeb18040ca6622bb4da322cb8d48cd8 ] Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
e027e68 gha: drop trailing spaces from all files under .github [ upstream commit 83338599965a579ef635f9e7b77f3c53cdfec056 ] [ backporter's notes: ignored all conflicts / changes applied as part of the cherry-pick and re-ran the command below instead. ] Following the introduction of the dedicated linter, in the previous commit, let's address all existing occurrences: $ find .github -type f -exec sed -ri 's/[[:blank:]]+$//' {} \; Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 06 August 2024, 11:31:29 UTC
6e1f218 test: fix empty named It clause in Checks E/W loadbalancing test [ upstream commit f6fae66998bfbe71ce178909e6f3c1991d2ecc10 ] Otherwise, it leads to a trailing space in the main-focus.yaml file, which is now flagged by the linter. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
46e684f gha: lint absence of trailing spaces in workflow files [ upstream commit e7ebc9a4489fb209384d7a85d3f1e43a7b0269a7 ] Trailing spaces in workflow files tend to be a pain during backports, as they cause conflicts due to slight divergences across branches, and require manual intervention. Additionally, depending on the editor settings of each developer, they either lead to unnecessary churn in the code-base, or require extra effort to explicitly ignore them. To prevent these issues, let's introduce a linter that verifies the absence of trailing spaces in all files under `.github`, and prompts a command to remove them if found. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
b30984c bgpv1: Fix data race in bgppSelection [ upstream commit 692791a7b498eb6d3cfb35b47ead62cb29044d50 ] Fixes: #33898 Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 06 August 2024, 11:31:29 UTC
35e926c gha: simplify the call-backport-label-updater workflow [ upstream commit e91378b34f9451c2f9016333efc65b3af8c57cf4 ] Currently, the call-backport-label-updater workflow performs a first step to determine the branch that the backport PR was merged into, based on the backport label. However, the target branch is already known in advance, as it is the one the workflow got triggered on. Hence, let's simplify this mechanism, so that we don't need to update the list of stable branches every time a new one gets added. While being there, let's also slightly generalize the branches filter to allow arbitrary suffixes to the branch name as well. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 06 August 2024, 11:31:29 UTC
277a878 datapath: Skip lxc devices for fallback and prefer selected [ upstream commit 28e9aded7d22466888e36eb6a5e7e52a256f8352 ] The fallback is used for e.g. BPF masquerading when the target device has no address, this is "best effort" for ECMP etc. setups). The selection algorithm for the fallback node address wasn't taking into account whether the device was selected or not, which led to the fallback address being taken from non-selected devices. Add selected as the first criteria for checking if a fallback is better. And to avoid unnecessary churn on updating the fallback, always skip lxc* devices. Signed-off-by: Jussi Maki <jussi@isovalent.com> 05 August 2024, 18:52:57 UTC
b5721d4 datapath: Pick non-loopback addresses assigned to lo as node addresses [ upstream commit a9e9666c4bc583cdb14af5412e34dc93166eeaff ] This fixes a regression where the non-loopback addresses assigned to the loopback device (lo) were not considered host/node addresses. This broke the practice of assigning VIPs to the loopback device to make Cilium consider them. The problem was due to filtering on "ExcludedDevicePrefixes" that included "lo". This filtering is already done in the devices controller that populates the Table[Device] that this code reads, so this filtering can be dropped. In addition to this the fix whitelists the lo device to make it unnecessary to specify "--devices=lo,..." and thus retaining the same semantics as v1.14. Fixes: #33214 Signed-off-by: Jussi Maki <jussi@isovalent.com> 05 August 2024, 18:52:57 UTC
d7d3e1d chore(deps): update all github action dependencies Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 05 August 2024, 05:17:19 UTC
8157cea chore(deps): update gcr.io/etcd-development/etcd docker tag to v3.5.15 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 29 July 2024, 07:56:48 UTC
83a81ce docs: Add upgrade note for CNP empty slices new semantic Following the change in the semantic of an empty non-nil slice in CNPs, an upgrade note is added to the guide for v1.16. The semantic change targeted v1.16 but this release note was inadvertently appended under the "v1.15 Upgrade Notes" section and thus deleted when preparing the documentation for v1.16. Related: e47e295a04 ("docs: cleanup upgrade docs on 1.16") Related: 966757d822 ("docs: add upgrade note for dangling cidrGroupRefs") Fixes: 5f77d50ee3 ("docs: Add upgrade note for CNP empty slices new semantic") Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 25 July 2024, 19:18:41 UTC
0b0e716 install: Update image digests for v1.16.0 Generated from https://github.com/cilium/cilium/actions/runs/10078541381 ## Docker Manifests ### cilium `quay.io/cilium/cilium:v1.16.0@sha256:46ffa4ef3cf6d8885dcc4af5963b0683f7d59daa90d49ed9fb68d3b1627fe058` `quay.io/cilium/cilium:stable@sha256:46ffa4ef3cf6d8885dcc4af5963b0683f7d59daa90d49ed9fb68d3b1627fe058` ### clustermesh-apiserver `quay.io/cilium/clustermesh-apiserver:v1.16.0@sha256:a1597b7de97cfa03f1330e6b784df1721eb69494cd9efb0b3a6930680dfe7a8e` `quay.io/cilium/clustermesh-apiserver:stable@sha256:a1597b7de97cfa03f1330e6b784df1721eb69494cd9efb0b3a6930680dfe7a8e` ### docker-plugin `quay.io/cilium/docker-plugin:v1.16.0@sha256:024a17aa8ec70d42f0ac1a4407ad9f8fd1411aa85fd8019938af582e20522efe` `quay.io/cilium/docker-plugin:stable@sha256:024a17aa8ec70d42f0ac1a4407ad9f8fd1411aa85fd8019938af582e20522efe` ### hubble-relay `quay.io/cilium/hubble-relay:v1.16.0@sha256:33fca7776fc3d7b2abe08873319353806dc1c5e07e12011d7da4da05f836ce8d` `quay.io/cilium/hubble-relay:stable@sha256:33fca7776fc3d7b2abe08873319353806dc1c5e07e12011d7da4da05f836ce8d` ### operator-alibabacloud `quay.io/cilium/operator-alibabacloud:v1.16.0@sha256:d2d9f450f2fc650d74d4b3935f4c05736e61145b9c6927520ea52e1ebcf4f3ea` `quay.io/cilium/operator-alibabacloud:stable@sha256:d2d9f450f2fc650d74d4b3935f4c05736e61145b9c6927520ea52e1ebcf4f3ea` ### operator-aws `quay.io/cilium/operator-aws:v1.16.0@sha256:8dbe47a77ba8e1a5b111647a43db10c213d1c7dfc9f9aab5ef7279321ad21a2f` `quay.io/cilium/operator-aws:stable@sha256:8dbe47a77ba8e1a5b111647a43db10c213d1c7dfc9f9aab5ef7279321ad21a2f` ### operator-azure `quay.io/cilium/operator-azure:v1.16.0@sha256:dd7562e20bc72b55c65e2110eb98dca1dd2bbf6688b7d8cea2bc0453992c121d` `quay.io/cilium/operator-azure:stable@sha256:dd7562e20bc72b55c65e2110eb98dca1dd2bbf6688b7d8cea2bc0453992c121d` ### operator-generic `quay.io/cilium/operator-generic:v1.16.0@sha256:d6621c11c4e4943bf2998af7febe05be5ed6fdcf812b27ad4388f47022190316` `quay.io/cilium/operator-generic:stable@sha256:d6621c11c4e4943bf2998af7febe05be5ed6fdcf812b27ad4388f47022190316` ### operator `quay.io/cilium/operator:v1.16.0@sha256:6aaa05737f21993ff51abe0ffe7ea4be88d518aa05266c3482364dce65643488` `quay.io/cilium/operator:stable@sha256:6aaa05737f21993ff51abe0ffe7ea4be88d518aa05266c3482364dce65643488` Signed-off-by: Cilium Release Bot <noreply@cilium.io> 24 July 2024, 14:59:50 UTC
8299999 Prepare for release v1.16.0 Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 14:24:03 UTC
ee9cf15 daemon: Mark --hubble-drop-events as alpha [ upstream commit 0f32f68f1c0c5fe13136ee438cf2a8d3e76333b9 ] Let's stage this feature a bit more gradually and assess its impact in production before marking it beta/stable. Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
e51da39 helm: Update hubble-relay livenessProbe and startupProbe [ upstream commit 45db088d7de70dc0cf8e99f06c5ed455070ad252 ] Add initialDelaySeconds of 10 seconds to both, since hubble-relay may take a bit of time to start, and modify the defaults for the livenessProbe since livenessProbes kill the pod which should be a last resort. We want to give relay time to retry before killing it, and the default livenessProbe only lets it fail for 30 seconds before terminating the pod. Increase this to 2 minutes worth of failures before allowing the pod to be terminated, giving relay time to retry and become healthy again. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
5460b67 hubble: Reduce relay peer manager exponential back-off min/max [ upstream commit 2325bc5afa843f955835e0fe53b3c9ebd314eec4 ] The current back-off settings for the hubble peer manager gRPC retries are way too high. Most users tend to only run 1 replica, or a small number of replicas of hubble-relay, meaning we aren't at high risk of impacting cilium/hubble, even with a large number of retries. This is especially true since each relay pod only connects to a single peer service, thus a single cilium agent/hubble server is being connected to by each replica of relay. Tuning the back-off for hubble-relay's peer manager is especially important since the relay's livenessProbe relies on the health of it's peers, meaning we need to be more aggressive in retries to avoid being killed due to the health status failing. Starting at 10 seconds for back-off is pretty long for a retry. In my experience, most of the time back-offs start in the milliseconds and ramp up quickly from there. Given most failures are expected to be transient it doesn't make any sense to start with such a high retry delay, as it just delays the successful connections needlessly and Starting at 10 seconds means we will quickly see the delay increase as the back-off doubles. By setting it to 1 second we'll reach 10 second of back-off (the current starting value) after 3 retries and 12 seconds of elapsed time, so it will only add 3 additional requests total in the same period as the current settings first retry of 10 seconds, which should be negligible. Additionally, 90 minutes is way too long for a maximum back-off, it basically means relay would be unhealthy for an extended period of time despite the fact that the underlying hubble servers might be fine. 1 minute feels reasonably acceptable for a maximum retry rate for connecting to the peer manager. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
aa4aedc hubble: Increase the default relay dial timeout to 30 seconds [ upstream commit 275b78671cedd8a2dc3a22fe010e84dac8369ef3 ] When connecting to the peer service sometimes it takes longer than 5 seconds to connect, especially on over burdened nodes or in CI, so change the default to a more reasonable default of 30 seconds. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
c9b9064 gh: e2e-upgrade: enable setup 7 [ upstream commit 0f3101cd306ceb003098c2f01f1b954df0a47794 ] [ Backporter's notes: Rebased against stable branch using 6.6 ] Reenable setup 7 CI run in upgrade test. Signed-off-by: Robin Gögge <r.goegge@isovalent.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
6a62e70 endpoint: hold build mutex during delete. [ upstream commit 2dad63c8a63701e5c79d2ba063ad9d1537aa3467 ] The endpoints buildMutex is commented as intending to be held while doing endpoint build as well as delete. However, this lock is not actually held during (*Endpoint).Delete(). As a result, there is a race condition that can occur where an endpoint in the process of doing `e.owner.GetCompilationLock().RLock()` may be deleted while waiting for this lock. This is made more likely as the write lock on the compilation lock can be held for several minutes at times. When the regenerateBPF finally does acquire the lock, it may find that it's underlying interface has already been deleted. Specifically, in the case of the health endpoints lxc_health interface, this is removed after some ping timeout and endpoint-manager is called to delete the endpoint. However if the endpoint waiting for the compilation lock at the same time it will still attempt to regenerate the endpoint once the rlock becomes available, leading to errors such as: ``` msg="Error while reloading endpoint BPF program" ciliumEndpointName=/ containerID= containerInterface= datapathPolicyRevision=0 desiredPolicyRevision=1 endpointID=62 error="retrieving device lxc_health: Link not found" identity=4 ipv4=10.244.0.18 ipv6="fd00:10:244::9647" k8sPodName=/ subsys=endpoint ``` This adds the build mutex locking to (*Endpoint).Delete(), such that the endpoint cannot deleted while it is being regenerated. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
5606404 cilium: Extend endpoint deletion to wait for any ongoing builds to finish [ upstream commit f59fe4e914e6bde3094749703f99e3202f584ab6 ] Endpoint deletions can race with pending builds as detailed in #32689. This is due to the fact that the Delete() call does not wait for them to finish. Instead, it tears down the endpoint, downs the device, removes tc(x), etc. This is not correct since the inflight from regenerate() can later still attempt to attach something to the given device. This in itself is not problematic as the CNI will just remove the device and the kernel removes all objects along with it. However, this can result in sporadic agent error logs that a given lxc* device has already disappeared when the agent tries to attach, but Delete() resp the CNI Del finished. We've seen this in particular in case of lxc_health. The health endpoint is not managed by the CNI plugin, but the agent instead. What appears to happen is that when health endpoint is cleaned, it's removed from the ep manager, and subsequently the device is removed.. however any ongoing regenerate() later tries to attach to a non-existing device. Given the Delete() updates the endpoint state, we also need to wrap this with the build mutex lock. This is to wait for any pending regenerate() to finish before setting the new ep state, and any new build requests will bail out early as they see that the endpoint is disconnecting. Kudos to Tom Hadlaw and Robin Goegge! Fixes: #32689 Co-developed-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
72b0bb4 cilium: Add a comment on device management [ upstream commit 60e508053cfa2df09ff5bba74f85c32e22df7e74 ] For the health endpoint the agent needs to remove devices since its not managed by CNI. Add a comment explaining this. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
4b3f236 cilium: Trigger reping after launching the health endpoint [ upstream commit a90b980a466300a287c867a88a414c2c5aa91698 ] Otherwise we might run risk of hitting the timeout too early. Just reping right away upon success. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
31b4ef8 cilium: Reuse common endpoint deletion functionality [ upstream commit f7f455dae376746a5f4d7a70445e1ac98162961b ] Do not open code it in the health code. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
ef15f1e cilium: Add a log for cleaning up health endpoint [ upstream commit 09633b875e1f60f7207fc02bab310363c03fe913 ] We log whenever we start the health endpoint. Also add a log entry whenever we attempt to clean up. Also add a small comment in the cleanup routine for device removal. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
aa783e8 cilium: Remove health ep from ep manager on shutdown [ upstream commit da056334d8e96e80b85281365fbe5baa7ec2efcf ] Currently on shutdown we don't remove the health endpoint from the endpoint manager but do remove the corresponding netlink interface. Note that validateEndpoint() special cases the health endpoint and does remove the old state directory if it existed before. But independent of that it would be better to just streamline cleanup via cleanupHealthEndpoint() for all cases. Related: #32689 Signed-off-by: Robin Gögge <r.goegge@isovalent.com> Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
8acd446 linux/node: reallocate nodeID upon conflict [ upstream commit 50b9a31dd25e5498079528bb534059106b5bbb51 ] NodeIDs and IPsec state suffer from a lack of reconciliation. If the agent misses a node deletion event, stale state is never cleaned up. This is somewhat known (#29822, #26298), but was generally considered not of huge consequence. Stale XFRM states/policies can accumulate, but will not match traffic - the effect is mostly slowing down processing in agent and kernel. nodeIDs can eventually run out if too many node deletions are missed, but the rate at which these are missed is expected to be low. Unfortunately, there are large clusters with high node churn in which rare events become common, and hence the following sequence of events is probable enough to actually observe: 1. a node is deleted while the agent is down (e.g. due to being upgraded) 2. a new node joins the cluster and is allocated IPs which overlap with previously used IPs. If this occurs, the agent can have a partioned view of what nodeID this node should have - in the BPF map, the k8s internal IP will map to a different nodeID than the cilium internal ip. This breaks IPsec traffic towards this node, as BPF applies a mark based on the BPF map nodeID of the tunnnel endpoint, but the xfrm states expect to match the mark based on the cilium internal IP. The result is traffic which doesn't match any xfrm state/policy, falling back to the catch all block policy. To work around this, we enforce that all IPs of a node get the same nodeID - even if an IP was already pointing to an existing nodeID. Since this node update is more current than whatever state we had held, it seems more correct to ensure all IPs point to the same nodeID than avoiding a BPF map write. We do so by forcing the allocation of a new nodeID. It is possible that unmapping the IPs but not deallocating the nodeID can leak the nodeID. Deallocating unconditionally would be wrong, as the stale mapping might point to a nodeID which represents a different, alive node. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
0546d66 bpf: lxc: limit nodeport RevDNAT support to IPsec configurations [ upstream commit e964b0ac1e95030dada6c2c087f02357998aaac6 ] bpf_lxc currently marks all nodeport-ish connections in the CT entry, and performs RevDNAT for their replies in from-container. But we typically want to perform RevDNAT for nodeport-ish connections at the node's egress interface, after the traffic has passed through eg. the L7 proxy. This requires a RevDNAT hook in the relevant code path - in either to-overlay, to-netdev or more recently also to-wireguard. IPsec is the exception here, as there is currently no RevDNAT hook in the IPsec path when forwarding traffic to the XFRM layer (see https://github.com/cilium/cilium/issues/32897). Hence we continue to apply RevDNAT in from-container whenever IPsec is enabled, until #32897 has been addressed. But for all other configurations we can disable the RevDNAT support in bpf_lxc, so that new connections use the "native" RevDNAT path. This allows us to eventually phase out the RevDNAT path in bpf_lxc, similar as has been done for DSR in https://github.com/cilium/cilium/pull/32642. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 24 July 2024, 13:55:06 UTC
back to top