https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
1216e5b test: Use DinD in L4LB tests The DinD for the tests was introduced in [1]. However, it never made into the v1.11 branch which made the GHA always to fail. Fix this by taking the test.sh files from the main branch. [1]: https://github.com/cilium/cilium/pull/22653 Signed-off-by: Martynas Pumputis <m@lambda.lt> 22 May 2023, 07:04:58 UTC
1689a2c install: Update image digests for v0.11.17 Generated from https://github.com/cilium/cilium/actions/runs/5006290922. `docker.io/cilium/cilium:v1.11.17@sha256:6c3132e34e66734752de798eb8519dafa77b9f0da1033e9bed7f7be30ce10358` `quay.io/cilium/cilium:v1.11.17@sha256:6c3132e34e66734752de798eb8519dafa77b9f0da1033e9bed7f7be30ce10358` `docker.io/cilium/clustermesh-apiserver:v1.11.17@sha256:022f8b23f9e977a74b8da25ac98fbeed65bd9c132362797681264bd13abc0349` `quay.io/cilium/clustermesh-apiserver:v1.11.17@sha256:022f8b23f9e977a74b8da25ac98fbeed65bd9c132362797681264bd13abc0349` `docker.io/cilium/docker-plugin:v1.11.17@sha256:ed49556f92b95ff339e99938bbd5649d5dc90e8378cb67a820df6bac1979ffa2` `quay.io/cilium/docker-plugin:v1.11.17@sha256:ed49556f92b95ff339e99938bbd5649d5dc90e8378cb67a820df6bac1979ffa2` `docker.io/cilium/hubble-relay:v1.11.17@sha256:d880ee0184f1ca0fffbd73374424ae2c4d1c26af14005a58103ef695816a78ff` `quay.io/cilium/hubble-relay:v1.11.17@sha256:d880ee0184f1ca0fffbd73374424ae2c4d1c26af14005a58103ef695816a78ff` `docker.io/cilium/operator-alibabacloud:v1.11.17@sha256:36999e2fefb8f1ce3a791f60c61055b3bdde350dff5128ce3f4a5fbe31c6f341` `quay.io/cilium/operator-alibabacloud:v1.11.17@sha256:36999e2fefb8f1ce3a791f60c61055b3bdde350dff5128ce3f4a5fbe31c6f341` `docker.io/cilium/operator-aws:v1.11.17@sha256:e96a7d34ed9386a00b0c7d73946f92872280f84addcc951780c42a56dfaeae9c` `quay.io/cilium/operator-aws:v1.11.17@sha256:e96a7d34ed9386a00b0c7d73946f92872280f84addcc951780c42a56dfaeae9c` `docker.io/cilium/operator-azure:v1.11.17@sha256:20cf49d57fdccc599cfefc5a6ab0ed152dac52d45d8a2339fd3ad19415aaebba` `quay.io/cilium/operator-azure:v1.11.17@sha256:20cf49d57fdccc599cfefc5a6ab0ed152dac52d45d8a2339fd3ad19415aaebba` `docker.io/cilium/operator-generic:v1.11.17@sha256:f77cf55ebc47174fb64fd8ffd030015e55817ed9a6bfab46d0ee917a7ed198e5` `quay.io/cilium/operator-generic:v1.11.17@sha256:f77cf55ebc47174fb64fd8ffd030015e55817ed9a6bfab46d0ee917a7ed198e5` `docker.io/cilium/operator:v1.11.17@sha256:c1cad3137dfa80c1d415dff43f064b91992158ce56899b093b0294382ae57289` `quay.io/cilium/operator:v1.11.17@sha256:c1cad3137dfa80c1d415dff43f064b91992158ce56899b093b0294382ae57289` Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 17 May 2023, 18:24:14 UTC
e86fde3 Prepare for release v1.11.17 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 15 May 2023, 13:12:15 UTC
a96172d images: update cilium-{runtime,builder} Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 15 May 2023, 11:32:12 UTC
0cbf76b Update CNI to 1.3.0 [ upstream commit 4ebf4e7a81a8f60154d693ecf4c844f6bdcb62e6 ] Run `images/scripts/update-cni-version.sh 1.3.0` to update the CNI version. Ref: https://github.com/containernetworking/plugins/releases/tag/v1.3.0 Signed-off-by: Yongkun Gui <ygui@google.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 15 May 2023, 11:32:12 UTC
a4d2a1f Add helm-toolbox image for helm docs, lint [ upstream commit e4ba2aa24bd7f28291a24a67acabd8e2fdbd09e6 ] Use https://github.com/cilium/helm-toolbox as an image for managing all of our helm formatting & linting needs. This implicitly updates: * helm (version from dev environment) -> 3.9.0 * helm-docs (custom build from Bruno) -> 1.10.0 * m2r 0.2.1 -> m2r2 0.3.2 Signed-off-by: Joe Stringer <joe@cilium.io> 12 May 2023, 19:04:57 UTC
7a1ad6d test/provision: Only install bpf mount if not already there Avoid failing VM start due to mount failing: Created symlink /etc/systemd/system/multi-user.target.wants/sys-fs-bpf.mount → /etc/systemd/system/sys-fs-bpf.mount. Failed to restart sys-fs-bpf.mount: Unit sys-fs-bpf.mount has a bad unit file setting. See system logs and 'systemctl status sys-fs-bpf.mount' for details. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 11:14:59 UTC
a9404a1 vagrant: Bump 4.9 Vagrant box (Linux 4.9.326, to fix a kernel bug) [ upstream commit 07e7fb0073ab387108ac6b4c126df1a34e36d5d2 ] We have been hitting a kernel bug on 4.9 for the verifier tests. An underflow on the memlock rlimit counter, caused by the reallocation of BPF programs not updating the charged values, makes the counter go under zero and convert into a huge value, blocking all further loads of BPF objects [0]. This has been fixed in kernel 4.10 [1], and was backported at last in 4.9.326. We generated a new Ubuntu image based on that, let's update. [0] cilium/cilium#20288 [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=5ccb071e97fbd9ffe623a0d3977cc6d013bee93c [ Backport note: Only update the v4.9 image, not the cilium-dev image because version 232 also contains an updated Go version. This is fine for VM images used in tests because they use CI images built by GH actions using the proper Go version for the branch. ] Signed-off-by: Quentin Monnet <quentin@isovalent.com> 12 May 2023, 11:14:59 UTC
dc187e2 helm chart: v1.11 base : hubble-ui deployment : restore nodeSelector and tolerations Signed-off-by: Bryan Stenson <bryan.stenson@okta.com> 12 May 2023, 11:13:23 UTC
5cf233e ipsec: Install default-drop XFRM policy sooner [ upstream commit 2045d593f7685a63a25338f9eb85a6da4997ce85 ] We currently install the default-drop XFRM policy when we install the XFRM policies and states for the local node. It is however possible for us to start installing XFRM policies and states for remote nodes before we handle the local one. The default-drop XFRM policy is a safety measure for when we move XFRM policies around. Because we don't always have a way to atomically update XFRM policies, it's possible that we end up with a very short time where no encryption XFRM OUT policy is matching a subset of traffic. The default-drop policy ensures that we drop such traffic instead of letting it leave the node as plain-text. We therefore want this default-drop XFRM policy to be installed before we update any other other XFRM policy. This commit therefore moves its installation before any other XFRM update instead of before just local-node XFRM updates. Fixes: 7d44f37509 ("ipsec: Catch-default default drop policy for encryption") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
8dc5a04 linux/node_id: do not attempt to map NoID [ upstream commit 9115e05703e92718b1c08e6f8faffa22ac806336 ] We correctly detect that we failed to allocate a new node ID (due to exhaustion of the idpool), but then still go ahead and map it. This leads to spurious errors which include "Failed to map node IP address to allocated ID". Instead, don't try to map NoID and return it directly. Fixes: af88b42bd4 (datapath: Introduce node IDs) Suggested-by: Paul Chaignon <paul@cilium.io> Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
665be40 Delete Cilium monitor verbose mode test [ upstream commit 7335aa38a9c6a148b12b267ff6625a2290bb7d2a ] Another option would be to quarantine the test and find an assignee to make the test more robust, but I assert that we don't need test coverage for monitor verbose output. Fixes: #25178 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
ae0ed5a agent: Handle correctly state when CEP is present in multiple CES. [ upstream commit 71af0a2f4147c7706fb98fef0836b4de5eaa7e8a ] [ Backporter's node: Added '!privileged_tests' to the test ] There are condition possible in which CEP changes CES. This leads to CEP being present in multiple CESs for some time. In such cases the standard logic may not work as it always expect to have a single CEP representation. This commit changes to logic to handle multiple CEPs properly. Signed-off-by: Alan Kutniewski <kutniewski@google.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
307cd56 test: Cover IPsec + VXLAN + endpoint routes [ upstream commit 54fa995c56d914400b977230d0eb34516bf06606 ] [ Backporter's notes: Dropped call to 'helpers.RunsOnAKS()' which does not exist in v1.11 ] The previous commit fixed a connectivity bug affecting the above configuration. We can now extend tests to cover that configuration. Note that these tests are soon going to be removed and replaced by the new GitHub workflow. However, we may need to backport this pull request to stable branches where the GitHub workflow doesn't exist. Therefore, the corresponding extension of the workflow test will be done in a separate pull request. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
a66ebf4 ipsec: Don't match on packet mark for FWD XFRM policy [ upstream commit d39ca10f849060ca90f4a1ddc54734d0e05ba80a ] While extending datapath coverage, Martynas found a new bug affecting IPsec + VXLAN + endpoint routes. In that configuration, cross-node pod connectivity seems to fail and we see the IPsec XfrmInNoPols error counter increasing. By tracing the connection, we can see that the packet disappears on the receiving node, after decryption, between bpf_overlay (second traversal) and the lxc device. On that path, given decryption already happened, we should match the FWD XFRM policy: src 0.0.0.0/0 dst 10.244.0.0/24 uid 0 dir fwd action allow index 106 priority 2975 share any flag (0x00000000) lifetime config: limit: soft (INF)(bytes), hard (INF)(bytes) limit: soft (INF)(packets), hard (INF)(packets) expire add: soft 0(sec), hard 0(sec) expire use: soft 0(sec), hard 0(sec) lifetime current: 0(bytes), 0(packets) add 2023-01-13 14:34:18 use 2023-01-13 14:34:22 mark 0/0xf00 Clearly, given the non-zero XfrmInNoPols, the packet doesn't match the policy. Note XfrmInNoPols is also reported for FWD; there is no XfrmFwdNoPols. Checking the source code [1], we can see that when endpoint routes are enabled, we encode the source security identity into the packet mark. We thus won't match the 0/0xf00 mark on the FWD XFRM policy. We shouldn't need to match on any packet mark for the FWD XFRM policy; we want to allow all packets through. This commit therefore removes the packet mark match for the FWD direction. 1 - https://github.com/cilium/cilium/blob/v1.13.0-rc4/bpf/lib/l3.h#L151-L154 Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
421d611 chore(deps): update hubble cli to v0.11.5 Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 May 2023, 17:16:44 UTC
d3bf9d7 agent: dump stack on stale probes [ backport of d85c0939824ea3508f2a1b0ef56ddbc8d197e588 ] [ upstream commit 87f7a11ecc68b1efdc1454b520abc22470a91d01 ] Most of the time, when we see a stale probe, it's due to a deadlock. So, write a stack dump to disk (since we're probably going to be restarted soon due to a liveness probe). To prevent any sort of excessive resource consumption, only dump stack once every 5 minutes, and always write to the same file. Also, let's make the check lock-free while we're at it. Also, make sure we capture this file in bugtool. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 11 May 2023, 17:15:49 UTC
0834b37 inctimer: fix test flake where timer does not fire within time. [ upstream commit e695e48b171be3e60ecd8ef56c99b1b71ba1316c ] Running the test in a cpu constrained environment, such as: ``` docker run -v $(pwd):$(pwd) -w $(pwd) --cpus=0.1 -it golang:bullseye ./inctimer.test -test.v ``` I can fairly consistency reproduce a flake where the inctimer.After does not fire in time. If I allow it to wait for an additional couple of ms, this seems to be sufficient to prevent failure. It appears that goroutine scheduling latency can be significantly delayed in cpu restricted environments. This seems unavoidable, so to fix the flake I'll allow the test to wait another 2ms to see if the inctimer eventually fires. This will also log an error for delayed test fires, so if there is any other issues we can more easily debug them in the future. Fixed: #25202 Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
8fc726d docs: Add platform support to docs [ upstream commit 9a38aecc71b34c4b6ae95fcadd126f77ebb200ad ] We've been distributing ARM architecture images for Cilium for almost two years, but neglected to mention this up front in the system requirements or the main docs page. Add this to the docs. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
9edbc6e Makefile: use a specific template for mktemp files [ upstream commit db3e0152c6583e0cbec1013e514526adf7229faf ] Before this patch, we would hit a controller-gen[1] bug when the temporary file would be of the form tmp.0oXXXXXX. This patch uses a custom mktemp template that will not trigger the bug. [1]: https://github.com/kubernetes-sigs/controller-tools/issues/734 Signed-off-by: Alexandre Perrin <alex@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
6b920c6 docs: Add matrix version between envoy and cilium [ upstream commit 11e1bcc40fd4ba884191f594b8c6fd7975b1caaf ] This is to add a small docs for version matrix between Cilium and Cilium envoy versions, which is useful with the upcoming work to move envoy proxy out of Cilium agent container. Co-authored-by: ZSC <zacharysarah@users.noreply.github.com> Signed-off-by: Tam Mach <sayboras@yahoo.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
7f90b0d helm: add clustermesh nodeport config warning about #24692 [ upstream commit 9e83a6f79940c86a95fff33de89d7ded225da25c ] Cilium is currently affected by a known bug (#24692) when NodePorts are handled by the KPR implementation, which occurs when the same NodePort is used both in the local and the remote cluster. This causes all traffic targeting that NodePort to be redirected to a local backend, regardless of whether the destination node belongs to the local or the remote cluster. This affects also the clustermesh-apiserver NodePort service, which is configured by default with a fixed port. Hence, let's add a warning message to the corresponding values file setting. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
931ebc1 envoy: Upgrade to v1.23.9 This commit is to upgrade envoy to v1.23.9 for security fixes, please find below the details: Build: https://github.com/cilium/proxy/actions/runs/4827955904/jobs/8601231172 Upstream Docs: https://www.envoyproxy.io/docs/envoy/v1.23.9/ Release notes: https://www.envoyproxy.io/docs/envoy/v1.23.9/version_history/v1.23/v1.23.9 Signed-off-by: Tam Mach <tam.mach@cilium.io> 29 April 2023, 12:37:42 UTC
e026bf6 ci: remove `STATUS` commands from upstream tests' Jenkinsfile [ upstream commit 46de5bca9fc77cb2c02ba2873a30649dbc1b78b6 ] These are remnants of a past before GHPRB. At best they create uncessary noise in the logs, at worst they can interfere with the default behaviour, so let's just remove them. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 28 April 2023, 14:51:34 UTC
bbddee0 bgp: do not advertise ipv6 prefixes via metallb [ upstream commit 922aa1bcabd1c75c568bf56a2ca2a5eb7e624ba9 ] When using metallb as BGP speaker, if IPv6 advertisement is made - metallb will return error as unsupported. This error is logged and error is returned to control loop, which continues retrying causing log flooding and high CPU. This change filters out IPv6 prefixes before sending them to metallb library and logs one time error message. Signed-off-by: harsimran pabla <hpabla@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 28 April 2023, 14:51:34 UTC
be630a3 contrib/backporting: Fix main branch reference The "master" branch was renamed to "main" recently. This commit is to adjust branch reference for chery-pick script. Signed-off-by: Tam Mach <tam.mach@cilium.io> 26 April 2023, 14:57:01 UTC
bac3efe wireguard: fix issue caused by nodes with the same name in clustermesh [ upstream commit 7398de68ca940a839915107e71c00b9661bf423b ] Currently, the wireguard subsystem in the cilium agent caches information about the known peers by node name only. This can lead to conflicts in case of clustermesh, if nodes in different clusters have the same name, causing in turn connectivity issues. Hence, let's switch to identify peers by full name (i.e., cluster-name/node-name) to ensure uniqueness. This modification does not introduce issues during upgrades, since the node ID is not propagated to the datapath. Fixes: #24227 Reported-by: @oulinbao <oulinbao@163.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
0523a23 daemon: Mark CES feature as beta in agent flag [ upstream commit a6d0142ec8f093bfb41a042e8d2cc38eff9e1cf3 ] This commit marks the CiliumEndpointSlice feature as beta (as per the documentation) in the agent flag description. This is necessary because users don't always read the full documentation before turning agent flags on. While at it, change the flag description to match the wording of other flags. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
a5eb447 pkg/kvstore: Fix for deadlock in etcd status checker [ upstream commit 9bb669b5705f3b283f3fac8f79b760987066d06d ] Etcd quorum checks are falsely reported as failing even though connection to etcd is intact. This can cause health checks to fail in both the agent and the operator. This happens due to a deadlock in pkg/kvstore/etcd after a prolonged downtime of etcd. Status check errors are being sent into a channel for the purpose of recreating kvstore connections in clustermesh. However when clustermesh is not used, messages from this channel are never read. The channel uses a buffer of size 128. After etcd has been down long enough to generate 128 errors, we enter a deadlock state. Agent / operator will continue to report etcd quorum failures and inturn health check failures until they're restarted. statusChecker() -> isConnectedAndHasQuorum() -> waitForInitLock() -> goroutine -> for -> ( initLockSucceeded <- err ) -> chan initLockSucceeded returned -> Block on receiving messages from initLockSucceeded channel -> e.statusCheckErrors <- e.latestErrorStatus [Blocked after 128 entries] Blocked goroutines captured from cilium 1.10 operator: goroutine 3309 [chan send, 13456 minutes]: github.com/cilium/cilium/pkg/kvstore.(*etcdClient).statusChecker(0xc00017db30) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:1171 +0x75a created by github.com/cilium/cilium/pkg/kvstore.connectEtcdClient /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:801 +0x679 goroutine 7838665 [chan send, 13505 minutes]: g.com/c/cilium/pkg/kvstore.(*etcdClient).waitForInitLock.func1(-,-,-,-) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:433 +0x449 created by github.com/cilium/cilium/pkg/kvstore.(*etcdClient).waitForInitLock /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:425 +0x7f Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
c884102 .travis: Quieten docker build output [ upstream commit 7f9e0f9b7b9fe26d707217e8248b86dde21dae4d ] The travis logs are frequently polluted with >10K lines of docker pull and build output. While this helps to track the ongoing progress of docker builds that take a long time, it's mostly useless output that developers must scroll past in order to see the useful output. Quieten that output in Travis to just the trigger of building the image plus the final summary that docker outputs. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
d7eef48 .travis: Make output less verbose [ upstream commit 9f7e24fd7e5c355e943528ac3074a2485a14b2ad ] Pass the verbosity parameters --quiet V=0 to quieten Travis output. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
3309e28 Makefile: Fix dirname errors with empty PRIV_TEST_PKGS [ upstream commit 29fe753e475fece540aac3fafd288147515bab08 ] When TESTPKGS only contains unprivileged tests, the PRIV_TEST_PKGS_EVAL evaluation previously filtered down to an empty list of packages that should be tested, and would pass this empty list to dirname, which then reports: dirname: missing operand Try 'dirname --help' for more information. This could happen multiple times during evaluation of the Makefile, and littered the output with no meaning. This could occur even if the privileged tests are not the target being run. Fix this by always adding "." to the list, which evaluates to the root directory of the repository. This causes dirname to succeed. Then, we can filter this root directory back out since there are no privileged tests at this level of the repository. This finally quietens the error. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
44ebf04 test/bpf: Fix compilation with V=0 [ upstream commit 82d5adc7951de3d14bb2406112314d93d3d606bb ] When the quiet mode was enabled, the $(CLANG) var would previously have a '@' at the start, which caused errors while attempting to make in this directory because it would be run in the context of a shell rather than directly as a make instruction. Move the $(QUIET) to the start of individual make instructions to resolve this compilation failure. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
e257f08 contrib/backporting: Fix main branch reference The "master" branch was renamed to "main" recently. Fix this for the older branch here. Signed-off-by: Joe Stringer <joe@cilium.io> 25 April 2023, 13:45:50 UTC
9012f69 docs: Document upgrade impact for IPsec The IPsec upgrade issue mentioned in ede154e27b ("Add IPSec remark for upgrade to v1.11.15") is fixed in v1.11.16. Nonetheless, a small impact remains, with a few packet drops happening during the upgrade. This commit documents that impact. Signed-off-by: Paul Chaignon <paul@cilium.io> 19 April 2023, 17:04:48 UTC
4da9949 jenkins: bump timeout to 210 minutes The net-next test has been timing out at the 2h50m mark. We need to increase its timeout in order to avoid such failures. Signed-off-by: André Martins <andre@cilium.io> 18 April 2023, 18:43:21 UTC
13d1197 install: Update image digests for v1.11.16 Generated from https://github.com/cilium/cilium/actions/runs/4731386331. ## Docker Manifests ### cilium `docker.io/cilium/cilium:v1.11.16@sha256:d2f2632c997a027ee4e540432edb4d8594e78e33315427e7ec3c06b473ec1e4e` `quay.io/cilium/cilium:v1.11.16@sha256:d2f2632c997a027ee4e540432edb4d8594e78e33315427e7ec3c06b473ec1e4e` ### clustermesh-apiserver `docker.io/cilium/clustermesh-apiserver:v1.11.16@sha256:67a051ef38ae113bcf7dc27ebb23a1137ece961ce86f087226ff5a0046099106` `quay.io/cilium/clustermesh-apiserver:v1.11.16@sha256:67a051ef38ae113bcf7dc27ebb23a1137ece961ce86f087226ff5a0046099106` ### docker-plugin `docker.io/cilium/docker-plugin:v1.11.16@sha256:1ee1bae0c2299d94ff162fc2847f9827823ff3d8e055e07da06e4ca28efe9391` `quay.io/cilium/docker-plugin:v1.11.16@sha256:1ee1bae0c2299d94ff162fc2847f9827823ff3d8e055e07da06e4ca28efe9391` ### hubble-relay `docker.io/cilium/hubble-relay:v1.11.16@sha256:c4c12759ba628e64a0f3fada99d2632627e5391ae0b49c3f35da51c3ba9eac9f` `quay.io/cilium/hubble-relay:v1.11.16@sha256:c4c12759ba628e64a0f3fada99d2632627e5391ae0b49c3f35da51c3ba9eac9f` ### operator-alibabacloud `docker.io/cilium/operator-alibabacloud:v1.11.16@sha256:d60aedfabf0957da1d975ee54779172f990366e9fb8bf55184ac31a0d77adc65` `quay.io/cilium/operator-alibabacloud:v1.11.16@sha256:d60aedfabf0957da1d975ee54779172f990366e9fb8bf55184ac31a0d77adc65` ### operator-aws `docker.io/cilium/operator-aws:v1.11.16@sha256:526dab3bee6231f71da44d14f25c17dfb53afba876bfc99374a11c0fb4278e36` `quay.io/cilium/operator-aws:v1.11.16@sha256:526dab3bee6231f71da44d14f25c17dfb53afba876bfc99374a11c0fb4278e36` ### operator-azure `docker.io/cilium/operator-azure:v1.11.16@sha256:0c2da6adf29f521f6d2ffe92794ad598fc99231eba2814b80cf608362cc14a3c` `quay.io/cilium/operator-azure:v1.11.16@sha256:0c2da6adf29f521f6d2ffe92794ad598fc99231eba2814b80cf608362cc14a3c` ### operator-generic `docker.io/cilium/operator-generic:v1.11.16@sha256:ea3fbe5ab65efc41228d716a64804b6fca9e2299835c3d39ae1cb248c1594c55` `quay.io/cilium/operator-generic:v1.11.16@sha256:ea3fbe5ab65efc41228d716a64804b6fca9e2299835c3d39ae1cb248c1594c55` ### operator `docker.io/cilium/operator:v1.11.16@sha256:44fb99adbba82605702aa9c41380c1c79ad5565bbd3c9d961f9aab55387be586` `quay.io/cilium/operator:v1.11.16@sha256:44fb99adbba82605702aa9c41380c1c79ad5565bbd3c9d961f9aab55387be586` Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> 18 April 2023, 13:49:46 UTC
9ce6d64 update CHANGELOG update changelog with the PRs that got merged after the last CHANGELOG but before tagging Signed-off-by: André Martins <andre@cilium.io> 17 April 2023, 22:34:46 UTC
54967c6 Avoid clearing objects in conversion funcs [ upstream commit 0d7406af7bf05e1c0c49c098a3b4e025d927c3e7 ] This removes the behavior of mutating the objects received from the client-go library. To begin with there isn't really any benefit from doing so, given we don't store the object afterwards, and it will be ready for gc when it leaves the scope inside client-go. client-go can possibly return the same pointer twice here, to trigger eg. both an object update delta and then a DeletedFinalStateUnknown delta with the same pointer. For more info, see the issue 115658 in the kubernetes/kubernetes repo on github. Follow-up of: 74307f175ceb ("Avoid clearing objects in conversion funcs") Signed-off-by: André Martins <andre@cilium.io> 17 April 2023, 22:28:49 UTC
e80ee2c Avoid clearing objects in conversion funcs [ upstream commit 74307f175cebc3e097eaa4eb4e930bf26ebb8cb1 ] This removes the behavior of mutating the objects received from the client-go library. To begin with there isn't really any benefit from doing so, given we don't store the object afterwards, and it will be ready for gc when it leaves the scope inside client-go. client-go can possibly return the same pointer twice here, to trigger eg. both an object update delta and then a DeletedFinalStateUnknown delta with the same pointer. For more info, see the issue 115658 in the kubernetes/kubernetes repo on github. Signed-off-by: Odin Ugedal <ougedal@palantir.com> Signed-off-by: Odin Ugedal <odin@uged.al> Signed-off-by: André Martins <andre@cilium.io> 17 April 2023, 22:28:49 UTC
fd81a6b envoy: Bump envoy to v1.23.8 https://github.com/cilium/proxy/actions/runs/4698873584/jobs/8331675690 Relates: https://github.com/cilium/proxy/pull/172 Signed-off-by: Tam Mach <tam.mach@cilium.io> 16 April 2023, 00:44:57 UTC
4190a2d envoy: Support more envoy image tag formats [upstream commit afcda947] This commit is to add the support for below image tags Different envoy image tag formats: ``` quay.io/cilium/cilium-envoy:f195a0a836629ceca5d7561f758c9505d9ebaebfa262647a2d4 quay.io/cilium/cilium-envoy:v1.23-f195a0a836629ceca5d7561f758c9505d9ebaebfa262647a2d4 ``` Testing was done as per below, kindly note the existing format should be working as usual. ```bash $ test=quay.io/cilium/cilium-envoy:014ceeb312a4d18dcf0ea219143f099fa91f2f28@sha256:1a3020822e8fb10b5f96bf45554690c411c2f48d8ca8fcf33da871dad1ce6b53 $ echo $test | sed -E -e 's/[^/]*\/[^:]*:(.*-)?([^:@]*).*/\2/p;d' 014ceeb312a4d18dcf0ea219143f099fa91f2f28 $ test=quay.io/cilium/cilium-envoy:v1.24-014ceeb312a4d18dcf0ea219143f099fa91f2f28@sha256:1a3020822e8fb10b5f96bf45554690c411c2f48d8ca8fcf33da871dad1ce6b53 $ echo $test | sed -E -e 's/[^/]*\/[^:]*:(.*-)?([^:@]*).*/\2/p;d' 014ceeb312a4d18dcf0ea219143f099fa91f2f28 ``` Fixes: #24749 Signed-off-by: Tam Mach <tam.mach@cilium.io> 16 April 2023, 00:44:57 UTC
cf28bba Prepare for release v1.11.16 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 14 April 2023, 11:41:35 UTC
36eaa38 update k8s dependencies to 1.23.17 The last client-go version contains an important bug fix. See https://github.com/kubernetes/kubernetes/pull/115901 for more info. Signed-off-by: André Martins <andre@cilium.io> 13 April 2023, 23:09:29 UTC
f490339 Remove HTTP header value from debug log Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com> 13 April 2023, 18:13:22 UTC
00630cc ipsec: Remove stale XFRM states and policies [ upstream commit 688dc9ac802b11f6c16a9cbc5d60baaf77bd6ed0 ] We recently changed our XFRM states and policies (IPs and marks). We however failed to remove the stale XFRM states and policies and it turns out that they conflict (e.g., the kernel ends up picking the stale policies for encryption instead of the new one). This commit therefore cleans up those stale XFRM states and policies. We can identify them based on mark values and masks (we switched from 0xFF00 to 0XFFFFFF00). The new XFRM states and policies are added as we receive the information on remote nodes. By removing the stale states and policies before the new ones are installed for all nodes, we could cause plain-text traffic on egress and packet drops on ingress. To ensure we never let plain-text traffic out, we will clean up the stale config only once the catch-all default-drop policy is installed. In that way, if there is a brief moment where, for a connection nodeA -> nodeB, we don't have a policy, traffic will be dropped instead of sent in plain-text. For each connection nodeA -> nodeB, those packet drops on egress and ingress of nodeA will happen between the time we replace the BPF datapath and the time we've installed the new XFRM state and policy corresponding to nodeB. Waiting longer to remove the stale states and policies doesn't impact the drops as they will keep happening until the new states and policies are installed. This is all happening on agent startup, as soon as we have the necessary information from k8s. Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
f76ce36 ipsec: Catch-default default drop policy for encryption [ upstream commit 7d44f37509c6271f7196dcec8edc7c417c609dca ] This commit adds a catch-all XFRM policy for outgoing traffic that has the encryption bit. The goal here is to catch any traffic that may passthrough our encryption while we are replacing XFRM policies & states. Those operations cannot always be performed atomically so we may have brief moments where there is no XFRM policy to encrypt a subset of traffic. This policy ensures we drop such traffic and don't let it flow in plain text. We do need to match on the mark because there is also traffic flowing through XFRM that we don't want to encrypt (e.g., hostns traffic). Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
4e18c00 ipsec: Custom check for XFRM state existence [ upstream commit ddd491bd8e100f94ca275c87e81f7e2be042c8db ] UpsertIPsecEndpoint is currently unable to replace stale XFRM states. We use XfrmStateAdd, which fails with EEXIST if a state with the same key (IPs, SPI, and mark) already exists. We can't use XfrmStateUpdate because it fails with ESRCH is no state with the specified key exist. Note we don't have the same issue for XFRM policies because XfrmPolicyUpdate doesn't return ESRCH if no such policy already exists. No idea why the two APIs are not consistent. We therefore need to implement a proper 'update or insert' logic for XFRM states ourselves. To that end, we first check if the state we want to add already exists. If it doesn't, we attempt to add it. If it fails with EEXIST, we know that some other state is conflicting. In that case, we attempt to remove any conflicting XFRM states that are found and then attempt to add the new state again. To find conflicting XFRM states, we use the same logic as the kernel does (cf. __xfrm_state_lookup). Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
198bad2 ipsec: Refactor wildcard IP variables [ upstream commit e802c2985fb673526fb1d00b2713b03827e63354 ] These wildcard variables will be used by a later commit in the IPsec logic. Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
9f553bc loader: Don't compile .asm files by default [ upstream commit 92407a836c85ad046e9ec1d335654ae2bc0bf26b ] Today we always compile a .asm files for endpoints, even though we rarely use them. They take a lot of space in the sysdumps and increase the overall compile time. This commit changes it to only compile those files if debugging mode is enabled. Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
1068a3b pkg/service: Handle duplicate backends [ upstream commit 5311f81505d3e8127f311628bffaf205d41c9429 ] In certain error scenarios, backends can be leaked, where they were deleted from the userspace state, but left in the datapath backends map. To reconcile datapath and userspace, identify such backends that were created with different IDs but same L3n4Addr hash. This commit builds up on previous commits that don't bail out on such error conditions (e.g., backend IDs mismatch during restore), and tracks backends that are currently referenced in service entries restored from the lb4_services map to restore backend entries. Furthermore, it uses the tracked state to delete any duplicate backends that were previously leaked. Fixes: b79a4a53 (pkg/service: Gracefully terminate service backends) Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
6b36d9e pkg/service: Restore services prior to backends [ upstream commit ebe2b55e2a71ce2a991bd648387b54d6ff8d76bd ] The restore logic attempts to reconcile datapath state with the userspace post agent restart. Previously, it first restored backends from the `lb4_backends` map before restoring service entries from the `lb4_services` map. If there were error scenarios prior to agent restart (for example, backend map full because of leaked backends), the logic would fail to restore backends currently referenced in the services map (and as a result, selected for load-balancing traffic). This commit prioritizes restoring service entries followed by backend entries. Follow-up commit handles error cases such as leaked backends by keeping track of backends retrieved from restoration of service entries, and then using that to subsequently restore backends. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
c00f9a6 pkg/service: Don't bail out on failures [ upstream commit 89a1936bf6dda7fc6816e304529f57656b57e72c ] The restore code attempts to reconcile datapath state with the userspace state post agent restart. Bailing out early on failures prevents any remediation from happening, so log any errors. Follow-up commits will try to handle leaked backends in the cluster if any. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
d8fe1a6 tests: add exceptions for lease errors due to etcd [ upstream commit e773f7e9b155d1e740af5a458a7e6deb60689ca6 ] Following up on #23334, add more exceptions for errors that seem to not be related to Cilium but rather to etcd. Fixes: #24701 Suggested-by: André Martins <andre@cilium.io> Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
c024a65 docs: Fix upgradeCompatibility references [ upstream commit 2f9850ca3653f16e6c003409c7e2e9e6f246bb4b ] The upgradeCompatability should always be set to the first version that the user installed in order to assume the Helm defaults that were in place during that release. Tracking each version here initially would provide confirmation for users in order to pick a valid version. Except that we forgot to keep it up to date with each release. Drop the examples to reduce user confusion. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
252083f pkg/bandwidth: add error for bandwidth manager not being enabled [ upstream commit 4aa6911868e943623a082a450a7afd0e81889a6b ] If we can read "procfs" the user will not the reason for it. We should log the error as well. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
1e022e7 policy: Do not share same policy for multiple cached selectors [ upstream commit: af83b0efeccdc56b89e167c3bcee634554735da4 ] Do not share the same PerSelectorPolicy object between multiple cached selectors. This makes sure that when rules are merged only the rules for the intended selectors are effected. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 13 April 2023, 11:57:07 UTC
043dba0 chore(deps): update dependency cilium/hubble to v0.11.3 Signed-off-by: renovate[bot] <bot@renovateapp.com> 13 April 2023, 08:53:01 UTC
f35abb0 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 10 April 2023, 16:58:47 UTC
c3dc2f2 chore(deps): update docker.io/library/ubuntu:20.04 docker digest to 24a0df4 Signed-off-by: Renovate Bot <bot@renovateapp.com> 10 April 2023, 16:58:47 UTC
ab6af26 envoy: Bump envoy version to v1.23.7 The image hash is coming from below run https://github.com/cilium/proxy/actions/runs/4615766882/jobs/8159983325 Upstream release https://github.com/envoyproxy/envoy/releases/tag/v1.23.7 Signed-off-by: Tam Mach <tam.mach@cilium.io> 06 April 2023, 09:07:04 UTC
c6378c9 chore(deps): update docker.io/library/alpine docker tag to v3.16.5 Signed-off-by: Renovate Bot <bot@renovateapp.com> 05 April 2023, 09:27:43 UTC
79e029b test: fix race condition of deleting cnp in e2e test [ upstream commit 294bcd1d8f884ce9b7c54c1421032ff0ec2706c9 ] There is a flake in e2e test when a test case starts to proceed before cnp comes to take effect by cilium-agent. The correct way to delete cnp is to run "kubectl delete" followed by "cilium policy wait", and kubectl helper already has such wrappers. Backporting conflicts: * minor conflict due to the renaming of test/k8sT to test/k8st in master Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io> 05 April 2023, 09:26:12 UTC
4f47be0 test: fix race condition of deleting ccnp in e2e test [ upstream commit 22a3743d2695fb2321ccf32abb67a778be28f092 ] There is a flake in e2e test when a test case starts to proceed before ccnp comes to take effect by cilium-agent. The correct way to delete ccnp is to run "kubectl delete" followed by "cilium policy wait", and kubectl helper already has such wrappers. Fixes: #24380 Backporting conflicts: * minor conflict due to the renaming of test/k8sT to test/k8st in master Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io> 05 April 2023, 09:26:12 UTC
ee85241 docs: add upgrade notes for 1.11 regressions Since this behavior will be unexpected for users upgrading from 1.10 to 1.11 we should make it obvious in the upgrade notes about this regression. Signed-off-by: André Martins <andre@cilium.io> 05 April 2023, 09:23:30 UTC
8986d78 docs: Fix mitigation for IPsec upgrade issue The mitigation documented in commit ede154e27b ("Add IPSec remark for upgrade to v1.11.15") is actually incomplete. The XFRM policies also need to be flushed. Fixes: ede154e27b ("Add IPSec remark for upgrade to v1.11.15") Signed-off-by: Paul Chaignon <paul@cilium.io> 03 April 2023, 14:20:23 UTC
ede154e Add IPSec remark for upgrade to v1.11.15 Cilium upgrades to v1.11.15 can cause severe problems when IPSec is enabled. This adds a remark to the docs. Signed-off-by: darox <maderdario@gmail.com> 30 March 2023, 07:27:42 UTC
fe958b5 pkg: add missing xfrm-no-track rules from ipv6 [ upstream commit 788bf37bc84cffae93df8da19c34f5bbe3a20f57 ] By right there should be a rule to let ipsec skb bypass conntrack: -A CILIUM_PRE_raw -m mark --mark 0xd00/0xf00 -m comment --comment "cilium-xfrm-notrack:" -j CT --notrack However ipv6 missed it and this commit adds the rule back. Fixes: #23481 Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
7fb7aaf helm/hubble-ui: use v0.11.0 hubble-ui [ upstream commit e980ca07470707d7fcc8b9a54d7525dbdcc21330 ] Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
58c1f25 hubble-ui: allow ingress from non root `/` urls [ upstream commit e29c8ea8d3f85cabfb19dda2b1c71cadb8cd8e01 ] Support the case when ingress is configured to serve hubble-ui from non default `/` root url (ex. `/service-map`). Related hubble-ui pull request: https://github.com/cilium/hubble-ui/pull/432 Signed-off-by: Dmitry Kharitonov<dmitry@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
b1dd38d docs: note there are two Cilium CLIs [ upstream commit b2bc42a180fbe0c62c0dfbf543d0d2685ffa89b9 ] Signed-off-by: Liz Rice <liz@lizrice.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
af0e4ef renovate: Fix Hubble release digest regex [ upstream commit 30a783f58cbd4a1cb743d426fdab5d131c25dcb6 ] Renovate v35 had a breaking change in how it computes the digest of GitHub releses. Previously, it would use the digest of the release attachements, but now it's just using a git sha as the "digest". This commit changes the data source for the Hubble artifacts to use the data source which preserves the old behavior. Ref: https://github.com/renovatebot/renovate/pull/20178 Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
36b5917 docs: fix typo in operations/troubleshooting.rst [ upstream commit 986daf407ae9643ad72b68a81ff8ca1a7a2758bf ] contrack -> conntrack Fixes: 93ebeb3bae11 ("docs: Update the documentation for the conntrack-gc-interval flag") Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
d243cf6 chore(deps): update quay.io/cilium/hubble docker tag to v0.11.3 Signed-off-by: Renovate Bot <bot@renovateapp.com> 28 March 2023, 13:47:25 UTC
d6cd392 Fix for disabled cloud provider rate limiting [ upstream commit 0557a2f6e9407385679dd5831e4b5646b235a2fe ] Earlier versions of these flags had a prefix of ENI and were later renamed with a prefix of IPAM to be consistent across cloud providers. While removing the deprecated old flags, #12676 also removed lines needed for setting OperatorConfig struct fields from the values read in. This resulted in fields having golang defaults, which caused the rate limiter to be completely bypassed. Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
fb47c2e docs: Update the documentation for the conntrack-gc-interval flag [ upstream commit 93ebeb3bae11475e7a4dc581affcef79d4fb7eaa ] The current documentation is incorrect as the default value is 0 and not 5 minutes. 0 implies a dynamic interval value, which this commit now documents. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
387c11c checker: Fix incorrect checker for ExportedEqual() [ upstream commit 9ac5b5326950f06291b5bc3f65c58198a4aa8dc3 ] When using checker.ExportedEqual(), it was using the standard Equals checker under-the-hood, but this is incorrect. Fix it to use the correct checker. In the commit introducing the bug, there was no direct usage of checker.ExportedEqual(), but rather checker.ExportedEqual (note the plural). Subtle! Discovered while working on improving policy unit tests. Fixes: f4407e7c8f9 ("checker: Add ExportedEquals checker") Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
0e86345 Update clustermesh requirements to mention node InternalIP explicitly [ upstream commit 7dbc63a47fd63a4574ba1cb1a59e4f776ecff07f ] Discovered in community slack that k8s node ExternalIP isn't used by clustermesh even if configured. Let's be explicit about InternalIP in the documented reqs, until such time as use of node ExternalIP is supported. Signed-off-by: Jef Spaleta <jspaleta@gmail.com> 23 March 2023, 23:22:38 UTC
9a24073 Fix duplicated logs for test-output.log [ upstream commit 853ec101d8adee8d482040aa038e5b71ce7775e2 ] This patch avoide duplicated info in the test-output.log Fixes: #18515 Signed-off-by: Roman Ptitcyn <romanspb@yahoo.com> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
ba156b4 In service recovery, don't skip if one of the service recovery fails [upstream commit https://github.com/cilium/cilium/commit/018856602b7637b8ae1c796f4ed02fe1bbeb5905] Reduce number of connections interrupted due to svc/backend ID changes if restoration fails Fixed by not return error instead logging it and continue. Logging number of failures and success in backend restoration with new variables Signed-off-by: Gaurav Yadav <gaurav.dev.iiitm@gmail.com> Signed-off-by: Jared Ledvina <jared.ledvina@datadoghq.com> 22 March 2023, 21:07:24 UTC
b416e44 chore(deps): update docker.io/library/alpine:3.16.4 docker digest to 2cf17aa Signed-off-by: Renovate Bot <bot@renovateapp.com> 21 March 2023, 14:07:29 UTC
43c2a75 install: Update image digests for v1.11.15 Generated from https://github.com/cilium/cilium/actions/runs/4446402139. `docker.io/cilium/cilium:v1.11.15@sha256:434ea1ff40b8db76c2be6cabfa1bbd2b887eaabe42e757651ea14757468e3bf4` `quay.io/cilium/cilium:v1.11.15@sha256:434ea1ff40b8db76c2be6cabfa1bbd2b887eaabe42e757651ea14757468e3bf4` `docker.io/cilium/clustermesh-apiserver:v1.11.15@sha256:66071d67f0249909c81cc3f94ad1dd2ae51e1451c400183a9337c04b9c1e076f` `quay.io/cilium/clustermesh-apiserver:v1.11.15@sha256:66071d67f0249909c81cc3f94ad1dd2ae51e1451c400183a9337c04b9c1e076f` `docker.io/cilium/docker-plugin:v1.11.15@sha256:e2d10187f4e31a00fd751b6e5ac56bd3698ab6bd3c404cff06b7b2740d4327df` `quay.io/cilium/docker-plugin:v1.11.15@sha256:e2d10187f4e31a00fd751b6e5ac56bd3698ab6bd3c404cff06b7b2740d4327df` `docker.io/cilium/hubble-relay:v1.11.15@sha256:352a65dde7c324ace5d6442f626f82c19550dd581e17f8f7e7aba30325c96d9e` `quay.io/cilium/hubble-relay:v1.11.15@sha256:352a65dde7c324ace5d6442f626f82c19550dd581e17f8f7e7aba30325c96d9e` `docker.io/cilium/operator-alibabacloud:v1.11.15@sha256:712972b46f592bd80a8e4c66e9b5cdcc73705740bf2cea84a6df131107a01699` `quay.io/cilium/operator-alibabacloud:v1.11.15@sha256:712972b46f592bd80a8e4c66e9b5cdcc73705740bf2cea84a6df131107a01699` `docker.io/cilium/operator-aws:v1.11.15@sha256:3aa776003eee064a6896b6ad712f55293d4e045defbe14d3768d224ce254d5c3` `quay.io/cilium/operator-aws:v1.11.15@sha256:3aa776003eee064a6896b6ad712f55293d4e045defbe14d3768d224ce254d5c3` `docker.io/cilium/operator-azure:v1.11.15@sha256:81e5168c977806a7f310aa57cca74c908fe6ea323518804e15c48bc786b99271` `quay.io/cilium/operator-azure:v1.11.15@sha256:81e5168c977806a7f310aa57cca74c908fe6ea323518804e15c48bc786b99271` `docker.io/cilium/operator-generic:v1.11.15@sha256:1feed1b895b39c7bdcbfe6232536e26edba9beb41c160c66d539de4358275a2e` `quay.io/cilium/operator-generic:v1.11.15@sha256:1feed1b895b39c7bdcbfe6232536e26edba9beb41c160c66d539de4358275a2e` `docker.io/cilium/operator:v1.11.15@sha256:97e6df665e10a08b2fbb5aefb183564debe0a0a4108b371a2f4d95f38c56f56c` `quay.io/cilium/operator:v1.11.15@sha256:97e6df665e10a08b2fbb5aefb183564debe0a0a4108b371a2f4d95f38c56f56c` Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 17 March 2023, 11:31:42 UTC
c5577e8 Prepare for release v1.11.15 Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 16 March 2023, 00:35:10 UTC
99df085 cmd/ciliumendpoint: guard against nil indexers. [ upstream commit f5c202d70f815aa5873374f50debca476ab2fb04 ] Make cleanup more resilient to conditions where indexer are not set. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 15 March 2023, 18:13:02 UTC
c73d235 daemon: fix panic when running with etcd with endpoint crd disabled [ upstream commit ee7f12c46be74a58be7f7a878329758e6c03db7b ] When running etcd kvstore, if endpoint CRD is disabled then the stale CEP cleanup init procedure panics due to a nil indexer references returned from k8s watchers. This is because the cep/ces k8s watchers aren't initialized if this option is set to true. Fixes: #24366 Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 15 March 2023, 18:13:02 UTC
b29f401 bpf: Ignore HOST_ID resolved from ipcache for IPv6 case [ upstream commit a74affd5c4851de3adface98a0b81bcca9ffde55 ] This PR adds the check to ignore HOST_ID resolved from ipcache for IPv6 case. resolve_srcid_ipv6 should ignore the HOST_ID resolved from ipcache because the packets marked MARK_MAGIC_HOST actually be from the host. Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp> Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 15 March 2023, 14:41:28 UTC
a5a3d89 Check stale cilium endpoint flag before cleaning [ upstream commit 449bb80ea6dde45e4b22453b03186f4879cc810a ] [ backporter's note: hive changes create conflict due to change in function names, had to modify to accomadate old K8sEnabled func ] The "--enable-stale-cilium-endpoint-cleanup" flag is never checked before running "cleanStaleCEPs". Signed-off-by: Steven Johnson <sjdot@protonmail.com> Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 14 March 2023, 16:36:01 UTC
9566881 agent: install CNI plugin binary in an InitContainer [ upstream commit e1a46216b4dfba98a13f5b32b1272c137b2cc923 ] This reduces the potential security surface of the agent by removing the bind-mount of /opt/cni/bin. Instead, write the binaries once in an initContainer. There is no currently known vulnerability exploiting this, but it's good practice to remove as many long-running host mounts as possible. This could be a potential further exploit vector if an agent were to be compromized. Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Feroz Salam <feroz.salam@isovalent.com> 13 March 2023, 18:18:04 UTC
b68ff83 images: update cilium-{runtime,builder} Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 13 March 2023, 17:34:01 UTC
29f0185 cni: add option to keep CNI configuration file on agent shutdown. [ upstream commit 64a9cc07fa6d7a5b4d379ab85eaf69cd3694bd43 ] [ backporter's note: Had some minor documentation conflicts ] When enabled, the CNI configuration file and plugin binaries will remain on disk, even when the agent is shut down. Without this, it is impossible to delete pods on the node. That can cause issues in hot clusters, where Cilium is deleted for an upgrade, another pod takes its place, and the node has no more capacity -- plus, none can be freed up since CNI deletion no longer works. When enabled, it does have one cosmetic effect: pods will be scheduled to nodes that cannot immediately start them. CNI ADDs and DELs will fail until the agent comes back. The CNI plugin itself always waits 30 seconds for the agent to start, which should mean that few, if any, deletes will actually fail. Likewise, the kubelet will retry. In the case of DEL, the kubelet will eventually give up, marking the pod as deleted and allowing the node to be recovered. In the case where someone is migrating off of Cilium, the value "cni.uninstall=true" will keep the original behavior. Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 09 March 2023, 04:50:02 UTC
8afc18c docs: Document CONFIG_PERF_EVENTS requirement [ upstream commit 56681fe7386545c9b0d60c5d0da1d695dfbafa9e ] This is needed for Hubble events and connection tracking garbage collection. Furthermore, a user reported that without this option, Cilium crashes on startup with the following message: level=fatal msg="Cannot initialise BPF perf ring buffer sockets" error="failed to create perf ring for CPU 0: can't create perf event: function not implemented" Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 09 March 2023, 04:50:02 UTC
ff7629a bugtool: Add ingress/egress tc filter dump [ upstream commit 60eb3e62d8a3d69f3d371bca38373365b0e62810 ] tc filter show dev <dev> doesn't typically show anything, you need to add ingress/egress direction afterwards to pull the actual output. CC: Tom Hadlaw <tom.hadlaw@isovalent.com> Fixes: b13dc89166e9 ("Bugtool: Add additional tc commands.") Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 09 March 2023, 04:50:02 UTC
e894d8d images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 07 March 2023, 19:23:47 UTC
15a84a9 chore(deps): update docker.io/library/ubuntu:20.04 docker digest to 9fa30fc Signed-off-by: Renovate Bot <bot@renovateapp.com> 07 March 2023, 19:23:47 UTC
1dba780 Enable Google Analytics 4 [ upstream commit 488240f419f40e29875622ae20b98091c0e7e6c7 ] [ Backport notes: - Fixed minor conflicts on Documentation/conf.py. - Have a new docs-builder image built and reference it in the CI wofkflow (former image didn't have GA plug-in). ] Signed-off-by: Patrice Chalin <chalin@cncf.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 March 2023, 05:39:03 UTC
f81894e chore(deps): update all github action dependencies Signed-off-by: Renovate Bot <bot@renovateapp.com> 02 March 2023, 15:51:40 UTC
f4bf073 ipsec: Per-node XFRM states & policies for EKS & AKS [ upstream commit 3e59b681f9171c3afe1e1f7e43ab18e471f67f91 ] On EKS & AKS, we have only one XFRM state and policy per direction. We install those at startup (vs. when receiving k8s node objects on e.g. GKE). This is a consequence of their IPAM modes (cannot predict node from pod IP as we don't have per-node pod CIDRs). To encode the outer IP addresses to use for IPsec encapsulation, this however needs to change. We need to install one XFRM state and policy per remote node. We therefore need to install those on reception of the k8s node objects, as with the cluster-pool IPAM. This commit implements this change. All XFRM state and policies will be installed by the same function, enableIPsec, regardless of the Kubernetes provider. Existing logic in enableIPsec is therefore pushed behind a !n.subnetEncryption() guard. Existing logic for EKS and AKS' IPAMs is moved from enableSubnetIPsec to enableIPsec. This commit fixes IPsec support on EKS and AKS that was broken by a previous commit ("bpf: Remove IP_POOLS IPsec code") in this series. A lot of code moves around, so here's a summary of code changes: 1. The four helper function at the beginning of changes are removed because they won't be used anymore. 2. All code to install the kernel's XFRM config for EKS & AKS is moved from enableSubnetIPsec to enableIPsec. This is because this code will now be executed in reaction to nodes being created, as on GKE, and not once and for all. 3. In enableIPsec, we differentiate between the AKS/EKS and the GKE cases with the n.subnetEncryption() condition (true if AKS or EKS). Signed-off-by: Paul Chaignon <paul@cilium.io> 02 March 2023, 06:28:32 UTC
630b08f bpf: Free packet mark on bpf_overlay recirculation [ upstream commit e625cded11d5f624bcbd008dbb241c8644705807 ] IPsec-encrypted packets coming from the overlay are recirculated to the overlay device and bpf_overlay after decryption. On the first pass, we can retrieve the source security identity from the tunnel metadata, but on the second pass, we don't have that metadata anymore. Therefore, today, we pass the security identity from the first bpf_overlay traversal to the second via the packet mark. Unfortunately, in the next commit, we need the packet mark's bits to match against XFRM IN states. This commit therefore frees those mark bits. To that end, we perform an ipcache lookup on the second bpf_overlay traversal to retrieve the security identity for the inner source IP address. That has obviously a higher cost than reading the packet mark, but given we're on the decryption path, it's probably negligible anyway. As a result of this change, we will also be missing the source identity in the very first TRACE_FROM_STACK packet trace in bpf_overlay. I think that's fine as it's usually the case for TRACE_FROM_* traces. Signed-off-by: Paul Chaignon <paul@cilium.io> 02 March 2023, 06:28:32 UTC
60a9ec6 bpf: Write node ID into packet mark for IPsec [ upstream commit dadf041ff93883599b2ebd7477968f194ee4543d ] Previous commits (1) populated the ipcache with the ID of the node hosting each remote endpoint and (2) updated XFRM marks to expect a node ID such that: 0xXXXXXE00 ^ ^^ | |+-- Set to 0xE to require encryption | +-- Set to the IPsec SPI to use +----- Set to the node ID This commit updates our BPF code to set the packet mark expected by the XFRM policies & states, using information from the ipcache. Signed-off-by: Paul Chaignon <paul@cilium.io> 02 March 2023, 06:28:32 UTC
db73f33 bpf: Pass security ID via skb->cb from lxc to host [ upstream commit 4de20fc2027afe0ce436c1d33147e6dcc943f055 ] We need to free the packet mark between bpf_lxc and bpf_host so that it can carry the node ID when IPsec is enabled, to match against XFRM states. Instead of using the packet mark to pass the source security ID, we thus use a skb->cb field. Signed-off-by: Paul Chaignon <paul@cilium.io> 02 March 2023, 06:28:32 UTC
back to top