https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
71728fd Pick up the latest startup-script image [ upstream commit 5c9b66ce093520b29e03d2ed36f5a2bd5d1b6db4 ] [ backporter's note: Fixed conflict in the install/kubernetes/Makefile.values and regenerated relevant documents. ] Upgrading this image is not automated yet. Ref: #25773 Ref: https://github.com/cilium/image-tools/pull/218 Ref: https://quay.io/repository/cilium/startup-script?tab=tags Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 09 June 2023, 11:16:01 UTC
186cb98 test: Collect sysdump as part of artifacts [ upstream commit e93fdd87b96328847bfe31ab9343ccfef4843b93 ] Once we have a sysdump in the test artifacts a lot of files we collect will become duplicates. This commit however doesn't remove all those duplicate files from the test artifacts. Let's wait a bit and confirm the sysdump collection always work before cleaning things up. The sysdump collection was tested by making a test fail on purpose. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 09 June 2023, 11:16:01 UTC
fb34e01 Set hostServices=true for smoke test Setting hostServices=false with KPR=partial increases complexity which breaks the SmokeTest on newer kernels. Disable it temporarily. Signed-off-by: Anton Protopopov <aspsk@isovalent.com> 08 June 2023, 14:35:32 UTC
296303b Temporarily disable part of the conformance-kind test Temporarily disable the ipsec part of the conformance-kind test, as it is currently broken on new kernels (most probably, due to the commit [1] which greatly increases complexity). A proper fix would be to split bpf_lxc.c:from-container into more tail calls, but this is work in progress. [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=354e8f1970f821d4952458f77b1ab6c3eb24d530 Signed-off-by: Anton Protopopov <aspsk@isovalent.com> 08 June 2023, 14:35:32 UTC
176a695 chore(deps): update quay.io/cilium/hubble docker tag to v0.11.6 Signed-off-by: renovate[bot] <bot@renovateapp.com> 07 June 2023, 22:06:19 UTC
7dc319e bug: Fix Potential Nil Reference in GetLables Implementation [ upstream commit bfbe5a26a458e114a5b8b261ed719a85a8ceff35 ] The policyIdentityLabelLookup wrapper for Endpoint implements the GetLabels interface method. This is necessary for the constructing the MapState of the policy engine. This implementation incorrectly did not check if the identity returned by LookupIdentityByID was nil. This fixes this bug, which heretofore has not caused any issues. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 02 June 2023, 13:46:30 UTC
37a0453 policy: Fix concurrent access of SelectorCache [ upstream commit 52ace8e9ea318fe79e86731bddbc0abc97843311 ] Marco Iorio reports that with previous code, Cilium could crash at runtime after importing a network policy, with the following error printed to the logs: fatal error: concurrent map read and map write The path for this issue is printed also in the logs, with the following call stack: pkg/policy.(*SelectorCache).GetLabels(...) pkg/policy.(*MapStateEntry).getNets(...) pkg/policy.entryIdentityIsSupersetOf(...) pkg/policy.MapState.denyPreferredInsertWithChanges(...) pkg/policy.MapState.DenyPreferredInsert(...) pkg/policy.(*EndpointPolicy).computeDirectionL4PolicyMapEntries(...) pkg/policy.(*EndpointPolicy).computeDesiredL4PolicyMapEntries(...) pkg/policy.(*selectorPolicy).DistillPolicy(...) pkg/policy.(*cachedSelectorPolicy).Consume(...) pkg/endpoint.(*Endpoint).regeneratePolicy(...) ... Upon further inspection, this call path is not grabbing the SelectorCache lock at any point. If we check all of the incoming calls to this function, we can see multiple higher level functions calling into this function. The following tree starts from the deepest level of the call stack and increasing indentation represents one level higher in the call stack. INCOMING CALLS - f GetLabels github.com/cilium/cilium/pkg/policy • selectorcache.go - f getNets github.com/cilium/cilium/pkg/policy • mapstate.go - f entryIdentityIsSupersetOf github.com/cilium/cilium/pkg/policy • mapstate.go - f denyPreferredInsertWithChanges github.com/cilium/cilium/pkg/policy • mapstate.go - f DenyPreferredInsert github.com/cilium/cilium/pkg/policy • mapstate.go - f computeDirectionL4PolicyMapEntries github.com/cilium/cilium/pkg/policy • resolve.go - f computeDesiredL4PolicyMapEntries github.com/cilium/cilium/pkg/policy • resolve.go + f DistillPolicy github.com/cilium/cilium/pkg/policy • resolve.go <--- No SelectorCache lock - f DetermineAllowLocalhostIngress github.com/cilium/cilium/pkg/policy • mapstate.go + f DistillPolicy github.com/cilium/cilium/pkg/policy • resolve.go <--- No SelectorCache lock - f consumeMapChanges github.com/cilium/cilium/pkg/policy • mapstate.go + f ConsumeMapChanges github.com/cilium/cilium/pkg/policy • resolve.go <--- Already locks the SelectorCache Read the above tree as "GetLabels() is called by getNets()", "getNets() is called by entryIdentityIsSupersetOf()", and so on. Siblings at the same level of indent represent alternate callers of the function that is one level of indentation less in the tree, ie DenyPreferredInsert() and consumeMapChanges() both call denyPreferredInsertWithChanges(). As annotated above, we see that calls through DistillPolicy() do not grab the SelectorCache lock. Given that ConsumeMapChanges() grabs the SelectorCache lock, we cannot introduce a new lock acquisition in any descendent function, otherwise it would introduce a deadlock in goroutines that follow that call path. This provides us the option to lock at some point from the sibling of consumeMapChanges() or higher in the call stack. Given that the ancestors of DenyPreferredInsert() are all from DistillPolicy(), we can amortize the cost of grabbing the SelectorCache lock by grabbing it once for the policy distillation phase rather than putting the lock into DenyPreferredInsert() where the SelectorCache could be locked and unlocked for each map state entry. Future work could investigate whether these call paths could make use of the IdentityAllocator's cache of local identities for the GetLabels() call rather than relying on the SelectorCache, but for now this patch should address the immediate locking issue that triggers agent crashes. CC: Nate Sweet <nathanjsweet@pm.me> Fixes: c9f0def587e6 ("policy: Fix Deny Precedence Bug") Reported-by: Marco Iorio <marco.iorio@isovalent.com> Co-authored-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 02 June 2023, 13:46:30 UTC
b6970da policy: Fix Deny Precedence Bug [ upstream commit c9f0def587e662c2b2ac4501362c6f44aa62ee71 ] - Add Tests for Deny Precedence Bug: Currently, when a broad "Deny" policy is paired with a specific "Unmanaged CIDR" policy, then the "Unmanaged CIDR" policy will still be inserted into the policy map for an endpoint. This results in "Deny" policies not always taking precedence over "Allow" policies. This test confirms the bugs existence. - Fix Deny Precedence Bug: When the policy map state is created CIDRs are now checked against one another to ensure that deny-rules that supersede allow-rules when they should. `DenyPreferredInsert` has been refactored to use utility methods that make the complex boolean logic of policy precedence more atomic. Add `NetsContainsAny` method to `pkg/ip` to compare cases where one set of networks conatins or is equal to any network in another set. - endpoint: Add policy.Identity Implementation A `policy.Identity` implementation is necessary for the incremental update to the endpoint's policy map that can occur with L7 changes. Valid deny-policy entries may prohibit these L7 changes based on CIDR rules, which are only obtainable by looking up all potentially conflicting policies' labels. Thus `l4.ToMapState` needs access to the identity allocater to lookup "random" identity labels. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 02 June 2023, 13:46:30 UTC
ddeaf64 envoy: Never use x-forwarded-for header [ upstream commit e8fcd6bfcd91e0eabe7d4049f84b1f706d68bc38 ] Envoy by default gets the source address from the `x-forwarded-for` header, if present. Always add an explicit `use_remote_address: true` for Envoy HTTP Connection Manager configuration to disable the default behavior. Also set the `skip_xff_append: true` option to retain the old behavior of not adding `x-forwarded-for` headers on cilium envoy proxy. Setting these options is not really needed for admin and metrics listeners, or most of the tests, but we add them there too in case anyone uses them as a source of inspiration for a real proxy configuration. This fixes incorrect hubble flow data when HTTP requests contain an `x-forwarded-for` header. This change has no effect on Cilium policy enforcement where the source security identity is always resolved before HTTP headers are parsed. Fixes: #25630 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 01 June 2023, 07:35:43 UTC
887c79b test/fqdn: Switch from jenkins.cilium.io to cilium.io [ upstream commit f66f4b159d24e1b5c8e4d92a69932ef003cff87d ] jenkins.cilium.io is down since Thursday. We can simply switch to cilium.io. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 24 May 2023, 12:32:25 UTC
a7edc23 test/fqdn: Avoid hardcoding the test FQDN [ upstream commit 4bcfebc11c29e94cc91a6cc5a027562f9ec0be20 ] Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 24 May 2023, 12:32:25 UTC
1216e5b test: Use DinD in L4LB tests The DinD for the tests was introduced in [1]. However, it never made into the v1.11 branch which made the GHA always to fail. Fix this by taking the test.sh files from the main branch. [1]: https://github.com/cilium/cilium/pull/22653 Signed-off-by: Martynas Pumputis <m@lambda.lt> 22 May 2023, 07:04:58 UTC
1689a2c install: Update image digests for v0.11.17 Generated from https://github.com/cilium/cilium/actions/runs/5006290922. `docker.io/cilium/cilium:v1.11.17@sha256:6c3132e34e66734752de798eb8519dafa77b9f0da1033e9bed7f7be30ce10358` `quay.io/cilium/cilium:v1.11.17@sha256:6c3132e34e66734752de798eb8519dafa77b9f0da1033e9bed7f7be30ce10358` `docker.io/cilium/clustermesh-apiserver:v1.11.17@sha256:022f8b23f9e977a74b8da25ac98fbeed65bd9c132362797681264bd13abc0349` `quay.io/cilium/clustermesh-apiserver:v1.11.17@sha256:022f8b23f9e977a74b8da25ac98fbeed65bd9c132362797681264bd13abc0349` `docker.io/cilium/docker-plugin:v1.11.17@sha256:ed49556f92b95ff339e99938bbd5649d5dc90e8378cb67a820df6bac1979ffa2` `quay.io/cilium/docker-plugin:v1.11.17@sha256:ed49556f92b95ff339e99938bbd5649d5dc90e8378cb67a820df6bac1979ffa2` `docker.io/cilium/hubble-relay:v1.11.17@sha256:d880ee0184f1ca0fffbd73374424ae2c4d1c26af14005a58103ef695816a78ff` `quay.io/cilium/hubble-relay:v1.11.17@sha256:d880ee0184f1ca0fffbd73374424ae2c4d1c26af14005a58103ef695816a78ff` `docker.io/cilium/operator-alibabacloud:v1.11.17@sha256:36999e2fefb8f1ce3a791f60c61055b3bdde350dff5128ce3f4a5fbe31c6f341` `quay.io/cilium/operator-alibabacloud:v1.11.17@sha256:36999e2fefb8f1ce3a791f60c61055b3bdde350dff5128ce3f4a5fbe31c6f341` `docker.io/cilium/operator-aws:v1.11.17@sha256:e96a7d34ed9386a00b0c7d73946f92872280f84addcc951780c42a56dfaeae9c` `quay.io/cilium/operator-aws:v1.11.17@sha256:e96a7d34ed9386a00b0c7d73946f92872280f84addcc951780c42a56dfaeae9c` `docker.io/cilium/operator-azure:v1.11.17@sha256:20cf49d57fdccc599cfefc5a6ab0ed152dac52d45d8a2339fd3ad19415aaebba` `quay.io/cilium/operator-azure:v1.11.17@sha256:20cf49d57fdccc599cfefc5a6ab0ed152dac52d45d8a2339fd3ad19415aaebba` `docker.io/cilium/operator-generic:v1.11.17@sha256:f77cf55ebc47174fb64fd8ffd030015e55817ed9a6bfab46d0ee917a7ed198e5` `quay.io/cilium/operator-generic:v1.11.17@sha256:f77cf55ebc47174fb64fd8ffd030015e55817ed9a6bfab46d0ee917a7ed198e5` `docker.io/cilium/operator:v1.11.17@sha256:c1cad3137dfa80c1d415dff43f064b91992158ce56899b093b0294382ae57289` `quay.io/cilium/operator:v1.11.17@sha256:c1cad3137dfa80c1d415dff43f064b91992158ce56899b093b0294382ae57289` Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 17 May 2023, 18:24:14 UTC
e86fde3 Prepare for release v1.11.17 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 15 May 2023, 13:12:15 UTC
a96172d images: update cilium-{runtime,builder} Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 15 May 2023, 11:32:12 UTC
0cbf76b Update CNI to 1.3.0 [ upstream commit 4ebf4e7a81a8f60154d693ecf4c844f6bdcb62e6 ] Run `images/scripts/update-cni-version.sh 1.3.0` to update the CNI version. Ref: https://github.com/containernetworking/plugins/releases/tag/v1.3.0 Signed-off-by: Yongkun Gui <ygui@google.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 15 May 2023, 11:32:12 UTC
a4d2a1f Add helm-toolbox image for helm docs, lint [ upstream commit e4ba2aa24bd7f28291a24a67acabd8e2fdbd09e6 ] Use https://github.com/cilium/helm-toolbox as an image for managing all of our helm formatting & linting needs. This implicitly updates: * helm (version from dev environment) -> 3.9.0 * helm-docs (custom build from Bruno) -> 1.10.0 * m2r 0.2.1 -> m2r2 0.3.2 Signed-off-by: Joe Stringer <joe@cilium.io> 12 May 2023, 19:04:57 UTC
7a1ad6d test/provision: Only install bpf mount if not already there Avoid failing VM start due to mount failing: Created symlink /etc/systemd/system/multi-user.target.wants/sys-fs-bpf.mount → /etc/systemd/system/sys-fs-bpf.mount. Failed to restart sys-fs-bpf.mount: Unit sys-fs-bpf.mount has a bad unit file setting. See system logs and 'systemctl status sys-fs-bpf.mount' for details. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 11:14:59 UTC
a9404a1 vagrant: Bump 4.9 Vagrant box (Linux 4.9.326, to fix a kernel bug) [ upstream commit 07e7fb0073ab387108ac6b4c126df1a34e36d5d2 ] We have been hitting a kernel bug on 4.9 for the verifier tests. An underflow on the memlock rlimit counter, caused by the reallocation of BPF programs not updating the charged values, makes the counter go under zero and convert into a huge value, blocking all further loads of BPF objects [0]. This has been fixed in kernel 4.10 [1], and was backported at last in 4.9.326. We generated a new Ubuntu image based on that, let's update. [0] cilium/cilium#20288 [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?h=5ccb071e97fbd9ffe623a0d3977cc6d013bee93c [ Backport note: Only update the v4.9 image, not the cilium-dev image because version 232 also contains an updated Go version. This is fine for VM images used in tests because they use CI images built by GH actions using the proper Go version for the branch. ] Signed-off-by: Quentin Monnet <quentin@isovalent.com> 12 May 2023, 11:14:59 UTC
dc187e2 helm chart: v1.11 base : hubble-ui deployment : restore nodeSelector and tolerations Signed-off-by: Bryan Stenson <bryan.stenson@okta.com> 12 May 2023, 11:13:23 UTC
5cf233e ipsec: Install default-drop XFRM policy sooner [ upstream commit 2045d593f7685a63a25338f9eb85a6da4997ce85 ] We currently install the default-drop XFRM policy when we install the XFRM policies and states for the local node. It is however possible for us to start installing XFRM policies and states for remote nodes before we handle the local one. The default-drop XFRM policy is a safety measure for when we move XFRM policies around. Because we don't always have a way to atomically update XFRM policies, it's possible that we end up with a very short time where no encryption XFRM OUT policy is matching a subset of traffic. The default-drop policy ensures that we drop such traffic instead of letting it leave the node as plain-text. We therefore want this default-drop XFRM policy to be installed before we update any other other XFRM policy. This commit therefore moves its installation before any other XFRM update instead of before just local-node XFRM updates. Fixes: 7d44f37509 ("ipsec: Catch-default default drop policy for encryption") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
8dc5a04 linux/node_id: do not attempt to map NoID [ upstream commit 9115e05703e92718b1c08e6f8faffa22ac806336 ] We correctly detect that we failed to allocate a new node ID (due to exhaustion of the idpool), but then still go ahead and map it. This leads to spurious errors which include "Failed to map node IP address to allocated ID". Instead, don't try to map NoID and return it directly. Fixes: af88b42bd4 (datapath: Introduce node IDs) Suggested-by: Paul Chaignon <paul@cilium.io> Signed-off-by: David Bimmler <david.bimmler@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
665be40 Delete Cilium monitor verbose mode test [ upstream commit 7335aa38a9c6a148b12b267ff6625a2290bb7d2a ] Another option would be to quarantine the test and find an assignee to make the test more robust, but I assert that we don't need test coverage for monitor verbose output. Fixes: #25178 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
ae0ed5a agent: Handle correctly state when CEP is present in multiple CES. [ upstream commit 71af0a2f4147c7706fb98fef0836b4de5eaa7e8a ] [ Backporter's node: Added '!privileged_tests' to the test ] There are condition possible in which CEP changes CES. This leads to CEP being present in multiple CESs for some time. In such cases the standard logic may not work as it always expect to have a single CEP representation. This commit changes to logic to handle multiple CEPs properly. Signed-off-by: Alan Kutniewski <kutniewski@google.com> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
307cd56 test: Cover IPsec + VXLAN + endpoint routes [ upstream commit 54fa995c56d914400b977230d0eb34516bf06606 ] [ Backporter's notes: Dropped call to 'helpers.RunsOnAKS()' which does not exist in v1.11 ] The previous commit fixed a connectivity bug affecting the above configuration. We can now extend tests to cover that configuration. Note that these tests are soon going to be removed and replaced by the new GitHub workflow. However, we may need to backport this pull request to stable branches where the GitHub workflow doesn't exist. Therefore, the corresponding extension of the workflow test will be done in a separate pull request. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
a66ebf4 ipsec: Don't match on packet mark for FWD XFRM policy [ upstream commit d39ca10f849060ca90f4a1ddc54734d0e05ba80a ] While extending datapath coverage, Martynas found a new bug affecting IPsec + VXLAN + endpoint routes. In that configuration, cross-node pod connectivity seems to fail and we see the IPsec XfrmInNoPols error counter increasing. By tracing the connection, we can see that the packet disappears on the receiving node, after decryption, between bpf_overlay (second traversal) and the lxc device. On that path, given decryption already happened, we should match the FWD XFRM policy: src 0.0.0.0/0 dst 10.244.0.0/24 uid 0 dir fwd action allow index 106 priority 2975 share any flag (0x00000000) lifetime config: limit: soft (INF)(bytes), hard (INF)(bytes) limit: soft (INF)(packets), hard (INF)(packets) expire add: soft 0(sec), hard 0(sec) expire use: soft 0(sec), hard 0(sec) lifetime current: 0(bytes), 0(packets) add 2023-01-13 14:34:18 use 2023-01-13 14:34:22 mark 0/0xf00 Clearly, given the non-zero XfrmInNoPols, the packet doesn't match the policy. Note XfrmInNoPols is also reported for FWD; there is no XfrmFwdNoPols. Checking the source code [1], we can see that when endpoint routes are enabled, we encode the source security identity into the packet mark. We thus won't match the 0/0xf00 mark on the FWD XFRM policy. We shouldn't need to match on any packet mark for the FWD XFRM policy; we want to allow all packets through. This commit therefore removes the packet mark match for the FWD direction. 1 - https://github.com/cilium/cilium/blob/v1.13.0-rc4/bpf/lib/l3.h#L151-L154 Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 12 May 2023, 05:22:54 UTC
421d611 chore(deps): update hubble cli to v0.11.5 Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 May 2023, 17:16:44 UTC
d3bf9d7 agent: dump stack on stale probes [ backport of d85c0939824ea3508f2a1b0ef56ddbc8d197e588 ] [ upstream commit 87f7a11ecc68b1efdc1454b520abc22470a91d01 ] Most of the time, when we see a stale probe, it's due to a deadlock. So, write a stack dump to disk (since we're probably going to be restarted soon due to a liveness probe). To prevent any sort of excessive resource consumption, only dump stack once every 5 minutes, and always write to the same file. Also, let's make the check lock-free while we're at it. Also, make sure we capture this file in bugtool. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 11 May 2023, 17:15:49 UTC
0834b37 inctimer: fix test flake where timer does not fire within time. [ upstream commit e695e48b171be3e60ecd8ef56c99b1b71ba1316c ] Running the test in a cpu constrained environment, such as: ``` docker run -v $(pwd):$(pwd) -w $(pwd) --cpus=0.1 -it golang:bullseye ./inctimer.test -test.v ``` I can fairly consistency reproduce a flake where the inctimer.After does not fire in time. If I allow it to wait for an additional couple of ms, this seems to be sufficient to prevent failure. It appears that goroutine scheduling latency can be significantly delayed in cpu restricted environments. This seems unavoidable, so to fix the flake I'll allow the test to wait another 2ms to see if the inctimer eventually fires. This will also log an error for delayed test fires, so if there is any other issues we can more easily debug them in the future. Fixed: #25202 Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
8fc726d docs: Add platform support to docs [ upstream commit 9a38aecc71b34c4b6ae95fcadd126f77ebb200ad ] We've been distributing ARM architecture images for Cilium for almost two years, but neglected to mention this up front in the system requirements or the main docs page. Add this to the docs. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
9edbc6e Makefile: use a specific template for mktemp files [ upstream commit db3e0152c6583e0cbec1013e514526adf7229faf ] Before this patch, we would hit a controller-gen[1] bug when the temporary file would be of the form tmp.0oXXXXXX. This patch uses a custom mktemp template that will not trigger the bug. [1]: https://github.com/kubernetes-sigs/controller-tools/issues/734 Signed-off-by: Alexandre Perrin <alex@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
6b920c6 docs: Add matrix version between envoy and cilium [ upstream commit 11e1bcc40fd4ba884191f594b8c6fd7975b1caaf ] This is to add a small docs for version matrix between Cilium and Cilium envoy versions, which is useful with the upcoming work to move envoy proxy out of Cilium agent container. Co-authored-by: ZSC <zacharysarah@users.noreply.github.com> Signed-off-by: Tam Mach <sayboras@yahoo.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
7f90b0d helm: add clustermesh nodeport config warning about #24692 [ upstream commit 9e83a6f79940c86a95fff33de89d7ded225da25c ] Cilium is currently affected by a known bug (#24692) when NodePorts are handled by the KPR implementation, which occurs when the same NodePort is used both in the local and the remote cluster. This causes all traffic targeting that NodePort to be redirected to a local backend, regardless of whether the destination node belongs to the local or the remote cluster. This affects also the clustermesh-apiserver NodePort service, which is configured by default with a fixed port. Hence, let's add a warning message to the corresponding values file setting. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Jussi Maki <jussi@isovalent.com> Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 11 May 2023, 09:57:54 UTC
931ebc1 envoy: Upgrade to v1.23.9 This commit is to upgrade envoy to v1.23.9 for security fixes, please find below the details: Build: https://github.com/cilium/proxy/actions/runs/4827955904/jobs/8601231172 Upstream Docs: https://www.envoyproxy.io/docs/envoy/v1.23.9/ Release notes: https://www.envoyproxy.io/docs/envoy/v1.23.9/version_history/v1.23/v1.23.9 Signed-off-by: Tam Mach <tam.mach@cilium.io> 29 April 2023, 12:37:42 UTC
e026bf6 ci: remove `STATUS` commands from upstream tests' Jenkinsfile [ upstream commit 46de5bca9fc77cb2c02ba2873a30649dbc1b78b6 ] These are remnants of a past before GHPRB. At best they create uncessary noise in the logs, at worst they can interfere with the default behaviour, so let's just remove them. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 28 April 2023, 14:51:34 UTC
bbddee0 bgp: do not advertise ipv6 prefixes via metallb [ upstream commit 922aa1bcabd1c75c568bf56a2ca2a5eb7e624ba9 ] When using metallb as BGP speaker, if IPv6 advertisement is made - metallb will return error as unsupported. This error is logged and error is returned to control loop, which continues retrying causing log flooding and high CPU. This change filters out IPv6 prefixes before sending them to metallb library and logs one time error message. Signed-off-by: harsimran pabla <hpabla@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 28 April 2023, 14:51:34 UTC
be630a3 contrib/backporting: Fix main branch reference The "master" branch was renamed to "main" recently. This commit is to adjust branch reference for chery-pick script. Signed-off-by: Tam Mach <tam.mach@cilium.io> 26 April 2023, 14:57:01 UTC
bac3efe wireguard: fix issue caused by nodes with the same name in clustermesh [ upstream commit 7398de68ca940a839915107e71c00b9661bf423b ] Currently, the wireguard subsystem in the cilium agent caches information about the known peers by node name only. This can lead to conflicts in case of clustermesh, if nodes in different clusters have the same name, causing in turn connectivity issues. Hence, let's switch to identify peers by full name (i.e., cluster-name/node-name) to ensure uniqueness. This modification does not introduce issues during upgrades, since the node ID is not propagated to the datapath. Fixes: #24227 Reported-by: @oulinbao <oulinbao@163.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
0523a23 daemon: Mark CES feature as beta in agent flag [ upstream commit a6d0142ec8f093bfb41a042e8d2cc38eff9e1cf3 ] This commit marks the CiliumEndpointSlice feature as beta (as per the documentation) in the agent flag description. This is necessary because users don't always read the full documentation before turning agent flags on. While at it, change the flag description to match the wording of other flags. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
a5eb447 pkg/kvstore: Fix for deadlock in etcd status checker [ upstream commit 9bb669b5705f3b283f3fac8f79b760987066d06d ] Etcd quorum checks are falsely reported as failing even though connection to etcd is intact. This can cause health checks to fail in both the agent and the operator. This happens due to a deadlock in pkg/kvstore/etcd after a prolonged downtime of etcd. Status check errors are being sent into a channel for the purpose of recreating kvstore connections in clustermesh. However when clustermesh is not used, messages from this channel are never read. The channel uses a buffer of size 128. After etcd has been down long enough to generate 128 errors, we enter a deadlock state. Agent / operator will continue to report etcd quorum failures and inturn health check failures until they're restarted. statusChecker() -> isConnectedAndHasQuorum() -> waitForInitLock() -> goroutine -> for -> ( initLockSucceeded <- err ) -> chan initLockSucceeded returned -> Block on receiving messages from initLockSucceeded channel -> e.statusCheckErrors <- e.latestErrorStatus [Blocked after 128 entries] Blocked goroutines captured from cilium 1.10 operator: goroutine 3309 [chan send, 13456 minutes]: github.com/cilium/cilium/pkg/kvstore.(*etcdClient).statusChecker(0xc00017db30) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:1171 +0x75a created by github.com/cilium/cilium/pkg/kvstore.connectEtcdClient /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:801 +0x679 goroutine 7838665 [chan send, 13505 minutes]: g.com/c/cilium/pkg/kvstore.(*etcdClient).waitForInitLock.func1(-,-,-,-) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:433 +0x449 created by github.com/cilium/cilium/pkg/kvstore.(*etcdClient).waitForInitLock /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:425 +0x7f Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
c884102 .travis: Quieten docker build output [ upstream commit 7f9e0f9b7b9fe26d707217e8248b86dde21dae4d ] The travis logs are frequently polluted with >10K lines of docker pull and build output. While this helps to track the ongoing progress of docker builds that take a long time, it's mostly useless output that developers must scroll past in order to see the useful output. Quieten that output in Travis to just the trigger of building the image plus the final summary that docker outputs. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
d7eef48 .travis: Make output less verbose [ upstream commit 9f7e24fd7e5c355e943528ac3074a2485a14b2ad ] Pass the verbosity parameters --quiet V=0 to quieten Travis output. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
3309e28 Makefile: Fix dirname errors with empty PRIV_TEST_PKGS [ upstream commit 29fe753e475fece540aac3fafd288147515bab08 ] When TESTPKGS only contains unprivileged tests, the PRIV_TEST_PKGS_EVAL evaluation previously filtered down to an empty list of packages that should be tested, and would pass this empty list to dirname, which then reports: dirname: missing operand Try 'dirname --help' for more information. This could happen multiple times during evaluation of the Makefile, and littered the output with no meaning. This could occur even if the privileged tests are not the target being run. Fix this by always adding "." to the list, which evaluates to the root directory of the repository. This causes dirname to succeed. Then, we can filter this root directory back out since there are no privileged tests at this level of the repository. This finally quietens the error. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
44ebf04 test/bpf: Fix compilation with V=0 [ upstream commit 82d5adc7951de3d14bb2406112314d93d3d606bb ] When the quiet mode was enabled, the $(CLANG) var would previously have a '@' at the start, which caused errors while attempting to make in this directory because it would be run in the context of a shell rather than directly as a make instruction. Move the $(QUIET) to the start of individual make instructions to resolve this compilation failure. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 April 2023, 10:54:26 UTC
e257f08 contrib/backporting: Fix main branch reference The "master" branch was renamed to "main" recently. Fix this for the older branch here. Signed-off-by: Joe Stringer <joe@cilium.io> 25 April 2023, 13:45:50 UTC
9012f69 docs: Document upgrade impact for IPsec The IPsec upgrade issue mentioned in ede154e27b ("Add IPSec remark for upgrade to v1.11.15") is fixed in v1.11.16. Nonetheless, a small impact remains, with a few packet drops happening during the upgrade. This commit documents that impact. Signed-off-by: Paul Chaignon <paul@cilium.io> 19 April 2023, 17:04:48 UTC
4da9949 jenkins: bump timeout to 210 minutes The net-next test has been timing out at the 2h50m mark. We need to increase its timeout in order to avoid such failures. Signed-off-by: André Martins <andre@cilium.io> 18 April 2023, 18:43:21 UTC
13d1197 install: Update image digests for v1.11.16 Generated from https://github.com/cilium/cilium/actions/runs/4731386331. ## Docker Manifests ### cilium `docker.io/cilium/cilium:v1.11.16@sha256:d2f2632c997a027ee4e540432edb4d8594e78e33315427e7ec3c06b473ec1e4e` `quay.io/cilium/cilium:v1.11.16@sha256:d2f2632c997a027ee4e540432edb4d8594e78e33315427e7ec3c06b473ec1e4e` ### clustermesh-apiserver `docker.io/cilium/clustermesh-apiserver:v1.11.16@sha256:67a051ef38ae113bcf7dc27ebb23a1137ece961ce86f087226ff5a0046099106` `quay.io/cilium/clustermesh-apiserver:v1.11.16@sha256:67a051ef38ae113bcf7dc27ebb23a1137ece961ce86f087226ff5a0046099106` ### docker-plugin `docker.io/cilium/docker-plugin:v1.11.16@sha256:1ee1bae0c2299d94ff162fc2847f9827823ff3d8e055e07da06e4ca28efe9391` `quay.io/cilium/docker-plugin:v1.11.16@sha256:1ee1bae0c2299d94ff162fc2847f9827823ff3d8e055e07da06e4ca28efe9391` ### hubble-relay `docker.io/cilium/hubble-relay:v1.11.16@sha256:c4c12759ba628e64a0f3fada99d2632627e5391ae0b49c3f35da51c3ba9eac9f` `quay.io/cilium/hubble-relay:v1.11.16@sha256:c4c12759ba628e64a0f3fada99d2632627e5391ae0b49c3f35da51c3ba9eac9f` ### operator-alibabacloud `docker.io/cilium/operator-alibabacloud:v1.11.16@sha256:d60aedfabf0957da1d975ee54779172f990366e9fb8bf55184ac31a0d77adc65` `quay.io/cilium/operator-alibabacloud:v1.11.16@sha256:d60aedfabf0957da1d975ee54779172f990366e9fb8bf55184ac31a0d77adc65` ### operator-aws `docker.io/cilium/operator-aws:v1.11.16@sha256:526dab3bee6231f71da44d14f25c17dfb53afba876bfc99374a11c0fb4278e36` `quay.io/cilium/operator-aws:v1.11.16@sha256:526dab3bee6231f71da44d14f25c17dfb53afba876bfc99374a11c0fb4278e36` ### operator-azure `docker.io/cilium/operator-azure:v1.11.16@sha256:0c2da6adf29f521f6d2ffe92794ad598fc99231eba2814b80cf608362cc14a3c` `quay.io/cilium/operator-azure:v1.11.16@sha256:0c2da6adf29f521f6d2ffe92794ad598fc99231eba2814b80cf608362cc14a3c` ### operator-generic `docker.io/cilium/operator-generic:v1.11.16@sha256:ea3fbe5ab65efc41228d716a64804b6fca9e2299835c3d39ae1cb248c1594c55` `quay.io/cilium/operator-generic:v1.11.16@sha256:ea3fbe5ab65efc41228d716a64804b6fca9e2299835c3d39ae1cb248c1594c55` ### operator `docker.io/cilium/operator:v1.11.16@sha256:44fb99adbba82605702aa9c41380c1c79ad5565bbd3c9d961f9aab55387be586` `quay.io/cilium/operator:v1.11.16@sha256:44fb99adbba82605702aa9c41380c1c79ad5565bbd3c9d961f9aab55387be586` Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> 18 April 2023, 13:49:46 UTC
9ce6d64 update CHANGELOG update changelog with the PRs that got merged after the last CHANGELOG but before tagging Signed-off-by: André Martins <andre@cilium.io> 17 April 2023, 22:34:46 UTC
54967c6 Avoid clearing objects in conversion funcs [ upstream commit 0d7406af7bf05e1c0c49c098a3b4e025d927c3e7 ] This removes the behavior of mutating the objects received from the client-go library. To begin with there isn't really any benefit from doing so, given we don't store the object afterwards, and it will be ready for gc when it leaves the scope inside client-go. client-go can possibly return the same pointer twice here, to trigger eg. both an object update delta and then a DeletedFinalStateUnknown delta with the same pointer. For more info, see the issue 115658 in the kubernetes/kubernetes repo on github. Follow-up of: 74307f175ceb ("Avoid clearing objects in conversion funcs") Signed-off-by: André Martins <andre@cilium.io> 17 April 2023, 22:28:49 UTC
e80ee2c Avoid clearing objects in conversion funcs [ upstream commit 74307f175cebc3e097eaa4eb4e930bf26ebb8cb1 ] This removes the behavior of mutating the objects received from the client-go library. To begin with there isn't really any benefit from doing so, given we don't store the object afterwards, and it will be ready for gc when it leaves the scope inside client-go. client-go can possibly return the same pointer twice here, to trigger eg. both an object update delta and then a DeletedFinalStateUnknown delta with the same pointer. For more info, see the issue 115658 in the kubernetes/kubernetes repo on github. Signed-off-by: Odin Ugedal <ougedal@palantir.com> Signed-off-by: Odin Ugedal <odin@uged.al> Signed-off-by: André Martins <andre@cilium.io> 17 April 2023, 22:28:49 UTC
fd81a6b envoy: Bump envoy to v1.23.8 https://github.com/cilium/proxy/actions/runs/4698873584/jobs/8331675690 Relates: https://github.com/cilium/proxy/pull/172 Signed-off-by: Tam Mach <tam.mach@cilium.io> 16 April 2023, 00:44:57 UTC
4190a2d envoy: Support more envoy image tag formats [upstream commit afcda947] This commit is to add the support for below image tags Different envoy image tag formats: ``` quay.io/cilium/cilium-envoy:f195a0a836629ceca5d7561f758c9505d9ebaebfa262647a2d4 quay.io/cilium/cilium-envoy:v1.23-f195a0a836629ceca5d7561f758c9505d9ebaebfa262647a2d4 ``` Testing was done as per below, kindly note the existing format should be working as usual. ```bash $ test=quay.io/cilium/cilium-envoy:014ceeb312a4d18dcf0ea219143f099fa91f2f28@sha256:1a3020822e8fb10b5f96bf45554690c411c2f48d8ca8fcf33da871dad1ce6b53 $ echo $test | sed -E -e 's/[^/]*\/[^:]*:(.*-)?([^:@]*).*/\2/p;d' 014ceeb312a4d18dcf0ea219143f099fa91f2f28 $ test=quay.io/cilium/cilium-envoy:v1.24-014ceeb312a4d18dcf0ea219143f099fa91f2f28@sha256:1a3020822e8fb10b5f96bf45554690c411c2f48d8ca8fcf33da871dad1ce6b53 $ echo $test | sed -E -e 's/[^/]*\/[^:]*:(.*-)?([^:@]*).*/\2/p;d' 014ceeb312a4d18dcf0ea219143f099fa91f2f28 ``` Fixes: #24749 Signed-off-by: Tam Mach <tam.mach@cilium.io> 16 April 2023, 00:44:57 UTC
cf28bba Prepare for release v1.11.16 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 14 April 2023, 11:41:35 UTC
36eaa38 update k8s dependencies to 1.23.17 The last client-go version contains an important bug fix. See https://github.com/kubernetes/kubernetes/pull/115901 for more info. Signed-off-by: André Martins <andre@cilium.io> 13 April 2023, 23:09:29 UTC
f490339 Remove HTTP header value from debug log Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com> 13 April 2023, 18:13:22 UTC
00630cc ipsec: Remove stale XFRM states and policies [ upstream commit 688dc9ac802b11f6c16a9cbc5d60baaf77bd6ed0 ] We recently changed our XFRM states and policies (IPs and marks). We however failed to remove the stale XFRM states and policies and it turns out that they conflict (e.g., the kernel ends up picking the stale policies for encryption instead of the new one). This commit therefore cleans up those stale XFRM states and policies. We can identify them based on mark values and masks (we switched from 0xFF00 to 0XFFFFFF00). The new XFRM states and policies are added as we receive the information on remote nodes. By removing the stale states and policies before the new ones are installed for all nodes, we could cause plain-text traffic on egress and packet drops on ingress. To ensure we never let plain-text traffic out, we will clean up the stale config only once the catch-all default-drop policy is installed. In that way, if there is a brief moment where, for a connection nodeA -> nodeB, we don't have a policy, traffic will be dropped instead of sent in plain-text. For each connection nodeA -> nodeB, those packet drops on egress and ingress of nodeA will happen between the time we replace the BPF datapath and the time we've installed the new XFRM state and policy corresponding to nodeB. Waiting longer to remove the stale states and policies doesn't impact the drops as they will keep happening until the new states and policies are installed. This is all happening on agent startup, as soon as we have the necessary information from k8s. Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
f76ce36 ipsec: Catch-default default drop policy for encryption [ upstream commit 7d44f37509c6271f7196dcec8edc7c417c609dca ] This commit adds a catch-all XFRM policy for outgoing traffic that has the encryption bit. The goal here is to catch any traffic that may passthrough our encryption while we are replacing XFRM policies & states. Those operations cannot always be performed atomically so we may have brief moments where there is no XFRM policy to encrypt a subset of traffic. This policy ensures we drop such traffic and don't let it flow in plain text. We do need to match on the mark because there is also traffic flowing through XFRM that we don't want to encrypt (e.g., hostns traffic). Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
4e18c00 ipsec: Custom check for XFRM state existence [ upstream commit ddd491bd8e100f94ca275c87e81f7e2be042c8db ] UpsertIPsecEndpoint is currently unable to replace stale XFRM states. We use XfrmStateAdd, which fails with EEXIST if a state with the same key (IPs, SPI, and mark) already exists. We can't use XfrmStateUpdate because it fails with ESRCH is no state with the specified key exist. Note we don't have the same issue for XFRM policies because XfrmPolicyUpdate doesn't return ESRCH if no such policy already exists. No idea why the two APIs are not consistent. We therefore need to implement a proper 'update or insert' logic for XFRM states ourselves. To that end, we first check if the state we want to add already exists. If it doesn't, we attempt to add it. If it fails with EEXIST, we know that some other state is conflicting. In that case, we attempt to remove any conflicting XFRM states that are found and then attempt to add the new state again. To find conflicting XFRM states, we use the same logic as the kernel does (cf. __xfrm_state_lookup). Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
198bad2 ipsec: Refactor wildcard IP variables [ upstream commit e802c2985fb673526fb1d00b2713b03827e63354 ] These wildcard variables will be used by a later commit in the IPsec logic. Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
9f553bc loader: Don't compile .asm files by default [ upstream commit 92407a836c85ad046e9ec1d335654ae2bc0bf26b ] Today we always compile a .asm files for endpoints, even though we rarely use them. They take a lot of space in the sysdumps and increase the overall compile time. This commit changes it to only compile those files if debugging mode is enabled. Reported-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
1068a3b pkg/service: Handle duplicate backends [ upstream commit 5311f81505d3e8127f311628bffaf205d41c9429 ] In certain error scenarios, backends can be leaked, where they were deleted from the userspace state, but left in the datapath backends map. To reconcile datapath and userspace, identify such backends that were created with different IDs but same L3n4Addr hash. This commit builds up on previous commits that don't bail out on such error conditions (e.g., backend IDs mismatch during restore), and tracks backends that are currently referenced in service entries restored from the lb4_services map to restore backend entries. Furthermore, it uses the tracked state to delete any duplicate backends that were previously leaked. Fixes: b79a4a53 (pkg/service: Gracefully terminate service backends) Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
6b36d9e pkg/service: Restore services prior to backends [ upstream commit ebe2b55e2a71ce2a991bd648387b54d6ff8d76bd ] The restore logic attempts to reconcile datapath state with the userspace post agent restart. Previously, it first restored backends from the `lb4_backends` map before restoring service entries from the `lb4_services` map. If there were error scenarios prior to agent restart (for example, backend map full because of leaked backends), the logic would fail to restore backends currently referenced in the services map (and as a result, selected for load-balancing traffic). This commit prioritizes restoring service entries followed by backend entries. Follow-up commit handles error cases such as leaked backends by keeping track of backends retrieved from restoration of service entries, and then using that to subsequently restore backends. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
c00f9a6 pkg/service: Don't bail out on failures [ upstream commit 89a1936bf6dda7fc6816e304529f57656b57e72c ] The restore code attempts to reconcile datapath state with the userspace state post agent restart. Bailing out early on failures prevents any remediation from happening, so log any errors. Follow-up commits will try to handle leaked backends in the cluster if any. Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
d8fe1a6 tests: add exceptions for lease errors due to etcd [ upstream commit e773f7e9b155d1e740af5a458a7e6deb60689ca6 ] Following up on #23334, add more exceptions for errors that seem to not be related to Cilium but rather to etcd. Fixes: #24701 Suggested-by: André Martins <andre@cilium.io> Signed-off-by: Gilberto Bertin <jibi@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
c024a65 docs: Fix upgradeCompatibility references [ upstream commit 2f9850ca3653f16e6c003409c7e2e9e6f246bb4b ] The upgradeCompatability should always be set to the first version that the user installed in order to assume the Helm defaults that were in place during that release. Tracking each version here initially would provide confirmation for users in order to pick a valid version. Except that we forgot to keep it up to date with each release. Drop the examples to reduce user confusion. Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
252083f pkg/bandwidth: add error for bandwidth manager not being enabled [ upstream commit 4aa6911868e943623a082a450a7afd0e81889a6b ] If we can read "procfs" the user will not the reason for it. We should log the error as well. Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 13 April 2023, 14:13:44 UTC
1e022e7 policy: Do not share same policy for multiple cached selectors [ upstream commit: af83b0efeccdc56b89e167c3bcee634554735da4 ] Do not share the same PerSelectorPolicy object between multiple cached selectors. This makes sure that when rules are merged only the rules for the intended selectors are effected. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 13 April 2023, 11:57:07 UTC
043dba0 chore(deps): update dependency cilium/hubble to v0.11.3 Signed-off-by: renovate[bot] <bot@renovateapp.com> 13 April 2023, 08:53:01 UTC
f35abb0 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 10 April 2023, 16:58:47 UTC
c3dc2f2 chore(deps): update docker.io/library/ubuntu:20.04 docker digest to 24a0df4 Signed-off-by: Renovate Bot <bot@renovateapp.com> 10 April 2023, 16:58:47 UTC
ab6af26 envoy: Bump envoy version to v1.23.7 The image hash is coming from below run https://github.com/cilium/proxy/actions/runs/4615766882/jobs/8159983325 Upstream release https://github.com/envoyproxy/envoy/releases/tag/v1.23.7 Signed-off-by: Tam Mach <tam.mach@cilium.io> 06 April 2023, 09:07:04 UTC
c6378c9 chore(deps): update docker.io/library/alpine docker tag to v3.16.5 Signed-off-by: Renovate Bot <bot@renovateapp.com> 05 April 2023, 09:27:43 UTC
79e029b test: fix race condition of deleting cnp in e2e test [ upstream commit 294bcd1d8f884ce9b7c54c1421032ff0ec2706c9 ] There is a flake in e2e test when a test case starts to proceed before cnp comes to take effect by cilium-agent. The correct way to delete cnp is to run "kubectl delete" followed by "cilium policy wait", and kubectl helper already has such wrappers. Backporting conflicts: * minor conflict due to the renaming of test/k8sT to test/k8st in master Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io> 05 April 2023, 09:26:12 UTC
4f47be0 test: fix race condition of deleting ccnp in e2e test [ upstream commit 22a3743d2695fb2321ccf32abb67a778be28f092 ] There is a flake in e2e test when a test case starts to proceed before ccnp comes to take effect by cilium-agent. The correct way to delete ccnp is to run "kubectl delete" followed by "cilium policy wait", and kubectl helper already has such wrappers. Fixes: #24380 Backporting conflicts: * minor conflict due to the renaming of test/k8sT to test/k8st in master Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io> 05 April 2023, 09:26:12 UTC
ee85241 docs: add upgrade notes for 1.11 regressions Since this behavior will be unexpected for users upgrading from 1.10 to 1.11 we should make it obvious in the upgrade notes about this regression. Signed-off-by: André Martins <andre@cilium.io> 05 April 2023, 09:23:30 UTC
8986d78 docs: Fix mitigation for IPsec upgrade issue The mitigation documented in commit ede154e27b ("Add IPSec remark for upgrade to v1.11.15") is actually incomplete. The XFRM policies also need to be flushed. Fixes: ede154e27b ("Add IPSec remark for upgrade to v1.11.15") Signed-off-by: Paul Chaignon <paul@cilium.io> 03 April 2023, 14:20:23 UTC
ede154e Add IPSec remark for upgrade to v1.11.15 Cilium upgrades to v1.11.15 can cause severe problems when IPSec is enabled. This adds a remark to the docs. Signed-off-by: darox <maderdario@gmail.com> 30 March 2023, 07:27:42 UTC
fe958b5 pkg: add missing xfrm-no-track rules from ipv6 [ upstream commit 788bf37bc84cffae93df8da19c34f5bbe3a20f57 ] By right there should be a rule to let ipsec skb bypass conntrack: -A CILIUM_PRE_raw -m mark --mark 0xd00/0xf00 -m comment --comment "cilium-xfrm-notrack:" -j CT --notrack However ipv6 missed it and this commit adds the rule back. Fixes: #23481 Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
7fb7aaf helm/hubble-ui: use v0.11.0 hubble-ui [ upstream commit e980ca07470707d7fcc8b9a54d7525dbdcc21330 ] Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
58c1f25 hubble-ui: allow ingress from non root `/` urls [ upstream commit e29c8ea8d3f85cabfb19dda2b1c71cadb8cd8e01 ] Support the case when ingress is configured to serve hubble-ui from non default `/` root url (ex. `/service-map`). Related hubble-ui pull request: https://github.com/cilium/hubble-ui/pull/432 Signed-off-by: Dmitry Kharitonov<dmitry@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
b1dd38d docs: note there are two Cilium CLIs [ upstream commit b2bc42a180fbe0c62c0dfbf543d0d2685ffa89b9 ] Signed-off-by: Liz Rice <liz@lizrice.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
af0e4ef renovate: Fix Hubble release digest regex [ upstream commit 30a783f58cbd4a1cb743d426fdab5d131c25dcb6 ] Renovate v35 had a breaking change in how it computes the digest of GitHub releses. Previously, it would use the digest of the release attachements, but now it's just using a git sha as the "digest". This commit changes the data source for the Hubble artifacts to use the data source which preserves the old behavior. Ref: https://github.com/renovatebot/renovate/pull/20178 Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
36b5917 docs: fix typo in operations/troubleshooting.rst [ upstream commit 986daf407ae9643ad72b68a81ff8ca1a7a2758bf ] contrack -> conntrack Fixes: 93ebeb3bae11 ("docs: Update the documentation for the conntrack-gc-interval flag") Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 29 March 2023, 11:58:00 UTC
d243cf6 chore(deps): update quay.io/cilium/hubble docker tag to v0.11.3 Signed-off-by: Renovate Bot <bot@renovateapp.com> 28 March 2023, 13:47:25 UTC
d6cd392 Fix for disabled cloud provider rate limiting [ upstream commit 0557a2f6e9407385679dd5831e4b5646b235a2fe ] Earlier versions of these flags had a prefix of ENI and were later renamed with a prefix of IPAM to be consistent across cloud providers. While removing the deprecated old flags, #12676 also removed lines needed for setting OperatorConfig struct fields from the values read in. This resulted in fields having golang defaults, which caused the rate limiter to be completely bypassed. Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
fb47c2e docs: Update the documentation for the conntrack-gc-interval flag [ upstream commit 93ebeb3bae11475e7a4dc581affcef79d4fb7eaa ] The current documentation is incorrect as the default value is 0 and not 5 minutes. 0 implies a dynamic interval value, which this commit now documents. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
387c11c checker: Fix incorrect checker for ExportedEqual() [ upstream commit 9ac5b5326950f06291b5bc3f65c58198a4aa8dc3 ] When using checker.ExportedEqual(), it was using the standard Equals checker under-the-hood, but this is incorrect. Fix it to use the correct checker. In the commit introducing the bug, there was no direct usage of checker.ExportedEqual(), but rather checker.ExportedEqual (note the plural). Subtle! Discovered while working on improving policy unit tests. Fixes: f4407e7c8f9 ("checker: Add ExportedEquals checker") Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
0e86345 Update clustermesh requirements to mention node InternalIP explicitly [ upstream commit 7dbc63a47fd63a4574ba1cb1a59e4f776ecff07f ] Discovered in community slack that k8s node ExternalIP isn't used by clustermesh even if configured. Let's be explicit about InternalIP in the documented reqs, until such time as use of node ExternalIP is supported. Signed-off-by: Jef Spaleta <jspaleta@gmail.com> 23 March 2023, 23:22:38 UTC
9a24073 Fix duplicated logs for test-output.log [ upstream commit 853ec101d8adee8d482040aa038e5b71ce7775e2 ] This patch avoide duplicated info in the test-output.log Fixes: #18515 Signed-off-by: Roman Ptitcyn <romanspb@yahoo.com> Signed-off-by: Nikolay Aleksandrov <nikolay@isovalent.com> 23 March 2023, 23:22:38 UTC
ba156b4 In service recovery, don't skip if one of the service recovery fails [upstream commit https://github.com/cilium/cilium/commit/018856602b7637b8ae1c796f4ed02fe1bbeb5905] Reduce number of connections interrupted due to svc/backend ID changes if restoration fails Fixed by not return error instead logging it and continue. Logging number of failures and success in backend restoration with new variables Signed-off-by: Gaurav Yadav <gaurav.dev.iiitm@gmail.com> Signed-off-by: Jared Ledvina <jared.ledvina@datadoghq.com> 22 March 2023, 21:07:24 UTC
b416e44 chore(deps): update docker.io/library/alpine:3.16.4 docker digest to 2cf17aa Signed-off-by: Renovate Bot <bot@renovateapp.com> 21 March 2023, 14:07:29 UTC
43c2a75 install: Update image digests for v1.11.15 Generated from https://github.com/cilium/cilium/actions/runs/4446402139. `docker.io/cilium/cilium:v1.11.15@sha256:434ea1ff40b8db76c2be6cabfa1bbd2b887eaabe42e757651ea14757468e3bf4` `quay.io/cilium/cilium:v1.11.15@sha256:434ea1ff40b8db76c2be6cabfa1bbd2b887eaabe42e757651ea14757468e3bf4` `docker.io/cilium/clustermesh-apiserver:v1.11.15@sha256:66071d67f0249909c81cc3f94ad1dd2ae51e1451c400183a9337c04b9c1e076f` `quay.io/cilium/clustermesh-apiserver:v1.11.15@sha256:66071d67f0249909c81cc3f94ad1dd2ae51e1451c400183a9337c04b9c1e076f` `docker.io/cilium/docker-plugin:v1.11.15@sha256:e2d10187f4e31a00fd751b6e5ac56bd3698ab6bd3c404cff06b7b2740d4327df` `quay.io/cilium/docker-plugin:v1.11.15@sha256:e2d10187f4e31a00fd751b6e5ac56bd3698ab6bd3c404cff06b7b2740d4327df` `docker.io/cilium/hubble-relay:v1.11.15@sha256:352a65dde7c324ace5d6442f626f82c19550dd581e17f8f7e7aba30325c96d9e` `quay.io/cilium/hubble-relay:v1.11.15@sha256:352a65dde7c324ace5d6442f626f82c19550dd581e17f8f7e7aba30325c96d9e` `docker.io/cilium/operator-alibabacloud:v1.11.15@sha256:712972b46f592bd80a8e4c66e9b5cdcc73705740bf2cea84a6df131107a01699` `quay.io/cilium/operator-alibabacloud:v1.11.15@sha256:712972b46f592bd80a8e4c66e9b5cdcc73705740bf2cea84a6df131107a01699` `docker.io/cilium/operator-aws:v1.11.15@sha256:3aa776003eee064a6896b6ad712f55293d4e045defbe14d3768d224ce254d5c3` `quay.io/cilium/operator-aws:v1.11.15@sha256:3aa776003eee064a6896b6ad712f55293d4e045defbe14d3768d224ce254d5c3` `docker.io/cilium/operator-azure:v1.11.15@sha256:81e5168c977806a7f310aa57cca74c908fe6ea323518804e15c48bc786b99271` `quay.io/cilium/operator-azure:v1.11.15@sha256:81e5168c977806a7f310aa57cca74c908fe6ea323518804e15c48bc786b99271` `docker.io/cilium/operator-generic:v1.11.15@sha256:1feed1b895b39c7bdcbfe6232536e26edba9beb41c160c66d539de4358275a2e` `quay.io/cilium/operator-generic:v1.11.15@sha256:1feed1b895b39c7bdcbfe6232536e26edba9beb41c160c66d539de4358275a2e` `docker.io/cilium/operator:v1.11.15@sha256:97e6df665e10a08b2fbb5aefb183564debe0a0a4108b371a2f4d95f38c56f56c` `quay.io/cilium/operator:v1.11.15@sha256:97e6df665e10a08b2fbb5aefb183564debe0a0a4108b371a2f4d95f38c56f56c` Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 17 March 2023, 11:31:42 UTC
c5577e8 Prepare for release v1.11.15 Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 16 March 2023, 00:35:10 UTC
99df085 cmd/ciliumendpoint: guard against nil indexers. [ upstream commit f5c202d70f815aa5873374f50debca476ab2fb04 ] Make cleanup more resilient to conditions where indexer are not set. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 15 March 2023, 18:13:02 UTC
c73d235 daemon: fix panic when running with etcd with endpoint crd disabled [ upstream commit ee7f12c46be74a58be7f7a878329758e6c03db7b ] When running etcd kvstore, if endpoint CRD is disabled then the stale CEP cleanup init procedure panics due to a nil indexer references returned from k8s watchers. This is because the cep/ces k8s watchers aren't initialized if this option is set to true. Fixes: #24366 Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 15 March 2023, 18:13:02 UTC
b29f401 bpf: Ignore HOST_ID resolved from ipcache for IPv6 case [ upstream commit a74affd5c4851de3adface98a0b81bcca9ffde55 ] This PR adds the check to ignore HOST_ID resolved from ipcache for IPv6 case. resolve_srcid_ipv6 should ignore the HOST_ID resolved from ipcache because the packets marked MARK_MAGIC_HOST actually be from the host. Signed-off-by: Yusuke Suzuki <yusuke-suzuki@cybozu.co.jp> Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 15 March 2023, 14:41:28 UTC
a5a3d89 Check stale cilium endpoint flag before cleaning [ upstream commit 449bb80ea6dde45e4b22453b03186f4879cc810a ] [ backporter's note: hive changes create conflict due to change in function names, had to modify to accomadate old K8sEnabled func ] The "--enable-stale-cilium-endpoint-cleanup" flag is never checked before running "cleanStaleCEPs". Signed-off-by: Steven Johnson <sjdot@protonmail.com> Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 14 March 2023, 16:36:01 UTC
9566881 agent: install CNI plugin binary in an InitContainer [ upstream commit e1a46216b4dfba98a13f5b32b1272c137b2cc923 ] This reduces the potential security surface of the agent by removing the bind-mount of /opt/cni/bin. Instead, write the binaries once in an initContainer. There is no currently known vulnerability exploiting this, but it's good practice to remove as many long-running host mounts as possible. This could be a potential further exploit vector if an agent were to be compromized. Signed-off-by: Casey Callendrello <cdc@isovalent.com> Signed-off-by: Feroz Salam <feroz.salam@isovalent.com> 13 March 2023, 18:18:04 UTC
b68ff83 images: update cilium-{runtime,builder} Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 13 March 2023, 17:34:01 UTC
back to top