sort by:
Revision Author Date Message Commit Date
278ec19 Reconcile qdiscs accurately when using BW manager Current logic bails out without updating leaf qdiscs when first item in the qdisc list is of type mq. Other qdiscs could be pfifo_fast for example. Encountered this in my local testing. The second qdisc was never replaced with fq. qdisc mq 0: dev enX0 root qdisc pfifo_fast 0: dev enX0 parent :1 bands 3 priomap 1 2 2 2 1 2 0 0 1 1 1 1 1 1 1 1 Co-developed-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Hemanth Malla <hemanth.malla@datadoghq.com> 18 June 2024, 08:19:17 UTC
d08aa3d chore(deps): update cilium/scale-tests-action digest to 511e3d9 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 18 June 2024, 08:00:51 UTC
112a16d build-images: fetch artifacts with specific pattern It seems that docker/build-push-action started to store artifacts on GitHub. This sort of affected the digests of the image build process as it timeout while trying to download these artifacts. To fix this issue we will only download the artifacts with the pattern "*image-digest *" which are the only artifacts relevant for the image digests. Fixes: b86d5fc1aa64 ("chore(deps): update docker/build-push-action action to v6") Signed-off-by: André Martins <andre@cilium.io> 18 June 2024, 06:56:36 UTC
5179460 docs: add troubleshoot clustermesh command clarification note Let's explicitly mention that the output of the cilium-dbg troubleshoot command refers to the connections from the agents to the local clustermesh-apiserver when KVStoreMesh is enabled, as potentially confusing. In this case, it is expected that the output is the same for all the clusters the agents are connected to. Differently, the connectivity to the remote clusters can be troubleshooted using the dedicated kvstoremesh-dbg command. Suggested-by: Bruno M. Custódio <bruno@isovalent.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 18 June 2024, 05:58:31 UTC
c46e6be cilium-dbg: improve troubleshoot clustermesh output for local cluster Users may additionally configure a clustermesh entry for the local cluster as well, to reuse the same configuration in all clusters, as Cilium then automatically ignores it. Let's improve the output of the cilium-dbg troubleshoot clustermesh (and kvstoremesh-dbg troubleshoot) commands in this situation, removing the usage of the term "remote", and displaying a note for the entry matching the local cluster name. The retrieval of the local cluster name is performed in a best effort fashion, and may not always work. Suggested-by: Bruno M. Custódio <bruno@isovalent.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 18 June 2024, 05:58:31 UTC
ab57923 cilium-dbg: minor clarifications to the clustermesh status output Add the term remote to clarify that the number of clusters reported by the cilium-dbg and kvstoremesh-dbg status commands do not include the local one, regardless of whether it is included in the clustermesh configuration or not. Similarly, let's replace the term failures with reconnections, as failures has a negative connotation, but they are actually expected to happen when the clustermesh-apiserver in the given remote cluster is restarted. Suggested-by: Bruno M. Custódio <bruno@isovalent.com> Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 18 June 2024, 05:58:31 UTC
919e207 Handle nil service retrieved from Resource store Signed-off-by: Nick Young <nick@isovalent.com> 18 June 2024, 01:51:16 UTC
24061be Fix CiliumEnvoyConfig Nodeport handling Adds additional service redirect handling for Services with Nodeports set, which will automatically include the Nodeport in the set of redirected ports if ports to redirect are specified. Also removes a hack in the Dedicated Ingress code that was introduced to solve this problem previously. Signed-off-by: Nick Young <nick@isovalent.com> 18 June 2024, 01:51:16 UTC
1defd4c ipam: remove ipam from cilium-dbg debuginfo Currently, IPAM gets registered as debuginfo statusobject before it gets initialized. This is visible when displaying the debuginfo from within an agent pod. ``` cilium-dbg statusinfo ... <nil> ... ``` After moving the initialization of IPAM into it's own cell, IPAM is fully initialized at that point in time. The problem is that with the added dependencies it seems as the output is way too big and results in memory issues (also results in failing tests due to OOM on GitHub). Therefore, this commit removes the registration of IPAM to the debuginfo (and the related implementation of the interface in IPAM). This shouldn't be an issue as it seems that this was no longer part of the debuginfo output for quite some time. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 17 June 2024, 16:35:54 UTC
4a36fe9 ipam: set metadata manager during initialization Currently, there's an extra method `setMetadata` to set the optional metadata manager for IPAM. With the introduction of the cell, it's possible to treat this as internal and set it during the initialization of the IPAM struct. This way we can get rid of the exported method. In addition, the metadata manager cell defaults to a new defaultIPPoolManager in case IPAM Multi Pool is disabled. This way, a metadata manager implementation is always provided and prevents from the need for additional (and potential enhanced) nil checks. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 17 June 2024, 16:35:54 UTC
2302e9c ipam: treat IPAM metadata manager as IPAM internal With the introduction of the IPAM cell, the IPAM metadata manager can be treated and registered as internal cell. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 17 June 2024, 16:35:54 UTC
9dd9790 ipam: move rest api implementation into cell Currently, the IPAM REST API is implemented by the daemon. With the extraction of the IPAM initialization into its own cell it's also possible to extract the IPAM REST API handler into the same cell. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 17 June 2024, 16:35:54 UTC
86a5c88 ipam: provide ipam via cell Currently, IPAM is initialized and configured during the agent/daemon initialization. This commit moves the initialization of the IPAM struct (with all its dependencies) into its own cell. The provider initialization is still triggered from the agent initialization code. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 17 June 2024, 16:35:54 UTC
2e2f553 chore(deps): update dependency renovatebot/renovate to v37.410.1 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 16:35:17 UTC
c22eceb Revert "Prepare for release v1.16.0-rc.0" This reverts commit e462433c4ee3685fd8c3c5867128889acf01d886. Signed-off-by: André Martins <andre@cilium.io> 17 June 2024, 14:55:51 UTC
14c2b3d Prepare for release v1.16.0-rc.0 Signed-off-by: André Martins <andre@cilium.io> 17 June 2024, 14:55:51 UTC
447c92d update AUTHORS and Documentation Signed-off-by: André Martins <andre@cilium.io> 17 June 2024, 14:55:51 UTC
2fc54dd IPAM: Adds IPv6 Prefix Delegation Config Option - `operator/option/config.go`: Adds an option for enabling AWS IPv6 prefix delegation (PD). - `*_test.go`: Updates IPAM implementation unit tests to call `NewNodeManager()` with IPv6 PD config option. - `pkg/ipam/node.go`: Adds `ipv6Alloc` field to `Node` type to represent IPv6-specific allocation node attributes. - `pkg/ipam/node_manager.go`: Adds IPv6 PD field to the `NodeManager` type and associated `NewNodeManager()`. Supports: #30684 Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io> 17 June 2024, 14:16:33 UTC
51ff384 Documentation for decoupling k8s-heartbeat-timeout Signed-off-by: Dorde Lapcevic <dordel@google.com> 17 June 2024, 14:06:02 UTC
95536f1 Decouple k8s client heartbeat from connection timeout and keep alive The main reason to introduce this is to be able to increase the heartbeat interval without affecting the k8s client connection settings. It also allows the possibility to disable heartbeat, by setting `k8s-heartbeat-timeout` to 0, without disable the k8s client itself. ```release-note When upgrading, users can experience a change to their configuration if they were overriding the `k8s-heartbeat-timeout` flag. k8s client timeout and keep alive are no longer getting values from the `k8s-heartbeat-timeout` flag, but rather would have default values (30 seconds). ``` Signed-off-by: Dorde Lapcevic <dordel@google.com> 17 June 2024, 14:06:02 UTC
c228c5e chore(deps): update dependency renovatebot/renovate to v37.409.2 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 13:39:07 UTC
d6a9529 gha: Grant write status permission This is to fix the below issue when the step is trying to update status. > Error: Resource not accessible by integration Sample run with failure https://github.com/cilium/cilium/actions/runs/9534715391/job/26279677470?pr=33092 Signed-off-by: Tam Mach <tam.mach@cilium.io> 17 June 2024, 12:35:44 UTC
8ce020c pkg/identitybackend: Make sanitizeK8sLabels method public The method will be used by operator managing CIDs. Related #27752 Signed-off-by: Ovidiu Tirla <otirla@google.com> 17 June 2024, 12:02:04 UTC
a2388a6 gha: Add more flags for Ingress Conformance test This is to add enable-http-debug flag to capture more information in case of failure. Additionally, the stop-on-failure flag is added for faster feedback. Signed-off-by: Tam Mach <tam.mach@cilium.io> 17 June 2024, 11:20:51 UTC
7e7e77e fix(deps): Bump ingress-controller-conformance This is to mainly pick up the below changes https://github.com/cilium/ingress-controller-conformance/pull/2 Signed-off-by: Tam Mach <tam.mach@cilium.io> 17 June 2024, 11:20:51 UTC
b7237b1 clustermesh: add namespace index to service resource Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> 17 June 2024, 10:48:51 UTC
14f5764 clustermesh: fix reconciliation when a remote service gets deleted This commit fixes the case when a remote service gets deleted while the local service remains. Previously the Reconcile on the controller wasn't called because we weren't returning the deleted Service in the service informer List method which may leave some EndpointSlice in the cluster that should be deleted. In an usual Kubernetes environment this is not an issue since the OwnerReference should be deleting the EndpointSlices as well. In our case the actual Service still exist because we have this mechanism of creating "virtual" service by adding the cluster name in suffix to trigger a reconcile on each remote cluster. To fix that we are now returning the combination of all the local Services and the remote clusters instead of returning all the remote services. This allows to trigger a reconcile on all the possible services including some of them that don't exist which would make the Get method of the Service informer to return a not found error which will then trigger a deletion via our cleanup hook. Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> 17 June 2024, 10:48:51 UTC
6f9b8fd clustermesh: use EventuallyWithT in endpointslicemeshsync Use EventuallyWithT instead of waiting for the controller queue to be empty. The queue being empty does not signify that the reconciliation is done as the controller pops elements when the reconciliation starts and not when it ends. Suggested-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> 17 June 2024, 10:48:51 UTC
b86d5fc chore(deps): update docker/build-push-action action to v6 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 10:46:11 UTC
7511f74 scripts: Add linter for logrus usage To catch reintroduction of logrus after a package has been converted over to slog, add a whitelist-based linter for finding imports of logrus from already converted packages. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
981c761 ipsec: Switch to slog Refactor the ipsec code to use slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
53fc716 sysctl: Switch to slog Refactor to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
698026d linux: Switch linuxNodeHandler to use slog Refactor the linuxNodeHandler to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
59fb337 devices: Switch to slog Refactor the DeviceManager to use slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
5c57430 config: Switch to slog Refactor to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
949f983 bigtcp: Switch to slog Refactor to use the slog logging Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
1b57a7a bandwidth: Switch to slog Refactor to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
63e90a8 fix(deps): update all go dependencies main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 10:31:06 UTC
7de5df0 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 17 June 2024, 09:58:04 UTC
fd4134b chore(deps): update docker.io/library/golang:1.22.4 docker digest to c2010b9 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 09:58:04 UTC
304b7fd operator: ignore identity delete conflicts Ignore CiliumIdentity delete conflicts during the gc run (by skipping deletion and emitting a warning), allowing gc to continue if a subset of identities are conflicted. Prior to this change conflicts would cause gc to error, which could lead to an unexpected accumulation of stale CiliumIdentity objects. Signed-off-by: Jacob Henner <henner@arcesium.com> 17 June 2024, 09:39:38 UTC
6a63598 doc: Update doc for CRD CiliumNodeConfig from v2alpha1 to v2 Signed-off-by: Donia Chaiehloudj <donia.cld@isovalent.com> 17 June 2024, 09:18:28 UTC
77ea8e9 hubble/cli: add --node-label Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
0de97e7 hubble: add node label filter Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
9b6ee33 hubble: wire the localNodeWatcher in the observer setup Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
6db7004 hubble/observer: add a local node watcher This commit introduce the observer LocalNodeWatcher, which cache the local node information to be filled in Hubble flows. Because the labels representation differ between the internal node.LocalNode struct and Hubble flows (a map and a key=val slice, respectively), we need to maintain a cache in order to avoid re-building the labels slice for each flow. The LocalNodeWatcher aim to solve this and can be hooked to the observer's OnGetFlows. Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
d6fb43b hubble/api: add node_labels to Hubble flows This is the first commit of a set introducing local node labels to Hubble flows. Filtering flows emitted from nodes having particular labels can be useful to debug the Egress Gateway feature: combined with the recently added network interface filter and/or SNAT IP filter one could then see egress flows related to a given CiliumEgressGatewayPolicy. Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
09ac42f chore(deps): update all lvh-images main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 09:07:11 UTC
badf925 Add appArmorProfile to the securityContext as well Signed-off-by: Aurelien Benoist <aurelien.larcin@gmail.com> 17 June 2024, 08:35:00 UTC
3790121 add securityContext for cronjob & disable hostNetwork Signed-off-by: Aurelien Benoist <aurelien.larcin@gmail.com> 17 June 2024, 08:35:00 UTC
ba713d0 fix(deps): update module github.com/aws/aws-sdk-go-v2/service/ec2 to v1.164.1 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 08:26:04 UTC
f5129a2 ui: v0.13.1 release Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com> 17 June 2024, 08:25:55 UTC
26df509 vendor: pin StateDB to version v0.1.0 Time to introduce versioning to StateDB as there's some API cleanups coming and we want to control when renovate tries to bump StateDB. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 08:14:33 UTC
fe05a39 make: explicitly specify default target to build Cilium's go binaries The blamed commit introduced an include statement for Makefile.override in Makefiles specific to building Cilium's go binaries, to enable optionally overriding variables. However, this changed the behavior of executing `make -C folder` (i.e., without specifying an explicit target, as we do in the clustermesh-apiserver Dockerfile for instance) in case Makefile.override contains any target. Indeed, make executes by default the first target that it sees. Let's address this discrepancy and avoid unexpected changes by explicitly configuring the default target to be executed to `all`. Fixes: 811cb7f0273e ("make: Add include to Makefile.override within binary-specific makefiles") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 17 June 2024, 08:28:59 UTC
e25047a make: drop leftover test statement in cilium-dbg Makefile Fixes: 811cb7f0273e ("make: Add include to Makefile.override within binary-specific makefiles") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 17 June 2024, 08:28:59 UTC
0b29687 operator/dockerfile: correctly propagate modifiers The blamed commit replaced several docker build arguments with a single, generic one named MODIFIERS. However, it didn't update the operator Dockerfile to correctly use it. Let's fix it. Fixes: c4aebae89528 ("docker, ci: Create generalized MODIFIERS build arg") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 17 June 2024, 08:28:59 UTC
caadba1 helm: loadBalancerClass for Cluster Mesh APIserver * Added `loadBalancerClass` Helm value for the Cluster Mesh APIserver Kubernetes Service. * Refactored the existing `loadBalancerIP` Helm value so it's clearer that it exists (instead of having it commented out in the values.yaml file). Signed-off-by: Philip Schmid <phisch@cisco.com> 17 June 2024, 07:47:23 UTC
df6609f test: Add check in tunnelMapInit to only set tunnel map if nil The function tunnel.TunnelMap calls a sync.Once to set the variable tunnel.tunnelMap before returning it. This conflicts with the behavior of the tunnel.SetTunnelMap function however, for if tunnel.SetTunnelMap is called to manually set the tunnel map in a test, and the test then calls tunnel.TunnelMap to grab a reference to the map, tunnel.TunnelMap will overwrite the map created by tunnel.SetTunnelMap. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 17 June 2024, 07:39:56 UTC
1395179 test: Attempt to unpin existing tunnel map in SetTunnelMap The function tunnel.SetTunnelMap is used in go tests to manually set the tunnel map, however this function relies on the caller to clean up the map when it is no longer needed. This commit modifies this function to try and unpin any existing tunnel map before setting a new one, in order to address cases where a leftover map may still be pinned. See https://github.com/cilium/cilium/actions/runs/8193113970/job/22406307012 for an example flake that may be related to this issue. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 17 June 2024, 07:39:56 UTC
0c54e45 test: Check error from unpinning tunnel map in TestClusterAwareAddressing This commit modifies the test TestClusterAwareAddressing in pkg/maps/tunnel/tunnel_test.go to check the error returned by the tunnel mapped when it is unpinned. Before, this error was ignored. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 17 June 2024, 07:39:56 UTC
c0a8f30 chore(deps): update dependency renovatebot/renovate to v37.409.1 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 07:30:56 UTC
c2b8f48 datapath: Add support for skipping direct routes on different L2 networks Previously, when a cluster ran with native-networking and had multiple zones, it wasn't possible to enable auto direct routes. This caused a bottleneck for same-zone traffic as it always had to be routed through the gw. With this new flag, any direct routes for nodes on different L2 networks will be skipped. Cilium will add routes for nodes on the same L2 and not exit. Fixes: #31124 Signed-off-by: Jonny <jonny@linkpool.io> 17 June 2024, 07:30:25 UTC
72ffe0f clustermesh: extract operator logic into a specific package This commit extracts all the clustermesh specific code into a package called operator. The aim of this is to provide clustermesh specific features to other package launched in the operator. The first user for this besides endpointslicesync would be the mcsapi package. Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> 17 June 2024, 07:09:53 UTC
de9a02a .github/workflows: pin renovate version To avoid unwanted breakages from renovate, we should also pin renovate version and let it be updated on a weekly basis like the other dependencies. Signed-off-by: André Martins <andre@cilium.io> 16 June 2024, 08:39:19 UTC
664cffe workflows: e2e-upgrade: fix EXTRA parameters use an array for EXTRA to allow extending it properly Signed-off-by: Gilberto Bertin <jibi@cilium.io> 16 June 2024, 08:07:13 UTC
412a46c renovate: run post upgrade tasks on Makefile.values Run postUpgradeTasks after modifying install/kubernetes/Makefile.values file. Signed-off-by: André Martins <andre@cilium.io> 15 June 2024, 10:10:52 UTC
ffb8443 ci: fix ces migration test trigger and conn-disrupt usage PR #32930 introduced a change to the conn-disrupt test that caused this migration test to fail. This PR updates the test to work with the updated test format. The inconsistency was not caught because of the path filters used in the test, so the path filters have been updated to exclude only Documentation/ and test/. Fixes: #32268 Signed-off-by: jshr-w <shjayaraman@microsoft.com> 15 June 2024, 08:59:05 UTC
5e305af images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 15 June 2024, 09:00:23 UTC
1a06642 chore(deps): update docker.io/library/golang:1.22.4 docker digest to 0f76912 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 15 June 2024, 09:00:23 UTC
3a0a3fc Choose optimal algorithm depending on input size This adds a check to ImmSet methods Insert and Delete, whether there is a single or multiple elements being inserted or deleted. Depending on that, two different algorithms are used. For a single element, both algorightms are linear in the size of the ImmSet, but we choose the one that benchmarking shows to be faster. Signed-off-by: Damian Sawicki <dsawicki@google.com> 15 June 2024, 08:22:04 UTC
304f8b1 Replace existing ImmSet methods with proposed ones The rationale is given in the previous commit message. Signed-off-by: Damian Sawicki <dsawicki@google.com> 15 June 2024, 08:22:04 UTC
8f5bbad Add and benchmark alternative immset methods This commit adds alternative implementations of methods of ImmSet: * InsertNew(xs ...T) * DeleteNew(xs ...T) * UnionNew(s2 ImmSet[T]) * DifferenceNew(s2 ImmSet[T]) and benchmarks these implementations agains the existing ones. Benchmarking results: * for Insert, the proposed method becomes faster already with the container of size 1000, and then it performed 10x faster for size 10,000 and 100x faster for size 100,000; * for Delete, the proposed method becomes faster already with the container of size 1000, and then it performed ~5x faster for size 10,000; * for Difference, the proposed method was already 4x faster for size 100, and then it performed 7x faster for size 1000, 35x times faster for size 10,000, and 193x faster for size 100,000; * for Union, the proposed method performs slightly faster, but gains do not visibly grow with increasing size. Theoretically, the proposed solutions have improved computational complexity: * the complexity of Insert is O(len(s.xs)*len(xs)), and the complexity of InsertNew is O(len(s.xs)+len(xs)); * the complexity of Delete is O(len(s.xs)*len(xs)), and the complexity of DeleteNew is O(len(s.xs)+len(xs)); * the complexity of Difference is O(len(s.xs)*len(s2.xs)) because it uses Delete internally, and the complexity of DifferenceNew O(len(s.xs)+len(s2.xs)); * the complexity of Union is harder to estimate: it involves sorting a slice of size n=len(s.xs)+len(s2.xs), but this slice is a concatenation of two sorted slices, so most likely this does not lead to the usual O(n*log(n)) complexity; of course, it is at least O(n); the complexity of UnionNew is O(n). Signed-off-by: Damian Sawicki <dsawicki@google.com> 15 June 2024, 08:22:04 UTC
5aa52b0 feat: Configure static cilium network policy Cilium reads CNP yaml if `static-cnp-path` is specified in cilium config. It converts to rules and add those rules to policy engine. This allows admin to configure policy to not allow traffic to certain secure infrastructure endpoints from pods running in cloud. Signed-off-by: Tamilmani <tamanoha@microsoft.com> 15 June 2024, 08:17:50 UTC
4195bdc chore: Bump spire agent and server versions Ideally, this should be taken care by renovate bot with postUpgradeTask, however, there is some issues with the configuration. Signed-off-by: Tam Mach <tam.mach@cilium.io> 15 June 2024, 08:14:22 UTC
268d28e dev-doctor: update hint links Signed-off-by: renyunkang <rykren1998@gmail.com> 15 June 2024, 07:56:46 UTC
cf63069 k8s: Fix usage of assert in TestWaitForCacheSyncWithTimeout TestWaitForCacheSyncWithTimeout is relying on subtests, so a new assert object should be derived from each subtest testing.T, in order to report the failing one correctly. Temporarily changing the code to force a failure in the "Not invoking BlockWaitGroupToSyncResources should cause wait to succeed immediately" case, it can be seen that no subtest is reported as failing: --- FAIL: TestWaitForCacheSyncWithTimeout (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Waiting_for_no_resources_should_always_sync (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Not_invoking_BlockWaitGroupToSyncResources_should_cause_wait_to_succeed_immediately (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_timeout_due_to_watched_resource_exceeding_timeout (0.20s) --- PASS: TestWaitForCacheSyncWithTimeout/Any_one_timeout_should_cause_error (0.60s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_complete_due_to_event_causing_timeout_to_be_extended_past_initial_timeout (0.70s) While after this change the expected subtest is correctly reported as the offending one: --- FAIL: TestWaitForCacheSyncWithTimeout (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Waiting_for_no_resources_should_always_sync (0.00s) --- FAIL: TestWaitForCacheSyncWithTimeout/Not_invoking_BlockWaitGroupToSyncResources_should_cause_wait_to_succeed_immediately (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_timeout_due_to_watched_resource_exceeding_timeout (0.20s) --- PASS: TestWaitForCacheSyncWithTimeout/Any_one_timeout_should_cause_error (0.60s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_complete_due_to_event_causing_timeout_to_be_extended_past_initial_timeout (0.70s) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 14 June 2024, 20:16:13 UTC
33d64e1 gateway-api: Update docs for v1.1.0 This is the follow-up item for #32233. Relates: https://github.com/cilium/cilium/pull/32233 Signed-off-by: Tam Mach <tam.mach@cilium.io> 14 June 2024, 16:59:43 UTC
2fdd17b policy: Fix Rare Deny Merge Bug While fixing the distillery tests I noticed that the test "broad_allow_is_a_portproto_subset_of_a_specific_deny" was failing intermittently. This revealed that in very rare circumstances the policy engine would merge a deny entry into a pre-existing allow that was set to be deleted completely (not just have an ownership entry removed). Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
2710e71 policy: Fix Deny Precedence Distillery Tests Deny precedence tests have not been using correct comparison logic for mapstates. This updates it to use the correct logic. It makes the named-port sub-test and port-range tests more legible. Some test data has been renamed because it was incorrectly named. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
9b57c0b policy: Fix Proxy Port Tests The default proxy port is 1. The dummy proxy port value of 4242 should no longer be the default. Addtionally, almost all distillery tests have had their comparison logic fixed to compare mapstates. The last distillery test to be fixed will be fixed in another commit. Co-authored-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
8458e5c policy: Fix InvertedPortMask Logic Inverted port masks were introduced in a previous commit, but they did not take into account all of the wildcard policy keys that are hardcoded both in tests and, unfortunately, in production code. This fixes all InvertedPortMask logic so that wild cards are correctly instantiated and accounted for. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
f174b5e recorder: move API handler into cell Currently, the recordder API handler is implemented by the daemon. This commit extracts the implementation and moves it into the recorder hive cell. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 16:32:04 UTC
248a9be recorder: use hive injected logger This commit replaces the static logger with the one that has been injected via Hive Framework. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 16:32:04 UTC
58dfad2 recorder: introduce cell for recorder This commit introduces a hive cell for the pcap recorder. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 16:32:04 UTC
736af68 bpf: extract ethertype in to-netdev / to-overlay just once Remove some duplicated logic. Ideally we would only do the extraction if at least *one* feature is enabled that actually requires the ethertype. But managing these dependencies doesn't seem worth the hassle. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 14:43:48 UTC
ae4ffd4 bpf: eth: extract eth_is_supported_ethertype() helper In preparation for a subsequent patch. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 14:43:48 UTC
aa10df3 kvstore: correctly assign permissions to single key, rather than prefix Fix the recently introduced rangeForKey function leveraged by the etcdinit logic to only grant permissions to the given key, and not the full prefix starting with that key, when appropriate. Fixes: cb6a58bef00b ("clustermesh: granular etcd permissions for kvstoremesh cached data") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 14 June 2024, 13:03:22 UTC
96336a0 lrp: move api handler from daemon to lrp hive cell Currently, the daemon implements the LRP API handler. With it comes the last dependency to the LRP manager from the daemon. Therefore, this commit moves the LRP API handler from the daemon to the existing LRP hive cell and removes the dependency from the daemon to the LRP manager. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 13:03:05 UTC
d21f654 bpf: test Wireguard with ENCRYPTION_STRICT_MODE Run compile & complexity tests for Wireguard with Strict-mode enabled. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 12:21:24 UTC
4cd716c bpf: clean up compile testing for ENABLE_WIREGUARD Wireguard completely lives in bpf_host. Remove the config from bpf_overlay, bpf_xdp and bpf_lxc. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 12:21:24 UTC
572bca4 bpf: l3: remove unused loopback code in local-delivery The `hairpin_flow` parameter was previously needed so that loopback replies could bypass the ingress network policy. But now that all callers set this parameter to `false`, we can safely remove the corresponding logic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 11:45:10 UTC
34337f0 bpf: lxc: simplify RevNAT path for loopback replies The usual flow for handling service traffic to a local backend is as follows: * requests are load-balanced in from-container. This entails selecting a backend (and caching the selection in a CT_SERVICE entry), DNATing the packet, creating a CT_EGRESS entry for the resulting `client -> backend` flow, applying egress network policy, and local delivery to the backend pod. As part of the local delivery, we also create a CT_INGRESS entry and apply ingress network policy. * replies bypass the backend's egress network policy (because the CT lookup returns CT_REPLY), and pass to the client via local delivery. In the client's ingress path they bypass ingress network policy (the packets match as reply against the CT_EGRESS entry), and we apply RevDNAT based on the `rev_nat_index` in the CT_EGRESS entry. For a loopback connection (where the client pod is selected as backend for the connection) this looks slightly more complicated: * As we can't establish a `client -> client` connection, the requests are also SNATed with IPV4_LOOPBACK. Network policy in forward direction is explicitly skipped (as the matched CT entries have the `.loopback` flag set). * In reply direction, we can't deliver to IPV4_LOOPBACK (as that's not a valid IP for an endpoint lookup). So a reply already gets fully RevNATed by from-container, using the CT_INGRESS entry's `rev_nat_index`. But this means that when passing into the client pod (either via to-container, or via the ingress policy tail-call), the packet doesn't match as reply to the CT_EGRESS entry - and so we don't benefit from automatic network policy bypass. We ended up with two workarounds for this aspect: (1) when to-container is installed, it contains custom logic to match the packet as a loopback reply, and skip ingress policy (see https://github.com/cilium/cilium/pull/27798). (2) otherwise we skip the ingress policy tailcall, and forward the packet straight into the client pod. The downside of these workarounds is that we bypass the *whole* ingress program, not just the network policy part. So the CT_EGRESS entry doesn't get updated (lifetime, statistics, observed packet flags, ...), and we have the hidden risk that when we add more logic to the ingress program, it doesn't get executed for loopback replies. This patch aims to eliminate the need for such workarounds. At its core, it detects loopback replies in from-container and overrides the packet's destination IP. Instead of attempting an endpoint lookup for IPV4_LOOPBACK, we can now look up the actual client endpoint - and deliver to the ingress policy program, *without* needing to early-RevNAT the packet. Instead the replies follow the usual packet flow, match the CT_EGRESS entry in the ingress program, naturally bypass ingress network policy, and are *then* RevNATed based on the CT_EGRESS entry's `rev_nat_index`. Consequently we follow the standard datapath, without needing to skip over policy programs. The CT_EGRESS entry is updated for every reply. Thus we can also remove the manual policy bypass for loopback replies, when using per-EP routing. It's no longer needed and in fact the replies will no longer match the lookup logic, as they haven't been RevNATed yet. This effectively reverts e2829a061a53 ("bpf: lxc: support Pod->Service->Pod hairpinning with endpoint routes"). Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 11:45:10 UTC
b66862a endpoint: Fix Policy Sync Method The endpoint package had the last code in the Cilium repository that was updating mapstate during iteration. Updating mapstate during iteration will result in corrupting the mapstate. Fixes: #32959 Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 11:27:46 UTC
42ee1f6 egressgw: skip gressgw handling if the packet is from host The egress gateway handling code at bpf_host only cares about packets from the egress proxy, so we can ignore the packets from the host. Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com> 14 June 2024, 11:14:28 UTC
7f6df2d prefilter: move api handler from daemon to prefilter hive cell Currently, the daemon implements the Prefilter API handler. With it comes the last dependency from the daemon to the prefilter itself. Therefore, this commit moves the prefilter API handler from the daemon to the existing prefilter hive cell and removes the dependency from the daemon to the prefilter. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 11:08:23 UTC
9f210b5 proxy: Persists proxy ports in /var/run/cilium/state Write proxy ports into /var/cilium/state/proxy_ports_state.json and restore from there. If the file is not available, only then restore from iptables rules. Restoration from iptables is still needed for upgrades. This fixes the issue with bpf TPROXY, as it does not rely on iptables rules, so the proxy ports can not be recovered from them. Note: File IO is patterned after similar methods for cache.CachingIdentityAllocator. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
d11e4d2 proxy: Reuse proxy ports from datapath on restart Previously we only reused the DNS proxy port from the datapath. Change to reuse the old proxy ports for all proxy redirects from datapath, if possible. This reduces Listener churn on daemonset Envoy on agent restart. With this change the listener update in Envoy logs after agent restart looks like this: begin add/update listener: name=cilium-http-egress:13563 hash=11560876577076369351 duplicate/locked listener 'cilium-http-egress:13563'. no add/update lds: add/update listener 'cilium-http-egress:13563' skipped Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
8149af7 proxy: Remove deprecated "localOnly" We now only ever have 'localOnly = true' so that we always bind the proxy listeners to localhost address. Remove support for all-hosts addresses. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
ee1cf1c proxy: Fix AllocateProxyPort Rename AllocateProxyPort as AllocateCRDProxyPort to make it clear that it is only to be used for CEC/CCEC CRD listeners. Also enforce the current practice where CRD proxy ports are only accessed by name, and 'ingress' member is always false for CRD ports. Fix the unit test to verify that the proxy port was not reallocated for the same listener name. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
bfbd3d6 proxy: Try previously used port first "rulesPort" is non-zero when we have a datapath (iptables) rules with a specific port. Try re-creating a redirect with that port if no other port is configured. This reduces churn on the datapath, as the existing rule can be reused without a need for an update. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
back to top