https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
d355027 envoy: Bump envoy minor version to 1.30.x Relates: https://github.com/cilium/proxy/pull/831 Signed-off-by: Tam Mach <tam.mach@cilium.io> 06 July 2024, 02:50:56 UTC
fc69880 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 05 July 2024, 19:51:02 UTC
261af51 chore(deps): update all-dependencies Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 05 July 2024, 19:51:02 UTC
59e734b dev: support for an additional kind values file Currently the Cilium Helm values that are used for all of the Make kind-* targets are provided by static values files or an optional kind-custom.yaml file that is under git ignore (for local dev purposes). This commit introduces the possibilility to pass an additional values file as make argument. ``` ADDITIONAL_KIND_VALUES_FILE=contrib/testing/kind-feature-x.yaml make kind-debug ``` This provides the possibility to use feature-specific values files during development and reduces the need for a specific make target. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 05 July 2024, 12:17:24 UTC
2271d1e dev: add option to delete docker containers when deleting kind cluster Currently deleting a kind cluster with `make kind-down` fails if there are docker containers attached to the kind docker network. ``` ❯ make kind-down ./contrib/scripts/kind-down.sh Deleting cluster "kind" ... Deleted nodes: ["kind-worker" "kind-control-plane"] Error response from daemon: error while removing network: network kind-cilium id be5f3e19dd958de25745635986363284e14b38504af10329f69f5176779cab3a has active endpoints ``` In some cases adding docker containers to the same network is part of the test- / dev-setup. Therefore it would be great to automatically delete these docker containers before deleting the network. This commit introduces the possibility to delete the docker containers by passing the env var `DELETE_CONTAINERS` to the make target. ``` DELETE_CONTAINERS=true make kind-down ``` Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 05 July 2024, 12:17:16 UTC
d36d992 build(install,dashboards): update cilium-agent Grafana dashboard In order to make the dashboard compatible with Grafana 11, the panels got upgraded. Issue: https://github.com/cilium/cilium/issues/31850 Signed-off-by: Sebastian Gaiser <sebastiangaiser@users.noreply.github.com> 05 July 2024, 06:31:47 UTC
4709707 bgpv2: Fix description of Selector behavior in CiliumBGPAdvertisement CRD The actual implementation as well as the intention is to not advertise any prefixes if the Selector CiliumBGPAdvertisement is not specified. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 05 July 2024, 05:57:35 UTC
ecdf16d fqdn-perf: allow to inject additional metrics measurements Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 04 July 2024, 18:59:39 UTC
cd2f736 renovate: stop wireguard updates Renovate has been failing recently to check for the wireguard updates. Also this dependency hasn't received updates for months so it's safe to ignore it temporarily. Signed-off-by: André Martins <andre@cilium.io> 04 July 2024, 18:51:20 UTC
9cb4ae5 .github: Clean up cilium-cli action usages - Install cilium-cli in the default location to be consistent with other workflows. When cilium/design-cfps#9 eventually gets implemented, we might want to put the cilium-cli code under cilium-cli/ directory. - For conformance-ginkgo.yaml, use the cilium-cli action to install cilium-cli in /host directory. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 04 July 2024, 17:51:47 UTC
4906db5 ipsec: Delete old, deprioritized XFRM OUT rules In commit a11d088154 ("ipsec: Deprioritize old XFRM OUT policy for dropless upgrade"), we added special logic to handle the upgrade to v1.15 and previous versions, particularly around the replacement of XFRM OUT policies. All users are now expected to have upgraded, so we can remove this logic. ...or more precisely, update it. Instead of depriorizing old XFRM OUT policies, the same logic will now remove them. This may seems strange because, in the previous commit, I advocated for not removing stale XFRM states. The difference here is two-fold. First, the logic to remove is almost the same as to deprioritize so I think the risk of introducing a regression in this way is low. Second, we know that a large number of XFRM policies on the system can have an impact on performance so there is an incentive (even if small) to remove stale policies. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 04 July 2024, 17:27:54 UTC
2d8c746 ipsec: Remove stale upgrade logic around XFRM states In commits f1c4a6e593 ("ipsec: Allow old and new XFRM IN states to coexist for upgrade") and c0d9b8c9e7 ("ipsec: Allow old and new XFRM OUT states to coexist for upgrade"), we added special logic to handle upgrades to v1.15.0 and previous versions. This logic was required because we needed to change the structure of our XFRM states and Linux doesn't offer a way to replace them atomically. So we had to detect conflicts and temporarily remove conflicting XFRM states while we add the new ones. Painful times... With the v1.17 development cycles starting, these upgrades are now clearly behind us. All IPsec users should have the new XFRM states in place. We can therefore remove this special logic. Note that we never cleaned up the old XFRM states. It's unclear that we should. They are now unused and not causing problems. They will naturaly disappear as users rotate nodes and clusters. Adding logic to specifically remove them may carry more risk than benefit. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 04 July 2024, 17:27:54 UTC
e357fdb ipsec: Remove stale code from v1.15 Commit a4c43f358ee ("ipsec: Do not use AllocCIDR with subnet encryption") added code to remove an old bogus route in v1.15. In v1.17, we can now assume this route was removed for all users and the related code can be removed. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 04 July 2024, 17:27:54 UTC
d2c1411 bpf: Remove old mark logic for IPsec upgrades Commit 420d7faea3 ("bpf, daemon: Have bpf_host support both values for skb->cb[4]") introduced a special logic to handle the upgrades to v1.15.0 and previous versions. This logic was needed because the way we use skb->cb[4] changed. In v1.17, we can assume that all users have now gone through the upgrade and this logic isn't needed anymore. This commit therefore deletes it. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 04 July 2024, 17:27:54 UTC
b855b25 node/manager: synthesize node deletion events When the cilium agent is down (due to a crash or an upgrade), it can miss node events. Upon startup, live nodes are upserted, but when deletions are missed, the agent fails to clean up node-related system state. Examples of such state includes bpf map entries, xfrm states or routes. In particular, the agent fails to clean up node IP to nodeID mappings in the nodeid bpf map. Since K8s will happily recycle such IPs, this can lead to breakage, as the agent associate the wrong nodeID with IPs. To avoid leaking this state, the node manager now dumps its view of the current set of nodes to a file in the runtime state directory, which can be read on restart of an agent. This is similar to how we restore other state upon restart. When reading this file, it's important to avoid resurrecting long-gone nodes (as we don't know for how long the agent was down) - instead, we merely take note of which nodes we knew of in the past, compare that to the nodes we consider live (once synced to k8s), and delete the ones which seem to have disappeared. The motivation to build this reconciliation based on full state dumps to disk is that downstream code generally assumes to have access to a full node object in the deletion callbacks. This makes is infeasible to base the pruning on just the information available in bpf maps. In an alternative design, downstream subsystems are responsible for cleaning up their own state based on just a node identifier, but current code doesn't allow for this. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 04 July 2024, 14:53:36 UTC
545fbc8 controlplane: clear environment after test Clearing the environment in the middle of the test can cause failures related to state being deleted, as the "environment" being cleared is simply the StateDir of the agent. Fixes: 940b186ab4 ("test/controlplane: Fix tests after removal of global hives") Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 04 July 2024, 14:53:36 UTC
f5f1e5a policy: Fix mapstate.Diff() used in tests Use the actual unexpected value, rather then the one that was not found. Remove the import of unused "testing" from production code. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 04 July 2024, 12:55:04 UTC
aa44dd1 Track node labels propagated to the endpoint manager correctly This fixes a race condition during agent startup where k8s node label updates are rejected indefinitely by the host endpoint. The current endpoint label update logic rejects a request if any of the labels on the old node are not present in the endpoint manager state. For host endpoints, the k8s node label add/update events may be missed if the k8s node watcher initializes before the host endpoint is created. As a result, the host endpoint labels are outdated, and all subsequent node label updates are rejected due to the aforementioned precondition check. To resolve the issue, this commit updates the old labels only if the current node label update is successfully propagated to the endpoint manager. Fixes cilium#29649 Signed-off-by: Satish Matti <smatti@google.com> 04 July 2024, 12:52:34 UTC
fdb9770 helm: possibility to control creation of GatewayClass Signed-off-by: Petr Baloun <petr.baloun@firma.seznam.cz> 04 July 2024, 12:23:09 UTC
ccbf2d7 pkg/identity: Add basic identity allocator Basic identity allocator will be used by operator to manage global identities CID and agent to manage locally created identities that do not require complex features like `pkg/allocator/allocator.go`. Related: #30356 Signed-off-by: Ovidiu Tirla <otirla@google.com> 04 July 2024, 11:50:14 UTC
a1014b6 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 04 July 2024, 11:55:29 UTC
a8c321d renovate: add fix grpc-go autodetection As grpc-go has a tag specifically for each project, we need to adjust renovate accordingly so that it detects the versioning of the grpc-go only. Fixes: dc683b9ea770 ("images/builder: let renovate update proto plugins") Signed-off-by: André Martins <andre@cilium.io> 04 July 2024, 11:55:29 UTC
5f80dd7 daemon: add agent-runtime-config backup files to gitignore Executing the unit tests in `pkg/option` generates the newly introduced `agent-runtime-config.json` backup files. This commit adds the config files to the gitignore file to prevent them from being added to version control. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 04 July 2024, 11:20:58 UTC
d4a4faf Fix too many open Unix sockets Fixes: #33542 Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 04 July 2024, 10:45:47 UTC
d06b684 bpf: Clean-up clang version check The current codebase is with clang 17+, so we can remove these checks for old versions. Signed-off-by: Tam Mach <tam.mach@cilium.io> 04 July 2024, 10:33:19 UTC
2a16206 build: update cilium proxy go dependency to latest version This commit updates the go dependency `github.com/cilium/proxy` to the latest version of `main` to pull in some API changes. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 04 July 2024, 09:57:41 UTC
984bac9 envoy: update Cilium Proxy image to latest version This commit updates the Cilium Proxy image to the latest verion from `main`. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 04 July 2024, 09:57:41 UTC
d9cbb03 cilium, test: Extend service tests with health check callbacks Extend the service tests to make use of the health checker callback infra to simulate that when health check for one frontend fails, we don't take down the whole backend through this facility but just the backend corresponding to the given service. Co-developed-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Aditi Ghag <aditi@cilium.io> Co-developed-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 July 2024, 09:56:48 UTC
0c310c3 cilium, api-handler: Do not exclude APIs in lb-only mode Just install all API handlers in lb-ony mode. The reason back then to disable API parts was due to b052272d550a ("daemon: Disable parts of Cilium API in LB mode") which was for a case where the Cilium CNI plugin was installed, a K8s cluster still had some endpoints and then the agent in lb-only mode tried to restore them but with errors. This is fine since Docker-only mode is targeted for non- K8s anyway, and also two Cilium instances in hostns (CNI + lb-only) would step on each other and is not supported at the moment. Also, Marco added that this daemon function only adds the API handlers that are still implemented "in the daemon", but there are also other API handlers that are already provided by the respective cells directly. More of the latter will be migrated in future. Leaving this here as separate commit for context. Suggested-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 July 2024, 09:56:48 UTC
9c7351e cilium: Add bpf-lb-external-control-plane switch for old-style docker mode Docker style mode is still in use, therefore add a new --bpf-lb-external-control-plane switch (default is false) where users can opt into the old behavior when it is being deployed in non-k8s environments. Adopt the CI tests accordingly by adding --bpf-lb-external-control-plane=true to their cmdline. "bpf-lb-external-control-plane" because it needs to be provided externally since the only thing that the agent listens on is the swagger API for installing frontends and backends. Co-developed-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Aditi Ghag <aditi@cilium.io> Co-developed-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 July 2024, 09:56:48 UTC
f4c91e8 cilium: Add service infrastructure for per-service health This adds new infrastructure for building custom health checkers for services. The current backend health infrastructure has the limitation in that it is global to the backend. This is fine if a backend is generally taken out of service for maintenance updates, however, this approach will not work if a L4 backend contains multiple L7 services. In this case the backend health property/state needs to be more fine-grained and tied to the specific service instead of being global. We achieve this by: 1) Changing UpdateBackendsState() to call into UpdateBackendsStateMultiple(). The latter is reworked into a reusable function, taking a custom service map to update and a boolean whether to also quarantine the whole backend. The per-service quarantining is called via UpdateBackendStateServiceOnly(). 2) The Backend Go object is not shared between services, so state transition can be updated via UpdateBackendStateServiceOnly() and not affecting other services. The latter skips updating the actual backend state. So the per- service state can be quarantined but the global state still active. 3) The BPF datapath in this case still "takes out" the backend from the BPF service map, so that new connections do not pick the given backend to redirect traffic to. It does so by reducing the struct lb{4,6}_service count field. Also, we reuse the pad[2] fields to explicitly add the qcount field to denote a quarantine count. These backends are present in the service map via struct lb{4,6}_key with a backend_slot lookup range of [svc->count, svc->count + svc->qcount). 4) Restoration takes the BPF map information into account in order to rebuild the internal per-service quarantine representation before a sync with the kube-apiserver happens. DumpServiceMaps() and svcBackend() set the per- service backend state. 4) This work also adds a registration mechanism for building custom health checkers via HealthChecker interface with hooks for UpsertService() and corresponding DeleteService() as well as a mechanism to register callbacks via SetCallback(). The HealthCheckCallback() API can then be used to update a per-service backend state. See also the TestHealthCheckCB() unit test in the subsequent patch. 5) The Service object also plumbs through annotations from the Kubernetes service. This helps for the health checkers to plumb through configuration data when registering the service at the health checker plugin. Co-developed-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Aditi Ghag <aditi@cilium.io> Co-developed-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 July 2024, 09:56:48 UTC
d26009c cilium: Enable load-balancer mode to connect to kube-apiserver Allow connecting to the kube-apiserver in lb-only datapath mode. This allows to utilize the K8s control plane for orchestration as in CNI mode and enables persistent storage of services and backends. Note that in this mode having local Pods in the cluster is currently not supported and therefore the BPF code generation skipped in endpoints package. Co-developed-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: Aditi Ghag <aditi@cilium.io> Co-developed-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 July 2024, 09:56:48 UTC
83fa353 bpf: lxc: fix up reporting of drop reason in drop_for_direction() DROP_* reasons are negative values. Reported-by: Nikita V. Shirokov <tehnerd@tehnerd.com> Relates: https://github.com/cilium/cilium/issues/32473 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 04 July 2024, 08:04:32 UTC
99adddc bpf: lxc: use THIS_INTERFACE_IFINDEX instead of CB_IFINDEX When bpf_lxc's ingress path is called via the policy tailcall map, it takes a CB_IFINDEX parameter. This is used to redirect the packet into the endpoint, after applying policy. But now that we bake the endpoint's ifindex into the bpf_lxc program, this can be replaced by THIS_INTERFACE_IFINDEX. Allowing us to eventually condense all the boolean flags (CB_IFINDEX / CB_FROM_HOST / CB_FROM_TUNNEL) into a single skb->cb slot. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 04 July 2024, 03:30:19 UTC
6236f38 docs: Add Port Range Information - Port Ranges are now supported so it is removed from the unsupported features table. - DNS rules and L7 rules do not support port ranges yet. Notes are added to call attention to this. - Add a Port Range example. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 04 July 2024, 00:39:53 UTC
4c336dd option: Fix merge conflict in test expectations Commit 369e927307 ("daemon: Check that DaemonConfig is not changed after being published") added a new test case that needs to be updated when adding new configs. Commit b7a26eb8a2 ("Add rate limiting on BPF events map") added new configs but the pull request branch didn't include commit 369e927307, thus tests passed in the pull request CI and started failing once the pull request was merged. This commit fixes it by updating the test expected output. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 03 July 2024, 21:43:10 UTC
e64311d Add kernel version limitation to multicast Doc The kernel version that can enable multicast is different between AMD64 and AArch64 due to the difference in the timing of when tail-call from eBPF sub-programs is enabled. I wrote about it in the document. ref : The commit which allow for tailcalls in BPF subprogram in each architecture are below AMD64: https://github.com/torvalds/linux/commit/e411901c0b775 This commit is reflected to version 5.10 or newer kernel AArch64: https://github.com/torvalds/linux/commit/d4609a5d8c70d21b4a3f801cf896a3c16c613fe1 This commit is reflected to 6.0 or newer kernel This PR is a little part of the solution of https://github.com/cilium/cilium/issues/33408 Signed-off-by: Yusho Yamaguchi <yusho.yamaguchi@sony.com> 03 July 2024, 20:17:41 UTC
be112ac helm: `loadBalancerSourceRanges` for clustermesh-apiserver service This commit updates the Helm template for the clustermesh-apiserver to allow users to define `loadBalancerSourceRanges` which will be used to determine which IP ranges users can connect to the clustermesh-apiserver load balancer from. Signed-off-by: Matthieu Antoine <matthieu.antoine@jumo.world> 03 July 2024, 18:15:39 UTC
c9aeefb envoy: Update envoy 1.29.x to v1.29.7 This is mainly to pick up the below CVE fix from the upstream. Related CVE: https://github.com/envoyproxy/envoy/security/advisories/GHSA-fp35-g349-h66f Relates: https://github.com/cilium/proxy/pull/817 Relates: https://github.com/envoyproxy/envoy/releases/tag/v1.29.7 Signed-off-by: Tam Mach <tam.mach@cilium.io> 03 July 2024, 18:11:46 UTC
b6b318e docs: Document plus sign in IPsec secret The plus sign in the IPsec secret forces the use of per-tunnel keys. It will be mandatory from version 1.16. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 03 July 2024, 15:08:17 UTC
c3a0d6c images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 03 July 2024, 12:25:17 UTC
9e334c2 chore(deps): update go to v1.22.5 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 03 July 2024, 12:25:17 UTC
50371db renovate: do not update major golang in v1.16 branch Similar to other stable branches we shouldn't update golang to major versions automatically. Fixes: da63c10eb308 ("Prepare for v1.17 development cycle") Signed-off-by: André Martins <andre@cilium.io> 03 July 2024, 11:56:18 UTC
c707781 ipsec: Deprecate global IPsec keys Using a single IPsec key for all IPsec tunnels is insecure. It was only preserved to allow for a smooth switch to per-tunnel keys. Per-tunnel keys have been released in v1.13, v1.14, and v1.15, so we can now deprecate the insecure alternative for v1.16. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 03 July 2024, 11:53:01 UTC
b7a26eb Add rate limiting on BPF events map This change introduces a rate limiter that prevents from high CPU utilization by cilium-agent. This is particuraly important on smaller VMs where consuming ~3vCPU can starve other processes on the node. In the same time CPU limit on the whole pod is not a good solution, as it may affect network programing latency. The rate limiter operates on BPF level, preventing an event to be generated and written to BPF events map. It utilizes the rate limiter already used for ICMPv6 messages, which implementation is based on a token system. The rate limiter is configured with 2 values - rate limit (how many tokens are refreshed every second) and burst limit (the maximum number of tokens). Each call to rate limiter uses 1 token if a token is available. Option should be used when you either need to limit cilium-agent CPU usage (without disabling hubble subsystem at all) or you can observe a lot of event lost in the observer queue (as this is wasting CPU time on processing events that won't be consumed anyway. This is a follow-up to https://github.com/cilium/cilium/pull/25385, which implemented similar behaviour, but in user space. This implementation provides rate limiting of events at the earliest stage of event lifecycle, in kernel space, before the event is written to BPF events map. Signed-off-by: Michal Siwinski <siwy@google.com> 03 July 2024, 09:54:59 UTC
9343a62 ipsec: do not nil out EncryptInterface when using IPAM ENI netlink.LinkList() can return a transient kernel interrupt error. This commit adds a retry when this occurs in loader.reinitializeIPSec() to prevent nilling out or misconfiguring EncryptInterface. Additionally, it will now surface an error instead of swallowing it. Signed-off-by: Jason Aliyetti <jaliyetti@gmail.com> 03 July 2024, 09:23:49 UTC
4379c98 fix link in node-ipam.rst Signed-off-by: Dean <22192242+saintdle@users.noreply.github.com> 03 July 2024, 09:00:39 UTC
2173bb6 experimental: Add fuzz test to validate serializability The "cilium-dbg statedb" commands require the objects to be JSON serializable, which is easy to break accidentally. Add a fuzz test to generate arbitrary Service, Frontend and Backend objects and validate that the TableRow() output is the same across JSON serialization. Signed-off-by: Jussi Maki <jussi@isovalent.com> 03 July 2024, 08:52:33 UTC
f0a7908 loadbalancer/experimental: Add experimental API for load-balancing Add experimental Services API for managing load-balancing frontends and backends. This is added as a new experimental package to avoid confusing it with the production implementation. When the hidden "--enable-experimental-services" flag is set: * K8s Service and Endpoints are reflected to service, frontend and backend tables * A mock reconciler is started that logs mock operations to reconcile the frontends. The tables can be inspected with "cilium-dbg statedb experimental" commands, e.g. "cilium-dbg statedb experimental frontends". Signed-off-by: Jussi Maki <jussi@isovalent.com> 03 July 2024, 08:52:33 UTC
62ac5e0 container: Add JSON marshalling support for ImmSet[T] For dumping StateDB objects with ImmSet[T] fields, add support for JSON marshallign and unmarshalling. The default implementation does not work as ImmSet[T] has private fields only. Signed-off-by: Jussi Maki <jussi@isovalent.com> 03 July 2024, 08:52:33 UTC
7add37f clustermesh: Add JSON marshalling to AddrCluster Implement JSON marshalling for AddrCluster type so it can be used in StateDB objects and dumped with cilium-dbg. Signed-off-by: Jussi Maki <jussi@isovalent.com> 03 July 2024, 08:52:33 UTC
1a33ff0 renovate: remove concurrency group from renovate's Base Image Release Build The "Base Image Release Build - Renovate" workflow doesn't need a concurrency group has it will use the concurrency group of the workflow that it uses, the "./.github/workflows/build-images-base.yaml". Using the concurrency groups on both workflows will result in the following error: Canceling since a deadlock for concurrency group 'Base Image Release Build - Renovate-refs/heads/renovate/main-all-dependencies' was detected between 'top level workflow' and 'build-base-images-from-renovate' Fixes: f054f94b24b9 (".github: add workflow for renovate to build base images") Signed-off-by: André Martins <andre@cilium.io> 03 July 2024, 08:49:45 UTC
979f335 bpf: wireguard: simplify overlay path Pull the goto statement up to clean up an extra level of indirection. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 03 July 2024, 07:45:19 UTC
26085bd bpf: wireguard: use overlay mark to detect tunnel traffic Instead of peeking into the packet headers, rely on the mark that is set by to-overlay. https://github.com/cilium/cilium/pull/31082 landed in v1.16, so for v1.17 it's safe to assume that all overlay traffic is marked accordingly. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 03 July 2024, 07:45:19 UTC
8e7f0a0 bgpv2: Skip reconcile while BGPNodeConfig is not initialized Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 03 July 2024, 07:33:06 UTC
99bf976 renovate: remove group for Makefile.values This groups is redundant given that the Makefile.values is already defined under the paths for renovate to search for dependencies. Fixes: 99846fd67db8 ("renovate: add all dependencies of Makefile.values") Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 22:48:12 UTC
abdaf95 renovate: fix wireguard's regex Fixes: 7e440fe3f9de ("add versioning schema for WireGuard in Renovate") Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 22:48:12 UTC
2c6b521 renovate: do not update spire 1.10 This version has a regression that we need to fix so we will be skipping its automatic updates from renovate. Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 22:48:12 UTC
5b9102a renovate: fix detection for dependencies on install-protoplugins.sh Fixes: dc683b9ea770 ("images/builder: let renovate update proto plugins") Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 22:48:12 UTC
04a21fc chore(deps): update all lvh-images main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 02 July 2024, 22:16:34 UTC
515e89a gha: Allow CRD mismatch for Gateway API conformance This is to fix the below failure, and allow more flexibility running latest upstream tests. ``` gateway-api/conformance_test.go:36 Error: Received unexpected error: the installed CRDs version is different from the suite version Test: TestConformance Messages: error initializing conformance suite ``` Fixes: d34119caadb00de25066ac91dfa3c33d2e159a6a Signed-off-by: Tam Mach <tam.mach@cilium.io> 02 July 2024, 15:31:41 UTC
b72f57a add kind/cfp label to feature request Features should also have the kind/cfp associated to it. Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 12:45:48 UTC
07c138c ipsec: Remote deprecated secret parsing code These format options for the IPsec secret were never documented and have been deprecated since v1.12. We postpone the removal since a lot was going on in IPsec land, but it's now time to remove it. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 02 July 2024, 11:59:33 UTC
8668292 Update ipsec to handle larger psk values For psk values <= 32 bytes use SHA256 to compute the node key. Otherwise use SHA512. This is needed to support GCM-256-AES since a PSK for this would require a 36 bytes as per RFC 4106. Fixes: #33457 Fixes: c28e046d4c6 ("ipsec: Compute per-node-pair IPsec keys") Signed-off-by: Jason Aliyetti <jaliyetti@gmail.com> 02 July 2024, 11:05:30 UTC
d34119c gateway-api: Bump version to latest upstream This is mainly to pick up new conformance tests for GRPCRoute such as exact method matching, listener hostname, etc. Signed-off-by: Tam Mach <tam.mach@cilium.io> 02 July 2024, 09:33:47 UTC
6cb07c1 daemon: Allow DNS transparent mode to be turned off with encryption DNS transparent mode was introduced to make sure that DNS traffic is always encrypted if the user is running with transparent encryption. If DNS proxy transparent mode is turned off, proxied DNS traffic will be leaked. However, DNS transparent mode is suffering from various bugs, e.g. - https://github.com/cilium/cilium/issues/31535 - https://github.com/cilium/cilium/issues/31197 - https://github.com/cilium/cilium/issues/33144 While we are working on addressing these bugs, some users might be fine with proxied DNS traffic being leaked. Therefore, this commit introduces a hidden and undocumented flag which requires DNS proxy transparent mode to be enabled with IPSec. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 02 July 2024, 09:18:56 UTC
99846fd renovate: add all dependencies of Makefile.values Now we can let renovate update the dependencies of all images from Makefile.values. Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 09:11:46 UTC
89bff35 renovate: run renovate on Sunday Instead of running it on Monday, renovate should run on Sunday so that it can use the CI outside of the hours that developers use CI as well. Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 09:11:46 UTC
0ea2232 renovate: update noisy dependencies once a month These dependencies are always being updated every week which can be noisy. For example, in case we merge a PR on Monday, Renovate will open another PR trying to update the same dependency again. Thus, we will only update them on the first sunday of the month. Signed-off-by: André Martins <andre@cilium.io> 02 July 2024, 09:11:46 UTC
c3c6d79 aws: use LimitsNotFound This change makes use of LimitsNotFound error type in ResyncInterfacesAndIPs. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 02 July 2024, 09:05:32 UTC
70bdeba alibabacloud: use LimitsNotFound This change makes use of LimitsNotFound error type in ResyncInterfacesAndIPs. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 02 July 2024, 09:05:32 UTC
0dbddf4 ipam: Add LimitsNotFound error type This change adds new error type for differentiating errors in ResyncInterfacesAndIPs calls that were caused by limits for instance types not being available. The new error is checked for and a more helpful log message is emitted. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 02 July 2024, 09:05:32 UTC
f5f2c54 Docs: Add `lbipam-require-lb-class` docs to LB-IPAM page Updated the LB-IPAM documentation to include the new `lbipam-require-lb-class` option. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com> 02 July 2024, 09:04:37 UTC
b18b5b0 install/kubernetes: Add helm option for lbipam-require-lb-class flag This commit adds a new helm option for the `lbipam-require-lb-class` flag. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com> 02 July 2024, 09:04:37 UTC
2cfeb0a LB-IPAM: Add flag to ignore services without LBClass set The existing behavior of LB-IPAM was to assume LB-IPAM is responsible for services without an LBClass set. This made sense since LB-IPAM is aimed at environments where traditional LB controllers are not available and there were still supported k8s deployments that did not support LBClass. All supported k8s deployments now support LBClass, and users have reported situations where the default LB controller has to be assigned to a controller other than LB-IPAM due to the other controller not respecting LBClasses. Therefor, it is advantages to have the option to tell LB-IPAM to ignore services without an LBClass set and only manage services with an LBClass that we own. This commit adds this functionality, by default we still have the old behavior, but the `--lbipam-require-lb-class` flag can be set to true to make LB-IPAM ignore services without an LBClass set. Signed-off-by: Dylan Reimerink <dylan.reimerink@isovalent.com> 02 July 2024, 09:04:37 UTC
c9ce9b0 docs: remove beta from local redirect policy page LRP has move to stable. #33032 Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com> 02 July 2024, 08:51:20 UTC
c1afd58 chore(deps): update dependency renovatebot/renovate to v37.421.5 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 02 July 2024, 08:10:59 UTC
a4dfe22 chore(deps): update kindest/node docker tag to v1.30.2 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 02 July 2024, 07:56:51 UTC
9d8db39 Remove unnecessary hubble port-forward commands Remove "cilium hubble port-forward" command from workflows that do not perform flow validation. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 02 July 2024, 07:53:23 UTC
b1ea3e3 fix(deps): update all go dependencies main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 02 July 2024, 05:54:47 UTC
1c1382d Cleanup: no need to deactivate l7proxy when activating EgressGetway Cleaning up a minor oversight in the documentation following this merge https://github.com/cilium/cilium/pull/32828 Signed-off-by: cdtzabra <22188574+cdtzabra@users.noreply.github.com> 02 July 2024, 05:09:09 UTC
fb55ad6 gh: ipsec: clarify check for leaked proxy traffic during key rotation Add a comment to explain why we need to disable the check for proxy traffic when running the bpftrace leak detection during key rotation. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 02 July 2024, 04:57:42 UTC
1267ff9 Documentation: accept ORG and REPO By default, the check-crd-compat-table script will get the remote from cilium/cilium. This script won't work if there isn't a remote under these names. As a workaround, and to avoid many refactoring, the script will detect if ORG and / or REPO environment variables are set and use those as inputs to get the remote name. Signed-off-by: André Martins <andre@cilium.io> 01 July 2024, 22:59:25 UTC
66b67f4 chore(deps): update dependency renovatebot/renovate to v37.421.4 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 01 July 2024, 16:30:31 UTC
94abcdf bpf: lxc: clean up stray IPv6 revalidation The corresponding IPv4 part was removed with 34337f0257d5 ("bpf: lxc: simplify RevNAT path for loopback replies"). We likely never even needed it for IPv6. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 01 July 2024, 16:29:56 UTC
dc9b047 bpf: lxc: stop setting rev_nat_index in CT_INGRESS entry In v1.16 we moved the RevDNAT step for loopback connections into the client's ingress path, aligning it with the standard code path (see 34337f0257d5 ("bpf: lxc: simplify RevNAT path for loopback replies")). This uses the .rev_nat_index in the CT_EGRESS entry. Thus we no longer need to propagate this rev_nat_index into the CT_INGRESS entry. Remove the relevant code in v1.17. Fixes: https://github.com/cilium/cilium/issues/33154 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 01 July 2024, 16:29:56 UTC
91c6365 chore(deps): update cilium/cilium-cli action to v0.16.11 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 01 July 2024, 16:22:44 UTC
7f2c92b rest-api: move config modify rest api handler into rest-api cell Currently, the REST API handler that handles config modifications is implemented in the Daemon struct. This commit moves the handler and all it's dependencies into the rest api cell. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 July 2024, 15:53:36 UTC
a5b7e6a docs: add upgrade note for dangling cidrGroupRefs When a CNP selects _only_ dangling cidrGroupRefs in a to/fromCIDRSet selector, it resulted in an allow-all prior to 1.16. Since 1.16 changes the semantics of the empty, but non-nil to/fromCIDRSet selector, the behaviour of dangling cidrGroupRefs was adjusted to match. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 01 July 2024, 15:53:34 UTC
5c09c92 gh/workflows: Skip no-frag in IPsec for some key rotation https://github.com/cilium/cilium/issues/29480 Signed-off-by: Martynas Pumputis <m@lambda.lt> 01 July 2024, 13:17:08 UTC
09ba861 gh/workflows: Bump Cilium CLI to v0.16.11 Signed-off-by: Martynas Pumputis <m@lambda.lt> 01 July 2024, 13:17:08 UTC
2123732 envoy: Avoid short circuit BE filtering The same service can be used with multiple port types (e.g number and name), so we should continue matching port values for both. Signed-off-by: Tam Mach <tam.mach@cilium.io> 01 July 2024, 09:55:04 UTC
385b214 .github/workflows: change GH runners GitHub now uses 4 CPU cores for default runners, up from 2. This allows us to use these runners instead of self-hosted ones, benefiting from features like nested virtualization. Signed-off-by: André Martins <andre@cilium.io> 01 July 2024, 09:09:01 UTC
74a025c chore(deps): update all github action dependencies Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 01 July 2024, 07:40:13 UTC
2716ec9 kvstoremesh: correctly remove cached data upon cluster disconnection Currently, a comment attached to the Remove method claims that the cached data would have disappeared due to lease expiration. However, the lease is never expected to expire, as we use a single local client for all remote clusters. Additionally, depending on lease expiration would leave a significantly long race condition window in which stale data would still be present if the same cluster were to be reconnected. In most cases, stale cached data is not a problem, as the etcd instance is stateless. However, it can create problems if the same cluster is first disconnected and then subsequently reconnected, without restarting the clustermesh-apiserver pod in between. Indeed, in this case we would have leftover entries that would not be cleaned up before reconnecting. Let's address this issue by explicitly deleting the cached data upon cluster disconnection. This operation is performed synchronously (i.e., blocking other connection/disconnection operations) to prevent race conditions causing the deletion of incorrect data. Yet, to avoid blocking forever in case of errors, we only allow a limited number of retries, and give up with a loud error if they all fail. As a safety measure, the removal of cached data can also be disabled via the hidden "disable-drain-on-disconnection" flag. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 01 July 2024, 07:30:30 UTC
057b2fc clustermesh: allow aborting cluster removal via dedicated context The removal of a cluster can take a significant amount of time, as in the case of the kvstoremesh logic introduced in subsequent commits. Hence, let's propagate a context that gets canceled upon clustermesh shutdown, to allow immediately interrupting ongoing removal processes if not yet terminated at that point. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 01 July 2024, 07:30:30 UTC
1e6136c clustermesh: make cluster disconnection asynchronous Currently, both cluster configuration upsertions and removals are handled synchronously. However, while upsertions are negligible as they only trigger the restart of a controller, removals can take a significant amount of time, possibly blocking any other parallel operations. This applies both to the agents logic, due to the draining of all previously known entries, and to kvstoremesh, as the subsequent commits will implement the removal of the cached data upon disconnection. Let's introduce the support for processing the configurations of separate clusters in parallel by running the removal logic asynchronously. Still, we need to ensure that events for the same cluster continue to be processed in order, to prevent issues in case the configuration of a given cluster is first removed and then subsequently added back again while the removal is still in progress. In this case, we delay the addition event and replay it only once the removal terminated. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 01 July 2024, 07:30:30 UTC
b780df6 check-ipsec-leak: additionally output TCP flags Let's additionally output the TCP flags in case of leaked traffic, as potentially useful while troubleshooting possible flakes. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 01 July 2024, 07:08:39 UTC
483b009 check-ipsec-leak: output whether detected traffic matched an override Normally, the script only flags traffic whose source and destination IP addresses belong to the PodCIDR and, when encapsulation is enabled, don't match the CiliumInternalIPs specified as parameters. However, this filter is overridden when the traffic comes from a proxy, so that it gets flagged even in case it is subsequently masqueraded. Let's additionally output whether displayed traffic got actually flagged due to this reason, to simplify troubleshooting possible flakes. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 01 July 2024, 07:08:39 UTC
40a4df7 check-ipsec-leak: output DNS query information when leaked We have been recently witnessing a few conformance ipsec runs reporting leaked packets, with some referring to DNS answers from a coredns pod to a CiliumInternalIP. To simplify troubleshooting these issues, and figure out whether they are legitimate or flakes, let's additionally print information about the DNS message itself, so that we can trace down which component performed the request. The output is along the lines of: [10:27:49:245997] 10.244.1.67:49662 -> 10.244.0.10:53 (proto: 17, encap: 1, ifindex: 43, netns: f0000000) [10:27:49:246003] Detected DNS message, ID: 17ef, Flags 120, QD: 1, AN: 0, NS: 0, AR: 1, query googlecom [10:27:49:246315] 10.244.0.10:53 -> 10.244.1.67:49662 (proto: 17, encap: 1, ifindex: 45, netns: f0000000) [10:27:49:246317] Detected DNS message, ID: 17ef, Flags 8580, QD: 1, AN: 1, NS: 0, AR: 1, query googlecom Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 01 July 2024, 07:08:39 UTC
back to top