sort by:
Revision Author Date Message Commit Date
981c761 ipsec: Switch to slog Refactor the ipsec code to use slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
53fc716 sysctl: Switch to slog Refactor to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
698026d linux: Switch linuxNodeHandler to use slog Refactor the linuxNodeHandler to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
59fb337 devices: Switch to slog Refactor the DeviceManager to use slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
5c57430 config: Switch to slog Refactor to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
949f983 bigtcp: Switch to slog Refactor to use the slog logging Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
1b57a7a bandwidth: Switch to slog Refactor to use the slog logging. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 10:39:39 UTC
63e90a8 fix(deps): update all go dependencies main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 10:31:06 UTC
7de5df0 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 17 June 2024, 09:58:04 UTC
fd4134b chore(deps): update docker.io/library/golang:1.22.4 docker digest to c2010b9 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 09:58:04 UTC
304b7fd operator: ignore identity delete conflicts Ignore CiliumIdentity delete conflicts during the gc run (by skipping deletion and emitting a warning), allowing gc to continue if a subset of identities are conflicted. Prior to this change conflicts would cause gc to error, which could lead to an unexpected accumulation of stale CiliumIdentity objects. Signed-off-by: Jacob Henner <henner@arcesium.com> 17 June 2024, 09:39:38 UTC
6a63598 doc: Update doc for CRD CiliumNodeConfig from v2alpha1 to v2 Signed-off-by: Donia Chaiehloudj <donia.cld@isovalent.com> 17 June 2024, 09:18:28 UTC
77ea8e9 hubble/cli: add --node-label Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
0de97e7 hubble: add node label filter Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
9b6ee33 hubble: wire the localNodeWatcher in the observer setup Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
6db7004 hubble/observer: add a local node watcher This commit introduce the observer LocalNodeWatcher, which cache the local node information to be filled in Hubble flows. Because the labels representation differ between the internal node.LocalNode struct and Hubble flows (a map and a key=val slice, respectively), we need to maintain a cache in order to avoid re-building the labels slice for each flow. The LocalNodeWatcher aim to solve this and can be hooked to the observer's OnGetFlows. Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
d6fb43b hubble/api: add node_labels to Hubble flows This is the first commit of a set introducing local node labels to Hubble flows. Filtering flows emitted from nodes having particular labels can be useful to debug the Egress Gateway feature: combined with the recently added network interface filter and/or SNAT IP filter one could then see egress flows related to a given CiliumEgressGatewayPolicy. Signed-off-by: Alexandre Perrin <alex@isovalent.com> 17 June 2024, 09:15:27 UTC
09ac42f chore(deps): update all lvh-images main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 09:07:11 UTC
badf925 Add appArmorProfile to the securityContext as well Signed-off-by: Aurelien Benoist <aurelien.larcin@gmail.com> 17 June 2024, 08:35:00 UTC
3790121 add securityContext for cronjob & disable hostNetwork Signed-off-by: Aurelien Benoist <aurelien.larcin@gmail.com> 17 June 2024, 08:35:00 UTC
ba713d0 fix(deps): update module github.com/aws/aws-sdk-go-v2/service/ec2 to v1.164.1 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 08:26:04 UTC
f5129a2 ui: v0.13.1 release Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com> 17 June 2024, 08:25:55 UTC
26df509 vendor: pin StateDB to version v0.1.0 Time to introduce versioning to StateDB as there's some API cleanups coming and we want to control when renovate tries to bump StateDB. Signed-off-by: Jussi Maki <jussi@isovalent.com> 17 June 2024, 08:14:33 UTC
fe05a39 make: explicitly specify default target to build Cilium's go binaries The blamed commit introduced an include statement for Makefile.override in Makefiles specific to building Cilium's go binaries, to enable optionally overriding variables. However, this changed the behavior of executing `make -C folder` (i.e., without specifying an explicit target, as we do in the clustermesh-apiserver Dockerfile for instance) in case Makefile.override contains any target. Indeed, make executes by default the first target that it sees. Let's address this discrepancy and avoid unexpected changes by explicitly configuring the default target to be executed to `all`. Fixes: 811cb7f0273e ("make: Add include to Makefile.override within binary-specific makefiles") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 17 June 2024, 08:28:59 UTC
e25047a make: drop leftover test statement in cilium-dbg Makefile Fixes: 811cb7f0273e ("make: Add include to Makefile.override within binary-specific makefiles") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 17 June 2024, 08:28:59 UTC
0b29687 operator/dockerfile: correctly propagate modifiers The blamed commit replaced several docker build arguments with a single, generic one named MODIFIERS. However, it didn't update the operator Dockerfile to correctly use it. Let's fix it. Fixes: c4aebae89528 ("docker, ci: Create generalized MODIFIERS build arg") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 17 June 2024, 08:28:59 UTC
caadba1 helm: loadBalancerClass for Cluster Mesh APIserver * Added `loadBalancerClass` Helm value for the Cluster Mesh APIserver Kubernetes Service. * Refactored the existing `loadBalancerIP` Helm value so it's clearer that it exists (instead of having it commented out in the values.yaml file). Signed-off-by: Philip Schmid <phisch@cisco.com> 17 June 2024, 07:47:23 UTC
df6609f test: Add check in tunnelMapInit to only set tunnel map if nil The function tunnel.TunnelMap calls a sync.Once to set the variable tunnel.tunnelMap before returning it. This conflicts with the behavior of the tunnel.SetTunnelMap function however, for if tunnel.SetTunnelMap is called to manually set the tunnel map in a test, and the test then calls tunnel.TunnelMap to grab a reference to the map, tunnel.TunnelMap will overwrite the map created by tunnel.SetTunnelMap. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 17 June 2024, 07:39:56 UTC
1395179 test: Attempt to unpin existing tunnel map in SetTunnelMap The function tunnel.SetTunnelMap is used in go tests to manually set the tunnel map, however this function relies on the caller to clean up the map when it is no longer needed. This commit modifies this function to try and unpin any existing tunnel map before setting a new one, in order to address cases where a leftover map may still be pinned. See https://github.com/cilium/cilium/actions/runs/8193113970/job/22406307012 for an example flake that may be related to this issue. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 17 June 2024, 07:39:56 UTC
0c54e45 test: Check error from unpinning tunnel map in TestClusterAwareAddressing This commit modifies the test TestClusterAwareAddressing in pkg/maps/tunnel/tunnel_test.go to check the error returned by the tunnel mapped when it is unpinned. Before, this error was ignored. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 17 June 2024, 07:39:56 UTC
c0a8f30 chore(deps): update dependency renovatebot/renovate to v37.409.1 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 17 June 2024, 07:30:56 UTC
c2b8f48 datapath: Add support for skipping direct routes on different L2 networks Previously, when a cluster ran with native-networking and had multiple zones, it wasn't possible to enable auto direct routes. This caused a bottleneck for same-zone traffic as it always had to be routed through the gw. With this new flag, any direct routes for nodes on different L2 networks will be skipped. Cilium will add routes for nodes on the same L2 and not exit. Fixes: #31124 Signed-off-by: Jonny <jonny@linkpool.io> 17 June 2024, 07:30:25 UTC
72ffe0f clustermesh: extract operator logic into a specific package This commit extracts all the clustermesh specific code into a package called operator. The aim of this is to provide clustermesh specific features to other package launched in the operator. The first user for this besides endpointslicesync would be the mcsapi package. Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr> 17 June 2024, 07:09:53 UTC
de9a02a .github/workflows: pin renovate version To avoid unwanted breakages from renovate, we should also pin renovate version and let it be updated on a weekly basis like the other dependencies. Signed-off-by: André Martins <andre@cilium.io> 16 June 2024, 08:39:19 UTC
664cffe workflows: e2e-upgrade: fix EXTRA parameters use an array for EXTRA to allow extending it properly Signed-off-by: Gilberto Bertin <jibi@cilium.io> 16 June 2024, 08:07:13 UTC
412a46c renovate: run post upgrade tasks on Makefile.values Run postUpgradeTasks after modifying install/kubernetes/Makefile.values file. Signed-off-by: André Martins <andre@cilium.io> 15 June 2024, 10:10:52 UTC
ffb8443 ci: fix ces migration test trigger and conn-disrupt usage PR #32930 introduced a change to the conn-disrupt test that caused this migration test to fail. This PR updates the test to work with the updated test format. The inconsistency was not caught because of the path filters used in the test, so the path filters have been updated to exclude only Documentation/ and test/. Fixes: #32268 Signed-off-by: jshr-w <shjayaraman@microsoft.com> 15 June 2024, 08:59:05 UTC
5e305af images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 15 June 2024, 09:00:23 UTC
1a06642 chore(deps): update docker.io/library/golang:1.22.4 docker digest to 0f76912 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 15 June 2024, 09:00:23 UTC
3a0a3fc Choose optimal algorithm depending on input size This adds a check to ImmSet methods Insert and Delete, whether there is a single or multiple elements being inserted or deleted. Depending on that, two different algorithms are used. For a single element, both algorightms are linear in the size of the ImmSet, but we choose the one that benchmarking shows to be faster. Signed-off-by: Damian Sawicki <dsawicki@google.com> 15 June 2024, 08:22:04 UTC
304f8b1 Replace existing ImmSet methods with proposed ones The rationale is given in the previous commit message. Signed-off-by: Damian Sawicki <dsawicki@google.com> 15 June 2024, 08:22:04 UTC
8f5bbad Add and benchmark alternative immset methods This commit adds alternative implementations of methods of ImmSet: * InsertNew(xs ...T) * DeleteNew(xs ...T) * UnionNew(s2 ImmSet[T]) * DifferenceNew(s2 ImmSet[T]) and benchmarks these implementations agains the existing ones. Benchmarking results: * for Insert, the proposed method becomes faster already with the container of size 1000, and then it performed 10x faster for size 10,000 and 100x faster for size 100,000; * for Delete, the proposed method becomes faster already with the container of size 1000, and then it performed ~5x faster for size 10,000; * for Difference, the proposed method was already 4x faster for size 100, and then it performed 7x faster for size 1000, 35x times faster for size 10,000, and 193x faster for size 100,000; * for Union, the proposed method performs slightly faster, but gains do not visibly grow with increasing size. Theoretically, the proposed solutions have improved computational complexity: * the complexity of Insert is O(len(s.xs)*len(xs)), and the complexity of InsertNew is O(len(s.xs)+len(xs)); * the complexity of Delete is O(len(s.xs)*len(xs)), and the complexity of DeleteNew is O(len(s.xs)+len(xs)); * the complexity of Difference is O(len(s.xs)*len(s2.xs)) because it uses Delete internally, and the complexity of DifferenceNew O(len(s.xs)+len(s2.xs)); * the complexity of Union is harder to estimate: it involves sorting a slice of size n=len(s.xs)+len(s2.xs), but this slice is a concatenation of two sorted slices, so most likely this does not lead to the usual O(n*log(n)) complexity; of course, it is at least O(n); the complexity of UnionNew is O(n). Signed-off-by: Damian Sawicki <dsawicki@google.com> 15 June 2024, 08:22:04 UTC
5aa52b0 feat: Configure static cilium network policy Cilium reads CNP yaml if `static-cnp-path` is specified in cilium config. It converts to rules and add those rules to policy engine. This allows admin to configure policy to not allow traffic to certain secure infrastructure endpoints from pods running in cloud. Signed-off-by: Tamilmani <tamanoha@microsoft.com> 15 June 2024, 08:17:50 UTC
4195bdc chore: Bump spire agent and server versions Ideally, this should be taken care by renovate bot with postUpgradeTask, however, there is some issues with the configuration. Signed-off-by: Tam Mach <tam.mach@cilium.io> 15 June 2024, 08:14:22 UTC
268d28e dev-doctor: update hint links Signed-off-by: renyunkang <rykren1998@gmail.com> 15 June 2024, 07:56:46 UTC
cf63069 k8s: Fix usage of assert in TestWaitForCacheSyncWithTimeout TestWaitForCacheSyncWithTimeout is relying on subtests, so a new assert object should be derived from each subtest testing.T, in order to report the failing one correctly. Temporarily changing the code to force a failure in the "Not invoking BlockWaitGroupToSyncResources should cause wait to succeed immediately" case, it can be seen that no subtest is reported as failing: --- FAIL: TestWaitForCacheSyncWithTimeout (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Waiting_for_no_resources_should_always_sync (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Not_invoking_BlockWaitGroupToSyncResources_should_cause_wait_to_succeed_immediately (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_timeout_due_to_watched_resource_exceeding_timeout (0.20s) --- PASS: TestWaitForCacheSyncWithTimeout/Any_one_timeout_should_cause_error (0.60s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_complete_due_to_event_causing_timeout_to_be_extended_past_initial_timeout (0.70s) While after this change the expected subtest is correctly reported as the offending one: --- FAIL: TestWaitForCacheSyncWithTimeout (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Waiting_for_no_resources_should_always_sync (0.00s) --- FAIL: TestWaitForCacheSyncWithTimeout/Not_invoking_BlockWaitGroupToSyncResources_should_cause_wait_to_succeed_immediately (0.00s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_timeout_due_to_watched_resource_exceeding_timeout (0.20s) --- PASS: TestWaitForCacheSyncWithTimeout/Any_one_timeout_should_cause_error (0.60s) --- PASS: TestWaitForCacheSyncWithTimeout/Should_complete_due_to_event_causing_timeout_to_be_extended_past_initial_timeout (0.70s) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 14 June 2024, 20:16:13 UTC
33d64e1 gateway-api: Update docs for v1.1.0 This is the follow-up item for #32233. Relates: https://github.com/cilium/cilium/pull/32233 Signed-off-by: Tam Mach <tam.mach@cilium.io> 14 June 2024, 16:59:43 UTC
2fdd17b policy: Fix Rare Deny Merge Bug While fixing the distillery tests I noticed that the test "broad_allow_is_a_portproto_subset_of_a_specific_deny" was failing intermittently. This revealed that in very rare circumstances the policy engine would merge a deny entry into a pre-existing allow that was set to be deleted completely (not just have an ownership entry removed). Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
2710e71 policy: Fix Deny Precedence Distillery Tests Deny precedence tests have not been using correct comparison logic for mapstates. This updates it to use the correct logic. It makes the named-port sub-test and port-range tests more legible. Some test data has been renamed because it was incorrectly named. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
9b57c0b policy: Fix Proxy Port Tests The default proxy port is 1. The dummy proxy port value of 4242 should no longer be the default. Addtionally, almost all distillery tests have had their comparison logic fixed to compare mapstates. The last distillery test to be fixed will be fixed in another commit. Co-authored-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
8458e5c policy: Fix InvertedPortMask Logic Inverted port masks were introduced in a previous commit, but they did not take into account all of the wildcard policy keys that are hardcoded both in tests and, unfortunately, in production code. This fixes all InvertedPortMask logic so that wild cards are correctly instantiated and accounted for. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 16:29:33 UTC
f174b5e recorder: move API handler into cell Currently, the recordder API handler is implemented by the daemon. This commit extracts the implementation and moves it into the recorder hive cell. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 16:32:04 UTC
248a9be recorder: use hive injected logger This commit replaces the static logger with the one that has been injected via Hive Framework. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 16:32:04 UTC
58dfad2 recorder: introduce cell for recorder This commit introduces a hive cell for the pcap recorder. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 16:32:04 UTC
736af68 bpf: extract ethertype in to-netdev / to-overlay just once Remove some duplicated logic. Ideally we would only do the extraction if at least *one* feature is enabled that actually requires the ethertype. But managing these dependencies doesn't seem worth the hassle. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 14:43:48 UTC
ae4ffd4 bpf: eth: extract eth_is_supported_ethertype() helper In preparation for a subsequent patch. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 14:43:48 UTC
aa10df3 kvstore: correctly assign permissions to single key, rather than prefix Fix the recently introduced rangeForKey function leveraged by the etcdinit logic to only grant permissions to the given key, and not the full prefix starting with that key, when appropriate. Fixes: cb6a58bef00b ("clustermesh: granular etcd permissions for kvstoremesh cached data") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 14 June 2024, 13:03:22 UTC
96336a0 lrp: move api handler from daemon to lrp hive cell Currently, the daemon implements the LRP API handler. With it comes the last dependency to the LRP manager from the daemon. Therefore, this commit moves the LRP API handler from the daemon to the existing LRP hive cell and removes the dependency from the daemon to the LRP manager. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 13:03:05 UTC
d21f654 bpf: test Wireguard with ENCRYPTION_STRICT_MODE Run compile & complexity tests for Wireguard with Strict-mode enabled. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 12:21:24 UTC
4cd716c bpf: clean up compile testing for ENABLE_WIREGUARD Wireguard completely lives in bpf_host. Remove the config from bpf_overlay, bpf_xdp and bpf_lxc. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 12:21:24 UTC
572bca4 bpf: l3: remove unused loopback code in local-delivery The `hairpin_flow` parameter was previously needed so that loopback replies could bypass the ingress network policy. But now that all callers set this parameter to `false`, we can safely remove the corresponding logic. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 11:45:10 UTC
34337f0 bpf: lxc: simplify RevNAT path for loopback replies The usual flow for handling service traffic to a local backend is as follows: * requests are load-balanced in from-container. This entails selecting a backend (and caching the selection in a CT_SERVICE entry), DNATing the packet, creating a CT_EGRESS entry for the resulting `client -> backend` flow, applying egress network policy, and local delivery to the backend pod. As part of the local delivery, we also create a CT_INGRESS entry and apply ingress network policy. * replies bypass the backend's egress network policy (because the CT lookup returns CT_REPLY), and pass to the client via local delivery. In the client's ingress path they bypass ingress network policy (the packets match as reply against the CT_EGRESS entry), and we apply RevDNAT based on the `rev_nat_index` in the CT_EGRESS entry. For a loopback connection (where the client pod is selected as backend for the connection) this looks slightly more complicated: * As we can't establish a `client -> client` connection, the requests are also SNATed with IPV4_LOOPBACK. Network policy in forward direction is explicitly skipped (as the matched CT entries have the `.loopback` flag set). * In reply direction, we can't deliver to IPV4_LOOPBACK (as that's not a valid IP for an endpoint lookup). So a reply already gets fully RevNATed by from-container, using the CT_INGRESS entry's `rev_nat_index`. But this means that when passing into the client pod (either via to-container, or via the ingress policy tail-call), the packet doesn't match as reply to the CT_EGRESS entry - and so we don't benefit from automatic network policy bypass. We ended up with two workarounds for this aspect: (1) when to-container is installed, it contains custom logic to match the packet as a loopback reply, and skip ingress policy (see https://github.com/cilium/cilium/pull/27798). (2) otherwise we skip the ingress policy tailcall, and forward the packet straight into the client pod. The downside of these workarounds is that we bypass the *whole* ingress program, not just the network policy part. So the CT_EGRESS entry doesn't get updated (lifetime, statistics, observed packet flags, ...), and we have the hidden risk that when we add more logic to the ingress program, it doesn't get executed for loopback replies. This patch aims to eliminate the need for such workarounds. At its core, it detects loopback replies in from-container and overrides the packet's destination IP. Instead of attempting an endpoint lookup for IPV4_LOOPBACK, we can now look up the actual client endpoint - and deliver to the ingress policy program, *without* needing to early-RevNAT the packet. Instead the replies follow the usual packet flow, match the CT_EGRESS entry in the ingress program, naturally bypass ingress network policy, and are *then* RevNATed based on the CT_EGRESS entry's `rev_nat_index`. Consequently we follow the standard datapath, without needing to skip over policy programs. The CT_EGRESS entry is updated for every reply. Thus we can also remove the manual policy bypass for loopback replies, when using per-EP routing. It's no longer needed and in fact the replies will no longer match the lookup logic, as they haven't been RevNATed yet. This effectively reverts e2829a061a53 ("bpf: lxc: support Pod->Service->Pod hairpinning with endpoint routes"). Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 11:45:10 UTC
b66862a endpoint: Fix Policy Sync Method The endpoint package had the last code in the Cilium repository that was updating mapstate during iteration. Updating mapstate during iteration will result in corrupting the mapstate. Fixes: #32959 Signed-off-by: Nate Sweet <nathanjsweet@pm.me> 14 June 2024, 11:27:46 UTC
42ee1f6 egressgw: skip gressgw handling if the packet is from host The egress gateway handling code at bpf_host only cares about packets from the egress proxy, so we can ignore the packets from the host. Signed-off-by: Yusuke Suzuki <yusuke.suzuki@isovalent.com> 14 June 2024, 11:14:28 UTC
7f6df2d prefilter: move api handler from daemon to prefilter hive cell Currently, the daemon implements the Prefilter API handler. With it comes the last dependency from the daemon to the prefilter itself. Therefore, this commit moves the prefilter API handler from the daemon to the existing prefilter hive cell and removes the dependency from the daemon to the prefilter. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 11:08:23 UTC
9f210b5 proxy: Persists proxy ports in /var/run/cilium/state Write proxy ports into /var/cilium/state/proxy_ports_state.json and restore from there. If the file is not available, only then restore from iptables rules. Restoration from iptables is still needed for upgrades. This fixes the issue with bpf TPROXY, as it does not rely on iptables rules, so the proxy ports can not be recovered from them. Note: File IO is patterned after similar methods for cache.CachingIdentityAllocator. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
d11e4d2 proxy: Reuse proxy ports from datapath on restart Previously we only reused the DNS proxy port from the datapath. Change to reuse the old proxy ports for all proxy redirects from datapath, if possible. This reduces Listener churn on daemonset Envoy on agent restart. With this change the listener update in Envoy logs after agent restart looks like this: begin add/update listener: name=cilium-http-egress:13563 hash=11560876577076369351 duplicate/locked listener 'cilium-http-egress:13563'. no add/update lds: add/update listener 'cilium-http-egress:13563' skipped Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
8149af7 proxy: Remove deprecated "localOnly" We now only ever have 'localOnly = true' so that we always bind the proxy listeners to localhost address. Remove support for all-hosts addresses. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
ee1cf1c proxy: Fix AllocateProxyPort Rename AllocateProxyPort as AllocateCRDProxyPort to make it clear that it is only to be used for CEC/CCEC CRD listeners. Also enforce the current practice where CRD proxy ports are only accessed by name, and 'ingress' member is always false for CRD ports. Fix the unit test to verify that the proxy port was not reallocated for the same listener name. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
bfbd3d6 proxy: Try previously used port first "rulesPort" is non-zero when we have a datapath (iptables) rules with a specific port. Try re-creating a redirect with that port if no other port is configured. This reduces churn on the datapath, as the existing rule can be reused without a need for an update. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
00b7717 proxy: Clear the proxy port on failure Clear proxy port on failure so that we'll try another random port next time. We have two failure cases for creating new redirects: - syncronous, for DNS proxy. Simply zero the proxy port in the retry loop. - asynchronous, for Envoy proxy. Clear proxy port state on revert callback. Noticed the need for this change when accidentally trying to use port 1 for Envoy when developing another feature. In this case Envoy can't bind the port, and all further tries were also trying the same port, as the proxy port was not cleared on revert. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 14 June 2024, 10:12:59 UTC
b30a3a9 ipcache: Fix orphaned ipcache entries when mixing Upsert and Inject When a prefix is initially created by the synchronous Upsert() API, it is flagged as such so that InjectLabels() knows it is shared. However, this flag is not removed if the legacy caller releases all references to this prefix. Thus, the timeline 1. AllocateCIDRs("1.1.1.1/32") 2. UpsertPrefixes("1.1.1.1/32") 3. ReleaseCIDRIdentities("1.1.1.1/32") 4. RemovePrefixes("1.1.1.1/32") leaves us with the prefix still in the ipcache, but the identity fully released. This leads to traffic drops, as the identity is unknown to the policy system and thus not present in the BPF policymaps. The fix is to forcibly remove the prefix if the identity reference reaches zero and the prefix is not in the metadata layer. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 14 June 2024, 09:55:50 UTC
e879425 ctmap: dump CT entry's BackendID Service connections store their selected backend ID in the SVC-type CT entry. Dump this field on `cilium-dbg bpf ct list global`. This then looks like: TCP SVC 10.244.0.62:55394 -> 10.96.0.1:443 expires=158116 ... BackendID=1 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 07:25:53 UTC
fd29852 fqdn: Exit go routines early if datapath update times out This commit changes the `UpdateGenerateDNS` function to return a errgroup instead of a waitgroup. This allows the caller to determine after `Wait` if the datapath update timed out or not. This then allows `UpdateGenerateDNS` to stop the errgroup, allowing the go routine spawned in `notifyOnDNSMsg` to exit early as well upon cancellation. This way, the go routine in `notifyOnDNSMsg` does not need to linger until the datapath update has finished. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 June 2024, 07:16:43 UTC
b9152ae ipcache: Allow WaitForRevision to be cancelled This adds a context argument to WaitForRevision, allowing callers to abort the call. For logic for the cancellation of the `Cond` var using `context.AfterFunc` was taken from the Go docs: https://pkg.go.dev/context#AfterFunc The commit also fixes an issue with the existing unit test where it waited for revision 0 instead of revision 1. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 June 2024, 07:16:43 UTC
8868b39 renovate: Add the configuration for spire images Signed-off-by: Tam Mach <tam.mach@cilium.io> 14 June 2024, 07:03:14 UTC
8482b03 docs: egressgw: remove stale enable-l7-proxy option This option was suggested to deal with an incompatibility between EGW and L7 policies. The incompatibility has been addressed by https://github.com/cilium/cilium/pull/32828. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 06:50:40 UTC
b6f6867 daemon: remove unnecessary method DebugEnabled The method `Daemon.DebugEnabled` is used in two cases. 1. In an optionChanged callback that can directly access the config property 2. As hubble observer option that is actually no longer used Therefore, this commit removes the unnecessary method from the daemon. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 June 2024, 06:11:51 UTC
5e27f67 bgpv2: pass types.Router in path and policy reconcilers This change passes only required field to Policy and Path reconcilers. Instead of passing BGPInstance, we pass only the Router interface which is required by underlying implementation. Signed-off-by: harsimran pabla <hpabla@isovalent.com> 14 June 2024, 05:33:40 UTC
3e30619 bpf: move tunnel map to encap.h Declutter the maps.h header, and reduce the usage of HAVE_ENCAP. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 04:25:14 UTC
dfb6b94 bpf: move throttle map to edt.h Declutter the maps.h header, and reduce the usage of ENABLE_BANDWIDTH_MANAGER Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 04:25:14 UTC
50c38aa bpf: move egressgw map to egress_gateway.h Declutter the maps.h header, and reduce the usage of ENABLE_EGRESS_GATEWAY. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 04:25:14 UTC
2a0bc76 bpf: encap: fix ifindex in TO_OVERLAY trace notification The encap helpers were meant to abstract from differences between TC and XDP. Therefore ctx_set_encap_info() provides the ifindex for TC, and setting the ifindex also indicates that a redirect to encap interface is possible (rather than manually adding the overlay headers + FIB lookup). The downside is that __encap_with_nodeid() for TC currently emits a trace notification without the ifindex set to ENCAP_IFINDEX. Fix this up manually by moving the `ifindex` initialization up. Reported-by: Tomasz Tarczyński <tomasz.tarczynski@isovalent.com> Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 June 2024, 04:22:26 UTC
d107719 gateway-api: Check for matching controller name Basically, we should do nothing for a given Gateway resource, till it's confirmed that its GatewayClass is managed by Cilium. Just a note that other watched and owned resources can by-pass the predicate function NewPredicateFuncs(hasMatchingControllerFn), the explicit check in reconcile method is required. This commit is to perform the controller name check for Gateway resource to avoid unnecessary and wrong reconciliation. Fixes: #31978 Signed-off-by: Tam Mach <tam.mach@cilium.io> 14 June 2024, 03:44:56 UTC
81adc5c remove tracking of backports with MLH With the sunset of GH projects by GH [1], we will now create organization-projects to track which PR is available on which release after a CHANGELOG of a release is performed. Thus, we can also sunset this feature from MLH. [1] https://github.blog/changelog/2024-05-23-sunset-notice-projects-classic/ Signed-off-by: André Martins <andre@cilium.io> 13 June 2024, 20:03:54 UTC
4b00124 fix(deps): update module github.com/aws/aws-sdk-go-v2/service/ec2 to v1.164.0 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 13 June 2024, 20:03:13 UTC
ef50a9a chore(deps): update all github action dependencies Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 13 June 2024, 17:25:01 UTC
bb0c48a daemon: remove unused method GetOptions This commit removes the unused method `GetOptions() *option.IntOptions` from the daemon. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 13 June 2024, 15:57:10 UTC
183f0fa iptables: Run an initial full reconciliation Run an initial full reconciliation before listening on partial reconciliation request channels like the ones related to proxy rules and no track pod rules. This avoid spurious errors at startup when a partial reconciliation request was seen by the reconciler before the 200 ms interval needed for the first full reconciliation. In that case, the partial reconciliation failed due to missing chains installed by the first full reconciliation. Reported-by: Dylan Reimerink <dylan.reimerink@isovalent.com> Suggested-by: Dylan Reimerink <dylan.reimerink@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 13 June 2024, 15:11:45 UTC
0b8f9e3 iptables: Fix usage of firstInit flag The firstInit flag is meant to avoid partial reconciliations (like the ones for proxy rules and no track pod rules) until the first full reconciliation is successfully completed. This is done to avoid trying a partial reconciliation when all the required chains have been created. The commit fixes the if conditions checking the flag that turned out to be inverted. Though the reconciler was already able to recover with the next full reconciliation, this led to spurious errors during Cilium startup where partial reconciliations were attempted too soon. Reported-by: Dylan Reimerink <dylan.reimerink@isovalent.com> Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 13 June 2024, 15:11:45 UTC
38278b5 chore(deps): update dependency grpc-ecosystem/grpc-health-probe to v0.4.27 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 13 June 2024, 14:59:54 UTC
4358595 fix(deps): update all go dependencies main Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 13 June 2024, 14:55:36 UTC
f88fade chore(deps): update cilium/cilium-cli action to v0.16.10 Signed-off-by: cilium-renovate[bot] <134692979+cilium-renovate[bot]@users.noreply.github.com> 13 June 2024, 14:35:04 UTC
2fd9150 bpf: introduce CILIUM_PIN_REPLACE map pinning flag This commit adds support for the custom CILIUM_PIN_REPLACE pinning flag. It signals to the loader that a map should be pinned without being reused in subsequent ELF loads. This replaces bespoke cilium_calls_-specific logic in the loader with a generic map flag, opening it up for other use cases as well. The reasons for this behaviour are widely documented and present in the code for posterity. Also, give each netdev its own instance of cilium_calls. Sharing a tail call map across all XDP entry points causes multiple netdevs' programs to clobber the shared cilium_calls_xdp bpffs pin. Signed-off-by: Timo Beckers <timo@isovalent.com> 13 June 2024, 13:51:55 UTC
8f5e88f bpf: replace bpf map migration with commit mechanism At its inception, Cilium had an external ebpf loader (iproute2) that didn't deal with changes to map properties (type/k/v/maxentries/flags). To allow the agent to upgrade/downgrade maps, a 'map migration' system was introduced that would take the new ELF and compare its maps against their pinned counterparts on the system's bpffs. Incompatible maps were renamed using a ':pending' suffix to allow the loader to create and pin a new instance of the map at its old path. If all went well, the :pending map was removed. Even though it served us for many years, this system wasn't without its drawbacks, primarily the many moving parts (files) to manage on bpffs, as well as its obscuring of subtle bugs in managing tail call map lifecycle. This commit replaces the map migration system with a commit-based system that doesn't modify any bpffs-related resources until all of an ELF's entrypoints are attached and all cross-ELF tail calls (policy progs) have been inserted. After commit() has run for a Collection, only one copy of each map pin will be present on bpffs. This removes all possibility of previous ELF generations being partially attached somewhere, still handling traffic using an old tail call map. Such cases will now fail loudly with the 'missed tail call' metric increasing due to the old tail call map pins being removed. Signed-off-by: Timo Beckers <timo@isovalent.com> 13 June 2024, 13:51:55 UTC
8fae0eb bpf: deprecate legacy PIN_* constants for map definitions Treat the bpf_elf_map.pinning field just like BTF map definitions do, and replace PIN_GLOBAL_NS with LIBBPF_PIN_BY_NAME. Pass the values through directly when parsing MapSpec.Extra on the Go side. A future commit will assign meaning to a value higher than LIBBPF_PIN_BY_NAME. Signed-off-by: Timo Beckers <timo@isovalent.com> 13 June 2024, 13:51:55 UTC
0c0800c ci: make runtime privileged tests not run in parallel There was a significant flakiness of IPSec-related privileged tests due to the fact that tests in different packages were modifying xfrm states/policies concurrently. While increasing timeout for test and making it last longer is non-ideal, less flaky tests outweight it. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 13 June 2024, 13:35:20 UTC
c464e66 helm: mount kvstoremesh-specific certificate into cilium agents Let's additionally mount the kvstoremesh-specific certificate into cilium agents, so that it can be used to authenticate against the local etcd instance storing the cached data. The secret entry is always configured (although marked as optional), regardless of whether KVStoreMesh is actually enabled or not, so that it can be automatically mounted in case it gets subsequently enabled, without requiring a restart of the agents. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 13 June 2024, 13:01:41 UTC
9ffeba1 helm: generate dedicated certificate for kvstoremesh access Extend the helm chart to additionally generate the "local" certificate with the common name matching the newly introduced "local" etcd user, when kvstoremesh is enabled. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 13 June 2024, 13:01:41 UTC
cb6a58b clustermesh: granular etcd permissions for kvstoremesh cached data Currently, the same etcd user (i.e., remote) is granted permissions to read the whole content of the clustermesh-apiserver's sidecar etcd instance, including also the data cached by kvstoremesh, when enabled. In an effort to harden the overall clustermesh posture, let's introduce a separate and dedicated user for local access, to ensure that remote clusters cannot access cached data, as it may include information that they would not normally have access to. Specifically, the remote user is intended to have access only to the information regarding the local cluster, while the local user can access cached data about remote clusters only. Still, for backward compatibility purposes, the remote user still retains access to cached data as well in this release. The reason being that there would otherwise be a time window upon upgrade in which Cilium Agents would lose access to the kvstoremesh data (especially in large clusters). Indeed, the new certificate would be mounted by the agents only upon rollout, but the configuration would be immediately reloaded (thus targeting the new, not yet mounted, certificate), hence breaking the access to the information cached by kvstoremesh. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 13 June 2024, 13:01:41 UTC
back to top