https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
35f206a Remove deadlock from AuthMap Endpoint GC This change keeps a copy of the endpoints map inside the Autentication GC code. It will be updated on subscribe events, This is then used in the AuthMap GC code instead of doing a call that caused a deadlock. Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com> 06 December 2023, 12:06:14 UTC
d1128cf chore(deps): update dependency cilium/cilium-cli to v0.15.17 Signed-off-by: renovate[bot] <bot@renovateapp.com> 05 December 2023, 21:48:01 UTC
ec4e22c test: remove probes-test.sh probes-test.sh is a relict from the past that tests a script that is no longer executed and was removed (fbe98184) from the code base. Signed-off-by: Robin Gögge <r.goegge@isovalent.com> 05 December 2023, 19:48:36 UTC
2dc6aa7 guestbook: update example with leader/follower naming The guestbook in version v5 fails trying to connect to `redis-leader`. The reason is that the deployment is still named `redis-master`. Therefore, this commit renames the guestbook example deployments to use the leader/follower naming. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 05 December 2023, 19:48:10 UTC
9b3ab90 release image: Allow arbitrary pre-release identifiers Make the filter patterns for release image tags slightly more lenient to allow arbitrary pre-release identifiers beyond pre and rc. Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 05 December 2023, 19:02:56 UTC
654d92f ci-e2e: Use lvh-kind in secure way Otherwise, an external PR can inject malicious cmds in the action. Signed-off-by: Martynas Pumputis <m@lambda.lt> 05 December 2023, 15:46:22 UTC
cd93d37 ci-ipsec-upgrade: Use lvh-kind Reduces the GH workflow complexity, and will allow us to reuse the workflow for non-Kind based tests. Signed-off-by: Martynas Pumputis <m@lambda.lt> 05 December 2023, 15:46:22 UTC
f8e61f3 ci-ipsec-e2e: Use lvh-kind Signed-off-by: Martynas Pumputis <m@lambda.lt> 05 December 2023, 15:46:22 UTC
29515a7 daemon: Fix incorrect node and ciliumnode resource type in annotations Fix incorrect node and ciliumnode resource types in annotations, along with a typo. Signed-off-by: Huagong Wang <wanghuagong@kylinos.cn> 05 December 2023, 14:20:12 UTC
8691035 codeowners: use new teams cilium/envoy & cilium/fqdn This commit updates the codeownership for FQDN- & Envoy proxy related code with the new teams. * Core / Common -> `cilium/proxy` * FQDN (and integration) -> `cilium/fqdn` * Envoy (and integration) -> `cilium/envoy` Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 05 December 2023, 14:19:53 UTC
c01a860 examples: update guestbook example with new image registry The GCP Kubernetes Engine Samples migrated their image registry from Google Container Registry to Google Artifact Registry. Hence, the image gb-frontend from the guestbook example is no longer available. Therefore, this commit changes the example to use the new registry. Issue: https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/issues/209 Guestbook PR: https://github.com/GoogleCloudPlatform/kubernetes-engine-samples/pull/194 Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 05 December 2023, 12:24:10 UTC
3e44799 DNS Proxy: Adds UDP checksum to IPv6 Responses Fixes #28678 Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io> 05 December 2023, 12:22:50 UTC
24257ff clustermesh: add nodes to wait for ipcache synchronization 8de7707a706c ("endpoint: wait for clustermesh IPs/identities sync before regeneration") modified the endpoint regeneration logic to explicitly wait for ipcache and identities synchronization from all remote clusters (in addition to the local one) before starting the regeneration process, to avoid disrupting long running connections. Yet, that fix is not enough in case of pod-to-node connectivity, because the ipcache entries corresponding to the addresses of remote nodes (as well as the health and ingress IPs) are configured upon reception of the relevant node entry. Hence, let's extend the wait function to also wait for nodes synchronization in addition to IPs and identities, in order to ensure that the ipcache is fully synchronized before triggering the endpoint regeneration process. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 05 December 2023, 12:22:07 UTC
8aba384 ci: disable preemptible VM & GKE clusters on tests based on GKE This commit introduces the usage of normal VMs and GKE clusters for tests based on GCP running on a schedule basis. Preemptible machines & clusters are still for PR tests. This should reduce the flakiness. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 05 December 2023, 12:20:49 UTC
ea093d8 docs: specify which further release for fqdn option removal. This should have been made explicit. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 05 December 2023, 12:24:43 UTC
34ade5c bpf/Makefile: remove bear URL from gen_compile_commands make target The presence of the GitHub project URL for Bear got flagged in a GPL license check. Remove the bear URL from gen_compile_commands make target so the GPL license check can pass. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com> 05 December 2023, 11:31:18 UTC
7c7b723 health/server: Fix stale references to old nodes during health probe Given the order of operations in prober.OnIdle, it is possible for the health probe to have a stale references to a deleted nodes. When that occurs, node connectivity metrics which were previously deleted [1] would be brought back, causing confusion. If users defined alerts for node connectivity health checks metrics (see example below), then this would erroneously trigger because the old nodes would appear in the metric labels as a failing health check. Example given deletion of "kind-worker2" node: ``` cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-control-plane" target_nod e_type="remote_intra_cluster" type="endpoint" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-control-plane" target_nod e_type="remote_intra_cluster" type="node" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-worker" target_node_type= "local_node" type="endpoint" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-worker" target_node_type= "local_node" type="node" 1.000000 cilium_node_connectivity_status source_cluster="kind-kind" source_node_name="kind-worker" target_cluster="kind-kind" target_node_name="kind-worker2" target_node_type ="remote_intra_cluster" type="endpoint" 0.000000 ``` Fixes: d9e1ff897d ("cilium-health: Remove unnecessary goroutine") [1]: e9f97cd0e3 ("Ensures prometheus metrics associated with a deleted node are no longer reported.") Signed-off-by: Chris Tarazi <chris@isovalent.com> 04 December 2023, 23:31:55 UTC
4787f8e node/manager: Add info logs for added and deleted nodes Similar to how useful log msgs are when endpoints created and deleted, this log is useful for understanding when nodes are added and deleted in production clusters. Signed-off-by: Chris Tarazi <chris@isovalent.com> 04 December 2023, 23:31:55 UTC
773cdca mutual-auth: Bump SPIRE image version to 1.8.5 This bumps the SPIRE version we use to v1.8.5 which includes several bug fixes. Relates: https://github.com/spiffe/spire/releases/tag/v1.8.5 Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com> 04 December 2023, 22:30:47 UTC
40088a2 Docs: Adds Webhook Limitation to EKS Install Doc For EKS installs that use overlay mode, webhook servers must be exposed outside the cluster so they are reachable from the managed control plane. Fixes #29454 Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io> 04 December 2023, 20:49:52 UTC
905416a README: Update releases Signed-off-by: André Martins <andre@cilium.io> 04 December 2023, 20:24:38 UTC
7abb00f endpoint: fix panic in RunMetadataResolver due to send on closed channel This commit fixes a "send to closed channel" panic during execution of `Endpoint.RunMetadataResolver`. Determine whether the regenTriggeredCh channel has already been closed by a previous controller run can't be checked in the same select statement that writes to the same channel. The execution order of case statements within a select isn't guaranteed by the Go language spec. Therefore, this commti fixes the check by introducing a variable `callerBlocked`. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 04 December 2023, 20:16:30 UTC
57a6089 helm: Added support for existing Cilium SPIRE NS Added additional `existingNamespace` bool flag to configure if the SPIRE `namespace` already exists or not. If it's an existing one, we should not include our Cilium SPIRE namespace template as this causes the Helm installation to fail. Helm then complains about `Namespace "xxx" in namespace "" exists and cannot be imported into the current release`. Signed-off-by: Philip Schmid <philip.schmid@isovalent.com> 04 December 2023, 19:56:04 UTC
5985553 envoy: perform version check directly on envoy binary (not starter) With the introduction of the envoy starter (#27498), the Envoy version check of the embedded mode calls out to the starter binary instead of the envoy binary directly. Depending on the permissions the agent runs with, the capabilities check within the starter might fail. To prevent unexpected errors, this commit re-introduces that the Envoy version check is performed on the Envoy binary directly. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 04 December 2023, 15:42:18 UTC
95d7c88 ipset: ignore not found errors on entry removal Configure the extra flag to prevent the ipset command from returning an error if the entry that we are trying to remove is already not present. This change brings consistency with the add command (which already sets the same flag), and enables us to attempt to remove an entry even if it may not exist, e.g., to cleanup possible stale entries. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 04 December 2023, 14:12:09 UTC
d5d3f5b Makefile: Fix variable override not working in all cases Commit b256fbd8 introduced the ability to override certain Makefile variables, such as `SUBDIRS_CILIUM_CONTAINER`, which controls which targets are built for the Cilium container. However, due to 1fcbe96bf88e ("make: avoid building plugins/cni twice"), the Makefile would then still append additional entries to the variable, thus not allowing the override to e.g. override the CNI plugin build. This commit addresses this issue by moving any modification to `SUBDIRS_CILIUM_CONTAINER` to before the override Makefile is included. This ensures that the override Makefile has full control over the variables it overrides. To compensate for the fact that SUBDIRS would then cause the CNI plugin to be built twice (e.g. once via "make -C plugins/cilium-cni", but then once again via "make -C plugins"), a new filter is introduced which filters out any subdirectories already covered by parent directories. This incidentally also fixes a bug where certain tools (e.g. "tools/mount") were built twice. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 04 December 2023, 14:11:36 UTC
332bf33 cilium, iptables: Fix default SNAT rule under masq-to-route-source Fix the default route masquerading `--enable-masquerade-to-route-source` option in two aspects: - Output devices should be ! -o cilium_+ - Destination must not be 0.0.0.0/0 but rather ! -d snatDstExclusionCIDR The fixes have been validated in the user's environment that they address the connectivity issue they were experiencing under mentioned agent option. Fixes: 0d10aca58f44 ("cilium, iptables: Extend to cover default route in enable-masquerade-to-route-source") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 December 2023, 14:09:24 UTC
223c39d cilium, iptables: Do not install catch-all masq rule under masq-to-route-source When `--enable-masquerade-to-route-source` option is used to support more fine-grained masquerading for advanced use cases, then do not install the regular catch-all masquerading iptables rule. In a user environment, we've seen that this messes with the source selection and picks the wrong IP. Since `--enable-masquerade-to-route-source` also handles default routes, it replaces the catch-all masquerading one already, hence move this into an else branch. No other functional changes. Fixes: 0d10aca58f44 ("cilium, iptables: Extend to cover default route in enable-masquerade-to-route-source") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 December 2023, 14:09:24 UTC
0a6ba7e cilium: Clean any unrelated ingress qdiscs Before loading the main datapath, check if there is any ingress qdisc and if so, then remove it, so that clsact qdisc can make forward progress. We have seen situations where phys devices had an empty ingress one attached which then led Cilium to fail attaching bpf_host. Fix this by removing and dump an info log message about it. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 December 2023, 14:09:24 UTC
83dac33 cilium: Fix Makefile to force ln /usr/bin/cilium Fixes the following build error when installing locally: # make && make install [...] for i in cilium-dbg daemon cilium-health bugtool tools/mount tools/sysctlfix operator plugins tools hubble-relay bpf; do make -C $i install; done make[1]: Entering directory '/root/go/src/github.com/cilium/cilium/cilium-dbg' install -m 0755 -d /usr/bin install -m 0755 cilium-dbg /usr/bin ln -rs /usr/bin/cilium-dbg /usr/bin/cilium ln: failed to create symbolic link '/usr/bin/cilium': File exists make[1]: *** [Makefile:29: install-binary] Error 1 make[1]: Leaving directory '/root/go/src/github.com/cilium/cilium/cilium-dbg' make: *** [Makefile:177: install] Error 2 [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 04 December 2023, 14:09:24 UTC
ea09e90 Fix exporting results to gs bucket. test_name was not set causing both tests to export results to the same gs bucket directory. Fixes #29214 Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 04 December 2023, 13:05:14 UTC
b36de40 Revert "Prepare for release v1.15.0-pre.3" This reverts commit 27b3dc83c631b0aa229993d9ee17e54afe9578f7. Signed-off-by: André Martins <andre@cilium.io> 04 December 2023, 13:04:57 UTC
ab99077 Prepare for release v1.15.0-pre.3 Signed-off-by: André Martins <andre@cilium.io> 04 December 2023, 13:04:57 UTC
b5a0cf5 update AUTHORS and Documentation Signed-off-by: André Martins <andre@cilium.io> 04 December 2023, 13:04:57 UTC
95a6cbe chore(deps): update anchore/scan-action action to v3.3.8 Signed-off-by: renovate[bot] <bot@renovateapp.com> 04 December 2023, 11:42:55 UTC
7a8be93 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 04 December 2023, 11:42:31 UTC
f7510c0 chore(deps): update docker.io/library/ubuntu:22.04 docker digest to 8eab65d Signed-off-by: renovate[bot] <bot@renovateapp.com> 04 December 2023, 11:42:31 UTC
1f40a11 fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 04 December 2023, 11:16:02 UTC
7772523 endpointmanager: unmap ip for lookup In case of an IPv4-mapped IPv6 address we'd lookup the address as an IPv4 address with a ::ffff: prefix, leading to endpoint lookup errors such as cannot find endpoint with IP ::ffff:10.0.1.19 Fix this by explicitly unmapping the address before lookup which leads to 10.0.1.19 to be looked up (and found) in the above case. Fixes: 54a896c5ab9a ("endpointmanager: Use netip.Addr instead of net.IP in LookupIP") Signed-off-by: Tobias Klauser <tobias@cilium.io> 04 December 2023, 10:36:03 UTC
5ef0f10 Increase client-go qps and burst to 10/20 for k8s 1.27+ Default QPS for kubelet has been increased in 1.27 k8s to 50 QPS. https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG/CHANGELOG-1.27.md under "Bump default API QPS limits for Kubelet. Before kubelet had 5 QPS the same as agent. Still when agent needs to create CEP and Identity for a pod, it has to make two API calls so potentially pod startup latency was higher than expected if node was experiancing high churn of pods with new identities. Now with recent change of kubelet, there is even higher chance of that to happen - therefore we should increase QPS. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 04 December 2023, 03:00:30 UTC
4f515fa gateway: Ignore loadbalancer class for Gateway service This is to align with Ingress implementation Relates: #29327 Fixes: #28949 Co-authored-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 04 December 2023, 02:50:23 UTC
2f6994f gateway: Simplify ensure resources with controllerutil This is to leverage the existing controllerutil package to ensure the resources. Signed-off-by: Tam Mach <tam.mach@cilium.io> 04 December 2023, 02:50:23 UTC
cf76f89 ipam: Fix bug where IP lease did not expire The `AllocateNextWithExpiration`[1] function is used to allocate an IP via API from the CNI plugin. Such IPs are always allocated with an expiration timer, which means that if the CNI ADD fails later on and is never retried, the IP is automatically released. Only once an endpoint is created, we then stop the expiration timer during the endpoint creation request [2], making the allocation of the IP permanent until it is explicitly freed. The current expiration implementation however has a bug: Instead of releasing the IP back into the IPAM pool from where the IP was actually allocated from, we forwarded the desired pool, which can be empty and is later overwritten with the actual pool. Because we passed in an empty pool into `StartExpirationTimer`, this led to IP expiration being broken in almost all cases: ``` 2023-11-24T06:24:37.089657953Z level=warning msg="Unable to release IP after expiration" error="no IPAM pool provided for IP release of 10.0.1.41" ip=10.0.1.41 subsys=ipam uuid=2320c5c1-b4c0-4a2e-8f3d-2b906330ab55 ``` This commit fixes that by using the realized pool (from the result) instead of the desired pool from the request. In addition, the unit tests are also adapted to cover this case to ensure we don't regress. [1] https://github.com/cilium/cilium/blob/0fcd1c86e347b2701880c9034e7ea3a74cd6b13e/daemon/cmd/ipam.go#L46 [2] https://github.com/cilium/cilium/blob/95a7d1288d5a13a5a216dcdb09383f1f483e5ac1/daemon/cmd/endpoint.go#L536 Reported-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 04 December 2023, 02:07:10 UTC
be9e853 ipam: Improve expiration timer memory footprint This commit addresses two problems with the IPAM expiration timer: 1. Before this commit, each timer consisted of a Go routine calling `time.Sleep` to wait for expiration to occur. The default expiration timeout is 10 minutes. This meant, that for every IP allocated via CNI ADD, we had a Go routine unconditionally sleeping for 10 minutes, only to (in most cases) wake up and learn that the expiration timer was stopped. This commit improves that situation by having the expiration Go routine wake up and exit early if it was stopped (either via IP Release or `StopExpirationTimer`). 2. In CI, we set the hidden `max-internal-timer-delay` option to 5 seconds (see cilium/cilium#27253). This meant that the `time.Sleep` expiration timer would effectively be 5 seconds instead of 10 minutes. 5 seconds however is not enough for an endpoint to be created via CNI ADD and complete its first endpoint regeneration. This therefore led to endpoint IPs being released while the endpoint was still being created. Due to another bug (fixed in the next commit) the expiration timer failed to actually release the IP, which is why this bug was not discovered earlier when we introduced the 5 second limit. This commit addresses this issue by adding an escape hatch to `pkg/time`, allowing the creation of a timer which is not subject to the `max-internal-timer-delay`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 04 December 2023, 02:07:10 UTC
103077d chore(deps): update dependency eksctl-io/eksctl to v0.165.0 Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 22:40:21 UTC
c749169 ci-clustermesh-upgrade: Increment timeout between rollouts to 5min Currently, the ClusterMesh upgrade test sets an explicit timeout of 1min to wait for the Cilium Agent DaemonSet to become ready between the rollouts. In some cases, the Pods aren't ready after 1min. Therefore, this commit increases the timeout to 5min. I think the most important part is that we set an explicit timeout on the command `kubectl rollout status` - as the default is wait forever. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 22:37:54 UTC
d618a70 pkg/endpoint: do not remove controller upon stopping If we execute the 'RunMetadataResolver' more than one time we could face the situation of deleting this new controller from an older run since they both shared the same name. Since controllers are never executed if the their 'RunInterval' is set to 0 then don't need to remove from the list of controllers. Signed-off-by: André Martins <andre@cilium.io> 01 December 2023, 22:18:21 UTC
e43b759 pkg/endpoint: keep endpoint labels for their original sources Fix two Cilium bugs related to label handling: 1. Addressed an issue during endpoint restoration where Cilium would incorrectly replace labels not sourced from Kubernetes. Previously, labels set on an endpoint outside of Kubernetes were wiped out upon restoration, as all labels were overwritten with those fetched from Kubernetes. 2. Resolved a bug that occurred when a user added or removed a label from a pod or namespace while the Cilium agent was inactive. Upon Cilium restart, the affected endpoint failed to reflect these changes, leading to synchronization issues in label management. Signed-off-by: André Martins <andre@cilium.io> 01 December 2023, 22:18:21 UTC
82e8849 pkg/endpoint: do not run metadata resolver for eps without pods If an endpoint does not contain a pod nor a namespace then don't resolve its metadata. Signed-off-by: André Martins <andre@cilium.io> 01 December 2023, 22:18:21 UTC
99609bf pkg/labels: do not replace labels that come from a different source Cilium shouldn't replace labels that come from a different source even if they have the same key. In order for a label to be replaced, the new label should have the same source as the old label. Signed-off-by: André Martins <andre@cilium.io> 01 December 2023, 22:18:21 UTC
9551482 pkg/endpoint: specify 'sourceFilter' when replacing endpoint labels When replacing the endpoint labels we want to keep all labels that are part of the source for which we are replacing the labels. For example, labels added through the API should not be replaced when a K8s label update is received. Signed-off-by: André Martins <andre@cilium.io> 01 December 2023, 22:18:21 UTC
474650f scale-test-100-gce: Use CILIUM_CLI_VERSION - Remove cilium_cli_version variable and use CILIUM_CLI_VERSION instead. - Remove CILIUM_CLI_MODE environment variable. The Helm mode became the default in v0.15. Fixes: 42e1a4a129b9 ("workflows: move cilium_cli_version definition to set-env-variables action") Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 01 December 2023, 19:24:11 UTC
995cc1f cilium: Update bwm doc with note about bpf host routing Update the doc with a note that it is strongly recommended to use it only in combination with BPF host routing. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 01 December 2023, 19:00:43 UTC
067668a cilium: Slightly tweak bwm parameters Align the qdisc drop horizon with the one we use in our BPF code, and also bump the buckets_log given they can potentially cause scalability issue. The rest remains with the defaults. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 01 December 2023, 19:00:43 UTC
e2a5128 vendor: Update vishvananda/netlink/ Pull in recent additions to the netlink go library, that is, support for setting TCA_FQ_PLIMIT and managing netkit driver (future work). Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Link: https://github.com/vishvananda/netlink/pull/929 Link: https://github.com/vishvananda/netlink/pull/930 01 December 2023, 19:00:43 UTC
2835b06 Delete deprecated CNPStatusUpdates and K8sEventHandover Also, we introduce hidden option "legacy-turn-off-k8s-event-handover". What was happening before in setups with CEP CRD Disabled and kvstore: When K8sEventHandover was disabled: - We were opening informer for all pods - even though we were watching endpoints from kvstore too When K8sEventHandover was enabled: - We were opening informer for all pods - Once connected to kvstore, we were closing this informer - We were opening node's local pods informer Now second options is default and hidden option "legacy-turn-off-k8s-event-handover" allows us to fallback to first behaviour - not recommended, but just failsafe in case we need mitigation Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 01 December 2023, 17:21:38 UTC
f51bc04 Introduce dynamic hubble flow logs exporters based on config file. Signed-off-by: Marek Chodor <mchodor@google.com> 01 December 2023, 16:38:46 UTC
55f21e3 chore(deps): update all github action dependencies Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 16:26:06 UTC
a477c02 fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 16:22:49 UTC
3a93b00 bpf: work around scrubbing of skb->mark during veth transition Previously we set skb->mark in from_host@cilium_host, expect the mark to remain unchanged after kernel transmits skb from cilium_host to cilium_net. The skb->mark is for instance used to transport IPsec-related information. However, as of 2023-10-19, kernel 5.10 still misses the backport patch[1] to fix a bug in skb_scrub_packet() which clears skb->mark for veth_xmit even if the veth pair is under the same netns: https://elixir.bootlin.com/linux/v5.10.198/source/include/linux/netdevice.h#L3975 To avoid hitting this issue, this patch sets metadata in skb->cb to survive skb_scrub_packet(), then to_host@cilium_net can retrieve this info and set proper mark. Only from_host bpf is setting cb, while from_lxc bpf is still using mark. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=ff70202b2d1a ("dev_forward_skb: do not scrub skb mark within the same name space") Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> 01 December 2023, 16:20:14 UTC
e78ff16 bpf_host can handle packets passed from L7 proxy Previously https://github.com/cilium/cilium/pull/25440 removed bpf_host's logic for host-to-remote-pod packets. However, we recently realized such host-to-remote-pod traffic can also be pod-to-pod traffic passing through L7 proxy. This commit made bpf_host capable of handling these host-to-remote-pod packets as long as they are originated from L7 proxy. Fixes: cilium/cilium#25440 Suggested-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> 01 December 2023, 16:20:14 UTC
217ae4f Re-introduce 2005 route table This commit re-introduced the 2005 routes that were removed by https://github.com/cilium/cilium/commit/9dd6cfcdf4406938c35c6ce2e8cc38fb5f2e9ea8 (datapath: remove 2005 route table for ipv6 only) and https://github.com/cilium/cilium/commit/c1a0dba3c0c79dc773ed9a9f75d5aa87b30f44f0 (datapath: remove 2005 route table for ipv4 only). Signed-off-by: Robin Gögge <r.goegge@gmail.com> Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> 01 December 2023, 16:20:14 UTC
ac63856 Allow proxy replies to WORLD_ID This is an alternative approach to fix cilium/cilium#21954, so that we can re-introduce the 2005 from-proxy routing rule in following patches to fix L7 proxy issues. This commit simply allows packets to WORLD as long as they are from ingress proxy. This was one of the solution suggested by Martynas, as recorded in commit message cilium/cilium@c534bb7: One fix was to extend the troublesome check https://github.com/cilium/cilium/blob/v1.12.3/bpf/bpf_host.c#L626 by allowing proxy replies to `WORLD_ID`. To tell if an skb is originated from ingress proxy, the commit extends the semantic of existing flags `TC_INDEX_F_SKIP_{INGRESS,EGRESS}_PROXY`, renames flags to clarify the changed meaning. Fixes: cilium/cilium#21954 (Reply from pod to outside is dropped when L7 ingress policy is used) Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> 01 December 2023, 16:20:14 UTC
d491891 Follow-up nits from etcd init script pull request This is a follow on from the review of https://github.com/cilium/cilium/pull/29109, and is a collection of minor changes: - Remove unused variables in install/kubernetes Makefile values file - Remove etcd image from check-docker-images.sh script - Remove now-unused external dependencies block from check-docker-images.sh script - Clarify doc comment for ClusterMeshEtcdInit function, to correctly state the purpose of the hasConfig key - defer etcdClient.Close() after creating etcdClient - Use errors.Join to handle errors in defered cleanup code, where the main function may have also returned an error - Correct typo in comment: "it's" to "its" Signed-off-by: James Laverack <james@isovalent.com> 01 December 2023, 16:13:30 UTC
7ae917a chore(deps): update dependency go to v1.21.4 Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 16:05:07 UTC
815da3e bpf: nat: clean up ICMP identifier extraction When loading the ICMP identifier field, store it in-place instead of going through a temporary variable. Also as the `tuple` is zero-initialized, we don't need to do this again for the port field(s). Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 01 December 2023, 15:15:01 UTC
4ff4a0e bpf: nat: use rewrite helper for snat_v6_rewrite_ingress() Use the common snat_v6_rewrite_headers() helper from the IPv6 RevSNAT path. This is essentially a 1-to-1 replacement, except that we need to ensure that the L4 port rewrite is skipped for ICMPV6_PKT_TOOBIG packets. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 01 December 2023, 15:15:01 UTC
863b70e bpf: nat: use rewrite helper for snat_v4_icmp_rewrite_egress_embedded() When a local endpoint sends an ICMP_FRAG_NEEDED message, it contains some inner packet that was originally addressed to the endpoint. The endpoint's address/port in this inner packet potentially needs to be SNATed again, to return it to its pre-revSNAT state. snat_v4_icmp_rewrite_egress_embedded() handles the .daddr / .dport rewrite of such an inner packet, based on an SNAT entry. Replace it with the common snat_v4_rewrite_headers() helper. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 01 December 2023, 15:15:01 UTC
e8249d5 bpf: nat: fix csum logic in snat_v6_rewrite_headers() Similar to 495af0717b1d ("bpf: Fix csum logic in snat_v4_rewrite_headers"), we mustn't mix the diff from the L4 port-rewrite into the diff used for the pseudo-header update. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 01 December 2023, 15:15:01 UTC
c414b77 ci-gke: remove duplicated wait for cilium This commit removes the duplicated step to wait for Cilium to become ready. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 14:46:30 UTC
ffe3302 Correct Regex sorting in Ingress model translation This commit makes a few changes: Adds a new test for ingestion, that tests how multiple path matches in the same Ingress work Updates translation logic to sort into the order Exact, Regex, Prefix, which means that regex matches that overlap with prefixes will now route traffic. Adds more cases to the unit tests for SortableRoute.Less(), which ensures that more options are covered. Also refactors the TestSharedIngressTranslator_getEnvoyHTTPRouteConfiguration test to also check the contents of the path and cluster matches, as well as just the listener name and virtual host name. This was quite a bit more complex, because just comparing protobufs produces very unhelpful failure outputs. This test design makes it more straightforward to correct errors in the expected results. Signed-off-by: Nick Young <nick@isovalent.com> 01 December 2023, 13:55:56 UTC
351784a chore(deps): update docker.io/library/alpine docker tag to v3.18.5 Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 13:54:00 UTC
00ab252 log: Do not export InitializeDefaultLogger function The DefaultLogger variable in the logging module serves as a parent logger which all other loggers can be derived from. It is initialized using the InitializeDefaultLogger function and then adjusted on startup based on user configuration. Users should not call InitializeDefaultLogger to create a parent logger for their logger, since the logger returned by InitializeDefaultLogger will always use the hardcoded defaults. For example, the logger returned will always be of level INFO, even if a user has enabled debug logging. To make this clear, this commit renames InitializeDefaultLogger to initializeDefaultLogger to signal that it should not be used outside of the logging module. Fixes: #29215 Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 01 December 2023, 13:52:06 UTC
b175548 test: Remove unnecessary call to InitializeDefaultLogger in cp suite The function InitializeDefaultLogger creates a new logger with the default settings and returns it. This commit removes a call to this function that doesn't save the return value, essentially calling it for no reason. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 01 December 2023, 13:52:06 UTC
45c2ed7 chore(deps): update all github action dependencies to v2 Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 13:48:37 UTC
22c8acd ci: increase disk size for GKE clusters (ci-gke & ci-external-workloads) Some tests are failing due to PodEviction due to DiskPressure. Therefore, this commit is increasing the disk size of a GKE node from 10GB to 20GB. Fixes: #29312 Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 13:44:30 UTC
6613f95 datapath: Fix primary flag in NodeAddress Since the "NodePort" and "Primary" flag computation is mixed together we should do it for non-selected devices as well, so mark address for NodePort use only if the device is selected. Additionally, only consider addresses on selected devices when filtering with --nodeport-addresses. Simplify the special-handling for the addresses of cilium_host by just adjusting the maximum scope. Finally add an assertion to tests to check that primary is correctly set. Fixes: 5342d0104f ("datapath/tables: Add Table[NodeAddress]") Signed-off-by: Jussi Maki <jussi@isovalent.com> 01 December 2023, 13:43:43 UTC
e263ddf ci: bypass proxy.golang.org in Go toolchain installation Directly access golang.org/dl to avoid errors such as > go: golang.org/dl/go1.21.4@latest: module golang.org/dl/go1.21.4: Get "https://proxy.golang.org/golang.org/dl/go1.21.4/@v/list": dial tcp: lookup proxy.golang.org on 127.0.0.53:53: server misbehaving > Error: Process completed with exit code 1. Reported-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 01 December 2023, 13:50:39 UTC
13da17a chore(deps): update actions/checkout action to v4 Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 13:49:29 UTC
93a71f5 ci-e2e-upgrade: Remove setting CLI vsn It's controlled by set-env-variables. Signed-off-by: Martynas Pumputis <m@lambda.lt> 01 December 2023, 13:46:43 UTC
5cc3b6d ci-e2e{,-upgrade}: Use kernel 6.1 instead of 6.0 The former is the long term stable [1]. [1]: https://kernel.org/ Signed-off-by: Martynas Pumputis <m@lambda.lt> 01 December 2023, 13:46:25 UTC
83425f2 ci: fix dns issue when pulling cilium-docker-plugin in ci-runtime It seems that even though we're setting the nameserver at the top of `/etc/resolv.conf`, in some cases docker still uses 127.0.0.53 to resolve names while pulling the cilium docker plugin in the ci-runtime test. It seems as docker tries to use nameserver information from `systemd-resolved`. Therefore, this commit tries to force docker to use the nameserver 1.1.1.1, by removing the resolv.conf symlink, deleting the resolv.conf from systemd-resolved and restarting the docker service after applying the changes. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 13:39:05 UTC
f908909 iptables: Use cell config in iptables tests iptables manager has been modularized into a cell, so the tests should config it through the fields in the manager struct itself, not through the global config options. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 01 December 2023, 13:37:02 UTC
a53fc45 iptables: Move prepend-iptables-chains option into the cell prepend-iptables-chains is used to enable prepending iptables chains instead of appending, and it is relevant only for the iptables manager. Since the iptables manager has been modularized, let's move this into the cell "private" config. The option used to have an env var alias named `CILIUM_PREPEND_IPTABLES_CHAIN`, for backward compatibility reason. The alias has been deprecated and a warning will be printed in case that env var will be found not empty at iptables cell startup. Besides, the env var deprecation has been described in the v1.15 upgrade notes. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 01 December 2023, 13:37:02 UTC
2e1065b iptables: Remove usage of global option EnableIPv6 Since iptables manager has been modularized, it should not depend directly on the global daemon config. Instead, it should read the configuration from its cell. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 01 December 2023, 13:37:02 UTC
f19faac iptables: Copy config at cell startup Copy the cell specific config parameters passed through dependency injection into the manager struct. Fixes: a6a2b73f4f ("iptables: Add a cell for iptables config manager") Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 01 December 2023, 13:37:02 UTC
0261f30 ci: increase cilium wait timeout to 10m on cloud providers There are cases where image pulling isn't successful within 5 minutes (default wait time of `cilium status --wait`). Especially on public cloud providers, it might takes longer in some cases. Therefore, this commit increases the wait timeout to 10m. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 13:28:02 UTC
1e71a19 test/verifier: improve log output Include the command invoked to compile the BPF and the output of the command in the test output by default. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 01 December 2023, 13:07:17 UTC
735807f test/verifier: fix complexity tests not being recompiled TestVerifier is accidentally reusing the result of previous subtest compilations. This means that only the first set of configurations was tested. Invoke clean for every new compilation. The generated object files are moved to the test directory to make them accessible as artifacts from CI. Fixes: d3ef5b2ac8 ("test/verifier: Avoid pruning object files before testing the next file") Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 01 December 2023, 13:07:17 UTC
ab329d2 bpf: prevent clang from reordering address calculations before NULL check Compiling the BPF complexity tests currently bails out in tail_nodeport_nat_egress_ipv4 and tail_nodeport_nat_egress_ipv6. The error for the v4 case is the following: ; return map_lookup_elem(map, tuple); 169: (18) r1 = 0xffff9dfb4d0f9800 ; R1_w=map_ptr(off=0,ks=14,vs=40,imm=0) 171: (85) call bpf_map_lookup_elem#1 172: (bf) r5 = r0 173: (07) r5 += 32 R5 pointer arithmetic on map_value_or_null prohibited, null-check it first The problem is easier to spot when looking at the disassembly of the function: ; return map_lookup_elem(map, tuple); 169: r1 = 0x0 ll 171: call 0x1 172: r5 = r0 173: r5 += 0x20 174: r3 = r0 175: r3 += 0x24 ; if (*state) 176: if r0 != 0x0 goto +0x6da <LBB9_167> It seems like clang has decided that it can hoist the address calculation based on r0 before checking that r0 != 0. Work around this issue by inserting a barrier_data() call. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 01 December 2023, 13:07:17 UTC
4d8c8b5 complexity-tests: remove Egress Gateway from IPv6-only tests Egress Gateway only supports IPv4 so enabling it on IPv6-only tests doesn't make a lot of sense. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 01 December 2023, 13:07:17 UTC
a8e5ee7 chore(deps): update all lvh-images main Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 11:42:38 UTC
00d4836 pkg/l2announcer: only create leases for services that are being announced Previously leases were created for all services in the cluster regardless of them having an announceable IP or not. This change ensures that these services are skipped and only services with an external and/or LB IP - depending on the policy - will have a lease. Fixes: https://github.com/cilium/cilium/issues/28752 Signed-off-by: Filip Nikolic <oss.filipn@gmail.com> 01 December 2023, 11:13:37 UTC
91ccdcd Remove subsystem name from datapath metrics Commit f8e9472 adds subsystem information to cilium_[forward|drop]_[count_bytes]_total metrics. This change was not intended. This commit removes subsystem info from metric names. Fixes #29213 Signed-off-by: Boris Petrovic <carnerito.b@gmail.com> 01 December 2023, 10:58:37 UTC
23efe81 fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 01 December 2023, 10:56:37 UTC
9a5998f hubble-relay: Add support for peers joining during requests With this change, if the request is in follow mode, we periodically re-list peers and send the request to newly joined or reconnected peers. Signed-off-by: Fabian Fischer <fabian.fischer@isovalent.com> 01 December 2023, 10:36:32 UTC
98dfc7d k8s: remove slim k8s model for Ingress & IngressClass This commit removes the unused slim k8s model for Ingress & IngressClass. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 10:27:35 UTC
7f0531c k8s: remove ingress converter function Remove unused Ingress status converter function. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 10:27:35 UTC
8dfb20d operator: remove IngressClass resource This commit removes the unused IngressClass resource. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 01 December 2023, 10:27:35 UTC
936c8ae ci-e2e: Use lvh-kind action instead of c/little-vm-helper Signed-off-by: Martynas Pumputis <m@lambda.lt> 01 December 2023, 10:11:51 UTC
back to top