https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
3de85a1 pkg/k8s: use a deep copy of CNP in UpdateStatus to avoid race condition We modified the UpdateStatus function to ensure that the CNP object is deep-copied before passing it as an argument. This change was necessary because the UpdateStatus function was modifying the CNP object, specifically clearing the LastAppliedConfiguration key from the annotations map. By deep-copying the CNP object, we ensure that the original object remains unmodified which fixes the following race condition: ``` Write at 0x00c002a98510 by goroutine 119834: runtime.mapassign_faststr() /usr/local/go/src/runtime/map_faststr.go:203 +0x0 github.com/cilium/cilium/pkg/k8s.(*CNPStatusUpdateContext).updateViaAPIServer.func1() ./pkg/k8s/cnp.go:215 +0x53 runtime.deferreturn() /usr/local/go/src/runtime/panic.go:477 +0x30 github.com/cilium/cilium/pkg/k8s.(*CNPStatusUpdateContext).updateStatus() ./pkg/k8s/cnp.go:78 +0x2c7 github.com/cilium/cilium/pkg/k8s.(*CNPStatusUpdateContext).UpdateStatus() ./pkg/k8s/cnp.go:146 +0x786 github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).addCiliumNetworkPolicyV2.func1() ./pkg/k8s/watchers/cilium_network_policy.go:352 +0x86 github.com/cilium/cilium/pkg/controller.(*controller).runController() ./pkg/controller/controller.go:251 +0x171 github.com/cilium/cilium/pkg/controller.(*Manager).createControllerLocked.func1() ./pkg/controller/manager.go:111 +0xa4 Previous read at 0x00c002a98510 by goroutine 1205: runtime.mapiterinit() /usr/local/go/src/runtime/map.go:816 +0x0 github.com/cilium/cilium/pkg/comparator.MapStringEqualsIgnoreKeys() ./pkg/comparator/comparator.go:82 +0xb1 github.com/cilium/cilium/pkg/k8s/apis/cilium.io/v2.objectMetaDeepEqual() ./pkg/k8s/apis/cilium.io/v2/cnp_types.go:65 +0xb0 github.com/cilium/cilium/pkg/k8s/apis/cilium.io/v2.(*CiliumNetworkPolicy).DeepEqual() ./pkg/k8s/apis/cilium.io/v2/cnp_types.go:54 +0x177 github.com/cilium/cilium/pkg/k8s/types.(*SlimCNP).DeepEqual() ./pkg/k8s/types/zz_generated.deepequal.go:82 +0xbd github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).onUpsert() ./pkg/k8s/watchers/cilium_network_policy.go:238 +0x170 github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).ciliumNetworkPoliciesInit.func1() ./pkg/k8s/watchers/cilium_network_policy.go:175 +0xc64 Goroutine 119834 (running) created at: github.com/cilium/cilium/pkg/controller.(*Manager).createControllerLocked() ./pkg/controller/manager.go:111 +0x757 github.com/cilium/cilium/pkg/controller.(*Manager).updateController() ./pkg/controller/manager.go:84 +0x44f github.com/cilium/cilium/pkg/controller.(*Manager).UpdateController() ./pkg/controller/manager.go:52 +0xe6f github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).addCiliumNetworkPolicyV2() ./pkg/k8s/watchers/cilium_network_policy.go:348 +0xc75 github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).onUpsert() ./pkg/k8s/watchers/cilium_network_policy.go:271 +0x744 github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).ciliumNetworkPoliciesInit.func1() ./pkg/k8s/watchers/cilium_network_policy.go:175 +0xc64 Goroutine 1205 (running) created at: github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).ciliumNetworkPoliciesInit() ./pkg/k8s/watchers/cilium_network_policy.go:91 +0x27c github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).enableK8sWatchers.func1() ./pkg/k8s/watchers/watcher.go:578 +0x59 sync.(*Once).doSlow() /usr/local/go/src/sync/once.go:74 +0xf0 sync.(*Once).Do() /usr/local/go/src/sync/once.go:65 +0x44 github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).enableK8sWatchers() ./pkg/k8s/watchers/watcher.go:578 +0xa24 github.com/cilium/cilium/pkg/k8s/watchers.(*K8sWatcher).InitK8sSubsystem() ./pkg/k8s/watchers/watcher.go:508 +0x104 github.com/cilium/cilium/daemon/cmd.newDaemon() ./daemon/cmd/daemon.go:1001 +0x9070 github.com/cilium/cilium/daemon/cmd.newDaemonPromise.func1() ./daemon/cmd/daemon_main.go:1687 +0xa4 github.com/cilium/cilium/pkg/hive.Hook.Start() ./pkg/hive/lifecycle.go:34 +0x70 github.com/cilium/cilium/pkg/hive.(*Hook).Start() <autogenerated>:1 +0x1f github.com/cilium/cilium/pkg/hive.(*DefaultLifecycle).Start() ./pkg/hive/lifecycle.go:103 +0x3f1 github.com/cilium/cilium/pkg/hive.(*Hive).Start() ./pkg/hive/hive.go:291 +0x152 github.com/cilium/cilium/pkg/hive.(*Hive).Run() ./pkg/hive/hive.go:191 +0xc4 github.com/cilium/cilium/daemon/cmd.NewAgentCmd.func1() ./daemon/cmd/root.go:39 +0x264 github.com/spf13/cobra.(*Command).execute() ./vendor/github.com/spf13/cobra/command.go:944 +0xcb8 github.com/spf13/cobra.(*Command).ExecuteC() ./vendor/github.com/spf13/cobra/command.go:1068 +0x5c4 github.com/spf13/cobra.(*Command).Execute() ./vendor/github.com/spf13/cobra/command.go:992 +0x2e github.com/cilium/cilium/daemon/cmd.Execute() ./daemon/cmd/root.go:79 +0x2f main.main() ./daemon/main.go:14 +0xa9 ``` Signed-off-by: André Martins <andre@cilium.io> 10 October 2023, 12:06:41 UTC
9dcc5d1 bpf: overlay: fix missing DBG_DECAP for Inter-Cluster-SNAT For Inter-Cluster-SNAT traffic we tail-call in handle_ipv4() before even reaching the DBG_DECAP message. Pull up the cilium_dbg() call a bit so that it also applies to inter-cluster-SNAT traffic. While at it also clarify which parameters are relevant for the path that handles decrypted packets. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 10 October 2023, 08:57:40 UTC
4cbe8a7 contrib: Add ContainerLab-based BGP CPlane development environment It is not always easy for developers to prepare an environment that easily deploy and test BGP Control Plane since it requires unusual network topology to make BGP peering and test reachability from external node that doesn't run Cilium. This commit adds BGP Control Plane development environments based on ContainerLab and Kind with IPv4/IPv6 single-stack and dual-stack. The topology looks like below. Note that server0-2 are directly connected to router0 with veth. |---server0(Cilium, ASN: 65001) | router0(FRRouting, ASN: 65000)---|---server1(Cilium, ASN: 65002) | |---server2(netshoot, No BGP) router0 has FRR on it. You can use `docker exec -it <container name> vtysh` to do any manual configuration. server0 and server1 are netshoot container that shares network namespace with Kind node. You can use netshoot container through `docker exec` (useful for debugging/troubleshooting) and use Cilium through `kubectl exec`. server2 is a netshoot container you can use for testing external connectivity. The development workflow looks like following. 1. Deploy lab $ make kind-bgp-<v4|v6|dual> 2. Install Cilium with your favorite tool. Mandatory Helm values are in contrib/containerlab/bgp-cplane-dev-<v4|v6|dual>/values.yaml. $ cilium install -f contrib/containerlab/bgp-cplane-dev-<v4|v6|dual>/values.yaml 3. Do basic peering $ make kind-bgp-<v4|v6|dual>-apply-policy 4. Validate peering state $ cilium bgp peers 5. Enjoy development 6. Cleanup lab $ make kind-bgp-<v4|v6|dual>-down Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> Co-authored-by: Ryan Drew <ryan.drew@isovalent.com> Co-authored-by: Daneyon Hansen <daneyon.hansen@solo.io> 10 October 2023, 08:37:20 UTC
c7287f7 Allow specifying Kind cluster name in make kind-image Allow specifying Kind cluster name with KIND_CLUSTER_NAME environment variable in kind-image-agent, kind-image-operator, and kind-image. So that we can load image to the target cluster with the name other than `kind`. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> Suggested-by: Daneyon Hansen <daneyon.hansen@solo.io> 10 October 2023, 08:37:20 UTC
2554edd Docs: Updates L2 Announce for LB Class Support Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io> 10 October 2023, 08:34:43 UTC
a608db6 gha: add clustermesh upgrade and downgrade tests This commit introduces a new GitHub actions workflow which validates the Cilium upgrade and downgrade paths when clustermesh is enabled, ensuring that this process does not disrupt long living connections. More specifically, the test initially deploys a mixed version cluster mesh, with one cluster running Cilium from the tip of the latest stable branch, and the other the current tip of main. It performs a subset of connectivity tests and deploys the application sensitive to connection disruption. At this point, the first cluster is upgraded to the tip of main, enabling kvstoremesh at the same time. Connectivity tests are again executed, checking that no long living connection was dropped. As an additional stress test, the clustermesh-apiserver deployment in both clusters is scaled to zero replicas, all agents restarted, and then scaled back to one replica, while checking that no long living connection was dropped. Finally, the first cluster is downgraded again to the tip of the latest stable version, disabling kvstoremesh. Connectivity tests and connection disruption checks are executed one more time. To reduce the total amount of time required by this workflow, only a limited subset of tests is enabled, while the full suite is run in the conformance-clustermesh workflow. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 10 October 2023, 08:32:52 UTC
3d749c8 .github/actions/helm-default: make chart-dir configurable Let's allow to override the default chart-dir, in case it is located outside of the standard path. This can happen, for instance, when checking out multiple Cilium versions in upgrade/downgrade workflows. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 10 October 2023, 08:32:52 UTC
5aa4329 gateway-api: Add conformance profile test The report output will be posted to upstream repo for the status and badge. Signed-off-by: Tam Mach <tam.mach@cilium.io> 10 October 2023, 08:28:46 UTC
0f947fe Docs: Updates for Deprecation of CNI network-plugin Flag - Updates minikube install steps for deprecated CLI flag. - Updates K8s requirements for kubelet command-line parameters. Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io> 10 October 2023, 07:39:27 UTC
60f0d40 chore(deps): update dependency cilium/cilium-cli to v0.15.10 Signed-off-by: renovate[bot] <bot@renovateapp.com> 10 October 2023, 00:24:01 UTC
9aa5068 build: Remove envoy from Makefile target As mentioned in envoy/Makefile, the envoy binary is now part of cilium/proxy instead. This commit is to remove unnecessary step from Makefile targets. Signed-off-by: Tam Mach <tam.mach@cilium.io> 09 October 2023, 21:46:38 UTC
ec8ef70 envoy: Import Health check sink API Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 09 October 2023, 10:43:41 UTC
eefa12b daemon: Skip Ingress Endpoint on BPF watchdog Skip endpoints without BPF datapath (currently only the Ingress Endpoint) when checking for Endpoint's bpf programs in `checkEndpointBPFPrograms()`. This helps avoid errors like this: level=error msg="Unable to assert if endpoint BPF programs need to be reloaded" endpoint= error="unable to find endpoint link by name: Link not found" subsys=daemon Fixes: #28126 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 09 October 2023, 08:51:24 UTC
414814f bpf: lxc: remove stale ENABLE_IDENTITY_MARK ifdefs 4de20fc2027a ("bpf: Pass security ID via skb->cb from lxc to host") switched these parts from using set_identity_mark() to set_identity_meta(). Thus we no longer have to respect the opt-in that controls whether the skb mark may be used to store the identity. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 09 October 2023, 07:43:53 UTC
73afc5f Remove daemon health from being reported via the CLI The daemon module currently does not support health rollups and hence reports the default status aka Unknown. This is less than ideal and may yields to confusion. This PR removes daemon health from being reported via the cilium health command. Signed-off-by: Fernand Galiana <fernand.galiana@isovalent.com> 06 October 2023, 17:56:30 UTC
ec63f51 Remove deprecated policy_import_errors_total metric The policy_import_errors_total metric was deprecated in Cilium 1.14 in favor of the policy_change_total metric, see commit 7ab330310ed8 ("Expand agent metric Policy Import Errors to count all policy changes") for details. Signed-off-by: Tobias Klauser <tobias@cilium.io> 06 October 2023, 15:12:51 UTC
0fe7f67 pkg/lbmap: fix incorrect Backend4MapV3Name comment Signed-off-by: yylt <yang8518296@163.com> 06 October 2023, 13:52:40 UTC
24053dd gha: restore kvstoremesh.enabled setting in conformance-clustermesh This setting has been incorrectly dropped in a previous commit, let's restore it, so that we keep testing kvstoremesh. Fixes: 4498ec908b02 (".github: re-use common helm values from a single action") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 06 October 2023, 13:44:19 UTC
bd099f4 .github/actions/helm-default: fix kvstoremesh image The kvstoremesh image repository was incorrectly set to `clustermesh-apiserver-ci`. Let's fix it. Fixes: 4498ec908b02 (".github: re-use common helm values from a single action") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 06 October 2023, 13:44:19 UTC
82d1c37 gha: explicit branch and trigger in ariane-scheduled workflow Let's use explicit bash variables to construct the trigger phrase, instead of directly using the matrix value. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 06 October 2023, 13:44:01 UTC
e50b3c3 Add hubble_relay_pool_peer_connection_status metric This change adds a new gauge metric to hubble-relay measuring the connectiion status to all peers. Metric keeps track of number of peers for each possible connectiion status The current set of metrics is not enough to accurately measure the availability of hubble-relay. They measure the status of gRPC calls, but, for instance, in case all peers are unreachable when GetFlows is called, even though gRPC call will succeed and return "OK" status, the response will come with no flows gathered, rendering it useless. This new metric is introduced to cover such cases. Signed-off-by: Michal Siwinski <siwy@google.com> 06 October 2023, 13:43:11 UTC
7646b69 envoy: Set enforce_policy_on_l7lb on l7 LB listener filters Turn on ingress policy enforcement on L7 LB. With this cilium-envoy starts dropping Ingress traffic unless Cilium Agent configures it with a passing policy via the Ingress endpoint (with the reserved:ingress identity). Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 06 October 2023, 13:42:29 UTC
b035460 k8s: Use reserved:ingress identity also for Gateway API Tell envoy package to use the IPs allocated for reserved:ingress identity also when a CEC/CCEC is owned by Gateway API in addition to Cilium Ingress. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 06 October 2023, 13:42:29 UTC
04f19e9 daemon: Add ingress endpoint Add Cilium Endpoint representing Ingress. It is defined without a veth interface and no bpf programs or maps are created for it. Ingress endpoint is needed so that the network policy is computed and configured to Envoy, so that ingress/egress network policy defined for Ingress can be enforced. Cilium Ingress is implemented as L7 LB, which is an Envoy redirect on the egress packet path. Egress CNP policies are already enforced when defined. Prior to this commit CNPs defined for reserved:ingress identity were not computed, however, and all traffic was passed through by Cilium Ingress was allowed to egress towards the backends. When the backends receive such packets, they are identified as coming from Cilium Ingress, so any ingress policies at the backends can not discern the original source of the traffic. This commit adds a Cilium endpoint for the reserved:ingress identity, which makes the Cilium node compute and pass policies whose endpoint selector selects this identity (e.g., by selecting all entities) to Envoy, so that they can be enforced. Envoy listener will then enforce not just the egress policy but also the ingress policy for the original incoming source security identity. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 06 October 2023, 13:42:29 UTC
d4543a8 envoy: Update to cilium-envoy with enhanced L7 LB policy enforcement Update cilium-envoy image to main that now has enhanced support for L7 LB policy enforcement. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 06 October 2023, 13:42:29 UTC
9959153 envoy: Fix support for allow-all network policies Generate the ingress/egress network policy also when l4 filter is nil. This enables creating allow-all rules when policy is not enforced. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 06 October 2023, 13:42:29 UTC
833245f labels/cidr: Fix slice preallocation size The number of CIDR labels returned from GetCIDRLabels are equal to the sum of: - ones+1 labels coming from the prefixes generated in the loop - an additional one for the reserved "world" label To improve performance, we should then allocate a slice of size ones+2. benchstat analysis confirms this reporting 1 allocation less for each operation, except for the "0.0.0.0/0" corner case. name old time/op new time/op delta GetCIDRLabels/0.0.0.0/0-8 306ns ±36% 299ns ±32% ~ (p=0.251 n=20+20) GetCIDRLabels/10.16.0.0/16-8 4.42µs ± 5% 4.17µs ± 3% -5.65% (p=0.000 n=19+19) GetCIDRLabels/192.0.2.3/32-8 8.43µs ± 5% 8.02µs ± 5% -4.88% (p=0.000 n=20+20) GetCIDRLabels/192.0.2.3/24-8 6.79µs ± 5% 6.49µs ± 5% -4.41% (p=0.000 n=20+20) GetCIDRLabels/192.0.2.0/24-8 6.78µs ± 4% 6.46µs ± 4% -4.68% (p=0.000 n=19+20) GetCIDRLabels/::/0-8 329ns ± 2% 338ns ± 3% +2.54% (p=0.000 n=19+20) GetCIDRLabels/fdff::ff/128-8 53.6µs ± 4% 52.1µs ± 6% -2.66% (p=0.000 n=19+17) GetCIDRLabels/f00d:42::ff/128-8 56.8µs ± 4% 55.1µs ± 2% -3.04% (p=0.000 n=20+18) GetCIDRLabels/f00d:42::ff/96-8 42.4µs ± 3% 40.8µs ± 6% -3.82% (p=0.000 n=20+20) name old alloc/op new alloc/op delta GetCIDRLabels/0.0.0.0/0-8 640B ± 0% 656B ± 0% +2.50% (p=0.000 n=20+20) GetCIDRLabels/10.16.0.0/16-8 3.75kB ± 0% 3.17kB ± 0% -15.37% (p=0.000 n=18+20) GetCIDRLabels/192.0.2.3/32-8 8.03kB ± 0% 6.88kB ± 0% -14.34% (p=0.000 n=16+20) GetCIDRLabels/192.0.2.3/24-8 7.21kB ± 0% 6.32kB ± 0% -12.42% (p=0.000 n=20+19) GetCIDRLabels/192.0.2.0/24-8 7.21kB ± 0% 6.32kB ± 0% -12.42% (p=0.000 n=19+20) GetCIDRLabels/::/0-8 640B ± 0% 656B ± 0% +2.50% (p=0.000 n=20+20) GetCIDRLabels/fdff::ff/128-8 30.8kB ± 0% 25.9kB ± 0% -15.80% (p=0.000 n=20+19) GetCIDRLabels/f00d:42::ff/128-8 33.7kB ± 0% 28.8kB ± 0% -14.45% (p=0.000 n=19+20) GetCIDRLabels/f00d:42::ff/96-8 28.3kB ± 0% 25.1kB ± 0% -11.31% (p=0.000 n=20+20) name old allocs/op new allocs/op delta GetCIDRLabels/0.0.0.0/0-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/10.16.0.0/16-8 38.0 ± 0% 37.0 ± 0% -2.63% (p=0.000 n=20+20) GetCIDRLabels/192.0.2.3/32-8 70.0 ± 0% 69.0 ± 0% -1.43% (p=0.000 n=20+20) GetCIDRLabels/192.0.2.3/24-8 54.0 ± 0% 53.0 ± 0% -1.85% (p=0.000 n=20+20) GetCIDRLabels/192.0.2.0/24-8 54.0 ± 0% 53.0 ± 0% -1.85% (p=0.000 n=20+20) GetCIDRLabels/::/0-8 3.00 ± 0% 3.00 ± 0% ~ (all equal) GetCIDRLabels/fdff::ff/128-8 450 ± 0% 449 ± 0% -0.22% (p=0.000 n=20+20) GetCIDRLabels/f00d:42::ff/128-8 450 ± 0% 449 ± 0% -0.22% (p=0.000 n=20+20) GetCIDRLabels/f00d:42::ff/96-8 296 ± 0% 295 ± 0% -0.34% (p=0.000 n=20+20) Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 06 October 2023, 13:38:48 UTC
ed6ec52 ci: Avoid using deprecated "tunnel" flag The tunnel option is deprecated and will be removed in Cilium v1.15. This commit fixes the remaining uses I have found where the CI explicitly set the old `tunnel` flag. Note that the Cilium CLI also still sets the flag some times in our CI, this is addressed by cilium/cilium-cli#1993. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 06 October 2023, 13:37:40 UTC
f6abd9d bgpv1: Fix BGP component tests using the same VirtualRouter config As some of the BGP component tests mutate the VirtualRouter config, we need to make sure that mutated config is not used by the following tests. That was the case before this fix, as multiple tests are using the same baseBGPPolicy. Now each test creates a deep copy of the base config, which is fine to mutate within the test. This change also ensures that fixture.config.policy always contains the most up-to-date policy. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 06 October 2023, 13:37:17 UTC
9ae4269 Setup renovate for SPIRE deployment This adds a check for the spire-server and spire-agent Docker images for Renovate. I also fixes the busybox image to be allowed newer than 1.35. Signed-off-by: Maartje Eyskens <maartje.eyskens@isovalent.com> 06 October 2023, 13:31:59 UTC
d93659c Resiliency: Checks endpoints BPF programs remain loaded We've seen instance where BPF programs are deleted by third party software (aka Ddog). When this occurrs either agents could fail to restart or connectivity may be compromised. This PR attempts to reconciliate BPF program state with their associated endpoints by ensuring endpoints programs are reloaded when necessary. - Add a watchdog to ensure endpoints bpf program remain loaded - Checks for tc qdisk/ingress filter presence - Add configuration to enable/disable this feature and configure watchdog rate - Add call to terminate controllers upon daemon shutting down Signed-off-by: Fernand Galiana <fernand.galiana@isovalent.com> 06 October 2023, 13:31:32 UTC
b3ae437 endpoint: correct stats for prepareBuild Previous change relocated CT cleanup blocker, but left prepareBuild stats calculated after that. Afterwards, some modifications also been done for the code related to prebuild. So the stats related to prepareBuild should be handled within runPreCompilationSteps. Fixes: eaa486d787ef ("endpoint: Wait for CT cleanup to complete before BPF compilation") Signed-off-by: Li Chun <chun2.li@intel.com> 06 October 2023, 13:21:51 UTC
4e3da76 ipam: Remove unused mock function This removes a unused mock function. It should have been removed as part of a previous cleanup commit. Fixes: ae25fca59c20 ("ipam: Remove cluster-pool-v2beta agent implementation") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 06 October 2023, 13:20:38 UTC
4b980e1 helm: add jobLabel to cilium-{agent,operator} and hubble serviceMonitor Signed-off-by: Ralph Bankston <ralph.bankston@isovalent.com> 06 October 2023, 12:56:09 UTC
2d00e56 resource: Improve documentation on Events() The semantics around retrying and the Sync event are subtle. Spell out the properties in a comment for Events(). Signed-off-by: Jussi Maki <jussi@isovalent.com> 06 October 2023, 11:00:38 UTC
a7f1eef resource: Fix double upserts on subscribe and retrying of delete events Fix double upserts that were caused by store being manipulated without synchronization with the subscriber queues by processing the deltas under the resource read-lock and doing the initial key listing for new subscriber with a write-lock. This way we cannot accidentally see a key in the store and process it just before the key is queued. As shown by test case in previous commit, the delete events are retried with an old incorrect version of the object causing a recreated object to be deleted. Fix the deletion retrying by always queueing upserts and deletes by key and keeping the last known state of objects emitted to the subscriber. Only emit a delete event if the subscriber has seen its creation and only use a version of the object that the subscriber has observed. Fixes: 4101e2c768 ("k8s: Add resource package") Signed-off-by: Jussi Maki <jussi@isovalent.com> 06 October 2023, 11:00:38 UTC
75927f7 resource: Add test for repeated deletions Resource[T] does not correctly handle the events: Upsert -> Delete with Done(not-nil) -> Upsert (recreate) -> Delete (retry) The retried delete event carries the old initial version of the object causing the recreated object to be incorrectly deleted. Signed-off-by: Jussi Maki <jussi@isovalent.com> 06 October 2023, 11:00:38 UTC
262d59e ci: update docs-builder Manual update for docs-builder tag, because the related PR's branch was not opened in the Cilium repository (but from a fork). Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 October 2023, 10:29:58 UTC
0397bf4 Update docs theme Signed-off-by: Raphaël Pinson <raphael@isovalent.com> 06 October 2023, 10:29:58 UTC
547d6a7 mapstate: optimize denyPreferredInsert Since the denyPreferredInsert logic cleanly splits on whether the to be inserted entry is a deny entry, we can avoid iterating over some entries: If we are inserting a deny, we are mostly interested in iteration over existing allows, and vice versa. In the case where we have a FQDN policy for a hostname which resolves to many IPs (for example with a fqdn policy covering Amazon's S3), this becomes a hot function (without performing a lot of useful work). The S3 workload is expected to be heavily skewed towards allow entries for the many CIDR identities allocated for S3. When we insert such an entry, we only need to iterate deny entries, of which there are expected to be much fewer. Now that the tracking is split, we can employ this fact and, instead of iterating all entries, iterate only the deny entries. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 06 October 2023, 10:02:14 UTC
c54289b mapstate: split allows and denies Now that mapState is a struct, we can split the tracking of allows and denies. This commit should not introduce functional changes, but prepares us for an optimization in a later commit. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 06 October 2023, 10:02:14 UTC
3159946 policy: Make MapState an interface `MapState` has many methods, some of which are public. In order to keep state to accomplish creating a cohesive map state that can keep state besides what will be put into an ebpf map, `MapState` can no longer be a type declaration for a golang map, but must be a struct that keeps state. The endpoint package must also be refactored to use `MapState` as an interface rather than a instantiated type of a map. Signed-off-by: Nate Sweet <nathanjsweet@pm.me> Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 06 October 2023, 10:02:14 UTC
6ae7515 docs: Update Kubernetes Gateway-API version to v0.8.1 Synchronize the Cilium Gateway API version support description to align with the commit d4d7ff4282a2 ("gateway-api: Bump the version to v0.8.1"). Signed-off-by: Haiyue Wang <haiyue.wang@intel.com> 06 October 2023, 09:40:42 UTC
b825f4e Gateway API double slash when stripping path prefix fix Path prefixes could not be stripped which lead to double slashes. Fixes: #28270 Signed-off-by: Dawid Danieluk <lolnoxy@gmail.com> 06 October 2023, 07:13:20 UTC
c338925 bpf: encap: clean up usage of __encap_and_redirect_with_nodeid() Several callers pass src_ip = 0 and vni = NOT_VTEP_DST, and can thus be switched to the simpler encap_and_redirect_with_nodeid() variant. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 06 October 2023, 05:51:39 UTC
2c5fa7b bpf: don't sign-extend IPv6 address components Unqualified integers are treated as `int` in C, but the behaviour when doing binary operations on them against unsigned integers is surprising, to say the least. Consider the following scenario, before this patch: ``` #define DEFINE_IPV6(NAME, a1, a2, a3, a4, a5, a6, a7, a8, a9, a10, a11, a12, a13, a14, a15, a16) \ DEFINE_U64_I(NAME, 1) = bpf_cpu_to_be64( \ (__u64)(a1) << 56 | (__u64)(a2) << 48 | (__u64)(a3) << 40 | \ (__u64)(a4) << 32 | (a5) << 24 | (a6) << 16 | (a7) << 8 | (a8)) .. <elip> DEFINE_IPV6(ROUTER_IP, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x80, 0x0, 0x0, 0x0); ``` Both a5 and a13 are set to a byte with a most-significant bit of 1. This results in the following output in the .data section during compilation: ``` 0000000000000038 <ROUTER_IP_1>: 7: ff ff ff ff 80 00 00 00 <unknown> 0000000000000040 <ROUTER_IP_2>: 8: ff ff ff ff 80 00 00 00 <unknown> ``` Since the `<< 24` pushed the 1 bit to now be the most-significant bit of an int, equivalent to an int32 on most 64-bit architectures, the subsequent OR operation against a __u64 causes the value to become sign-extended, setting the upper 32 bits of the resulting (unsigned..) integer to 1. Ouch. This commit treats each member as a u64, and additionally truncates each macro argument to u8 to prevent any of the arguments overlapping as a defensive measure. Signed-off-by: Timo Beckers <timo@isovalent.com> 05 October 2023, 20:52:49 UTC
e773f96 gha: do not hardcode AWS VPC CNI plugin version in conformance-aws-cni Currently, we are installing a fixed version of the AWS VPC CNI plugin version (i.e., v1.11) regardless of the Kubernetes version. Yet, this means that in certain cases we upgrade it, while in others we downgrade, it, introducing unnecessary churn. Let's instead use the default one that gets installed with the given k8s version. Specifically, for the k8s versions currently tested: * k8s: v1.23 - AWS VPC CNI: v1.10.4-eksbuild.1 * k8s: v1.24 - AWS VPC CNI: v1.11.4-eksbuild.1 * k8s: v1.25 - AWS VPC CNI: v1.12.2-eksbuild.1 * k8s: v1.26 - AWS VPC CNI: v1.12.5-eksbuild.2 * k8s: v1.27 - AWS VPC CNI: v1.12.6-eksbuild.2 Retrieved through: for MINOR in $(seq 23 27) do echo -n "k8s: v1.$MINOR - AWS VPC CNI: " aws eks describe-addon-versions --addon-name vpc-cni \ --kubernetes-version 1.$MINOR --output yaml | \ yq '.addons[].addonVersions[] | select(.compatibilities[].defaultVersion == true) | .addonVersion'; done Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 05 October 2023, 15:29:50 UTC
4dd47d4 chore(deps): update dependency cilium/cilium-cli to v0.15.9 Signed-off-by: Joe Stringer <joe@cilium.io> 05 October 2023, 12:50:13 UTC
4bed3ef .github/actions/helm-default: use the derived SHA as image tag This commit ensures that the correct image tag, derived from the GitHub SHA, is used in the Helm chart when workflows are triggered by events other than 'workflow_dispatch' or 'pull_request' Fixes: 4498ec908b02 (".github: re-use common helm values from a single action") Reported-by: Marco Iorio <marco.iorio@isovalent.com Signed-off-by: André Martins <andre@cilium.io> 05 October 2023, 11:37:44 UTC
d50a525 ipcache: aggregate labels from all IPs with local host identity The `reserved:host` identity is special: the numeric identity is fixed and the set of labels is mutable. (The datapath requires this.) So, we need to determine all prefixes that have the `reserved:host` label and capture their labels. Then, we must aggregate *all* labels from all IPs and insert them as the `reserved:host` identity labels. However, the code as written has a race condition whenever the local node has more than one IP address. This can happen when, for example vxlan or ipv6 is enabled. The basic sequence is this: 1. Insert IP A as `reserved:host` in to the ipcache. ID 1 now has labels `reserved:host` 2. Insert IP A as `reserved:kube-apiserver` in to the ipcache. ID 1 is updated with labels `reserved:host, reserved:kube-apsierver` 3. Insert IP B as `reserved:host` in to the ipcache. ID 1 is updated with labels `reserved:host`. And now policies that select `reserved:kube-apiserver` are broken Likewise, we need to always update the SelectorCache; we cannot short-circuit if the ipcache already has that identity. Again, this is needed because the identity is mutable. So this bug can take another form: 1. Insert IP A as `reserved:host` in to the ipcache. Because IP A is not known to the ipcache, treat ID 1 as a new identity and update the selector cache 2. Insert IP A as `reserved:kube-apiserver`. Mutate the labels of ID 1. But, because IP A already has ID 1, short-circuit the update to the selector cache (if the Source is the same, which it _may_ be). 3. Now the selector cache has incorrect labels for ID 1. Without this, when there are multiple IPs with the host label, the identity may flap and the SelectorCache may be missing updates. Fixes: #28259 Fixes: e0d403adc Fixes: 308c14225 Signed-off-by: Casey Callendrello <cdc@isovalent.com> 05 October 2023, 10:34:50 UTC
6ea3ed5 Makefile fix kind-install-cilium-fast target The kind-install-cilium-fast should install Cilium on all available kind clusters. Fixes: 5f88bbffeb50 ("docs: Add Makefile and documentation for "fast" development targets") Signed-off-by: André Martins <andre@cilium.io> 05 October 2023, 08:05:34 UTC
5741606 Makefile: fix 'fast' make targets These targets failed to select the nodes of a kind cluster. This change selects the right variables that are used to check for the cluster nodes. Fixes: 5f88bbffeb50 ("docs: Add Makefile and documentation for "fast" development targets") Signed-off-by: André Martins <andre@cilium.io> 05 October 2023, 08:05:34 UTC
4498ec9 .github: re-use common helm values from a single action A lot of our GH workflows use the same helm values to deploy Cilium. Thus, it makes sense to re-use one file for that purpose use it as a GH action composite. Signed-off-by: André Martins <andre@cilium.io> 05 October 2023, 07:35:17 UTC
d53d6ba CODEOWNERS: assign .github/actions to github-sec and ci-structure similarly to what we are already doing with .github/workflows Signed-off-by: Gilberto Bertin <jibi@cilium.io> 05 October 2023, 07:31:19 UTC
9865f8c bpf: egressgw: make ct_status an enum Allow for a tiny bit more type safety. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 04 October 2023, 17:34:41 UTC
425254d Remove dependencies on linux probes for Windows builds We provide CLI binaries for Windows in `cilium/hubble` and `cilium/cilium-cli`. These repositories import parts of cilium, including the `pkg/metrics` package. So we need to make sure that `pkg/metrics` builds on Windows. Signed-off-by: Fabian Fischer <fabian.fischer@isovalent.com> 04 October 2023, 15:34:55 UTC
a85e6f5 gha: Disable HTTPRouteRequestMultipleMirrors test Temporarily disable this test, de-flake will be tracked under below issue. Relates: https://github.com/cilium/cilium/issues/28374 Signed-off-by: Tam Mach <tam.mach@cilium.io> 04 October 2023, 12:38:44 UTC
d031248 fix: Remove the latest image tag from docs as latest tag is not published Signed-off-by: Vipul Singh <vipul21sept@gmail.com> 04 October 2023, 12:08:53 UTC
ec91f1c cilium ingress support set the number of trusted loadbalancer hops (as per Envoy config xff_num_trusted_hops) Fixes: #24292 Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 04 October 2023, 10:53:23 UTC
bfb7736 BGP CP: API Helper Functions Cleanup - Moves functions next to types. - Renames functions to ParseXXX - Adds godocs. Signed-off-by: Daneyon Hansen <daneyon.hansen@solo.io> 04 October 2023, 09:00:35 UTC
9d0c201 watchers: remove ciliumnodechain Previous commits in the series have removed all subscribers to the CiliumNodeChain by moving them over to use the LocalCiliumNodeResource. This commit reaps the benefits by deleting all of the now unused code. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 04 October 2023, 08:57:42 UTC
3f3e019 ipam: multipool: use resource for node events As part of a refactoring effort to remove the CiliumNodeChain, remove the multipool IPAM dependency on it. Since IPAM isn't yet part of the hive, we need to pass the resource down to IPAM through the daemon, in `startIPAM`. Once IPAM code is modularised, this is no longer necessary. In order to keep the tests largely unchanged, we reimplement the fakeNodeK8sAPI testing infrastructure in terms of the resource API. This leads to code duplication, but a commit later in this series removes the duplication again, by also converting the other usages to the resource style. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 04 October 2023, 08:57:42 UTC
dff52b8 bgp/speaker: use Resource[T] instead of NodeChains As part of a refactoring effort to remove the CiliumNodeChain, remove the dependency of the MetalLBSpeaker on it, by directly using the LocalCiliumNodeResource. Note that the speaker always discarded events not pertaining to the local Node, hence no functional change results from the fact that the CiliumNodeChain proagated events from all cluster CiliumNodes. As an added benefit, the CiliumNode events now don't stop coming once we connect to KVStore, since the resource doesn't stop. Signed-off-by: David Bimmler <david.bimmler@isovalent.com> 04 October 2023, 08:57:42 UTC
9ac6721 pkg/pprof: add CODEOWNER The package is currently owned by tophat. Let's give it to the last team that touched it. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 04 October 2023, 08:56:44 UTC
eee2414 Fix data race during Hubble setup in the daemon This commit fixes a data race where the setup of the hubbleObserver is being initialized in a new goroutine while the StatusCollector already tries to check Hubble's status. We fix the data race by switching to an atomic pointer to access the `hubbleObserver` in the Daemon struct. Fixing the data race by simply waiting to start the status collector until hubble is initialized does not work, as `launchHubble` could take up to 30s to return while waiting for a TLS certificate and we don't want to block agent startup for this. Fixes: #28291 Signed-off-by: Fabian Fischer <fabian.fischer@isovalent.com> 04 October 2023, 08:56:17 UTC
8e318cb ci: update docs-builder Signed-off-by: Cilium Imagebot <noreply@cilium.io> 04 October 2023, 08:53:17 UTC
07b3e70 build(deps): bump urllib3 from 2.0.4 to 2.0.6 in /Documentation Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.0.4 to 2.0.6. - [Release notes](https://github.com/urllib3/urllib3/releases) - [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst) - [Commits](https://github.com/urllib3/urllib3/compare/2.0.4...2.0.6) --- updated-dependencies: - dependency-name: urllib3 dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> 04 October 2023, 08:53:17 UTC
db9a2a9 chore(deps): update all lvh-images main Signed-off-by: renovate[bot] <bot@renovateapp.com> 04 October 2023, 08:52:25 UTC
43e8447 workflows: cilium-config: parametrize egressgw helm values Signed-off-by: Gilberto Bertin <jibi@cilium.io> 04 October 2023, 08:50:29 UTC
5e5c3aa bpf: clean up CB_NAT CB_NAT is no longer used. Instead just define a generic CB_3 placeholder. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 04 October 2023, 07:33:12 UTC
235211a fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 03 October 2023, 22:41:37 UTC
b4da4e6 metrics: Add map pressure metric for auth map This commit is to make sure that we are having map pressure metric for auth map to improve observability, which can be useful for adjusting map size if required. Relates: https://github.com/cilium/cilium/issues/24617 Signed-off-by: Tam Mach <tam.mach@cilium.io> 03 October 2023, 21:43:14 UTC
30d4a33 docs: Add policymap pressure debugging guide Signed-off-by: Chris Tarazi <chris@isovalent.com> 03 October 2023, 17:23:20 UTC
059586a docs: Fix typo for NameManager Signed-off-by: Chris Tarazi <chris@isovalent.com> 03 October 2023, 17:23:20 UTC
33541b3 contrib: Fix missing function in post-release.sh Commit 4afed72cde71 ("contrib: Move github release to post-release") introduced a usage of version_is_prerelease() in post-release.sh, but this was defined in start-release.sh so the function couldn't be found. Move it to common.sh so it can be reused across scripts. Fixes: 4afed72cde71 ("contrib: Move github release to post-release") Signed-off-by: Joe Stringer <joe@cilium.io> 03 October 2023, 16:20:23 UTC
97b8d1e docs: Remove bare URLs from Flow gRPC API Reference The Flow gRPC API Reference has bare URLs to Cilium's source code on Github. It also has a bare URL to the W3C Trace Context specification. This updates the non-tabled bare URLs to render as links. This also adds punctuation in several places for consistency. Signed-off-by: Stacy Kim <stacy.kim@ucla.edu> 03 October 2023, 13:53:12 UTC
93f4011 backporting: Revert changes until the new workflow will be in place The new GH workflow "Update labels of backported PRs", called from each stable branch, updates each `backport-pending` PRs to mark them as `backport-done` when the related backport is merged. So, there should be no need to use the contrib/backporting/set-labels.py to do that anymore. However, there are still some in-flight backport PRs that rely on the script to do that manually. Thus, this commit reverts the changes in 2966b03469 to restore the old script code. Once all the new GH workflows will be in place for the stable branches and the backport PRs will be created following the new format documented in the backporting docs, the script may be updated. Fixes: 2966b03469 ("backporting: Update docs after introduction of Label Updater") Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 03 October 2023, 13:46:50 UTC
79f07df gateway-api: Add support for multiple request mirrors Signed-off-by: Tam Mach <tam.mach@cilium.io> 03 October 2023, 12:12:03 UTC
def1770 ipsec: Clear node ID and SPI with output-mark TL;DR. This commit works around a Cilium iptables bug that can affect IPsec. The root of the issue for IPsec is a conflict between the packet mark we use and some iptables-nft rules. The workaround changes the packet mark to avoid the conflict. Cilium's IPsec implementation relies on the packet mark to match packets against XFRM rules. In particular, the packet mark holds the node ID (0xffff0000), the SPI (0xf000), and whether to encrypt or decrypt (0xf00). On egress, the packet mark is written in bpf_lxc before the packet is sent to the stack for encryption. The encryption layer retains the packet mark, except for the encryption bit (0xf00) in some cases. bpf_host then processes the encrypted packet and clears the mark. In some corner cases (what causes this isn't clear yet), some OSes can end up with kube-proxy rules in both iptables-nft and iptables-legacy. Cilium is currently unable to handle that situation properly: it should install iptables rules to skip some of kube-proxy's rules, but only does that for iptables-legacy. We can therefore end up with the following rules matching outgoing packets: [31061:1736670] -A POSTROUTING -m comment --comment "kubernetes postrouting rules" -j KUBE-POSTROUTING [31059:1736390] -A KUBE-POSTROUTING -m mark ! --mark 0x4000/0x4000 -j RETURN [2:280] -A KUBE-POSTROUTING -j MARK --set-xmark 0x4000/0x0 [2:280] -A KUBE-POSTROUTING -m comment --comment "kubernetes service traffic requiring SNAT" -j MASQUERADE --random-fully We can see that kube-proxy install a KUBE-POSTROUTING chain in the POSTROUTING netfilter chain. That chain skips all packets not marked with 0x4XXX. Other packets see the mark removed and are then masqueraded. In the above example, 2 packets had the 0x4XXX mark and were masqueraded. These rules can end up matching the mark of our encrypted packets, if the SPI is 4. Encrypted packets are then incorrectly masqueraded. Depending on the node and network configuration, they may end up dropped. The SPI is incremented on key rotation (from 1 to 15, then going back to 1 again). Users performing key rotations are therefore highly likely to hit this bug if they have iptables-nft kube-proxy rules installed. We can work around this bug by clearing all packet mark bits we don't need anymore after the encryption. Once the packet is encrypted, bpf_host only needs the 0xf00 part of the mark to determine if a packet is encrypted or not. The node ID and SPI can be cleared from the mark. This commit therefore clears those bits right after encryption, with output-mark, on egress and ingress. Note this only avoids the IPsec impact of this bug, but doesn't fix the underlying issue with iptables-nft. More work is needed to have the agent correctly handle this situation. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 03 October 2023, 09:26:09 UTC
ae30c5c ipam/multipool: Introducing specific IP-family annotations. Now, it will first check if the pod has annotations ipam.cilium.io/ipv4-pool or ipam.cilium.io/ipv6-pool; if so, it will return the pool from the specific IP family. If these annotations are not present, it will check if the pod has the annotation ipam.cilium.io/ip-pool, and if found, it will return the pool as usual. If the pod doesn't have any annotations, it will then check if the namespace in which the pod resides has these annotations in the same order. If neither the pod nor the namespace has these annotations, it will return the default IP pool. Signed-off-by: Huagong Wang <wanghuagong@kylinos.cn> 03 October 2023, 08:47:48 UTC
809ae2a annotation: Add IP-family pool specific annotations This adds two annotations, "ipam.cilium.io/ipv4-pool" and "ipam.cilium.io/ipv6-pool". The former can be added to pods or namespaces to specify an IPv4 pool for workloads, while the latter corresponds to specifying an IPv6 pool for workloads. Signed-off-by: Huagong Wang <wanghuagong@kylinos.cn> 03 October 2023, 08:47:48 UTC
77232f4 egressgw: clean up stale IP rules and routes this commit adds logic to ensure the manager cleans up any stale IP rules and routes from previous Cilium versions which still use those to steer egress gateway traffic to the correct interface Signed-off-by: Gilberto Bertin <jibi@cilium.io> 03 October 2023, 08:01:53 UTC
334fdbc egressgw: mark --install-egress-gateway-routes as deprecated this flag has now no effect and will be removed in v1.16 Signed-off-by: Gilberto Bertin <jibi@cilium.io> 03 October 2023, 08:01:53 UTC
ae47521 egressgw: remove IP rules/and routes logic Signed-off-by: Gilberto Bertin <jibi@cilium.io> 03 October 2023, 08:01:53 UTC
4a2f222 bpf: bpf_host: skip host firewall in cil_to_netdev if packet is SNATed when egress gateway traffic gets masqueraded, it can get redirected to a different interface that is also running bpf_host. When this happens, the host firewall logic should be skipped as it would be incorrect to enforce again any policy: after being masqueraded and redirected, the original pod identity is lost, and the host firewall would incorrectly identify this traffic as host or world Signed-off-by: Gilberto Bertin <jibi@cilium.io> 03 October 2023, 08:01:53 UTC
b1b415d bpf: skip tail_handle_snat_fwd_ipv4 trace event if packet was redirected Don't emit a trace event if the packet has been redirected to another interface. This can happen for egress gateway traffic that needs to egress from the interface to which the egress IP is assigned to. Suggested-by: Julian Wiedmann <jwi@isovalent.com> Signed-off-by: Gilberto Bertin <jibi@cilium.io> 03 October 2023, 08:01:53 UTC
369a6fd bpf: egressgw: add support for fib_lookup when a packet matching an egress gateway policy needs to leave the node, it can do so through the wrong interface. This can happen when the interface with the route for that packet is different from the one with the egress IP assigned to. The current solution to this is to install additional IP rules and routes that match egress gateway traffic and steer it to the correct interface with the egress IP. The drawback of this approach is that installing and maintaining these rules is getting more and more complex and consumes resources on the host. This commit introduces a new approach to solve this, based on fib lookups and redirects in the datapath. In practice, after a packet matching an egress gateway policy has been masqueraded, we run a fib lookup on the packet with the new source (egress) IP, and in case the packet is redirected to the correct interface with the egress IP assigned to. Fixes: #23504 Signed-off-by: Gilberto Bertin <jibi@cilium.io> 03 October 2023, 08:01:53 UTC
9d3976c docs: Mention `RouteTableInterfacesOffset` in system requirements ENI mode creates routing tables with index `10 + eni-index`. This commit documents that and mentions that those indices are not taken by the system. Suggested-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 03 October 2023, 07:53:54 UTC
1e41011 api, cmd, policy: Show rule labels in `policy selectors` output Use the functionality implemented in previous commits to output the rule labels in the selector output. This is useful for understanding which policy the selector comes from, which makes debugging issues much easier. Example output: ``` root@kind-worker:/home/cilium# cilium policy selectors SELECTOR LABELS USERS IDENTITIES &LabelSelector{MatchLabels:map[string]string{k8s.io.kubernetes.pod.namespace: kube-system,k8s.k8s-app: kube-dns,},MatchExpressions:[]LabelSelectorRequirement{},} default/tofqdn-dns-visibility 1 16500 &LabelSelector{MatchLabels:map[string]string{reserved.none: ,},MatchExpressions:[]LabelSelectorRequirement{},} default/tofqdn-dns-visibility 1 MatchName: , MatchPattern: * default/tofqdn-dns-visibility 1 16777217 16777218 16777219 ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> 03 October 2023, 06:13:42 UTC
c863e26 policy: Plumb rule labels through selector cache In order to make investigating the output of `cilium policy selectors` easier to understand, plumb the rule labels from KNP, CNP, and CCNPs. This makes it so that the user can very easily see which policy the selector came from. Benchmark results before and after given that selectorcache can be a hot path in the policy engine: ``` $ go test -v ./pkg/policy -run '^$' -bench 'BenchmarkRegenerateL3EgressPolicyRules' -test.benchtime 100x -test.benchmem -test.count 10 ... $ benchstat old.txt new.txt name old time/op new time/op delta RegenerateL3EgressPolicyRules-16 8.13ms ±13% 8.01ms ±11% ~ (p=0.912 n=10+10) name old alloc/op new alloc/op delta RegenerateL3EgressPolicyRules-16 1.75MB ± 0% 1.94MB ± 0% +10.78% (p=0.000 n=10+9) name old allocs/op new allocs/op delta RegenerateL3EgressPolicyRules-16 24.1k ± 0% 25.1k ± 0% +4.15% (p=0.000 n=10+10) ``` Signed-off-by: Chris Tarazi <chris@isovalent.com> 03 October 2023, 06:13:42 UTC
9486e7b Resiliency: Node manager reconciliation path yields unchecked errors > NOTE: Initial pass! Found several places in the node manager control loop where we are ignoring potential errors and either not logging them or worth not bubbling up to their respective call sites. > NOTE! Please go thru this PR with a fine comb as I don't have the necessary understanding on whether some of these should be hard errors or not. Thank you!! Signed-off-by: Fernand Galiana <fernand.galiana@isovalent.com> 02 October 2023, 23:29:55 UTC
3e8059d Add errorset to track unique errors Some calls may return multiple non unique errors. Leveraging an error set in those situations will trim down repetitions while reporting errors. Signed-off-by: Fernand Galiana <fernand.galiana@isovalent.com> 02 October 2023, 23:29:55 UTC
207eb7a cmd: Disable local node routes when endpoint routes are enabled The command-line description of `enable-endpoint-routes` states: "Use per endpoint routes instead of routing via cilium_host". This description however has not been true: Even with per-endpoint routes enabled, Cilium would still install the local node route unless explicitly disabled. Not only is the local node route (which installs a single route based on the local pod CIDR) redundant to per-endpoint routes (which installs a route per endpoint), it also does not work on IPAM modes that do not have a local pod CIDR, such as ENI or Azure. On Azure, GKE, and in the MultiPool CI, we already disabled the local node route explicitly when enabling per-endpoint routes, but in many other cases (ENI, as well as most of our CI (see github/actions/cilium-config)) we forgot to disable local node routes explicitly when per-endpoint routes are enabled. Therefore, this commit improves UX by automatically disabling the local node route if per-endpoint routes are enabled. This approach was discussed at the Cilium Community meeting on June 7, 2023. As an alternative to this change, we could also introduce a new tri-mode flag (i.e. "per-endpoint routes", "local node route", "none"), but such a change might introduce unnecessary churn if we decide to remove the local node routes further down the road. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 02 October 2023, 15:05:34 UTC
81adc9c Add flows_rate field to hubble status We add a `flows_rate` field to the Hubble `ServerStatus` that returns the approximate rate of seen flows per second over the last minute. It's "approximate" as we calculate the rate by counting all flow events in the ring buffer that happened in the last minute. If all events in the ring buffer happened in the last minute, we can't calculate the rate over the last minute, so we calculate the rate since the oldest flow event in the ring buffer. Signed-off-by: Fabian Fischer <fabian.fischer@isovalent.com> 02 October 2023, 11:26:14 UTC
6d16e5c doc: add trafic shifting example for service mesh Signed-off-by: chentanjun <tanjunchen20@gmail.com> 02 October 2023, 09:34:47 UTC
2966b03 backporting: Update docs after introduction of Label Updater Update the documentation and the backporting scripts to take into account the new Label Updater workflow that will automatically update the labels of each backported Pr. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 02 October 2023, 09:34:23 UTC
c07bfe6 ci: Add a workflow to update labels of backported PRs Add a reusable workflow to be called from stable branches whenever a backport PR is merged. The workflow scans the body of the backport PR to get the list of the original PRs, and updates the label "backport-pending/<version>" to "backport-done/<version>" for each of them. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 02 October 2023, 09:34:23 UTC
980590e documentation: add documentation for bpf map capacity metric. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 02 October 2023, 09:33:47 UTC
9d513d9 metrics: add bpf_map_capacity metric which provides max size of maps It is useful to have bpf map capacity metric, which is the defined bpf max size, for writing general Promtheus queries that can detect various issues with map usage (for example, providing an upper bound on FQDN IPs). This is especially helpful in the face of configurable map sizes, where you cannot assume the default value is always used. bpf_map_capacity metrics are emitted per "group" of maps, this is any group of maps that share common purpose and attributes. This avoids having excessive redundant cardinality due to similar map types each having their capacity emitted in a separate metric (ex. endpoint policy maps are all always the same size). Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 02 October 2023, 09:33:47 UTC
8a6e83d docs: Use host port for serving docs Previously, the sphinx-autobuild process running in a Docker reported: [I 230927 13:54:55 server:335] Serving on http://0.0.0.0:8000 This could have confused users thinking that the server is accessible through the tcp/8000 on the host. However, it was accessible https://0.0.0.0:9081. Change the server's listen port to 9081 to avoid the confusion. Signed-off-by: Martynas Pumputis <m@lambda.lt> 02 October 2023, 09:33:23 UTC
back to top