sort by:
Revision Author Date Message Commit Date
003fb4a .github: Trigger ci-verifier and lint-bpf-checks on image changes Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> 15 March 2024, 11:15:29 UTC
ed4e650 k8s/utils: filter out cilium-owned labels on pod update Currently `io.cilium.k8s.*` pod labels are only filtered out on pod creation. On pod update, they are currently not filtered which leads to a situation where no pod label update is reflected in the endpoint anymore in case of a `io.cilium.k8s.*` label set on the pod: $ cat <<EOF | kubectl apply -f - apiVersion: v1 kind: Pod metadata: name: foo namespace: default labels: app: foobar io.cilium.k8s.something: bazbar spec: containers: - name: nginx image: nginx:1.25.4 ports: - containerPort: 80 EOF $ kubectl -n kube-system exec -it cilium-nnnn -- cilium-dbg endpoint list ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS ENFORCEMENT ENFORCEMENT 252 Disabled Disabled 50316 k8s:app=foobar fd00:10:244:1::8b69 10.244.1.78 ready k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=kind-kind k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default $ kubectl label pods foo app=nothing --overwrite $ kubectl describe pod foo [...] Labels: app=nothing io.cilium.k8s.something=bazbar [...] $ kubectl describe cep foo [...] Labels: app=foobar io.cilium.k8s.something=bazbar [...] $ kubectl -n kube-system exec -it cilium-nnnn -- cilium-dbg endpoint list ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS ENFORCEMENT ENFORCEMENT 252 Disabled Disabled 50316 k8s:app=foobar fd00:10:244:1::8b69 10.244.1.78 ready k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=kind-kind k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default 1285 Disabled Disabled 1 reserved:host ready 1297 Disabled Disabled 4 reserved:health fd00:10:244:1::ebfb 10.244.1.222 ready Note that the `app` label didn't change from `foobar` to `nothing` in the endpoint and the CiliumEndpoint CRD This is because the filtered labels are passed wrongly passed to `(*Endpoint).ModifyIdentityLabels` which in turn calls `e.OpLabels.ModifyIdentityLabels` which checks whether all of the deleted labels (which contains the filtered label on pod update for the example above) were present before, i.e. on pod creation. This check fails however because the labels were filtered out on pod creation. Fix this issue by also filtering out the labels on pod update and thus allowing the label update to successfully complete in the presence of filtered labels. After this change, the labels are correctly updated on the endpoint and the CiliumEndpoint CRD: $ kubectl label pods foo app=nothing --overwrite $ kubectl describe pod foo [...] Labels: app=nothing io.cilium.k8s.something=bazbar [...] $ kubectl describe cep foo [...] Labels: app=nothing io.cilium.k8s.something=bazbar [...] $ kubectl -n kube-system exec -it cilium-x2x5r -- cilium-dbg endpoint list ENDPOINT POLICY (ingress) POLICY (egress) IDENTITY LABELS (source:key[=value]) IPv6 IPv4 STATUS ENFORCEMENT ENFORCEMENT 57 Disabled Disabled 56486 k8s:app=nothing fd00:10:244:1::71b7 10.244.1.187 ready k8s:io.cilium.k8s.namespace.labels.kubernetes.io/metadata.name=default k8s:io.cilium.k8s.policy.cluster=kind-kind k8s:io.cilium.k8s.policy.serviceaccount=default k8s:io.kubernetes.pod.namespace=default 201 Disabled Disabled 4 reserved:health fd00:10:244:1::c8de 10.244.1.221 ready 956 Disabled Disabled 1 reserved:host ready Fixes: 599dde3b91b3 ("k8s: Filter out cilium owned from pod labels") Signed-off-by: Tobias Klauser <tobias@cilium.io> 15 March 2024, 09:07:16 UTC
5508746 k8s/watchers: set unfiltered pod labels on CEP on pod update The labels on the CEP are set to the unfiltered pod labels on CEP creation, see [1]. On any label update where labels contain filtered labels, e.g. io.cilium.k8s.* labels or labels filtered out by the user by means of the --label and/or --label-prefix-file agent options the current logic would wrongly remove the filtered labels from the CEP labels. Fix this by always using the unfiltered pod labels. [1] https://github.com/cilium/cilium/blob/b58125d885edbb278f11f84303c0e7c934ca7ea4/pkg/endpointmanager/endpointsynchronizer.go#L185-L187 Signed-off-by: Tobias Klauser <tobias@cilium.io> 15 March 2024, 09:07:16 UTC
2309805 k8s: move filterPodLabels to k8s/utils package for SanitizePodLabels Currently GetPodMetadata is the only caller of SanitizePodLabels but other callers will be introduced in successive changes. This change ensures the io.cilium.k8s.* labels are filtered for these callers as well. Signed-off-by: Tobias Klauser <tobias@cilium.io> 15 March 2024, 09:07:16 UTC
9a26446 k8s/watchers: warn when endpoint label update fails on pod update Currently, failure to update endpoint labels based on pod labels on pod update is silently ignored by the callers or only reflected in error count metrics. Report a warning to clearly indicate that pod and endpoint labels might be out of sync. Signed-off-by: Tobias Klauser <tobias@cilium.io> 15 March 2024, 09:07:16 UTC
bba0ff5 k8s/watchers: inline single-use updateEndpointLabels The functions updateEndpointLabel is only used in one place. Inline it to improve readability and simplify changes in successive commits. Signed-off-by: Tobias Klauser <tobias@cilium.io> 15 March 2024, 09:07:16 UTC
8c23fa8 gh: workflows: clarify reference to issue #23283 Clarify that while the issue was closed as resolved, this actually only applies to scenarios where the kind.sh script is used. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 15 March 2024, 09:02:08 UTC
f61651f operator: fix errors/warnings metric. This was broken during transition of pkg/metrics to integrate with Hive where relevant operator metrics where never initialized. This adds a init func specific for operator and cleans up the "flush" logic used as a work around for errors/warnings emitted prior to agent starting (in the case of the operator). Addresses: #29525 Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 15 March 2024, 06:34:11 UTC
e929947 labelsfilter: Ensure entity relevant labels are always applied Entities are special selectors used by network policies. The Cluster entity relies on the `io.cilium.k8s.policy.cluster` label which is removed by Cilium if a strict identity label configuration is applied. This PR adds the relevant Cilium policy label to the list of default labels so it will always be applied regardless of configuration, and includes this label to the associated test file. Fixes: #18878 Signed-off-by: soggiest <nicholas@isovalent.com> 15 March 2024, 06:32:55 UTC
4ba7e6a datapath: Remove unnecessary IPsec code Commit 891fa78474 ("bpf: Delete obsolete do_netdev_encrypt_pools()") removed the special code we had to rewrite the IPsec outer header. The code removed in the present commit is therefore not required anymore. Fixes: 891fa78474 ("bpf: Delete obsolete do_netdev_encrypt_pools()") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 15 March 2024, 06:32:12 UTC
a5eafe0 ci: Bump lvh-kind ssh-startup-wait-retries Recently, we frequently see the CI failure with lvh-kind startup failure with exit code 41. This indicates the timeout of the task waiting for the SSH startup. Bump the timeout (retry) to 600 (10min) as a workaround. Fixes: #31336 Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 15 March 2024, 02:43:29 UTC
65bbf3f policy: Fix missing labels from SelectorCache selectors During the refactor of the below commit, it seems the labels were left out inadvertently, breaking the `cilium policy selectors` command that displays the labels/name of the policy from which the selectors originate from. Fixes: 501944c35d ("policy/selectorcache: invert identitySelector interface") Signed-off-by: Chris Tarazi <chris@isovalent.com> 14 March 2024, 23:41:28 UTC
787858c bgpv2/ci: added watch reactor for bgp cluster config Signed-off-by: harsimran pabla <hpabla@isovalent.com> 14 March 2024, 22:12:53 UTC
95c916d test: add ginkgo default-allow tests Add some tests that create various mixtures of default-allow and default-deny policies. It is important that default-deny policies always take precedence over default-allow. It is also important that Deny rules take precedence over default-allow. These need to be integration tests, since they rely on specific interactions between the userspace and bpf policy engines. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 14 March 2024, 19:32:03 UTC
ace9740 policy: implement non-default-deny policies This adjust the policy rule generation to take in to account non-default-deny policies. As before, an endpoint is normally in a policy-disabled state. If any policies select this endpoint, then policy is enabled and all non-allowed traffic is dropped. If, however, an endpoint is only selected by default-allow policies, then policy is enabled, but a special wildcard allow policy is inserted. Since wildcard polcies have very low precedence, this ensures that any Deny or L7-proxy rules will still take effect. This commit also fixes tests that incorrectly failed to sanitize rules before adding to the policy repository, leading to a nil pointer exception. Production code *always* sanitizes rules before adding. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 14 March 2024, 19:32:03 UTC
315dc38 policy/api: change Rule.Sanitize() to pointer receiver Methods called by Sanitize() may alter the underlying structures. For example. selectors are aggregated and address families are upper-cased. However, top-level fields can't be written by Sanitize. In the future, we'd like to do that. So, give it a pointer receiver. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 14 March 2024, 19:32:03 UTC
75ac5d0 policy/api: add EnableDefaultDeny field This adds a new field, EnableDefaultDeny, to CiliumNetworkPolicy and CiliumClusterwideNetworkPolicy, that controls whether or not the subject endpoints of this policy should drop unselected peer traffic. By default, endpoints are in a default-allow mode. When the first policy applies to an endpoint, it flips it to a default-deny mode. This option allows disabling that behavior. If not specified, the existing behavior remains: for each direction, traffic is allowed unless at least one policy rule applies to that endpoint. If multiple policies select an endpoint, then default-deny takes precedence. It is useful in heterogeneous environments, it may not be desirable to implicitly drop all non-matching traffic. Consider, for example, the case where an administrator wishes to ensure a monitoring service can access all namespaces. If they create a cluster-wide policy allowing access from the monitoring service, it may create a deny policy where none was previously; unexpectedly dropping traffic. See: https://github.com/cilium/design-cfps/blob/main/cilium/CFP-30572-non-default-deny-policies.md This commit contains only the API changes; a subsequent commit will introduce the implementation. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 14 March 2024, 19:32:03 UTC
b741a58 bpf: add node_key to alignchecker This struct is also used by the agent. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 14 March 2024, 16:53:16 UTC
b58125d kind: reset sysctl net.ipv4.ip_unprivileged_port_start to 1024 Currently, kind clusters running with docker have sysctl `net.ipv4.ip_unprivileged_port_start` set to `0`. This is the default of docker. ``` root@kind-worker:/home/cilium# sysctl net.ipv4.ip_unprivileged_port_start net.ipv4.ip_unprivileged_port_start = 0 ``` This can lead to wrong assumptions and differ from the default of most k8s setups - where binding to privileged ports (<1024) requires the capability `NET_BIND_SERVICE`. Therefore, this commit resets the sysctl `net.ipv4.ip_unprivileged_port_start` to `1024`. This way the dev environment matches the default on most k8s environments. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 March 2024, 15:00:51 UTC
4f4b0a7 bpf, maps: Don't propagate nodeID to bpf map when allocation fails. When we run out of IDs to allocate for nodes, we were propagating zero ID to bpf map. Now we just simply return error and not modify bpf map instead. Also clean up incorrectly mapped nodeids on startup in case that happened. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 14 March 2024, 10:26:47 UTC
29931c8 daemon: Remove `ip-allocation-timeout` flag This removes the `ip-allocation-timeout` flag from cilium-agent. It was used by the now removed `AllocateCIDRs` function. As the timeout does apply to the asynchronous allocation mechanism, it is removed. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 March 2024, 06:20:36 UTC
2fddd09 ipcache: Remove `asyncPrefixReleaser` This removes the asynchronous prefix garbage collector. This is now possible because the `enqueue` function is no longer called. It was previously called by `ReleaseCIDRIdentitiesByCIDR`, but that function has been removed. Therefore, all the dequeue logic can also be removed, as the gc queue will always be empty. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 March 2024, 06:20:36 UTC
be63f16 ipcache: Remove `ReleaseSlice` helper This function is no longer used now that `ReleaseCIDRIdentitiesByCIDR` was removed in a previous commit. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 March 2024, 06:20:36 UTC
7693063 ipcache: Remove synchronous CIDR identity allocation This commit removes the `AllocateCIDRs` and `ReleaseCIDRIdentitiesByCIDR` functions from the IPCache. Their last usage has been removed when the `ToServices` and `ToFQDN` implementations were moved to use the asynchronous API (using `{Upsert,Remove}Prefixes`). This commit removes the public functions, as well as the unit test checking the interaction between the synchronous and asynchronous methods. The other unit tests are changed to simulate `UpsertPrefixes` instead. A subsequent commit remove further now-dead code. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 March 2024, 06:20:36 UTC
696a4fd bgpv1: fix Test_PodIPPoolAdvert flakiness Since the test updates multiple k8s resources in each test step, and each step depends on the previous one, we should be careful about how many k8s resources are we actually changing in each step, to not trigger multiple reconciliations with different results (advertisements) in a single step. This change ensures we change only one value that affects the advertisement in each test step. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 14 March 2024, 06:20:03 UTC
74119be gateway-api: RequestRedirect picks wrong port with multiple listeners Fixes: 29099 If RequestRedirect does not specify a port and schem is empty. The port of the listener is used by default. Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 14 March 2024, 04:22:53 UTC
e0cd2a6 lxcmap: Fix comment about byte-order It was many and many a year ago, In a kingdom by the BEES, That a MAP there lived whom you may know By the name of LXC; And this MAP it lived with no other thought Than to ROUTE and ROUTE efficiently. In GIT logs I discover when it was created, A field--SECURITY IDENTITY-- In a style that has yet to be dated, An ENDIANNESS known as "BE". Yet some time later the field had faded, leaving A remark with one guarantee: The field so related, no longer updated, is formed "this" way. Most certainly! As years and years and oh they passed, This remark it remained silently, Until a day no-one would forecast, The field was restored--and rightfully. And so set the trap, set with the MAP! Whenceforth I was deceived: For the utterance was renewed at last, In a way that lied to me. Amidst a review of some code so UPDATED, I a-raised an enquiry, To understand what I had conflated, That elderly comment about "BE". And so we translated, And demarcated, All CODE related to SEC_ID.. Until we were persuaded, Perhaps frustrated, The field was little endian! And so we regale you, with our tale, To state but this to thee, Care of comments that may be stale-- Truth is not always what you see! Related: a15be077a34ad9d2a8f208e4b736c25ce3a15306 Related: 0d513f3ae2a29119c864309cd1198468e70a13c2 Related: bd7b7af6322182e952d8be134d47a4a55ecad2c4 Reported-by: Louis DeLosSantos <louis@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 13 March 2024, 22:29:15 UTC
44aeb53 helm: Add pod affinity for cilium-envoy This commit is to avoid cilium-envoy running on the node without cilium agent. Two main changes are: - nodeAffinity to make sure that cilium-envoy will not be scheduled on node without cilium agent - podAffinity with requiredDuringSchedulingIgnoredDuringExecution to cilium agent Relates: #25081, #30034 Fixes: #31149 Signed-off-by: Tam Mach <tam.mach@cilium.io> 13 March 2024, 20:38:16 UTC
76c85c5 NodeAddress: Add wildcard device for fallback addresses Some setups that use e.g. ECMP may not have addresses assigned to all devices. For BPF masquerading one can use DeriveMasqIPFromDevice to work around this, but we don't have a similar mechanism for the direct routing address. To fix this and retain the semantics of v1.14, add a "wildcard device" (*) to the NodeAddress table that picks suitable fallback addresses from all devices in the system. The criteria (in order) is to prefer public over private, lower scope, lower ifindex and finally lower address. To allow overlapping addresses in the NodeAddress table, use a primary key that combines address with device name. Signed-off-by: Jussi Maki <jussi@isovalent.com> 13 March 2024, 17:08:20 UTC
0e44f30 node: Replace ipv[46]MasqAddrs with Table[NodeAddress] Replace the global map of BPF masquerade addresses with a query to Table[NodeAddress] in the loader's patchHostNetdevDatapath and remove the now unused code around masquerade addresses in pkg/node. Signed-off-by: Jussi Maki <jussi@isovalent.com> 13 March 2024, 17:08:20 UTC
9fbe73c helm: Update the example value for EKS load balancer internal annotation Signed-off-by: Oshan Galwaduge <oshan304@gmail.com> 13 March 2024, 12:44:44 UTC
513a570 bpf: xdp: remove unused set_encrypt_dip() 4c7cce1bf044 ("bpf: Remove IP_POOLS IPsec code") removed the skb helper. Also clean up the (unused) XDP variant. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 13 March 2024, 11:17:56 UTC
531d798 enable IPv6 system setting for devcontainer environment. Fixes: #31263 Signed-off-by: Tomoya Fujita <Tomoya.Fujita@sony.com> 13 March 2024, 10:46:41 UTC
1321e03 doc: Clarified GwAPI KPR prerequisites Before the GwAPI doc listed KPR mode as a prerequisite. However, it's actually only required to enable BPF nodePort support. Signed-off-by: Philip Schmid <philip.schmid@isovalent.com> 13 March 2024, 09:57:05 UTC
54c01e2 bgpv1: ExternalIP advertisement with BGP Control Plane Fixes: #29990 Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 13 March 2024, 07:25:07 UTC
687e4f8 remove individual Signed-off-by: Bill Mulligan <billmulligan516@gmail.com> 12 March 2024, 19:51:05 UTC
bd4ee16 fix: Make it clear USERS.md should be production Signed-off-by: Bill Mulligan <billmulligan516@gmail.com> 12 March 2024, 19:51:05 UTC
ebf5e74 images: update cilium-{runtime,builder} Signed-off-by: André Martins <andre@cilium.io> 12 March 2024, 15:56:35 UTC
ef31cfd images: bump cni plugins to v1.4.1 The result of running ``` images/scripts/update-cni-version.sh 1.4.1 ``` Signed-off-by: André Martins <andre@cilium.io> 12 March 2024, 15:56:35 UTC
a5824bd k8s: Remove k8s.RuleTranslator logic With the previous commit, this logic is no longer needed. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
1ff9be2 policy/k8s: Move ToServices translation to policy package This commit moves the logic which translates `ToServices` rules to `ToCIDRSet` rules into `pkg/policy/k8s`. In doing so, we also change how translation is performed: Previously, translation was split over multiple packages: Initial policy ingestion (in `pkg/policy`) invoked `k8s.PreprocessRules`, which populated the initial version of the policy with `ToCIDRSet` rules. Service updates were handled incrementally by the service watcher, which invoked a `k8s.RuleTranslator` for each update and allocating CIDR identities (via IPCache) accordingly. The new approach centralizes all the logic into `pkg/policy/k8s`. We mirror the approach from the existing `CiliumCIDRGroup` logic which also translates references to `CiliumCIDRGroups` to `ToSericeSet`s. Because the new approach modifies the CNP/CCNP before it is added to the policy repository, we no longer perform incremental updates and no longer need manage CIDR identities ourselves. This simplifies the code and allows us to remove one of the last users of the "synchronous IPCache API". As a side-effect of this change, `ToServices` rules for policies ingested via the non-persistent API (i.e. `cilium-dbg policy import`) are no longer supported. We assume that this was not a commonly used feature for two reasons: First, importing non-K8s policies that refer K8s resources (i.e. services) is likely rather rare. And secondly, the previously implementation was slightly broken: Because policies imported via API were not preprocessed, the policy was defunct until one of the matched services generated an update event. Note that this commit only removes the conflicting parts the old infrastructure to keep this commit reviewable. The remaining code of the old infrastructure is removed in a subsequent commit. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
5f3e3ab policy/k8s: Extract helper for generating resource ID This commit moves the logic to generate an IPCache resource ID into a separate function. It will be used by additional call sites in a subsequent commit. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
a2619f2 policy/k8s: Reduce number of CNP copies This commit changes the `CiliumCIDRGroup` translation logic to not create a copy of the CNP/CCNP before it is translated. Instead, the passed in parameter is mutated in place. While it would arguably be nice to have a functional approach where we have a input parameter and one output result, creating two copies for every CNP change is expensive. In particular, since an upcoming change will add additional translation steps and we want to avoid having to create unnecessary copies for every translation step. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
81f0490 policy/api: Allow generated ToCIDRSets with other L3 targets This commit relaxes the CNP/CCNP policy validation to allow `ToCIDRSet` rules to coexist with other L3 members, such as `ToServices`. This has no impact on user-submitted policies, since it is not possible for policies imported via K8s/API to set the `Generated` field. The `Generated` field is currently only set by the `ToServices` implementation, which extracts service endpoints and translates them as `ToCIDRSet` rules. Currently, this is done after CNP validation, however a subsequent commit will perform that translation step earlier. For this reason, we need to relax the rule validation to allow policy rules to contain both a `ToServices` and generated `ToCIDRSet` rules. This change has no user-facing impact. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
e99521a k8s/service: Introduce service cache iterator This commit introduces a new helper function to the service cache which allows users outside of the package to search for services based on arbitrary conditions (such as e.g. matching labels). The helper will be used in a subsequent commit. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
f1a0888 k8s/service: Introduce service cache notifications This introduces a new notification mechanism to the K8s service cache. In contrast to the existing static single-consumer `Events` channel, this new `Notifications` mechanism allows multiple dynamic subscribers to be informed when the service cache changes. Ideally, we would need two different mechanisms and replace the `Events` consumer with a `Notifications` consumer. The difficulty comes from the fact that the current `Events` channel is guaranteed to deliver all events, including events which occurred before the consumer started listening on the channel. Where as with the new dynamic `Notifications` mechanisms, we have to deal with consumers that register late. Guaranteeing delivery of early events in such a scenario much harder to achieve, i.e. we would have to replay past events based on the current cache state. Since replaying past events would significantly complicate the implementation, this commit opts for a simpler solution and instead requires subscribers to register early, i.e. pushing the complexity to consumers of the new `Notifications` mechanisms. The static consumers of the existing `Events` mechanism are however unaffected. There is an upcoming CFP to re-implement the ServiceCache on top of StateDB. StateDB does have a "watch" mechanism which will give late subscribers a full review of the cache, thus making custom event mechanisms obsolete. Until then, the approach implemented by this commit should serve as a stop-gap solution. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
74614bc policy/k8s: Deduplicate CIDR group resolution logic This commit moves all code which resolves CIDR group references into a single `resolveCiliumNetworkPolicyRefs` function. This is done in preparation for the upcoming sequence of commits, which will resolve services references in the CNP. By having a single point where we resolve references, we ensure that all different kinds of references are updated together and consistently. This commit does not contain any functional changes. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
314e5d2 policy/k8s: Remove double validation in CNP update This commit removes the `updateCiliumNetworkPolicyV2` and uses the renamed `addCiliumNetworkPolicyV2` (which is now called `upsertCiliumNetworkPolicyV2`) in its place. The removed `updateCiliumNetworkPolicyV2` did not provide any value: - It validated the old version of the CNP object, even though that old version was always coming from the `PolicyWatcher.cnpCache`, to which CNPs are only added after they previously passed validation. So the `oldRuleCpy.Parse()` call could never fail in practise. - It called `newRuleCpy.Parse()` followed by a call to `addCiliumNetworkPolicyV2` which also would then call `Parse()` again, thereby performing the validation twice. The only meaningful logic was the "Modified CiliumNetworkPolicy" debug log line, which this commit moves to the `onUpsert` call site. The log line is also simplified, as there is no way where the name or namespace of the CNP can change (as we're retrying the old object from the `cnpCache` which is indexed by name and namespace). Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 12 March 2024, 14:01:38 UTC
cc82eff ingress: Create FromGroups resource Duplicates the structures inplace to evaluate the toGroups resource into the ingress section, allowing the creation of FromGroups. This means AWS SG groups can be included as ingress resources and directly translated into fromCIDR rules. Fixes: #30032 Signed-off-by: Alex Waring <ajmwaring@gmail.com> 12 March 2024, 11:47:50 UTC
04c64e6 abstract out ExtractCidrSet As I am going to be copying the Create Derivative functions, this commit abstracts out some of the logic into a helper function, to make the code more DRY. Signed-off-by: Alex Waring <ajmwaring@gmail.com> 12 March 2024, 11:47:50 UTC
0d4bdf4 ingress: Rename FromGroups resource In preparation for making the groups resource applicable to both ingress and egress rules, this commit changes the name of the ToGroups struct to Groups. Signed-off-by: Alex Waring <ajmwaring@gmail.com> 12 March 2024, 11:47:50 UTC
96e01ad wireguard: Improve L7 proxy traffic detection Use marks set by the proxy instead of assuming that each pkt from HOST_ID w/o MARK_MAGIC_HOST belongs to the proxy. In addition, in the tunneling mode the mark might get reset before entering wg_maybe_redirect_to_encrypt(), as the proxy packets are instead routed to from_host@cilium_host. The latter calls inherit_identity_from_host() which resets the mark. In this case, rely on the TC index. Suggested-by: Gray Lian <gray.liang@isovalent.com> Signed-off-by: Martynas Pumputis <m@lambda.lt> 12 March 2024, 11:41:17 UTC
c8c2899 test: Allow setting multiple tests to FOCUS Use quotes around $(FOCUS) to support syntax like: FOCUS='K8sAgentFQDNTest|K8sAgentPerNodeConfigTest' make -C test k8s-kind Without the quotes the pipe character in the variable's value is interpreted by the shell as a pipe. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> 12 March 2024, 11:25:38 UTC
835c38d test: Set image.pullPolicy=IfNotPresent for kind Running tests locally on a kind cluster is supposed to work with the following commands: make kind make kind-image FOCUS=... make -C test k8s-kind The last one has been apparently broken for some while, resulting in ImagePullBackOff due to a failure to connect to localhost:5000. It happens because these images are preloaded by `kind load`, and there is no real registry at localhost:5000, but image.pullPolicy is set to Always, therefore k8s ignores the preloaded image, attempts to pull it anyway and fails in ImagePullBackOff. Setting image.pullPolicy to IfNotPresent resolves this problem by allowing to use the preloaded image. Signed-off-by: Maxim Mikityanskiy <maxim@isovalent.com> 12 March 2024, 11:25:38 UTC
1a1c2e9 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 12 March 2024, 11:22:16 UTC
28ae991 chore(deps): update go to v1.22.1 Signed-off-by: renovate[bot] <bot@renovateapp.com> 12 March 2024, 11:22:16 UTC
f5362cf config: Remove unused ENCRYPT_IFACE macro This macro is not used by the datapath since commit 3a650c32f2 ("bpf: Remove FIB lookup for IPsec"); there's no point writing it to the header file. Fixes: 3a650c32f2 ("bpf: Remove FIB lookup for IPsec") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 12 March 2024, 10:56:53 UTC
60ca00d Makefiles: Allow external input for go build/test/clean flags. Previously go build/test/clean flags are hardcoded inside the makefile, would not allow additional flags such as verbose output flag (-v) for the `go test` command. This change allows `GO_BUILD_FLAGS`, `GO_TEST_FLAGS` `GO_CLEAN_FLAGS` and `GO_BUILD_LDFLAGS` to take external input if specified. For `GO_TAGS_FLAGS`, if `GO_TAGS_FLAGS` is specified, `osusergo` would be appended to external supplied values. Signed-off-by: Wanlin Du <wanlindu@google.com> 12 March 2024, 10:45:51 UTC
dda18fc ingress: Allow strict kube-proxy-replacement Relates: https://github.com/cilium/cilium/pull/30592 Suggested-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 12 March 2024, 10:29:55 UTC
5b57f7d fix(deps): update google.golang.org/genproto/googleapis/rpc digest to c811ad7 Signed-off-by: renovate[bot] <bot@renovateapp.com> 12 March 2024, 10:22:26 UTC
42d9399 doc,bgpv1: Add some failure scenarios Add following new failure scenarios: 1. Peering Link Down 2. Cilium Operator Down 3. Service Losing All Backends Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> Co-authored-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 12 March 2024, 09:40:40 UTC
6fe3725 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 12 March 2024, 07:52:11 UTC
4775b1f chore(deps): update all-dependencies Signed-off-by: renovate[bot] <bot@renovateapp.com> 12 March 2024, 07:52:11 UTC
e6576ec bgpv2: fix cluster_test informer race Change require.EventuallyWith approach in test cases. Also, fixes race between node watcher registration and addition of node objects. Fixes: #31237 #31235 Signed-off-by: harsimran pabla <hpabla@isovalent.com> 12 March 2024, 01:17:28 UTC
6716a9c gha: checkout target branch instead of the default one Currently, the GHA workflows running tests triggered on pull_request and/or push events initially checkout the default branch to configure the environment variables, before retrieving the PR head. However, this is problematic on stable branches, as we then end up using the variables from the default (i.e., main) branch (e.g., Kubernetes version, Cilium CLI version), which may not be appropriate here. Hence, let's change the initial checkout to retrieve the target (i.e., base) branch, falling back to the commit in case of push events. This ensure that we retrieve the variables from the correct branch, and matches the behavior of Ariane triggered workflows. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 11 March 2024, 17:17:30 UTC
b284170 fqdn: prevent conntrack GC from reaping newly-added IPs A bug was found where a low-TTL name was incorrectly reaped despite being part of an active connection. After looking at logs, and reproducing locally, it was determined that there is an unfortunate interleaving between the DNS and CT GC loops. The code attempts to prevent this issue by ensuring that names inserted after CT GC has started are exempt from reaping. However, we don't actually track the insertion time, we track the DNS TTL expiration time, which is strictly in the past. In fact, it can be up to a minute in the past. We shouldn't rely on timestamps anyways, as the scheduler can always play tricks on us. So, if a CT GC run has started and finished in the time between name expiration and insertion in to zombies, the IP address is immediately considered dead and unnecessarily reaped. Timeline: T1. name expires T2. CT GC starts and finishes T3. Zombies.SetCTGCTime(T2) T4. Zombies.Upsert(name, T1) T5. Zombies.GC() At T5, zombies.GC will remove IPs associated with name, because T2 > T1. The solution is to use an explicit serial number to ensure that CTGC has completed a full run before we are allowed to delete an IP. We actually need to let CT GC run twice, as it may have started before this zombie was added and thus not marked it alive. Additionally, we already have a grace period, the idle connection timeout, that gives applications a chance to re-use an expired IP. However, we did not respect this grace period if the IP in question did not have an entry in conntrack. So, pad deletion time by this grace period as well, just to be sure this grace period applies to all possible deletions. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 11 March 2024, 17:01:45 UTC
03f8c85 conntrack: only bump FQDN GC time when CT GC successful The FQDN GC subsystem waits before a successful CT GC run before marking IPs as stale. However, we were erroneously marking CT GC as successful even on failure, or when only run for a single family. So, only mark notify FQDN when we've done a successful GC pass for all configured families. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 11 March 2024, 17:01:45 UTC
a84a18b fix(deps): update google.golang.org/genproto/googleapis/rpc digest to a219d84 Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 March 2024, 16:39:50 UTC
d6fbccf gateway-api: shorten the length of the value of the svc's label. Fixes #31285 When creating a gateway-api with a name exceeding 64 characters, it is impossible to create svc. This is because the label of svc references the name of gateway-api. Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 11 March 2024, 11:45:11 UTC
c095caa chore(deps): update all lvh-images main to bpf-next-20240309.012251 Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 March 2024, 11:33:38 UTC
3a26d31 bpf: nodeport: simplify CT entry validation in nodeport_lb*() Move the validation of the .rev_nat_index all the way into the CT lookup, so that a stale CT entry returns CT_NEW and gets re-created by the normal code flow. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 11 March 2024, 11:15:44 UTC
3dc32c8 Fix helm template for hubble-relay prometheus annotations We introduced a templating bug when adding prometheus annotations for hubble-relay. `{ { .Values.hubble.relay.prometheus.port | quote } }` will be rendered as is because of the spaces between the braces, which results in a helm templating error when enabling relay metrics without service monitors. ``` Error: YAML parse error on cilium/templates/hubble-relay/deployment.yaml: error converting YAML to JSON: yaml: invalid map key: map[interface {}]interface {}{".Values.hubble.relay.prometheus.port | quote":interface {}(nil)} ``` Fixes: e4abda5aba37 ("enable Prometheus metrics for Hubble Relay") Signed-off-by: Fabian Fischer <fabian.fischer@isovalent.com> 11 March 2024, 10:54:40 UTC
f4452d3 bpf: nodeport: add nodeport_rev_dnat_ingress_ipv4_hook infra this commit adds a hooking point to nodeport_rev_dnat_ingress_ipv4 in nodeport.h that can be used by cilium plugins to extend the functionality of this function. Signed-off-by: Gilberto Bertin <jibi@cilium.io> 11 March 2024, 09:33:48 UTC
edcb897 bgpv2: remove BGP peering policy translation from operator This change removes automatic translation of BGPv1 to BGPv2 CRDs from the operator. Separate CLI will be provided for doing this translation - similar to kubernetes-sigs/ingress2gateway CLI. Signed-off-by: harsimran pabla <hpabla@isovalent.com> 11 March 2024, 08:23:11 UTC
94c8b80 fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 March 2024, 07:23:51 UTC
9f2d021 chore(deps): update all github action dependencies Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 March 2024, 07:07:23 UTC
c3a3f07 chore(deps): update cilium/cilium-cli action to v0.16.0 Signed-off-by: renovate[bot] <bot@renovateapp.com> 11 March 2024, 07:07:16 UTC
a052869 bugfix: hostname config in httproute and gateway This fixes a bug where the hostname config isn't respected when set on a Gateway Listener and on an HTTPRoute's spec. Fixes: #30685 Signed-off-by: CJ Virtucio <cjv287@gmail.com> 10 March 2024, 13:36:49 UTC
26f8349 wireguard: Encrypt L7 proxy pkts to remote pods Marco reported that the following L7 proxy traffic is leaked (bypasses the WireGuard encryption): 1. WG: tunnel, L7 egress policy: forward traffic is leaked 2. WG: tunnel, DNS: all DNS traffic is leaked 3. WG: native routing, DNS: all DNS traffic is leaked This was reported before the introduction of the --wireguard-encapsulate [1]. The tunneling leak cases are obvious. The L7 proxy traffic got encapsulated by the Cilium's tunneling device. This made it to bypass the redirection to the Cilium's WireGuard device. However, [1] fixed this behavior. For Cilium v1.15 (upcoming) nothing needs to be configured. Meanwhile, for v1.14.4 users need to set --wireguard-encapsulate=true. The native routing case is more tricky. The L7 proxy taffic got a src IP of a host instead of a client pod. So, the redirection was bypassed. To fix this, we extended the redirection check to identify L7 proxy traffic. [1]: https://github.com/cilium/cilium/pull/28917 Reported-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Martynas Pumputis <m@lambda.lt> 09 March 2024, 09:08:29 UTC
02d04b3 endpointmanager: Improve health reporter messages when stopped The health reporter messages could use a bit more clarity and another was a copy-paste mistake. Fixes: bb957f3821 ("endpointmanager: add modular health checks to epm componenets.") Signed-off-by: Chris Tarazi <chris@isovalent.com> 08 March 2024, 18:46:10 UTC
5d28c64 chore(deps): update module github.com/go-jose/go-jose/v3 to v3.0.3 [security] Signed-off-by: renovate[bot] <bot@renovateapp.com> 08 March 2024, 13:26:44 UTC
7e4ad4a bgpv1: ClusterIP advertisement with BGP Control Plane Fixes: #30875 Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 08 March 2024, 01:19:59 UTC
81f14bb bpf: Enable monitor aggregation for all events in bpf_network.c This commit adjusts the usage of send_trace_notify in bpf_network.c to enable monitor aggregation for all events emitted at this observation point in the datapath. This change helps improve resource usage by reducing the overall number of events that the datapath emits, while still enabling packet observability with Hubble. The events in bpf_network.c enable observability into the IPSec processing of the datapath. Before this commit, multiple other efforts have been made to increase the aggregation of events related to IPSec to reduce resource usage, see #29616 and #27168. These efforts were related to packets that were specifically marked as encrypted or decrypted by IPSec and did not include events in bpf_network.c that were emitted when either: (a) a plaintext packet has been received from the network, or (b) a packet was decrypted and reinserted into the stack by XFRM. Both of these events are candidates for aggregation because similar to-stack events will be emitted down the line in the datapath anyways. Additionally, these events are mainly useful for root-cause analysis or debugging and are not necessarily helpful from an overall observability standpoint. Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 08 March 2024, 01:19:36 UTC
8d525fe Add node neighbor link updater controller This controller is responsible for monitoring nodes that request a neighbor link update. It provides a single place in the application where all node neighbor link are updates are performed as insertNeighbor are sprinkled in the code base. Additionally given the central location we can provide additional module health signals should neighbor link updates failed. Signed-off-by: Fernand Galiana <fernand.galiana@gmail.com> 08 March 2024, 01:06:53 UTC
bee9ed4 Make insertNeighor return err In order to surface module health signals we need insertNeighbor to return error vs simple logging to the console. Signed-off-by: Fernand Galiana <fernand.galiana@gmail.com> 08 March 2024, 01:06:53 UTC
2bf8924 Queue implementation to track nodes neighbor link requests. Signed-off-by: Fernand Galiana <fernand.galiana@gmail.com> 08 March 2024, 01:06:53 UTC
2b0d111 envoy: Bump golang version to 1.21.8 This is to pick up the new image with updated golang version, and other dependency bump. Related build: https://github.com/cilium/proxy/actions/runs/8182679554/job/22374391003 Signed-off-by: Tam Mach <tam.mach@cilium.io> 08 March 2024, 00:31:10 UTC
46713b9 chore(deps): update all lvh-images main Signed-off-by: renovate[bot] <bot@renovateapp.com> 07 March 2024, 22:26:49 UTC
5c2ed45 agent: define new flags to control Cilium's datapath events notifications This commit introduces three new configuration flags for the Cilium agent, allowing users to choose the bpf event types they want to expose to Cilium monitor and Hubble. - `--bpf-events-drop-enabled` Expose 'drop' events for Cilium monitor and/or Hubble (default true) - `--bpf-events-policy-verdict-enabled` Expose 'policy verdict' events for Cilium monitor and/or Hubble (default true) - `--bpf-events-trace-enabled` Expose 'trace' events for Cilium monitor and/or Hubble (default true) The default values for these flags remain set to `true`, not changing the current behaviour. In our case, we found particularly useful to disable the TraceNotification in order to reduce the CPU overhead on some of our nodes when Hubble is enabled as we were mostly interested into dropped packets. Signed-off-by: Maxime Visonneau <maxime.visonneau@gmail.com> 07 March 2024, 18:48:02 UTC
2764994 bgpv1: Disable PodCIDR Reconciler for unsupported IPAM modes PodCIDR shouldn't take any effect for the unsupported IPAM modes. Modify ExportPodCIDRReconciler's constructor to not provide ConfigReconciler for unsupported IPAMs. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 07 March 2024, 17:03:29 UTC
726cde2 cilium: Enable plain IPIP/IP6IP6 termination Add a simple --enable-ipip-termination option for the agent which creates the cilium_ipip{4,6} devices similarly as with lb-only mode, but for the purpose that this does a straight-forward ipip decap for incoming packets. All are in remote any local any. bpf_netdev pushes these packets up the stack into the respective ipip devices which do plain decap, and then travel further up into a corresponding socket. [...] 5159: cilium_ipip4@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN mode DEFAULT group default link/ipip 0.0.0.0 brd 0.0.0.0 promiscuity 0 minmtu 0 maxmtu 0 ipip external ipip remote any local any ttl inherit pmtudisc addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 5160: cilium_ip6tnl@NONE: <NOARP> mtu 1452 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/tunnel6 :: brd :: permaddr 7e74:1189:d86c:: promiscuity 0 minmtu 68 maxmtu 65407 ip6tnl ip6ip6 remote any local any hoplimit inherit encaplimit 0 tclass 0x00 flowlabel 0x00000 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 5161: cilium_ipip6@NONE: <NOARP,UP,LOWER_UP> mtu 1452 qdisc noqueue state UNKNOWN mode DEFAULT group default link/tunnel6 :: brd :: permaddr a28:8495:68b8:: promiscuity 0 minmtu 68 maxmtu 65407 ip6tnl external any remote any local any hoplimit inherit encaplimit 0 tclass 0x00 flowlabel 0x00000 addrgenmode eui64 numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 4994: cilium_tunl@NONE: <NOARP> mtu 1480 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ipip 0.0.0.0 brd 0.0.0.0 promiscuity 0 minmtu 0 maxmtu 0 ipip any remote any local any ttl inherit nopmtudisc numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 [...] Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 07 March 2024, 17:01:59 UTC
61076b0 docs: Fix various typos in the readme Signed-off-by: Brian Payne <payne.in.the.brian@gmail.com> 07 March 2024, 13:52:58 UTC
0262c28 patches: Call upstream callbacks via UpstreamFilterManager Envoy has moved the encodeHeaders() call to a new call path in upstream decoder filter. Move the upstream callbacks iteration call there to be just before the encodeHeaders() call, and call the iteration via UpstreamFilterManager so that the callbacks registered in the downstream filter manager are used. Call sendLocalReply also via the UpstreamFilterManager to have its local state updated properly for upstream processing. Relates: https://github.com/envoyproxy/envoy/pull/26916/files#r1176556258 Related commit: https://github.com/cilium/proxy/commit/21100f00630d912181a9b8893f6782c156d4535f Related build: https://github.com/cilium/proxy/actions/runs/8156758279/job/22294915980 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> Signed-off-by: Tam Mach <tam.mach@cilium.io> 07 March 2024, 13:15:31 UTC
bb0db0d chore(deps): update all lvh-images main Signed-off-by: renovate[bot] <bot@renovateapp.com> 07 March 2024, 11:17:54 UTC
1dadae3 bpf: Don't skip local delivery for plain-text packets The current IPsec decryption handling in bpf_host is buggy: when executing do_decrypt, we skip all subsequent logic in the same BPF program and return to stack. That subsequent logic includes (1) all service handling logic including NAT, (2) the host policies enforcement, and (3) delivery to the destination pods. (1) and (2) are not an issue today because KPR and the host firewall are incompatible with IPsec. (3) is also not really an issue because packets will flow through cilium_host where they will run through the local delivery logic again; that's however less efficient. This commit changes the logic a bit, to not skip the subsequent BPF processing for plain-text ingressing packets. If the packet is encrypted and needs decryption, we should send it to the stack, but any other packet needs should be fully processed by bpf_host. Fixes: 7ba0e83acc ("cilium: from-netdev and from-network BPF programs conflicting hooks") Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 07 March 2024, 10:42:55 UTC
777b580 bgpv1: Adjust ConnectionRetryTimeSeconds to 1 in component tests Set ConnectionRetryTimeSeconds in the component tests to 1s in component tests unless it is specified explicitly. Otherwise, when the initial connect fails, we need to 120s for the next connection by default, which may longer than the timeout of the test itself. Fixes: #31217 Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 07 March 2024, 10:35:07 UTC
77a0c6b gha: additionally test host firewall + KPR disabled in E2E tests Let's additionally enable host firewall on a couple of existing matrix entries associated with KPR disabled, so that we can additionally cover this configuration and prevent regressions. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 07 March 2024, 09:34:32 UTC
3ce296b doc,bgpv1: Bootstrapping BGP CPlane failure scenario doc Bootstrap a document outlining the failure scenarios of a BGP Control Plane with an initial focus on how the BGP Control Plane behaves when the agent is down, as a part of Operation Guide. The main distinction between this troubleshooting document is that this document emphasizes issues that occur involuntarily, while the troubleshooting documents focus more on voluntary issues. Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 07 March 2024, 02:13:40 UTC
d923b81 contrib: don't use sysctl to check if IPv6 is enabled The sysctl command may not be installed, or in the default path (e.g., in case of Debian), causing errors like: ./contrib/scripts/kind.sh: line 12: sysctl: command not found ./contrib/scripts/kind.sh: line 12: [: ==: unary operator expected Instead, let's just directly access the content of the corresponding proc file, so that we don't depend on external tools. Fixes: f7fdeef2cc19 ("ipfamily should be set by platform configuration.") Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 07 March 2024, 01:33:12 UTC
db7b3ef Downgrade L2 Neighbor Discovery failure log to Debug Suppress the "Unable to determine next hop address" logs. While it shows the L2 neighbor resolution failure, it does not always indicate a datapath connectivity issue. For example, when "devices=eth+" is specified and the device naming/purposing is not consistent across the nodes in the cluster, in some nodes "eth1" is a device reachable to other nodes, but in some nodes, it is not. As a result, L2 Discovery generates an "Unable to determine next hop address". Another example is ENI mode with automatic device detection. When secondary interfaces are added, they are used for L2 Neighbor Discovery as well. However, nodes can only be reached via the primary interface through the default route in the main routing table. Thus, L2 Neighbor Discovery generates "Unable to determine next hop address" for secondary interfaces. In both cases, it does not always mean the datapath has an issue for KPR functionality. However, the logs appear repeatedly, are noisy, and the message is error-ish, causing confusion. This log started to appear for some users who did not see it before from v1.14.3 (#28858) and v1.15.0 (in theory). For v1.14.3, it affects KPR + Tunnel users because of f2dcc86. Before the commit, we did not perform L2 Neighbor Discovery in tunnel mode, so even if users had an interface unreachable to other nodes, the log did not appear. For v1.15.0, it affects to the users who used to have the unreachable interface. 2058ed6a made it visible. Before the commit, some kind of the errors like EHOSTUNREACH and ENETUNREACH were not caught because FIBMatch option didn't specified. After v1.15.0, users started to see the log. Fixes: #28858 Signed-off-by: Yutaro Hayakawa <yutaro.hayakawa@isovalent.com> 07 March 2024, 00:46:22 UTC
back to top