https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
39db9aa [wip] Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 04 January 2024, 05:29:35 UTC
1d1449d [wip] Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 02 January 2024, 04:20:26 UTC
ffa8775 wip 02 January 2024, 04:11:01 UTC
c8e49a1 -s 02 January 2024, 03:55:30 UTC
f9ab759 [wip] Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 02 January 2024, 03:36:39 UTC
e69ac8a [wip] Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 02 January 2024, 03:29:00 UTC
1d04347 [wip] Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 02 January 2024, 01:46:23 UTC
e13f6ff wip 29 December 2023, 07:25:05 UTC
7e8b388 wip 29 December 2023, 06:45:07 UTC
a14ff80 wip 28 December 2023, 22:42:35 UTC
cc9b33b wi0p 28 December 2023, 21:24:13 UTC
c1c10ce [WIP] Enable and port-forward Hubble for conformance e2e. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 28 December 2023, 20:50:03 UTC
2daca4c removed metal LB references Signed-off-by: Nico Vibert <nicolas.vibert@isovalent.com> 27 December 2023, 14:45:38 UTC
dc84a75 Add a description to the default GatewayClass Signed-off-by: chaunceyjiang <chaunceyjiang@gmail.com> 27 December 2023, 12:04:50 UTC
3d73117 Add tests for enforceHTTPS behavior in operator model translation This adds mechanisms and tests for how the translator works differently when enforceHTTPS is enabled. Signed-off-by: Nick Young <nick@isovalent.com> 27 December 2023, 11:19:42 UTC
79d5029 Refactor getEnvoyHTTPRouteConfiguration test This commit refactors the getEnvoyHTTPRouteConfiguration test in `pkg/operator/model/translation` by: - moving the expected configs into the `fixture_test.go` file - adding a new test for using the multiple hostnames in one Route functionality inside the model.Model. This is to prepare for adding additional test cases that test what happens when `enforceHTTPS` is set to `true` - currently we only test what happens when it is `false`, but `true` is actually the default. Signed-off-by: Nick Young <nick@isovalent.com> 27 December 2023, 11:19:42 UTC
9f27354 spire: Add new method SyncAuthorizedEntries in test Signed-off-by: Tam Mach <tam.mach@cilium.io> 27 December 2023, 11:16:39 UTC
b3ce0a3 fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 27 December 2023, 11:16:39 UTC
04da6bf contrib: Autodetect GITHUB_TOKEN during release If the GITHUB_TOKEN variable is not set, try to create it via local CLI. Signed-off-by: Joe Stringer <joe@cilium.io> 21 December 2023, 15:28:51 UTC
9dc0c5e contrib: Require github authentication for release Adding this check at the start of the release process, and in the prep-changelog script which will later execute $(gh auth token) if a token is not available. Signed-off-by: Joe Stringer <joe@cilium.io> 21 December 2023, 15:28:51 UTC
f7416e8 chore(deps): update all lvh-images main Signed-off-by: renovate[bot] <bot@renovateapp.com> 21 December 2023, 11:09:46 UTC
1571cce bgpv1: set running flag in manager BGP Manager to unset running flag when Stop is called. This is to fix flaky tests where reconcile gets called after Stop, which recreates BGP servers at shutdown stage. Signed-off-by: harsimran pabla <hpabla@isovalent.com> 21 December 2023, 10:18:02 UTC
066dee8 egressgw: remove deleteStaleIPRulesAndRoutes() This reverts commit 77232f47b3b01872990ffa644faaeecd84eca4c8. This code was only needed to facilitate a clean upgrade from v1.14 to v1.15. Now that we have a v1.15 branch, remove it from the development branch. Fixes: https://github.com/cilium/cilium/issues/29441 Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 21 December 2023, 10:10:37 UTC
0bf036e hive: Unify parsing of string slices Hive uses pflag and viper to parse configuration flags from multiple sources. If a flag is set via command-line then the pflag parser is invoked to get to the destination type as defined in the FlagSet ("flags.StringSlice" [1]), however if the flag comes from environment or config-map, then the parsing was done by a mapstructure hook [2]. This is all well and good as long as these two ways of parsing into say []string are aligned with each other. And they were. Unfortunately these were not aligned with the pre-Hive way of parsing which used viper.GetStringSlice and similar methods. Specifically viper.GetStringSlice is implemented ([3]) via cast.ToStringSlice, which uses strings.Fields that splits by whitespace instead of by commas. So to summarize the different ways a StringSlice can be parsed: - [1]: flags.StringSlice: parses with csv.Reader (split by comma) - [2]: stringToSliceHookFunc: splits by comma - [3]: viper.GetStringSlice: splits by whitespaces So while arguably the first two are more consistent, we can't just flip from splitting by spaces to splitting by commas as that creates a huge foot-gun when fields are moved from option.Config to individual hive.Config structs. To solve this we allow splitting both ways by using two mapstructure hooks that process the values before they're pushed to the config struct: - mapstructure.StringToSliceHookFunc(",") splits first string by commas. This only impacts input coming from environment, configmap or flags.String and going to []string. - fixupStringSliceHookFunc takes []string coming from flags.StringSlice or from the StringToSliceHookFunc and resplits it by whitespace if it was of length 1. With this, we have unified the parsing of []string across all the config input methods: "foo,bar,baz" => []string{"foo", "bar", "baz"} "foo bar baz" => []string{"foo", "bar", "baz"} "foo bar,baz" => []string{"foo bar", "baz"} [1]: https://github.com/spf13/pflag/blob/master/string_slice.go#L27 [2]: https://github.com/mitchellh/mapstructure/blob/main/decode_hooks.go#L104 [3]: https://github.com/spf13/viper/blob/9154b900c34ad9d88897f7e5288ce43f457f698b/viper.go#L1067 Fixes: #29210 Fixes: b407ffce15 ("hive: Reimplement on top of dig") Signed-off-by: Jussi Maki <jussi@isovalent.com> 21 December 2023, 09:29:40 UTC
5f77d50 docs: Add upgrade note for CNP empty slices new semantic Following the changes in previous commits, where the semantic of an empty non-nil slice in CNPs has been changed, an upgrade note is added to the upgrade guide. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 21 December 2023, 07:19:50 UTC
0d1bf68 policy: Add unit test for empty ToEndpoints egress CNP Add a unit test to check that a non-nil empty ToEndpoints slice in an egress CiliumNetworkPolicy does not select any identity. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 21 December 2023, 07:19:50 UTC
6ccd044 policy: Do not select any identity with egress empty slices In case of L4 egress policies with an explicit (non-nil) empty slice in one of: - toEndpoints - toCIDR - toCIDRSet - toEntities no identities should be selected, thus falling back to default deny for an allow policy. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 21 December 2023, 07:19:50 UTC
e97df7b policy: Do not select any identity with ingress empty slices In case of L4 ingress policies with an explicit (non-nil) empty slice in either: - fromEndpoints - fromCIDR - fromCIDRSet - fromEntities no identities should be selected, thus falling back to default deny for an allow policy. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 21 December 2023, 07:19:50 UTC
c084bc0 metrics/metric: improve logging when label values are violated. When a metric declared with XWithLabels(...) is updated with a vector label value that is outside the expected range of values for that label then a error log is emitted (with the metric being still emitted). This improves this by: * Switch error log to warning, as these logs do not indicate a serious runtime error but are meant to help developers catch use of undeclared label values. * Add information about what metric/label/value caused the violation. As well, this DRY's up the label value checking code. Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 21 December 2023, 05:33:52 UTC
9f845ab metrics: revert changes to pre-init kubernetes events metrics. Commit bd5ec0b3070fb9536761ccb8245388624fdaffa9 introduced pre-declaring metrics label value ranges for k8s watcher metrics. This also performs a check when the metric is updated to check whether the metric label value is within the (optional) declared range. However, k8s watchers would require declaring all possible CRD types in the scope. We could make this change, but this list would be hard to maintain, and more importantly, we probably don't want to initialize metrics for CRDs that are not necessarily going to be used (i.e. CES if disabled). This reverts this change, until a better solution for this type of metric is developed. Reported-by: Marco Iorio <marco.iorio@isovalent.com> Signed-off-by: Tom Hadlaw <tom.hadlaw@isovalent.com> 21 December 2023, 05:33:52 UTC
2afcb61 bpf: implement multicast delivery This commit implements replication and delivery of multicast packets. This commit also enables the Cilium datapath to access both `bpf_clone_redirect` and `bpf_map_for_each_elem` helpers. The datapath flow is illustrated below: ┌──────────────────────────────────────────┐ │ │ │ Sender │ │ ┌──────┐ ┌─────────┐ │ │ │ pod ├─────► bpf_lxc │ │ │ └──────┘ └────┬────┘ │ │ Local Receivers │ eBPF Replication │ │ ┌──────┐ ┌──────┐ │ and Redirection │ │ │ pod ◄─┤ veth ◄─┤(cil_from_container) │ │ └──────┘ └──────┘ │ ┌───────┐ │ │ ├─► vxlan │ │ │ ┌──────┐ ┌──────┐ │ └───┬───┘ │ │ │ pod ◄─┤ veth ◄─┘ │ │ │ └──────┘ └──────┘ ┌────┘ │ │ │ │ └─────────────────────┼────────────────────┘ │ ┌─────────────────────┼────────────────────┐ │ │ │ │ ┌───▼───┐ │ │ │ vxlan │ │ │ └───┬───┘ │ │ Remote Receivers │ eBPF Replication │ │ ┌──────┐ ┌──────┐ │ and Redirection │ │ │ pod ◄─┤ veth ◄─┤ (from_overlay) │ │ └──────┘ └──────┘ │ │ │ │ │ │ ┌──────┐ ┌──────┐ │ │ │ │ pod ◄─┤ veth ◄─┘ │ │ └──────┘ └──────┘ │ │ │ └──────────────────────────────────────────┘ A multicast sender sends a multicast packet. The sender's bpf_lxc program does a lookup in the multicast group map to discover who has subscribed to the group. The program then clones and redirects the packets to the subscriber's ingress device on the host namespace. If the subscriber is remote the packet is cloned and redirected to a vxlan device for encapsulation. Once the host stack forwards the vxlan encap'd packet to the receiving vxlan device on the remote host a similar "clone and redirect" process is performed once the vxlan driver decaps the packet. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com> 20 December 2023, 20:21:43 UTC
8c488dd bpf,igmp: igmpv2 join and leave parsing This commit adds parsing of IGMPv2 messages in a similar fashion as IGMPv3 messages. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com> 20 December 2023, 20:21:43 UTC
d7e580f bpf,mcast: initial IGMPv3 parsing in bpf_lxc This commit introduces IGMPv3 detection and parsing. When bpf_lxc recognizes IGMP messages egressing the Pod we attempt to parse them. The parsing logic is as follows: 1. Determine if traffic is IGMP 2. Determine the IGMP message type 3. If the type is not a membership report simply drop it (for now) 4. Parse each group record in the membership report 5. For any group records which indicate a join add a subscriber to the multicast subscriber map, if it exists. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com> 20 December 2023, 20:21:43 UTC
b7822a3 bpf,multicast: add map infrastructure This commit adds the eBPF map used to implement the synthetic multicast feature. A `BPF_MAP_TYPE_HASH_OF_MAPS`, which employees a `BPF_MAP_TYPE_HASH` inner map, is added to the datapath. The outer eBPF map is keyed by IPv4 multicast group addresses in big endian format and the values are `BPF_MAP_TYPE_HASH` maps. The inner hash map associates IPv4 source addresses with their subscriber multicast metadata. Each key/value in the inner hash map is a subscriber of the owning multicast group. Signed-off-by: Louis DeLosSantos <louis.delos@isovalent.com> 20 December 2023, 20:21:43 UTC
4135f5b bpf: add improved helper for program-internal tail-call The naming of ep_tail_call() is very misleading - it *doesn't* call to an endpoint's policy tail-call (in POLICY_CALL_MAP), and it's also used by programs that are *not* associated with an endpoint (eg. bpf_xdp and bpf_overlay). Instead it's used for a tail-call in the program's internal tail-call map. So start off with introducing a new helper with improved naming. Then let this helper return DROP_MISSED_TAIL_CALL (instead of every caller open-coding the same value), and also setting the index of the missed tail-call in `ext_err` where available. Finally steal a bit of compiler magic from the kernel, and enforce that callers check for returned errors. We've had too many cases in the past where we forgot to return the DROP_MISSED_TAIL_CALL. Then convert a few initial callers to demonstrate the usage. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 20 December 2023, 16:04:09 UTC
7ebaf63 test: remove warning ignore for missing identity delete This should be fixed now. Fixes: #29681 Signed-off-by: Casey Callendrello <cdc@isovalent.com> 20 December 2023, 14:10:54 UTC
afc3b5f identity: don't force-notify policy subsystem for local ids AllocateIdentity() takes a parameter, `notifyOwner`, that when true, tells the allocator to propagate those updates to the policy subsystem. However, the local (i.e. CIDR) identity allocator ignores this field, and *always* propagates identity updates to the policy subsystem as well as **always regenerating all endpoints**. This is needless, since the ipcache always updates the policy system as well. This change is safe since the ipcache is the only caller that passes `notifyOwner = false`, and the ipcache updates the policy system itself. All other callers to `AllocateIdentity()` set `notifyOwner = true`. Fixes: #29681 Signed-off-by: Casey Callendrello <cdc@isovalent.com> 20 December 2023, 14:10:54 UTC
70ec61c ipcache: always pass identity updates to the policy engine With the newer, asynchronous APIs, the ipcache takes on the responsibility of updating the policy engine (i.e. the SelectorCache) whenever it allocates an identity. However, the legacy APIs don't do that. This commit changes that. By centralizing this in the ipcache, we can (in a subsequent commit) stop regenerating all endpoints whenever an identity is allocated. This is wasteful, since local identities can only ever be allocated by the ipcache now. Signed-off-by: Casey Callendrello <cdc@isovalent.com> 20 December 2023, 14:10:54 UTC
da3b48f chore(deps): update actions/setup-go action to v5 Signed-off-by: renovate[bot] <bot@renovateapp.com> 20 December 2023, 14:04:57 UTC
ef98491 images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 20 December 2023, 11:08:42 UTC
733af3d chore(deps): update docker.io/library/ubuntu:22.04 docker digest to 6042500 Signed-off-by: renovate[bot] <bot@renovateapp.com> 20 December 2023, 11:08:42 UTC
b26d9be docs: Fix keyid derivation in IPsec docs Previously, when determing a keyid before the rotation, the doc suggested to run "cut -c 1". This returns only the first digit (e.g., if keyid is "15", then "1" is returned). This breaks the rotation 15=>1. Fixes: 42ef7f3f814 ("docs: Update IPsec key rotation command") Reported-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Martynas Pumputis <m@lambda.lt> 20 December 2023, 10:57:08 UTC
0bed4d6 chore(deps): update all github action dependencies Signed-off-by: renovate[bot] <bot@renovateapp.com> 20 December 2023, 10:31:55 UTC
babba79 helm: enforce routing-mode when {gke,aksbyocni}.enabled is set. Historically, the Cilium helm chart allowed to override the routing mode leveraged in combination with {gke,aksbyocni}.enabled. This is no longer possible since aff16b2e404d ("Change routing-mode and tunnel interaction."). According to the Cilium documentation [1,2], this appears to be the correct behavior, as the routing mode must be respectively set to native and tunnel in these cases. Hence, let's validate that users didn't configure a different routing mode, to avoid falling back silently, which may be confusing. [1]: https://docs.cilium.io/en/stable/network/concepts/routing/#id6 [2]: https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/#install-cilium (AKS tab) Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 20 December 2023, 08:29:25 UTC
d24fa08 endpoint: fix endpoint suite tests duration regression d9a4594b2048 ("etcd: rework lease manager to additionally create sessions") appears to have caused a regression in the duration of the endpoint tests, due to a slightly different implementation of the isConnectedAndHasQuorum() etcd client method, which is in turn executed before starting a Watcher (through Connected()). The regression is the combination of multiple causes: * Incorrect endpoint tests tear down ordering, due to the interplay of TearDownTest and Cleanup; specifically, the identity allocator manager (which internally starts the watch operation) is stopped after closing the kvstore client; * The identity allocator does not propagate a context to the watch operation, but instead relies on a external channel to stop it; * The etcd Connected() function does no longer terminate immediately when the client is closed (i.e., when the session gets closed). The reason is that the session is no longer guaranteed to be unique. * The etcd client appears to retry multiple times certain operations if it is closing, but the given context is not canceled (as in this case, as the given context is never canceled). This is the reason for the extra delay, and comes with 100 repetitions of the warning: {"level":"warn","ts":"...","logger":"etcd-client","caller":"v3/retry_interceptor.go:62", "msg":"retrying of unary invoker failed","target":"...","attempt":0, "error":"rpc error: code = Canceled desc = grpc: the client connection is closing"} Let's adopt the easiest fix here, and make sure that the client is closed after closing the manager. This ensures correct tear down ordering and prevents the etcd client retry process. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 20 December 2023, 08:28:04 UTC
0ff8bdb fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 19 December 2023, 22:20:26 UTC
4ad5718 fix(deps): update module golang.org/x/crypto to v0.17.0 [security] Signed-off-by: renovate[bot] <bot@renovateapp.com> 19 December 2023, 22:13:43 UTC
18533aa chore(deps): update gcr.io/distroless/static-debian11:nonroot docker digest to 112a87f Signed-off-by: renovate[bot] <bot@renovateapp.com> 19 December 2023, 21:36:56 UTC
89c3789 bpf: xdp: use bpf_xdp_get_buff_len() when available This helper was introduced in kernel 5.18 with 0165cc817075 ("bpf: introduce bpf_xdp_get_buff_len helper"). Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 19 December 2023, 19:28:07 UTC
3d3f69c policy: Fix mapstate changes error in entry change comparison Datapath equalness of the old and new entry must be evaluated before entries are merged, afterwards the entries will be the same and the key is not added in ChangeState.Adds, which may lead to a policy map entry not being updated during incremental updates. Fixes: #26331 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 19 December 2023, 18:56:41 UTC
e95f2e0 complexity-tests: add bpf_network configuration Add an initial bpf_network configuration. The defines are taken from a sysdump for a failing CI run that seems to exhibit verifier problems. Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 19 December 2023, 18:45:19 UTC
a495abd ipsec: ensure all trace events are discarded when mon. agg. is enabled This commit modifies calls to `send_trace_notify` in the datapath to drop trace events related to encrypted packets when monitor aggregation is enabled. More specifically, this commit ensures that whenever `send_trace_notify` is called with a `trace_reason` of `TRACE_REASON_ENCRYPTED`, the `monitor` argument is set to zero. A Coccinelle script is provided in this commit to add a build-time check for this requirement moving forward. This change helps to reduce the overall CPU usage of Cilium Agents when IPSec encryption is enabled, by reducing the number of trace events emitted by the datapath. Normally monitor aggregation can be used in order to reduce the number of trace events, however IPSec-related trace events are not able to be aggregated since they lack the necessary connection tracking information. See the function `emit_trace_notify` in `bpf/lib/trace.h` for more information. This same change was applied in `bpf/bpf_lxc.c` in commit 3e52822. Thank you to Lorenz who added a workaround for passing the verifier with Clang 10. See also: cilium/cilium#27168 Co-authored-by: Lorenz Bauer <lmb@isovalent.com> Signed-off-by: Ryan Drew <ryan.drew@isovalent.com> 19 December 2023, 18:45:19 UTC
3942bea MAINTAINERS: Add Yutaro Voting results (courtesy of Joe): YES: 25 (52%) NO: 0 (0%) ABSTAIN: 23 (48%) With the Company Block Vote Limit applied: YES: (23 / (23/6)) + 2 = 8 votes NO: 0 votes Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 19 December 2023, 18:42:13 UTC
7f0e373 external workloads: fix node identity error During external workloads initialization, the clustermesh-apiserver allocates a new identity for the external workload, which is then propagated through etcd to the agent running on the external workload; yet, the agent eventually overrides it with the default value (1). While this appears to be harmless from a functional point of view, as the clustermesh-apiserver configures the original identity as part of the associated the CiliumNode and CiliumEndpoint resources (which are then watched by the other agents), it also triggers an error, due to the unexpected mismatch: CEW: Invalid identity 1 in ... Let's fix this by avoiding to override the originally assigned identity from the external workload agent. Signed-off-by: Marco Iorio <marco.iorio@isovalent.com> 19 December 2023, 18:16:35 UTC
a573bb4 endpoint: Use resolved named port also in the proxy stats Commit 10f04fd98a5fb17c745aa4363aa331128ab0698c (endpoint: Resolve named ports for redirects) fixed redirect creation for L7 policies using a named port, but failed to use the resolved destination port also in proxy stats. This commit does that. Fixes: #29023 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 19 December 2023, 18:16:15 UTC
e5b1de4 k8s/watchers: Remove GetIndexer and SetIndexer GetIndexer and SetIndexer were meant to ease the testability of the previous non-modularized implementation of the agent endpoint GC. Since they are not needed anymore, it is possible to safely remove them. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 19 December 2023, 18:15:51 UTC
b825850 endpointcleanup: Modularize stale CE cleanup init procedure Modularize the agent stale CiliumEndpoint init procedure in an independent cell. The logic is functionally equivalent to the previous implementation. The tests have been refactored to use the hive and cell framework, thus implementing unit-style integration tests. As an additional benefit, this remove the direct dependency from the CEP and CES stores managed in the related k8s watchers. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 19 December 2023, 18:15:51 UTC
1e73c1a bpf: alignchecker: add encrypt_config and world_cidrs_key4 These structs are also used by the agent. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 19 December 2023, 18:13:23 UTC
1cf0570 monitor/payload: remove bitrotted benchmark Benchmarks written against gocheck / checkmate can't be invoked easily and lack integration with the normal go test runner. They also aren't executed by CI and therefore tend to rot away. Remove the gocheck benchmark. Updates #29546 Signed-off-by: Lorenz Bauer <lmb@isovalent.com> 19 December 2023, 17:00:17 UTC
5c756c0 fix(deps): update all go dependencies main Signed-off-by: renovate[bot] <bot@renovateapp.com> 19 December 2023, 10:14:30 UTC
1bf697f workflows: Increase IPsec upgrade test's timeout The IPsec upgrade test started timing out from time to time on main's CI. This commit bumps the timeout a bit to avoid such failures. Signed-off-by: Paul Chaignon <paul.chaignon@gmail.com> 19 December 2023, 09:16:21 UTC
ffa9aee k8s: Update to final v1.29.0 Signed-off-by: Chris Tarazi <chris@isovalent.com> 19 December 2023, 08:59:56 UTC
08017c2 bgpv1: Avoid creating resource.Store in Start() of BGP store wrappers This changes event handling in BGP CP resource.Store wrappers (DiffStore, BGPCPResourceStore) from hive hooks and a goroutine to a hive job, mainly to avoid creating resource.Store in the Start() hive hook. This also makes them wrappers rather than a super set of the resource.Store, exposing only API that is actually used within the BGP CP, and allowing all methods to return an error if the store has not yet initialized. Since creating resource.Store() blocks until the store is initialized, we should avoid calling it in the Start() hive hook to not slow down the startup process and to allow progressing with the startup even when the CRD is not yet installed. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 19 December 2023, 08:31:03 UTC
cab1568 bgpv1: Avoid creating resource.Store in Start() of BGP controller This changes BGP Controller's event handling from hive hooks and workerpool to hive jobs, mainly to avoid creating resource.Store in the Start() hive hook of the BGP controller. Since creating resource.Store() blocks until the store is initialized, we should avoid calling it in the Start() hive hook to not slow down the startup process and to allow progressing with the startup even when the CRD is not yet installed. Signed-off-by: Rastislav Szabo <rastislav.szabo@isovalent.com> 19 December 2023, 08:31:03 UTC
afccb83 ci: continue container scanning on error Stopping scanning if a single image shows CVEs stops us from seeing the entire results and being able to resolve multiple issues at the same time. Signed-off-by: Feroz Salam <feroz.salam@isovalent.com> 19 December 2023, 08:29:40 UTC
d4b81c0 bpf: host: skip from-proxy handling in from-netdev from-proxy traffic gets redirected to cilium_host. Skip the proxy paths when handle_ipv*_cont() is included by from-netdev. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 19 December 2023, 06:23:56 UTC
0d35af0 bpf: ipv4: always return drop reason from ipv4_handle_fragmentation() To make it easy for callers, ipv4_handle_fragmentation() should always return a DROP_* reason on error. But for errors from l4_load_ports() we're currently just propagating those raw errors back. Return a drop reason instead. This also makes us consistent with the non-fragment path in ipv4_load_l4_ports(). Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 19 December 2023, 06:23:36 UTC
2f1b375 chore(deps): update dependency cilium/cilium-cli to v0.15.19 Signed-off-by: renovate[bot] <bot@renovateapp.com> 18 December 2023, 21:48:38 UTC
263e689 nodediscovery: Fix bug where CiliumInternalIP was flapping This fixes a bug in `UpdateCiliumNodeResource` where the `CiliumInternalIP` (aka `cilium_host` IP, aka router IP) was flapping in the node manager during restoration (i.e. during cilium-agent restarts). In particular in `cluster-pool` mode, `UpdateCiliumNodeResource` is called before the `cilium_host` IP has been restored, as there are some circular dependencies: The restored IP can only be fully validated after the IPAM subsystem is ready, but that in turn can only happen if the `CiliumNode` object has been created. The `UpdateCiliumNodeResource` function however will only announce the `cilium_host` IP if it has been restored. This commit attempts to break that cycle by not overwriting any already existing `CiliumInternalIP` in the CiliumNode resource. Overall, this change is rather hacky, in particular it does not address the fact that other less crucial node information (like the health IP) also flaps. But since we want to backport this bugfix to older stable branches too, this change is intentionally kept as minimal as possible. Example node event (as observed by other nodes) before this change: ``` 2023-12-18T12:58:20.070330814Z level=debug msg="Received node update event from custom-resource" node="{\"Name\":\"kind-worker\",\"Cluster\":\"default\",\"IPAddresses\":[{\"Type\":\"InternalIP\",\"IP\":\"172.18.0.4\"},{\"Type\":\"InternalIP\",\"IP\":\"fc00:c111::4\"}],..." subsys=nodemanager 2023-12-18T12:58:20.208082226Z level=debug msg="Received node update event from custom-resource" node="{\"Name\":\"kind-worker\",\"Cluster\":\"default\",\"IPAddresses\":[{\"Type\":\"InternalIP\",\"IP\":\"172.18.0.4\"},{\"Type\":\"InternalIP\",\"IP\":\"fc00:c111::4\"},{\"Type\":\"CiliumInternalIP\",\"IP\":\"10.0.1.245\"}],..." subsys=nodemanager ``` After this change (note the `CiliumInternalIP` present in both events): ``` 2023-12-18T15:38:23.695653876Z level=debug msg="Received node update event from custom-resource" node="{\"Name\":\"kind-worker\",\"Cluster\":\"default\",\"IPAddresses\":[{\"Type\":\"CiliumInternalIP\",\"IP\":\"10.0.1.245\"},{\"Type\":\"InternalIP\",\"IP\":\"172.18.0.4\"},{\"Type\":\"InternalIP\",\"IP\":\"fc00:c111::4\"}],..." subsys=nodemanager 2023-12-18T15:38:23.838604573Z level=debug msg="Received node update event from custom-resource" node="{\"Name\":\"kind-worker\",\"Cluster\":\"default\",\"IPAddresses\":[{\"Type\":\"InternalIP\",\"IP\":\"172.18.0.4\"},{\"Type\":\"InternalIP\",\"IP\":\"fc00:c111::4\"},{\"Type\":\"CiliumInternalIP\",\"IP\":\"10.0.1.245\"}],...}" subsys=nodemanager ``` Reported-by: Paul Chaignon <paul.chaignon@gmail.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 18 December 2023, 21:03:23 UTC
7084c17 node/manager: Improve debug logging This commit improves the debug logging of node update events by using the JSON representation instead of the Go syntax representation of the node. This makes it easier to parse the log message, as IP addresses are now printed as strings instead of byte arrays. Before: ``` level=debug msg="Received node update event from custom-resource: types.Node{Name:\"kind-worker\", Cluster:\"default\", IPAddresses:[]types.Address{types.Address{Type:\"InternalIP\", IP:net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xac, 0x12, 0x0, 0x3}}, types.Address{Type:\"InternalIP\", IP:net.IP{0xfc, 0x0, 0xc1, 0x11, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x3}}, types.Address{Type:\"CiliumInternalIP\", IP:net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xa, 0x0, 0x0, 0xd2}}}, IPv4AllocCIDR:(*cidr.CIDR)(0xc000613180), IPv4SecondaryAllocCIDRs:[]*cidr.CIDR(nil), IPv6AllocCIDR:(*cidr.CIDR)(nil), IPv6SecondaryAllocCIDRs:[]*cidr.CIDR(nil), IPv4HealthIP:net.IP{0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0xff, 0xff, 0xa, 0x0, 0x0, 0x30}, IPv6HealthIP:net.IP(nil), IPv4IngressIP:net.IP(nil), IPv6IngressIP:net.IP(nil), ClusterID:0x0, Source:\"custom-resource\", EncryptionKey:0x0, Labels:map[string]string{\"beta.kubernetes.io/arch\":\"amd64\", \"beta.kubernetes.io/os\":\"linux\", \"kubernetes.io/arch\":\"amd64\", \"kubernetes.io/hostname\":\"kind-worker2\", \"kubernetes.io/os\":\"linux\"}, Annotations:map[string]string(nil), NodeIdentity:0x0, WireguardPubKey:\"\"}" subsys=nodemanager ``` After: ``` level=debug msg="Received node update event from custom-resource" node="{\"Name\":\"kind-worker\",\"Cluster\":\"default\",\"IPAddresses\":[{\"Type\":\"InternalIP\",\"IP\":\"172.18.0.3\"},{\"Type\":\"InternalIP\",\"IP\":\"fc00:c111::3\"},{\"Type\":\"CiliumInternalIP\",\"IP\":\"10.0.1.245\"}],\"IPv4AllocCIDR\":{\"IP\":\"10.0.1.0\",\"Mask\":\"////AA==\"},\"IPv4SecondaryAllocCIDRs\":null,\"IPv6AllocCIDR\":null,\"IPv6SecondaryAllocCIDRs\":null,\"IPv4HealthIP\":\"10.0.1.120\",\"IPv6HealthIP\":\"\",\"IPv4IngressIP\":\"\",\"IPv6IngressIP\":\"\",\"ClusterID\":0,\"Source\":\"custom-resource\",\"EncryptionKey\":0,\"Labels\":{\"beta.kubernetes.io/arch\":\"amd64\",\"beta.kubernetes.io/os\":\"linux\",\"kubernetes.io/arch\":\"amd64\",\"kubernetes.io/hostname\":\"kind-worker\",\"kubernetes.io/os\":\"linux\"},\"Annotations\":null,\"NodeIdentity\":0,\"WireguardPubKey\":\"\"}" subsys=nodemanager ``` Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 18 December 2023, 21:03:23 UTC
1409a37 loader: refactor/cleanup replaceNetworkDatapath replaceNetworkDatapath() is only called from one place and adds an additional loop over the encryption devices. This commit removes the function and calls replaceDatapath() from reinitializeIPSec() directly. There are no functional changes. Signed-off-by: Robin Gögge <r.goegge@isovalent.com> 18 December 2023, 18:54:32 UTC
4196935 hubble: Rate limit "stale identities observed" debug message This limits the amount of "stale identities observed" messages to one every 30 seconds. This is particularly important if monitor aggregation is disabled, as otherwise any unknown discrepancy fills the log buffers and thus makes it hard to e.g. debug CI flakes. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 18 December 2023, 18:53:45 UTC
caef2d3 hubble: Do not report stale identities for IPSec encap packets This was reported by Paul in [1]. After investigating the sysdump, it is clear that Hubble should not complain about such packets. [1] https://github.com/cilium/cilium/issues/15283#issuecomment-1858820397 Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 18 December 2023, 18:53:45 UTC
8c2dfda hubble: Improve "stale identity observed" log context This commit changes the log fields for the "stale identity observed" message to include the full context (i.e. trace observation point, source and destination addresses etc) for identifying which flow caused the message to be emitted. In addition, the "old" and "new" identity field names are replaced with the better fitting "userspace" and "datapath" terminology. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 18 December 2023, 18:53:45 UTC
7661353 ci: check kvstoremesh only in v1.14 Currently, vulnerability scanning fails because kvstoremesh no longer exists on v1.15. It has been removed with #28961 There are already kvstoremesh exclusions for v1.12 & v1.13. This commit inverts the exclusion to an inclusion for v1.14, as this is the only branch that contains the kvstoremesh. Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 18 December 2023, 18:51:06 UTC
758b3a4 contrib: Fix post-release.sh for branch candidates When we prepare release candidates on branches, we need to update the helm values and submit a PR to update the branch. Make sure that happens. Signed-off-by: Joe Stringer <joe@cilium.io> 18 December 2023, 18:30:07 UTC
1288473 policy: expand "world" entity selector to select all address families Previously, we had a single label, `reserved:world`, applied to all CIDR (non-cluster) identities. With #22625, this is no longer the case; CIDR identities get either `reserved:world-ipv4` or `reserved:world-ipv6` labels **in dual stack clusters**. This PR updates the `toEntities: world` selector to select *all* world entities, as it worked previously. It does so by expanding the set of selectors underlying the `world` entity. No other magic in the SelectorCache or policy engine is required. Fixes: #29666 Fixes: a94fa56f Signed-off-by: Casey Callendrello <cdc@isovalent.com> 18 December 2023, 14:29:33 UTC
fa00376 Update AUTHORS Signed-off-by: Joe Stringer <joe@cilium.io> 15 December 2023, 18:12:26 UTC
446cf56 bpf: host: clean up duplicated send_trace_notify() code Use a single send_trace_notify() statement, with parameters that can be trivially optimized out in the from-netdev path. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 15 December 2023, 11:48:16 UTC
feaf6c7 bpf: host: simplify MARK_MAGIC_PROXY_EGRESS_EPID handling inherit_identity_from_host() already knows how to handle MARK_MAGIC_PROXY_EGRESS_EPID and extract the endpoint ID from the mark. Use it to condense the code path, similar to how to-container looks like. Also fix up the drop notification, which currently uses the endpoint ID in place of the source security identity. Signed-off-by: Julian Wiedmann <jwi@isovalent.com> 15 December 2023, 11:48:16 UTC
e788a21 Bump readme, MLH for v1.15.0-rc.0 Signed-off-by: Joe Stringer <joe@cilium.io> 15 December 2023, 01:58:11 UTC
0363a2e bump release versions references by readme, stable.txt, and MLH v1.14.5 v1.13.10 v1.12.17 Signed-off-by: Andrew Sauber <2046750+asauber@users.noreply.github.com> 14 December 2023, 19:32:35 UTC
56b4329 ci: Fix Artifact Creation Failure Due to Invalid Character in Name Resolved an issue where the creation of tar archives was failing. The failure was caused by the presence of an invalid '|' character in the archive name. This update modifies the archive naming process to only use 'matrix.focus', ensuring the name is valid and the archive creation succeeds. Additionally, implemented a check for the existence of the test results directory. Signed-off-by: Birol Bilgin <birol@cilium.io> 14 December 2023, 13:15:50 UTC
244a5e9 iptables: filter table accepts from-proxy packets GKE has DROP policy for filter table, so we have to explicitly accept proxy traffic. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> 14 December 2023, 12:47:52 UTC
9fbd5a8 proxy: opt-out from SNAT for L7 + Tunnel for some scenarios Currently the L7 proxy performs SNAT for traffic when tunnel routing is enabled, even for cluster-internal traffic. This prevents cilium_host from detecting pod-level traffic, and we thus can't apply features. Modify SupportsOriginalSourceAddr(), so that the proxy doesn't SNAT such traffic when some conditions are met. Signed-off-by: Zhichuan Liang <gray.liang@isovalent.com> 14 December 2023, 12:47:52 UTC
4838ca2 identity: deflake test TestGetIdentity Currently, the test `TestGetIdentity` fails with the following error. ``` --- FAIL: TestGetIdentity (0.85s) --- FAIL: TestGetIdentity/Multiple_identities (0.26s) identity_test.go:226: Identity not found in the store ``` The reason is that the watch of the backend (started in its own goroutine) might not be ready when the test starts to create test identities. Therefore, this commit introduces that the test waits with its continuation until the watch is ready. Fixes: #23856 Signed-off-by: Marco Hofstetter <marco.hofstetter@isovalent.com> 14 December 2023, 12:47:25 UTC
f8e0342 helm: Support unsupported K8s versions for now Commit 9b016904f8aa7db90ab8fc3539933562d80c022f bumped the `kubeVersion` in the Helm chart to K8s 1.26, which is effectively the oldest Kubernetes version we officially supported. However, this change now prevents the installation of Cilium against Kubernetes versions which we don't officially support, but are still used in CI (e.g. (ci-eks, ci-aks, ci-gke, ext-workload). To unbreak CI, let's relax that restriction for now. Reported-by: Marco Hofstetter <marco.hofstetter@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 14 December 2023, 12:46:29 UTC
8297366 ci-e2e: Do not check for missed tail calls with host-fw It's a known flake / bug [1]. [1]: https://github.com/cilium/cilium/issues/28088 Signed-off-by: Martynas Pumputis <m@lambda.lt> 14 December 2023, 05:19:18 UTC
16fe166 gh/workflows: Bump CLI to v0.15.18 Signed-off-by: Martynas Pumputis <m@lambda.lt> 14 December 2023, 05:19:18 UTC
6b4a032 Makefile: Refactor hubble-relay target Instead of using a fixed directory, use a variable to define the hubble-relay directory, just like the other cilium binaries. Signed-off-by: Chance Zibolski <chance.zibolski@gmail.com> 14 December 2023, 02:27:41 UTC
9b01690 Prepare for v1.16 development cycle Signed-off-by: Joe Stringer <joe@cilium.io> 14 December 2023, 00:10:14 UTC
65286ce images: update cilium-{runtime,builder} Signed-off-by: Cilium Imagebot <noreply@cilium.io> 14 December 2023, 00:07:47 UTC
780f5b4 chore(deps): update docker.io/library/golang:1.21.5 docker digest to 2ff79bc Signed-off-by: renovate[bot] <bot@renovateapp.com> 14 December 2023, 00:07:47 UTC
23ef3c0 preflight: fix overriding node name env variable Cilium-agent is overriding K8S_NODE_NAME env variable based on node name. Cilium-preflight was not overriding it so in case that hostname did not match node name, preflight was stuck waiting for node information. Signed-off-by: Marcel Zieba <marcel.zieba@isovalent.com> 13 December 2023, 23:09:54 UTC
0d1ee39 Revert "cilium: Ensure xfrm state is initialized for route IP before publish" This reverts commit c9ea7a52bd59c167c6e7611d4976e3c041f4e7f0. This works around a condition where restarting the agent uses a new IP for Cilium Internal IP. But, it turns out this is because of an incorrect set helm chart option in our reproducer. When configured correctly we require CiliumInternalIP to reused so this patch is not necessary. In fact it complicates the code so lets drop it. The helm option is cleanState. It must be set to false cleanState=false Note that cleanState="false" is a string type and will default to true because of bool typing. Creating a subtle and broke config. Signed-off-by: John Fastabend <john.fastabend@gmail.com> 13 December 2023, 19:16:53 UTC
5e5d4d8 gha: sig-servicemesh owns Ingress or Gateway API related workflows The servicemesh team may be interested in reviewing changes of these workflows. Signed-off-by: Tam Mach <tam.mach@cilium.io> 13 December 2023, 17:05:41 UTC
16ec0b2 .github: Only create Go cache for main branch push This commit extends the conditional logic for running this logic against the main branch, so that we can keep the steps in place on older branches without causing the cache to get thrashed by stable branch versions of the cache. This allows us to have fewer lines of diff between the main and older (eg v1.14) branches. Co-authored-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 13 December 2023, 17:12:33 UTC
723ea5d .github: Remove branch names from step ids This simplifies the branching steps for stable branches when we create them. Co-authored-by: Michi Mutsuzaki <michi@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 13 December 2023, 17:12:33 UTC
5d7d63f fix: Send node add when using local-router-ipv4 local-router-ipv4 is set in cilium-config causing all nodes to have NodeCiliumInternalIP set to the same value. When Upserting it in node manager it detects collision expecting sending node_update message, but as the node is new it instead skips sending node_add message. Bug: b/284134355 Signed-off-by: Aleksander Mistewicz <amistewicz@google.com> 13 December 2023, 16:51:16 UTC
94e6596 k8s: Rename CiliumCIDRGroups resource To be consistent with other Cilium resources, rename CIDRGroups to CiliumCIDRGroups. Signed-off-by: Fabio Falzoi <fabio.falzoi@isovalent.com> 13 December 2023, 16:20:30 UTC
back to top