https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
5b1ecad Exclude local interface Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 18 March 2022, 21:16:19 UTC
51dee98 More logs Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 18 March 2022, 17:00:01 UTC
2474de6 Add more logs to daemon initialization logic Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 17 March 2022, 23:32:10 UTC
3e77756 Prepare for release v1.10.7 Signed-off-by: Joe Stringer <joe@cilium.io> 19 January 2022, 00:48:33 UTC
58f5aee build(deps): bump docker/build-push-action from 2.7.0 to 2.8.0 Bumps [docker/build-push-action](https://github.com/docker/build-push-action) from 2.7.0 to 2.8.0. - [Release notes](https://github.com/docker/build-push-action/releases) - [Commits](https://github.com/docker/build-push-action/compare/a66e35b9cbcf4ad0ea91ffcaf7bbad63ad9e0229...1814d3dfb36d6f84174e61f4a4b05bd84089a4b9) --- updated-dependencies: - dependency-name: docker/build-push-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 18 January 2022, 20:32:18 UTC
42df137 bpf: Reset Pod's queue mapping in host veth to fix phys dev mq selection [ upstream commit ecdff123780dcc50599e424cbbc77edf2c70e396 ] Fix TX queue selection problem on the phys device as reported by Laurent. At high throughput, they noticed a significant amount of TCP retransmissions that they tracked back to qdic drops (fq_codel was used). Suspicion is that kernel commit edbea9220251 ("veth: Store queue_mapping independently of XDP prog presence") caused this due to its unconditional skb_record_rx_queue() which sets queue mapping to 1, and thus this gets propagated all the way to the physical device hitting only single queue in a mq device. Lets have bpf_lxc reset it as a workaround until we have a kernel fix. Doing this unconditionally is good anyway in order to avoid Pods messing with TX queue selection. Kernel will catch up with fix in 710ad98c363a ("veth: Do not record rx queue hint in veth_xmit"). Fixes: #18311 Reported-by: Laurent Bernaille <laurent.bernaille@datadoghq.com> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Tested-by: Laurent Bernaille <laurent.bernaille@datadoghq.com> Link (Bug): https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=edbea922025169c0e5cdca5ebf7bf5374cc5566c Link (Fix): https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=710ad98c363a66a0cd8526465426c5c5f8377ee0 Signed-off-by: Aditi Ghag <aditi@cilium.io> 18 January 2022, 16:02:50 UTC
e131335 test: bump l4lb kind in Vagrantfile to 0.11.1 [ upstream commit 018c94536f27c868a7795c6ba66c50559692ce28 ] The 0.11.1 release bumps the base ubuntu image to 21.04 [1], which should fix the issue we are seeing with the current test: ++ docker exec -i kind-control-plane /bin/sh -c 'echo $(( $(ip -o l show eth0 | awk "{print $1}" | cut -d: -f1) ))' [..] Reading package lists... E: The repository 'http://security.ubuntu.com/ubuntu groovy-security Release' does not have a Release file. E: The repository 'http://archive.ubuntu.com/ubuntu groovy Release' does not have a Release file. E: The repository 'http://archive.ubuntu.com/ubuntu groovy-updates Release' does not have a Release file. E: The repository 'http://archive.ubuntu.com/ubuntu groovy-backports Release' does not have a Release file. Error: Process completed with exit code 100. [1] https://github.com/kubernetes-sigs/kind/releases/tag/v0.11.1 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> Signed-off-by: Aditi Ghag <aditi@cilium.io> 18 January 2022, 16:02:50 UTC
d1c416a Fix possible IP leak in case ENI's are not present in the CN yet [ upstream commit aea1b9f24ade9068711a1a555c7345188b9f736b ] buildAllocationResult may return an error in case of inconsistencies found in the local CN's status. For example, there are situations where an IP is already part of spec.ipam.pool (including the resource/ENI where the IP comes from), while the corresponding ENI is not part of status.eni.enis yet. If that is the case, the IP would be allocated (e.g. by allocateNext) and then marked as allocated (via a.markAllocated). Shortly after that, a.buildAllocationResult() would fail and then NOT undo the changes done by a.markAllocated(). This will then result in the IP never being freed up again. At the same time, kubelet will keep scheduling PODs onto the same node without knowing that IPs run out and thus causing new PODs to never get an IP. Why exactly this inconsistency between the spec and the status arise is a different topic and should maybe be investigated further. This commit/PR fixes this issue by simply moving a.markAllocated() after the a.buildAllocationResult() result, so that the function is bailed out early enough. Some additional info on how I encountered this issue and maybe how to reproduce it. We have a cluster running that does automatic downscaling of all deployments at night and then relies on cluster-autoscaler to also shut down nodes. Next morning, all deployments are upscaled again, causing cluster-autoscaler to also start many nodes at once. This causes many nodes to appear in k8s at the same time, all being `NotReady` at the beginning. Cilium agents are then started on each node. When cilium agents start to get ready, the node are also marked `Ready`, causing the k8s scheduler to immediately schedule dozens of PODs onto the `Ready` nodes, long before cilium-operator had a chance to attach new ENIs and IPs to the fresh nodes. This means that all PODs scheduled to the fresh nodes run into a temporary state where the CNI plugin reports that there are no more IPs available. All this is expected and normal until this point. After a few seconds, cilium-operator finishes attaching new ENIs to the fresh nodes and then tries to update the CN. The update to the spec.pool seems to be successful then, causing the agent to allocate the IP. But as the update to the status seems to fail, the agent then bails out with the IP being marked as used and thus causing the leak. This is only happening with very high load on the apiserver. At the same time, I can observe errors like these happening in cilium-operator: ``` level=warning msg="Failed to update CiliumNode" attempt=1 error="Operation cannot be fulfilled on ciliumnodes.cilium.io \"ip-100-66-62-168.eu-central-1.compute.internal\": the object has been modified; please apply your changes to the latest version and try again" instanceID=i-009466ca3d82a1ec0 name=ip-100-66-62-168.eu-central-1.compute.internal subsys=ipam updateStatus=true ``` Please note the `attempt=1` in the log line, it indicates that the first attempt also failed and that no further attempt is done (looking at the many `for retry := 0; retry < 2; retry++` loops found in the code). I assume (without 100% knowing) that this is the reason for the inconsistency in spec vs status. Signed-off-by: Alexander Block <ablock84@gmail.com> Signed-off-by: Aditi Ghag <aditi@cilium.io> 18 January 2022, 16:02:50 UTC
67545c3 images: update cilium-{runtime,builder} for Go 1.16.13 While at it also bump the ubuntu:20.04 image to latest version to pick up updated systemd packages. Signed-off-by: Tobias Klauser <tobias@cilium.io> 17 January 2022, 16:40:26 UTC
b59c73f Update Go to 1.16.13 Signed-off-by: Tobias Klauser <tobias@cilium.io> 17 January 2022, 16:40:26 UTC
6d1bc27 egressgateway: fix initial reconciliation [ upstream commit ab9bfd71c9cb552445555167375b27c610ee19c6 ] When a new egress gateway manager is created, it will wait for the k8s cache to be fully synced before running the first reconciliation. Currently the logic is based on the WaitUntilK8sCacheIsSynced method of the Daemon object, which waits on the k8sCachesSynced channel to be closed (which indicates that the cache has been indeed synced). The issue with this approach is that Daemon object is passed to the NewEgressGatewayManager method _before_ its k8sCachesSynced channel is properly initialized. This in turn causes the WaitUntilK8sCacheIsSynced method to never return. Since NewEgressGatewayManager must be called before that channel is initialized, we need to switch to a polling approach, where the k8sCachesSynced is checked periodically. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 17 January 2022, 16:11:56 UTC
d75468f Revert "test: Add Error Log Exceptions" This reverts commit b73c5b19cdf8583dfded17db75cad79c0e62c972. Rationale: - The reverted commit on v1.10[1] was a backport of a master commit[2], itself building upon another master commit[3], both adding `level=error` checking to the CI. - [3] was not backported to v1.10, hence conflicts from backporting [2] to v1.10 were manually fixed, resulting in [1]'s description not matching its contents and also losing proper history tracing due to [3] not being present in the tree. - [1] introduced issue #18285 in v1.10 CI tests. - It also appears [2] was not intended for backport in the first place. - Due to all of these, we revert [1] in v1.10. [1] b73c5b19cdf8583dfded17db75cad79c0e62c972 [2] 82d44229e61d066b37d13ca34d0e30e5e2b0e9b0 [3] 11fb42adc2c53f38b66bc5f1271d370b6c1e8f9a Note: a more thorough discussion can be found at https://github.com/cilium/cilium/issues/18285#issuecomment-1009970348 Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 13 January 2022, 21:19:10 UTC
52a0400 install: add mountPropagation directive to bpf-maps volume in cilium DS The original backport was missing the "mountPropagation: Bidirectional" directive for the bpf-maps volume, causing the bpffs to not get mounted in the host. Fixes: d2217045cb ("install/kubernetes: use bidirectional mounts to mount bpf fs") Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 12 January 2022, 19:36:23 UTC
82fdfc6 ci: use python3 instead of python [ upstream commit dddbbe709e2827873420fef9b635152340f37f91 ] Our CI nodes no longer have `python` binary, python3 is available instead. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> Signed-off-by: Paul Chaignon <paul@cilium.io> 11 January 2022, 18:36:32 UTC
4d1fa4d docs: Replace janitors team with tophat team [ upstream commit 05815f79561da1f5029633c08c9b8373cf79a32e ] This commit replaces a few references to the janitors team, which was recently renamed to tophat. Signed-off-by: Paul Chaignon <paul@cilium.io> 11 January 2022, 18:36:32 UTC
4f8061a docs: Fix incorrect mention of bpf.masquerade's default value [ upstream commit 82fdb528bd64f8904b5379e44114deead97ae683 ] bpf.masquerade is not enabled by default since commit dc40b2cb ("helm: Disable BPF masquerading in v1.10+"). Fixes: dc40b2cb ("helm: Disable BPF masquerading in v1.10+") Reported-by: Stevo Slavic via GitHub Signed-off-by: Paul Chaignon <paul@cilium.io> 11 January 2022, 18:36:32 UTC
7e4ac38 docs: improve Kubespray installation guide [ upstream commit d8577ff9a9fe2dee7a684130346d695e092025a1 ] Previously, the Kubespray documentation recommended changing the role variables. However, changing the role files in an Ansible playbook could lead to problems. So, with this commit, the documentation recommends using the extra variables or editing the group_vars files. Co-authored-by: Yasin Taha Erol <yasintahaerol@gmail.com> Signed-off-by: necatican <necaticanyildirim@gmail.com> Signed-off-by: Paul Chaignon <paul@cilium.io> 11 January 2022, 18:36:32 UTC
dffd23c docs: Fix `first-interface-index` documentation [ upstream commit 5e3a7b3c9f01f0c4014675269fabc9d18380757d ] This fixes the `first-interface-index` section in the ENI docs where we introduced a new default value and wanted to document that new default, but by doing that accidentally changed a value in the examples. This commit actually fixes the default value and reverts the example to its proper meaning. Ref: https://github.com/cilium/cilium/pull/14801 Fixes: 231a217ea99d ("docs: first-interface-index new ENI default") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Paul Chaignon <paul@cilium.io> 11 January 2022, 18:36:32 UTC
b472c0e docs: Replace 'micro version' with 'patch version' [ upstream commit 5a5755abb8a230bf54e50b2b883c31b2ddac216e ] In a version vX.Y.Z, X is called the major version, Y the minor version, and Z the patch version with this commit. Before this commit, Z was the micro version, but that's a bit confusing given we already have 'minor'. The patch terminology also matches semantics documented at semver.org. Signed-off-by: Paul Chaignon <paul@cilium.io> 11 January 2022, 18:36:32 UTC
bf2f83d bpf: remove local EP check on egress gw SNAT logic [ upstream commit a38aababe1a01fc187c4e550837fd0afd488f55d ] In snat_v4_neeeded(), one of the conditions to determine if a packet should be SNATed with an egress IP is: !local_ep || is_cluster_destination() The intent of the first check (!local_ep) was to that tells us that traffic was redirected by an egress gateway policy to a different node to be masqueraded, but in practice it's not needed: as long as the packet is destined to outside the cluster, is not reply traffic and it's matched by an egress NAT policy, it should be SNATed with the egress gw IP (moreover we should not assume that there's no local EP since it's possible that the node where the client pod is running is the same node that will act as egress gateway). Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 11 January 2022, 13:27:50 UTC
3a834f1 bpf: egressgw: sync logic to determine if destination is outside cluster [ upstream commit cdfc30d530fb87baf4b11f092b3aa41356b9147f ] In the context of egress gateway, when traffic is leaving the cluster we need to check twice if it is a match for an egress NAT policy: * first time in handle_ipv4_from_lxc(), on the node where the client pod is running (to determine if it should be forwarded to a gateway node) * second time in snat_v4_needed(), on the actual gateway node (to determine if it should be SNATed) Currently the 2 checks are slightly diverging wrt how traffic destined to outside the cluster is identified: * in the first case we use is_cluster_destination(), which uses the information stored on the ipcache and EP maps * in the second case we just rely on the IPV4_SNAT_EXCLUSION_DST_CIDR The issue with the IPV4_SNAT_EXCLUSION_DST_CIDR logic is that we may incorrectly exclude from egress gw SNAT traffic that is supposed to be SNATed: case in point an EKS environment where the primary VPC is shared between the cluster and some other EC2 nodes that don't belong to the cluster. To fix this, this commit changes the snat_v4_needed() logic to match the one we use in handle_ipv4_from_lxc() and executes it before the IPV4_SNAT_EXCLUSION_DST_CIDR check. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 11 January 2022, 13:27:50 UTC
f711d16 bpf: rename ep and info to local_ep and remote_ep in snat_v4_needed [ upstream commit e3dca631d51bf6adcbf7b0225b52be16b85accb2 ] to make it more explicit their purpose Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 11 January 2022, 13:27:50 UTC
282030c build(deps): bump 8398a7/action-slack from 3.12.0 to 3.13.0 Bumps [8398a7/action-slack](https://github.com/8398a7/action-slack) from 3.12.0 to 3.13.0. - [Release notes](https://github.com/8398a7/action-slack/releases) - [Commits](https://github.com/8398a7/action-slack/compare/c9ff874f8549f97317ec9f6162d5449ee77bc984...a74b761b4089b5d730d813fbedcd2ec5d394f3af) --- updated-dependencies: - dependency-name: 8398a7/action-slack dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 11 January 2022, 11:19:21 UTC
5d97e28 daemon: Fix multi-dev XDP check [ upstream commit 227424a49838ca047a6fdda98be7cb32bec9f078 ] The 0ffa7c60e1 commit didn't remove all guards which previously were used to check that XDP is enabled only with |--devices| == 1. The last guard was not visible on v1.11 due to https://github.com/cilium/cilium/pull/18304, while on v1.10 it failed on the multi-dev XDP setups. Fixes: 0ffa7c60e1 ("datapath,daemon: Enable multi-dev XDP") Reported-by: Tobias Klauser <tobias@cilium.io> Reported-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Chris Tarazi <chris@isovalent.com> 10 January 2022, 09:57:31 UTC
6584f79 ci: set PR base and ref for code codeql workflow This fixes the following error on scheduled runs: This action requires 'base' input to be configured or 'repository.default_branch' to be set in the event payload" This also makes sure only code changed by the CL is flagged by CodeQL, but not existing issues in the repo. Signed-off-by: Tobias Klauser <tobias@cilium.io> 07 January 2022, 09:44:37 UTC
2e8e161 ci: run CodeQL action only on PRs against v1.10 branch This fixes the following reported issue [1]: 1 issue was detected with this workflow: Please make sure that every branch in on.pull_request is also in on.push so that Code Scanning can compare pull requests against the state of the base branch. [1] https://github.com/cilium/cilium/actions/runs/1591471691 Note that each release branch and master have their own version of this workflow and on.{pull_request,push}.branches needs to be changed accordingly. Signed-off-by: Tobias Klauser <tobias@cilium.io> 07 January 2022, 09:44:37 UTC
fd0d371 CODEOWNERS: janitors renamed to tophat The janitors team was renamed to tophat so we need to update the code owners accordingly. Signed-off-by: Paul Chaignon <paul@cilium.io> 03 January 2022, 23:02:11 UTC
c910cea build(deps): bump actions/setup-go from 2.1.4 to 2.1.5 Bumps [actions/setup-go](https://github.com/actions/setup-go) from 2.1.4 to 2.1.5. - [Release notes](https://github.com/actions/setup-go/releases) - [Commits](https://github.com/actions/setup-go/compare/331ce1d993939866bb63c32c6cbbfd48fa76fc57...424fc82d43fa5a37540bae62709ddcc23d9520d4) --- updated-dependencies: - dependency-name: actions/setup-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 29 December 2021, 17:44:11 UTC
5cd7077 datapath: don't attempt deleting old tunnel map entries on node add [ upstream commit 386029bb42155b37b425edd20f2e02f5ba0be749 ] The datapath.LocalNodeConfig.EnableEncapsulation field is immutable at runtime [1], i.e. it is defined once at agent startup. The tunnel map is created as non-persistent, meaning that any potentially pinned map would be deleted on startup [2], [3]. In combination this means that there cannot possibly be a case that there are left over old tunnel map entries in the tunnel map if encapsulation is disabled. [1] https://github.com/cilium/cilium/blob/6c169f63ec254de7777483b6f01c261215f9ec9c/pkg/datapath/node.go#L59-L64 [2] https://github.com/cilium/cilium/blob/6c169f63ec254de7777483b6f01c261215f9ec9c/pkg/maps/tunnel/tunnel.go#L48 [3] https://github.com/cilium/cilium/blob/6c169f63ec254de7777483b6f01c261215f9ec9c/pkg/bpf/map_linux.go#L104-L106 Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
a0b9acf daemon: open or create tunnel map cache only if needed [ upstream commit 8b9e890e6a989f66c62d975341ffc3c1335765b7 ] The tunnel map and thus its user space cache are only needed if either tunneling or the IPv4 egress gateway is enabled. Currently, the user space cache of the map is created unconditionally of whether it is actually used, leading to e.g. the bpf-map-sync-cilium_tunnel_map being spawend unnecessarily. This controller will e.g. show up in `cilium status --all-controllers` and might lead to confusion in setups where tunneling and the egress gateway feature are disabled. In some reported cases (see #16488) with tunneling disabled that controller also ended up in an errors state, not being able to recoiver. Thus, only create the user space cache map and thus the sync controller in case it is actually needed to avoid this class of errors. Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
69665c0 datapath, tunnel: correctly ignore ENOENT on tunnel mapping delete [ upstream commit c0839256feec7841724970d6b92ae919038ad5d7 ] Currently, when deleteTunnelMapping(oldCIDR *cidr.CIDR, quietMode bool) is called with quietMode = true for an inexistent entry - as could be the case on initial node addition - we would get ENOENT on the underlying map operation in pkg/bpf/(*map).deleteMapEntry. This would lead to the bpf-map-sync-cilium_tunnel_map controller being started to reconcile the error. However, in some reported cases (see #16488), the controller seems to stay in an error state failing to delete the non-existent entry. Avoid this situation entirely by ignoring any map delete error in case deleteTunnelMapping is called with quietMode = true. Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
57a7ab9 maps/tunnel: mark TunnelMap as NonPersistent on creation [ upstream commit 2ab15e9bb3f16ff6f7137b0216f5f2fb4a058098 ] Create the tunnel map as non-persistent instead of marking it as such in the package level init func. Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
6afa135 option: correct godoc for (*DaemonConfig).TunnelingEnabled [ upstream commit 979cae5488ae3c59083955339a0c71909f3288d8 ] Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
515cc97 docs: Warn against Helm's `--reuse-values` in Cilium upgrades [ upstream commit 9f50a91c8dcbbc6f9642673bb069ccb884f587d1 ] Using it to upgrade to a new minor Cilium version has never been supported and may break the Helm template in subtle ways due to the lack of default values. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
21c9134 Fix proxy nil check [ upstream commit 13a7125fda14b19898fb3b09706bfecde30f4ad6 ] Signed-off-by: chaosbox <ram29@bskyb.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
e177091 bpf: update DSR flag if CT entry is found [ upstream commit f6c78254c7c4c28b24e01b1c5c38975bb1ea1268 ] Currently, we check for the DSR IP option and then create the CT entry with DSR flag only for new connections (CT_NEW). If a stale CT entry exists (without DSR flag), then the DSR is not handled for the new flow which leads to the rev-DNAT xlation not applied. This commit fix this problem and update the DSR flag for connections in the CT_REOPENED state. Signed-off-by: ivan <i.makarychev@tinkoff.ru> Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
a7d8b55 monitor: TraceToNetwork does not populate ct state [ upstream commit 5c0eb01f80bf39591be441231ee200ca56c317bf ] This fixes a bug where Hubble wrongly populated the `is_reply` field for `to-network` trace events, as it assumed these events populated the `TraceNotify.Reason` field with connection tracking state. This however turned out to be an error, and thus TO_NETWORK needs to be removed from the list of observation points with CT state. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
a14d1be bpf: Use enum for send_trace_notify reason [ upstream commit 3d788376f4cd2747366780984fbe916ce2d0284e ] This commit introduces an enum for the `TRACE_REASON_*` values, to ensure callers of `send_trace_notify` do not accidentally pass in wrong values. Suggested-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
cd80b22 bpf: Fix invalid trace reason in bpf_host [ upstream commit 1cf3ef3fa99f1ae8426882419a84e9a67bde36de ] The `reason` argument of `send_trace_notify` is intended to contain connection tracking state (see TRACE_REASON_*). It is used by user-space to filter out reply packets where possible and thus should be zero if no ct state is available, to avoid misclassification. The code in bpf_host erroneously populated this value with the verdict instead. This commit removes those values and adds documentation of what value may be passed in `send_trace_notify`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 December 2021, 18:16:33 UTC
257c046 build(deps): bump docker/login-action from 1.10.0 to 1.12.0 Bumps [docker/login-action](https://github.com/docker/login-action) from 1.10.0 to 1.12.0. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](https://github.com/docker/login-action/compare/f054a8b539a109f9f41c372932f1ae047eff08c9...42d299face0c5c43a0487c477f595ac9cf22f1a7) --- updated-dependencies: - dependency-name: docker/login-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 20 December 2021, 19:04:59 UTC
9b94d7c .github: stop pushing last stable image from v1.10 branches Signed-off-by: Joe Stringer <joe@cilium.io> 16 December 2021, 10:51:41 UTC
6354c69 build(deps): bump actions/upload-artifact from 2.3.0 to 2.3.1 Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2.3.0 to 2.3.1. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/da838ae9595ac94171fa2d4de5a2f117b3e7ac32...82c141cc518b40d92cc801eee768e7aafc9c2fa2) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 15 December 2021, 15:15:28 UTC
b73c5b1 test: Add Error Log Exceptions [ upstream commit 82d44229e61d066b37d13ca34d0e30e5e2b0e9b0 ] [ Backport notes: had to resolve conflicts manually due to #16395 previously introducing exceptions not having been backported to v1.10. The changes in this PR completely supersede #16395 so there should be no need to backport it first. ] Occasionally the cilium-operator will run into a transient issue where it cannot get/update/release the leaselock with K8s that it uses to adjudicate its leader election. This error message is part and parcel of this failure and can be ignored. cf. https://github.com/cilium/cilium/issues/16402 Signed-off-by: Nate Sweet <nathanjsweet@pm.me> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 14 December 2021, 16:31:05 UTC
7f3c916 github: Collect cilium-sysdump in L4LB upon failure [ upstream commit 95cba0aa05c5b8ba77ef4e4e9dd84bed78832672 ] [ Backport notes: dropped conflicts in .github/workflows/ files since they do not exist in stable branches ] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 14 December 2021, 16:31:05 UTC
287bafc images: update cilium-{runtime,builder} To pull in Go 1.16.12. Signed-off-by: Tobias Klauser <tobias@cilium.io> 13 December 2021, 11:26:48 UTC
400959a Update Go to 1.16.12 Signed-off-by: Tobias Klauser <tobias@cilium.io> 13 December 2021, 11:26:48 UTC
fba16c3 install: Update image digests for v1.10.6 Generated from https://github.com/cilium/cilium/actions/runs/1561734971. `docker.io/cilium/cilium:v1.10.6@sha256:cf52b14bf9bc62e4eb1967661a51e5f5482cbb05b784c0a0e38ee16d66f85773` `quay.io/cilium/cilium:v1.10.6@sha256:cf52b14bf9bc62e4eb1967661a51e5f5482cbb05b784c0a0e38ee16d66f85773` `docker.io/cilium/cilium:stable@sha256:cf52b14bf9bc62e4eb1967661a51e5f5482cbb05b784c0a0e38ee16d66f85773` `quay.io/cilium/cilium:stable@sha256:cf52b14bf9bc62e4eb1967661a51e5f5482cbb05b784c0a0e38ee16d66f85773` `docker.io/cilium/clustermesh-apiserver:v1.10.6@sha256:07e0ba11f74b8ea00303a3705457994f99e64e423b0ebe7f0e1bfda7a3493dec` `quay.io/cilium/clustermesh-apiserver:v1.10.6@sha256:07e0ba11f74b8ea00303a3705457994f99e64e423b0ebe7f0e1bfda7a3493dec` `docker.io/cilium/clustermesh-apiserver:stable@sha256:07e0ba11f74b8ea00303a3705457994f99e64e423b0ebe7f0e1bfda7a3493dec` `quay.io/cilium/clustermesh-apiserver:stable@sha256:07e0ba11f74b8ea00303a3705457994f99e64e423b0ebe7f0e1bfda7a3493dec` `docker.io/cilium/docker-plugin:v1.10.6@sha256:c48995fe2666cb73f12dc51200d6d05fa11ecb566d9cf978db4cac47ec77746b` `quay.io/cilium/docker-plugin:v1.10.6@sha256:c48995fe2666cb73f12dc51200d6d05fa11ecb566d9cf978db4cac47ec77746b` `docker.io/cilium/docker-plugin:stable@sha256:c48995fe2666cb73f12dc51200d6d05fa11ecb566d9cf978db4cac47ec77746b` `quay.io/cilium/docker-plugin:stable@sha256:c48995fe2666cb73f12dc51200d6d05fa11ecb566d9cf978db4cac47ec77746b` `docker.io/cilium/hubble-relay:v1.10.6@sha256:4d8de723d64e5aecb9de2e12b624e50c0a4388d3e43f697f8e5781be33f7e888` `quay.io/cilium/hubble-relay:v1.10.6@sha256:4d8de723d64e5aecb9de2e12b624e50c0a4388d3e43f697f8e5781be33f7e888` `docker.io/cilium/hubble-relay:stable@sha256:4d8de723d64e5aecb9de2e12b624e50c0a4388d3e43f697f8e5781be33f7e888` `quay.io/cilium/hubble-relay:stable@sha256:4d8de723d64e5aecb9de2e12b624e50c0a4388d3e43f697f8e5781be33f7e888` `docker.io/cilium/operator-alibabacloud:v1.10.6@sha256:16ba99f0ac71562883d45760cb85957249a4f7f1238841ad3cee40a9b5f3a03c` `quay.io/cilium/operator-alibabacloud:v1.10.6@sha256:16ba99f0ac71562883d45760cb85957249a4f7f1238841ad3cee40a9b5f3a03c` `docker.io/cilium/operator-alibabacloud:stable@sha256:16ba99f0ac71562883d45760cb85957249a4f7f1238841ad3cee40a9b5f3a03c` `quay.io/cilium/operator-alibabacloud:stable@sha256:16ba99f0ac71562883d45760cb85957249a4f7f1238841ad3cee40a9b5f3a03c` `docker.io/cilium/operator-aws:v1.10.6@sha256:e78b6e2904b694ca08635d2485d5dcd342d06ee3d6a7ef6c5f31cd2901a8fd67` `quay.io/cilium/operator-aws:v1.10.6@sha256:e78b6e2904b694ca08635d2485d5dcd342d06ee3d6a7ef6c5f31cd2901a8fd67` `docker.io/cilium/operator-aws:stable@sha256:e78b6e2904b694ca08635d2485d5dcd342d06ee3d6a7ef6c5f31cd2901a8fd67` `quay.io/cilium/operator-aws:stable@sha256:e78b6e2904b694ca08635d2485d5dcd342d06ee3d6a7ef6c5f31cd2901a8fd67` `docker.io/cilium/operator-azure:v1.10.6@sha256:3c7e7a9e23d721e4845793ece54bcd1393ebcb9b3fdf3581a90796c95f356cc0` `quay.io/cilium/operator-azure:v1.10.6@sha256:3c7e7a9e23d721e4845793ece54bcd1393ebcb9b3fdf3581a90796c95f356cc0` `docker.io/cilium/operator-azure:stable@sha256:3c7e7a9e23d721e4845793ece54bcd1393ebcb9b3fdf3581a90796c95f356cc0` `quay.io/cilium/operator-azure:stable@sha256:3c7e7a9e23d721e4845793ece54bcd1393ebcb9b3fdf3581a90796c95f356cc0` `docker.io/cilium/operator-generic:v1.10.6@sha256:6bd47edc4d8f8b5b984509c68f5625a4141c0f5a4c8931f012b0453d9b62bd92` `quay.io/cilium/operator-generic:v1.10.6@sha256:6bd47edc4d8f8b5b984509c68f5625a4141c0f5a4c8931f012b0453d9b62bd92` `docker.io/cilium/operator-generic:stable@sha256:6bd47edc4d8f8b5b984509c68f5625a4141c0f5a4c8931f012b0453d9b62bd92` `quay.io/cilium/operator-generic:stable@sha256:6bd47edc4d8f8b5b984509c68f5625a4141c0f5a4c8931f012b0453d9b62bd92` `docker.io/cilium/operator:v1.10.6@sha256:037441989e5b3b69893bd1112f5b79684758a1de4c5b793fd16011cbf7e0523b` `quay.io/cilium/operator:v1.10.6@sha256:037441989e5b3b69893bd1112f5b79684758a1de4c5b793fd16011cbf7e0523b` `docker.io/cilium/operator:stable@sha256:037441989e5b3b69893bd1112f5b79684758a1de4c5b793fd16011cbf7e0523b` `quay.io/cilium/operator:stable@sha256:037441989e5b3b69893bd1112f5b79684758a1de4c5b793fd16011cbf7e0523b` Signed-off-by: Joe Stringer <joe@cilium.io> 10 December 2021, 18:15:12 UTC
17d3d15 Prepare for release v1.10.6 Signed-off-by: Joe Stringer <joe@cilium.io> 10 December 2021, 03:00:16 UTC
43df21b ui: v0.8.5 [ upstream commit 7d748630fd965639d0baa43fe977afef98e436d2 ] Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 10 December 2021, 00:08:10 UTC
7233121 test: use stable zookeeper image [ upstream commit 1305bab3035d83cee014bbaf2aa2a14093cd84cc ] Commit f66515219c21 ("test: Use stable tags instead of :latest") switched most use of image tags to stable tags, but omitted one occurrence of the zookeeper image in the runtime kafka tests. Switch it to the stable docker.io/cilium/zookeeper:1.0 as well. Fixes: f66515219c21 ("test: Use stable tags instead of :latest") Signed-off-by: Tobias Klauser <tobias@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io> 10 December 2021, 00:08:10 UTC
b9093ef docs: fix link to signoff / certificate of origin section [ upstream commit 6c169f63ec254de7777483b6f01c261215f9ec9c ] Signed-off-by: Timo Reimann <ttr314@googlemail.com> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
3e3f17a docs: fix eksctl ClusterConfig to allow copy [ upstream commit 00275427db4addae523c17fc5424bab63cacc029 ] This commit fixes the eksctl ClusterConfig to allow for copy. It is merely a workaround for now until a proper fix is available. Fixes: 706c9009dc39 ("docs: re-write docs to create clusters with tainted nodes") Signed-off-by: André Martins <andre@cilium.io> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
4dc6fa0 service: Always allocate higher ID for svc/backend [ upstream commit 33bd95c6375c4a494b47fc3634a1eb0a8892660a ] Previously, it was possible that a backend or a service would get allocated ID, which would be ID_backend_A < ID < ID_backend_B. This could have happened after cilium-agent restart, as the nextID was not advanced upon the restoration of IDs. This could have led to situations in which the per-packet LB could selected a backend which did not belong to a requested service when the following was fulfilled in the chronological order: 1. Previously the same client made the request to the service and the backend with ID_x was chosen. 2. The service endpoint (backend) with ID_x was removed. 3. cilium-agent was restarted. 4. A new service backend which does not belong to the initial service was created and got the ID_x allocated. 5. The CT_SERVICE entry for the old connection was not removed by the CT GC. 6. The same client made a new connection to the same service from the same src port. The above led the lb{4,6}_local() to select the wrong backend, as it found the CT_SERVICE entry with the backend ID_x. The advancement of the nextID upon the restoration only partly mitigates the issue. The real fix would be to introduce a match map which key would be (svc_id, backend_id), and it would be populated by the agent. The lb{4,6}_local() routines would consult the map to detect whether the backend belongs to the service. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
0244ec9 test: Extend coredns clusterrole with additional resource permissions [ upstream commit 854bb8601e420f2087f2f54e1890aae976f464da ] Commit 398d55cd didn't add permissions for `endpointslices` resource to the coredns `cluterrole` on k8s < 1.20. As a result, core-dns deployments failed on the these versions with the error - `2021-11-30T14:09:43.349414540Z E1130 14:09:43.349292 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.EndpointSlice: failed to list *v1beta1.EndpointSlice: endpointslices.discovery.k8s.io is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpointslices" in API group "discovery.k8s.io" at the cluster scope` Fixes: 398d55cd Signed-off-by: Aditi Ghag <aditi@cilium.io> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
ba90edd test: Fix incorrect selector for netperf-service [ upstream commit 8002a50acba951b810b37b3748ec5ba90218fc63 ] Caught by random chance when using this manifest to test something locally. Might as well fix it in case someone uses this in the future and the service is not working as expected. AFAICT, no CI failures occurred from this typo because the Chaos test suite (only suite which uses this manifest) doesn't assert any traffic to the service, but rather to the netperf-server directly. Fixes: b4a3cf6abc6 ("Test: Run netperf in background while Cilium pod is being deleted") Signed-off-by: Chris Tarazi <chris@isovalent.com> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
c653b89 docs: KUBECONFIG for cilium-cli with k3s [ upstream commit 606b5fe9f49f1734d15fcd2d914e56ffa59a82e1 ] Clarify how cilium-cli can work with k3s Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
e420989 bpf: Add WireGuard to complexity and compile tests [ upstream commit 04bf74c8444cc0b3aa5de380d2c542df540543a7 ] ENABLE_WIREGUARD was missing from the compile tests in bpf/Makefile and from the complexity tests in bpf/complexity-tests. We could therefore have missed new complexity issues or compilation errors occurring only when WireGuard is enabled. Fixes: 8930bebe ("daemon: Configure Wireguard for local node") Reported-by: Joe Stringer <joe@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
938a095 test/contrib: Bump CoreDNS version to 1.8.3 [ upstream commit 398d55cd94c0e16dc19b03c53f7b5040c1dd8f13 ] As reported in [1], Go's HTTP2 client < 1.16 had some serious bugs which could result in lost connections to kube-apiserver. Worse than this was that the client couldn't recover. In the case of CoreDNS the loose of connectivity to kube-apiserver was even not logged. I have validated this by adding the following rule on the node which was running the CoreDNS pod (6443 port as the socket-lb was doing the service xlation): iptables -I FORWARD 1 -m tcp --proto tcp --src $CORE_DNS_POD_IP \ --dport=6443 -j DROP After upgrading CoreDNS to the one which was compiled with Go >= 1.16, the pod was not only logging the errors, but also was able to recover from them in a fast way. An example of such an error: W1126 12:45:08.403311 1 reflector.go:436] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: watch of *v1.Endpoints ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding To determine the min vsn bump, I was using the following: for i in 1.7.0 1.7.1 1.8.0 1.8.1 1.8.2 1.8.3 1.8.4; do docker run --rm -ti "k8s.gcr.io/coredns/coredns:v$i" \ --version done CoreDNS-1.7.0 linux/amd64, go1.14.4, f59c03d CoreDNS-1.7.1 linux/amd64, go1.15.2, aa82ca6 CoreDNS-1.8.0 linux/amd64, go1.15.3, 054c9ae k8s.gcr.io/coredns/coredns:v1.8.1 not found: manifest unknown: k8s.gcr.io/coredns/coredns:v1.8.2 not found: manifest unknown: CoreDNS-1.8.3 linux/amd64, go1.16, 4293992 CoreDNS-1.8.4 linux/amd64, go1.16.4, 053c4d5 Hopefully, the bumped version will fix the CI flakes in which a service domain name is not available after 7min. In other words, CoreDNS is not able to resolve the name which means that it hasn't received update from the kube-apiserver for the service. [1]: https://github.com/kubernetes/kubernetes/issues/87615#issuecomment-803517109 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
a847d0e ci: Restart pods when toggling KPR switch [ upstream commit 06d9441d49b0b25e86af58bac16281a6950cbc27 ] Previously, in the graceful backend termination test we switched to KPR=disabled and we didn't restart CoreDNS. Before the switch, CoreDNS@k8s2 -> kube-apiserver@k8s1 was handled by the socket-lb, so the outgoing packet was $CORE_DNS_IP -> $KUBE_API_SERVER_NODE_IP. The packet should have been BPF masq-ed. After the switch, the BPF masq is no longer in place, so the packets from CoreDNS are subject to the iptables' masquerading (they can be either dropped by the invalid rule or masqueraded to some other port). Combined with CoreDNS unable to recover from connectivity errors [1], the CoreDNS was no longer able to receive updates from the kube-apiserver, thus NXDOMAIN errors for the new service name. To avoid such flakes, forcefully restart the DNS pods if the KPR setting change is detected. [1]: https://github.com/cilium/cilium/pull/18018 Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: nathanjsweet <nathanjsweet@pm.me> 10 December 2021, 00:07:49 UTC
421bd59 test: Add e2e test for FQDN policy update [ upstream commit 80ff05aec2fc1458b7f8fb9b3c4c8d159b39cfe7 ] Test that when FQDN policy is updated to a new policy which still selects the same old FQDN destination, connectivity continues to work. Validated to fail without the previous commit: K8sFQDNTest /home/joe/git/cilium/test/ginkgo-ext/scopes.go:473 Validate that FQDN policy continues to work after being updated [It] /home/joe/git/cilium/test/ginkgo-ext/scopes.go:527 Can't connect to to a valid target when it should work Expected command: kubectl exec -n default app2-58757b7dd5-rh7dd -- curl --path-as-is -s -D /dev/stderr --fail --connect-timeout 5 --max-time 20 --retry 5 http://vagrant-cache.ci.cilium.io -w "time-> DNS: '%{time_namelookup}(%{remote_ip})', Connect: '%{time_connect}', Transfer '%{time_starttransfer}', total '%{time_total}'" To succeed, but it failed: Exitcode: 28 Err: exit status 28 Stdout: time-> DNS: '0.000016()', Connect: '0.000000',Transfer '0.000000', total '5.000415' Stderr: command terminated with exit code 28 Signed-off-by: Joe Stringer <joe@cilium.io> 09 December 2021, 16:14:29 UTC
710bb9b policy: Fix selector identity release for FQDN [ upstream commit 018be31d6026a7816436a9b7fc587b70f29ac199 ] Alexander reports in GitHub issue 18023 that establishing a connection via an FQDN policy, then modifying that FQDN policy, would cause subsequent traffic to the FQDN to be dropped, even if the new policy still allowed the same traffic via a toFQDN statement. This was caused by overzealous release of CIDR identities while generating a new policy. Although the policy calculation itself keeps all selectorcache entries alive during the policy generation phase (see cachedSelectorPolicy.setPolicy() ), after the new policy is inserted into the PolicyCache, the distillery package would clean up the old policy. As part of that cleanup, it would call into the individual selector to call the RemoveSelectors() function. The previous implementation of this logic unintentionally released the underlying identities any time a user of a selector was released, rather than only releasing the underlying identities when the number of users reached zero and the selector itself would be released. This meant that rather than the selectorcache retaining references to the underlying identities when a policy was updated, instead the references would be released (and all corresponding BPF resources cleaned up) at the end of the process. This then triggered subsequent connectivity outages. Fix it by only releasing the identity references once the cached selector itself is removed from the SelectorCache. Fixes: f559cf1cecb0 ("selectorcache: Release identities on selector removal") Reported-by: Alexander Block <ablock84@gmail.com> Suggested-by: Jarno Rajahalme <jarno@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io> 09 December 2021, 16:14:29 UTC
7105e9a docs: Update the minimum required Minikube version [ upstream commit f4d59d188cb39c0f3360117a3f93046f0a3592b0 ] Minikube 1.12.0 or later is required to use the --cni flag [1]. 1 - https://github.com/kubernetes/minikube/commit/9e95435e0020eed065ee0229a6a54a7e54530a6d Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io> 09 December 2021, 16:14:29 UTC
e0af79d workflows: Run CodeQL workflow is the workflow is edited [ upstream commit 3bd4ad63556a5fb538312e6c655e878ff090932c ] We use path filters in the CodeQL workflow to avoid running it for unrelated changes. We're however missing the workflow file itself in the path filters. As a result, the CodeQL workflow isn't run when the GitHub Actions it uses are updated by dependabot. This commit fixes it. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io> 09 December 2021, 16:14:29 UTC
d7972e4 build(deps): bump actions/download-artifact from 2.0.10 to 2.1.0 Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 2.0.10 to 2.1.0. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/3be87be14a055c47b01d3bd88f8fe02320a9bb60...f023be2c48cc18debc3bacd34cb396e0295e2869) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 09 December 2021, 16:04:32 UTC
d84cac1 build(deps): bump actions/upload-artifact from 2.2.4 to 2.3.0 Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2.2.4 to 2.3.0. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/27121b0bdffd731efa15d66772be8dc71245d074...da838ae9595ac94171fa2d4de5a2f117b3e7ac32) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 08 December 2021, 17:53:36 UTC
bd30dbe Update Go to 1.16.11 Signed-off-by: Tobias Klauser <tobias@cilium.io> 06 December 2021, 21:15:24 UTC
0024250 bugtool: fix IP route debug gathering commands [ upstream commit e38e3c44f712b5f0ecf33efd1867c0ae16b241f7 ] Commit 8bcc4e5dd830 ("bugtool: avoid allocation on conversion of execCommand result to string") broke the `ip route show` commands because the change from `[]byte` to `string` causes the `%v` formatting verb to emit the raw byte slice, not the string. Fix this by using the `%s` formatting verb to make sure the argument gets interpreted as a string. Also fix another instance in `writeCmdToFile` where `fmt.Fprint` is now invoked with a byte slice. Grepping for `%v` in bugtool sources and manually inspecting all changes from commit 8bcc4e5dd830 showed no other instances where a byte slice could potentially end up being formatted in a wrong way. Fixes: 8bcc4e5dd830 ("bugtool: avoid allocation on conversion of execCommand result to string") Signed-off-by: Tobias Klauser <tobias@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 December 2021, 21:13:10 UTC
bb0357f docs: add registry (quay.io/) for pre-loading images for kind [ upstream commit 4758bef62869d60df45a383c4be813ebed1343c8 ] in doc, it recommends docker pull image, but the command is : docker pull cilium/cilium:|IMAGE_TAG| this will download from docker.io However, in operator, it loads images from quay.io we should keep them the same, otherwise, we download for nothing. Signed-off-by: adamzhoul <adamzhoul186@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 December 2021, 21:13:10 UTC
7829db3 docs: correct ec2 modify net iface action [ upstream commit ce45bc36946120ee5495be23ccc753d5e1910c8c ] `ModifyNetworkInterface` -> `ModifyNetworkInterfaceAttribute` see: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_ModifyNetworkInterfaceAttribute.html Signed-off-by: austin ce <austin.cawley@gmail.com> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 December 2021, 21:13:10 UTC
ec29e35 daemon: Deprecate --prefilter-{device,mode} [ upstream commit da26d0ae6f25c1fa28ba1d686e5505eb245e4908 ] Replace the enablement by the dedicated flag "--enable-xdp-prefilter", and the mode with the existing "--bpf-lb-acceleration". Ideally, we should deprecate the filter, but we cannot do it until the host-fw is supported by the XDP prog. Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 December 2021, 21:13:10 UTC
1b72316 test: Test XDP_REDIRECT in L4LB suite [ upstream commit f5f0537b6a8ac64f2f5b1fbf78cfb2ea02fc9596 ] To test that, add another veth pair to the LB node (aka "kind-control-plane"), and steer the LB request through it (done by installing ip route: $LB_VIP via $ANOTHER_VETH_PAIR_IP). After receiving the request, the LB node will do XDP_REDIRECT to the previous veth pair (identified by the --direct-routing-device) which is used to connect with the kind-worker which is running the service backend. [ Backport note: Created upgrade notes for v1.10.6 instead of v1.11. ] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 December 2021, 21:13:10 UTC
cff593b datapath,daemon: Enable multi-dev XDP [ upstream commit 0ffa7c60e1eb1991a263ddad5ac14a9c3c1ec3a9 ] This commit enables XDP (bpf_xdp) on all devices specified by --devices. Previously, only --direct-routing-device (or --devices if dev count = 1) could have been used to run bpf_xdp. To do so, we rely on the bpf_redirect() helper which performance (>= 5.5 Linux kernel) recently became on par with bpf_redirect_map() [1]. Therefore, it does not make sense to use the latter. The side effect of the enablement is that the --prefilter-device no longer makes sense. This will be addressed in a following commit. [1]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=1d233886dd904edbf239eeffe435c3308ae97625 [ Backport note: minor conflict with commit db5300dc0f26 ("choir: normalize error handling in kube_proxy_replacement.go"). Solved by deleting the whole chunk anyway. ] Signed-off-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 06 December 2021, 21:13:10 UTC
f08ea1d egressgateway: refactor manager logic [ upstream commit ed73a3174c868dde427c6a11194adc5f59f4a0f1 ] This commit refactors the egress gateway manager in order to provide a single `reconcile()` method which will be invoked on all events received by the manager. This method is responsible for adding and removing entries to and from the egress policy map. In addition to this, the manager will now wait for the k8s cache to be fully synced before running its first reconciliation, in order to always have the egress_policy map in a consistent state with the k8s configuration. Fixes: #17380 Fixes: #17753 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 06 December 2021, 16:32:00 UTC
b26520e daemon: add WaitUntilK8sCacheIsSynced method [ upstream commit d9b60f7102777c84f4917a6953be5e3538084c65 ] which will block the caller until the agent has fully sync its k8s cache. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 06 December 2021, 16:32:00 UTC
319d797 docs: add a note on egress gateway upgrade impact for 1.10.6 [ upstream commit cdb4b461560565f13cc574ab0f70bf40d4876c0c ] Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 06 December 2021, 16:32:00 UTC
5342799 bpf: rename egress policy map and its fields [ upstream commit 2b079593b04e5fb1fe2dbc6095921b15415e4a1f ] to make it more clear it's related to the egress gateway policies Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 06 December 2021, 16:32:00 UTC
e1f8ef0 maps: switch egressmap to cilium/ebpf package [ upstream commit 3ba8e6e481fe6601747c62369790bcc9d79fa0b6 ] Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 06 December 2021, 16:32:00 UTC
d61ce85 daemon, node: Remove old, discarded router IPs from `cilium_host` [ upstream commit fcd00390c30c6eeffbe2fefa81d5b22e59397297 ] In the previous commit (referenced below), we forgot to remove the old router IPs from the actual interface (`cilium_host`). This caused connectivity issues in user environments where the discarded, stale IPs were reassigned to pods, causing the ipcache entries for those IPs to have `remote-node` identity. To fix this, we remove all IPs from the `cilium_host` interface that weren't restored during the router IP restoration process. This step correctly finalizes the restoration process for router IPs. Fixes: ff63b0775c0 ("daemon, node: Fix faulty router IP restoration logic") Signed-off-by: Chris Tarazi <chris@isovalent.com> 06 December 2021, 01:55:43 UTC
9f91ec3 node: Add missing fallback to router IP from CiliumNode for restoration [ upstream commit 02fa124f73e44cde4124c9f37325ce66c338aa98 ] Previously in the case that both router IPs from the filesystem and the CiliumNode resource were available, we missed a fallback to the CiliumNode IP, if the IP from the FS was outside the provided CIDR range. In other words, we returned early that the FS IP does not belong to the CIDR, without checking if the IP from the CiliumNode was a valid fallback. This commit adds the missing case logic and also adds more documentation to the function. Signed-off-by: Chris Tarazi <chris@isovalent.com> 06 December 2021, 01:55:43 UTC
46c3432 selectorcache: Release identities on selector removal [ upstream commit f559cf1cecb0780eb3a815c951fe599f4918e474 ] FQDN selectors hold references to CIDR identities, one for each 'fqdnSelector.cachedSelections' entry. Previously, these would be associated with the fqdnSelector during creation of the selector but they were not released when the fqdnSelector is deleted from the cache. Commit de10e82639cc ("fqdn: Move identity allocation to FQDN selector") fixed a similar bug recently during the update of existing selectors; this code is balancing the release for identities allocated during the initial creation of each FQDN selector. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
3d37fba fqdn: Handle duplicate ids during selector caching [ upstream commit ff85fcdc6033b34352d02a86e21a55601756208e ] Previously this code didn't consider that this function could be called against a selector that is already cached (and therefore has identities associated with its internal 'cachedSelections' list). If a selector with the same string representation was added twice into the cache and identities associated with it in between, then that could potentially cause those identities to have a reference allocated without eventually being freed. Account for this case by detecting duplicates here and releasing any identities that are already referenced by this selector. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
511993a fqdn: Move identity allocation to FQDN selector [ upstream commit b10797c7d87b5b9e0c85e4a113fe302167b48933 ] [ Backporter's notes: Minor conflict in FakeIdentityAllocator rename ] Previously, the CIDR identity alloction logic was tied directly into the NameManager in RegisterForIdentityUpdatesLocked(), however these identities were never cleaned up. In order to properly release CIDR identities that are allocated upon creation of an FQDN selector, we need to track them and eventually call into release functions. Similar to recent changes that looked at identity leak during updates of FQDN selectors, we have two options - continue to tie the allocation into the NameManager (and hence teach UnregisterForIdentityUpdatesLocked() about identities and their associations), or we can tie them more closely to the cached FQDN selectors that require the allocations to occur. Given that not all cached selectors need this treatment (in particular, cached label selectors do not hold identity references), this code opts to refactor the allocation routines up one layer into the selectorcache's 'fqdnSelector' initialization handling logic. This will pave the way for fqdnSelector cleanup logic to then release those references, thereby balancing the allocation and release of identities. This commit should have no functional changes. Technically the identity allocation is moved inside the SelectorCache critical section while holding the mutex, but CIDR identities are only ever locally allocated within memory anyway so this is not expected to block for a long time. Regarding the dropped test, by name it seemed like it's intended to test the identity notifier, except that at its core it is using a mock implementation of the identity notifier. It was also depending on the assumption that the identity notifier is responsible for identity allocation, whereas this patch removes that responsibility and pushes it back to the FQDN selector. We perhaps could have expanded the way that it prepares the identity allocator and the identity notifier in order to validate that when adding a new FQDN selector, existing IPs from the DNS cache are properly propagated into the selector via the mock selection user that's also present in this test file. However, by this time we mock out this many implementation details around the DNS cache, FQDN selectors, cached selection users, Identity allocation and Identity notifier, it seemed like the test is getting too deep into implementation details to be able to robustly validate the functionality it's intending to validate. Rather than fight this structure and attempt to build yet more implementation-specific testing, it seemed like a simpler path to just remove the test. If this selection notification doesn't occur properly here, then I'd hope we can pick this up during later integration tests rather than solely relying on this existing test. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
476c006 identity: Remove disbalanced SelectorCache notify code [ upstream commit eaaa14f573fa0d90b1b98b6482d99465aa07d217 ] [ Backporter's notes: Minor conflict due to lack of ipcache/metadata.go ] While attempting to address identity garbage collection issues in upcoming commits, I hit the stack trace below. This is not currently possible to hit, because CIDR identity references from the SelectorCache are never released. However, after introducing the proper identity reference count release implementation, we observe that the SelectorCache attempts to release a CIDR identity, which then triggers a call back into the SelectorCache to notify it about the release of the identities. Since the outer calling code must hold the SelectorCache mutex in order to protect the writes against the SelectorCache itself, when the release is executed and attempts to call back into the SelectorCache, UpdateIdentities() attempts to re-acquire the mutex that the goroutine is already holding. This causes deadlocks which then propagate through to other subsystems, for instance locking Endpoints out from applying policy changes and preventing CLI calls for commands such as `cilium policy selectors list` from gathering and returning the internal state of the system. Upon further investigation, the notification of the "owner", ie SelectorCache, is also disbalanced here between the AllocateIdentity() call and the Release() call. The AllocateIdentity() call will only perform the notification when it is called from endpoint (restore or identity update) or clustermesh-apiserver packages. It does not notify the SelectorCache when called from CIDR allocation routines. On the other hand, the Release function inexplicably only notifies the SelectorCache when CIDR identities are released. When this functionality was initially introduced, it seems that we did not recognize and address this disbalance. The CIDR identity allocation routines would not notify the owner (because the callers take on that responsibility), and yet the release _would_ notify the owner. To fix this, this commit disables the owner notification for the Release() function so that it matches the AllocateIdentity() function. This will also prevent the caller from notifying itself again and triggering a deadlock when we apply the fixes in upcoming commits. sync.runtime_SemacquireMutex(0x44, 0x18, 0xc000a9a2d8) /usr/local/go/src/runtime/sema.go:71 +0x25 sync.(*Mutex).lockSlow(0xc000532d90) /usr/local/go/src/sync/mutex.go:138 +0x165 sync.(*Mutex).Lock(...) /usr/local/go/src/sync/mutex.go:81 sync.(*RWMutex).Lock(0x580) /usr/local/go/src/sync/rwmutex.go:111 +0x36 github.com/cilium/cilium/pkg/policy.(*SelectorCache).UpdateIdentities(0xc000532d90, 0x0, 0xc0016540f0, 0x24bc800) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:950 +0x65 github.com/cilium/cilium/daemon/cmd.(*Daemon).UpdateIdentities(0xc0000c5440, 0xc0016540f0, 0xc001000002) /go/src/github.com/cilium/cilium/daemon/cmd/policy.go:88 +0x5a github.com/cilium/cilium/pkg/identity/cache.(*CachingIdentityAllocator).Release(0xc0007a18c0, {0x2b86420, 0xc0009e7140}, 0xc001c42540) /go/src/github.com/cilium/cilium/pkg/identity/cache/allocator.go:423 +0x324 github.com/cilium/cilium/pkg/ipcache.releaseCIDRIdentities({0x2b86420, 0xc0009e7140}, 0xc0006b08c5) /go/src/github.com/cilium/cilium/pkg/ipcache/cidr.go:141 +0x296 github.com/cilium/cilium/pkg/ipcache.ReleaseCIDRIdentitiesByID({0x2b86420, 0xc0009e7140}, {0xc001597450, 0x41de7a0, 0x1bf08eb000}) /go/src/github.com/cilium/cilium/pkg/ipcache/cidr.go:200 +0x4af github.com/cilium/cilium/daemon/cmd.cachingIdentityAllocator.ReleaseCIDRIdentitiesByID({0x2b863e8}, {0x2b86420, 0xc0009e7140}, {0xc001597450, 0x3, 0x10}) /go/src/github.com/cilium/cilium/daemon/cmd/identity.go:117 +0x37 github.com/cilium/cilium/pkg/policy.(*fqdnSelector).releaseIdentityMappings(0xc001c24e10, {0x7f0588917fb8, 0xc0007a18c0}) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:526 +0x1aa github.com/cilium/cilium/pkg/policy.(*SelectorCache).removeSelectorLocked(0xc000532d90, {0x2ba5660, 0xc001c24e10}, {0x2b33100, 0xc000953880}) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:901 +0xd7 github.com/cilium/cilium/pkg/policy.(*SelectorCache).RemoveSelectors(0xc000532d90, {0xc000f255c0, 0x2, 0xc0026e65c8}, {0x2b33100, 0xc000953880}) /go/src/github.com/cilium/cilium/pkg/policy/selectorcache.go:919 +0xb4 github.com/cilium/cilium/pkg/policy.(*L4Filter).removeSelectors(0xc000953880, 0xc0026e6690) /go/src/github.com/cilium/cilium/pkg/policy/l4.go:624 +0x185 github.com/cilium/cilium/pkg/policy.(*L4Filter).detach(0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/policy/l4.go:631 +0x1e github.com/cilium/cilium/pkg/policy.L4PolicyMap.Detach(0x0, 0xc000a80160) /go/src/github.com/cilium/cilium/pkg/policy/l4.go:801 +0x7e github.com/cilium/cilium/pkg/policy.(*L4Policy).Detach(0xc000635c80, 0x1) /go/src/github.com/cilium/cilium/pkg/policy/l4.go:1011 +0x45 github.com/cilium/cilium/pkg/policy.(*selectorPolicy).Detach(...) /go/src/github.com/cilium/cilium/pkg/policy/resolve.go:107 github.com/cilium/cilium/pkg/policy.(*cachedSelectorPolicy).setPolicy(0xc000532e70, 0xc0009e63c0) /go/src/github.com/cilium/cilium/pkg/policy/distillery.go:188 +0x3b github.com/cilium/cilium/pkg/policy.(*PolicyCache).updateSelectorPolicy(0xc000131410, 0xc0009e63c0) /go/src/github.com/cilium/cilium/pkg/policy/distillery.go:124 +0x195 github.com/cilium/cilium/pkg/policy.(*PolicyCache).UpdatePolicy(...) /go/src/github.com/cilium/cilium/pkg/policy/distillery.go:153 github.com/cilium/cilium/pkg/endpoint.(*Endpoint).regeneratePolicy(0xc00078b500) /go/src/github.com/cilium/cilium/pkg/endpoint/policy.go:230 +0x22b github.com/cilium/cilium/pkg/endpoint.(*Endpoint).runPreCompilationSteps(0xc00078b500, 0xc000363400) /go/src/github.com/cilium/cilium/pkg/endpoint/bpf.go:815 +0x2dd github.com/cilium/cilium/pkg/endpoint.(*Endpoint).regenerateBPF(0xc00078b500, 0xc000363400) /go/src/github.com/cilium/cilium/pkg/endpoint/bpf.go:584 +0x19d github.com/cilium/cilium/pkg/endpoint.(*Endpoint).regenerate(0xc00078b500, 0xc000363400) /go/src/github.com/cilium/cilium/pkg/endpoint/policy.go:405 +0x7b3 github.com/cilium/cilium/pkg/endpoint.(*EndpointRegenerationEvent).Handle(0xc000d8e060, 0x40568a) /go/src/github.com/cilium/cilium/pkg/endpoint/events.go:53 +0x32c github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).run.func1() /go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:245 +0x13b sync.(*Once).doSlow(0x1, 0x43f325) /usr/local/go/src/sync/once.go:68 +0xd2 sync.(*Once).Do(...) /usr/local/go/src/sync/once.go:59 github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).run(0x0) /go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:233 +0x45 created by github.com/cilium/cilium/pkg/eventqueue.(*EventQueue).Run /go/src/github.com/cilium/cilium/pkg/eventqueue/eventqueue.go:229 +0x7b Fixes: 83e576e6d00a ("identity: create `CachingIdentityAllocator` type") Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
4d6ad68 endpoint: Document SelectorCache notification [ upstream commit c094644c0797dbff0509ec1a81eed5b5a80e2393 ] Document why the endpoint's AllocateIdentity() function triggers notification into the SelectorCache and yet the Release does not. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
ed13051 identity: Add notifyOwner flag to identity Release [ upstream commit 6275eb8ee65f0c6be389eb22e2bd6e4c138c11c2 ] [ Backporter's notes: Minor conflict in testutils ] The AllocateIdentity() method accepts a 'notifyOwner' boolean to determine whether to notify the SelectorCache about the change in identities. To balance the APIs into this structure, this commit adds the corresponding notifyOwner flag to the Release function. No other changes are made at this time; all existing cases which notify the owner will continue to notify the owner, and all cases which do not notify the owner will continue to not notify the owner. The next commit will change the behaviour and argue why that is correct. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
91c32bf test: Make the FQDN+HTTP policy test more lenient Currently, the test "RuntimeFQDNPolicies toFQDNs populates toCIDRSet (data from proxy) L3-dependent L7/HTTP with toFQDN updates proxy policy" is relying on a bug in the cilium-agent where identities allocated during DNS packet handling are leaked indefinitely. Specifically: 1. Previous tests run some DNS requests through the agent, causing the agent to allocate a local CIDR identity for the corresponding IP address, which is then propagated into the ipcache, 2. After those previous tests, the identity is not correctly cleaned up internally due to the bug that is about to be fixed. 3. This test then runs, and creates the relevant policies. When evaluating the policy, the identity from the previous test is taken into account and propagated through to Envoy. 4. Finally, this test establishes a connection through the combination of FQDN policy and HTTP policy. 5. This connection is established by first sending a DNS request to learn about the IP corresponding to the FQDN (and to notify Cilium's policy logic, which notifies Envoy) 6. Then, as part of handling the request, Cilium will synchronously plumb the identity into the ipcache but it will not synchronously notify Envoy about the IP->Identity mapping. 7. Crucially, on kernels 4.9 and 4.10, Envoy relies on the async ipcache listener logic to learn about these identities. On kernels 4.11 or newer, Envoy directly references the ipcache to fetch this information. 8. Given the lack of cleanup in step (2) above, Envoy does not need to learn about the IP->Identity mapping to correctly apply the policy even if the logic is async in (7) above. Upcoming commits will fix the step (2) above, thereby releasing the CIDR identity corresponding to the FQDN from this test prior to running the following test. This means that we now rely on the async execution in step (6) to execute quickly enough before the application sends the HTTP request. Due to (7), this is not important on most supported kernels, however if the user (or in this case, test case) is running on a 4.9 kernel for instance, then there is a window where Envoy may not have been notified with the information necessary for it to make the correct policy decision. Now, we could resolve this in one of several ways: * Teach Envoy to always pull the IP->Identity mapping from the ipcache, even on kernels 4.9/4.10 (requires a bit of work on proxy side); * Teach Cilium to synchronously wait until the Envoy notification mechanism in step (7) completes (requires a bit of work on Cilium), * This approach, just retry. If we assumed that many users are running on these kernels, then one of the first two approaches above would likely make sense, as otherwise it's possible that users would start to notice occasional policy issues when combining FQDN and HTTP policies together. However, Linux 4.9 is rarely used these days and it's not likely worth the effort to implement one of the first two solutions here. So, go for a simple fix to allow the test to pass on the second try. Thanks to Jarno Rajahalme for the discussion and pointers on the underlying issues with state propagation between Cilium and Envoy. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 20:32:12 UTC
73475b7 mlh: switch runtime from kernel 4.9 to net-next This backports #17186 to `v1.10`. Rationale: the original PR was required on `master` for #17813. #18082 is backporting #17813 to `v1.10`, thus it also requires similar changes. As we previously did on `master`, we switch the runtime tests to run on net-next instead of 4.9 due to new unit tests requirements. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 02 December 2021, 20:12:29 UTC
8417a82 bpf: rename reason to ct_ret in handle_ipv4_from_lxc [ upstream commit c11736280ff210c249a098e9816a88cc41f47dce ] to make it more explicit its purpose Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
6c3eb84 bpf: exclude pod's reply traffic from egress gateway logic [ upstream commit fcd19dca42d55602b00cdbd0763e5ee38ffd99bb ] Currently all pod traffic matching source IP and destination CIDR of an egress policy will be forwarded to an egress gateway. This means we will incorrectly forward to an egress gateway also all reply traffic from connections destined to a pod, breaking said connections. This commit fixes this by adding an additional check to make sure reply traffic (i.e. connections not originating from a pod) is excluded from the egress gateway logic. Fixes: #17866 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
02a0d13 helm: remove the masquerade check in poststart [ upstream commit e7de7f255b438022325c59e1b03bd79e4bcd8896 ] The current understanding is that we want to remove these rules all the time, as they can cause issues with features such as egress gateway. Signed-off-by: Bruno M. Custódio <brunomcustodio@gmail.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
07df79e daemon: Fatal on BPF masquerade + IPv6 masquerade [ upstream commit d60cdef32ec150efb42ac386545fac33addd1332 ] BPF masquerading for IPv6 isn't supported yet, so we should fatal early if the user asks for both BPF and IPv6 masquerade. They can use iptables-based masquerading for IPv6 instead. Since we enable BPF-based masquerading in all tests with 4.19+ kernels, we also need to disable IPv6 masquerading there. That should be fine since we rarely rely on IPv6 masquerading anyway. Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
b0b4c08 bugtool: fix data race occurring when running commands [ upstream commit 690c11201d9e48c0210c8dc644cc281f40e896ad ] A version of the classic closure with concurrency gotcha. Bind the value of cmd in the loop to a new variable to address the issue. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
10b339a helm: Disable BPF masquerading in v1.10+ [ upstream commit dc40b2cbe0468aa7389cdafa6c0ae0387f165eb8 ] In Cilium v1.10, we disabled kube-proxy-replacement by default but left BPF masquerading enabled. Since the latter requires the former, the default installation results in a warning. This commit fixes the warning by disabling BPF masquerading as well on new v1.10+ deployments. Fixes: 54121427 ("install: Disable kube-proxy-replacement by default") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
8480765 ipsec: fix source template in skip rule [ upstream commit b385f0f1c0e20e87af4315477d78266b9261d8d1 ] This patch modifies a forward policy update introduced by 0b52fd76c0101e966d701c07ca174517948739e4 so that the template source matches the source which is 0.0.0.0/0 (wildcard). Above modification addresses an issue of intermittent packet drops, as discussed in detail below. During an investigation of intermittent dropped packets in AKS (kernel 5.4.0-1061-azure) with IPSec enabled, there was an increase of (XfrmInTmplMismatch) errors in /proc/net/xfrm_stat as packets were dropped. Tracing revaled that the packets were dropped due to an __xfrm_policy_check failure when the packet was redirected from eth0 (after decryption) to the LXC device of a pod. Further investigation, attributed the drops to changes in the forwarding policy. Specifically, the forwarding policy would change as: src 0.0.0.0/0 dst 10.240.0.0/16 dir fwd priority 2975 - tmpl src 0.0.0.0 dst 10.240.0.19 + tmpl src 10.240.0.19 dst 10.240.0.61 proto esp reqid 1 mode tunnel level use And back: src 0.0.0.0/0 dst 10.240.0.0/16 dir fwd priority 2975 - tmpl src 10.240.0.19 dst 10.240.0.61 + tmpl src 0.0.0.0 dst 10.240.0.19 proto esp reqid 1 mode tunnel level use The above change was caused by: func (n *linuxNodeHandler) enableIPsec(newNode *nodeTypes.Node) in pkg/datapath/linux/node.go. Modifying the code to avoid chancing the policy elimiated the packet drops. The are two places were the xfrm policy is updated in enableIPsec(): (1) inside UpsertIPsecEndpoint() when an IN policy is specified (as happens if newNode.IsLocal()) (2) in enableIPsec() itself, as introduced by 0b52fd76c0101e966d701c07ca174517948739e4 For example, adding log messages to IpSecReplacePolicyFwd and UpsertIPsecEndpoint produced: level=info msg="IpSecReplacePolicyFwd: src=0.0.0.0/0 dst=10.240.0.61/16 tmplSrc=10.240.0.19/16 tmplDst=10.240.0.61/16" subsys=ipsec level=info msg="UpsertIPsecEndpoint: local:10.240.0.19/16 remote:0.0.0.0/0 fowrard:10.240.0.19/16" subsys=ipsec level=info msg="IpSecReplacePolicyFwd: src=0.0.0.0/0 dst=10.240.0.19/16 tmplSrc=0.0.0.0/0 tmplDst=10.240.0.19/16" subsys=ipsec level=info msg="UpsertIPsecEndpoint: exit" subsys=ipsec Additional testing revealed that the update that resulting in a template with tmplSrc=10.240.0.19/16 was the culprit for the packet drops. Making the source template matching the source which is a wildcard in update (2), eliminated the packet drops. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
2e652a4 ipsec: Fix L7 with endpoint routes [ upstream commit 3ffe49e181727c1c964e0b4f4d89e6c511b9e44e ] With the previous patch, when IPsec and endpoint routes are enabled, packets flow directly from bpf_network to bpf_lxc via the Linux stack, instead of going through bpf_host. However, we noticed that, when L7 policies are applied, connectivity fails between the proxy and the destination: 43.808: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-endpoint FORWARDED (TCP Flags: SYN) 43.808: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 UNKNOWN 5 (TCP Flags: SYN) 43.808: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 to-proxy FORWARDED (TCP Flags: SYN) 43.808: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 from-proxy FORWARDED (TCP Flags: SYN, ACK) 43.808: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 to-endpoint FORWARDED (TCP Flags: SYN, ACK) 43.808: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-endpoint FORWARDED (TCP Flags: ACK) 43.808: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 to-proxy FORWARDED (TCP Flags: ACK) 43.810: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-endpoint FORWARDED (TCP Flags: ACK, PSH) 43.810: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 to-proxy FORWARDED (TCP Flags: ACK, PSH) 43.810: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 from-proxy FORWARDED (TCP Flags: ACK) 43.810: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 to-endpoint FORWARDED (TCP Flags: ACK) 43.810: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 http-request FORWARDED (HTTP/1.1 GET http://10.240.0.55:8080/) 43.812: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-network FORWARDED (TCP Flags: SYN) 43.812: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-stack FORWARDED (TCP Flags: SYN) 43.812: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 to-endpoint FORWARDED (TCP Flags: SYN) 43.812: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 from-endpoint FORWARDED (TCP Flags: SYN, ACK) 43.812: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 to-stack FORWARDED (TCP Flags: SYN, ACK) 43.813: cilium-test/echo-other-node-697d5d69b7-lglxf:8080 -> cilium-test/client2-6dd75b74c6-9nxhx:45704 from-network FORWARDED (TCP Flags: SYN, ACK) 44.827: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-network FORWARDED (TCP Flags: SYN) 44.827: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 from-stack FORWARDED (TCP Flags: SYN) 44.827: cilium-test/client2-6dd75b74c6-9nxhx:45704 -> cilium-test/echo-other-node-697d5d69b7-lglxf:8080 to-endpoint FORWARDED (TCP Flags: SYN) We can see the TCP handshake between the client and proxy, followed by an attempt to perform the TCP handshake between the proxy and server. That second part fails, as the SYN+ACK packets sent by the server never seem to reach the proxy; they are dropped somewhere after the from-network observation point. At the same time, we can see the IPsec error counter XfrmInNoPols increasing on the client node. This indicates that SYN+ACK packets were dropped after decryption, because the XFRM state used for decryption doesn't match any XFRM policy. The XFRM state used for decryption is: src 0.0.0.0 dst 10.240.0.18 proto esp spi 0x00000003 reqid 1 mode tunnel replay-window 0 mark 0xd00/0xf00 output-mark 0xd00/0xf00 aead rfc4106(gcm(aes)) 0x3ec3e84ac118c3baf335392e7a4ea24ee3aecb2b 128 anti-replay context: seq 0x0, oseq 0x0, bitmap 0x00000000 sel src 0.0.0.0/0 dst 0.0.0.0/0 And it should match the following XFRM policy: src 0.0.0.0/0 dst 10.240.0.0/16 dir in priority 0 mark 0xd00/0xf00 tmpl src 0.0.0.0 dst 10.240.0.18 proto esp reqid 1 mode tunnel After the packet is decrypted however, we hit the following rule in iptables because we're going to the proxy. -A CILIUM_PRE_mangle -m socket --transparent -m comment --comment "cilium: any->pod redirect proxied traffic to host proxy" -j MARK --set-xmark 0x200/0xffffffff As a result, the packet mark is set to 0x200 and we don't match the 0xd00 packet mark of the XFRM policy anymore. The packet is therefore dropped with XfrmInNoPols. To avoid this, we can simply mark the XFRM policy optional when endpoint routes are enabled, in the same way we do for tunneling. Fixes: 287f49c2 ("cilium: encryption, fix redirect when endpoint routes enabled") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
35594d1 config: Fix incorrect packet path with IPsec and endpoint routes [ upstream commit 573159dffc68aeb3a48280661063a97f0dd9cc55 ] Note this commit was applied before as 7ef59aa9 and reverted due to a broken test in commit 346713b2. When endpoint routes are enabled, we attach a BPF program on the way to the container and add a Linux route to the lxc interface. So when coming from bpf_network with IPsec, we should use that route to go directly to the lxc device and its attached BPF program. In contrast, when endpoint routes are disabled, we run the BPF program for ingress pod policies from cilium_host, via a tail call in bpf_host. Therefore, in that case, we need to jump from bpf_network to cilium_host first, to follow the correct path to the lxc interface. That's what commit 287f49c2 ("cilium: encryption, fix redirect when endpoint routes enabled") attempted to implement for when endpoint routes are enabled. It's goal was to go directly from bpf_network to the stack in that case, to use the per-endpoint Linux routes to the lxc device. That commit however implements a noop change: ENABLE_ENDPOINT_ROUTES is defined as a per-endpoint setting, but then used in bpf_network, which is not tied to any endpoint. In practice, that means the macro is defined in the ep_config.h header files used by bpf_lxc, whereas bpf_network (from which the macro is used) relies on the node_config.h header file. The fix is therefore simple: we need to define ENABLE_ENDPOINT_ROUTES as a global config, written in node_config.h. To reproduce the bug and validate the fix, I deploy Cilium on GKE (where endpoint routes are enabled by default) with: helm install cilium ./cilium --namespace kube-system \ --set nodeinit.enabled=true \ --set nodeinit.reconfigureKubelet=true \ --set nodeinit.removeCbrBridge=true \ --set cni.binPath=/home/kubernetes/bin \ --set gke.enabled=true \ --set ipam.mode=kubernetes \ --set nativeRoutingCIDR=$NATIVE_CIDR \ --set nodeinit.restartPods=true \ --set image.repository=docker.io/pchaigno/cilium-dev \ --set image.tag=fix-ipsec-ep-routes \ --set operator.image.repository=quay.io/cilium/operator \ --set operator.image.suffix="-ci" \ --set encryption.enabled=true \ --set encryption.type=ipsec I then deployed the below manifest and attempted a curl request from pod client to the service echo-a. metadata: name: echo-a labels: name: echo-a spec: template: metadata: labels: name: echo-a spec: containers: - name: echo-a-container env: - name: PORT value: "8080" ports: - containerPort: 8080 image: quay.io/cilium/json-mock:v1.3.0 imagePullPolicy: IfNotPresent readinessProbe: timeoutSeconds: 7 exec: command: - curl - -sS - --fail - --connect-timeout - "5" - -o - /dev/null - localhost:8080 selector: matchLabels: name: echo-a replicas: 1 apiVersion: apps/v1 kind: Deployment --- metadata: name: echo-a labels: name: echo-a spec: ports: - name: http port: 8080 type: ClusterIP selector: name: echo-a apiVersion: v1 kind: Service --- apiVersion: "cilium.io/v2" kind: CiliumNetworkPolicy metadata: name: "l3-rule" spec: endpointSelector: matchLabels: name: client ingress: - fromEndpoints: - matchLabels: name: echo-a --- apiVersion: v1 kind: Pod metadata: name: client labels: name: client spec: affinity: podAntiAffinity: requiredDuringSchedulingIgnoredDuringExecution: - labelSelector: matchExpressions: - key: app.kubernetes.io/name operator: In values: - echo-a topologyKey: kubernetes.io/hostname containers: - name: netperf args: - sleep - infinity image: cilium/netperf Fixes: 287f49c2 ("cilium: encryption, fix redirect when endpoint routes enabled") Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
d47a408 bpf: Refactor egressgw code into is_cluster_destination [ upstream commit 4a10eaa4bb457cc9a2f0fa4f7623ae9f72bbf101 ] Move several check from bpf_lxc into a helper function is_cluster_destination(). Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
74f5c08 bpf: Move egress gateway code to its own lib/ file [ upstream commit f6434eb5c389bcd0b66853244c209a8b2df6f5e7 ] Signed-off-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
2bc36e3 test: Disable unreliable K8sBookInfoDemoTest test [ upstream commit ef16ed7e38075166db22701cd685bf6f9bd0a530 ] Refs https://github.com/cilium/cilium/issues/17401. Signed-off-by: Tom Payne <tom@isovalent.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 13:29:12 UTC
d7e0934 nodediscovery: Fix local host identity propagation [ upstream commit 7bf60a59f0720e5546c68dccf4e4fa133407355f ] The local NodeDiscovery implementation was previously informing the rest of the Cilium agent that the local node's identity is "Remote Node" because of the statically initialized "identity.GetLocalNodeID" value. However, that value should only ever be used for external workloads cases in order to prepare the source identity used for transmitting traffic to other Cilium nodes. It should not be used for locally determining the identity of traffic coming from the host itself. Fix this by hardcoding the identity to "Host" identity. Fixes: c864fd3cf5cd ("daemon: Split IPAM bootstrap, join cluster in between") Signed-off-by: Joe Stringer <joe@cilium.io> Signed-off-by: Timo Beckers <timo@isovalent.com> 23 November 2021, 16:03:49 UTC
back to top