https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
43f2f74 workflows: fix concurrency group names Improve concurrency group names uniqueness, in particular for allowing testing workflow changes via `pull_request` events. In the previous version, all `pull_request`-triggered runs would end up in the same concurrency group as the `scheduled` events, due to not having a `github.event.issue` object. This new version proposes a new structure that should be unique for all types of testing, while still allowing runs of the same type to override each other: Structure: - Workflow name - Event type - A unique identifier depending on event type: - schedule: SHA - issue_comment: PR number + trigger phrase - pull_request: PR number + label name Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 01 July 2021, 09:24:29 UTC
ca94866 DO NOT MERGE: TESTING ONLY Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 01 July 2021, 09:24:29 UTC
676ee45 ci/conformance: Wait for hubble-relay-ci image This commit extends the conformance tests (all of which are using `cilium hubble enable` to deploy Hubble Relay) to also wait for the `hubble-relay-ci` image to be built. In the multicluster test, we also wait for the `clustermesh-apiserver-ci` image. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 01 July 2021, 09:24:29 UTC
4f76408 ci-multicluster: Pull in correct image version Pull in the correct image version built as part of CI when deploying `clustermesh-apiserver`. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 01 July 2021, 09:24:29 UTC
929c28f k8s: Fix External Workloads service access NewClusterService() and EqualsClusterService() were inadvertently broken when support for ipFamilies was added, leaving the 'Ports' in service empty. This made all cluster services inaccessible to External Workloads. Fix this by collecting the ports accross all front-ends. This assumes that each front-end will be serving the same ports. Fixes: #14914 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 01 July 2021, 04:45:03 UTC
dc7df4d daemon: Add Azure IPAM mode for setting the native routing CIDR This will allow the router IP restoration logic to pick up the correct pod CIDR to validate the router IP. This also fixes the issue where upon Cilium restart, additional IPs were erroneously assigned to `cilium_host`. Signed-off-by: Chris Tarazi <chris@isovalent.com> 01 July 2021, 04:29:47 UTC
8d8a7f8 azure, ipam, k8s: Derive primary / VPC CIDR of Azure interface To align with other CRD-backed IPAM modes such as ENI and Alibaba, derive the VPC CIDR from the Azure API and set it as the native routing CIDR. This enables the subsequent commit to use the CIDR to validate the router IPs upon restoration. Signed-off-by: Chris Tarazi <chris@isovalent.com> 01 July 2021, 04:29:47 UTC
fc06cbc ipam: Fix return inside deriveVpcCIDR() The `return` statement wasn't placed in the correct place, as the code should return as soon as a valid result is found. Signed-off-by: Chris Tarazi <chris@isovalent.com> 01 July 2021, 04:29:47 UTC
eb9a5c4 docs: update the version specific notes table Updates the table in the "Version Specific Notes" subsection of the "Upgrade" page in order to be explicit about the supported upgrade paths. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> 30 June 2021, 18:54:22 UTC
42da6c2 workflows: update Kind version to 0.11.1 This is necessary to work around a probable GH infrastructure issue where 0.9.0 suddenly started not to work in GH Actions: https://github.com/helm/kind-action/issues/42 Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 30 June 2021, 17:27:47 UTC
b9977ff build(deps): bump helm/kind-action from 1.1.0 to 1.2.0 Bumps [helm/kind-action](https://github.com/helm/kind-action) from 1.1.0 to 1.2.0. - [Release notes](https://github.com/helm/kind-action/releases) - [Commits](https://github.com/helm/kind-action/compare/7a937c0fb648064a83b8b9354151e5e543d9fcec...94729529f85113b88f4f819c17ce61382e6d8478) --- updated-dependencies: - dependency-name: helm/kind-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 30 June 2021, 17:27:47 UTC
a2a2496 test: Add klog lock error to allow-list This error message happened in CI and seems to be a less frequent variation of known klog error messages [1]. 1 - https://github.com/cilium/cilium/issues/16402#issuecomment-871155492 Signed-off-by: Paul Chaignon <paul@cilium.io> 30 June 2021, 17:07:14 UTC
f8e794b Revert "docs: deprecate native-routing-cidr from v1.10" This reverts commit 4bd21885ad8b3aaa1486db4c5412f123f95e95a1. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 30 June 2021, 14:23:21 UTC
ffe255e Pick up cilium-cli v0.8.3 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 30 June 2021, 14:23:04 UTC
8e71e30 .github: Fix image digest job printing Commit 044afab2ecf3 ("ci: Set up qemu in images workflow and build cilium-test") introduced a new dependency for the "Display image digests" job which provides a nice single page that developers can use to find all of the CI image artifacts generated from the PR. Unfortunately the new dependency is conditionally skipped, and when it is skipped, the display digests job is also skipped because the dependency is skipped. This is a known behaviour of GitHub actions/runner, issue 491. This commit works around the issue by explicitly running the job if the dependency was skipped, allowing the "display image digests" job to run even when the qemu job is skipped. Signed-off-by: Joe Stringer <joe@cilium.io> 30 June 2021, 14:22:27 UTC
a12f4e4 build(deps): bump docker/setup-buildx-action from 1.3.0 to 1.4.1 Bumps [docker/setup-buildx-action](https://github.com/docker/setup-buildx-action) from 1.3.0 to 1.4.1. - [Release notes](https://github.com/docker/setup-buildx-action/releases) - [Commits](https://github.com/docker/setup-buildx-action/compare/0d135e0c2fc0dba0729c1a47ecfcf5a3c7f8579e...a1c666d855a037f439ebb7bf701ee144fcadd307) --- updated-dependencies: - dependency-name: docker/setup-buildx-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 29 June 2021, 21:57:35 UTC
4bd2188 docs: deprecate native-routing-cidr from v1.10 Given that the change to deprecate the native-routing-cidr option in favor of ipv4-native-routing-cidr was supposed to be backported to v1.10, update the docs to move the deprecation notice under the v1.10 section. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 29 June 2021, 21:57:20 UTC
9324ba0 feat: generate tls certs for ui on helm install Signed-off-by: Renat Tuktarov <yandzeek@gmail.com> 29 June 2021, 21:55:57 UTC
ff63b07 daemon, node: Fix faulty router IP restoration logic When running in ENI or Alibaba IPAM mode, or any CRD-backed IPAM mode ("crd") and upon Cilium restart, it was very likely that `cilium_host` was assigned an additional IP. Below is a case where Cilium was restarted 3 times, hence getting 3 additional router IPs: ``` 4: cilium_host@cilium_net: <BROADCAST,MULTICAST,NOARP,UP,LOWER_UP> mtu 9001 qdisc noqueue state UP group default qlen 1000 link/ether 66:03:3c:07:8c:47 brd ff:ff:ff:ff:ff:ff inet 192.168.35.9/32 scope link cilium_host valid_lft forever preferred_lft forever inet 192.168.34.37/32 scope link cilium_host valid_lft forever preferred_lft forever inet 192.168.57.107/32 scope link cilium_host valid_lft forever preferred_lft forever inet6 fe80::6403:3cff:fe07:8c47/64 scope link valid_lft forever preferred_lft forever ``` This was because in CRD-backed IPAM modes, we wait until we fully sync with K8s in order to derive the VPC CIDR, which becomes the pod CIDR on the node. Since the router IP restoration logic was using a different pod CIDR during the router IP validation check, it was erroneously discarding it. This was observed with: ``` 2021-06-25T13:59:47.816069937Z level=info msg="The router IP (192.168.135.3) considered for restoration does not belong in the Pod CIDR of the node. Discarding old router IP." cidr=10.8.0.0/16 subsys=node ``` This is problematic because the extraneous router IPs could be also assigned to pods, which would break pod connectivity. The fix is to break up the router IP restoration process into 2 parts. The first is to attempt a restoration of the IP from the filesystem (`node_config.h`). We also fetch the router IPs from Kubernetes resources since they were already retrieved prior inside k8s.WaitForNodeInformation(). Then after the CRD-backed IPAM is initialized and started (*Daemon).startIPAM() is called, we attempt the second part. This includes evaluating which IPs (either from filesystem or from K8s) should be set as the router IPs. The IPs from the filesystem take precedence. In case the node was rebooted, the filesystem will be wiped so then we'd rely on the IPs from the K8s resources. At this point in the daemon initialization, we have the correct CIDR range as the pod CIDR range to validate the chosen IP. Fixes: beb8bdea3 ("k8s, node: Restore router IPs (`cilium_host`) from K8s resource") Signed-off-by: Chris Tarazi <chris@isovalent.com> 29 June 2021, 20:33:50 UTC
504e19e config: add validation for IPv4NativeRoutingCIDR Make sure the provided range is actually a v4 CIDR. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 29 June 2021, 06:27:56 UTC
792ed5a daemon: rename native-routing-cidr option to ipv4-native-routing-cidr This commit renames the native-routing-cidr option to ipv4-native-routing-cidr-option, to make it more clear that the flag expects an IPv4 range. In addition to that, also rename the nativeRoutingCIDR Helm option to ipv4NativeRoutingCIDR (marking nativeRoutingCIDR as deprecated). Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 29 June 2021, 06:27:56 UTC
28e7e39 endpoint: Do not panic in Finalize() Panicing in Finalize functions may leave endpoint locked and brick the whole agent. Better avoid itt and log errors instead, and unlock the Endpoint in defer if it still happens. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 29 June 2021, 01:09:22 UTC
5839d23 iptables: Keep old rules while adding new ones Keep old iptables rules by renaming Cilium chains so that new rules can be added while old are still in use. Copy old TPROXY rules from the renamed old rules. Remove the backups only after new rules have been successfully added. This change makes it possible to keep old rules in effect while adding new ones without special consideration for transient rules. On first initialization only copy over the DNS proxy TPROXY rules, as other proxies can't reuse old proxy ports across restarts. Pick the last applicable proxy port from iptables, if multiple are present. Remove stale TPROXY rules once the current port is known. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 29 June 2021, 01:09:22 UTC
537715a iptables: Add rudimentary unit testing Wrap "iptables" and "ip6tables" programs with iptablesInterface so that unit testing can mock up the executables. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 29 June 2021, 01:09:22 UTC
9d4e99d tests: rework custom calls's AfterEach/AfterAll blocks to skip if needed The AfterAll() and AfterEach() blocks in the test file for custom calls run everytime, even if the Context block for the actual tests is skipped. In that case, running the final blocks results in an attempt to remove deployments that have never been set up in the first place. This may lead to the blocks failing when the tests were in fact skipped, and may produce test artifacts even though Jenkins does not considered the test failed. Let's reorganise those blocks, to make sure they are called only when necessary. Note that we do need to keep both DeleteCilium() and DeleteAll(), even if they are now in the same block, as calling only DeleteAll() would not remove the Cilium ConfigMap. Fixes: 37f6192c9e77 ("test: add CI test for tail calls hooks for custom programs") Fixes: #13191 Fixes: #16633 Reported-by: Paul Chaignon <paul@cilium.io> Signed-off-by: Quentin Monnet <quentin@isovalent.com> 28 June 2021, 23:12:58 UTC
d1a3b70 logging: split syslog functionality into separate file This package is a transitive dependency in cilium-cli, but currently it fails to build on platforms without syslog support (e.g. windows). However, we want still to be able to build it for some these platforms, see e.g. https://github.com/cilium/cilium-cli/issues/231. Fix the build by moving all functionality specifiy to syslog into a separate file, protected by build tags. The functionality will be stubbed out on platforms without syslog. Signed-off-by: Tobias Klauser <tobias@cilium.io> 28 June 2021, 23:09:58 UTC
1174622 logging: factor complete syslog setup into setupSyslog Move all syslog setup steps into function setupSyslog. No functional changes, this is in preparation for moving syslog specific functionality to a build tag protected file for non-linux build support. Signed-off-by: Tobias Klauser <tobias@cilium.io> 28 June 2021, 23:09:58 UTC
51df515 .github: Set commit status to error when workflow are cancelled GitHub jobs are usually set to status 'error' when cancelled. We should do the same for ci-xxx jobs when they are cancelled. Having the state appear as an error clarifies that the author, janitor, and reviewers should take notice of that workflow's result. Signed-off-by: Paul Chaignon <paul@cilium.io> 28 June 2021, 23:06:22 UTC
f263235 docs: Add troubleshooting steps to the kube-proxy free guide Document the requirement that Cilium agent needs to be able to attach BPF cgroup programs at the host cgroup root, in order for socket-based load balancing (aka host-reachable services) to be effective for other pods and host processes. More details in the PR - https://github.com/cilium/cilium/pull/16259 Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
ca8456c docs: Document failure scenario for kind deployment Deploying a kind cluster in an environment where Cilium is already running (for example, in the Cilium development VM) can lead to Cilium pods crashing. This can also happen if there are other BPF cgroup programs attached to the parent ``cgroup`` hierarchy of the kind container nodes. Relevant Linux kernel code reference - https://elixir.bootlin.com/linux/latest/source/kernel/bpf/cgroup.c#L457. Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
0c166f6 Revert "cgroups: Determine cgroup v2 hierarchy root for Kind" This reverts commit e9ce8306400bf416087046b8d5b013b23ebdcb3e. This logic is no longer needed as we mount cgroup v2 filesystem from the underlying kubernetes node. This will enable cilium to correctly attach BPF programs at every `kind` node's cgroup root. Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
8265314 cilium-daemonset: Host cgroup root mount as alternative to auto-mount Cilium agent daemonset auto mounts cgroup2 filesystem on the host by default. However, it needs to mount host's `/proc` inside an init container in order to do that. To disable this auto-mount behavior, we introduce a helm option. When auto-mount is disabled, users can specify the mount point on the underlying host where cgroup v2 fs is already mounted. We then volume mount this directory inside the cilium agent pod. The reason why we don't set the host cgroup2 mount point to a hard-coded path such as `/sys/fs/cgroup`, is because cgroup2 filesystem mount point can be platform dependent. See this note in the cgroup manpage [1] - >Note that on many modern systems, systemd(1) automatically mounts the cgroup2 filesystem at /sys/fs/cgroup/unified during the boot process. [1] https://man7.org/linux/man-pages/man7/cgroups.7.html Suggested-by: Kornilios Kourtis <kornilios@isovalent.com>. Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
fa8bea4 cilium-daemonset: Fix ineffective socket-lb caused by incorrect cgroup2 fs mount If container runtimes are run with cgroup v2, Cilium agent pod would be deployed in a separate cgroup namespace. For example, Docker container runtime with cgroupv2 support switched to private cgroup namespace mode as the default [1]. Due to cgroup namespaces [2], the cgroup fs mounted by the Cilium pod points to a virtualized cgroup hierarchy instead of the host cgroup root. As a result, BPF programs are attached to the nested cgroup root, and socket-lb isn't effective for other pods. Fix: Mount cgroup2 fs from the host so that BPF programs are attached at the host cgroup root. A new init container is added to the Cilium Daemonset that mounts cgroup2 fs on the host. The `/proc/1/ns/` directory on the host is required to be mounted so that cgroup and mount namespaces are enabled as enterable namespaces while running the `nsenter` command. Additionally, cgroup2 fs can be attached to different paths so let's mount it on the host at a cilium-specific custom location. Cilium can thus have control over the location (e.g., create the directory if it doesn't exist). This also helps in effectively identifying if a cgroup2 mount already exists at the custom location. [1] https://docs.docker.com/config/containers/runmetrics/#running-docker-on-cgroup-v2 [2] https://man7.org/linux/man-pages/man7/cgroup_namespaces.7.html Reported-By: Kornilios Kourtis <kornilios@isovalent.com> Fixes: #15137 Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
8b9bc2e defaults: Update default cgroup root `/var/run` is a symlink to `/run` on most platforms, and may not always be present. Also, this is consistent with the `DefaultMapRootFallback` currently configured in the agent. Example - $ sudo mount -t cgroup2 none /var/run/cilium/cgroupv2 $ mount | grep cgroup none on /run/cilium/cgroupv2 type cgroup2 (rw,relatime) Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
afb1c9c pkg/cgroup: Fix a typo Signed-off-by: Aditi Ghag <aditi@cilium.io> 28 June 2021, 12:42:25 UTC
8162de8 .github: Test IPsec with high value for keyID We want to use the highest possible value for the key ID in end-to-end tests to be sure to catch packet mark conflicts if any arises. By using the value corresponding to all bits set to 1, any conflict will be caught. Signed-off-by: Paul Chaignon <paul@cilium.io> 28 June 2021, 12:41:53 UTC
76ab456 ci: use git status instead of git diff to check for a clean state Before this patch, git diff was used to ensure a clean state with respect to the git repository. While it could catch a modified or removed file, it would not fail when a file was not checked in git (i.e. untracked). This patch use git status instead of git diff, effectively catching untracked files as well. Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 25 June 2021, 17:22:15 UTC
40b88a6 test: Don't skip encapsulation tests on GKE Signed-off-by: Paul Chaignon <paul@cilium.io> 25 June 2021, 14:25:19 UTC
776af64 Fix typo. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> 25 June 2021, 09:02:34 UTC
2b77f96 Add missing descriptions to 'Helm Reference'. Some fields appear in the 'Helm Reference' page without an associated description. This commit aims at fixing that. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> 25 June 2021, 09:02:34 UTC
ea0820e Include codegen warning in 'helm-values.rst'. Include a comment in 'helm-values.rst' indicating that the file is generated automatically. This will hopefully limit the risk of having contributors opening PRs to edit it directly. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> 25 June 2021, 09:02:34 UTC
93392dd Add 'helm-values.rst' to '.gitattributes'. Adding so that GitHub automatically folds its diff by default for reviews given that it is a generated file. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> 25 June 2021, 09:02:34 UTC
8d4f1ea Use a fork of 'helm-docs'. 'helm-docs' has a bug which causes it to include comments belonging to previously-appearing but commented-out fields. A fix has been proposed in https://github.com/norwoodj/helm-docs/pull/99, but hasn't been reviewed yet. While said PR doesn't get merged it's preferable to switch to a fork containing the fix so we can have a proper description for our Helm chart fields. Signed-off-by: Bruno Miguel Custódio <brunomcustodio@gmail.com> 25 June 2021, 09:02:34 UTC
70bd2ac Pick up cilium-cli v0.8.2 This release fixes cilium v1.10 install on AKS. Ref: https://github.com/cilium/cilium-cli/releases/tag/v0.8.2 Signed-off-by: Michi Mutsuzaki <michi@isovalent.com> 25 June 2021, 07:46:24 UTC
448dc50 build(deps): bump docker/login-action from 1.9.0 to 1.10.0 Bumps [docker/login-action](https://github.com/docker/login-action) from 1.9.0 to 1.10.0. - [Release notes](https://github.com/docker/login-action/releases) - [Commits](https://github.com/docker/login-action/compare/28218f9b04b4f3f62068d7b6ce6ca5b26e35336c...f054a8b539a109f9f41c372932f1ae047eff08c9) --- updated-dependencies: - dependency-name: docker/login-action dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 24 June 2021, 17:50:06 UTC
4ddb158 contrib: Identify upstream commits by author and date When listing the commits of pull requests to backport, GitHub doesn't offer a way to find the corresponding commits merged in master. We therefore have to do it manually. To that end, we first retrieve a candidate commit by matching on the exact commit title. Several commits can have the same title however, so we need another check to confirm the candidate commit is the same commit as the pull request's. We currently use 'git patch-id' for the second check. That command computes a unique ID for a patch. It can however have false negatives. For example, 9515d1e ("docs: add a reference of helm values") and de62fa3 ("docs: add a reference of helm values") refer to the same patch, the first being from the pull request and the second from master (i.e., once merged). Nevertheless, when we run 'git patch-id', we get two different IDs: $ git show 9515d1e | git patch-id 5d928411d72fcdb5c9c24ab2138896e6709e578c 9515d1ea37f1d1122ece73cf061cf47590e90f9e $ git show de62fa3 | git patch-id de14f63774d0f56ecc1e22db615987bedffe1e4b de62fa37c9ac679fd45bb617e8759dd7a4918ccb Comparing the two commits shows that the difference is actually due to changes not introduced by this commit: $ diff <(git show 9515d1e) <(git show de62fa3) [...] 1997,1998c1997,1998 < @@ -118,7 +118,7 @@ contributors across the globe, there is almost always someone available to help. < | debug.enabled | bool | `false` | Enable debug logging | --- > @@ -119,7 +119,7 @@ contributors across the globe, there is almost always someone available to help. > | disableEndpointCRD | string | `"false"` | Disable the usage of CiliumEndpoint CRD | [...] We however don't need to use 'git patch-id'. Using the author's email address and date (+ commit title) is usually enough to uniquely identify commits on master. If someone sends two commits with the same title and author date (to the second), then they are definitely trying to game the system. In that unlikely event, we have two rounds of reviews (original pull request and backport pull request) to catch it. This commit implements that change. "%ae%at" (author email followed by author date without spaces) is used as the commit ID instead of the ID generated by git patch-id. Signed-off-by: Paul Chaignon <paul@cilium.io> 24 June 2021, 09:29:22 UTC
04def89 images: Remove trailing newlines from before computing SHA256 Some versions of buildx may add a newline at the end of the output for "buildx imagetools inspect". That additional newline leads to a different SHA256 hash: $ docker buildx version github.com/docker/buildx v0.3.1-tp-docker 6db68d029599c6710a32aa7adcba8e5a344795a7 $ docker buildx imagetools inspect "quay.io/cilium/cilium-runtime:e5902a650726387b39d080ce77a9ef6ccb89eabc" --raw 2>/dev/null | sha256sum | cut -d " " -f 1 7b0efa641ec89ee9860abfcce9a699765fee0f6f6337d4fe2d5cef09f60eb88c $ docker buildx imagetools inspect "quay.io/cilium/cilium-runtime:e5902a650726387b39d080ce77a9ef6ccb89eabc" --raw 2>/dev/null | perl -0pe 's/\n\Z//' | sha256sum | cut -d " " -f 1 38995ce0cf801983fb3706ed76fa3df03572d4cd7c0d2c4281fe622e7cd77e51 This is turn means the local and CI runs of the update-xxx-image make targets may lead to different image tags. This commit fixes it by removing any trailing newline from the output. Co-authored-by: Quentin Monnet <quentin@isovalent.com> Signed-off-by: Paul Chaignon <paul@cilium.io> 24 June 2021, 09:28:51 UTC
81b1db7 test: Comment to warn against adding new level=err exceptions Suggested-by: Joe Stringer <joe@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 23 June 2021, 17:55:42 UTC
e68d43f test: Update list of allowed error logs https://github.com/cilium/cilium/pull/16477 was merged and a new error, https://github.com/cilium/cilium/issues/16402#issuecomment-861544964 was discovered since the PR disallowing level=error in CI was merged. Signed-off-by: Paul Chaignon <paul@cilium.io> 23 June 2021, 17:55:42 UTC
4d8a86e Add Form3 to users 23 June 2021, 17:51:40 UTC
edf76fb vagrant: Bump all Vagrant box versions Mostly to pick up the latest commits on bpf-next, which fix vulnerabilities but may increase complexity. Signed-off-by: Paul Chaignon <paul@cilium.io> 22 June 2021, 19:36:16 UTC
ba4acfe daemon: Warn on disabling iptables I'm looking forward to a time when we no longer need to configure iptables. However, for the moment there's a couple of minor features we use to handle policy and forwarding correctly which rely on iptables. Furthermore, even if all of this is implemented in eBPF, the user's environment may still have iptables configured and this can then interfere with the Cilium traffic handling, depending on how Cilium is configured. For now, it likely makes sense to warn users that disabling this flag could lead to unexpected policy and forwarding behaviour. Once we've resolved the linked issue, maybe we can think about reverting this to an info message to account for the compatibility case mentioned above. Signed-off-by: Joe Stringer <joe@cilium.io> 22 June 2021, 19:35:35 UTC
d9eff9a ci: Bump cilium-cli version Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 22 June 2021, 18:59:21 UTC
f9abe64 ci: Enable flow validation Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 22 June 2021, 18:59:21 UTC
2c10568 test: re-enable K8sDatapathConfig Host firewall tests This commit re-enables the "K8sDatapathConfig Host firewall tests With native routing" and "K8sDatapathConfig Host firewall tests With native routing and endpoint routes" tests to run with kube-proxy Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 22 June 2021, 12:42:56 UTC
31927a2 bpf: fix iptables masquerading for node -> remote pod traffic When Cilium runs with KPR, host-firewall or bandwidth manager, it will try to auto-derive one or more devices to which the bpf_host program is attached. This program will, among other things, redirect ingress traffic destined to a pod to the pod's lxc device using `bpf_redirect()`. This causes the traffic to bypass the nf_conntrack table, leading to a situation where traffic leaving the pod after the connection's been established will be (incorrectly) masqueraded in case Iptables masquerading is enabled, since the connection is not tracked by netfilter. This commit fixes this by skipping `bpf_redirect()` when we detect this case (i.e. traffic is flowing through bpf_host attached to a physical device and Cilium has installed Iptables rules which require conntrack). Fixes: #14859 Suggested-by: Martynas Pumputis <m@lambda.lt> Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 22 June 2021, 12:42:56 UTC
037a1d0 build(deps): bump github.com/aliyun/alibaba-cloud-sdk-go Bumps [github.com/aliyun/alibaba-cloud-sdk-go](https://github.com/aliyun/alibaba-cloud-sdk-go) from 1.61.1095 to 1.61.1153. - [Release notes](https://github.com/aliyun/alibaba-cloud-sdk-go/releases) - [Changelog](https://github.com/aliyun/alibaba-cloud-sdk-go/blob/master/ChangeLog.txt) - [Commits](https://github.com/aliyun/alibaba-cloud-sdk-go/compare/v1.61.1095...v1.61.1153) --- updated-dependencies: - dependency-name: github.com/aliyun/alibaba-cloud-sdk-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 22 June 2021, 12:35:14 UTC
4e5272b docs: run GitHub action when Charts are touched to check Helm values ref PR #16238 added a reference for the Helm values in the Charts to the documentation. A number of these values are not common words from the dictionary, and need to be added to the list of acceptable words in the spelling list as we update the charts. The GitHub action for documentation is supposed to help with the task, catching omitted keywords. But it is only run when a number of documentation-related files are run, and this does not currently include the Charts! Let's fix in order to catch spelling mistake (or omitted spelling list updates). Fixes: #16238 Signed-off-by: Quentin Monnet <quentin@isovalent.com> 22 June 2021, 09:37:22 UTC
8b83fd3 .github/workflows: install ginkgo for test suite build test The test suite build check for the individual commits of a PR currently fails due to missing ginkgo binary. Install ginkgo v1.12.1 (as per go.mod). Fixes: e260ba9a08cf (".github/workflows: verify that each commit builds for test suite changes") Signed-off-by: Tobias Klauser <tobias@cilium.io> 22 June 2021, 09:37:01 UTC
876e9db fix: missing update verb in hubble-generate-certs If hubble-ca-secret already exists, then certgen is going to update it. To let certgen do its job, we need to configure update verb in the binded ClusterRole, otherwise it will fail with cannot update resource \"secrets\" in API group message. Fixes: #16508 Signed-off-by: Alex Szakaly <alex.szakaly@gmail.com> 22 June 2021, 04:50:37 UTC
a75599d policy: Make selectorcache callbacks lock-free Make IdentitySelectionUpdated() callbacks lock-free by queueing them while still holding selectorcache lock (to keep FIFO order) and calling from a goroutine not holding any locks. This prevents deadlocks caused by the implementation of IdentitySelectionUpdated() taking locks such as endpoint or selectorcache locks. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 22 June 2021, 04:48:49 UTC
92d851d lrp: Refactor logic executed on policy delete The `deletePolicyService` function was previously common to both delete policy and delete service callbacks. Refactor the logic to pass the policy config directly, thereby skip config look up. Signed-off-by: Aditi Ghag <aditi@cilium.io> 22 June 2021, 04:43:55 UTC
a7d73e4 lrp: Skip restoring service on delete operation Previously, we were restoring the original clusterIP service even when the service was deleted. Signed-off-by: Aditi Ghag <aditi@cilium.io> 22 June 2021, 04:43:55 UTC
d42614e ipsec: Fix logging of SPI after key rotations Five minutes after IPsec key rotations, we cleanup the old IPsec state and print the following message: level=info msg="New encryption keys reclaiming SPI" spi=0 subsys=ipsec Unfortunately, due to a bug the SPI was always 0 in that log message. This commit changes it and also logs the old SPI value if we have it: level=info msg="New encryption keys reclaiming SPI" SPI=7 oldSPI=0 subsys=ipsec Fixes: 3f12fb6 ("cilium: ipsec, add cleanup xfrm routine") Signed-off-by: Paul Chaignon <paul@cilium.io> 22 June 2021, 04:42:35 UTC
4c4a5dc node-neigh: Use arping ts in last ping hashmap The change is probably noop, but itshould improve the last ping timestamp precision. Signed-off-by: Martynas Pumputis <m@lambda.lt> 22 June 2021, 04:38:58 UTC
8260f9d node-neigh: Add retry for concurrent arping test case The test became notoriously flaky. It seems that some goroutines were lagging behind with the updates and they were overwritting the new MAC addr entry with the obsolete. To fix this, retry multiple times until the correct entry is found. Signed-off-by: Martynas Pumputis <m@lambda.lt> 22 June 2021, 04:38:58 UTC
128f0f8 testutils: Add WaitUntilWithSleep As for some cases WaitUntil() is a DoS tool. Signed-off-by: Martynas Pumputis <m@lambda.lt> 22 June 2021, 04:38:58 UTC
27122d4 bpf: fix hw_csum issue for icmp probe packets Example trace seen in dmesg: [...] [ 7710.165608] enp10s0f0np0: hw csum failure [ 7710.165621] skb len=84 headroom=78 headlen=84 tailroom=30 mac=(64,14) net=(78,20) trans=98 shinfo(txflags=0 nr_frags=0 gso(size=0 type=0 segs=0)) csum(0x0 ip_summed=2 complete_sw=0 valid=0 level=0) hash(0x14006e3a sw=0 l4=0) proto=0x0800 pkttype=0 iif=4 [ 7710.165631] dev name=enp10s0f0np0 feat=0x0x0032b18217514ba9 [ 7710.165635] skb headroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7710.165638] skb headroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7710.165641] skb headroom: 00000020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7710.165644] skb headroom: 00000030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7710.165646] skb headroom: 00000040: b8 ce f6 05 e7 62 b8 ce f6 05 e7 76 08 00 [ 7710.165649] skb linear: 00000000: 45 00 00 54 8a 07 00 00 40 01 84 e8 c0 a8 a0 04 [ 7710.165652] skb linear: 00000010: 0a 9a 00 73 00 00 23 57 00 f8 15 db cd 74 d0 60 [ 7710.165654] skb linear: 00000020: 00 00 00 00 5c 2d 0d 00 00 00 00 00 10 11 12 13 [ 7710.165657] skb linear: 00000030: 14 15 16 17 18 19 1a 1b 1c 1d 1e 1f 20 21 22 23 [ 7710.165660] skb linear: 00000040: 24 25 26 27 28 29 2a 2b 2c 2d 2e 2f 30 31 32 33 [ 7710.165663] skb linear: 00000050: 34 35 36 37 [ 7710.165665] skb tailroom: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7710.165668] skb tailroom: 00000010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 [ 7710.165672] CPU: 26 PID: 0 Comm: swapper/26 Not tainted 5.13.0-rc3+ #174 [ 7710.165674] Hardware name: Gigabyte Technology Co., Ltd. X570 AORUS MASTER/X570 AORUS MASTER, BIOS F22 08/20/2020 [ 7710.165676] Call Trace: [ 7710.165677] <IRQ> [ 7710.165680] dump_stack+0x7d/0x9c [ 7710.165683] netdev_rx_csum_fault.part.0+0x41/0x45 [ 7710.165686] netdev_rx_csum_fault.cold+0xb/0x10 [ 7710.165687] __skb_checksum_complete+0xdd/0xf0 [ 7710.165690] ? skb_send_sock_locked+0x20/0x20 [ 7710.165692] ? reqsk_fastopen_remove+0x190/0x190 [ 7710.165693] nf_ip_checksum+0x5b/0x120 [ 7710.165697] nf_conntrack_icmpv4_error+0x112/0x160 [nf_conntrack] [ 7710.165706] nf_conntrack_in.cold+0x1d/0x74 [nf_conntrack] [ 7710.165714] ? nft_do_chain_inet_ingress+0x280/0x2e0 [nf_tables] [ 7710.165722] ipv4_conntrack_in+0x14/0x20 [nf_conntrack] [ 7710.165731] nf_hook_slow+0x44/0xb0 [ 7710.165733] nf_hook_slow_list+0x71/0xf0 [ 7710.165735] ip_sublist_rcv+0x1d1/0x1f0 [ 7710.165737] ? ip_sublist_rcv+0x1f0/0x1f0 [ 7710.165739] ip_list_rcv+0xf5/0x120 [ 7710.165741] __netif_receive_skb_list_core+0x228/0x250 [ 7710.165745] netif_receive_skb_list_internal+0x1a1/0x2b0 [ 7710.165747] napi_complete_done+0x7a/0x1b0 [ 7710.165749] mlx5e_napi_poll+0x16e/0x730 [mlx5_core] [ 7710.165795] __napi_poll+0x31/0x170 [ 7710.165796] net_rx_action+0x22f/0x280 [ 7710.165798] __do_softirq+0xce/0x281 [ 7710.165800] irq_exit_rcu+0xa2/0xd0 [ 7710.165803] common_interrupt+0x8d/0xa0 [ 7710.165805] </IRQ> [ 7710.165806] asm_common_interrupt+0x1e/0x40 [ 7710.165808] RIP: 0010:cpuidle_enter_state+0xcc/0x360 [...] The trace was only reproducible with NICs using CHECKSUM_COMPLETE as csum type for inbound packets. It has been observed with mlx5, for example. The hw csum failure was only reproducible under the following conditions: - Protocol is ICMP, e.g. triggered by Cilium health probe packets - Pod from one node was pinging a remote node address - BPF based masquerading was used to SNAT Pod IP to node IP - BPF NAT engine found a collision in the NAT table such that it was forced to select a different ICMP id, and hence caused L4 rewrites In the case of ICMPv4 the bug was that BPF_F_PSEUDO_HDR was used for updating the L4 checksum. However, ICMPv4 does not have a pseudo header, only ICMPv6. The packet based csum was okay either way, but the flag caused to have a buggy skb->csum. Setting flag to 0 for ICMPv4 stopped the hw csum traces. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> Co-developed-by: Kornilios Kourtis <kornilios@isovalent.com> Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> 21 June 2021, 20:49:28 UTC
2e0427a daemon: Remove agent options deprecated in v1.10 Signed-off-by: Paul Chaignon <paul@cilium.io> 21 June 2021, 17:14:13 UTC
cb66c98 test: Remove outdated version checks Signed-off-by: Paul Chaignon <paul@cilium.io> 21 June 2021, 17:14:13 UTC
2d91924 Revert "eni: Fix compatibility when operator <= 1.7 is running" This reverts commit 439b142b049e6f4371d7cdac778ecc0ad15d7e85. That code is now obsolete and can be removed. Signed-off-by: Paul Chaignon <paul@cilium.io> 21 June 2021, 17:14:13 UTC
abf2a19 service: Remove outdated map deletion code This code was added in commit a7a841e ("Add temporary package for service management"). Signed-off-by: Paul Chaignon <paul@cilium.io> 21 June 2021, 17:14:13 UTC
9faa347 endpoint: Remove outdated code for endpoint restoration In v1.8, we rename the C header file for endpoints from lxc_config.h to ep_config.h. That file is used to restore endpoints, so we had to add some special code to handle upgrades and downgrades, from v1.7 to v1.8 and back. However, in versions v1.8+, we continued to use the old C header filename to check for the presence of endpoints. We therefore still can't completely remove that code to handle up/downgrades. We therefore just update the code to use the new C header file to check for the presence of endpoints. We will be able to remove all that code once v1.11 is the oldest supported version (as there is then no risk that a user tries to downgrade to a version of Cilium that uses the old C header file). Signed-off-by: Paul Chaignon <paul@cilium.io> 21 June 2021, 17:14:13 UTC
8d94a1b daemon: Remove deprecated code to handle stale map Remove deprecated code in agent that was added to handle the upgrade path and remove the stale map cilium_policy. This code was added in commit bb615ea ("daemon: Fix handling of policy call map on upgrades"). Signed-off-by: Paul Chaignon <paul@cilium.io> 21 June 2021, 17:14:13 UTC
44866f3 fix: conditionally change hubble relay port in hubble-ui In case of enabled TLS for Hubble Relay the hubble-ui shall follow the service port change Fixes #16510 Signed-off-by: Alex Szakaly <alex.szakaly@gmail.com> 21 June 2021, 13:17:18 UTC
fe01c7c pkg/kvstore: fix TestRunLocksGC unit test The unit test had a couple of bugs that are fixed by this commit. Fixes: 440539b7604a ("garbage collect stale distributed locks") Signed-off-by: André Martins <andre@cilium.io> 21 June 2021, 12:06:53 UTC
879f9eb maps: switch maglev to cilium/ebpf package Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 21 June 2021, 11:52:43 UTC
9fb9c33 test: Add nodesInfo struct and pass to helpers Otherwise, some helpers are going to explode by a number of args they accept. Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
33d12a6 test: Move common consts of K8sServicesTest So that the helper functions could access them instead of passing via args. Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
d3dd7e8 test: Move Bookinfo Demo to a separate test suite Same motivation as in the previous commit. Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
412e299 test: Move LRP tests to a separate suite To increase sanity of K8sServicesTest size. Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
e7ffc91 test: Remove unused externalIPs test cases We already have a coverage for externalIPs functionlity (e.g. "Tests externalIPs"). Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
880e8fb test: Move K8sServices test helpers to separate file This commit only moves the helper functions. No functional changes. Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
91d6364 test: Make Services test helpers stateless For the sake of readability, in the next commit we are going to move the helpers to another file. Therefore, we need to free them from any global var. For now it's just a dummy move of arguments. In the future commits, some helper functions with many args will accept a struct. Signed-off-by: Martynas Pumputis <m@lambda.lt> 21 June 2021, 08:01:44 UTC
e260ba9 .github/workflows: verify that each commit builds for test suite changes Make sure the individual commits of a PR build without errors for test suite changes as well. Noticed during review of #16470, see https://github.com/cilium/cilium/pull/16470#discussion_r651638453 Signed-off-by: Tobias Klauser <tobias@cilium.io> 21 June 2021, 08:01:29 UTC
5d16459 docs: rename maintainers team to cilium-maintainers As there are going to to be more than one maintainers team we should rename the current one. The remaining teams will have the following format: '<repository>-maintainers' Signed-off-by: André Martins <andre@cilium.io> 18 June 2021, 21:34:27 UTC
db06a64 k8s: Fix logging Log the correct field for HostIP. Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 18 June 2021, 03:02:20 UTC
41e830e test: Fix missing artifacts for tests with parentheses When tests with parentheses in their name fail, the artifacts are missing. This is happening because we run: bash -c "zip -qr test_name.zip test_directory" That therefore fails with: /bin/bash: -c: line 0: syntax error near unexpected token `(' We need to add double quotes for this command to work properly with parentheses. Fixes: b4bfb40 ("Test: Add test result in Jenkins Junit") Signed-off-by: Paul Chaignon <paul@cilium.io> 17 June 2021, 22:42:22 UTC
a93c0ed .github: Rename maintainer's little helper's config file This commit renames the config. file to better clarify its purpose. Signed-off-by: Paul Chaignon <paul@cilium.io> 17 June 2021, 12:35:33 UTC
8b3f009 docs: Fix typo in BGP GSG Fixes: https://github.com/cilium/cilium/issues/16549 Signed-off-by: Chris Tarazi <chris@isovalent.com> 17 June 2021, 12:12:49 UTC
d790f8f test: Wait for kube-dns before starting test Wait for kube-dns to become reachable before running test in fqdn.go. Fixes: #16409 Signed-off-by: Jarno Rajahalme <jarno@isovalent.com> 17 June 2021, 12:12:35 UTC
0c9d55e vendor: Update go.universe.tf/metallb Following https://github.com/cilium/metallb/pull/4, Cilium is now tracking the code from the v0.9.6 branch of cilium/metallb: https://github.com/cilium/metallb/tree/v0.9.6 This was done in a backwards-compatible way to ensure that older versions of Cilium can still build by avoiding the invalidation of the previous commit SHA (40d425d20241). Signed-off-by: Chris Tarazi <chris@isovalent.com> 17 June 2021, 12:09:53 UTC
06658e8 test: Remove uptime reporting Based on the sparse PR description which introduced the uptime reporting [1], it used to detect any CPU discrepancies. It's been awhile since we had any. Also, we have "lscpu" to detect such. So, let's remove this unnecessary noise from the tests. [1]: https://github.com/cilium/cilium/pull/4901 Signed-off-by: Martynas Pumputis <m@lambda.lt> 17 June 2021, 12:09:38 UTC
1dd477d ci: Disable NFS locking This is an attempt to fix the recent issues with NFS locking in CI, e.g. issue #16551 From the nfs(5) manpage: > When using the nolock option, applications can lock files, but such > locks provide exclusion only against other applications running on > the same client. Remote applications are not affected by these locks. Since in CI, we do not have any remote applications accessing the shared folder, only using local locks should be safe and more robust than using distributed locking. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 17 June 2021, 12:09:18 UTC
1bfc3b2 build(deps): bump actions/download-artifact from 2.0.9 to 2.0.10 Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 2.0.9 to 2.0.10. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/158ca71f7c614ae705e79f25522ef4658df18253...3be87be14a055c47b01d3bd88f8fe02320a9bb60) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 17 June 2021, 11:22:50 UTC
6234ad8 build(deps): bump actions/upload-artifact from 2.2.3 to 2.2.4 Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2.2.3 to 2.2.4. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/ee69f02b3dfdecd58bb31b4d133da38ba6fe3700...27121b0bdffd731efa15d66772be8dc71245d074) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 17 June 2021, 11:22:40 UTC
67b946d pkg/option: Fix default assignment of EnableWellKnownIdentities Fixes: 09d9e1e0e2d9 ("policy: Disable well-known identities for non-managed etcd") Signed-off-by: Mauricio Vásquez <mauricio@accuknox.com> Signed-off-by: Mauricio Vásquez <mauricio@kinvolk.io> 17 June 2021, 11:10:38 UTC
a38470c cicd: skip codesql on forks introduces an if statement into the codesql lint action which skips the work if its taking place on a fork. Signed-off-by: ldelossa <louis.delos@isovalent.com> 17 June 2021, 09:11:42 UTC
99230d2 build(deps): bump github.com/aws/aws-sdk-go-v2/feature/ec2/imds Bumps [github.com/aws/aws-sdk-go-v2/feature/ec2/imds](https://github.com/aws/aws-sdk-go-v2) from 1.1.0 to 1.1.1. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.1.0...config/v1.1.1) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/feature/ec2/imds dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 17 June 2021, 01:20:33 UTC
0681343 Removes CEP subresource. This is part 2/2 of trimmming CEP subresource to improve scalability. Part 1/2 is PR #15230. This will bump cilium CRD schema version and is only backward-compatible with agent that has part 1/2. Signed-off-by: Weilong Cui <cuiwl@google.com> 17 June 2021, 01:19:01 UTC
back to top