https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
6b2e2bd test: Fix --dry-run usage in ip-masq-agent tests Since the test is going to be run on k8s 1.23, we need to pass the "client" value to the "--dry-run" flag which was changed on k8s > 1.18. Signed-off-by: Martynas Pumputis <m@lambda.lt> 08 December 2021, 20:47:48 UTC
11c17f0 Revert "test/Services: Quarantine 'Checks service on same node'" This reverts commit ca4ed8dac7f7 ("test/Services: Quarantine 'Checks service on same node'"). The original intention was to quarantine the test from #17919. However, ca4ed8dac7f7 quarantined the /wrong/ test case which was not the failing one since the code in mentioned commit only covers the IPv6 one and in the test the IPv4 one was failing. Meanwhile, the IPv4 test for 'Checks service on same node' was still left active on master, and looking at CI dashboard from last few days on master there has been no failure related to any of the 'Checks service on same node' tests. Closes: #17919 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 08 December 2021, 18:39:39 UTC
89176a7 images: update gops binary in images to v0.3.22 The vendored version of gops was already bumped to v0.3.22 in commit fdd9731a2a6b ("update go.mod dependencies"). Update the binary in the docker images as well to match the version of the gops agent. Fixes: fdd9731a2a6b ("update go.mod dependencies") Signed-off-by: Tobias Klauser <tobias@cilium.io> 08 December 2021, 17:53:55 UTC
a836fec build(deps): bump actions/upload-artifact from 2.2.4 to 2.3.0 Bumps [actions/upload-artifact](https://github.com/actions/upload-artifact) from 2.2.4 to 2.3.0. - [Release notes](https://github.com/actions/upload-artifact/releases) - [Commits](https://github.com/actions/upload-artifact/compare/27121b0bdffd731efa15d66772be8dc71245d074...da838ae9595ac94171fa2d4de5a2f117b3e7ac32) --- updated-dependencies: - dependency-name: actions/upload-artifact dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 08 December 2021, 17:53:48 UTC
be152b3 build(deps): bump actions/download-artifact from 2.0.10 to 2.1.0 Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 2.0.10 to 2.1.0. - [Release notes](https://github.com/actions/download-artifact/releases) - [Commits](https://github.com/actions/download-artifact/compare/3be87be14a055c47b01d3bd88f8fe02320a9bb60...f023be2c48cc18debc3bacd34cb396e0295e2869) --- updated-dependencies: - dependency-name: actions/download-artifact dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 08 December 2021, 17:53:27 UTC
f4d59d1 docs: Update the minimum required Minikube version Minikube 1.12.0 or later is required to use the --cni flag [1]. 1 - https://github.com/kubernetes/minikube/commit/9e95435e0020eed065ee0229a6a54a7e54530a6d Signed-off-by: Paul Chaignon <paul@cilium.io> 08 December 2021, 17:52:33 UTC
05fd42f build(deps): bump azure/login from 1.4.1 to 1.4.2 Bumps [azure/login](https://github.com/azure/login) from 1.4.1 to 1.4.2. - [Release notes](https://github.com/azure/login/releases) - [Commits](https://github.com/azure/login/compare/89d153571fe9a34ed70fcf9f1d95ab8debea7a73...66d2e78565ab7af265d2b627085bc34c73ce6abb) --- updated-dependencies: - dependency-name: azure/login dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 08 December 2021, 17:52:26 UTC
4b16aee identity: fix incorrect maximum identity when ClusterID > 0 Calculate MaximumAllocationIdentity during module initialization is not enough as user specified 'ClusterID' hasn't been parsed during this phase. This patch fixed the problem by recalculating the max identity during runtime initialization when `ClusterID > 0`. Signed-off-by: ArthurChiao <arthurchiao@hotmail.com> 08 December 2021, 17:52:08 UTC
3aea7d1 build(deps): bump google-github-actions/setup-gcloud from 0.2.1 to 0.3 Bumps [google-github-actions/setup-gcloud](https://github.com/google-github-actions/setup-gcloud) from 0.2.1 to 0.3. - [Release notes](https://github.com/google-github-actions/setup-gcloud/releases) - [Changelog](https://github.com/google-github-actions/setup-gcloud/blob/master/CHANGELOG.md) - [Commits](https://github.com/google-github-actions/setup-gcloud/compare/daadedc81d5f9d3c06d2c92f49202a3cc2b919ba...a45a0825993ace67ae6e11cf3011b3e7d6795f82) --- updated-dependencies: - dependency-name: google-github-actions/setup-gcloud dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 08 December 2021, 17:51:36 UTC
32b889b workflows: disable rollback on CLI install By default, `cilium install` tries to rollback the installation if something goes wrong. We explicitly disable this behaviour in CI as this forbids us to retrieve Cilium's state if something goes wrong at install time. Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 08 December 2021, 17:51:28 UTC
fef5928 build(deps): bump golang.org/x/tools from 0.1.7 to 0.1.8 Bumps [golang.org/x/tools](https://github.com/golang/tools) from 0.1.7 to 0.1.8. - [Release notes](https://github.com/golang/tools/releases) - [Commits](https://github.com/golang/tools/compare/v0.1.7...v0.1.8) --- updated-dependencies: - dependency-name: golang.org/x/tools dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 08 December 2021, 17:51:18 UTC
17f42b3 feat(helm): add ingressClassName to ingress spec Signed-off-by: Cyril Corbon <corboncyril@gmail.com> 08 December 2021, 17:51:09 UTC
4075e9a bpf: Clean up warnings from NAT mock tests Fixes : #17991 Signed-off-by: Clément Delzotti <elk1ns@outlook.fr> 08 December 2021, 17:49:46 UTC
6c169f6 docs: fix link to signoff / certificate of origin section Signed-off-by: Timo Reimann <ttr314@googlemail.com> 06 December 2021, 21:18:24 UTC
fdee17f Update Go to 1.17.4 Signed-off-by: Tobias Klauser <tobias@cilium.io> 06 December 2021, 21:14:36 UTC
444c58c test/helpers: fix kubectl version detection for RCs It seems that in some cases (e.g. for RCs) the minor version reported by `kubectl version --client -o json` contains trailing non-numeric characters, e.g. ``` { "clientVersion": { "major": "1", "minor": "23+", "gitVersion": "v1.23.0-rc.0", "gitCommit": "a117a51aa497a516106a3115963437218493c8d2", "gitTreeState": "clean", "buildDate": "2021-11-24T05:31:49Z", "goVersion": "go1.17.3", "compiler": "gc", "platform": "linux/amd64" } } ``` This breaks the check where the reported version is compared against the current k8s env version, leading to `kubectl` being downloaded by each test and creating flakes such as ``` cmd: "curl --output /tmp/kubectl/1.23/kubectl https://storage.googleapis.com/kubernetes-release/release/v1.23.0-rc.0/bin/linux/amd64/kubectl && chmod +x /tmp/kubectl/1.23/kubectl" exitCode: 23 duration: 90.516369ms stdout: err: exit status 23 stderr: % Total % Received % Xferd Average Speed Time Time Time Current Dload Upload Total Spent Left Speed ^M 0 0 0 0 0 0 0 0 --:--:-- --:--:-- --:--:-- 0Warning: Failed to create the file /tmp/kubectl/1.23/kubectl: Text file busy ^M 0 44.4M 0 1177 0 0 14530 0 0:53:25 --:--:-- 0:53:25 14530 curl: (23) Failed writing body (0 != 1177) FAIL: failed to ensure kubectl version: failed to download kubectl ``` Where `kubectl` from another test is still in use and attempted to be overwritten by the current test. Fix this by stripping any trailing non-numeric characters from the k8s minor version for the purpose of version comparison. Fixes: 75fbebbfbb5d ("test/helpers: use rc.0 as the default version of kubectl") Signed-off-by: Tobias Klauser <tobias@cilium.io> 06 December 2021, 17:38:35 UTC
37c3499 Chore: Change ip address used in testing to RFC1918 address space Fixes: #17701 Signed-off-by: Aniruddha Amit Dutta <duttaaniruddha31@gmail.com> 06 December 2021, 01:59:11 UTC
f458639 build(deps): bump gopkg.in/ini.v1 from 1.66.0 to 1.66.2 Bumps [gopkg.in/ini.v1](https://github.com/go-ini/ini) from 1.66.0 to 1.66.2. - [Release notes](https://github.com/go-ini/ini/releases) - [Commits](https://github.com/go-ini/ini/compare/v1.66.0...v1.66.2) --- updated-dependencies: - dependency-name: gopkg.in/ini.v1 dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 06 December 2021, 01:57:53 UTC
1802445 docs: prevent search engines from indexing old branches To avoid users find on old results on search engines, this commit adds the required configuration for that to happen. Based on https://stackoverflow.com/questions/63542354/readthedocs-robots-txt-and-sitemap-xml Signed-off-by: André Martins <andre@cilium.io> 04 December 2021, 00:46:05 UTC
0027542 docs: fix eksctl ClusterConfig to allow copy This commit fixes the eksctl ClusterConfig to allow for copy. It is merely a workaround for now until a proper fix is available. Fixes: 706c9009dc39 ("docs: re-write docs to create clusters with tainted nodes") Signed-off-by: André Martins <andre@cilium.io> 03 December 2021, 22:18:50 UTC
cc1ded8 docs: Clarify deprecated "prefilter-devices" Make it clear how users can select devices for the prefiltering. Reported-by: André Martins <andre@cilium.io> Signed-off-by: Martynas Pumputis <m@lambda.lt> 03 December 2021, 21:48:10 UTC
6bd3833 images: Bump Hubble CLI to v0.9.0 This bumps the Hubble CLI to the recently released version 0.9.0. Hubble CLI v0.9.0 has been released to include the Hubble protobuf API changes present in Cilium v1.11-rc3 and thus is intended to be bundled with the final Cilium v1.11 release. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 03 December 2021, 21:47:31 UTC
ce68d37 docs: cleanup and tidy up the 1.11 upgrade guide This upgrade guide contained all other versions in it. To prevent users from mistakenly reading an old upgrade guide, we should remove those leftovers. Signed-off-by: André Martins <andre@cilium.io> 03 December 2021, 21:47:15 UTC
b0ab425 doc: add upgrade note about nativeRoutingCIDR deprecation Missed by e03bfffd55466366289944dd087b9ae18593355f Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 03 December 2021, 21:46:59 UTC
2273b04 docs: clarify upgrade impact for clients using an egress gateway Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 03 December 2021, 21:46:50 UTC
915a7f5 helm: Fix operator cloud image digests Tested by applying this patch to v1.11 branch and validating that the digest matches the correct cloud image vs. the v1.11.0-rc3 images on Quay.io: $ helm template cilium ./install/kubernetes/cilium/ --version 1.10.0-rc3 \ --namespace kube-system --set eni.enabled=true --set ipam.mode=eni \ --set egressMasqueradeInterfaces=eth0 --set tunnel=disabled \ | grep operator.*sha image: quay.io/cilium/operator-aws:v1.11.0-rc3@sha256:5ea0ccb6a866a5fb13f4bdfcf1ed8bce12a1355cb10a0914ea52af25f3a8f931 Signed-off-by: Joe Stringer <joe@cilium.io> 03 December 2021, 21:34:34 UTC
33bd95c service: Always allocate higher ID for svc/backend Previously, it was possible that a backend or a service would get allocated ID, which would be ID_backend_A < ID < ID_backend_B. This could have happened after cilium-agent restart, as the nextID was not advanced upon the restoration of IDs. This could have led to situations in which the per-packet LB could selected a backend which did not belong to a requested service when the following was fulfilled in the chronological order: 1. Previously the same client made the request to the service and the backend with ID_x was chosen. 2. The service endpoint (backend) with ID_x was removed. 3. cilium-agent was restarted. 4. A new service backend which does not belong to the initial service was created and got the ID_x allocated. 5. The CT_SERVICE entry for the old connection was not removed by the CT GC. 6. The same client made a new connection to the same service from the same src port. The above led the lb{4,6}_local() to select the wrong backend, as it found the CT_SERVICE entry with the backend ID_x. The advancement of the nextID upon the restoration only partly mitigates the issue. The real fix would be to introduce a match map which key would be (svc_id, backend_id), and it would be populated by the agent. The lb{4,6}_local() routines would consult the map to detect whether the backend belongs to the service. Signed-off-by: Martynas Pumputis <m@lambda.lt> 03 December 2021, 18:42:56 UTC
a1fdcb9 Makefile.docker: replace hardcode "docker" command with $(CONTAINER_ENGINE) Fix building image broken when specify a custom container engine or specify the priviledged command in some environments, such as: $ CONTAINER_ENGINE="sudo docker" DOCKER_IMAGE_TAG=xxx make docker-cilium-image -j4 Signed-off-by: ArthurChiao <arthurchiao@hotmail.com> 02 December 2021, 18:49:02 UTC
0c7fe95 aws: Disable flaky test This test has been flaky for well over a year now, see issue 11560. Track re-enablement in https://github.com/cilium/cilium/projects/173 Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 18:33:11 UTC
2d7602e test: Quarantine Secondary nodeport device tests See issue 18072 for more details about the flaky test. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 18:32:59 UTC
a7855a5 .github: Disable EKS encryption tests These tests are flaky, see issue 16938. Tracking to fix them in https://github.com/cilium/cilium/projects/173. Signed-off-by: Joe Stringer <joe@cilium.io> 02 December 2021, 18:32:20 UTC
854bb86 test: Extend coredns clusterrole with additional resource permissions Commit 398d55cd didn't add permissions for `endpointslices` resource to the coredns `cluterrole` on k8s < 1.20. As a result, core-dns deployments failed on the these versions with the error - `2021-11-30T14:09:43.349414540Z E1130 14:09:43.349292 1 reflector.go:138] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: Failed to watch *v1beta1.EndpointSlice: failed to list *v1beta1.EndpointSlice: endpointslices.discovery.k8s.io is forbidden: User "system:serviceaccount:kube-system:coredns" cannot list resource "endpointslices" in API group "discovery.k8s.io" at the cluster scope` Fixes: 398d55cd Signed-off-by: Aditi Ghag <aditi@cilium.io> 02 December 2021, 18:03:33 UTC
af6b795 dependabot: disable all AWS package updates This will prevent dependabot from updating subpackages of github.com/aws/aws-sdk-go-v2 as well as github.com/aws/smithy-go Fixes: 4762a4abae5a ("dependabot: disable cloud provider SDK updates") Signed-off-by: Tobias Klauser <tobias@cilium.io> 02 December 2021, 14:44:33 UTC
75fbebb test/helpers: use rc.0 as the default version of kubectl Since we only update the Kubernetes version tested on our CI when the first RC is announced we should use that binary instead of the `.0` as the `.0` is not available at the time the rc.0 is released. Fixes: 61812551f659 ("test: ensure kubectl version is available for test run") Signed-off-by: André Martins <andre@cilium.io> 02 December 2021, 11:24:51 UTC
6c432fb Revert "test/helpers: fix ensure kubectl version to work for RCs" This reverts commit bb6ef27c7c3628e5cd22072caaae5e0c399a31a5. Signed-off-by: André Martins <andre@cilium.io> 02 December 2021, 11:24:51 UTC
1987b67 test: Replace `WaitUntilMatch` with `Eventually` The library function provides the same functionality. Signed-off-by: Aditi Ghag <aditi@cilium.io> 02 December 2021, 09:50:11 UTC
8986930 test: Fix graceful termination test flake The graceful termination test apps [1] are updated to make the test logic to fix flakes. Specifically, added read and write deadlines while making socket calls on the server side. This way the server doesn't block on the socket calls when `SIGTERM` event is received on termination. While at it, also updated the test logic to validate that connectivity between client and server is intact at least for the configured `terminationGracePeriodInSeconds` duration. [1] https://github.com/cilium/graceful-termination-test-apps Signed-off-by: Aditi Ghag <aditi@cilium.io> 02 December 2021, 09:50:11 UTC
32b5bb2 Revert "test/Services: Quarantine 'Checks graceful termination'" This reverts commit cbbea398 Signed-off-by: Aditi Ghag <aditi@cilium.io> 02 December 2021, 09:50:11 UTC
8fd2bb9 bpf/Makefile: Remove unnecessary shell references These Makefiles were sprinkled with semi-colons, causing the overall statement to be run as a series of commands in a shell and to ignore the result. Remove them and rely on regular Makefile statements. Signed-off-by: Joe Stringer <joe@cilium.io> 01 December 2021, 14:05:36 UTC
47dab33 bpf: Quieten mock targets Make the bpf mock testing framework targets respect the user's verbosity flag. Signed-off-by: Joe Stringer <joe@cilium.io> 01 December 2021, 14:05:36 UTC
a663671 docs: Remove manual installation instruction for `kind` clustermesh The clustermesh guide has installation instructions using cilium CLI so let's use that. Signed-off-by: Aditi Ghag <aditi@cilium.io> 01 December 2021, 09:41:44 UTC
6334f98 health: Use signal.NotifyContext This is a cleanup commit with no functional change. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 23:18:49 UTC
cfd9da2 ci: Set ClusterHealthPort in K8sHealth This sets a custom value for `cluster-health-port` in the K8sHealth test suite, to ensure we support setting a custom health port (e.g. used in OpenShift, which we do not test in our CI at the moment). Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 23:18:49 UTC
c640c71 health: Fix cluster-health-port for health endpoint To determine cluster health, Cilium exposes a HTTP server both on each node, as well as on the artificial health endpoint running on each node. The port used for this HTTP server is the same and can be configured via `cluster-health-port` (introduced in #16926) and defaults to 4240. This commit fixes a bug where the port specified by `cluster-health-port` was not passed to the Cilium health endpoint responder. Which meant that `cilium-health-responder` was always listening on the default port instead of the one configured by the user, while the probe tried to connect via `cluster-health-port`. This resulted in the cluster being reported us unhealthy whenever `cluster-health-port` was set to a non-default value (which is the case our OpenShift OLM for v1.11): ``` Nodes: gandro-7bmc2-worker-2-blgxf.c.cilium-dev.internal (localhost): Host connectivity to 10.0.128.2: ICMP to stack: OK, RTT=634.746µs HTTP to agent: OK, RTT=228.066µs Endpoint connectivity to 10.128.11.73: ICMP to stack: OK, RTT=666.83µs HTTP to agent: Get "http://10.128.11.73:9940/hello": dial tcp 10.128.11.73:9940: connect: connection refused ``` Fixes: e624868e165d ("health: Add a flag to set HTTP port") Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 23:18:49 UTC
420028f .github: add workflow to build beta images With this new workflow, developers will be able to release beta features that are created on top of an existing release. The workflow to create a new beta image is as follow: 1. Push a branch into Cilium's repository with the name: `feature/<stable-branch>/<feature-name>` where `<stable-branch>` represents the branch where the feature is based on and `<feature-name>` represents the name of the feature being released. 2. Trigger the workflow by going into [1], use the workflow from `feature/<stable-branch>/<feature-name>` branch and write an image tag name. The tag name should be in the format `vX.Y.Z-<feature-name>` where `vX.Y.Z` is the version on which the branch is built on, and `<feature-name>` the name of the feature. 3. Ping one of the maintainers or anyone from the cilium-build team to approve the build and release process of this feature. [1] https://github.com/cilium/cilium/actions/workflows/build-images-beta.yaml Signed-off-by: André Martins <andre@cilium.io> 30 November 2021, 22:43:38 UTC
fcd0039 daemon, node: Remove old, discarded router IPs from `cilium_host` In the previous commit (referenced below), we forgot to remove the old router IPs from the actual interface (`cilium_host`). This caused connectivity issues in user environments where the discarded, stale IPs were reassigned to pods, causing the ipcache entries for those IPs to have `remote-node` identity. To fix this, we remove all IPs from the `cilium_host` interface that weren't restored during the router IP restoration process. This step correctly finalizes the restoration process for router IPs. Fixes: ff63b0775c0 ("daemon, node: Fix faulty router IP restoration logic") Signed-off-by: Chris Tarazi <chris@isovalent.com> 30 November 2021, 21:32:22 UTC
02fa124 node: Add missing fallback to router IP from CiliumNode for restoration Previously in the case that both router IPs from the filesystem and the CiliumNode resource were available, we missed a fallback to the CiliumNode IP, if the IP from the FS was outside the provided CIDR range. In other words, we returned early that the FS IP does not belong to the CIDR, without checking if the IP from the CiliumNode was a valid fallback. This commit adds the missing case logic and also adds more documentation to the function. Signed-off-by: Chris Tarazi <chris@isovalent.com> 30 November 2021, 21:32:22 UTC
1a49543 install: add tolerations for the certgen cronjob Enable pod tolerations for the certgen cronjob pods to allow jobs on tainted nodes. Signed-off-by: David Wolffberg <davidwolffberg@gmail.com> 30 November 2021, 21:31:18 UTC
0fc1188 test/DatapathConfiguration: Quarantine 'Encapsulation' CC: Thomas Graf <thomas@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 18:28:14 UTC
f77a8d8 test/Services: Quarantine 'IPv6 masquerading across K8s nodes' CC: Deepesh Pathak <deepshpathak@gmail.com> Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 18:28:14 UTC
cbbea39 test/Services: Quarantine 'Checks graceful termination' CC: Aditi Ghag <aditi@cilium.io> Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 18:28:14 UTC
dea1343 test/Services: Quarantine 'Tests with direct routing' CC: Martynas Pumputis <m@lambda.lt> Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 18:28:14 UTC
ca4ed8d test/Services: Quarantine 'Checks service on same node' CC: Sebastian Wicki <sebastian@isovalent.com> Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 18:28:14 UTC
34c1d6e contrib: Add quarantine commit creation script usage: ./contrib/scripts/quarantine.sh "<focus-phrase>" This will generate a commit that quarantines the tests that match the specified focus phrase. It mostly works, but if the declarations for tests are made across multiple lines then it will be unable to locate the line to execute the quarantine. There's also a bit of a trick in selecting the right phrase to quarantine; often it will make sense to use the last set of words in a test name for a failing test. Typically these start with something like 'Checks ...' or 'Tests ...' so that only the inner-most 'It' or 'Context' statement is quarantined. However, if a more widespread issue is present then it may make sense to quarantine something using a phrase in the middle or even at the start of the test name. Other hints may be gathered by studying the Jenkins UI, the CI dashboard, and/or the GitHub issues page for issues labeled with 'ci/flake' which have been recently updated. Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 18:28:14 UTC
8002a50 test: Fix incorrect selector for netperf-service Caught by random chance when using this manifest to test something locally. Might as well fix it in case someone uses this in the future and the service is not working as expected. AFAICT, no CI failures occurred from this typo because the Chaos test suite (only suite which uses this manifest) doesn't assert any traffic to the service, but rather to the netperf-server directly. Fixes: b4a3cf6abc6 ("Test: Run netperf in background while Cilium pod is being deleted") Signed-off-by: Chris Tarazi <chris@isovalent.com> 30 November 2021, 18:05:08 UTC
606b5fe docs: KUBECONFIG for cilium-cli with k3s Clarify how cilium-cli can work with k3s Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> 30 November 2021, 16:49:06 UTC
04bf74c bpf: Add WireGuard to complexity and compile tests ENABLE_WIREGUARD was missing from the compile tests in bpf/Makefile and from the complexity tests in bpf/complexity-tests. We could therefore have missed new complexity issues or compilation errors occurring only when WireGuard is enabled. Fixes: 8930bebe ("daemon: Configure Wireguard for local node") Reported-by: Joe Stringer <joe@cilium.io> Signed-off-by: Paul Chaignon <paul@cilium.io> 30 November 2021, 16:40:38 UTC
2f749ab build(deps): bump gopkg.in/ini.v1 from 1.64.0 to 1.66.0 Bumps [gopkg.in/ini.v1](https://github.com/go-ini/ini) from 1.64.0 to 1.66.0. - [Release notes](https://github.com/go-ini/ini/releases) - [Commits](https://github.com/go-ini/ini/compare/v1.64.0...v1.66.0) --- updated-dependencies: - dependency-name: gopkg.in/ini.v1 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 30 November 2021, 16:39:05 UTC
e9b28d8 build(deps): bump github.com/aws/aws-sdk-go-v2/config Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.10.0 to 1.10.3. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Changelog](https://github.com/aws/aws-sdk-go-v2/blob/main/CHANGELOG.md) - [Commits](https://github.com/aws/aws-sdk-go-v2/compare/v1.10.0...config/v1.10.3) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/config dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 30 November 2021, 16:38:39 UTC
4762a4a dependabot: disable cloud provider SDK updates The cloud provider SDKs are updated too frequently, often times without any code changes affecting Cilium. However, these PRs still require developer's time reviewing/approving such PRs and increase CI cost. Thus, exclude these dependencies from automatic updates and instead update them manually once every month. Signed-off-by: Tobias Klauser <tobias@cilium.io> 30 November 2021, 16:24:08 UTC
aef5002 test: temporary increase Hubble buffer size to 64k Temporary increase the Hubble buffer size in order to capture more flows. This will hopefully help us understand why the K8sEgressGatewayTest is occasionally failing (#18012) Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 30 November 2021, 14:12:05 UTC
e38e3c4 bugtool: fix IP route debug gathering commands Commit 8bcc4e5dd830 ("bugtool: avoid allocation on conversion of execCommand result to string") broke the `ip route show` commands because the change from `[]byte` to `string` causes the `%v` formatting verb to emit the raw byte slice, not the string. Fix this by using the `%s` formatting verb to make sure the argument gets interpreted as a string. Also fix another instance in `writeCmdToFile` where `fmt.Fprint` is now invoked with a byte slice. Grepping for `%v` in bugtool sources and manually inspecting all changes from commit 8bcc4e5dd830 showed no other instances where a byte slice could potentially end up being formatted in a wrong way. Fixes: 8bcc4e5dd830 ("bugtool: avoid allocation on conversion of execCommand result to string") Signed-off-by: Tobias Klauser <tobias@cilium.io> 30 November 2021, 13:31:05 UTC
7376df3 neigh, test: Bump max timeout for tests There has been report that the neighbor tests took slightly longer than expected and while there was nothing wrong with them, the timeout kicked in and led to failure. Slighly bump it to avoid flakes like these. Fixes: #18013 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 30 November 2021, 12:20:37 UTC
98697f3 neigh, test: Also retry upon temporary NUD_FAILED state Wasn't able to reproduce the flake even after running the test overnight. The only explanation I'd have is that there is a small/rare flake due to a temporary NUD_FAILED state where we won't retry again. Closes: #18004 Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 30 November 2021, 12:20:37 UTC
fc0045f docs: Mention how to build images for local CI testing Signed-off-by: Martynas Pumputis <m@lambda.lt> 30 November 2021, 11:52:57 UTC
1b42f7a contrib: Fix backport submission for own PRs On GitHub, one cannot request oneself to review one's own PR. This results in the following problem when submitting a backport PR: $ submit-backport Using GitHub repository joestringer/cilium (git remote: origin) Sending PR for branch v1.10: v1.10 backports 2021-11-23 * #17788 -- Additional FQDN selector identity tracking fixes (@joestringer) Once this PR is merged, you can update the PR labels via: ```upstream-prs $ for pr in 17788; do contrib/backporting/set-labels.py $pr done 1.10; done ``` Sending pull request... remote: remote: Create a pull request for 'pr/v1.10-backport-2021-11-23' on GitHub by visiting: remote: https://github.com/joestringer/cilium/pull/new/pr/v1.10-backport-2021-11-23 remote: Error requesting reviewer: Unprocessable Entity (HTTP 422) Review cannot be requested from pull request author. Signal ERR caught! Traceback (line function script): 58 main /home/joe/git/cilium/contrib/backporting/submit-backport Fix this by excluding ones own username from the reviewers list. Signed-off-by: Joe Stringer <joe@cilium.io> 30 November 2021, 11:50:28 UTC
816849a build(deps): bump github.com/Azure/azure-sdk-for-go Bumps [github.com/Azure/azure-sdk-for-go](https://github.com/Azure/azure-sdk-for-go) from 59.3.0+incompatible to 59.4.0+incompatible. - [Release notes](https://github.com/Azure/azure-sdk-for-go/releases) - [Changelog](https://github.com/Azure/azure-sdk-for-go/blob/main/CHANGELOG.md) - [Commits](https://github.com/Azure/azure-sdk-for-go/compare/v59.3.0...v59.4.0) --- updated-dependencies: - dependency-name: github.com/Azure/azure-sdk-for-go dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> 30 November 2021, 11:46:00 UTC
71381e8 elf: skip TestWrite if ELF file wasn't built This will skip the test when running the tests standalone (i.e. via `go test` and not via Makefile). See #17536 for more details about this particular file, which applied the same principle to the benchmark in that test suite. See also #16914 Reported-by: Hemanth Malla <hemanth.malla@datadoghq.com> Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 30 November 2021, 11:45:46 UTC
54bd57b docs: update k8s instructions on how to update k8s libraries Signed-off-by: André Martins <andre@cilium.io> 30 November 2021, 01:59:49 UTC
65a46b5 Prometheus lint errors in operator metrics Promtool identified following lint errors when running against operator metrics 1) cilium_operator_identity_gc_entries_total non-counter metrics should not have "_total" suffix 2) cilium_operator_identity_gc_runs_total non-counter metrics should not have "_total" suffix Add relevant changes in upgrade documentation for 1.10 and 1.11 Fixing both the non-counter metrics. Signed-off-by: Gobinath Krishnamoorthy <gobinathk@google.com> 29 November 2021, 18:34:02 UTC
398d55c test/contrib: Bump CoreDNS version to 1.8.3 As reported in [1], Go's HTTP2 client < 1.16 had some serious bugs which could result in lost connections to kube-apiserver. Worse than this was that the client couldn't recover. In the case of CoreDNS the loose of connectivity to kube-apiserver was even not logged. I have validated this by adding the following rule on the node which was running the CoreDNS pod (6443 port as the socket-lb was doing the service xlation): iptables -I FORWARD 1 -m tcp --proto tcp --src $CORE_DNS_POD_IP \ --dport=6443 -j DROP After upgrading CoreDNS to the one which was compiled with Go >= 1.16, the pod was not only logging the errors, but also was able to recover from them in a fast way. An example of such an error: W1126 12:45:08.403311 1 reflector.go:436] pkg/mod/k8s.io/client-go@v0.20.2/tools/cache/reflector.go:167: watch of *v1.Endpoints ended with: an error on the server ("unable to decode an event from the watch stream: http2: client connection lost") has prevented the request from succeeding To determine the min vsn bump, I was using the following: for i in 1.7.0 1.7.1 1.8.0 1.8.1 1.8.2 1.8.3 1.8.4; do docker run --rm -ti "k8s.gcr.io/coredns/coredns:v$i" \ --version done CoreDNS-1.7.0 linux/amd64, go1.14.4, f59c03d CoreDNS-1.7.1 linux/amd64, go1.15.2, aa82ca6 CoreDNS-1.8.0 linux/amd64, go1.15.3, 054c9ae k8s.gcr.io/coredns/coredns:v1.8.1 not found: manifest unknown: k8s.gcr.io/coredns/coredns:v1.8.2 not found: manifest unknown: CoreDNS-1.8.3 linux/amd64, go1.16, 4293992 CoreDNS-1.8.4 linux/amd64, go1.16.4, 053c4d5 Hopefully, the bumped version will fix the CI flakes in which a service domain name is not available after 7min. In other words, CoreDNS is not able to resolve the name which means that it hasn't received update from the kube-apiserver for the service. [1]: https://github.com/kubernetes/kubernetes/issues/87615#issuecomment-803517109 Signed-off-by: Martynas Pumputis <m@lambda.lt> 29 November 2021, 18:17:01 UTC
e03bfff doc: use ipv4NativeRoutingCIDR instead of nativeRoutingCIDR As the latter has been deprecated in favor of the former. Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 29 November 2021, 17:25:55 UTC
04c29ba Fix unhelpful error emitted when we try to setup base devices Signed-off-by: kerthcet <kerthcet@gmail.com> 29 November 2021, 16:11:32 UTC
06d9441 ci: Restart pods when toggling KPR switch Previously, in the graceful backend termination test we switched to KPR=disabled and we didn't restart CoreDNS. Before the switch, CoreDNS@k8s2 -> kube-apiserver@k8s1 was handled by the socket-lb, so the outgoing packet was $CORE_DNS_IP -> $KUBE_API_SERVER_NODE_IP. The packet should have been BPF masq-ed. After the switch, the BPF masq is no longer in place, so the packets from CoreDNS are subject to the iptables' masquerading (they can be either dropped by the invalid rule or masqueraded to some other port). Combined with CoreDNS unable to recover from connectivity errors [1], the CoreDNS was no longer able to receive updates from the kube-apiserver, thus NXDOMAIN errors for the new service name. To avoid such flakes, forcefully restart the DNS pods if the KPR setting change is detected. [1]: https://github.com/cilium/cilium/pull/18018 Signed-off-by: Martynas Pumputis <m@lambda.lt> 29 November 2021, 16:10:49 UTC
1f71f4e build(deps): bump github.com/aliyun/alibaba-cloud-sdk-go Bumps [github.com/aliyun/alibaba-cloud-sdk-go](https://github.com/aliyun/alibaba-cloud-sdk-go) from 1.61.1340 to 1.61.1357. - [Release notes](https://github.com/aliyun/alibaba-cloud-sdk-go/releases) - [Changelog](https://github.com/aliyun/alibaba-cloud-sdk-go/blob/master/ChangeLog.txt) - [Commits](https://github.com/aliyun/alibaba-cloud-sdk-go/compare/v1.61.1340...v1.61.1357) --- updated-dependencies: - dependency-name: github.com/aliyun/alibaba-cloud-sdk-go dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> 29 November 2021, 14:13:50 UTC
6c1dae8 CODEOWNERS: clean-up entries for deleted files Remove from CODEOWNERS the patterns for which we no longer have any entry in the repository. List obtained with: while read i; do case "$i" in /*) # Remove leading slash and use ls LIST+=" ${i#/}" ;; *) # No leading slash: maybe not at the root, use find [[ -n $(find . -name "$i" -print -quit) ]] || echo "$i" ;; esac done <<< $(awk '/^[^#]/ {print $1}' CODEOWNERS) ls -- $LIST 2>&1 >/dev/null | sed "s=.*'\(.*\)':.*=/\1=" Fixes: b8401aa2edd6 ("checkpatch: remove checkpatch-related files from the repository") Fixes: 72e7740245c5 ("doc: Move goverance documentation to a more visible") Fixes: bf6039b99c33 ("doc: Remove obsolete Docker getting started guide") Fixes: db9b6f71453c ("docs: add cilium-operator technical overview documentation") Fixes: 26a80c381696 ("jenkinsfile: Remove stale symlinks") Fixes: b667f010fb81 ("pkg/k8s: self contain CRDs in common directory") Signed-off-by: Quentin Monnet <quentin@isovalent.com> 29 November 2021, 10:50:29 UTC
e5d84ac CODEOWNERS: fix wildcard patterns for files under daemon/cmd/ The syntax for wildcards in the CODEOWNERS file is a simple asterisk "*", which should not be preceded by a dash (".*") like other languages may use for regular expressions. For most entries, this means that only files starting by e.g. "ipcache.", such as "ipcache.go", are covered, but not "ipcache_test.go". For "kube_proxy.*", this even means there is no related entries, given that all files follow the pattern "kube_proxy_*". Let's replace the use of ".*" by "*" (or simply ".go" where relevant) in the CODEOWNERS file. References: - https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/about-code-owners#codeowners-syntax - https://git-scm.com/docs/gitignore#_pattern_format Fixes: a6677888fe04 ("CODEOWNERS: Attach entries to root of repository") Signed-off-by: Quentin Monnet <quentin@isovalent.com> 29 November 2021, 10:50:29 UTC
4758bef docs: add registry (quay.io/) for pre-loading images for kind in doc, it recommends docker pull image, but the command is : docker pull cilium/cilium:|IMAGE_TAG| this will download from docker.io However, in operator, it loads images from quay.io we should keep them the same, otherwise, we download for nothing. Signed-off-by: adamzhoul <adamzhoul186@gmail.com> 29 November 2021, 10:14:05 UTC
ce45bc3 docs: correct ec2 modify net iface action `ModifyNetworkInterface` -> `ModifyNetworkInterfaceAttribute` see: https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_ModifyNetworkInterfaceAttribute.html Signed-off-by: austin ce <austin.cawley@gmail.com> 26 November 2021, 21:43:29 UTC
3650544 Adds a locked function to do ipcache delete on metadata match Fixes potential racing condition introduced in PR #17161. Suggested-by: Joe Stringer <joe@cilium.io> Signed-off-by: Weilong Cui <cuiwl@google.com> 26 November 2021, 21:42:02 UTC
93d4a62 mlh: update Jenkins jobs following 1.23 support Following merge of #18008, we now support K8s 1.22 and have rotated the Jenkins test jobs as follow: - Changed: Kernel 4.9 testing on K8s 1.23 (instead of 1.22) - Changed: Kernel 4.19 testing on K8s 1.22 (instead of 1.21) - Changed: Kernel 5.4 testing on K8s 1.21 (instead of 1.20) - Added: Kernel 4.9 testing on K8s 1.21 See the Table of Truth:tm: for up to date status on all trigger phrases: https://docs.google.com/spreadsheets/d/1TThkqvVZxaqLR-Ela4ZrcJ0lrTJByCqrbdCjnI32_X0/edit#gid=0 Signed-off-by: Nicolas Busseneau <nicolas@isovalent.com> 26 November 2021, 21:39:51 UTC
ff8a7e6 ui: v0.8.3 Signed-off-by: Dmitry Kharitonov <dmitry@isovalent.com> 26 November 2021, 21:39:05 UTC
bb6ef27 test/helpers: fix ensure kubectl version to work for RCs Fixes: 61812551f659 ("test: ensure kubectl version is available for test run") Signed-off-by: André Martins <andre@cilium.io> 26 November 2021, 17:08:30 UTC
c56075d Update k8s tests and libraries to v1.23.0-rc.0 Signed-off-by: André Martins <andre@cilium.io> 26 November 2021, 17:08:30 UTC
18b10b4 ipam/crd: Fix spurious CiliumNode update status failures When running in CRD-based IPAM modes (Alibaba, Azure, ENI, CRD), it is possible to observe spurious "Unable to update CiliumNode custom resource" failures in the cilium-agent. The full error message is as follows: "Operation cannot be fulfilled on ciliumnodes.cilium.io <node>: the object has been modified; please apply your changes to the latest version and try again". It means that the Kubernetes `UpdateStatus` call has failed because the local `ObjectMeta.ResourceVersion` of submitted CiliumNode version is out of date. In the presence of races, this error is expected and will resolve itself once the agent receives a more recent version of the object with the new resource version. However, it is possible that the resource version of a `CiliumNode` object is bumped even though the `Spec` or `Status` of the `CiliumNode` remains the same. This for examples happens when `ObjectMeta.ManagedFields` is updated by the Kubernetes apiserver. Unfortunately, `CiliumNode.DeepEqual` does _not_ consider any `ObjectMeta` fields (including the resource version). Therefore two objects with different resource versions are considered the same by the `CiliumNode` watcher used by IPAM. But to be able to successfully call `UpdateStatus` we need to know the most recent resource version. Otherwise, `UpdateStatus` will always fail until the `CiliumNode` object is updated externally for some reason. Therefore, this commit modifies the logic to always store the most recent version of the `CiliumNode` object, even if `Spec` or `Status` has not changed. This in turn allows `nodeStore.refreshNode` (which invokes `UpdateStatus`) to always work on the most recently observed resource version. Signed-off-by: Sebastian Wicki <sebastian@isovalent.com> 26 November 2021, 13:39:41 UTC
ed73a31 egressgateway: refactor manager logic This commit refactors the egress gateway manager in order to provide a single `reconcile()` method which will be invoked on all events received by the manager. This method is responsible for adding and removing entries to and from the egress policy map. In addition to this, the manager will now wait for the k8s cache to be fully synced before running its first reconciliation, in order to always have the egress_policy map in a consistent state with the k8s configuration. Fixes: #17380 Fixes: #17753 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 26 November 2021, 08:05:35 UTC
d9b60f7 daemon: add WaitUntilK8sCacheIsSynced method which will block the caller until the agent has fully sync its k8s cache. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 26 November 2021, 08:05:35 UTC
cdb4b46 docs: add a note on egress gateway upgrade impact for 1.11 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 26 November 2021, 08:05:35 UTC
2b07959 bpf: rename egress policy map and its fields to make it more clear it's related to the egress gateway policies Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 26 November 2021, 08:05:35 UTC
3ba8e6e maps: switch egressmap to cilium/ebpf package Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 26 November 2021, 08:05:35 UTC
0b27f80 docs: Mention service topology in KPR guide Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
545d94c helm: Add loadBalancer.serviceTopology This enables k8s service topology aware hints. Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
ed9c7ce k8s: Add unit tests for topology aware hints Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
8442d6e k8s: Fix endpoints returned by update routine Previously, the function returned all passed endpoints instead the ones which were filtered and correlated by correlateEndpoints(). The change is no-op, as nobody was consuming the return value of UpdateEndpoint*(). Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
6ddfbd2 k8s: Implement svc topology aware hints This commit implements the topology aware hints for k8s services described in [1]. The idea of the feature is to provision service endpoints only if their zone hints matches the self node's "topology.kubernetes.io/zone" label value. The main benefit is that it allows service traffic to prefer zone-local endpoints which could be used e.g., to avoid costs associated with crossing cloud network zones. Also, it might yield better performance for service traffic, as the nearer endpoints are preferred. The hints for endpoints is set by kube-controller-manager. The heuristics are described in [1]. The hints are set in the EndpointsliceV1 object (this is the reason why we don't implement the hints parsing for other endpoint object types). I considered implementing the feature in "pkg/service" instead of "pkg/k8s". The main reasons for choosing the latter is (1) that this feature is k8s specific and (2) that in the near future we probably will merge "pkg/service" with "pkg/maps/lbmap", as both deal with the low-level datapath specific details. [1]: https://kubernetes.io/docs/concepts/services-networking/topology-aware-hints/ Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
14b70ad k8s: Extend Node subscriber to accept swg The swg (stoppable wait group) is used by the service_cache.go when syncing k8s caches upon the agent startup. Until now, service_cache was consuming only Service and Endpoint* objects. However, for the upcoming service topology aware hints feature we need to add (self) Node object as well to the list. This is because the feature needs to get the "topology.kubernetes.io/zone" of the self Node. Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
2ddf5e7 daemon: Add --enable-service-topology It's going to be used by the k8s service topology aware hints feature to be implemented in the next commit. Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
2ac1403 k8s: Add Hints.ForZone field to slim Endpoint This is going to be used by the upcoming (service) topology aware hints feature. Signed-off-by: Martynas Pumputis <m@lambda.lt> 25 November 2021, 17:34:05 UTC
c46a028 docs: Add cilium "managed pods" example This example demonstrates a good example of when all pods are managed by Cilium. Signed-off-by: Joe Stringer <joe@cilium.io> 25 November 2021, 15:58:28 UTC
4ce5cef docs: Document recent feature deprecations Signed-off-by: Joe Stringer <joe@cilium.io> 25 November 2021, 15:58:28 UTC
back to top