https://github.com/cilium/cilium

sort by:
Revision Author Date Message Commit Date
3cebd6e Revert version check change for priority class name Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 12 October 2020, 10:13:01 UTC
bfb6509 Helm: full refactor of helm charts, default values implemented, tests updated, kind cni integration Signed-off-by: Sean Winn <sean@isovalent.com> Fixes: #13210 ```release-note Helm charts have been fully re-structured to a single chart for cilium with no dependency on sub-charts. More than 170 global values have been properly scoped to the cilium chart with sane defaults defined. Users upgrading from prior versions of cilium should be sure to read the upgrade guide for specific instructions. ``` 09 October 2020, 21:16:45 UTC
0ea21ae helm: bump hubble-ui patch version in chart values Use v0.7.2 by default for hubble-ui Signed-off-by: Sergey Generalov <sergey@genbit.ru> 08 October 2020, 08:42:30 UTC
37e22d6 USERS: Add Alibaba Cloud usage Reference: https://www.alibabacloud.com/blog/how-does-alibaba-cloud-build-high-performance-cloud-native-pod-networks-in-production-environments_596590 Signed-off-by: Thomas Graf <thomas@cilium.io> 08 October 2020, 08:39:08 UTC
2ce7d67 vagrant script: create a restart script You can restart vagrant VMs by passing the same env variables to the start script. Save the variables and create a restart script to make this easier. Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> 08 October 2020, 08:35:10 UTC
834aa57 vagrant script: add routing suggestion Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> 08 October 2020, 08:35:10 UTC
9daa71f vagrant script: fix typo Signed-off-by: Kornilios Kourtis <kornilios@isovalent.com> 08 October 2020, 08:35:10 UTC
b6df008 test: fix ipv6 addr configuration for vm interfaces * This commit fixes ipv6 configuration of VM interfaces. The Vagrantfile assigns an ipv6 address to both enp0s8 and enp0s9 interfaces. This assignemnt is not propogated properly as both the provisioning step have the same name `ipv6-config`. This results in only one of the interface getting the IPv6 address(enp0s9). This commit changes the name for provisioning steps so that both the steps are executed and both interfaces get IPv6 address assigned. Signed-off-by: Deepesh Pathak <deepshpathak@gmail.com> 08 October 2020, 07:35:17 UTC
d068162 connectivity-check: Improve CLI help and docs Improve the help text and documentation for listing and generating connectivity checks. Fixes: #12714 Signed-off-by: Joe Stringer <joe@cilium.io> 07 October 2020, 17:21:55 UTC
9d08cc6 connectivity-check: Fix misc cosmetic bits Rearrange the flags in the CLI and command output alphabetically, and fix a spelling error in the README. Signed-off-by: Joe Stringer <joe@cilium.io> 07 October 2020, 17:21:55 UTC
2864f48 docs: Add initial Hubble contributing guide Signed-off-by: Tom Payne <tom@isovalent.com> 07 October 2020, 16:38:10 UTC
3239a09 docs: Add note about git commit -s to contributing guide Signed-off-by: Tom Payne <tom@isovalent.com> 07 October 2020, 16:38:10 UTC
2c42c8c pkg/hubble/filters: Add HTTP path filters Signed-off-by: Tom Payne <tom@isovalent.com> 07 October 2020, 14:59:48 UTC
ca601cd k8s: Set upper limit when backing off from node retrieval failure Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
5ce94ef clusterpool: Error out if IPv6 is enabled but ClusterPool CIDR is not set The existing logic reported an error if both IPv4 and IPv6 CIDRs were not provided. This is insufficient when both IPv4 and IPv6 are enabled but only an IPv4 ClusterPool CIDR is specified. It lead to a situation where agents won't start up while waiting for the CiliumNode to to provide an IPv6 address. Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
c3d8ca0 ipam: Log parameters when initializing ClusterPool Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
e613b59 ipam: Move pkg/ipam/allocator/operator to pkg/ipam/allocator/clusterpool Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
96ca106 operator: Log when initializing the IPAM system Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
71a6094 k8s: Improve error messages when retrieving the CiliumNode resource Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
17aabf4 ipam: Derive PodCIDR requirements from enable-ipv[46] setting There is no need to require special Helm logic to set `k8s-require-ipv4-pod-cidr` correctly in combination with `ipam: cluster-pool`. The existing `enable-ipv4` and `enable-ipv6` options are more user friendly and ensure that cluster-pool also functions properly without Helm. Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
3af54a1 k8s: Improve error message when PodCIDR is unavailable When using ClusterPool IPAM, the reference to Kubernetes node is misleading to users. Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
d011b3b k8s: Rename GetNodeSpec() to WaitForNodeInformation() The function has grown and supports both CiliumNode and Kubernetes Node and thus the original name is misleading at this point. Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
d292904 nodediscovery: Log attempt to create or update CiliumNode resource This help detect instances when creating or updating the resource is delayed or hangs. It also helps validate the correct behavior of the cluster-pool IPAM implementation. Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
dff42c6 ipam: Rename IPAMOperatorV[46]CIDR to ClusterPoolIPv[46]CIDR Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
2c759a2 ipam: Rename IPAMOperator to IPAMClusterPool The operator is involved in various IPAM modes, a more specific name for the ClusterPool mode makes the code more readable. Signed-off-by: Thomas Graf <thomas@cilium.io> 07 October 2020, 13:31:58 UTC
2155815 docs: Small improvement to OpenShift OKD guide Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com> 07 October 2020, 11:55:34 UTC
43e6a4a docs: Document OpenShift OKD installation Co-authored-by: Vadim Rutkovsky <roignac@gmail.com> Co-authored-by: Joe Stringer <joe@cilium.io> Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com> 07 October 2020, 10:24:08 UTC
4a66961 contexthelpers: Fix deadlock when nobody recvs on success channel It has been observed that the "sessionSuccess <- true" statement can block forever. E.g. Cilium v1.7.9: goroutine 742 [chan send, 547 minutes]: github.com/cilium/cilium/pkg/kvstore.(*etcdClient).renewLockSession(0xc000f66000, 0x2790b40, 0xc000e247c0, 0x0, 0x0) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:657 +0x3e2 github.com/cilium/cilium/pkg/kvstore.connectEtcdClient.func6(0x2790b40, 0xc000e247c0, 0x3aae820, 0x2) /go/src/github.com/cilium/cilium/pkg/kvstore/etcd.go:819 +0x3e github.com/cilium/cilium/pkg/controller.(*Controller).runController(0xc000f42500) /go/src/github.com/cilium/cilium/pkg/controller/controller.go:205 +0xa2a created by github.com/cilium/cilium/pkg/controller.(*Manager).updateController /go/src/github.com/cilium/cilium/pkg/controller/manager.go:120 +0xb09 This can happen when the context cancellation timer fires after a function which consumes the context has returned, but before "sessionSuccess <- true" is executed. After the timeout, the receiving goroutine is closed, making the sending block forever. Fix this by making the sessionSuccess channel buffered. Note that after sending we don't check whether the context has been cancelled, as we expect that any subsequent functions which consume the context will check for the cancellation. Fixes: 02628547 ("etcd: Fix incorrect context usage in session renewal") Signed-off-by: Martynas Pumputis <m@lambda.lt> 07 October 2020, 09:40:46 UTC
c27faa6 test: Rework no-specs policy test Following the previous commit, asserting on whether we are able to apply the policy is no longer relevant. The previous commit allowed empty policies to be applied (at the K8s apiserver level), but a parsing error will be triggered which can be observed in the agent logs. Applying an empty policy will not overwrite the existing policy, it'll simply continue enforcing the previous policy. Note, the CNP on the K8s side will be out-of-sync with the actual policy in the datapath now. Signed-off-by: Chris Tarazi <chris@isovalent.com> 07 October 2020, 09:36:31 UTC
40ee882 daemon, k8s: Reject empty policies For context, an empty policy is deemed "empty" if it has no `spec` and no `specs`. When a policy is applied by the user, it will potentially go through two levels of validation: K8s apiserver (CRD schema) and agent. The agent-level validation is only reached if it passes the K8s CRD validation. Agent-level validation tends to be stricter than K8s CRD validation. This commit rejects empty policies at the agent level. An empty policy can still be admitted by the K8s apiserver, which was not the case since K8s 1.15 and later, due to CRD pruning [1]. Note that empty policies are only allowed by the apiserver if the policy already exists with the same name (update operation), because of the previous commit. New policies that are empty do not pass K8s apiserver validation, and therefore are rejected (this has always been the case since K8s 1.15). Previously, empty policies were allowed by the agent. However, they were allowed / rejected at the K8s apiserver, depending on the K8s version. Versions 1.14 and earlier admitted empty policies because they did not have CRD pruning as mentioned in [1]. Versions 1.15 and later rejected empty policies. (We are referring to policies here, but this applies to any K8s resource.) The consequence of this is evident in the next commit which reworks the end-to-end tests. The summary is again dependant on the K8s version: - K8s 1.14 and earlier: empty policies admitted; datapath was configured with an "empty" rule, meaning allow all traffic. - K8s 1.15 and later: empty policies _not_ admitted; datapath continues to be configured by the previous policy. With this commit, we no longer allow the difference in behavior described above. This means that regardless of the K8s version, the datapath will always be configured with the previous policy, upon "admission" of an empty policy. Essentially, empty policies are no longer allowed to have an empty rule, **at the agent level**. However, all empty policies will _still_ be permitted by the K8s apiserver (as long as it's an update operation). The previous commit is responsible for that. The reason for that is in Cilium 1.8 and earlier, we set `x-kubernetes-preserve-unknown-fields` in v1 (or `preserveUnknownFields` in v1beta1). This forced the apiserver to keep a policy's (or CR's) empty `spec` or `specs`, which had the consequence of bypassing the detection of "unknown" fields in the policy (or CR), and therefore allowed unknown fields. (Unknown fields are fields that we do not expect to be present in a policy; they can manifest themselves as user typos.) Another piece of information to note is that when an empty policy is applied over an existing policy (same name), then the policy resource on the K8s side (CNP or CCNP) will be out-of-sync with the actual policy in the datapath. This is new behavior introduced by this commit. [1]: https://github.com/kubernetes/kubernetes/pull/77333 Signed-off-by: Chris Tarazi <chris@isovalent.com> 07 October 2020, 09:36:31 UTC
fa00ef7 k8s: Disallow unknown fields in CNP & CCNP Previously, the `x-kubernetes-preserve-unknown-fields` field was set to true because it was incorrectly thought to have disallowed an empty spec in a CNP / CCNP (empty rule). In reality, it is not needed at all, and actually allows additional unknown fields to be permitted. Policies such as the following would be allowed, bypassing the schema validation of the CRD (note the `toFQDNs2`). ``` apiVersion: "cilium.io/v2" kind: CiliumClusterwideNetworkPolicy metadata: name: "denylist" spec: endpointSelector: matchLabels: k8s-app.guestbook: web egress: - toEndpoints: - matchLabels: "k8s:io.kubernetes.pod.namespace": kube-system "k8s:k8s-app": kube-dns toPorts: - ports: - port: "53" protocol: ANY rules: dns: - matchPattern: "*" - toFQDNs2: - matchName: "www.google.com" ``` Fixes: 691f831ba8 ("k8s, examples: Preserve unknown fields in {C,CC}NP") Revert "k8s, examples: Preserve unknown fields in {C,CC}NP" This reverts commit 691f831ba80467827e49cffb359d151120ccf9a9. Reported-by: André Martins <andre@cilium.io> Signed-off-by: Chris Tarazi <chris@isovalent.com> 07 October 2020, 09:36:31 UTC
a03e1a6 bpf: properly handle IPv4 fragmented packets in host firewall This commit fixes a typo in the IPV4_FRAGMENTS constant in the BPF host filewall, which should have been named ENABLE_IPV4_FRAGMENTS. As IPV4_FRAGMENTS was never defined, this bug caused the `is_untracked_fragment` variable in `ipv4_host_policy_ingress` to be set to `true` even when IPv4 fragment tracking was effectively enabled. This in turn caused the host firewall to always fail the L4 policy lookup for all IPv4 fragment and to always fall back to the L3 policy lookup, potentially returning the incorrect DROP_FRAG_NOSUPPORT error in case no policy allowing the traffic was in place. Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 06 October 2020, 21:17:29 UTC
a919220 docs: Remove "Experimental" from GKE E2E test section Since we're now running this in CI and several of us are using it, I'm guessing we can't really call it "Experimental" anymore. Signed-off-by: Paul Chaignon <paul@cilium.io> 06 October 2020, 19:22:19 UTC
ada53a9 Add the return code CT_REOPENED for conntrack table lookup. When a tcp SYN packet hits a conntrack entry that is in close-wait state, the entry timeout will be modified and the entry will be reopened. But because the return code is CT_ESTABLISHED, policy-verdict event will not be generated even this is a new connection. The patch fix the issue by returning the code CT_REOPENED instead in this case. Except for policy verdict events, all other handlings are the same as if CT_ESTABLISHED is returned. Signed-off-by: Zang Li <zangli@google.com> 06 October 2020, 17:30:32 UTC
e505dad docs: Add connectivity-check to Hubble getting started guide Signed-off-by: Tom Payne <tom@isovalent.com> 06 October 2020, 12:49:42 UTC
8a1d6d0 hubble/server: tls certificate hot reloading support Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 06 October 2020, 12:47:00 UTC
713b06b hubble/relay: tls certificate hot reloading support Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 06 October 2020, 12:47:00 UTC
c0c5c33 hubble/relay: extract the logger outside of the server Preparation for tls certificates hot reloading. Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 06 October 2020, 12:47:00 UTC
89e80d9 initial crypto/certloader package implementation The certloader package aim to provide a facility to ease dynamic tls.Config handling. The main goal is to "smoothly" handle TLS certificates rotation, i.e. without service interruption and without severing ongoing connections (aka "hot reloading"). This initial implementation brings support for file-backed TLS configuration watched for changes through fsnotify. The server and client configurations are separated into different interfaces to avoid misuse. Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 06 October 2020, 12:47:00 UTC
803f283 docs: Create connectivity test is separate namespace in getting started guide Signed-off-by: Tom Payne <tom@isovalent.com> 06 October 2020, 12:10:15 UTC
e22ee43 docs: Add demonstration that second example is additive This makes it clear that the L7 filter builds on the L3/L4 filter. Signed-off-by: Tom Payne <tom@isovalent.com> 06 October 2020, 12:10:15 UTC
edd0552 docs: Correct app name in getting started guide Signed-off-by: Tom Payne <tom@isovalent.com> 06 October 2020, 12:10:15 UTC
503b073 test: Restart unmanaged pods in log gatherer ns Log gatherer pods were not restarted, if during Cilium restarts they become unmanaged, they may lose connectivity, causing tests to fail. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 06 October 2020, 08:44:11 UTC
d1f2ddc bump Alpine base image to 3.12 Note: bumping `Documentation/Dockerfile` to 3.12 breaks the image build (some python related errors) so it was simply bumped to latest 3.7 python instead. Signed-off-by: Robin Hahling <robin.hahling@gw-computing.net> 05 October 2020, 15:51:36 UTC
7938177 pkg/option: re-introduce conntrack-gc-interval flag This flag was accidentally removed, re-adding it. Fixes: 1cb5e185f301 ("daemon: remove deprecated conntrack-garbage-collector-interval option") Signed-off-by: André Martins <andre@cilium.io> 05 October 2020, 15:15:43 UTC
4a445eb docs: Clarify when GITHUB_TOKEN is needed Add GITHUB_TOKEN=xxx to all backport commands that require the GitHub token. This is to help contributors who blindly copy-paste commands, such as myself... Signed-off-by: Paul Chaignon <paul@cilium.io> 05 October 2020, 13:40:55 UTC
0c5758d doc: Request user to restart unmanaged pods in Azure guide When following the AKS installation instructions, there will be existing pods which are not managed by Cilium. Ask the user to restart them. Fixes: #13154 Signed-off-by: Thomas Graf <thomas@cilium.io> 05 October 2020, 11:13:01 UTC
3c482a5 azure: Don't fail if gateway IP is not known The commit dd99958bdd introduced support for multiple interfaces and added a new requirement for the gateway IP to be known. This is not always the case. Make it optional. This resulted in errors like these: ``` Warning FailedCreatePodSandBox 3m46s (x264 over 8m38s) kubelet, aks-nodepool1-10706209-vmss000001 (combined from similar events): Failed to create pod sandbox: rpc error: code = Unknown desc = failed to set up sandbox container "600afaf8013aac81bdd001e12f4ee115ef07a9c1520dc72b7c9c449a8f3f3b86" network for pod "coredns-869cb84759-jrxdd": networkPlugin cni failed to set up pod "coredns-869cb84759-jrxdd_kube-system" network: unable to setup interface datapath: unable to parse routing info: invalid ip: ``` Fixes: #13154 Fixes: dd99958bdd ("azure: support multiple pods subnets") Signed-off-by: Thomas Graf <thomas@cilium.io> 05 October 2020, 11:13:01 UTC
3a95492 test: don't check logs if kubectl is nil Added a nil check in log check function since it can be run after test failed due to kubectl failing to set up, resulting in a panic. Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 05 October 2020, 08:16:10 UTC
419e9be Add Azure as prefix for UserAssignedIdentityID Signed-off-by: Vlad Ungureanu <vladu@palantir.com> 05 October 2020, 08:11:41 UTC
efdacac Rename azure-user-assigned-identity-name to azure-user-assigned-identity-id Signed-off-by: Vlad Ungureanu <vladu@palantir.com> 05 October 2020, 08:11:41 UTC
30ecb48 common: move single-use consts to using package These consts are used only in a single package each. Move them to the package that uses them and un-export them. The ServicesKeyPath is unused and can be removed altogether Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 05 October 2020, 07:47:56 UTC
ca51dbf common: move single-use function MoveNewFilesTo to pkg/endpoint This function is only used in pkg/endpoint, so move it there and unexport it. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 05 October 2020, 07:47:56 UTC
3a7375f cilium/cmd: look up known identities from cache in non-numeric mode PR #13304 switched `cilium ip list` output to show names for reserved identities. It is easy enough to look up any other numeric ID for known labels, so do that in the non-numeric mode. Example output: ``` $ cilium ip list IP IDENTITY SOURCE 0.0.0.0/0 world ::/0 world 10.0.2.15/32 host 10.11.0.78/32 k8s:io.cilium.k8s.policy.cluster=default k8s k8s:k8s-app=kube-dns k8s:io.kubernetes.pod.namespace=kube-system k8s:io.cilium.k8s.policy.serviceaccount=coredns 10.11.0.90/32 host 10.11.0.116/32 health 10.11.1.123/32 host 10.11.1.251/32 health 192.168.33.11/32 host 192.168.34.11/32 host fd00::b/128 host fd00::2d7c/128 health fd00::9998/128 k8s:k8s-app=kube-dns k8s k8s:io.kubernetes.pod.namespace=kube-system k8s:io.cilium.k8s.policy.serviceaccount=coredns k8s:io.cilium.k8s.policy.cluster=default fd00::e57e/128 host fd00::1:e9b/128 host fd00::1:8742/128 health fd01::b/128 host fe80::ccf4:b3ff:fe16:de79/128 host $ cilium ip list -n IP IDENTITY SOURCE 0.0.0.0/0 2 ::/0 2 10.0.2.15/32 1 10.11.0.78/32 104 k8s 10.11.0.90/32 1 10.11.0.116/32 4 10.11.1.123/32 1 10.11.1.251/32 4 192.168.33.11/32 1 192.168.34.11/32 1 fd00::b/128 1 fd00::2d7c/128 4 fd00::9998/128 104 k8s fd00::e57e/128 1 fd00::1:e9b/128 1 fd00::1:8742/128 4 fd01::b/128 1 fe80::ccf4:b3ff:fe16:de79/128 1 ``` Address https://github.com/cilium/cilium/pull/13304#issuecomment-700314189 Suggested-by: Joe Stringer <joe@cilium.io> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 05 October 2020, 07:45:55 UTC
40411ed cilium/cmd: print ID argument in fatal message Print the ID passed as argument in the fatal message, not the returned (possibly nil) id. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 05 October 2020, 07:45:55 UTC
6af5b70 bugtool: collect IP cache list with numeric identities PR #13304 switched `cilium ip list` output to show names for reserved identities. In bugtool output we might want the underlying integer values when debugging, so add the `-n` option. There's still the `cilium identity list` output to correlate the numeric identities to names. Address https://github.com/cilium/cilium/pull/13304#issuecomment-700314867 Suggested-by: Joe Stringer <joe@cilium.io> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 05 October 2020, 07:45:55 UTC
31c3bab test: update runtime test descriptions after removal of DNS poller Follow-up for #13229 Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 05 October 2020, 07:44:43 UTC
6584046 datapath/linux/route: use upstream function to list rules Signed-off-by: Lehner Florian <dev@der-flo.net> 03 October 2020, 19:23:11 UTC
6083b06 docs: add deny policies documentation Add documentation and examples for policy denies. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
cce9093 test: add some e2e coverage for deny policies Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
61e6a67 pkg/policy: add unit tests for deny policies Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
f1d0f7f pkg/policy: refactor manual unit tests into automatically generated These changes introduce automatic capability of performing unit tests by adding an algorithm used to derive map states and test cases. This commit keeps the same behavior used in the manually unit tests to check if the new algorithm used to derive map states breaks the old expectations. In follow up changes the manual tests will be removed. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
2f64377 pkg/policy: use PortDenyRules for deny policies Since not all functionalities will be available in the IngressDeny and EgressDeny fields of Rule, it was necessary to create a new structure, IngressDeny and EgressDeny. These 2 structures share common fields with the Ingress and Egress except the implementation of L7 Policies, which is part of the PortRule but not of the PortDenyRules. This requirement requires the introduction of some interfaces to avoid code duplication in the policy calculation. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
2bf6797 pkg/{endpoint,policy}: add policy deny implementation Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
63f29cc policy/api: add IngressDeny and EgressDeny fields to API These changes only add the fields into the API for both Cilium and Kubernetes interactions. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
75a2b47 bpf: add deny policy enforcement This change adds the deny policy enforcement in the datapath. When checking the entry of the policy map, we will check the dedicated bit for policy denies. If that bit is set, which we will consider unlikely to avoid performance penalties for the allow case, the traffic will be dropped. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
23e2506 cilium/cmd: dump policy deny maps With these changes users will have the ability to dump deny policy maps using 'cilium bpf policy map'. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
e6fce27 api/v1: add policy deny specific fields Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
0ba1967 pkg/endpoint: remove unused functions Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
607431f pkg/policy: add function to retrieve identities of policyMapState Create function to de-duplicate code and have the ability to re-use code. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
12e8d11 refactor desiredPolicyAllowsIdentity out of pkg/endpoint This function can be refactored out of the pkg/endpoint. With this change it is also possible to unit test this function. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
0fa3561 k8s/apis: refactor parseToCiliumRule functions This will allow to reuse functions. Signed-off-by: André Martins <andre@cilium.io> 03 October 2020, 16:00:39 UTC
28d9889 node: Join a k8s cluster via kvstore Add Cilium agent option `--join-cluster` has the following effects: - agent registers the node in a special `register` key in the kv store - agent waits until kv store updates the registration with a numeric identifier allocated for the node labels Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
a8c83d3 node: Add NodeIdentity field Add an optional NodeIdentity field to Node and CiliumNode. This can be used to pass the Cilium numeric identity of a node in the cluster. If missing, either NODE or REMOTE_NODE is assumed depending on the Cilium agent configuration. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
b6f37a1 bpf: Add support for LOCAL_NODE_ID Add a bpf macro for LOCAL_NODE_ID that is only used when setting the source security ID in encap headers for packets originating from the local node. This defaults to 6, the same value as REMOTE_NODE_ID. Any other value may be used, if set during Cilium agent bootstrap before the node_config.h is written. This allows representing the Cilium node as an endpoint in the cluster, allowing endpoint selector based network policy enforcement for traffic originating from Cilium nodes. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
b34f99c operator: Make CEP GC aware of the type of the owner CEP GC used to delete CEPs whenever a Pod resource with the same name no longer exists. Change this to inspect the OwnerReferences of the CEP resource, and keep the existing behavior if CEP is owned by a Pod. If the CEP is owned by a CiliumNode, then check the existence of that instead. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
8a5bc17 build: Suppress warnings about invalid image names When pre-pulling images, detect special characters not allowed in image names to avoid docker warnings about invalid image names. These were caused by using environment variables for image names. Those images will not be pre-pulled. Pre-pulling of docker images used with buildkit builds is a workaround for a docker bug causing images sometimes not being pulled properly. Therefore it is possible that when image names are specified indirectly via an environment variable, and the image is not pre-pulled, then it is possible that the build will fail due to the docker bug. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
a8b4f16 vagrant: Share parent of cilium directory if SHARE_PARENT is set Share the parent of the cilium directory if SHARE_PARENT is set. This shares all the cilium repos with one mount, but requires Cilium directory to be named 'cilium'. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
28154fa test: Lauch runtime VM in the same network as k8s for the same job Configure cilium-k8s#{$BUILD_NUMBER}-#{$JOB_NAME}-#{$K8S_VERSION} private network also for the runtime VM so that it has connectivity to k8s nodes when run at the same time. This is needed for VM support testing. Document prevously undocumented options for Vagrantfiles. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 02 October 2020, 21:46:34 UTC
d315ec3 doc: Document API rate limiting Signed-off-by: Thomas Graf <thomas@cilium.io> 02 October 2020, 16:43:28 UTC
6333eaa agent: Add rate limiting to endpoint API calls Add default rate limiting for all endpoint related API calls with automatic adjustment based on estimated processing duration. Metrics are provided to monitor the rate limiting system: ``` cilium_api_limiter_adjustment_factor api_call="endpoint-create" 0.695787 cilium_api_limiter_processed_requests_total api_call="endpoint-create" outcome="success" 7.000000 cilium_api_limiter_processing_duration_seconds api_call="endpoint-create" value="estimated" 2.000000 cilium_api_limiter_processing_duration_seconds api_call="endpoint-create" value="mean" 2.874443 cilium_api_limiter_rate_limit api_call="endpoint-create" value="burst" 4.000000 cilium_api_limiter_rate_limit api_call="endpoint-create" value="limit" 0.347894 cilium_api_limiter_requests_in_flight api_call="endpoint-create" value="in-flight" 0.000000 cilium_api_limiter_requests_in_flight api_call="endpoint-create" value="limit" 0.000000 cilium_api_limiter_wait_duration_seconds api_call="endpoint-create" value="max" 15.000000 cilium_api_limiter_wait_duration_seconds api_call="endpoint-create" value="mean" 0.000000 cilium_api_limiter_wait_duration_seconds api_call="endpoint-create" value="min" 0.000000 ``` Signed-off-by: Thomas Graf <thomas@cilium.io> 02 October 2020, 16:43:28 UTC
3141e65 rate: Add API rate limiting system The API rate limiting system is capable to enforce both rate limiting and maximum parallel requests. Instead of enforcing static limits, the system is capable to automatically adjust rate limits and allowed parallel requests by comparing the provided estimated processing duration with the mean processing duration observed. Usage: ``` var requestLimiter = rate.NewAPILimiter("myRequest", rate.APILimiterParameters{ rate.SkipInitial: 5, rate.RateLimit: 1.0, // 1 request/s rate.ParallelRequests: 2, }, nil) func myRequestHandler() error { req, err := requestLimiter.Wait(context.Background()) if err != nil { // request timed out whie waiting return err } defer req.Done() // Signal that request has been processed // process request .... return nil } ``` Configuration parameters: - EstimatedProcessingDuration time.Duration EstimatedProcessingDuration is the estimated duration an API call will take. This value is used if AutoAdjust is enabled to automatically adjust rate limits to stay as close as possible to the estimated processing duration. - AutoAdjust bool AutoAdjust enables automatic adjustment of the values ParallelRequests, RateLimit, and RateBurst in order to keep the mean processing duration close to EstimatedProcessingDuration - MeanOver int MeanOver is the number of entries to keep in order to calculate the mean processing and wait duration - ParallelRequests int ParallelRequests is the parallel requests allowed. If AutoAdjust is enabled, the value will adjust automatically. - MaxParallelRequests int MaxParallelRequests is the maximum parallel requests allowed. If AutoAdjust is enabled, then the ParalelRequests will never grow above MaxParallelRequests. - MinParallelRequests int MinParallelRequests is the minimum parallel requests allowed. If AutoAdjust is enabled, then the ParallelRequests will never fall below MinParallelRequests. - RateLimit rate.Limit RateLimit is the initial number of API requests allowed per second. If AutoAdjust is enabled, the value will adjust automatically. - RateBurst int RateBurst is the initial allowed burst of API requests allowed. If AutoAdjust is enabled, the value will adjust automatically. - MinWaitDuration time.Duration MinWaitDuration is the minimum time an API request always has to wait before the Wait() function returns an error. - MaxWaitDuration time.Duration MaxWaitDuration is the maximum time an API request is allowed to wait before the Wait() function returns an error. - Log bool Log enables info logging of processed API requests. This should only be used for low frequency API calls. Example: ``` level="info" msg="Processing API request with rate limiter" maxWaitDuration=10ms name=foo parallelRequests=2 subsys=rate uuid=933267c5-01db-11eb-93bb-08002720ea43 level="info" msg="API call has been processed" name=foo processingDuration=10.020573ms subsys=rate totalDuration=10.047051ms uuid=933265c7-01db-11eb-93bb-08002720ea43 waitDurationTotal="18.665µs" level=warning msg="Not processing API request. Wait duration for maximum parallel requests exceeds maximum" maxWaitDuration=10ms maxWaitDurationParallel=10ms name=foo parallelRequests=2 subsys=rate uuid=933269d2-01db-11eb-93bb-08002720ea43 ``` - DelayedAdjustmentFactor float64 DelayedAdjustmentFactor is percentage of the AdjustmentFactor to be applied to RateBurst and MaxWaitDuration defined as a value between 0.0..1.0. This is used to steer a slower reaction of the RateBurst and ParallelRequests compared to RateLimit. - SkipInitial int SkipInitial is the number of API calls to skip before applying rate limiting. This is useful to define a learning phase in the beginning to allow for auto adjustment before imposing wait durations on API calls. - MaxAdjustmentFactor float64 MaxAdjustmentFactor is the maximum adjustment factor when AutoAdjust is enabled. Base values will not adjust more than by this factor. The configuration of API rate limiters is typically provided as-code to establish defaults. A string based configuration option can then be used to adjust defaults. This allows to expose configuration of rate limiting using a single option flag: ``` go l, err = NewAPILimiterSet(map[string]string{ "foo": "rate-limit:2/m,rate-burst:2", }, map[string]APILimiterParameters{ "foo": { RateLimit: rate.Limit(1.0 / 60.0), AutoAdjust: true, }, }, nil) ``` Signed-off-by: Thomas Graf <thomas@cilium.io> 02 October 2020, 16:43:28 UTC
1010757 pkg/hubble/filters: Add HTTP method filters Signed-off-by: Glib Smaga <code@gsmaga.com> 02 October 2020, 13:49:06 UTC
cbccac4 api/v1: Add http method filter entry Signed-off-by: Glib Smaga <code@gsmaga.com> 02 October 2020, 13:49:06 UTC
98356e2 docs: Document make target for operator Docker image Signed-off-by: Paul Chaignon <paul@cilium.io> 02 October 2020, 12:12:21 UTC
2779bfd Restores ClusterIP service entry upon LRP removal. Deleting an LRP shadowing a ClusterIP service today will delete such service entry entirely, this is problematic in cases where the original service is still needed, e.g., NodeLocalDNS. This allows for restoring ClusterIP service when correspoding LRP is removed. We acquire original service info from Cilium's service cache and enforce an update event to restore the service entry. With LRP, `cilium service list`: ID Frontend Service Type Backend 1 10.91.240.10:53 LocalRedirect 1 => 10.88.1.242:53 2 10.91.241.27:53 ClusterIP 1 => 10.88.1.117:53 2 => 10.88.0.46:53 3 10.91.240.1:443 ClusterIP 1 => 35.193.66.178:443 4 10.91.254.119:443 ClusterIP 1 => 10.88.1.127:443 After removing the LRP, `cilium service list`: ID Frontend Service Type Backend 2 10.91.241.27:53 ClusterIP 1 => 10.88.1.117:53 2 => 10.88.0.46:53 3 10.91.240.1:443 ClusterIP 1 => 35.193.66.178:443 4 10.91.254.119:443 ClusterIP 1 => 10.88.1.127:443 5 10.91.240.10:53 ClusterIP 1 => 10.88.1.117:53 2 => 10.88.0.46:53 Signed-off-by: Weilong Cui <cuiwl@google.com> 02 October 2020, 07:57:17 UTC
d6ad56b test: further increase range of accepted values for bandwidth test Our current range for the 25Mbps target is [18; 32]. We seem to always fall short of the 18 bound. Expected cases of regressions are likely to be either a lack of connectivity or a lack of rate limiting. So with a range [1; 30] we're likely to catch most regression cases without missing on cases where there's no rate limiting (which we could miss if we keep increase the whole range). Fixes: #13062 Co-authored-by: Daniel Borkmann <daniel@iogearbox.net> Signed-off-by: Paul Chaignon <paul@cilium.io> 02 October 2020, 07:34:33 UTC
ca9992e cleanup: remove unused code Signed-off-by: Lehner Florian <dev@der-flo.net> 01 October 2020, 19:05:50 UTC
55209b7 docs: Move performance guide under Operations Signed-off-by: Paul Chaignon <paul@cilium.io> 01 October 2020, 18:31:41 UTC
e4b3689 docs: Move scalability guide under Operations Signed-off-by: Paul Chaignon <paul@cilium.io> 01 October 2020, 18:31:41 UTC
b769c64 docs: operations/ dir to match displayed structure Signed-off-by: Paul Chaignon <paul@cilium.io> 01 October 2020, 18:31:41 UTC
175c7da datapath/connector: move CheckLink to daemon/cmd This function is only used in daemon/cmd in a single place, so move it there and unexport it. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 01 October 2020, 18:29:39 UTC
96648d0 datapath/connector: remove unused funcs The last remaining user of DeriveEndpointFrom was removed by commit 532ad9d44a6f ("rm pkg/workloads"). GetNetInfoFromPID and GetVethInfo were only used by DeriveEndpointFrom, so remove them as well. Signed-off-by: Tobias Klauser <tklauser@distanz.ch> 01 October 2020, 18:29:39 UTC
7257d4a install: RBAC permissions for finalizers subresources Since Cilium sets ownership references on pods, it needs permission to delete pods via finalizers and for that purpose it also needs permissions to set the finalizers on pods. This change is required for OpenShift, however it's based on the GC admission controller that was introduced in Kubernetes 1.5 (https://github.com/kubernetes/kubernetes/pull/34829). Also add explicit permissions for finalizers on all CRs, to ensure that agent and operator can set finalizers on their own resources. Signed-off-by: Ilya Dmitrichenko <errordeveloper@gmail.com> 01 October 2020, 16:42:55 UTC
b6d0054 test: enable operator metrics in stresspolicy suite Signed-off-by: Maciej Kwiek <maciej@isovalent.com> 01 October 2020, 16:20:30 UTC
657171f vagrant: bump bpf-next vagrant box version Pull in latest BPF kernel features from bpf-next [0]. [0] https://lore.kernel.org/bpf/cover.1601477936.git.daniel@iogearbox.net/ Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 01 October 2020, 12:29:36 UTC
18dc9f2 envoy: Stop using deprecated filter names Stop using deprecated Envlyo filter names in order to get rid of deprecation warning logs. Signed-off-by: Jarno Rajahalme <jarno@covalent.io> 01 October 2020, 09:27:28 UTC
28b4c96 fsnotify: correctly check for event operation fsnotify Event.Op is a bit mask and testing for strict equality might not detect the event operation correctly. This patch make it so we check for fsnotify event operation consistently as documented at https://github.com/fsnotify/fsnotify. Signed-off-by: Alexandre Perrin <alex@kaworu.ch> 01 October 2020, 08:56:44 UTC
b71cf0d contrib: Improve start-release.sh script Due to an extra `v` in the branch name, this script would fail with: $ ~/git/cilium/contrib/release/start-release.sh v1.6.12 128 fatal: 'origin/vv1.6' is not a commit and a branch 'pr/prepare-v1.6.12' cannot be created from it Signal ERR caught! Traceback (line function script): 62 main /home/joe/git/cilium/contrib/release/start-release.sh Fix it. While we're at it, update the instructions at the end for next steps, since there's also now a `submit-backport.sh` script to send the PR from the CLI. Signed-off-by: Joe Stringer <joe@cilium.io> 01 October 2020, 08:11:45 UTC
57d3473 bugtool: get bpffs mountpoint from /proc/self/mounts Rather then hardcoding the /sys/fs/bpf value in bugtool, use the `mountinfo` package (which exposes the information in /proc/self/mounts) to determine the correct mountpoint for the BPF filesystem. Fixes: #13218 Signed-off-by: Gilberto Bertin <gilberto@isovalent.com> 01 October 2020, 07:50:19 UTC
back to top