sort by:
Revision Author Date Message Commit Date
e731065 dev VM: bump k8s version to v1.14.0, CNI to 0.7.5 and etcd to v3.3.12 Signed-off-by: André Martins <andre@cilium.io> 27 March 2019, 15:12:14 UTC
c6cdb7d vendor: update github.com/containernetworking/cni to v0.7.0-rc2 Signed-off-by: André Martins <andre@cilium.io> 27 March 2019, 14:09:25 UTC
996cba0 update loopback CNI plugin to v0.7.5 in runtime docker image Signed-off-by: André Martins <andre@cilium.io> 27 March 2019, 14:09:25 UTC
f62cc7b vendor: update github.com/containernetworking/plugins to v0.7.5 This is the same version used by k8s so we should be using the same one Signed-off-by: André Martins <andre@cilium.io> 27 March 2019, 14:09:25 UTC
3c16e51 remove .* from .dockerignore Ignoring the .* makes the git directory to be ignored and thefore it's not possible to derive the commit ID when building docker images. Fixes: f332375063be ("Add dockerignore file.") Signed-off-by: André Martins <andre@cilium.io> 27 March 2019, 12:38:36 UTC
3a570b9 linux/route: use replace route instead of lookup As lookups can be expensive for a large number of routes we can use replace route directly instead. Signed-off-by: André Martins <andre@cilium.io> 27 March 2019, 10:53:37 UTC
e956cfb linux/route: use RouteListFiltered to filter routes netlink library provides a function to filter routes based on specific filters, to avoid retching all routes every time we want to perform a lookup we can make use of the RouteListFiltered instead. Signed-off-by: André Martins <andre@cilium.dev> 27 March 2019, 10:53:37 UTC
0a38063 test: Fix copyright header for scopes ginkgo-ext Signed-off-by: Joe Stringer <joe@cilium.io> 27 March 2019, 10:47:58 UTC
90d861b test: Fix some misc spelling mistakes Signed-off-by: Joe Stringer <joe@cilium.io> 27 March 2019, 10:47:58 UTC
443b989 bpf: Remove cidr_addr from policy check The CIDR parameters to this function are no longer used and can be removed from function paramters as well as callers. Fixes: #6178 Signed-off-by: Umesh Keerthy B S <umesh.freelance@gmail.com> 27 March 2019, 10:47:20 UTC
90a36a9 Revert "policy: Simplify l7 rule generation for l4-only rules" This reverts commit 13c705070d71e44b11a1c0f57b8de89a89c93e6b. The wildcardL3L4Rules() function is used from L3-only rules as well as L3-dependent L4 and L4-only rules, so in these cases the empty endpoint set has different meaning: In L3-only, it means "select nothing at all"; in any L4 type of rule, it means "select all endpoints with this port" - ie a wildcard. Shift this code back to where it used to be, and add a comment to describe why it's there. Signed-off-by: Joe Stringer <joe@cilium.io> 27 March 2019, 05:16:00 UTC
13c7050 policy: Simplify l7 rule generation for l4-only rules This logic is almost equivalent to the version introduced in the previous patch, however now if the wildcard selector is explicitly defined in the rules then we won't end up adding yet another wildcard selector to iterate through. In cases where the wildcard selector is early on in the set of `toEndpoints`/`fromEndpoints`, this will reduce iteration over those endpoint selector slices. Signed-off-by: Joe Stringer <joe@cilium.io> 26 March 2019, 22:07:08 UTC
1ef4ec5 policy: Generate L7 allow-all for L4-only rules Previously, if an L4-only rule shadowed an L7 rule on the same port/protocol, we would redirect the traffic to the proxy, but we would *not* generate xDS filters to allow that traffic. This would result in unexpectedly dropping traffic that should otherwise be allowed. The included test previously failed, with the functional change in this commit it now passes. Fixes: #7438 Signed-off-by: Joe Stringer <joe@cilium.io> 26 March 2019, 22:07:08 UTC
b731c6b daemon/policy: Share labels declarations in tests Signed-off-by: Joe Stringer <joe@cilium.io> 26 March 2019, 22:07:08 UTC
cd8d72e daemon/policy: Refactor test endpoint initialization Consolidate the initialization of the test endpoints to simplify the core logic of individual tests. Signed-off-by: Joe Stringer <joe@cilium.io> 26 March 2019, 22:07:08 UTC
f63302b daemon/policy: Consolidate policy testing primitives Reuse the same CNP, PNP primitives for policy testing. Signed-off-by: Joe Stringer <joe@cilium.io> 26 March 2019, 22:07:08 UTC
5ca566c test update k8s to 1.11.9, 1.12.7, 1.13.5 and 1.14.0 Signed-off-by: André Martins <andre@cilium.io> 26 March 2019, 21:04:44 UTC
3947585 Test: Wait endpoints to be deleted. In the following CI fail, toServices fail due pods were schedulled to be deleted. This PR addressed that all pods are deleted when reach the BeforeAll. Link: https://jenkins.cilium.io/job/cilium-ginkgo/job/cilium/job/master/2605/testReport/junit/k8s-1/14/K8sServicesTest_External_services_To_Services_first_endpoint_creation/ Error: ``` /home/jenkins/workspace/cilium-ginkgo_cilium_master/src/github.com/cilium/cilium/test/ginkgo-ext/scopes.go:366 Pods are not ready after timeout Expected <*errors.errorString | 0xc4200c3a20>: { s: "There are some pods with filter that are marked to be deleted", } to be nil /home/jenkins/workspace/cilium-ginkgo_cilium_master/src/github.com/cilium/cilium/test/k8sT/Services.go:248 ``` Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 26 March 2019, 15:59:24 UTC
dd18958 bpf: suppress needless CT generation whose saddr/daddr are target addr When a connection is newly created against a service, we generate two CT entries in both loopback and non-loopback cases. Running the same logic as the one for loopback scenario where reverse SNAT is required against non-loopback scneraio creates a useless CT entry, whose saddr and daddr are both the same target addr. Regardless of other fields in the tuple, it's harmless since it's considered martian and never be utilized in an unexpected way in any place but let's not waste CT map. Signed-off-by: Koichiro Den <den@klaipeden.com> 26 March 2019, 01:32:06 UTC
14f4053 bpf: clean up unused args Signed-off-by: Koichiro Den <den@klaipeden.com> 26 March 2019, 01:32:06 UTC
e20b7c9 docs: Fix triage steps enumeration The wrong indent messed up the #. numbering scheme. Signed-off-by: Joe Stringer <joe@cilium.io> 25 March 2019, 23:01:27 UTC
3fa3ce9 docs: Fix up triage pipeline links Signed-off-by: Joe Stringer <joe@cilium.io> 25 March 2019, 23:01:27 UTC
03b6a12 bpf: improve ethertype validation The main motivation here is to suppress misleading DROP notification from handle_xgress() which says "reason Invalid source ip" when the frame is not Ethernet II, e.g., LLC frame whose skb->protocol being set to ETH_P_IP or ETH_P_IPV6 leads to the aforementioned message. Let's directly validate ethertype instead of checking skb->protocol. For clarity this patch introduces the error code DROP_UNSUPPORTED_L2. Fixes: #6305 - LLC frames are dropped due to "Invalid source ip" Signed-off-by: Koichiro Den <den@klaipeden.com> 25 March 2019, 22:42:31 UTC
f2dc377 CNP Status: Fix issues on invalid CNP. If the cnp is not correct the status is not correctly updated because `cnp.updateStatus` is never called, and this is an unexpected behaviour, the status field should have the following information: ``` { "nodes": { "k8s1": { "error": "Invalid CiliumNetworkPolicy specs: L7 rules can only apply to TCP (not ANY) except for DNS rules", "lastUpdated": "2019-03-22T14:37:26.711811883Z" } } } ``` An invalid rule can be the following: ``` { "apiVersion": "cilium.io/v2", "kind": "CiliumNetworkPolicy", "metadata": { "annotations": { "kubectl.kubernetes.io/last-applied-configuration": "{\"apiVersion\":\"cilium.io/v2\",\"kind\":\"CiliumNetworkPolicy\",\"metadata\":{\"annotations\":{},\"labels\":{\"test\":\"policygen\"},\"name\":\"bc6028dc\",\"namespace\":\"default\"},\"specs\":[{\"endpointSelector\":{\"matchLabels\":{\"any:id\":\"bc6028dc-8d411be1\"}},\"ingress\":[{\"fromEndpoints\":[{}]}]},{\"egress\":[{\"toPorts\":[{\"ports\":[{\"port\":\"80\"}],\"rules\":{\"http\":[{\"method\":\"GET\",\"path\":\"/private\"}]}}]}],\"endpointSelector\":{\"matchLabels\":{\"any:id\":\"bc6028dc-636ccb26\"}}}]}\n" }, "creationTimestamp": "2019-03-22T00:16:00Z", "generation": 1, "labels": { "test": "policygen" }, "name": "bc6028dc", "namespace": "default", "resourceVersion": "10003", "selfLink": "/apis/cilium.io/v2/namespaces/default/ciliumnetworkpolicies/bc6028dc", "uid": "ac3f2a26-4c37-11e9-95e5-080027051dad" }, "specs": [ { "endpointSelector": { "matchLabels": { "any:id": "bc6028dc-8d411be1" } }, "ingress": [ { "fromEndpoints": [ {} ] } ] }, { "egress": [ { "toPorts": [ { "ports": [ { "port": "80" } ], "rules": { "http": [ { "method": "GET", "path": "/private" } ] } } ] } ], "endpointSelector": { "matchLabels": { "any:id": "bc6028dc-636ccb26" } } } ] } ``` This commit fix the update on failed rules. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 25 March 2019, 15:14:14 UTC
dbb2ef1 endpoint: only start EventQueue once endpoint is exposed via endpointmanager This ensures that goroutines are not leaked for endpoints which cannot be inserted into the endpointmanager / are not inserted into the endpointmanager due to some error before insertion. Fixes: 04a49b0898ed008f6f332ae55ba4acfcc396a40f ("endpoint: add EventQueue field") Signed-off by: Ian Vernon <ian@cilium.io> 23 March 2019, 23:15:05 UTC
5585765 prometheus/templates: update grafana dashboard Signed-off-by: André Martins <andre@cilium.io> 23 March 2019, 23:14:33 UTC
2be6419 k8s: ignore etcd-operator and gcp-controller-manager endpoints by default etcd-operator and gcp-controller-manager endpoints are only used for leaderelection which makes those endpoints to have frequent endpoint updates. As those endpoints are not used in a k8s service we can safely ignore them by default. Signed-off-by: André Martins <andre@cilium.io> 23 March 2019, 23:00:31 UTC
ada4e28 k8s: set endpoint watcher selector as a CLI option As certain environments require different tweaks, the endpoint watcher selector needs to be option so users can set the endpoint watcher selector accordingly their enviornment. Signed-off-by: André Martins <andre@cilium.io> 23 March 2019, 23:00:31 UTC
e504986 daemon: fix namespace controller to watch for correct object type This should be `v1.Namespace`, not `v1.Node`. Fixes: a71dfb49bb55e879dc7449c95f47c527fbea70a8 ("pkg/k8s: use official k8s library to watch events from kube-apiserver") Signed-off by: Ian Vernon <ian@cilium.io> 23 March 2019, 04:10:24 UTC
b8a726b Fixes: #7440 - remove deprecated PolicyRegenerationTimeSquare and PolicyRegenerationTime metrics Signed-off-by: Mark deVilliers markdevilliers@gmail.com 22 March 2019, 23:29:16 UTC
889b7fa optimize processing of endpoints when reacting to rule updates The sets of endpoints which need to be regenerated by a given policy update vs. those which need to have their revision bumped directly are always disjoint. Furthermore, when evaluating whether a set of rules select a given endpoint, the set of endpoints which needs to be regenerated can be updated, and no more rules need to be evaluated, as the set of endpoints to be regenerated has already been updated for said endpoint. This means that said endpoint can be removed from the original set containing all endpoints. The current flow when reacting to rule updates is as follows: * Add all endpoints to the set of endpoints which need to have their revision bumped * If an endpoint needs to be regenerated because rules which were added, deleted, or updated which select the endpoint, then remove the endpoint from the set of all endpoints and add it to the set of endpoints which need to be regenerated. No further processing needs to be performed for this endpoint apart from regenerating it later. * After all endpoints are processed in relation to all rule changes, the set of endpoints which formerly contained all endpoints will eventually contain only the endpoints whose revision needs to be bumped. This makes the reacting to rule updates more performant, as the iteration of rules for a given endpoint is no longer always `O(n)`, but at best case may be `O(1)`. Furthermore, an endpoint is never evaluated again for rule matching (which is costly) as part of the reacting after it is determined to be needed to be regenerated. A side-effect of this is that the updating of whether rules select the endpoint is deferred to occur when the endpoint is being regenerated, but this will only happen once per-rule per-endpoint. This is OK because the caches within the rules are lazily updated. Signed-off by: Ian Vernon <ian@cilium.io> 22 March 2019, 23:24:01 UTC
0d1f22b dnsproxy: Return DNS response before cache update We may have slow update behaviour when running the proxy on a loaded system. In these cases it is more reliable to return the DNS response to the requesting pod before we block. This assumes that it is better to absorb a delay in regenerating in a TCP connection setup than to delay the DNS response beyond its timeout. Signed-off-by: Ray Bejjani <ray@covalent.io> 22 March 2019, 23:21:19 UTC
3f12fb6 cilium: ipsec, add cleanup xfrm routine After a timeout delete xfrm rules not matching current version. In this way we can perform non-disruptive updates to the version. This still requires a cilium restart however until we attach an inotify event to the file. xfrm state cleanup is still an over-approximation because states do not use mark values yet. A follow up series can clean this up. Signed-off-by: John Fastabend <john.fastabend@gmail.com> 22 March 2019, 22:21:14 UTC
b698972 cilium: ipsec, support rolling updates Currently, rolling updates may get stuck due to a time period between when some set of nodes have started with encryption enabled and another set exist without encryption enabled. At this point these two sets of nodes can only communicate from non-encrypted to encrypted set. The set with encryption enabled will encrypt traffic that will in turn be dropped by the set that has yet to enable encryption. To resolve we make encryption a property of the endpoint id. Keeping the key identifier with the endpoint to inform cilium which key should be used during an upgrade. Because we use the mark space to identify keys we limit the key space to two keys currently. After this key secrets will need to be updated to include an id field as follows, keys: "1 hmac(sha256) 0123456789abcdef0123456789abcdef cbc(aes) 0123456789abcdef0123456789abcdef" Where '1' is the id here. IDs are enforced to be less than 16. This is a bit arbitrary we could go as high as 256 without hitting mark bit limits. However, 16 feels sufficient and we cant take bits back later so start low and bump up if needed. The id '0' is a special id and should not be used it represents the absence of keys in the datapath. Signed-off-by: John Fastabend <john.fastabend@gmail.com> 22 March 2019, 22:21:14 UTC
506ccb8 cilium: ipsec, wildcard out rules and remove localhost rules Currently, OUT xfrm rules use full (src,dst,spi) tuple. The original thinking on this was that we wanted to ensure matches only on relavent IP addresses. However now both state and policy are further restricted by mark values we can drop the src piece without worrying about having unintended matches. Signed-off-by: John Fastabend <john.fastabend@gmail.com> 22 March 2019, 22:21:14 UTC
f4d2f69 cilium: ipsec, move ipv4 rules before feeders It is possible that ipv6 updates may fail due to missing kernel modules or iptable6 filters. In this case we still want the ipv4 rules to be in place so reorder ipv4 before ipv6 setup. Signed-off-by: John Fastabend <john.fastabend@gmail.com> 22 March 2019, 22:21:14 UTC
8ac4d10 FQDN: Avoid a data race on FQDN policyAdd This commit change a little bit how policyAdd handles the case of FQDN rules. This code checks that the FQDN rule UUID is still present on the fqdn.RuleGen, if it is not present means that the `daemon.policyAdd` is already updated the rule, so the rule is no longer valid. There is a case where the rule does not have UUID, this means that it is a new rule and will be send to policyAdd. Checking the RuleGen.allRules is O(1) due is a map, hopefully this will be replaced in the future when policyRepo has a key-value and things can be retrieved without looping over it. This is a new commit,and this will close the PR#7220 where issue was discussed more on deep. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 22 March 2019, 17:56:39 UTC
d28a883 make selective regeneration of endpoints configurable Provide an option to end-users of Cilium to always regenerate all endpoints upon policy changes instead of selectively regenerating endpoints based on the content of the policy change. The option is called "enable-selective-regeneration", and is hidden by default. Signed-off by: Ian Vernon <ian@cilium.io> 22 March 2019, 17:55:32 UTC
d1890f8 Test: Delete KVStore event on CIDR identities To validate that a delete event does not affect to local identities. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 22 March 2019, 17:36:33 UTC
6b3ffa8 Test: Added runtime test for delete events on KVstore. Add a new test where we validate that a delete event happens in the KVStore but the identity is not released if it is in use. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 22 March 2019, 17:36:33 UTC
62f43aa test: run k8s 1.14.0-rc.1 by default on all PRs k8s 1.14.0-rc.1 is a stable enough release that we can start using to test on all PRs Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 17:27:34 UTC
e50045a test: set coredns deployment closer to the upstream version Users will most likely use a coredns deployment based on the upstream version available so our CI should use a configuration closer to the one available upstream. configuration changed from the upstream version - replaced [0] with [1] - removed [3] to avoid caching entries for more than 30 seconds as tests can take less than 30 seconds to complete. [0] ``` forward . /etc/resolv.conf ``` [1] ``` proxy . /etc/resolv.conf { fail_timeout 10s max_fails 0 } ``` [2] ``` cache 30 ``` Signed-off-by: André Martins <andre@cilium.io> 22 March 2019, 17:27:34 UTC
24f1bbd k8s: generate code from k8s 1.14.0-rc.1 Generated code based on the new k8s 1.14.0-rc.1 k8s code-generator Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 17:27:34 UTC
83caf91 vendor: update dependencies to k8s 1.14.0-rc.1 k8s 1.14.0-rc.1 is stable enough release so we can update k8s libraries with this version. Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 17:27:34 UTC
b59acba ipcache: Allow CIDR ipcache overwrite from all sources The existing logic prevented any ipcache update of an entry that was triggered by a CIDR policy entry. This meant that if a CIDR policy in the PodCIDR covering yet unused PodIPs could prevent the ipcache being updated to map to PodIPs being used later on. Fixes: f3bbcd8e886 ("identity: Use local identities to represent CIDR") Signed-off-by: Thomas Graf <thomas@cilium.io> 22 March 2019, 14:18:06 UTC
d1c6f2f pkg/k8s: add informer benchmark tests for Node type ``` $ go test --check.vv --check.b --check.bmem --check.f="Benchmark_.*Informer" PASS: informer_test.go:541: K8sIntegrationSuite.Benchmark_Informer 1000000 1504 ns/op 1106 B/op 5 allocs/op PASS: informer_test.go:550: K8sIntegrationSuite.Benchmark_K8sInformer 100000 13991 ns/op 12995 B/op 75 allocs/op ``` Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 12:01:16 UTC
984e34e k8s: switch to own k8s types implementation This commit switches all code from using k8s types directly into using cilium's representation of a k8s type. From this commit forward we can expect the following structures to have less memory consumption: - Node - Pod - CiliumNetworkPolicy, only in k8s >= 1.13 since we don't need to keep the CNP Status of all the other nodes locally. Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 12:01:16 UTC
027ccdc pkg/k8s: add converter functions from k8s to cilium types These converter functions will be used in the NewInformer. The informer will receive events with k8s types and these functions will convert those types into cilium types for which they will copy the minimum information needed to Cilium. Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 12:01:16 UTC
19fb03e pkg/k8s: add Informer for cilium's k8s types This new informer will allow objects to be converted into another representation of the same object and store this new representation into the local cache. Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 12:01:16 UTC
8477e5b k8s/types: add internal representation of k8s types As cilium does not make use of all fields from the k8s types we need to create new types that cilium can use internaly for a better memory efficiency. Signed-off-by: André Martins <andre@cilium.dev> 22 March 2019, 12:01:16 UTC
763329d policy: correctly do closure on variables being iterated over Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
c3e5781 endpoint: make per-endpoint queue size configurable The size of the EventQueue per-endpoint can now be configured with the "endpoint-queue-size" option. The default size is 25. Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
1739c1a endpoint: remove useless comment Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
b0a5a9a endpoint: clarify need for synchronous enqueueing of regeneration event Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
55b0fd3 selectively regenerate endpoints on policy change Previously, when policy updates occurred, all endpoints were regenerated, regardless of whether or not the rules which were updated, added, or deleted actually selected all endpoints. This commit removes this kitchen-sink approach to policy calculation when the repository is modified, and instead regenerates only the endpoints that are selected by the rules that are updated, added, or deleted. If an endpoint is not selected by the changes in the policy repository, then its policy revision number is bumped. These events must be synchronized / mutually exclusive in their operation, so the endpoint's EventQueue is utilized to enqueue revision bump events and regeneration events. This is because an Endpoint's policy revision should not be bumped while it is being regenerated, as this could signal that the Endpoint is realizing a policy revision which it hasn't yet (as it is still regenerating!). Now that endpoints are selectively regenerated per change in the policy repository, the ordering of when all work associated with a given policy update occurs is important. Take the following example: * Time X: Policy imported with rules that select endpoint A. Eventually, this endpoint will be regenerated by having a regeneration event pushed onto its event queue, and this may be done after the policy has been added, as it is done in an asynchronous manner. * Time X+1: Policy imported with rules that do not select endpoint A. Eventually, this endpoint will not be regenerated, and instead will have its policy revision incremented, after the API has returned, similar to Time X. However, because the regeneration triggered at Time X and the policy revision incrementing at Time X+1 are launched as goroutines, there is no guarantee about when either event will be enqueued for endpoint A. This may mean that the policy revision event will get enqueued *before* the regeneration event for the endpoint, which means that the endpoint will falsely report that it is implementing that policy revision, while it has not yet because the regeneration for the rules from Time X has not occurred yet. This means that there needs to be a way to ensure that updating of the policy repository, and the caches within the policy repository rules indicating which endpoints the rules select, is done in-order. The "RuleReactionQueue" within the policy repository is used to do this; it serializes the 'reacting' to rule updates (e.g., which manages the enqueuing of events for endpoints in an in-order fashion). Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
f71d87a endpointmanager: signal when work is done Provide new ways of signaling when various operations are completed by the endpointmanager for the following cases: * Removal of endpoint: when an endpoint is removed, all references to it must also be removed from the policy repository in the Daemon as well. Block on releasing the endpoint ID until it is removed from all caches in the policy repository, but do so in an asynchronous manner so that `endpointmanager.Remove` does not get stuck for a long period of time waiting for said work to complete. * Provide new functions to indicate when regenerations for a set of endpoints have been enqueued for each endpoint. This is useful for ensuring that a given regeneration is queued up before some other event, e.g., incrementing the policy revision for the endpoint for a later revision than the regeneration which is being queued. Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
04a49b0 endpoint: add EventQueue field This will be used to schedule events for a given Endpoint which cannot run concurrently, such as policy revision being increased and regeneration. When an endpoint is deleted, its event queue is closed. Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
d81fdbc policy: repository updates caches within rules * Update `ResolvePolicy` to take in a `SecurityIdentity` instead of just the `LabelArray` within a `SecurityIdentity` so that the `EndpointsSelected` field within each `rule` can be kept up-to-date in relation to this endpoint. Update several related, unexported functions within the policy package to account for this as well. * Update `AddListLocked`, `DeleteByLabelsLocked`, to return the slice of `rule` which was added. This slice will be utilized to update the caches within each `rule` which specify what endpoints the rule selects. The analysis is performed in relation to these rules so that it can be determined which set of endpoints on the node need to be regenerated, as the rules which were added or deleted may not select all endpoints. Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
800e260 policy: cache which endpoints a given rule selects Add a new field within the rule type in `pkg/policy`, `ruleMetadata`. This new field contains two fields: * EndpointsSelected: tracks which node-local endpoints are selected by this rule. It maps each endpoint ID to its corresponding security identity. If a rule does not select a given endpoint, it will never be added to this set. * AllEndpoints: tracks which node-local endpoints have been analyzed in relation to this rule. It is implemented as a set of uint16, with each entry corresponding to an endpoint ID. After all endpoints have their policy calculated in relation to this rule, this set will contain the IDs of all endpoints. Determining whether rules select a given endpoint is performed via lazy-evaluation of the caches within the rules in the following manner. 1. If rule is in AllEndpoints a. Check whether it is in EndpointsSelected. i. If not, rule does not select the endpoint. b. If it is in EndpointsSelected check that the identity in EndpointsSelected mapping to this endpoint ID matches the one for which we are evaluating. This handles the case of endpoint identity change. 2. Else, determine whether rule selects given endpoint by falling back to label-based matching. a. After matching, update AllEndpoints as this endpoint has been analyzed in relation to this rule. b. If label-based matching returns true, update EndpointsSelected. This commit also provides means of updating a given list of rules in relation to a given endpoint and signaling when it is done via use of a WaitGroup. This will be utilized when policy rules are deleted / added from the policy repository. Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
b7a8dfb policy: add new types for tracking identifiers of consumers of policy These types will be used in an upcoming commit to track which rules select which endpoints within policy rules. The two types are: * IDSet: is a wrapper around a set of numeric IDs, with a mutex to protect access. * IdentityConsumer is an interface which describes anything which has a unique node-local identifier, has a security identity, and has a policy revision which can be incremented. This is currently implemented by the Endpoint type in `pkg/endpoint`. Signed-off by: Ian Vernon <ian@cilium.io> 21 March 2019, 22:00:50 UTC
546bfa5 pkg/k8s: update copyright headers of k8s generated code copyright headers are still pointing to 2017, updating date to 2018 Signed-off-by: André Martins <andre@cilium.dev> 21 March 2019, 19:06:25 UTC
5eb05f6 test/health: Check that peers are discovered Previously, this test would only check that the "status" field reports no issues. However, if the peer was never discovered, this would be the case but something has gone wrong. First, check if the peer field is there. If it is not there, fail. Then, check that the status indicates success. The test should only pass if the peer has been probed AND there is no error response. Signed-off-by: Joe Stringer <joe@cilium.io> 21 March 2019, 18:56:15 UTC
83dc10b node: Fix health endpoint IP fetch with IP disable If either IPv4 or IPv6 was disabled, the health endpoint IP fetch would previously completely fail (silently), which would cause health endpoint connectivity probing to be disabled. Fixes: #7456 Signed-off-by: Joe Stringer <joe@cilium.io> 21 March 2019, 18:56:15 UTC
fcc2a2c k8s: Fix node equality function for health IPs Fix up the node equality function which caused Cilium to ignore updates for health IPs, which could cause cilium-health IP changes to be ignored. Signed-off-by: Joe Stringer <joe@cilium.io> 21 March 2019, 18:56:15 UTC
9ab5e64 Documentation: Add Kubernetes 1.14 support. - Add Kubernetes 1.14 support in documentation. - Disable 1.8 and 1.9 documentation due is not longer supported. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 21 March 2019, 14:20:01 UTC
b7361fa Kubernetes examples, delete 1.{8,9} version. Due are no longer supported delete the example manifest files. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 21 March 2019, 14:20:01 UTC
f5d4c2d Test: Delete 1.9 and 1.8 releases from testing. Due are not longer supported deleted for the new Cilium version. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 21 March 2019, 14:20:01 UTC
eacd41c Examples: Added kubernetes 1.14 manifest Add 1.14 manifests files Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 21 March 2019, 14:20:01 UTC
3825904 Test: Add Kuberentes 1.14-rc.1 to the build system. Added kubernetes 1.14 to the build Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 21 March 2019, 14:20:01 UTC
d751742 Add golang.org/x/time/rate to dependencies Signed-off-by: Maciej Kwiek <maciej@covalent.io> 21 March 2019, 13:56:23 UTC
b4e2180 Add rate limiting for etcd kvstore operations Signed-off-by: Maciej Kwiek <maciej@covalent.io> 21 March 2019, 13:56:23 UTC
635a5f8 k8s: decrease the log level of some messages Some messages are not usefull in production environments so we can downgrade those to debug level. Signed-off-by: André Martins <andre@cilium.dev> 21 March 2019, 13:52:30 UTC
ae4979f policy: Simplify generation of L4 BPF keys An L4 policy, with or without L7 policy, can be represented as a single BPF key which wildcards the identity (by specifying it as 0), and specifying the port/protocol/direction for the key. Previously this code synthesised L3-dependent L4 keys for every identity in the system, which could cause an explosion of BPF map keys when L4-only or L4+L7 policies are in use, and would inject delays for allowing traffic on the L4 port when new identities appear on remote nodes. Improve this by only generating the single L4-only BPF key when L4 policy is in use. This will reduce the number of keys generated (saving policymap key space and CPU for iteration), and also make the policy more resilient during creation and deletion of identities, as the policy will apply regardless of whether identities have propagated across the cluster. Signed-off-by: Joe Stringer <joe@cilium.io> 20 March 2019, 23:47:57 UTC
96fc70f endpoint: metrics: Add policy regeneration time stats This change deprecates the following metrics (which are going to be removed in Cilium 1.6): - `policy_regeneration_seconds_total` - `policy_regeneration_square_seconds_total` And introduces the new metric, `policy_regeneration_time_stats_seconds`, which is a labeled histogram with duration of each step of policy regeneration. Ref: #7229 Signed-off-by: Michal Rostecki <mrostecki@opensuse.org> 20 March 2019, 23:13:44 UTC
bf784ae eventqueue: correct case in documentation Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
9fd88ee daemon: remove useless comment Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
ce2f17e daemon: remove superfluous `select` statements No functional change is intended here. Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
9243fc3 policy: make queue sizes in policy repository configurable The size of the queues for policy-related events, and reaction events can now be customized by end-users, with the "policy-queue-size" option. The default value is 100. Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
c25b2a4 daemon: fix race in unit tests Now that `PolicyAdd` signals that it is done before updating of prefix lengths is finished, update the unit test accordingly to use `testutils.WaitUntil` to wait for prefix lengths to be correct instead of performing the assertion once and bailing out before the operation may have completed. Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
4e2f9e0 daemon: update documentation for PolicyAdd Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
bd20143 serialize policy repository update eventss The ordering of when all work associated with a given policy update occurs is important, as the policy repository is modified by the DNS proxy. Two EventQueues are added: one, which serializes the processing of policy repository changes (add, delete, update), and one which serializes the 'reacting' to rule updates (will be used in the future). Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
add0d65 add eventqueue package This package implements a queue-based system for event processing in a generic fashion in a first-in, first-out manner. Only one event is able to be consumed at a time per event queue. This is useful in the case for ensuring that certain types of events do not run concurrently. An EventQueue may be closed, in which case all events which are queued up, but have not been processed, will be cancelled (i.e., not ran). It is guaranteed that no events will be scheduled onto an EventQueue after it has been closed; if any event is attempted to be scheduled onto an EventQueue after it has been closed, it will be cancelled immediately. For any event to be processed by the EventQueue, it must implement the `EventHandler` interface. This allows for different types of events to be processed by anything which chooses to utilize an `EventQueue`. Signed-off by: Ian Vernon <ian@cilium.io> 20 March 2019, 22:26:46 UTC
57cdaf9 Bugtool: Fix header size on Tarball files Currently in the bugtool the tar file is not created because the size for elf objects [1] are not correct in the fileInfo so the bugtool fails and no data can be retrieved. With this change we make sure that we send the correct size in the file. Error: ``` vagrant@k8s1:~$ kubectl -n kube-system exec -ti cilium-mvzwp -- bash root@k8s1:~# cilium-bugtool Deleted empty directory /tmp/cilium-bugtool-20190320-150406.345+0000-UTC-372992147/cmd Deleted empty directory /tmp/cilium-bugtool-20190320-150406.345+0000-UTC-372992147/conf Failed to create archive archive/tar: write too long root@k8s1:~# ``` [1] File that fails: /var/run/cilium/state/templates/93e0f422d723d70b202357a3c412f26374f8e233/bpf_lxc.o Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 20 March 2019, 21:37:20 UTC
80912fc test: Fix skipping of ipvlan test suite Previously, AfterEach statements were executed regardless whether the ipvlan suites were skipped which made the whole suite to fail. Fixes: 314a6d3fd6 ("test: Extend runtime connectivity tests to include ipvlan") Signed-off-by: Martynas Pumputis <m@lambda.lt> 19 March 2019, 23:33:16 UTC
f332375 Add dockerignore file. On CI has been seen the following error on building operator[0] to avoid that kind of issues do not include test/*.json files because can be added/deleted in any time in runtime and things start to fail randomly. [0] CI error ``` 00:37:26.858 k8s1-1.13: e45cfbc98a50: Pushed 00:37:26.858 k8s1-1.13: d60e01b37e74: Pushed 00:37:37.746 k8s1-1.13: e8dbf417700a: Pushed 00:37:38.878 k8s1-1.13: 762d8e1a6054: Pushed 00:37:40.622 k8s1-1.13: can't load package: package ./test/bpf: C source files not allowed when not using cgo or SWIG: bpf-event-test.c elf-demo.c unit-test.c 00:37:41.355 k8s1-1.13: docker build --build-arg LOCKDEBUG=1 -f cilium-operator.Dockerfile -t "cilium/operator:"latest"" . 00:37:43.069 k8s1-1.13: error checking context: 'file ('/home/vagrant/go/src/github.com/cilium/cilium/test/policy_ace3612d.json') not found or excluded by .dockerignore'. 00:37:43.069 k8s1-1.13: Makefile:187: recipe for target 'docker-operator-image' failed 00:37:43.069 k8s1-1.13: make: *** [docker-operator-image] Error 1 00:37:52.144 k8s1-1.13: bde303456f0c: Pushed 00:37:52.144 k8s1-1.13: latest: digest: sha256:d0d55c183ac79bd9611e259ab33105900e8f53ba7e8e3eed59f50f544e415300 size: 3665 ``` Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 19 March 2019, 18:57:35 UTC
4eb58b8 cmd: mark tests as privileged Since BPF functionality is imported into the `cmd` package, we need to mark tests within the package as privileged for them to pass on Travis. Signed-off by: Ian Vernon <ian@cilium.io> 19 March 2019, 16:24:18 UTC
1ab31cf Makefile: fix grep command for getting directories to run unit tests This fixes an issue where no unit tests are ran when \`make unit tests\` is invoked, as the parameters provided to grep to acquire the list of directories for which tests should be ran was broken. When ran, it would produce the following error: \`\`\` grep: the -P option only supports a single pattern \`\`\` Fixes: `4dbb8fdfc5`: Create testable logutils package in test helpers Signed-off by: Ian Vernon <ian@cilium.io> 19 March 2019, 16:24:18 UTC
6ddc7a5 test: Bump ubuntu-next to v21 The VM image include >= 5.0.0 kernel and two patches required for NAT46 and large BPF map allocations under memory pressure to work. Signed-off-by: Martynas Pumputis <m@lambda.lt> 19 March 2019, 15:28:43 UTC
314a6d3 test: Extend runtime connectivity tests to include ipvlan Signed-off-by: Martynas Pumputis <m@lambda.lt> 19 March 2019, 15:28:43 UTC
c8bcae8 CODEOWNERS: add `daemon/policy.go` to be owned by @cilium/policy Signed-off by: Ian Vernon <ian@cilium.io> 19 March 2019, 10:04:14 UTC
714ebac k8s: Use node resource from Update() when annotating Instead of fetching the latest node information from the apiserver on each retry, use the node resource as returned by Update(). Signed-off-by: Thomas Graf <thomas@cilium.io> 18 March 2019, 22:48:13 UTC
d3f73fd istio: Add nslookup timeout during container injection Aborts the DNS lookup if still unsuccessful when timeout elapses Fixes #6230 Signed-off-by: ifeanyi <ify1992@yahoo.com> 18 March 2019, 21:12:22 UTC
93e0035 docs, bpf: Remove struct padding with aligning members Currently, BPF doesn't work with a padded structure due to data alignment. This commit adds description to the BPF document about how to remove struct padding with aligning members by using #pragma pack. Signed-off-by: Daniel T. Lee <danieltimlee@gmail.com> 18 March 2019, 21:12:06 UTC
2cb6aa4 FQDN: Only regenerate rule if dns response has IPs. There are cases where a DNS response return a Nonerror but there are no IPS in the response. In that case Cilium regenerates the policy because it's a legit response. With this change, regeneration will not happens if it is a empty response. Signed-off-by: Eloy Coto <eloy.coto@gmail.com> 18 March 2019, 20:48:51 UTC
238f3c4 k8s: Use cilium status --brief in liveness/readiness probe * Make sure the output of `--brief` selects specific sub-component errors (like kvstore) to print over generic ones (like cilium). * Terminate `cilium status --brief` with non-zero exit code if cilium status is unhealthy, similar to `cilium status`. * Update yaml files to use the `--brief` flag in liveness/readiness probes. Fixes #6574 Signed-off-by: ifeanyi <ify1992@yahoo.com> 18 March 2019, 20:27:05 UTC
fe124a6 k8s: Add metric to count number of raw Kubernetes events received Counting the processed events is insufficient as it may hide the majority of events being received. Also count the number of events received in order to gain visibility into how many events the apiserver is sending out. Signed-off-by: Thomas Graf <thomas@cilium.io> 18 March 2019, 19:47:45 UTC
608189c ipsec, daemon: reject unsupported config options This avoids obscure startup errors on Cilium daemon and/or false user expectations, thus lets error out early with a clear error message. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 18 March 2019, 19:45:10 UTC
52eb6da ipsec, doc: remove note on 1.4.1 release Replace it with 'upcoming release' since it hasn't been merged into 1.4.1 or 1.4.2 at this point and to avoid confusion for users following the guide. Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 18 March 2019, 19:45:10 UTC
e012ff0 ipsec, bpf: fix build error when tunneling is disabled If we don't have encap index defined (ENCAP_IFINDEX), then we also cannot redirect to it. This compilation error is thrown instead: 2019-03-16T08:38:33.30268603Z level=warning msg="/var/lib/cilium/bpf/bpf_netdev.c:548:11: warning: implicit declaration of function '__encap_and_redirect_with_nodeid' is invalid in C99 [-Wimplicit-function-declaration]" subsys=daemon 2019-03-16T08:38:33.302694714Z level=warning msg=" return __encap_and_redirect_with_nodeid(skb, tunnel_endpoint, seclabel, TRACE_PAYLOAD_LEN);" subsys=daemon 2019-03-16T08:38:33.302702926Z level=warning msg=" ^" subsys=daemon 2019-03-16T08:38:33.302708262Z level=warning msg="1 warning generated." subsys=daemon Fixes: 3b6245843aef ("cilium: ipsec, add BPF datapath encryption direction") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 18 March 2019, 19:45:10 UTC
3c58577 daemon: fix conntrack map dump wrt addresses The CT dump currently shows swapped src/dst address entries even though it's correctly using src address resp. dst address as data. Issue is that 7afe903203c3 ("bpf: Global 5 tuple conntrack.") did not swap the initial tuple for the lookup when converting from local to global table, and all the current code right now is doing workarounds in order to not break CT table during version upgrade. Thus same needs to be done here for the dump. Issue became more apparent after aaf6ba39ad4e ("ctmap: Fix order of CtKey{4,6} struct fields"), which might have had been swapped on purpose but without further comments in the code on why it was swapped on daemon side. In this case, reverting aaf6ba39ad4e doesn't fully fix it either since then direction also needs to be swapped. Instead, make it less confusing and only swap what needs to be swapped, that is, the address parts since in the datapath this is the only thing that should have been done but was missed back then. For next major version upgrade (aka 2.0), this will be properly fixed (at the cost of disruptive upgrade). Fixes: 7afe903203c3 ("bpf: Global 5 tuple conntrack.") Fixes: aaf6ba39ad4e ("ctmap: Fix order of CtKey{4,6} struct fields") Signed-off-by: Daniel Borkmann <daniel@iogearbox.net> 18 March 2019, 16:23:55 UTC
back to top