Revision 3eee19a17a23489bb9461e0526f42d7f71da35ed authored by Quentin Monnet on 16 January 2023, 13:56:21 UTC, committed by Quentin Monnet on 17 January 2023, 09:41:25 UTC
Once upon a time, Cilium docs used the openapi Sphinx add-on to generate
its API reference based on the code. And things were good.

One day, Dependabot raised a security alert, stating that Mistune v2.0.2
was vulnerable to catastrophic backtracking [0] - this is a regex
parsing thing. Mistune was a dependency to m2r, an add-on to parse
Markdown in Sphinx, which in turn was a dependency to openapi.

The easy path would have been to update m2r to use the latest, fixed
Mistune version; but m2r was incompatible with Mistune >= 2.0.0, and
also it was no longer in development.

There was a fork, m2r2, which had little activity, and would avoid the
security issue by very simply pinning the Mistune version to 0.8.4
(which would either fail to build Cilium's reference correctly, or bring
some incompatibilities with other dependencies, at this point the
narrator does not remember for sure).

There was a fork of the fork, sphinx-mdinclude. We could use that
project to update openapi, except that it was not compatible with recent
versions of docutils, and that this would cause openapi's test suite to
fail to pass.

... So we ended up forking the openapi repository to update the
dependency to sphinx-mdinclude locally, and this is what we've been
using since last summer. And things were good again.

But things are even better when they go upstream [citation needed]. We
also filed the issue for docutils compatibility in sphinx-mdinclude [1].
It was fixed (thanks!). We submitted a PR to have openapi switch to
sphinx-mdinclude [2]. It was adjusted (thanks!), merged, and a new tag
was created.

Now at last, we can switch back to the upstream version of openapi!
[And the build system lived happily ever after.]

[0]: https://github.com/advisories/GHSA-fw3v-x4f2-v673
[1]: https://github.com/omnilib/sphinx-mdinclude/issues/8
[2]: https://github.com/sphinx-contrib/openapi/pull/127

I did _not_ run `make -C Documentation update-requirements`, because the
resulting changes seemed to break the Netlify preview [3]. I stuck to
openapi and bumped sphinx-mdinclude to >= 0.5.2, as required by openapi.

[3] https://app.netlify.com/sites/docs-cilium-io/deploys/63c55fcc5531c6000838b87c

Signed-off-by: Quentin Monnet <quentin@isovalent.com>
1 parent 4ec82bb
Raw File
README.md
# API server for Cilium ClusterMesh

Cilium uses a clustermesh-apiserver when multiple clusters are connected in clustermesh, or
when external workloads are connected to the Cilium cluster. If neither is used, then
clustermesh-apiserver is never required.

Since etcd is used in a clustermesh for data synchronization, an etcd server container
is deployed within clustermesh-apiserver pod.

When used in an External Workloads setup, it also creates a CiliumNode and
CiliumEndpoint resources for each workload name and allocates its identity.

Note: `ipv4-alloc-cidr` set in the CiliumExternalWorkload object spec is currently unused.
IP address tied to the CiliumEndpoint and CiliumNode is the one that is registered by
cilium-agent (IP address of the external workload).

The API server itself performs the following operations:

### K8s synchronization

It performs the job of synchronizing CiliumIdentites, CiliumEndpoints,
CiliumNodes and Kubernetes services from k8s datastore to the KVStore (etcd).

### Heartbeat update

Cilium's heartbeat path key stored in the KVStore is periodically updated by
the API server with the current time so that Cilium Agents can correctly
validate KVStore updates. The key for this heartbeat is
`cilium/.heartbeat`.

## Deploy the clustermesh-apiserver

Clustermesh-apiserver is automatically deployed when External
Workloads support or clustermesh is enabled using either Helm or the cilium-cli tool.

Users are required to set both `cluster.name` and a non-zero `cluster.id` in Helm or
`cilium install --cluster-name <name> --cluster-id <id>`. Otherwise, clustermesh will
not be correctly established.

`clustermesh-apiserver` service type defaults to `NodePort`. Depending on
your k8s provider it may be beneficial to change this to `LoadBalancer`.

### Deploy using cilium-cli:

   ```
   $ cilium clustermesh enable
   ```

#### Connect Cilium clusters in to a clustermesh

   ```
   $ cilium --context "${CONTEXT1}" clustermesh connect --destination-context "${CONTEXT2}"
   ```
   Note: `clustermesh connect` command needs to be run for every new cluster (context) that joins clustermesh.

#### Wait for clustermesh status to be ready

   ```
   $ cilium --context "${CONTEXT1}" clustermesh status --wait
   ```

### Deploy using helm:

   ```
   $ helm install cilium ... \
     --set clustermesh.useAPIServer=true \
   ```

Additionally, if your load balancer can give you a static IP address, it may be
specified like so:

   ```
   $ helm install cilium ... \
     --set clustermesh.apiserver.service.loadBalancerIP=xxx.xxx.xxx.xxx \
   ```

Clustermesh-apiserver is deployed as a standard k8s deployment with multiple
containers. You can check that both clustermesh-apiserver and etcd server are present:

   ```
   $ kubectl get pods -l k8s-app=clustermesh-apiserver \
     -o jsonpath='{range .items[*].spec.containers[*]}{.image}{"\n"}{end}'
   quay.io/coreos/etcd:v3.4.13
   quay.io/cilium/clustermesh-apiserver:v1.10.2
   ```
#### Connect Cilium clusters in to a clustermesh

In helm installation clusters have to be connected in 2 steps:

1. Extract a `cilium-clustermesh` secret from each cluster to be applied in another cluster:

   ```
   $ contrib/k8s/k8s-extract-clustermesh-nodeport-secret.sh > cluster1-secret.json
   ```

   Repeat this step in all your clusters, storing the outputs into different files.

3. Apply secrets from all other clusters in each of your clusters, e.g., on cluster1:

   ```
   $ contrib/k8s/k8s-import-clustermesh-secrets.sh cluster2-secret.json cluster3-secret.json ...
   ```
back to top