https://gitlab.com/tezos/tezos
Raw File
Tip revision: 64b890ecccf11f7b4238fddbb36d1e883f134952 authored by marcbeunardeau on 14 November 2023, 17:18:18 UTC
batch update
Tip revision: 64b890e
README.rst
***********
Documenting
***********

The documentation is available online at `tezos.gitlab.io <http://tezos.gitlab.io/>`_,
and always up to date with branch ``master`` on `GitLab <https://gitlab.com/tezos/tezos>`_.

Building the documentation
==========================

You can build the documentation locally or in the CI.

.. _build_doc_ci:

Building the documentation in the CI
------------------------------------

When reviewing a merge request (MR) in the Gitlab interface, you may build the documentation in the CI without checking out the source branch of the MR, and without installing Python locally. Proceed as follows:

+ trigger the CI if needed in the home page of the MR, and make sure that job ``documentation:build_all`` under the CI stage ``build`` is being executed
+ once the whole CI is finished, check the built documentation in the exposed artifacts on the home page of the MR

If you cannot wait for the whole CI to finish, the artifacts are not yet exposed on the home page of the MR; but click on the CI job ``documentation:build_all``, and in the job's page (once finished), click on ``Browse`` to retrieve only the doc artifacts.

In both cases, visualize file ``docs/_build/index.html``.

Building the documentation locally
----------------------------------

To build the documentation locally, you need to install the Python package
manager `Poetry <https://python-poetry.org/>`_. For instructions on
how to obtain Python and Poetry, see :doc:`the installation
instructions for the Python environment<developer/python_environment>`.

Another pre-requisite for building the documentation is making sure that the Octez sources on your branch are compiled, because part of the documentation is generated by Octez executables.
This involves executing ``make`` in the parent directory (the repository root).
If this step results in errors, you usually have to restart the :ref:`compiling procedure <compiling_with_make>` from ``make clean`` onwards.

Once this is done, you can do:

.. code-block:: bash

    make -C docs

The output is generated and available in ``docs/_build``. It is built by
Sphinx, and uses the Read The Docs theme.


OCaml documentation
-------------------

As part of the above procedure,
Odoc is used for OCaml API generation. You can install Odoc with:

.. code-block:: bash

    opam install odoc

Octez generates the API documentation for all libraries in HTML format. The
generated HTML pages are put in ``_build/<context>/_doc``.
It creates one sub-directory
per public library and generates an ``index.html`` file in each sub-directory.

The documentation is not installed on the system by Octez. It is meant to be
read locally while developing and then published on the www when releasing
packages.

Writing documentation
=====================

Online documentation is written in reStructuredText format, also known as RST.
reStructuredText is the default plaintext markup language used by
`Sphinx <https://www.sphinx-doc.org/>`_, which
is the tool used to compile this format into plain web pages in HTML format.

For the RST syntax, see the `Sphinx RST primer <https://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html>`_ and also the `Sphinx extensions`_ below.

Sphinx extensions
-----------------

Some ad-hoc reference kinds are supported.

- ``:package:`name``` or ``:package:`text<name>``` points
  to the ``odoc`` page of the package, checking that the page exists
- ``:package-name:`name``` or ``:package-name:`text<name>``` just
  displays the package name (no link), checking that the package
  exists
- ``:package-api:`path/to/api-page.html``` or
  ``:package-api:`text<path/to/api-page.html>```
  points to an API page generated by odoc, checking that the page exists.
  The path is relative to the root of the odoc-generated pages (normally,
  ``_build/api/odoc/_html``). It must start with a package name, optionally
  followed by a library name, then by a series of nested module names,
  and ended by a page name, usually ``index.html``.
  It may optionally be suffixed by a section
  name, using the standard HTML ``#section`` suffix. This role is meant
  to point to APIs that do not correspond to a whole package (for that case,
  prefer to use the ``:package:`` role).
- ``:src:`/path/to/file/or/dir``` or
  ``:src:`text</path/to/file/or/dir>``` points to the gitlab source
  tree viewer. It is not possible to refer to a particular line in a file using
  a line number suffix of the form ``#Lnnn``, because such links are usually
  too fragile to be used in documentation.
- ``:opam:`package``` or ``:opam:`text<package>``` points to the
  package page on ``opam.ocaml.org``, version number is supported
  (``package.version``)
- ``:gl:`[special gitlab reference]``` or ``:gl:`text <[special gitlab
  reference]>``` expands and links `GitLab special references
  <https://docs.gitlab.com/ee/user/markdown.html#gitlab-specific-references>`_,
  like for
  merge requests :gl:`tezos/tezos!123` (``:gl:`tezos/tezos!123```),
  issues :gl:`tezos/tezos#999` (``:gl:`tezos/tezos#999```)
  and
  commits :gl:`28309c81` (``:gl:`28309c81```).
  The default project and namespace is
  ``tezos/tezos``. In other words, ``tezos/tezos#999``, ``tezos#999`` and
  ``#999`` all refer to the same thing. Currently supports usernames,
  projects, issues, merge requests, snippets, milestone ids, commits
  and commit ranges. The implementation of this role is in
  :src:`docs/_extensions/gitlab_custom_role.py`.

Style guidelines
----------------

Currently there are no enforced guidelines about the style in writing documentation.
In particular, the choice of American, British, Canadian, ... English (alphabetical, non-exhaustive list!) is up to each contributor.
So is the capitalization convention of section names, and other typesetting aspects.
The focus should be on the contents: on logical structure of documents, on uniform use of terms, on avoiding incoherencies between pages, and so on.

However, when adding a new page or modifying an existing one, you should check that your text displays correctly and introduces no new problems.
For that, you should build the documentation (by running ``make`` in the ``docs`` directory), address any new error message, and check the generated pages (``docs/_build/index.html``) in a browser.

Links
~~~~~

When introducing cross-references between documentation pages as well as references to external resources, please consider using the most appropriate kind of link:

- When referring to a whole documentation page, you should use a ``:doc:`` role rather than introducing a label at the start of the page.
  Indeed, labels incur an overhead, especially when pages get duplicated for different protocol versions.
  In particular, when referring to a page of the currently active protocol, consider using ``active/`` as the directory of that page, instead of a hardcoded protocol name.
- When referring to an artifact in the code repository (source file, commit, etc.), you may use an appropriate custom or GitLab role (see `Sphinx extensions`_) instead of a plain HTML link.
  Indeed, specific roles are checked for correctness more effectively and more efficiently than HTML links.

Line breaking
~~~~~~~~~~~~~

When writing documentation in text formats such as RST, it is not required to respect a maximal line width, such as 80 columns.
Therefore, you may choose between the different line breaking policies your text editor proposes.
However, you should be aware that file comparison tools such as ``diff`` tend to output large differences for a paragraph that has been reformatted after only a small change in one phrase.
Also, reviewing tools such as the one in the GitLab user interface associate comments and change suggestions to lines, while these comments and suggestions are usually logically associated with whole phrases.

For such reasons:

- Some contributors use one line per complete phrase, which allows to make rephrasing suggestions more easily in ``gitlab``, associated to this (possibly long) line; and which allows ``diff`` to isolate modified phrases, instead of showing the whole container paragraph as modified.
- Other contributors, whose editor breaks lines at a fixed width, introduce an extra line break at the end of each phrase. This also allows ``diff`` to isolate modified phrases.

Thus, you may choose your own formatting style, while tolerating different styles from other contributors.


Writing executable documentation
--------------------------------

When you are writing documentation containing executable parts, such as sequences of instructions to install, configure, or launch some tool, there is sometimes a better way than copying those instructions from a terminal (where you supposedly tried them before!) to a documentation page.
This better way is to write "executable documentation".
The idea is to write such executable scripts separated from the documentation, and to automatically copy them in the documentation whenever it is (re)generated.
Executable documentation allows one to test those scripts, e.g. in CI (continuous integration), ensuring they work and are up to date with the code and with its environment.

Typically, Octez installation scripts not only have to evolve with the Octez codebase, but also with various other evolving resources, such as OPAM packages, package managers, Linux distributions, and so on.
By continuously testing such installation scripts, executable documentation allows one to detect problems and fix obsolete instructions as early as possible, avoiding headaches and frustration, for new end users and experienced developers alike.

Technically, executable documentation can be created by using the Sphinx directive `literalinclude <https://www.sphinx-doc.org/en/master/usage/restructuredtext/directives.html#directive-literalinclude>`_, which may include whole scripts or parts of them.
For example, the following directive includes a script fragment detailing a step in compiling the Octez sources::

  .. literalinclude:: compile-sources.sh
    :language: shell
    :start-after: [install packages]
    :end-before: [test executables]

Whenever appropriate, in addition to including the script (fragment) in the documentation as above, make sure it is regularly tested, manually and/or within a CI job.

Writing protocol documentation
------------------------------

Writing protocol documentation is a special case because protocol-related
documentation pages are duplicated for several protocol versions (under directories named as the protocols, e.g.,  "alpha/"), and possibly
also in a protocol-independent part (typically under directory
``shell/``).

Besides the need of maintaining several versions of these pages, this
duplication introduces the need to carefully handle documentation
cross-references, in particular to avoid duplicate labels (i.e., multiple labels with the same name in different pages) and wrong references (i.e.,
escaping from one protocol version into another).

The following rules promote a systematic way of handling documentation
cross-references that avoids introducing such errors.

Definitions
~~~~~~~~~~~

First let us introduce the following definitions:

- A *label* is an identifier defining a specific position in a documentation page (typically, before a section name). A *reference* is a link to a label, in the same or another page. In Sphinx, labels are written ``.. _label:`` and references are written ``:ref: `textual description <label>```, or ``:ref: `label```. Labels and references are case-insensitive.
- A *versioned* label is suffixed by  protocol name (e.g. ``label_alpha``); an  *unversioned* label doesn't (i.e. just ``label``)
- A *local* reference is a link from a protocol-specific page to the same page or to another protocol-specific page. An *external* reference is a reference from a protocol-independent page to a label in a protocol-specific page.

Rules
~~~~~

The following simple rules are proposed for safely managing cross-references:

1. In all but the **current** protocol, any defined label must be versioned::

    .. _<label>_<proto>:

2. In the **current** protocol, labels may be versioned (as targets of local references), unversioned (as targets of external references), or both. The last case is done by defining *two* labels for such location::

    ..  _<label>:
    ..  _<label>_<proto>:

3. Any local reference in protocol ``<proto>`` must be versioned ``<proto>``. This includes references appearing in the currently active protocol.

4. External references must be unversioned.

The rationale of the above rules:

- Any label defined in a protocol-specific page must be versioned to avoid name conflicts (as by definition the containing page is duplicated).
- External references must be unversioned to avoid modifying protocol-independent pages when the current protocol is changed.
- Local references in the current protocol could also work if unversioned, but when the protocol is changed, they should be rewritten as versioned. It is much simpler to enforce the rule that all local references in a page for any protocol ``<proto>`` must be versioned ``<proto>``.

Protocol changes
~~~~~~~~~~~~~~~~

When a new protocol is adopted, its pages must be "linked" with the protocol-independent pages:

- remove in the old protocol all the unversioned labels (this operation is unnecessary if the pages of the old protocol are removed altogether)
- add in the new protocol an unversioned label before each versioned label

**NB** no rewriting of any reference is needed on protocol changes.

On creating a new protocol proposal version ``<proto>`` out of alpha:

- rename all versioned labels AND references _alpha in its pages to version _<proto>

Rules automation
~~~~~~~~~~~~~~~~

To help enforcing the above cross-referencing rules in protocol-specific pages, the following scripts are provided under ``docs/scripts``:

- ``check_proto_xrefs.py``: checks the references, and optionally the labels, in all pages of a given protocol version

  + can be used at any time, e.g. when changing a protocol-specific page
- ``add_labels_without_proto.py``: adds unversioned labels before each versioned label in a protocol-specific page

  + can be used when a new protocol is adopted, to "link" its documentation into protocol-independent pages
- ``remove_labels_without_proto.py``: removes unversioned labels in a protocol-specific page

  + can be used when a new protocol is adopted for "unlinking" the pages of the old protocol, only if those pages are not removed altogether

Moreover, the script ``scripts/snapshot_alpha.sh``, used to create a new protocol proposal version ``<proto>`` out of alpha integrates renaming of labels and references.

Documenting protocols
~~~~~~~~~~~~~~~~~~~~~

Due to the duplication of the documentation for multiple protocol versions, the following extra guidelines should be observed.

- In principle, protocol-independent pages should only refer to the currently active protocol. Indeed, until newer protocols are adopted, there is no guarantee that their features will be part of Tezos someday.
  Note that there is a symbolic link called ``active`` within the documentation folder pointing to the currently active protocol directory.
  Use it whenever appropriate to avoid introducing hardcoded protocol numbers.

- When modifying the pages of a given protocol version, you might have to also modify it for later versions. Otherwise, when newer protocols are adopted, your changes will vanish! In particular, when fixing a problem in the documentation of the current protocol (e.g. adding a term in the glossary), you might have to fix it also for the candidate protocol (if there is one under the voting procedure) and for the Alpha protocol under development (assuming that the features of the candidate protocol will be inherited by or proposed in another form in Alpha).

- As there is a considerable overhead for maintaining protocol-specific pages, think twice before duplicating a page as protocol-specific. Does this page really refer to the protocol? If yes, does *all* the page refer to the protocol? If the answer to the last question is "no", consider splitting the page in two parts, respectively protocol-specific and protocol-independent.
  This kind of splitting is however unadvised when there are many local cross-references between the parts; in this case, keeping everything in a same page may avoid introducing many labels (this is why the glossary pages are not split into shell and protocol pages).
back to top