Revision - b41d4f3 - refactor natgrads to be more efficient (#1443) - origin: https://github.com/GPflow/GPflow

visit type:

https://github.com/GPflow/GPflow

05 April 2024, 20:18:59 UTC

Revision b41d4f38436e4a090c940dbd3bc7e2afd39a283e authored by st-- on 23 April 2020, 18:17:42 UTC, committed by GitHub on 23 April 2020, 18:17:42 UTC

refactor natgrads to be more efficient (#1443)

Previously, GPflow's NaturalGradient optimizer would call the loss_function once for each (q_mu, q_sqrt) set in the var_list. This is a light refactor that separates out applying the natural gradient step from computing the gradients (`_natgrad_apply_gradients`), and changes `_natgrad_steps` to only evaluate the loss function once, computing the gradients for all (q_mu, q_sqrt) tuples passed in the var_list.

Other changes:
- The no-longer-used `_natgrad_step` method got removed.
- NaturalGradient now takes a `xi_transform` argument that is used for all parameter sets without explicitly specified xi transform (i.e. tuples rather than triplets).
- XiTransform has been changed to have staticmethods.

None of this should affect any downstream code; this PR is backwards-compatible.

1 parent c7550ce

Files
Changes

Tip revision: b41d4f38436e4a090c940dbd3bc7e2afd39a283e authored by st-- on 23 April 2020, 18:17:42 UTC
refactor natgrads to be more efficient (#1443)

Tip revision: b41d4f3

GLOSSARY.md

## Glossary

GPflow does not always follow standard Python naming conventions,
and instead tries to apply the notation in the relevant GP papers.\
The following is the convention we aim to use in the code.

---

<dl>
  <dt>GPR</dt>
  <dd>Gaussian process regression</dd>

  <dt>SVGP</dt>
  <dd>stochastic variational inference for Gaussian process models</dd>

  <dt>Shape constructions [..., A, B]</dt>
  <dd>the way of describing tensor shapes in docstrings and comments. Example: <i>[..., N, D, D]</i>, this is a tensor with an arbitrary number of leading dimensions indicated using the ellipsis sign, and the last two dimensions are equal</dd>

  <dt>X</dt>
  <dd>(and variations like Xnew) refers to input points; always of rank 2, e.g. shape <i>[N, D]</i>, even when <i>D=1</i></dd>

  <dt>Y</dt>
  <dd>(and variations like Ynew) refers to observed output values, potentially with multiple output dimensions; always of rank 2, e.g. shape <i>[N, P]</i>, even when <i>P=1</i></dd>

  <dt>Z</dt>
  <dd>refers to inducing points</dd>

  <dt>M</dt>
  <dd>stands for the number of inducing features (e.g. length of Z)</dd>

  <dt>N</dt>
  <dd>stands for the number of data or minibatch size in docstrings and shape constructions</dd>

  <dt>P</dt>
  <dd>stands for the number of output dimensions in docstrings and shape constructions</dd>

  <dt>D</dt>
  <dd>stands for the number of input dimensions in docstrings and shape constructions</dd>
</dl>

Showing with 0 additions and 0 deletions (0 / 0 diffs computed)

Computing file changes ...