https://github.com/facebookresearch/pythia
Revision c9ab34925cd318cc0bdf616ecac66ce631672c4a authored by Ayush Thakur on 23 November 2021, 17:59:22 UTC, committed by Facebook GitHub Bot on 23 November 2021, 18:00:14 UTC
Summary:
🚀 I have extended the `WandbLogger` with the ability to log the `current.pt` checkpoint as W&B Artifacts. Note that this PR is based on top of this [PR](https://github.com/facebookresearch/mmf/pull/1129).

### What is W&B Artifacts?

> W&B Artifacts was designed to make it effortless to version your datasets and models, regardless of whether you want to store your files with us or whether you already have a bucket you want us to track. Once you've tracked your dataset or model files, W&B will automatically log each and every modification, giving you a complete and auditable history of changes to your files.

Through this PR, W&B Artifacts can help save and organize machine learning models throughout a project's lifecycle. More details in the documentation [here](https://docs.wandb.ai/guides/artifacts/model-versioning).

### Modification

This PR adds a `log_model_checkpoint` method to the `WandbLogger` class in the `utils/logger.py` file. This method is called in the `utils/checkpoint.py` file.

### Usage

To use this, in the `config/defaults.yaml` do, `training.wandb.enabled=true` and `training.wandb.log_checkpoint=true`.

### Result

The screenshot shows the `current.pt` checkpoints saved at intervals defined by `training.checkpoint_interval`. You can check out the logged artifacts page [here](https://wandb.ai/ayut/mmf/artifacts/model/run_ey9xextf_model/0dc64164acbdc300fd01/api).

![image](https://user-images.githubusercontent.com/31141479/139390462-d5c8445e-5c20-4fdd-85d0-51ef64846bf0.png)

### Superpowers

With this small addition, now one can easily track different versions of the model, download a checkpoint of interest by using the API in the API tab, easily share the checkpoints with teammates, etc.

### Requests

This is a draft PR as there are a few more things that can be improved here.

* Is there a better way to access the path to the `current.pt` checkpoint? Rather is the modification made to `utils/checkpoint.py` an acceptable way of approaching this?

* While logging a file as W&B artifacts we can also provide metadata associated with that file. In this case, we can add current iteration, training metrics, etc. as the metadata. Would love to get suggestions about the different data points that I should log as metadata alongside the checkpoints.

* How to determine if a checkpoint is the best one? If a checkpoint is best I can add `best` as an alias for that checkpoint's artifact.

Pull Request resolved: https://github.com/facebookresearch/mmf/pull/1137

Test Plan:
Imported from GitHub, without a `Test Plan:` line.

**Static Docs Preview: mmf**
|[Full Site](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/)|

|**Modified Pages**|
|[docs/notes/logger](https://our.intern.facebook.com/intern/staticdocs/eph/D32402090/V6/mmf/docs/notes/logger/)|

Reviewed By: apsdehal

Differential Revision: D32402090

Pulled By: ebsmothers

fbshipit-source-id: 94b881ec55c4197301331d571bc926521e2feecc
1 parent b6a5804
History
Tip revision: c9ab34925cd318cc0bdf616ecac66ce631672c4a authored by Ayush Thakur on 23 November 2021, 17:59:22 UTC
[feat] Model version control using W&B Artifacts (#1137)
Tip revision: c9ab349
File Mode Size
.circleci
.github
docs
mmf
mmf_cli
projects
tests
tools
website
.editorconfig -rw-r--r-- 191 bytes
.flake8 -rw-r--r-- 187 bytes
.gitignore -rw-r--r-- 267 bytes
.pre-commit-config.yaml -rw-r--r-- 1.0 KB
LICENSE -rw-r--r-- 1.5 KB
MANIFEST.in -rw-r--r-- 130 bytes
NOTICES -rw-r--r-- 6.0 KB
README.md -rw-r--r-- 2.1 KB
pyproject.toml -rw-r--r-- 1.0 KB
requirements.txt -rw-r--r-- 493 bytes
setup.py -rw-r--r-- 5.1 KB

README.md

back to top