Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

https://github.com/open-mmlab/Amphion
09 September 2024, 06:46:44 UTC
  • Code
  • Branches (2)
  • Releases (3)
  • Visits
    • Branches
    • Releases
    • HEAD
    • refs/heads/main
    • refs/heads/revert-154-FACodec-readme
    • v0.1.1-alpha
    • v0.1.0-alpha
    • v0.1.0
  • 320c54d
  • /
  • egs
  • /
  • metrics
  • /
  • README.md
Raw File Download Save again
Take a new snapshot of a software origin

If the archived software origin currently browsed is not synchronized with its upstream version (for instance when new commits have been issued), you can explicitly request Software Heritage to take a new snapshot of it.

Use the form below to proceed. Once a request has been submitted and accepted, it will be processed as soon as possible. You can then check its processing state by visiting this dedicated page.
swh spinner

Processing "take a new snapshot" request ...

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • content
  • directory
  • revision
  • snapshot
  • release
origin badgecontent badge
swh:1:cnt:882b31365cdd69b381cc8b9bef77509e94c9deb9
origin badgedirectory badge
swh:1:dir:ef51de3a027fa8006588c9990ee2084325a14767
origin badgerevision badge
swh:1:rev:a4c23e2e1f15e4be0b0c7194e6b69a82a4bb4a07
origin badgesnapshot badge
swh:1:snp:bef780d851faeac80aef6db569e51e66f505bf34
origin badgerelease badge
swh:1:rel:acb10df35ee6dcf0c0dcb3216afce897fb0bc227

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • content
  • directory
  • revision
  • snapshot
  • release
(requires biblatex-software package)
Generating citation ...
(requires biblatex-software package)
Generating citation ...
(requires biblatex-software package)
Generating citation ...
(requires biblatex-software package)
Generating citation ...
(requires biblatex-software package)
Generating citation ...
Tip revision: a4c23e2e1f15e4be0b0c7194e6b69a82a4bb4a07 authored by Xueyao Zhang on 18 December 2023, 14:14:33 UTC
Amphion v0.1 Release (#39)
Tip revision: a4c23e2
README.md
# Amphion Evaluation Recipe

## Supported Evaluation Metrics

Until now, Amphion Evaluation has supported the following objective metrics:

- **F0 Modeling**:
  - F0 Pearson Coefficients (FPC)
  - F0 Periodicity Root Mean Square Error (PeriodicityRMSE)
  - F0 Root Mean Square Error (F0RMSE)
  - Voiced/Unvoiced F1 Score (V/UV F1)
- **Energy Modeling**:
  - Energy Root Mean Square Error (EnergyRMSE)
  - Energy Pearson Coefficients (EnergyPC)
- **Intelligibility**:
  - Character Error Rate (CER) based on [Whipser](https://github.com/openai/whisper)
  - Word Error Rate (WER) based on [Whipser](https://github.com/openai/whisper)
- **Spectrogram Distortion**:
  - Frechet Audio Distance (FAD)
  - Mel Cepstral Distortion (MCD)
  - Multi-Resolution STFT Distance (MSTFT)
  - Perceptual Evaluation of Speech Quality (PESQ)
  - Short Time Objective Intelligibility (STOI)
  - Scale Invariant Signal to Distortion Ratio (SISDR)
  - Scale Invariant Signal to Noise Ratio (SISNR)
- **Speaker Similarity**:
  - Cosine similarity based on [Rawnet3](https://github.com/Jungjee/RawNet)
  - Cosine similarity based on [WeSpeaker](https://github.com/wenet-e2e/wespeaker) (πŸ‘¨β€πŸ’»Β developing)

We provide a recipe to demonstrate how to objectively evaluate your generated audios. There are three steps in total:

1. Pretrained Models Preparation
2. Audio Data Preparation
3. Evaluation

## 1. Pretrained Models Preparation

If you want to calculate `RawNet3` based speaker similarity, you need to download the pretrained model first, as illustrated [here](../../pretrained/README.md).

## 2. Aduio Data Preparation

Prepare reference audios and generated audios in two folders, the `ref_dir` contains the reference audio and the `gen_dir` contains the generated audio. Here is an example.

```plaintext
 ┣ {ref_dir}
 ┃ ┣ sample1.wav
 ┃ ┣ sample2.wav
 ┣ {gen_dir}
 ┃ ┣ sample1.wav
 ┃ ┣ sample2.wav
```

You have to make sure that the pairwise **reference audio and generated audio are named the same**, as illustrated above (sample1 to sample1, sample2 to sample2).

## 3. Evaluation

Run the `run.sh` with specified refenrece folder, generated folder, dump folder and metrics.

```bash
cd Amphion
sh egs/metrics/run.sh \
	--reference_folder [Your path to the reference audios] \
	--generated_folder [Your path to the generated audios] \
	--dump_folder [Your path to dump the objective results] \
	--metrics [The metrics you need] \
	--fs [Optional. To calculate all metrics in the specified sampling rate]
```

As for the metrics, an example is provided below:

```bash
--metrics "mcd pesq fad"
```

All currently available metrics keywords are listed below:

| Keys                  | Description                                |
| --------------------- | ------------------------------------------ |
| `fpc`                 | F0 Pearson Coefficients                    |
| `f0_periodicity_rmse` | F0 Periodicity Root Mean Square Error      |
| `f0rmse`              | F0 Root Mean Square Error                  |
| `v_uv_f1`             | Voiced/Unvoiced F1 Score                   |
| `energy_rmse`         | Energy Root Mean Square Error              |
| `energy_pc`           | Energy Pearson Coefficients                |
| `cer`                 | Character Error Rate                       |
| `wer`                 | Word Error Rate                            |
| `speaker_similarity`  | Cos Similarity based on RawNet3            |
| `fad`                 | Frechet Audio Distance                     |
| `mcd`                 | Mel Cepstral Distortion                    |
| `mstft`               | Multi-Resolution STFT Distance             |
| `pesq`                | Perceptual Evaluation of Speech Quality    |
| `si_sdr`              | Scale Invariant Signal to Distortion Ratio |
| `si_snr`              | Scale Invariant Signal to Noise Ratio      |
| `stoi`                | Short Time Objective Intelligibility       |

back to top

Software Heritage β€” Copyright (C) 2015–2026, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, APIβ€” Content policyβ€” Contactβ€” JavaScript license informationβ€” Web API