0aec0b9 | Tanel Alumäe | 04 August 2022, 08:39:43 UTC | added --do-language-id option | 04 August 2022, 08:39:43 UTC |
928adff | Tanel Alumäe | 29 June 2022, 09:43:19 UTC | Skip very short segments | 29 June 2022, 09:43:19 UTC |
adfa8de | Tanel Alumäe | 27 May 2022, 12:06:00 UTC | Merge branch 'master' of github.com:alumae/kaldi-offline-transcriber | 27 May 2022, 12:06:00 UTC |
0ab0929 | Tanel Alumäe | 27 May 2022, 12:05:49 UTC | Avoid 0-length segments | 27 May 2022, 12:05:49 UTC |
9e0142b | Tanel Alumäe | 26 May 2022, 06:21:46 UTC | Update README.md | 26 May 2022, 06:21:46 UTC |
8815439 | Tanel Alumäe | 26 May 2022, 06:21:16 UTC | Update README.md | 26 May 2022, 06:21:16 UTC |
9028b47 | Tanel Alumäe | 25 May 2022, 11:39:16 UTC | Removed SAD | 25 May 2022, 11:39:16 UTC |
1c2068f | Tanel Alumäe | 24 May 2022, 13:19:08 UTC | Now uses Silero VAD for speech activity detection | 24 May 2022, 13:19:08 UTC |
51bfdb3 | Tanel Alumäe | 15 June 2021, 10:50:12 UTC | Lates updates | 15 June 2021, 10:50:12 UTC |
c88870b | Tanel Alumäe | 14 June 2021, 07:10:31 UTC | Now works with recent changes | 14 June 2021, 07:10:31 UTC |
dbe5ad4 | Tanel Alumäe | 11 June 2021, 21:15:11 UTC | Some fixes | 11 June 2021, 21:15:11 UTC |
7dee822 | Tanel Alumäe | 11 June 2021, 14:29:23 UTC | Updating to newer versions, adding language ID | 11 June 2021, 14:29:23 UTC |
2ddfbb0 | Tanel Alumäe | 04 June 2021, 15:54:41 UTC | Bug fix | 04 June 2021, 15:54:41 UTC |
caadc22 | Tanel Alumäe | 22 February 2021, 13:30:07 UTC | Docker image now inherits from kaldi's Docker image | 22 February 2021, 13:30:07 UTC |
d3b8372 | Tanel Alumäe | 28 January 2021, 12:47:05 UTC | Update README.md Update name of the models file | 28 January 2021, 12:47:05 UTC |
14398f8 | Tanel Alumäe | 05 June 2019, 15:28:03 UTC | small fixes to speaker ID server stuff | 05 June 2019, 15:28:03 UTC |
9b74727 | Tanel Alumäe | 05 June 2019, 13:39:43 UTC | Merge branch 'master' of /home/tanel/devel/kaldi-offline-transcriber | 05 June 2019, 13:39:43 UTC |
82036ff | Tanel Alumäe | 05 June 2019, 13:39:39 UTC | reverted some changes | 05 June 2019, 13:39:39 UTC |
2059fe1 | Tanel Alumäe | 05 June 2019, 13:38:37 UTC | Small changes related to words-to-numbers | 05 June 2019, 13:38:37 UTC |
918806d | Tanel Alumäe | 05 June 2019, 13:28:58 UTC | Merge branch 'master' of /home/tanel/devel/kaldi-offline-transcriber | 05 June 2019, 13:28:58 UTC |
19dea83 | Tanel Alumäe | 05 June 2019, 13:19:47 UTC | Can now use external speaker ID server | 05 June 2019, 13:19:47 UTC |
e2dbf72 | Tanel Alumäe | 05 June 2019, 06:50:56 UTC | Riigikogu spetsiifilised muudatused | 05 June 2019, 06:50:56 UTC |
d64a8e7 | Tanel Alumäe | 08 January 2019, 14:11:05 UTC | Added Sync elements after each Turn beginning to make TSAB happy | 08 January 2019, 14:11:05 UTC |
dc16cd0 | Tanel Alumäe | 29 November 2018, 11:06:58 UTC | Added words-to-numbers conversion to appropriate places | 29 November 2018, 11:06:58 UTC |
a4b93d3 | Tanel Alumäe | 27 November 2018, 10:20:20 UTC | Program that converts words to numbers, using Pynini | 27 November 2018, 10:20:20 UTC |
6e84dbf | Tanel Alumäe | 27 November 2018, 10:19:37 UTC | Program that postprocesses JSON-formatted transcript using an external program | 27 November 2018, 10:19:37 UTC |
79f0d38 | Tanel Alumäe | 22 November 2018, 15:33:13 UTC | added confidence scores to CTM and JSON result files | 22 November 2018, 15:33:13 UTC |
b980ff6 | Tanel Alumäe | 22 November 2018, 15:32:03 UTC | script for converting words to numbers using FST | 22 November 2018, 15:32:03 UTC |
ad6ff46 | Tanel Alumäe | 14 November 2018, 14:43:41 UTC | Avoid small gaps between turns | 14 November 2018, 14:43:41 UTC |
b1d10b8 | Tanel Alumäe | 14 November 2018, 14:11:35 UTC | Merge branch 'master' of github.com:alumae/kaldi-offline-transcriber | 14 November 2018, 14:11:35 UTC |
b03a80a | Tanel Alumäe | 14 November 2018, 14:10:58 UTC | Fixed a Transcriber file format issue | 14 November 2018, 14:10:58 UTC |
142f0aa | Tanel Alumäe | 01 November 2018, 15:14:47 UTC | Update README.md | 01 November 2018, 15:14:47 UTC |
97152d5 | Tanel Alumäe | 31 October 2018, 16:25:30 UTC | minor fixes | 31 October 2018, 16:25:30 UTC |
b67c07b | Tanel Alumäe | 31 October 2018, 14:59:35 UTC | Some minor fixes | 31 October 2018, 14:59:35 UTC |
b656610 | Tanel Alumäe | 31 October 2018, 14:58:01 UTC | fix typo | 31 October 2018, 14:58:01 UTC |
f7dad1a | Tanel Alumäe | 31 October 2018, 11:28:12 UTC | Refactored and introduced a new JSON format that holds all information about word and segment timings | 31 October 2018, 11:28:12 UTC |
fe0ab0f | Tanel Alumäe | 31 October 2018, 11:04:16 UTC | Refactored and introduced a new JSON format that holds all information about word and segment timings | 31 October 2018, 11:04:16 UTC |
623d5c9 | Tanel Alumäe | 28 October 2018, 12:47:57 UTC | Update README.md | 28 October 2018, 12:47:57 UTC |
bcec748 | Tanel Alumäe | 21 October 2018, 18:27:20 UTC | Merge branch 'master' of github.com:alumae/kaldi-offline-transcriber | 21 October 2018, 18:27:20 UTC |
df8d5e8 | Tanel Alumäe | 21 October 2018, 18:26:43 UTC | much faster compounding now | 21 October 2018, 18:26:43 UTC |
39a19df | Tanel Alumäe | 21 October 2018, 18:26:21 UTC | clean up temp file in src-audio | 21 October 2018, 18:26:21 UTC |
4962ba9 | Tanel Alumäe | 09 October 2018, 12:00:31 UTC | Update README.md | 09 October 2018, 12:00:31 UTC |
4b02ef5 | Tanel Alumäe | 09 October 2018, 11:55:44 UTC | Update README.md Changed public image URL. | 09 October 2018, 11:55:44 UTC |
80256af | Tanel Alumäe | 04 October 2018, 07:15:32 UTC | delete wav from src-audio when cleaning up | 04 October 2018, 07:15:32 UTC |
c2c99b9 | Tanel Alumäe | 12 September 2018, 18:30:23 UTC | Updated SID models | 12 September 2018, 18:30:23 UTC |
19541b7 | Tanel Alumäe | 31 August 2018, 08:58:59 UTC | Added Dockerfile for pre-building Estonian system | 31 August 2018, 08:58:59 UTC |
09bc2a8 | Tanel Alumäe | 31 August 2018, 08:54:49 UTC | Added Dockerfile for pre-building Estonian system | 31 August 2018, 08:54:49 UTC |
9be252a | Tanel Alumäe | 31 August 2018, 08:23:57 UTC | Added Dockerfile for pre-building Estonian system | 31 August 2018, 08:23:57 UTC |
61b16fd | Tanel Alumäe | 31 August 2018, 08:22:32 UTC | Added Dockerfile for pre-building Estonian system | 31 August 2018, 08:22:32 UTC |
a532032 | Tanel Alumäe | 24 August 2018, 13:08:43 UTC | cleanup after creating decoding graph | 24 August 2018, 13:08:43 UTC |
7d511f1 | Tanel Alumäe | 24 August 2018, 09:04:38 UTC | cleanup after init | 24 August 2018, 09:04:38 UTC |
64b71d0 | Tanel Alumäe | 23 August 2018, 05:36:34 UTC | Some minor improvements | 23 August 2018, 05:36:34 UTC |
b2a457d | Tanel Alumäe | 21 August 2018, 13:52:45 UTC | Speaker ID system now uses Kaldi's native i-vector scoring | 21 August 2018, 13:52:45 UTC |
c8fb6fe | Tanel Alumäe | 09 August 2018, 12:52:10 UTC | Titlecase after questions | 09 August 2018, 12:52:10 UTC |
924aca4 | Tanel Alumäe | 09 August 2018, 11:01:01 UTC | removed reduntant option | 09 August 2018, 11:01:01 UTC |
25a5116 | Tanel Alumäe | 08 August 2018, 11:42:44 UTC | doc updates | 08 August 2018, 11:42:44 UTC |
8a71397 | Tanel Alumäe | 08 August 2018, 11:34:19 UTC | doc updates | 08 August 2018, 11:34:19 UTC |
3f3f731 | Tanel Alumäe | 08 August 2018, 10:33:23 UTC | Updates related to segments file | 08 August 2018, 10:33:23 UTC |
11bfbd9 | Tanel Alumäe | 08 August 2018, 08:56:53 UTC | Refactored -- now segments file is used, instead of splitting the wav ühysically to pieces | 08 August 2018, 08:56:53 UTC |
194e124 | Tanel Alumäe | 07 August 2018, 11:20:06 UTC | Fixes related to init | 07 August 2018, 11:20:06 UTC |
d5982b1 | Tanel Alumäe | 07 August 2018, 10:32:54 UTC | Now uses LM with special unk handling, and <unk> words can be reconstructed from pronuciation | 07 August 2018, 10:32:54 UTC |
5995e98 | Tanel Alumäe | 28 June 2017, 08:27:39 UTC | Use ffmpeg for decoding mp4 files | 28 June 2017, 08:27:39 UTC |
2cc26c6 | Tanel Alumäe | 15 June 2017, 06:58:07 UTC | Added support for SubRip subtitle files (.srt) | 15 June 2017, 06:58:07 UTC |
b4395b6 | Tanel Alumäe | 30 May 2017, 11:47:10 UTC | Now uses DNN-based speaker ID, trained in a weakly unsupervised manner. Requires Keras | 30 May 2017, 11:47:10 UTC |
59f935b | Tanel Alumäe | 29 May 2017, 13:38:30 UTC | Now uses DNN-based speaker ID, trained in a weakly unsupervised manner. Requires Keras | 29 May 2017, 13:38:30 UTC |
8e5c513 | Tanel Alumäe | 29 May 2017, 13:37:00 UTC | mistakenly commited | 29 May 2017, 13:37:00 UTC |
81f77c6 | Tanel Alumäe | 29 May 2017, 13:15:27 UTC | Now uses DNN-based speaker ID, trained in a weakly unsupervised manner. Requires Keras | 29 May 2017, 13:15:27 UTC |
8050527 | Tanel Alumäe | 02 May 2017, 12:21:39 UTC | Replaced pyfst in compounder.py with OpenFst's native extension | 02 May 2017, 12:21:39 UTC |
c9cdfb6 | Tanel Alumäe | 02 May 2017, 12:20:29 UTC | Replaced pyfst in compounder.py with OpenFst's native extension | 02 May 2017, 12:20:29 UTC |
50cf17b | Tanel Alumäe | 26 April 2017, 12:55:54 UTC | Fixes for chain model CTM generation, too long subtitles in SBV file, and out-of-memory errors for diarization of long audio files | 26 April 2017, 12:55:54 UTC |
c745dce | Tanel Alumäe | 20 February 2017, 14:36:07 UTC | Fixes titlecasing bug | 20 February 2017, 14:36:07 UTC |
a36fa2d | Tanel Alumäe | 16 February 2017, 08:13:54 UTC | Fixes for some cases when transcribing is restarted | 16 February 2017, 08:13:54 UTC |
323fa1a | Tanel Alumäe | 13 February 2017, 12:34:52 UTC | Compatibiliy to user-set LD_LIBRARY_PATH | 13 February 2017, 12:34:52 UTC |
87ff3c1 | Tanel Alumäe | 13 February 2017, 12:19:12 UTC | Migrated to chain models, made python3 compatible | 13 February 2017, 12:19:12 UTC |
708bcfa | Tanel Alumäe | 13 February 2017, 12:14:35 UTC | Compatibiliy to user-set LD_LIBRARY_PATH | 13 February 2017, 12:14:35 UTC |
e12288a | Tanel Alumäe | 13 February 2017, 11:59:12 UTC | Migrated to chain models, made python3 compatible | 13 February 2017, 11:59:12 UTC |
17d5ec5 | Tanel Alumäe | 13 February 2017, 10:21:06 UTC | Migrated to chain models, made python3 compatible | 13 February 2017, 10:21:06 UTC |
066a5db | Tanel Alumäe | 13 February 2017, 10:19:38 UTC | Migrated to chain models, made python3 compatible | 13 February 2017, 10:19:38 UTC |
86e3c32 | Tanel Alumäe | 30 December 2015, 01:06:21 UTC | Updated models | 30 December 2015, 01:06:21 UTC |
b3e0552 | Tanel Alumäe | 30 December 2015, 01:03:48 UTC | Updated models | 30 December 2015, 01:03:48 UTC |
780eae1 | Tanel Alumäe | 29 December 2015, 22:11:24 UTC | Compability fix | 29 December 2015, 22:11:24 UTC |
a90f62d | Tanel Alumäe | 04 December 2015, 14:34:08 UTC | Cosmetic fixes | 04 December 2015, 14:34:08 UTC |
eb00e3b | Tanel Alumäe | 25 November 2015, 17:01:06 UTC | mp3 files are converted now using ffmpeg which is more robust to exotic formats | 25 November 2015, 17:01:06 UTC |
bb7a3b4 | Tanel Alumäe | 08 September 2015, 15:20:47 UTC | Fix for an issue that caused speaker ID to be executed when it was turned off | 08 September 2015, 15:20:47 UTC |
6730923 | Tanel Alumäe | 14 May 2015, 11:28:27 UTC | About memory reqs | 14 May 2015, 11:28:27 UTC |
2bede6f | Tanel Alumäe | 14 May 2015, 10:10:09 UTC | fix in .init | 14 May 2015, 10:10:09 UTC |
cb389ed | Tanel Alumäe | 14 May 2015, 10:03:14 UTC | Cosmetic fixes | 14 May 2015, 10:03:14 UTC |
343405f | Tanel Alumäe | 14 May 2015, 09:32:24 UTC | Removed option to decode with old-style nnet2 models, made decodining with online nnet2 models multithreaded | 14 May 2015, 09:32:24 UTC |
14f9ddf | Tanel Alumäe | 16 March 2015, 12:59:10 UTC | bug fix | 16 March 2015, 12:59:10 UTC |
48ef75c | Tanel Alumäe | 11 March 2015, 14:03:41 UTC | Updated LM | 11 March 2015, 14:03:41 UTC |
5300497 | Tanel Alumäe | 10 March 2015, 15:45:35 UTC | Clarified that the system is currently for Estonian | 10 March 2015, 15:45:35 UTC |
2feec82 | Tanel Alumäe | 06 March 2015, 09:01:59 UTC | small utf-8 related fix | 06 March 2015, 09:01:59 UTC |
babf185 | Tanel Alumäe | 05 March 2015, 10:52:18 UTC | titlecases words after . | 05 March 2015, 10:52:18 UTC |
9c4b7f2 | Tanel Alumäe | 03 March 2015, 18:59:42 UTC | small bug fix | 03 March 2015, 18:59:42 UTC |
594b21a | Tanel Alumäe | 03 March 2015, 13:35:28 UTC | txt files now have (optional) punctuation | 03 March 2015, 13:35:28 UTC |
0b3f213 | Tanel Alumäe | 03 March 2015, 13:25:59 UTC | small bug fix | 03 March 2015, 13:25:59 UTC |
722dc37 | Tanel Alumäe | 03 March 2015, 13:10:25 UTC | bug fix | 03 March 2015, 13:10:25 UTC |
56db315 | Tanel Alumäe | 03 March 2015, 11:38:33 UTC | Integrated punctuation insertion module | 03 March 2015, 11:38:33 UTC |
425267f | Tanel Alumäe | 29 December 2014, 14:47:55 UTC | Added optional support for automatic punctuation (NB! uses SRILM currently -- not for commercial use) | 29 December 2014, 14:47:55 UTC |
dc2107e | Tanel Alumäe | 29 December 2014, 14:36:32 UTC | Added optional support for automatic punctuation (NB! uses SRILM currently -- not for commercial use) | 29 December 2014, 14:36:32 UTC |