e9a1b20 | Achraf | 06 December 2022, 15:16:44 UTC | add some training data with examples with docs containing highlights | 06 December 2022, 15:16:44 UTC |
de6016d | Patrice Lopez | 05 December 2022, 17:58:58 UTC | more log debug; model update | 05 December 2022, 17:58:58 UTC |
c5ed7bc | Patrice Lopez | 03 December 2022, 19:50:01 UTC | update models/scores | 03 December 2022, 19:50:01 UTC |
ac6146c | lopez | 03 December 2022, 14:42:13 UTC | update logs | 03 December 2022, 14:42:13 UTC |
7d717b8 | lopez | 03 December 2022, 12:12:33 UTC | Merge branch 'master' of github.com:kermitt2/grobid | 03 December 2022, 12:12:33 UTC |
579939b | lopez | 03 December 2022, 12:12:23 UTC | review shared libraries got lin-64 JNI | 03 December 2022, 12:12:23 UTC |
bdefc2c | Patrice Lopez | 02 December 2022, 20:39:52 UTC | update model | 02 December 2022, 20:39:52 UTC |
2ecfd38 | Patrice Lopez | 02 December 2022, 18:33:07 UTC | review incremental training | 02 December 2022, 18:33:07 UTC |
27d2917 | lopez | 30 November 2022, 13:35:42 UTC | update lin64 wapiti binaries built with release flags | 30 November 2022, 13:35:42 UTC |
cbbc09a | Patrice Lopez | 25 November 2022, 17:02:42 UTC | try to fix circleci config | 25 November 2022, 17:02:42 UTC |
2bfadfd | Patrice Lopez | 25 November 2022, 16:47:34 UTC | Merge pull request #971 from kermitt2/incremental-training Incremental training for Deep Learning and Wapiti models | 25 November 2022, 16:47:34 UTC |
fc701f7 | Patrice Lopez | 25 November 2022, 15:06:14 UTC | document the incremental training option in the web api training service | 25 November 2022, 15:06:14 UTC |
c6b33a7 | Patrice Lopez | 25 November 2022, 14:50:33 UTC | add option incremental in REST web training service | 25 November 2022, 14:50:33 UTC |
b166763 | Patrice Lopez | 25 November 2022, 13:25:02 UTC | ensure compatibility with older modules/versions | 25 November 2022, 13:25:02 UTC |
b61eb87 | Patrice Lopez | 25 November 2022, 08:09:52 UTC | some missing steps in JNI training (although not used) | 25 November 2022, 08:09:52 UTC |
3209bbb | Patrice Lopez | 24 November 2022, 16:26:24 UTC | add incremental training for Wapiti | 24 November 2022, 16:26:24 UTC |
b1e4fd1 | Patrice Lopez | 24 November 2022, 10:38:54 UTC | document incremental training call | 24 November 2022, 10:38:54 UTC |
2dce9b9 | Patrice Lopez | 24 November 2022, 10:38:26 UTC | support incremental training for DL models | 24 November 2022, 10:38:26 UTC |
b9d92c6 | Patrice Lopez | 21 November 2022, 18:08:42 UTC | support training with transformer parameter | 21 November 2022, 18:08:42 UTC |
cb39dac | lopez | 14 November 2022, 02:00:57 UTC | ensure tests are fine | 14 November 2022, 02:00:57 UTC |
9ff77cc | lopez | 14 November 2022, 02:00:19 UTC | cleaning and removing empty blocks | 14 November 2022, 02:00:19 UTC |
5e6745c | lopez | 13 November 2022, 20:41:07 UTC | fix possible missing start token position in block | 13 November 2022, 20:41:07 UTC |
71d5559 | lopez | 13 November 2022, 17:33:24 UTC | cleaning and safety checks | 13 November 2022, 17:33:24 UTC |
95a637c | lopez | 10 November 2022, 10:28:41 UTC | retesting PWD image 0.7.2 | 10 November 2022, 10:28:41 UTC |
d394e6c | lopez | 10 November 2022, 10:16:04 UTC | test docker image version | 10 November 2022, 10:16:04 UTC |
6d97f19 | lopez | 10 November 2022, 09:44:01 UTC | update doc | 10 November 2022, 09:44:01 UTC |
f0397d7 | lopez | 10 November 2022, 09:18:16 UTC | update PWD parameters | 10 November 2022, 09:18:16 UTC |
9f66c15 | lopez | 10 November 2022, 09:09:54 UTC | Merge branch 'master' of github.com:kermitt2/grobid | 10 November 2022, 09:09:54 UTC |
36cf9f0 | lopez | 10 November 2022, 09:09:49 UTC | add link to datastet | 10 November 2022, 09:09:49 UTC |
45ced10 | Patrice Lopez | 10 November 2022, 09:08:35 UTC | Merge pull request #962 from kurdi-dev/master Adding PWD (Play with Docker) as another demo for Grobid | 10 November 2022, 09:08:35 UTC |
4dee6f7 | lopez | 07 November 2022, 16:42:32 UTC | fix null model path case | 07 November 2022, 16:42:32 UTC |
11e08cb | Walid R. Rashed | 31 October 2022, 09:36:38 UTC | Merge branch 'kermitt2:master' into master | 31 October 2022, 09:36:38 UTC |
9b06879 | lopez | 31 October 2022, 06:41:08 UTC | move to 0.7.3-SNAPSHOT | 31 October 2022, 06:41:08 UTC |
5d104b1 | Walid | 30 October 2022, 22:30:10 UTC | docs: adding Play with Docker button to the README file | 30 October 2022, 22:30:10 UTC |
05c0110 | Walid R. Rashed | 30 October 2022, 22:00:53 UTC | feat: adding docker compose file based on lfoppiano/grobid CRF model container | 30 October 2022, 22:00:53 UTC |
b189a64 | lopez | 30 October 2022, 14:21:42 UTC | fix crossref consolidation default | 30 October 2022, 14:21:42 UTC |
fcaf667 | lopez | 30 October 2022, 09:17:21 UTC | prepare release 0.7.2 | 30 October 2022, 09:17:21 UTC |
9ddc9f6 | lopez | 29 October 2022, 20:10:22 UTC | try to fix exit-code 137 for circleci test | 29 October 2022, 20:10:22 UTC |
2b8affc | lopez | 29 October 2022, 19:32:40 UTC | review default parameter DL reference-segmenter model | 29 October 2022, 19:32:40 UTC |
3c7629b | lopez | 29 October 2022, 19:29:37 UTC | update new eval | 29 October 2022, 19:29:37 UTC |
4821754 | lopez | 29 October 2022, 16:14:58 UTC | for safety. better cleaning of reference labels | 29 October 2022, 16:14:58 UTC |
f53cac7 | lopez | 29 October 2022, 16:13:51 UTC | refine segment recombination | 29 October 2022, 16:13:51 UTC |
cec5341 | lopez | 29 October 2022, 16:13:03 UTC | model updates | 29 October 2022, 16:13:03 UTC |
773db26 | lopez | 29 October 2022, 11:17:05 UTC | update training data | 29 October 2022, 11:17:05 UTC |
6705cb0 | lopez | 28 October 2022, 15:34:09 UTC | update benchmarking | 28 October 2022, 15:34:09 UTC |
a165cea | lopez | 27 October 2022, 12:12:34 UTC | review exception message | 27 October 2022, 12:12:34 UTC |
fa7e6b0 | Patrice Lopez | 27 October 2022, 11:32:24 UTC | Merge pull request #940 from kermitt2/improvement/jep-init-error-display Improve JEP initialisation and DL model loading error logging | 27 October 2022, 11:32:24 UTC |
033703f | lopez | 27 October 2022, 11:29:45 UTC | remove old model | 27 October 2022, 11:29:45 UTC |
011ca2c | lopez | 27 October 2022, 11:06:47 UTC | more training data fix | 27 October 2022, 11:06:47 UTC |
779ecf4 | Patrice Lopez | 27 October 2022, 10:35:52 UTC | typo in training | 27 October 2022, 10:35:52 UTC |
758b0b7 | lopez | 26 October 2022, 15:27:41 UTC | update model | 26 October 2022, 15:27:41 UTC |
621ed5d | lopez | 26 October 2022, 12:28:05 UTC | update crf models | 26 October 2022, 12:28:05 UTC |
a876a5e | lopez | 25 October 2022, 13:05:46 UTC | update models | 25 October 2022, 13:05:46 UTC |
88be98b | lopez | 24 October 2022, 12:41:36 UTC | update header model | 24 October 2022, 12:41:36 UTC |
c64ee30 | lopez | 23 October 2022, 14:12:17 UTC | small addition training citation | 23 October 2022, 14:12:17 UTC |
95ed80c | lopez | 23 October 2022, 14:09:53 UTC | addition training data segmentation | 23 October 2022, 14:09:53 UTC |
691a3d9 | lopez | 23 October 2022, 14:08:27 UTC | small addition training header | 23 October 2022, 14:08:27 UTC |
ebfc3ea | lopez | 23 October 2022, 13:58:34 UTC | ensure cleaning of extracted bib labels | 23 October 2022, 13:58:34 UTC |
4cc727d | lopez | 23 October 2022, 13:56:52 UTC | update training data reference segmenter | 23 October 2022, 13:56:52 UTC |
b58ad5e | lopez | 22 October 2022, 18:59:57 UTC | add training data | 22 October 2022, 18:59:57 UTC |
c910f98 | Patrice Lopez | 21 October 2022, 18:32:40 UTC | Merge pull request #742 from kermitt2/option-consolidate-with-doi-only added option to consolidate using doi only. | 21 October 2022, 18:32:40 UTC |
4168261 | lopez | 21 October 2022, 17:39:41 UTC | minor rephrase/typos | 21 October 2022, 17:39:41 UTC |
aac0ce4 | lopez | 21 October 2022, 13:07:33 UTC | support long sequences for reference segmenter RNN model and batch process | 21 October 2022, 13:07:33 UTC |
c0b27cf | Achraf | 20 October 2022, 11:23:41 UTC | Merge branch 'master' into option-consolidate-with-doi-only | 20 October 2022, 11:23:41 UTC |
1837b3b | Achraf | 20 October 2022, 11:18:44 UTC | Merge branch 'master' into option-consolidate-with-doi-only | 20 October 2022, 11:18:44 UTC |
3440cf8 | lopez | 20 October 2022, 09:46:11 UTC | training typos | 20 October 2022, 09:46:11 UTC |
a62fc12 | lopez | 20 October 2022, 07:50:58 UTC | Merge branch 'master' of github.com:kermitt2/grobid | 20 October 2022, 07:50:58 UTC |
0136530 | lopez | 20 October 2022, 07:50:41 UTC | clean/correct training | 20 October 2022, 07:50:41 UTC |
02b6e2c | Luca Foppiano | 20 October 2022, 02:09:04 UTC | Unit tests I forgot to commit | 20 October 2022, 02:09:04 UTC |
da7bb0c | lopez | 19 October 2022, 20:00:19 UTC | quick refresh authors in references | 19 October 2022, 20:00:19 UTC |
dab259e | Patrice Lopez | 19 October 2022, 07:36:47 UTC | Merge pull request #959 from kermitt2/feature/funding-statement Add funding statement in TEI output | 19 October 2022, 07:36:47 UTC |
6c8b888 | Luca Foppiano | 19 October 2022, 02:37:37 UTC | remove field from PMC - make a general method for it | 19 October 2022, 02:37:37 UTC |
5f08df2 | Luca Foppiano | 19 October 2022, 02:04:23 UTC | Merge branch 'master' into feature/funding-statement | 19 October 2022, 02:04:23 UTC |
9fdec9c | lopez | 18 October 2022, 14:06:12 UTC | cleaning useless hack ; generalizing post-processing for short texts | 18 October 2022, 14:06:12 UTC |
b528647 | Luca Foppiano | 18 October 2022, 08:26:13 UTC | update name and add unit test | 18 October 2022, 08:26:13 UTC |
4ac0339 | Luca Foppiano | 18 October 2022, 06:47:50 UTC | add funding XPaths in end to end evaluation | 18 October 2022, 06:47:50 UTC |
8e53a7d | lopez | 17 October 2022, 18:15:06 UTC | additional training data and model update | 17 October 2022, 18:15:06 UTC |
1836680 | Luca Foppiano | 17 October 2022, 10:56:59 UTC | apply post processing of tei sections of text without considering tables and figures labels | 17 October 2022, 10:56:59 UTC |
0bd482c | lopez | 16 October 2022, 16:56:53 UTC | skip availability statement eval for e2e PMC set | 16 October 2022, 16:56:53 UTC |
134ac7a | Patrice Lopez | 15 October 2022, 13:41:21 UTC | Merge pull request #838 from kermitt2/prioritize_crossref_author_meta prefer author meta from consolidation , crossref is considered more r… | 15 October 2022, 13:41:21 UTC |
197ecd0 | Patrice Lopez | 15 October 2022, 13:35:30 UTC | Merge pull request #961 from kermitt2/martin-citation-annotations Corrected ref data from PR #864 | 15 October 2022, 13:35:30 UTC |
f82daa5 | lopez | 15 October 2022, 13:22:50 UTC | corrected ref data from PR #864 | 15 October 2022, 13:22:50 UTC |
ad1ee7f | lopez | 15 October 2022, 13:03:49 UTC | add back training data generation for raw reference strings | 15 October 2022, 13:03:49 UTC |
52352a8 | lopez | 15 October 2022, 08:42:59 UTC | add new hack | 15 October 2022, 08:42:59 UTC |
27a25ad | lopez | 15 October 2022, 08:42:49 UTC | roll back first hack | 15 October 2022, 08:42:49 UTC |
53ace9d | lopez | 15 October 2022, 07:04:36 UTC | cleaning; review basic keyword segmentation | 15 October 2022, 07:04:36 UTC |
14f5c5a | Luca Foppiano | 13 October 2022, 05:43:08 UTC | avoid loosing text when processShort tag text as figure or table | 13 October 2022, 05:43:08 UTC |
b1c0cfd | Luca Foppiano | 13 October 2022, 03:24:43 UTC | output funding statement in the back of the TEI output | 13 October 2022, 03:24:43 UTC |
06a526f | Luca Foppiano | 12 October 2022, 05:14:39 UTC | Add mention to WSL mode in documentation related to #954 | 12 October 2022, 05:14:39 UTC |
0f6c1b4 | lopez | 08 October 2022, 06:56:21 UTC | try to rephrase more clearly :) | 08 October 2022, 06:56:21 UTC |
2a16015 | Patrice Lopez | 07 October 2022, 06:25:50 UTC | Merge pull request #951 from kermitt2/feature/data-availability-statement Data and code availability statement zone | 07 October 2022, 06:25:50 UTC |
1925296 | lopez | 06 October 2022, 19:46:09 UTC | set default timeout and max blocks higher | 06 October 2022, 19:46:09 UTC |
7bb0c8f | lopez | 06 October 2022, 19:38:15 UTC | update segmentation model prior to merge | 06 October 2022, 19:38:15 UTC |
08b176f | lopez | 05 October 2022, 18:44:20 UTC | more training data for segmentation model | 05 October 2022, 18:44:20 UTC |
bd3de67 | lopez | 05 October 2022, 15:49:42 UTC | fix errors in latest training data | 05 October 2022, 15:49:42 UTC |
0f48242 | lopez | 03 October 2022, 18:35:22 UTC | new training data segmentation for availability statements | 03 October 2022, 18:35:22 UTC |
3c8c9e5 | lopez | 03 October 2022, 11:57:15 UTC | add a script to select interesting training cases from JATS/PDF pairs | 03 October 2022, 11:57:15 UTC |
74e29c2 | lopez | 03 October 2022, 08:44:07 UTC | fix test | 03 October 2022, 08:44:07 UTC |
332daf1 | lopez | 02 October 2022, 18:05:47 UTC | better foot note identifier | 02 October 2022, 18:05:47 UTC |
15e8565 | lopez | 02 October 2022, 17:55:23 UTC | fix xml id for foot notes | 02 October 2022, 17:55:23 UTC |