https://github.com/kermitt2/grobid

sort by:
Revision Author Date Message Commit Date
fc701f7 document the incremental training option in the web api training service 25 November 2022, 15:06:14 UTC
c6b33a7 add option incremental in REST web training service 25 November 2022, 14:50:33 UTC
b166763 ensure compatibility with older modules/versions 25 November 2022, 13:25:02 UTC
b61eb87 some missing steps in JNI training (although not used) 25 November 2022, 08:09:52 UTC
3209bbb add incremental training for Wapiti 24 November 2022, 16:26:24 UTC
b1e4fd1 document incremental training call 24 November 2022, 10:38:54 UTC
2dce9b9 support incremental training for DL models 24 November 2022, 10:38:26 UTC
b9d92c6 support training with transformer parameter 21 November 2022, 18:08:42 UTC
cb39dac ensure tests are fine 14 November 2022, 02:00:57 UTC
9ff77cc cleaning and removing empty blocks 14 November 2022, 02:00:19 UTC
5e6745c fix possible missing start token position in block 13 November 2022, 20:41:07 UTC
71d5559 cleaning and safety checks 13 November 2022, 17:33:24 UTC
95a637c retesting PWD image 0.7.2 10 November 2022, 10:28:41 UTC
d394e6c test docker image version 10 November 2022, 10:16:04 UTC
6d97f19 update doc 10 November 2022, 09:44:01 UTC
f0397d7 update PWD parameters 10 November 2022, 09:18:16 UTC
9f66c15 Merge branch 'master' of github.com:kermitt2/grobid 10 November 2022, 09:09:54 UTC
36cf9f0 add link to datastet 10 November 2022, 09:09:49 UTC
45ced10 Merge pull request #962 from kurdi-dev/master Adding PWD (Play with Docker) as another demo for Grobid 10 November 2022, 09:08:35 UTC
4dee6f7 fix null model path case 07 November 2022, 16:42:32 UTC
11e08cb Merge branch 'kermitt2:master' into master 31 October 2022, 09:36:38 UTC
9b06879 move to 0.7.3-SNAPSHOT 31 October 2022, 06:41:08 UTC
5d104b1 docs: adding Play with Docker button to the README file 30 October 2022, 22:30:10 UTC
05c0110 feat: adding docker compose file based on lfoppiano/grobid CRF model container 30 October 2022, 22:00:53 UTC
b189a64 fix crossref consolidation default 30 October 2022, 14:21:42 UTC
fcaf667 prepare release 0.7.2 30 October 2022, 09:17:21 UTC
9ddc9f6 try to fix exit-code 137 for circleci test 29 October 2022, 20:10:22 UTC
2b8affc review default parameter DL reference-segmenter model 29 October 2022, 19:32:40 UTC
3c7629b update new eval 29 October 2022, 19:29:37 UTC
4821754 for safety. better cleaning of reference labels 29 October 2022, 16:14:58 UTC
f53cac7 refine segment recombination 29 October 2022, 16:13:51 UTC
cec5341 model updates 29 October 2022, 16:13:03 UTC
773db26 update training data 29 October 2022, 11:17:05 UTC
6705cb0 update benchmarking 28 October 2022, 15:34:09 UTC
a165cea review exception message 27 October 2022, 12:12:34 UTC
fa7e6b0 Merge pull request #940 from kermitt2/improvement/jep-init-error-display Improve JEP initialisation and DL model loading error logging 27 October 2022, 11:32:24 UTC
033703f remove old model 27 October 2022, 11:29:45 UTC
011ca2c more training data fix 27 October 2022, 11:06:47 UTC
779ecf4 typo in training 27 October 2022, 10:35:52 UTC
758b0b7 update model 26 October 2022, 15:27:41 UTC
621ed5d update crf models 26 October 2022, 12:28:05 UTC
a876a5e update models 25 October 2022, 13:05:46 UTC
88be98b update header model 24 October 2022, 12:41:36 UTC
c64ee30 small addition training citation 23 October 2022, 14:12:17 UTC
95ed80c addition training data segmentation 23 October 2022, 14:09:53 UTC
691a3d9 small addition training header 23 October 2022, 14:08:27 UTC
ebfc3ea ensure cleaning of extracted bib labels 23 October 2022, 13:58:34 UTC
4cc727d update training data reference segmenter 23 October 2022, 13:56:52 UTC
b58ad5e add training data 22 October 2022, 18:59:57 UTC
c910f98 Merge pull request #742 from kermitt2/option-consolidate-with-doi-only added option to consolidate using doi only. 21 October 2022, 18:32:40 UTC
4168261 minor rephrase/typos 21 October 2022, 17:39:41 UTC
aac0ce4 support long sequences for reference segmenter RNN model and batch process 21 October 2022, 13:07:33 UTC
c0b27cf Merge branch 'master' into option-consolidate-with-doi-only 20 October 2022, 11:23:41 UTC
1837b3b Merge branch 'master' into option-consolidate-with-doi-only 20 October 2022, 11:18:44 UTC
3440cf8 training typos 20 October 2022, 09:46:11 UTC
a62fc12 Merge branch 'master' of github.com:kermitt2/grobid 20 October 2022, 07:50:58 UTC
0136530 clean/correct training 20 October 2022, 07:50:41 UTC
02b6e2c Unit tests I forgot to commit 20 October 2022, 02:09:04 UTC
da7bb0c quick refresh authors in references 19 October 2022, 20:00:19 UTC
dab259e Merge pull request #959 from kermitt2/feature/funding-statement Add funding statement in TEI output 19 October 2022, 07:36:47 UTC
6c8b888 remove field from PMC - make a general method for it 19 October 2022, 02:37:37 UTC
5f08df2 Merge branch 'master' into feature/funding-statement 19 October 2022, 02:04:23 UTC
9fdec9c cleaning useless hack ; generalizing post-processing for short texts 18 October 2022, 14:06:12 UTC
b528647 update name and add unit test 18 October 2022, 08:26:13 UTC
4ac0339 add funding XPaths in end to end evaluation 18 October 2022, 06:47:50 UTC
8e53a7d additional training data and model update 17 October 2022, 18:15:06 UTC
1836680 apply post processing of tei sections of text without considering tables and figures labels 17 October 2022, 10:56:59 UTC
0bd482c skip availability statement eval for e2e PMC set 16 October 2022, 16:56:53 UTC
134ac7a Merge pull request #838 from kermitt2/prioritize_crossref_author_meta prefer author meta from consolidation , crossref is considered more r… 15 October 2022, 13:41:21 UTC
197ecd0 Merge pull request #961 from kermitt2/martin-citation-annotations Corrected ref data from PR #864 15 October 2022, 13:35:30 UTC
f82daa5 corrected ref data from PR #864 15 October 2022, 13:22:50 UTC
ad1ee7f add back training data generation for raw reference strings 15 October 2022, 13:03:49 UTC
52352a8 add new hack 15 October 2022, 08:42:59 UTC
27a25ad roll back first hack 15 October 2022, 08:42:49 UTC
53ace9d cleaning; review basic keyword segmentation 15 October 2022, 07:04:36 UTC
14f5c5a avoid loosing text when processShort tag text as figure or table 13 October 2022, 05:43:08 UTC
b1c0cfd output funding statement in the back of the TEI output 13 October 2022, 03:24:43 UTC
06a526f Add mention to WSL mode in documentation related to #954 12 October 2022, 05:14:39 UTC
0f6c1b4 try to rephrase more clearly :) 08 October 2022, 06:56:21 UTC
2a16015 Merge pull request #951 from kermitt2/feature/data-availability-statement Data and code availability statement zone 07 October 2022, 06:25:50 UTC
1925296 set default timeout and max blocks higher 06 October 2022, 19:46:09 UTC
7bb0c8f update segmentation model prior to merge 06 October 2022, 19:38:15 UTC
08b176f more training data for segmentation model 05 October 2022, 18:44:20 UTC
bd3de67 fix errors in latest training data 05 October 2022, 15:49:42 UTC
0f48242 new training data segmentation for availability statements 03 October 2022, 18:35:22 UTC
3c8c9e5 add a script to select interesting training cases from JATS/PDF pairs 03 October 2022, 11:57:15 UTC
74e29c2 fix test 03 October 2022, 08:44:07 UTC
332daf1 better foot note identifier 02 October 2022, 18:05:47 UTC
15e8565 fix xml id for foot notes 02 October 2022, 17:55:23 UTC
2605750 remove unstable integration test 27 September 2022, 14:43:32 UTC
2278be1 fix conflict 27 September 2022, 14:21:55 UTC
4e96757 minor optional diff report 27 September 2022, 13:37:08 UTC
f9dc68f Merge pull request #944 from kermitt2/features/footnotes Link footnotes in the text 27 September 2022, 13:35:08 UTC
2f2241e minor for trace 26 September 2022, 17:11:30 UTC
278ee1d review eval, remove redundant normalize-space 26 September 2022, 16:17:57 UTC
7b0aa0c avoid regression; cleaning 26 September 2022, 10:59:12 UTC
68efda8 better field name for reporting 25 September 2022, 16:00:06 UTC
14cc516 remove non-Grobid TEI path 25 September 2022, 15:18:00 UTC
98aface do not restrict availability statement to data availability 25 September 2022, 14:39:09 UTC
e098b50 write header availability statement in the final TEI 25 September 2022, 14:31:37 UTC
back to top