https://github.com/kermitt2/grobid

sort by:
Revision Author Date Message Commit Date
c1927ec improving error message 27 February 2020, 09:03:15 UTC
67a2863 adding some debugging information 27 February 2020, 08:51:47 UTC
53044fd Merge pull request #538 from kermitt2/change-port-server-unittests Change local unit tests server ports 29 January 2020, 16:24:30 UTC
40da6de change the port of the local service to a less likely value 29 January 2020, 04:46:36 UTC
8dc5de2 update fulltext model 22 January 2020, 06:52:30 UTC
cfd2400 some training data for the fulltext model 21 January 2020, 14:55:41 UTC
ea6a0cd add dummy viewer page 20 January 2020, 15:56:26 UTC
5ad1761 add software heritage badge 18 January 2020, 14:08:45 UTC
d163175 update segmentation model 18 January 2020, 11:31:17 UTC
b37c855 a bit more training data for the segmentation model 18 January 2020, 07:59:07 UTC
331d27b fix <p> opening tag for generated fulltext training data 13 January 2020, 10:32:44 UTC
0b2d68b doc typo 10 January 2020, 04:37:41 UTC
66de8d5 correcting wrong parameters assignments 07 January 2020, 23:19:25 UTC
ab36463 Merge pull request #527 from kermitt2/cors-configuration Cors configuration 07 January 2020, 21:26:38 UTC
3d758a0 some date updates 05 January 2020, 05:01:21 UTC
8ac0eb1 cleaning references 27 December 2019, 11:30:46 UTC
8ad022d adding JsonProperty in configuration 25 December 2019, 06:57:27 UTC
cce38c8 Adding documentation 18 December 2019, 19:49:25 UTC
162c283 implement configurable cors headers 18 December 2019, 19:16:28 UTC
f0457c8 Merge pull request #525 from kant/patch-1 Fixed duplicate word on paragraph 29 15 December 2019, 10:00:47 UTC
585f40d Fixed duplicate word on paragraph 29 13 December 2019, 17:15:43 UTC
2e70ade set rights in docker image in case it will not run as user-root (e.g. kubernetes) 05 December 2019, 09:43:32 UTC
0b6d1d7 typo 26 November 2019, 16:08:31 UTC
82c0181 a bit more explanation on building local docker image 26 November 2019, 15:54:27 UTC
cfa50af bug fixing for affiliation block fragments 21 November 2019, 15:21:37 UTC
63885c9 remove non TEI-valid attribute in table 20 November 2019, 15:17:52 UTC
e520000 consistent formatting of ref in doc 20 November 2019, 15:17:21 UTC
5d3ad03 typo in ref 16 November 2019, 18:22:34 UTC
4d4b8d8 add one more ref. 15 November 2019, 22:52:23 UTC
7a4fa7d update appinfo TEI output 13 November 2019, 13:57:51 UTC
a9a0c54 add robustness for pdfalto recent serialization problems 10 November 2019, 18:25:26 UTC
b4644c2 last issue with #519 09 November 2019, 23:15:55 UTC
16c432c tackle the case of doi introduced with string Doi (was missed so far) 08 November 2019, 16:44:26 UTC
bf79e8f Merge pull request #519 from kermitt2/fixes-n-fold-evaluation Fixes n fold evaluation 08 November 2019, 07:36:08 UTC
7f0a04f Update/Fix minor doc errors 07 November 2019, 19:50:51 UTC
5a38058 Computing average as the macro average of the micro averages of all the results from each fold #516 06 November 2019, 07:06:23 UTC
3951dc0 Compute best and worst model using micro F1 score #516 04 November 2019, 23:45:21 UTC
9ad861e fixing pdfviewer demo Former-commit-id: 4a7e4f3b93e46caa81e7507e659631551d8e7dd8 17 October 2019, 09:49:45 UTC
a4d35d2 update doc with new version number Former-commit-id: 41ab93618f9515e8638c0237f9b91234f18b7b6b 17 October 2019, 08:22:28 UTC
c0806fa [Gradle Release Plugin] - new version commit: '0.6.0-SNAPSHOT'. Former-commit-id: 00c0dfc2ff970dbb77f0af11d9d34775dbda4f75 16 October 2019, 13:58:40 UTC
4cfebc8 [Gradle Release Plugin] - pre tag commit: '0.5.6'. Former-commit-id: c395489b14dae2df9e4fc0b6dbdb8da1f7e7a5ba 16 October 2019, 13:56:28 UTC
00831b0 Revert "[Gradle Release Plugin] - pre tag commit: '0.5.6'." This reverts commit cc73bcaefd32a7f0ace4d0eae532a92df59e3f21 [formerly 0820ec13889c1a21eda9b0c546535be5df1b1bc2]. Former-commit-id: c12c0002e17960ff11b6f5851a5c97465ff291e0 16 October 2019, 13:50:57 UTC
dd5ffc1 Revert "[Gradle Release Plugin] - new version commit: '0.6.0'." This reverts commit 95580d901760f1cdd971fbcb302a70f8db71aa31 [formerly b41aa84a09d57167cf32a014a733d7615d029410]. Former-commit-id: 9b5fa3643cbbad3b06f7edeb02f02bd6c5184e62 16 October 2019, 13:49:58 UTC
95580d9 [Gradle Release Plugin] - new version commit: '0.6.0'. Former-commit-id: b41aa84a09d57167cf32a014a733d7615d029410 16 October 2019, 13:40:15 UTC
cc73bca [Gradle Release Plugin] - pre tag commit: '0.5.6'. Former-commit-id: 0820ec13889c1a21eda9b0c546535be5df1b1bc2 16 October 2019, 13:38:17 UTC
48cda84 update of linux pdfalto binaries, release benchmark Former-commit-id: 388f888255d132aeb35c619618cc74cc334197b4 16 October 2019, 08:09:48 UTC
bd2b272 Merge pull request #496 from kermitt2/pdfalto_parser_fixes pdfalto binaries update pdfalto parser updates Fix for #509 #152 Doc update Former-commit-id: f85acd5afe094d1959b3faff6705505de0ef47c6 14 October 2019, 18:05:16 UTC
2ea9d7e cleaning Former-commit-id: 0c8863b508ed9343af7d19e663336dcd8c5c8a49 14 October 2019, 17:46:00 UTC
1622c58 fix #509 #152 and state to preserve spaces in xml Former-commit-id: 369de2ceda78238cdf38f41c0d67a31da1bf08ca 14 October 2019, 17:43:06 UTC
e10741b update readme for new release 0.5.6 Former-commit-id: 3eed2d6611d0a9a9a04aba9c7745dc1d0d37c539 14 October 2019, 17:41:15 UTC
60b8535 adding pdfalto for windows Former-commit-id: c5d98478999b03e62a22f3367104c25c2a38e36d 13 October 2019, 14:41:45 UTC
52a80b4 adding pdfalto for mac-64 Former-commit-id: 0fb990455f0daae30401b431bc9796dd437a28fa 29 September 2019, 05:47:13 UTC
cd449d4 fix test, update lin-64 pdfalto for bold/italic capture Former-commit-id: 7066d20119f433bffbf79ce9f37e57e0b12ea55b 28 September 2019, 20:04:12 UTC
7f3e08e Remove the font name test for bold/italic because it is done in pdfalto now Former-commit-id: 26efb67bb9ac323c99c4d369ad15a61c77b9e4ba 28 September 2019, 19:16:39 UTC
b6ac573 Merge branch 'master' into pdfalto_parser_fixes Former-commit-id: 2eaed4516e1e90cb78ce9e8883cd97d5454d0dc5 28 September 2019, 19:13:23 UTC
1adc351 Merge pull request #498 from kermitt2/improved-dehypenisation Improved dehypenisation Former-commit-id: 472324ac14e4b6489972fd6e086360525d6736d4 28 September 2019, 19:06:28 UTC
794297e fix test Former-commit-id: 3212b61d1eb3d99b118dae35074c26003c8d639d 28 September 2019, 17:40:36 UTC
f1ddb17 support unicode strings Former-commit-id: f395e2104f9b3d61157e11e65af1ce85b901deff 28 September 2019, 17:39:22 UTC
845ecbb do not use anymore deprecated dehyphenization methods in grobid core Former-commit-id: 883b3cbf327afced5f3ee88e1454c948e5f07c69 28 September 2019, 17:21:03 UTC
df4e01f Merge branch 'master' into improved-dehypenisation Former-commit-id: 5f4af225ee68a7656172dddd9063e2c49b552cf1 28 September 2019, 14:58:48 UTC
8d5afd2 doc update Former-commit-id: 24e6f0ee4834a3d2e79340747688b69b7003e40e 27 September 2019, 18:49:11 UTC
9d0344b Fix #505 Former-commit-id: ce45a96c3e3318e6a54c234041f2e0a435734498 23 September 2019, 07:08:24 UTC
71e3e17 Do not use the XMP embedded metadata for the moment; cleaning Former-commit-id: bc18bed3e6b963113f24c0fac8b872ccf6423471 14 September 2019, 05:14:21 UTC
bc0d325 Review usage of XMP PDF embedded metadata Former-commit-id: 8086c8cb0f4e3955896a4af28c1c612e1e17ca5e 13 September 2019, 19:07:34 UTC
0ce0cf3 cleaning Former-commit-id: bf7e1de523e2d5a0b5034888dcc28196c89f61d1 13 September 2019, 05:32:39 UTC
1ff82c4 merge with master for benchmark Former-commit-id: c71f8791a24670dd519cb8c5e32c98473f72c07a 13 September 2019, 05:02:01 UTC
10be7af fix merging issue with master Former-commit-id: bc6bd9ad70f4be00f7154ca8237510d9242b89ac 13 September 2019, 05:01:07 UTC
ef8a54e Merge pull request #486 from kermitt2/duplicated-body-parts-476 Avoid duplicated body part in the abstract Former-commit-id: f1845462e4b0e926cab7c8fc229153a259937014 12 September 2019, 19:49:20 UTC
6ece1fb Add a cleaning method for abstract working with layout tokens Former-commit-id: 06da47f7ebebc2b4e5b97b06121d882fc2a8a15a 12 September 2019, 19:32:31 UTC
4fb374f fix #424, fix labeled abstract mapping Former-commit-id: 6a9e16768ff3a794aad0acd005814e9b9bbbfe14 12 September 2019, 12:25:59 UTC
78fb889 review processShort; fix bug for DocumentPiece handling in feature generation Former-commit-id: 345c6ae5c19b97e6df5c7ee2eb71d67d1630a26a 12 September 2019, 07:34:55 UTC
0308066 add model declaration for dataseer Former-commit-id: 1a1653b19e349f6e6a5aa976d268d8cdf02e2497 11 September 2019, 08:30:23 UTC
43ad93e Merge pull request #280 from kermitt2/check-evaluation Improvement in evaluation framework Former-commit-id: 38489ed87ca296f77451a339ceb8ccc8c0ba0bb7 11 September 2019, 08:25:14 UTC
6fc4f4b ignore submodule grobid-keyterm Former-commit-id: 12e392c56d2e9c1898facb711cf82357b2b0936b 11 September 2019, 08:21:18 UTC
3fad113 Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: db33786b3518e17ddeb6ecec36c333692c402b71 07 September 2019, 23:59:42 UTC
2ae6076 avoid that the python.virtualEnv property breaks the modules performing checkProperties Former-commit-id: c534c417caee94eed1a8bf6b6b109c3f026a7321 07 September 2019, 23:58:59 UTC
b7c15e9 Implementing suggestions and move code into methods + adding some unit tests Former-commit-id: 02612ffbd6c00a6d7e3460d08ff221b66f527052 07 September 2019, 09:19:09 UTC
f7f1030 adding subList by Offset for layout tokens Former-commit-id: a22098e68133785975aa9524b9755161ddcfa601 06 September 2019, 08:46:42 UTC
dc2d834 getting instance of GrobidProperties before running tests Former-commit-id: d6b1d0e55a41f0f77ed65f7984608abb72edd022 06 September 2019, 06:49:08 UTC
655d8b1 cosmetics Former-commit-id: 4b14c67a1b74a40d0fe1949887541fc99bc13ce7 06 September 2019, 06:47:22 UTC
5a4e921 avoiding going out of bounds Former-commit-id: f6b243425177751d762e2f124260b9d99ef84f77 06 September 2019, 05:55:28 UTC
0040cf7 improving dehypenisation using coordinates to check breakline Former-commit-id: 3ccfd89e4381cf5488fb7d9d44061c96e77e8ac6 06 September 2019, 05:29:46 UTC
1cd280e cleanup dehypenisation Former-commit-id: fc0cafa787d5d318c00e9bccea91975fd51839dd 06 September 2019, 05:29:32 UTC
74932ca improving naming Former-commit-id: 9e44d1c007d5fa8cad0450796a695692343343f7 04 September 2019, 00:07:41 UTC
6b1cd96 fixing extraction of font styles from ALTO format #495 Former-commit-id: 920113efb4697aa73f04fba1fa0dae04135a5209 03 September 2019, 23:54:34 UTC
79e9869 adding more tests on pdfalto parser and trying to fix issues with bold/italic and subscript/superscript Former-commit-id: 8c2d2d504d7a0c18bbfe4f3bc82eb80e59043d98 03 September 2019, 08:24:44 UTC
a14e427 extra explanations on grobid-home for the batch mode to avoid any confusions Former-commit-id: 64a2a46eff406920b2162f9d9d467b2521d4afc2 31 August 2019, 19:18:35 UTC
a88d851 correct spelling in new doc Former-commit-id: 27cde82e3439428364ab61c375879031af810a65 28 August 2019, 02:13:35 UTC
c7b922a documentation for n-folds evaluation Former-commit-id: 2adac376ee56023a2a4ffef85bee42e2229acf61 28 August 2019, 01:44:55 UTC
ec7cfa2 Implementing review remarks #453 Former-commit-id: 2a6ab0988339c0ecd4e1a382ee72ce844e3eb686 27 August 2019, 23:26:44 UTC
813de2a use previous processShort for all short texts Former-commit-id: 2087e78a0cacc959f93a47c2dc492616ae3e9a47 23 August 2019, 11:48:20 UTC
76c8f01 rollback Former-commit-id: 377ad90264ea41c9919f5bad7cf43f8637d59aa8 22 August 2019, 20:35:03 UTC
b1ee6b1 update processShort for applying the fulltext model to short piece of texts like the abstract Former-commit-id: 626ad60fea3093ae56dc08f8477b7eec0417e9b9 22 August 2019, 15:36:34 UTC
7fa9f4a better PMID and PMC ID recognition, update citation model with some PMID examples Former-commit-id: 25258577dcc3b409b20341f19eea56360191ad3d 22 August 2019, 14:02:31 UTC
08e5064 cleaning remaining bin/ Former-commit-id: 17d83dd87b2299f421d0a80d722dde13bff87ca6 20 August 2019, 19:01:36 UTC
ea612a0 Merge pull request #488 from elifesciences/added-bin-to-gitignore added bin to .gitignore Former-commit-id: 01b92384a2bb3df0a45bf4f59ed5e56901602d87 20 August 2019, 18:59:15 UTC
9b28884 added bin to .gitignore Former-commit-id: 5f71cd663458528834ef0d8f19ce9f2088f5eaae 20 August 2019, 13:22:35 UTC
4316344 create valid DocumentPiece for further structuring abstract Former-commit-id: c6d29308a36c34fe1dcfed7c3db224a0153578c7 20 August 2019, 12:32:41 UTC
df92b5a saved by a test :-) Former-commit-id: 2b0915310596aec9fa5edde8aa1b366a6d412e11 20 August 2019, 09:11:26 UTC
979c1f9 Adding more tests and moving code around Former-commit-id: ce933aa5ad7079deb10cc282e2e2d9ed5a4fa9b4 20 August 2019, 09:06:02 UTC
back to top