https://github.com/kermitt2/grobid

sort by:
Revision Author Date Message Commit Date
5ff63c6 last doc updates before release Former-commit-id: 5411f72593fcceeba43f62ffe0e9283028f77750 05 August 2017, 17:10:52 UTC
276c7fa last doc updates before release Former-commit-id: 45655e9a97fac2ac12c53a75ab21955c3301a63d 05 August 2017, 17:10:22 UTC
ba245ab released version of pdf2xml 2.0 for windows (thanks a lot to @flydutch for the help) Former-commit-id: 75536cd4ea965fbcf7d26e12b6f43fc01a085c64 03 August 2017, 15:04:58 UTC
0d0d80b Stopping end 2 end evaluation when the parameters are not correctly set Former-commit-id: 09a20aab54f7236ecf75a67379f4c52b5217829d 03 August 2017, 11:18:49 UTC
8bd78d7 Adding log implementation Former-commit-id: 2acc9c24c1f7275d92681ba3515dac39f2a52fe0 03 August 2017, 11:07:46 UTC
db5dc3b Small correction in the check if there are or not pdf files in the directory of the evaluation Former-commit-id: 867343a6b443172fb08b88493beddfdc29799a05 02 August 2017, 20:48:53 UTC
9fb3beb Merge pull request #215 from aoboturov/fix/language_detection Fix other missing language-specific checks. Former-commit-id: 80b8991504d85134c974e06233785f1d47c2a53b 02 August 2017, 17:12:52 UTC
f41573b Fix other missing language-specific checks. Former-commit-id: 1651622d675564dd932431b918b272e8ac5c6904 02 August 2017, 16:24:07 UTC
cb78b77 Merge pull request #212 from aoboturov/fix/slf4j-remove-implementation Remove slf4j implementation from the Grobid Core. Former-commit-id: 185043d8446cf8cec6fa16c5b8b91bd05b5ebb94 02 August 2017, 16:01:38 UTC
0c37e41 Still a smal type in parameter description Former-commit-id: c7c02c88134afc659082603264996415192bea84 02 August 2017, 14:38:08 UTC
549a919 typo in parameter for end2end evaluation, correcting typos in documentation Former-commit-id: e8221c820a60e009757c8499a5c4b6526136d0bb 02 August 2017, 14:32:49 UTC
5bc433b Fix parameter name Former-commit-id: 37cafac7867d4212074c652f49a8ef479d70f482 02 August 2017, 14:06:32 UTC
01277e6 Merge pull request #211 from aoboturov/fix/shade_lucene shade lucene in grobid-core, fix #77. Former-commit-id: 44d7dc4f9af08e68bc5785fe23870b3023cfa4ef 02 August 2017, 13:46:23 UTC
b733642 Remove slf4j implementation from the Grobid Core. Former-commit-id: 2c3d23c192e1541cc7b99fb2d8c4039193ac099a 02 August 2017, 10:48:03 UTC
c773929 shade lucene in grobid-core, fix #77. Former-commit-id: 7cdc199988dd99faa9a40844044b665b4aa7e91c 02 August 2017, 08:24:41 UTC
89c7938 fix for #77, update wipo-analysers as indicated by @aoboturov Former-commit-id: a35706a2db27cc7d1b24167bbcee1494d9c26246 01 August 2017, 20:10:11 UTC
37ff742 Merge pull request #208 from aoboturov/fix/pddocument_resource_leak Problematic PDFs could leak resources. Former-commit-id: fb7c8e38d00560a31ab02ebfbe27518316845e61 01 August 2017, 17:20:32 UTC
a9ff1ff Merge pull request #207 from aoboturov/fix/add_zh_tw_locale Fix 'zh-tw' language detection. Former-commit-id: 28ab4d30be7f359145c300b4bcd22ed443309d4e 01 August 2017, 17:16:29 UTC
3cf0c7f Problematic PDFs could leak resources. Former-commit-id: 5d971ba0655ed05f2250654077465fdd101c2cdb 01 August 2017, 16:00:02 UTC
1ba1154 Fix 'zh-tw' language detection. Former-commit-id: d698a1ecd6dee793ff826c881dbedc5cf047c969 01 August 2017, 15:22:40 UTC
f35fd35 removing duplicated entries from .gitignore Former-commit-id: 49e8fe036f24c80d11a257eee3edda5c4c26e45c 31 July 2017, 07:33:44 UTC
a81dd40 update pom.xml & add new module in .gitignore Former-commit-id: 9ab76e4d87ae4e4616a0e8523e28c33b20303f97 31 July 2017, 07:09:49 UTC
4f17ba8 Migration of the SegmentationLabel out of the enum Former-commit-id: b1b0b816fcc29d3d23fbc6c57a48503b6d4a31ec 19 July 2017, 16:22:58 UTC
1144149 Throwing an exception GrobidResourceException instead of an NPE when the input data is not correctly read #197 Former-commit-id: 69b4272482e4e2336cc971ad44244689a9b0fc34 19 July 2017, 16:04:44 UTC
c114df5 Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 96ee043ba6a59c766647aee5d2da2bbe89feecc5 19 July 2017, 14:06:02 UTC
773f4ed reapply the blob on the patent download and call the visibility check from the script #201 Former-commit-id: d14fecf96a742f0e91c3fe8b4f2079cad53adfae 19 July 2017, 14:04:31 UTC
8a3eccb Specified the UTF8 charset when parsing XML in order to avoid encoding problems on Windows #195 Former-commit-id: 18419a57db4f9effabc721c03c65a02079183f3e 19 July 2017, 12:46:41 UTC
8c82350 Correct typos in the doc Former-commit-id: bdf29fa08d5a16c8638bd3891b75999f1e150eb4 19 July 2017, 11:13:29 UTC
55fe718 Cleaning doc Former-commit-id: 64a1627e56758edaa34342b3d6ac800fe6c217aa 19 July 2017, 08:50:03 UTC
f84c48b Add explanation on the directory structure for end-to-end evaluation Former-commit-id: 43a941465c970030cacf8acc1aa2dd362faeba07 19 July 2017, 08:46:53 UTC
047e03a Fixing doc Former-commit-id: 91c800542d487aee533ea2a6e02f9be28525eae3 19 July 2017, 08:29:30 UTC
49ad35a Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 09273543001e809cc409c71c3eabd57eec461836 19 July 2017, 08:18:54 UTC
848977e Documentation for end-to-end evaluation Former-commit-id: 9f11b088eaa7b7d6b246c388057a12a862bd217d 19 July 2017, 08:18:35 UTC
b02cbf5 Merge pull request #204 from michamos/document-article-IDs Documention of article IDs for reference training Former-commit-id: fbffa18fd70adec6fbf5a4aad22037488e9393aa 18 July 2017, 17:15:28 UTC
a550d68 Fix for remaining problems for issue #201 Former-commit-id: 6a51a572f7c3b98844c8c965025b7d39f3c43768 18 July 2017, 15:11:13 UTC
0225659 Documention of article IDs for reference training Former-commit-id: bf6d0d6cfab20aa987ca71ab3d1988e6f0580e2b 18 July 2017, 11:36:49 UTC
327cb42 End to end evaluation based on Pub2TEI output Former-commit-id: 073114f3b529afde4d9ec67dd58b41e53dd945b2 18 July 2017, 10:06:22 UTC
d171c89 fix the declaration of output variable #201 Former-commit-id: ba81d1f2f75803a0454a5d89fd8078a802fd7620 17 July 2017, 17:30:45 UTC
8952795 Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 4c277569b2144270eb6c0a0681444452b0aa1305 17 July 2017, 17:11:12 UTC
302dbab add the download button to the TEI and Patent views and manage their visibility in case of new inputs #201 Former-commit-id: 28fc14ec7db05b6a52abc37b7ee13dd86c85845c 17 July 2017, 17:06:42 UTC
a3f9f54 Updating documentation Former-commit-id: 2e9e727f25485ff5597cc61be21480314f9f91e1 17 July 2017, 12:01:35 UTC
1301440 again a file forgotten: sample location Former-commit-id: 2714bf3be9c459d561ef1d439db705c939a4933a 17 July 2017, 08:25:57 UTC
87ab5e0 Removing models already specified in the underlying modules Former-commit-id: 8def144ac83eb33811074afd1d5dbdf84c288e8f 17 July 2017, 08:07:01 UTC
48b4140 Adding additional variant methods to lookup match in gazetteer for location, person, organisations Former-commit-id: 181d9a2c35c632e2104d36217c7b25431c1875e4 17 July 2017, 08:03:18 UTC
163a3e1 Fix missing suffix in header author name Former-commit-id: b738f1f862c269582efe0ffe953c0268561130c4 17 July 2017, 08:01:35 UTC
43f073c Make method signatures more consistent Former-commit-id: 416e49a379abf168b16ffd585602791ff538750c 16 July 2017, 12:24:19 UTC
759dc2e Forgotten file Former-commit-id: 5fdcaa2338ba2c60834cdfc81311c80187f6ad2f 13 July 2017, 21:59:19 UTC
392bb00 Disable badge versioneye for the moment Former-commit-id: 41ed1bc3de693ec76f53535ce97abe6a268e6ce8 13 July 2017, 21:55:43 UTC
3fbb5e3 Adding the versioneye badge Former-commit-id: 1341c5921162bf5f6f9f7df9e3bb8ec7861a3320 13 July 2017, 21:51:58 UTC
94c2842 moved SegmentationLabel under labels, removed not used class using xerces Former-commit-id: d3e45a0009c7631a2824ddb9c8fd05f58d5bb525 13 July 2017, 21:51:58 UTC
b8b8bbc Add some missing offset position in LayouTokens Former-commit-id: 2bdd9e43dd1ac13572cd52e5150b9cd3a3ab0bba 13 July 2017, 20:43:17 UTC
1568589 Improve evaluation structures Former-commit-id: 221dcab53cf6a7994aeb72d8275f5614011a2e7d 02 July 2017, 14:20:05 UTC
16a28c7 Merge pull request #194 from Vi-dot/master api.crossref.org request pool Former-commit-id: 3717d23a859e5b2a1f56f912b7c071d19bdad22f 24 June 2017, 18:16:32 UTC
aabfc6d Some progress on evaluation utilities Former-commit-id: c334bb433ae260be71fb82e3b73be1188a78c88f 24 June 2017, 16:53:03 UTC
1fe060d Replace java.time (java8) with joda-time Former-commit-id: 96be221eced79ccd1e85fd7abf6d2aca39a38147 23 June 2017, 12:54:27 UTC
2cce501 clean printf Former-commit-id: 25ec5d7b7414b7c69636833469894b1141e10e29 19 June 2017, 13:00:21 UTC
d553ea7 Crossref calls can be synchronous also Former-commit-id: 2228489bd866909305c59f2c62edbd7fb33a1517 19 June 2017, 12:30:55 UTC
2ce9ee8 Request pool to get data from api.crossref.org without exceeding limits. Former-commit-id: 1bd1b24b0d4adb368897e26f59cdbf9ccdc22c31 19 June 2017, 08:52:05 UTC
b580c71 Merge branch 'master' of https://github.com/kermitt2/grobid into HEAD Former-commit-id: dd420337dae92a710dbd0cd5c9ad0b409f9c4907 16 June 2017, 13:34:40 UTC
db24592 Merge branch 'zedomel-master': Add flag to signal the end of doi data. Former-commit-id: 3939c3bb1a528efaa3af1d0a56fe620c6d2672ee 06 June 2017, 16:40:58 UTC
8bb5ab9 Fixing minor bug Former-commit-id: e560e5f977d0a683610c2a010040fdd1959ee3e9 06 June 2017, 16:40:25 UTC
406d449 Merge branch 'master' of https://github.com/zedomel/grobid into zedomel-master Former-commit-id: 42e0a616c27253c70194da6807f56865fb4fe4b7 06 June 2017, 16:28:50 UTC
8b2125e adding more tests to cover the crossref unix ref sax parser Former-commit-id: 1fd2efcda93a38d1ebe68fff0d1e4cd35f08037f 06 June 2017, 16:27:59 UTC
1f0969b On-demand access to full text clusters Former-commit-id: 7cab59b1483478d39fd265df0dd0185e244e65d0 25 May 2017, 04:04:16 UTC
2452d0c Adding more junit tests Former-commit-id: 537a652dfc45c9497b330d608a03acc75db6310e 23 May 2017, 10:22:14 UTC
d400745 Adding assert in a dateParser test Former-commit-id: 0884228233f1c66840b1cb0e23f64d6352ebe8aa 23 May 2017, 10:22:14 UTC
c8f78fa Removing system.out printing during the test run Former-commit-id: 9883a3398541dcc5c3a353c78e037f891bfd7bd7 23 May 2017, 09:45:04 UTC
03005d6 Adding setter for images in document Former-commit-id: 9f2e5cd78bb6b23e98bc4d1840afd536cd6a392a 22 May 2017, 15:54:47 UTC
07447be When required, add images in the images list when the document is processed Former-commit-id: 1b0adc44a0d34ee62e12d8e90b8f6c3236890b5f 22 May 2017, 15:42:34 UTC
73700eb Adding test to verify whether the doi is deleted when parsing from crossref Small cleanup on other tests Former-commit-id: ea1d58644cf84d6092af68b302003d34a2d8e0d9 22 May 2017, 15:22:26 UTC
deaa6da Merge pull request #189 from aoboturov/wip/git-revision-info Provide git info in the MANIFEST.MF Former-commit-id: cf695678feecd2fb370d6607f62c113506d08824 18 May 2017, 02:33:53 UTC
a628176 Provide git info in the MANIFEST.MF Former-commit-id: a8bcfefff1cb0dd02dad68d36ceb1677a8a2f8de 17 May 2017, 10:44:07 UTC
ad7a930 Merge pull request #183 from aoboturov/fix/may-march-ambiguity Fix for the May/March ambiguity. Former-commit-id: 2ab2b23456c9a8cc21c6f3bfbae8989bfc99eba5 16 May 2017, 19:28:07 UTC
0f573ef Add flag to sinalize the end of doi data. Without that flag the document DOI is overwrited by the doi of citations list. Former-commit-id: 579b51916e8f62f0af121c76db62b5202dfe554c 12 May 2017, 00:10:49 UTC
3fcb91a Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 2774a17cac4bc1de5c3ab0d26578df0acdaa34a3 10 May 2017, 13:52:50 UTC
4f3edd4 Complete training split functionality for fulltext model Former-commit-id: 54f4d8a36226bc1946864ef57f640804987e7587 10 May 2017, 13:52:05 UTC
2bbc740 Fix for the May/March ambiguity. Former-commit-id: d30d203c386151c0f664f3130c564c39edfd7efe 09 May 2017, 10:56:00 UTC
3f783ee Merge pull request #170 from leeN/mkdir-error Fixing recursive processing with empty subfolders. Former-commit-id: c50c78a2a5da8e99ee90d38618085e44e365418d 03 May 2017, 16:13:17 UTC
c8af4ef Spacing in the case of paragraph cut by a figure or a table, in relation to issue #177 Former-commit-id: ef87048674c8ae0a40b3a0085b7e2504db1d002d 03 May 2017, 07:16:24 UTC
fccf1ea Fixing space problems around inline XML elements, issue #177 Former-commit-id: f5d5cd600ebf6f29b7dd7eff93692dc01911aeaa 27 April 2017, 19:31:26 UTC
cbf613e Fix labels in formulas Former-commit-id: 938813018d879af49db6a57533e5252802b59ba5 23 April 2017, 21:01:17 UTC
da26f0e Add equation/formula representation, coordinates and equation markers/labels Former-commit-id: d718337bdd0d59b53e5c99fadc883eb365abff50 20 April 2017, 17:55:20 UTC
10b7c2f Merge branch 'master' of https://github.com/kermitt2/grobid into HEAD Former-commit-id: 598aa7d772060890c5bfd98e945ef9a62f56a4bc 20 April 2017, 15:06:17 UTC
1d0a08e Extend equation support Former-commit-id: da8b4cc996b9677ef0b4cb36136075319ca3298e 18 April 2017, 11:33:04 UTC
df09c65 Update url of PMC sample on our storage Former-commit-id: e5e62e546da5067910631d0e1fd2611eb4c041f3 14 April 2017, 13:26:28 UTC
fe42888 correct the documentation index Former-commit-id: cf8c54dc5da86c8810e53930bcf86fd260489b11 13 April 2017, 03:45:37 UTC
50121ef update documenation of coordinates Former-commit-id: c65c232162e2387f5c016d38043ab29440e9f38f 13 April 2017, 03:42:52 UTC
030c11a update figure model Former-commit-id: 458227674ef2f891db4702374a344d82cd5c84c1 13 April 2017, 03:25:18 UTC
15c9e64 coordinates as parameter in service and batch; documentation for coordinates; update figure and table models Former-commit-id: cd47fa52c3ba9582a3ed8b740352b5d850a78cef 13 April 2017, 03:19:34 UTC
94a4dd2 Doc correction and maintenance Former-commit-id: 120f7dff728e70e9ac1a0fc5e4d92da88d2addba 10 April 2017, 16:52:10 UTC
061f0d7 correcting index Former-commit-id: e48282c8938ff0b52f2bd220b582dd44338cf2f2 10 April 2017, 16:51:33 UTC
43345c5 some corrections and adding the page to the index Former-commit-id: 1adb56b6e0861b6cced94237c97d5097105a4366 10 April 2017, 16:41:16 UTC
e543f4f Adding a documentation page with troubleshooting: known issues and solutions/workarounds Former-commit-id: e22f0a9b0bf867fcf823755665aff2aad328904e 10 April 2017, 16:27:29 UTC
efc0c76 Fixing recursive processing with empty subfolders. Currently grobid fails to create intermediate output subfolders during recursive batch processing if the corresponding input subfolder does not contain any PDFs. This leads to FileNotFound Exceptions for PDFs deeper in the directory tree, as the directory tree on the output side is incomplete. Changing File.mkdir() to File.mkdirs() creates all potential missing intermediate folders. Former-commit-id: 832e0735d03e168faf9d7c58509c59d8ae46866f 09 April 2017, 19:16:34 UTC
1b34587 Update segmentation and full text training process Former-commit-id: 2916deeee34525640ddd9fde38afae2be4f7fa27 06 April 2017, 00:26:08 UTC
851d70b Update of training data and model for segmentation Former-commit-id: dabb1d56009edd6949d83db187677dfbae2666c2 27 March 2017, 19:36:30 UTC
6b39967 Fix failing test Former-commit-id: a255d5039a20c9e10d72e4d74f59e5bbdc4c2379 25 March 2017, 18:05:38 UTC
7a47006 revert fulltext model update Former-commit-id: 05013b6cc4d2532c3e7afa31622d97f58521e9a1 25 March 2017, 18:00:38 UTC
6997e48 Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: eab4d672e30ae5cb6c6d163d3fb723ee1fa52d08 25 March 2017, 17:55:41 UTC
766d30d Debug body structure learning Former-commit-id: f9fb2dcabcacefafdc9e6e82600b2a58bc070b37 25 March 2017, 17:55:16 UTC
back to top