https://github.com/kermitt2/grobid

sort by:
Revision Author Date Message Commit Date
853c8a7 [Gradle Release Plugin] - pre tag commit: '0.5.3'. Former-commit-id: b61951322f5647fddfe70ef03c0d23ab1feaab63 25 November 2018, 19:05:54 UTC
2f998e9 Remove eval limit for DOI matching Former-commit-id: a4f5f251c26da7c3daedd7bf27606e120f0c5cad 24 November 2018, 20:02:43 UTC
e677663 Fixing again proxy... in the case there is no proxy Former-commit-id: b05734fc0e5fdb08e2e2df49e653a2f915fdfec9 24 November 2018, 19:38:02 UTC
4e02dc7 complete consolidation Former-commit-id: f0e5bf4072a1bc8ff890c64121da72d557a1362e 24 November 2018, 11:43:44 UTC
d4713ac make httpclient working with proxy; improve calls to crossref api Former-commit-id: d58c820291df7a28b6bd92e067aefab03f4720b2 21 November 2018, 13:39:31 UTC
0f7ffb4 new iteration on DOI matching evaluation Former-commit-id: 78082b0c0b024da6ef5830efc3b4362489ab2780 19 November 2018, 03:06:44 UTC
b383db7 Add first version of DOI matching evaluation Former-commit-id: 502d541b2417f12f602d4c6567eb2552959a7525 18 November 2018, 21:47:24 UTC
46b2ec7 Merge pull request #355 from kermitt2/standalone-figure-extraction Figure extraction improvements Former-commit-id: 35c758fd71accce888949e52dc5f67a4e598deaa 18 November 2018, 02:09:45 UTC
872e22c fix tests Former-commit-id: 8a77d8e4b3810efb01d4c80762835fe136fe87c6 18 November 2018, 01:59:47 UTC
3e40d0e Add new consolidation option Former-commit-id: 34b1ae5603b17c6464b73b9652f8335f54d9338e 18 November 2018, 01:43:13 UTC
c634767 Add a dedicated batch creation of blank training files Former-commit-id: 35bd1242113dc1a4b6048c3f4c2099f18d38d369 16 November 2018, 22:56:56 UTC
a77e0f4 adding list of layout tokens for the caption of figures Former-commit-id: 673673635e68dea138960e863c89d4afb4419060 09 November 2018, 16:13:44 UTC
e884e2e adapting to new PDFBox Former-commit-id: e6b0ae8031ca30dacd74db49455681a6d68c0762 29 October 2018, 15:45:33 UTC
164fab2 - fixing reference tokens (they were sharing the same List object) - fixing running crossref client - refactoring of the main page area detection - a bit more heuristics to detect figures Former-commit-id: 38cb0627f6bee3011bffac01a0be1f342fb7641d 29 October 2018, 15:41:17 UTC
3aed958 put on hold output of collaboration in the header (no training data for it for the moment) Former-commit-id: 1b2338a42a6fef799ffdf7587a7a3136f3a6d40f 23 October 2018, 10:27:51 UTC
6bdc140 Update CrossRef REST API usage for consolidation with full reference string query Former-commit-id: 90ad311de0c6d85ad7ecd41c8dbb67d47fa32955 21 October 2018, 18:00:39 UTC
5bcbfb0 minor typos in the doc Former-commit-id: 11e23a48d5076eebfd4720f717e9123b23cd6d65 17 October 2018, 17:34:41 UTC
d7e0c42 correction wrong documentation file Former-commit-id: eabb3cbf18c19779da403ad4d8429d42ce715424 17 October 2018, 17:15:24 UTC
17453fb Update documentation after release Former-commit-id: e1f95a9be8266d58d30cdf9574d3272a42674e9f 17 October 2018, 17:13:48 UTC
064635e [Gradle Release Plugin] - new version commit: '0.5.3-SNAPSHOT'. Former-commit-id: c4eb7d44e929dde6ffacbb99b93d632d07b19f34 17 October 2018, 15:47:34 UTC
5aca6fb [Gradle Release Plugin] - pre tag commit: '0.5.2'. Former-commit-id: ce7d2670250f2ed5c166bf8095833a2a57cee7cf 17 October 2018, 15:47:15 UTC
861a2ae Avoiding NPE when processing execution command is missing from command line #39 Former-commit-id: a231c2bf85dec9789379ec941e9d0a02c5e8554b 17 October 2018, 14:01:26 UTC
91750c5 Updating libraries Former-commit-id: a221103f02dc98da045aca56c24fe200529dbdc7 17 October 2018, 10:58:55 UTC
48f8524 Adding metrics at the REST API + documentation Former-commit-id: df4f8d3cbc012ed3222d9511fc758c1a49f015fc 17 October 2018, 09:42:09 UTC
b91aa6d Some more styling for the documentation Former-commit-id: 286f04f64c7694db677c50f8a8599369c24857f0 14 October 2018, 23:19:33 UTC
a485764 Add more links to python, java and node.js clients in the doc Former-commit-id: 071948bf92d0ca9975c41add257b76c958fff853 14 October 2018, 23:15:50 UTC
3021dc8 Add links to python, java and node.js clients in the doc Former-commit-id: 6fbfd4120e5c016439378f991d9d4cb76e950942 14 October 2018, 23:05:45 UTC
788a029 add counter for crossref REST API; try to fix the doc theme Former-commit-id: c5ef2c278e9c18bc36ab886519f440d631123a72 08 October 2018, 15:07:07 UTC
31529ef Merge pull request #350 from kermitt2/updated-dependencies Updated dependencies Former-commit-id: ba8e2e42f0c470be6279f682bd725bef3a27bfab 03 October 2018, 08:38:11 UTC
69b86f9 Removing unused imports Former-commit-id: b254a8cab13bd3f12c9a87d735d9647f0f78a986 03 October 2018, 05:44:47 UTC
d21814c Updated dependencies - dropwizard (to latest version) - pdfbox (to latest minor version) Former-commit-id: 3f05357a26410f719ffd3454fd59cafa14b4d06e 03 October 2018, 05:34:30 UTC
58f4811 some more documentation on using multi-threaded service Former-commit-id: 597d86656371a9dfd48e50860de1f176be58b7ac 25 September 2018, 16:47:23 UTC
ba60b09 Restore homogeneous service response status codes, complete the service documentation Former-commit-id: c199c59f8635e3754abc436b231b989a34f42979 25 September 2018, 14:13:48 UTC
ce7a170 Update end-to-end evaluation Former-commit-id: 688ff81666e97fdf5a5add91f7eefd49cf9fd5aa 22 September 2018, 18:37:33 UTC
79e30fd correct dev version in doc Former-commit-id: df5f92bc4f3b7f2b53af628e61376aa6e529eb52 19 September 2018, 14:19:30 UTC
d00df6c Document how to build through a proxy Former-commit-id: 4c3dfa0461dd820386b075db10b08ade52423a86 19 September 2018, 14:02:13 UTC
3d35713 Update gradle version Former-commit-id: ea843ccb49ae9bd71ec08e7d13abf9631696d0d8 19 September 2018, 13:50:24 UTC
2558e65 Add before test class to init properties. Former-commit-id: 59d26b993308a2e968bd9af4a77f867be85ff51d 11 September 2018, 16:23:05 UTC
9bf6a76 Fixed #325 uniforming parameter names Former-commit-id: 5d68f9506fef3bf2362f0272ad500c1b04943089 11 September 2018, 01:49:53 UTC
4319fb1 Add Grobid Factory reset method. * Static fields need to be cleared after each test class (otherwise any modification will impact all the rest of test cases) Former-commit-id: 2c6c16226392419bc96f5f487b451d741b047104 10 September 2018, 13:19:21 UTC
542fe56 Add JAXB api dependency Former-commit-id: 20d0727535236b87226c2c36b134195e1333db10 07 September 2018, 20:36:21 UTC
4fd893b Fix after and before properties settings. Former-commit-id: 7973effcd46d6f30407aff2ce26262bc3e9c874e 06 September 2018, 13:55:10 UTC
1095dc9 Fix test Former-commit-id: 31d46bed7d147df9d12a416fd23d44e8c76504fa 22 August 2018, 15:14:06 UTC
bcc6369 Fix issue #339, more robustness for patent number parsing Former-commit-id: 639cb4e5b2b130338975e335765bde43c2c6e0c6 22 August 2018, 02:18:26 UTC
9774692 styling bibtex entry Former-commit-id: 58be7f1689617fd0cdea8a4e3ba43880b09d3168 16 August 2018, 18:44:51 UTC
57ca386 add bibtex entry Former-commit-id: 55183a4a8f9f695c856a56372d3bbdb1f7d22259 16 August 2018, 17:41:06 UTC
7c06689 update doc and license with correct dates and references Former-commit-id: 3a276bc5a3eb0d2c11781379987bf4c4d78dd4f7 16 August 2018, 15:53:16 UTC
2e26e6e filtering resources to add correct version #322 Former-commit-id: 99ea296658c96ea45fc57652d6de533bb98d81c0 16 August 2018, 14:56:10 UTC
bab9e47 updating shadow gradle plugin Former-commit-id: 9c414cd91661fbcad5540d65100134bdc8b1433c 16 August 2018, 12:02:44 UTC
6eb507c use valueOf instead of the deprecated constructor for Integer Former-commit-id: af01d0b9c2d89d0ca6ae0761867e350020936916 14 August 2018, 09:15:11 UTC
4ae7089 create model directory if it doesn't exists when training with automatic split Former-commit-id: f0762f6df75baab58e47f956bb15b3f9c5ba998a 26 July 2018, 10:11:03 UTC
6ed0467 Merge pull request #331 from kermitt2/docu-fix Update documentation and fix build on readthedocs Former-commit-id: fcb3480a7c55f57fa6fd0773824785381f82ad17 26 July 2018, 09:12:06 UTC
864dfd3 update documentation format Former-commit-id: eb2881ba785c95b374577b30359aeffb2e3545e2 26 July 2018, 08:58:22 UTC
5976b9f Minor update documentation: grobid as a java library Former-commit-id: e9ae3f2d715affb30635f4807b671681525f77c6 25 July 2018, 06:45:59 UTC
b363e80 Add case sensitiveness option in fast matcher; do not produce outline by default for pdf Former-commit-id: 75482c815750769be0ed2882208438920ba3eff4 24 July 2018, 21:57:56 UTC
ed7f0d8 Remove unnecessary conservative exceptions for assets; review logs for CrossRef consolidation Former-commit-id: 7ca86a2826c364f231ed1d9b57a14c87757c69bd 14 July 2018, 16:45:20 UTC
6ab34cb Fix broken asset generation in batch mode Former-commit-id: 8ed6dd5835432f70582cced07bf4ab360b7c7cf3 14 July 2018, 15:22:31 UTC
fbd049a Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 52cde4bd427f90b9a636daa567067975c391f6f4 14 July 2018, 13:53:05 UTC
ea58f0b Janitor mode Former-commit-id: f6021135164fa5da377e471f76722e722c86417e 14 July 2018, 13:52:25 UTC
41603a0 Merge pull request #326 from contentinnovation/saxon-he-9.6 Update Saxon 9 to a more recent version to remove dependency on saxon9-dom.jar Former-commit-id: 582c34ee1b03e2d7f7a93185f1116765558a9afa 06 July 2018, 08:18:20 UTC
ca91bf2 Merge pull request #327 from contentinnovation/sanitized-emails Assign sanitizedEmails to authorEmailAssigner Former-commit-id: 0082794b3c4bd823c91c7322577c6f7b1c8a5157 06 July 2018, 08:16:34 UTC
f149543 Update BiblioItem.java Former-commit-id: 3bc08942cb61151e1b32f502358bf0e2a4074426 05 July 2018, 13:58:11 UTC
f33261e Update Saxon to more recent version This removes the downstream dependency on saxon9-dom.jar - see: https://stackoverflow.com/a/15441957/9098739 Former-commit-id: ad3d869d7c22c307353f4a5ea3b3bbdec936ca82 05 July 2018, 12:42:28 UTC
1c05244 Add option for case sensitive lexicon/FastMatcher Former-commit-id: 77bcd81ea123255c315a0c38877c1f47d60bf99f 03 July 2018, 06:19:21 UTC
afd53c0 register software model Former-commit-id: b186452052b3cfcae92f0c2105161e65198cf302 02 July 2018, 09:49:25 UTC
cf5dd48 Merge pull request #311 from de-code/add-no-daemon-flag-to-docker-gradle-build added no-daemon flag to docker gradle build Former-commit-id: 4162176eda6b188d884a16cc80e0c083e0680088 30 May 2018, 06:23:44 UTC
d52931d Update documentation and dockerfile to include the init process when running docker images #312 Former-commit-id: a55e326aceadbc5f82474f2a2858438cf5222e85 29 May 2018, 23:20:54 UTC
491155c Fixing uppercase text utilities for checking acronym tokens Former-commit-id: 7b7087b4cd70081991dff30793e7cb9dd8fafecb 29 May 2018, 16:07:36 UTC
3bb791d Update documentation to include the init process when running docker images #312 Former-commit-id: eb9538a1d90b35a87027ff9737a0786f653734c1 29 May 2018, 15:50:03 UTC
d26e7da Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 13a6cfb9f720f946a3f7575940e76354179bf8e6 22 May 2018, 22:37:22 UTC
574701f fix invalid lexicon matching for affiliation-address model Former-commit-id: 0258068d4ca83e2d45684089389e4510ef9a42a5 22 May 2018, 22:37:05 UTC
d319c01 Merge pull request #318 from csw/ref-annot-fix Fix JSON generation for reference annotations Former-commit-id: 6d195f024ad5245f15abe13dd0fc9c1b718fb72d 18 May 2018, 17:42:49 UTC
3aa29dd Merge pull request #317 from csw/bibdata-quoting Fix BibDataSetContextExtractor to quote replacement text Former-commit-id: 31936a3424f287a289cd4ffa1ea6335f85970aa2 18 May 2018, 17:39:42 UTC
b2fa9e4 Fix bug where JSON strings weren't being escaped; use Jackson Backslashes in URLs were being passed through verbatim into JSON for the reference annotations, resulting in invalid JSON output. This was because the JSON was being built via string concatenation, without any escaping. This switches to using Jackson instead, to ensure the JSON is valid and properly escaped. Former-commit-id: 17fb626bb9e87f8a6d356d6e3dea4b8299379e17 17 May 2018, 02:44:26 UTC
732af07 Fix BibDataSetContextExtractor to quote replacement text Former-commit-id: 1ec624a0788e6bcf2fd2ec5a71838c4a604f48fd 17 May 2018, 02:27:30 UTC
fd0c965 Add unit test for getJsonAnnotations Former-commit-id: ba56600d55d2cdb69235a3eef5bc06f661bb62af 14 May 2018, 19:29:03 UTC
2451a58 Complete tagset Former-commit-id: aa8652b86127ae427eb520241d3b31b8e24e5a49 08 May 2018, 10:40:26 UTC
9db5c42 added no-daemon flag to docker gradle build Former-commit-id: 5b58b6b34b0bd23bf2914b3cd1a6d262317dcdd8 24 April 2018, 16:48:58 UTC
c12899a disable consolidation also when DOI available #300 Former-commit-id: f6a3eda0b3bec9c1ed6052c809944b9c30c433eb 24 April 2018, 14:24:57 UTC
c3591a3 Merge pull request #306 from de-code/local-dockerfile [wip] build docker from local source Former-commit-id: 1d4f3393562aa45958cb3c6fa0f610d90a9c4cb4 24 April 2018, 13:33:44 UTC
3e45465 build docker from local source Former-commit-id: 9338013f24e98321deeb6ef84a9a0a85470b5431 19 April 2018, 07:20:13 UTC
f5ee604 dehypenisation chapter 2: bug fixing and algorithm improvement #180 Former-commit-id: a529f40b60ad6a0f82793753e620fdbc1c0e466f 19 April 2018, 06:13:44 UTC
8f8b174 First implementation of the dehypenisation using layout tokens #180 Former-commit-id: 4e4b6f9c3618c6c876f6948e5c678418b704bbd5 14 April 2018, 20:13:47 UTC
92750cf updating grobidAnalisers to consider break line in tokenizeToLayoutToken, when a \n is encountered #180 Former-commit-id: 27ffd3bd1fd62e003926e92ce6ea5b0d45d1c348 14 April 2018, 20:13:15 UTC
c114f42 minor corrections: typos in comments, imports, code shortcuts Former-commit-id: 34fba0a8ededec65acee0968a784d7d2840c92e7 14 April 2018, 20:13:15 UTC
444429f Rearranged tests to test separate files and read using resources Former-commit-id: 9b5d2e3300860ee7994d8fc53f82d02a823ab6af 14 April 2018, 10:03:18 UTC
bc9c661 updating documentation: grobid as standalone application #298 Former-commit-id: f2291aed0716ca0a149e19bea64058f47e6848d0 11 April 2018, 10:57:23 UTC
bd1f396 Merge pull request #296 from tantikristanti/master update the links for INIST and TEI in the documentation Former-commit-id: 154c530aa9b4a694a7a377399b499df67b8669cb 19 March 2018, 18:13:03 UTC
0a535e7 update the links for INIST and TEI Former-commit-id: 0a56fa5f9d276084ed0d79ae539f5b65a91a3ed6 19 March 2018, 17:46:33 UTC
bc36c19 update the links for INIST and TEI Former-commit-id: b37580cbf3e50ccbccf11982b003bfef329d2840 19 March 2018, 17:40:30 UTC
de7bb80 Merge pull request #285 from kermitt2/standalone-figure-extraction - extracting standalone figures (for which we didn't detect captions,… Former-commit-id: 4efac54ba03271686d4f3bd8e781dd030bca5aa3 18 March 2018, 19:23:13 UTC
90aab21 Update doc for #286 Former-commit-id: 29de83ed570c2711250d1a658ffc9cfa1b5c09cb 20 February 2018, 06:54:04 UTC
0876e22 - extracting standalone figures (for which we didn't detect captions, but pretty sure that they are proper figures) - making sure that CrossrefClient does not prevent JVM from exiting Former-commit-id: 5d2ed241c5882105a50c0af6c1580a6eaf79fe0d 12 February 2018, 11:13:14 UTC
d0847cb updating documentation Former-commit-id: 461633dd21851b2e21656237182efa43d46086da 29 January 2018, 09:30:00 UTC
8ab80de updating documentation Former-commit-id: ac863de3099992344d2c12f448f7a58bb98f16b0 28 January 2018, 23:50:34 UTC
3fd5637 version bump Former-commit-id: ef0de093de422f4f0d7f896624f44881551397f9 28 January 2018, 22:50:11 UTC
43ff4ab remove forgotten war plugin - can be re-added after working out how to add the rest of the war packaging components Former-commit-id: c18f2c687d66b7bdb97c989346450b0cc9a232c4 24 January 2018, 09:59:22 UTC
208dcda Minor cosmetic changes on dockerfile Former-commit-id: 972a93595daa4df6f29294ba76b255f02e18892a 22 January 2018, 10:43:21 UTC
ff2b945 minor fixes: - removing dockerfile workaround for version 0.5.0 - updating docker dev references Former-commit-id: 1f42ac7c1f95d8eda1ada289b0d3c3716788828b 22 January 2018, 09:52:54 UTC
57b4790 updating docker documentation Former-commit-id: 5a17684d44571179753ce886cb20f895b345b466 22 January 2018, 08:43:14 UTC
back to top