https://github.com/kermitt2/grobid

sort by:
Revision Author Date Message Commit Date
4904f42 [maven-release-plugin] prepare release grobid-parent-0.3.8 Former-commit-id: 165f9601a9c48c2036b1685d6570301e0163d5d0 20 August 2015, 13:28:08 UTC
2ce77e1 preparing for release Former-commit-id: ab01f5e8c2ca9567d1ac813f1782651c909c6f6b 20 August 2015, 13:23:57 UTC
59e3d5d [maven-release-plugin] rollback the release of grobid-parent-0.3.7 Former-commit-id: dbb154226e1bc6fcaf72a6a2df528bc6eb0bf94c 20 August 2015, 13:23:19 UTC
f1b35a1 [maven-release-plugin] prepare release grobid-parent-0.3.7 Former-commit-id: c9fec76c9feb548b031f974430e97a8af142793c 20 August 2015, 13:22:16 UTC
e9c7169 [maven-release-plugin] rollback the release of grobid-parent-0.3.7 Former-commit-id: 2fc86c3050ed062329457790c2e53b18a660fde0 20 August 2015, 13:18:05 UTC
bb72f08 [maven-release-plugin] prepare release grobid-parent-0.3.7 Former-commit-id: 57189881f7eb2f21eec652da8b7269f42909eacc 20 August 2015, 13:16:55 UTC
8b15dcd trying to fix release plugin Former-commit-id: 6929ba51952955016630e6c98a6c11f7dcb10f11 20 August 2015, 12:24:20 UTC
d1fa258 preparing for release Former-commit-id: 9fb0eb23ad881c096b478e6901badbe7f51e9071 20 August 2015, 12:09:50 UTC
a67cee9 fixing maven release plugin version regenerated ant files Former-commit-id: 90d60b374373b2bf4bd08013432e104660e3055d 20 August 2015, 11:52:42 UTC
f5209f5 Fix for issue #65 Former-commit-id: 4582d9c5a02705e239cbeeed7d201e0fa22634bc 20 August 2015, 10:57:59 UTC
ded4887 pom/xml Former-commit-id: 900cba03f03a3fa2842be0c9530e5026d3773841 19 August 2015, 18:30:55 UTC
3829166 Complete commit bae6cea (oops) Former-commit-id: 775777e997b9e5f78f93a3748c7e091271cdb27b 19 August 2015, 14:23:46 UTC
e9b144d Two parallel commits I committed offline in the train! Former-commit-id: 6680788de703a7f2fabbe035ddb5d72cac30b576 19 August 2015, 13:53:55 UTC
9f98811 Careful introduction of a class LayoutTokenization This tokenization in particular propagates the layout information when producing the full text TEI and allows to indicate the coordinates of some structural elements in the original PDF (e.g. the reference markers) Former-commit-id: c919e1de2ee330bfac3499ad89af6a68d532ece8 19 August 2015, 13:37:35 UTC
f89b18c Test for issue #65 Former-commit-id: 8b066bbfb66a193d376ea8d2b64836a9f7f21f45 19 August 2015, 13:37:35 UTC
d320abd Default sha1 for console admin password It seems the default one was not anymore admin! Former-commit-id: d28ed8279747fdb1470bc8f01c3c8e11cb40e6e7 19 August 2015, 13:37:35 UTC
97529db Add back isalive REST call Former-commit-id: e1e205cec26628dfd07c6a9253bd821e807de746 19 August 2015, 13:37:35 UTC
f4f058d Correct extension for temporary pdf files Former-commit-id: 0d6c9c774b16b98ab6161dc1e71df60df5fbaca2 19 August 2015, 13:37:34 UTC
ef954ab Case where tabulations are used as usual space separator in the PDF Former-commit-id: f1bda574f0e1c97382e363ea07ce13e8b6a8c91a 19 August 2015, 13:37:34 UTC
88979ae Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: bd565d2eea4e23009e7fa1310caad0d799f899c8 19 August 2015, 09:47:13 UTC
7cc4ae2 adjusting epsilon for BoundingBoxCalculator + counting pages from 1 Former-commit-id: 8d5ff512fa5a6ddcb37c894b63f4dff24e9fed1a 19 August 2015, 09:45:50 UTC
b9005f8 Add missing isalive REST path declaration #67 Former-commit-id: 124fad7dfa9323501c05e771bcd93cf9183c33eb 19 August 2015, 08:21:19 UTC
21b28bf a small bug Former-commit-id: f56d27b5047455960710af851d24d724e1d5109b 18 August 2015, 18:19:58 UTC
1b5d83d Merge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 9d93fee2775f1d81e6567865347ba33f2d68034c 18 August 2015, 16:05:55 UTC
efc5573 bounding box detection for coordinates Former-commit-id: d6729619019c4296a9c2017632ae31c00683fb67 18 August 2015, 16:05:48 UTC
c48c7eb Show failure information from RESTful interface. Former-commit-id: bae6ceaee6816878d5bc58c6725cf877e25d44fa 18 August 2015, 16:03:21 UTC
a302935 cosmetic Former-commit-id: 80b7e97dce761bd0977b6fb4135f0c89e6d45942 18 August 2015, 14:59:59 UTC
87a0a69 Merge pull request #66 from kermitt2/element_pdf_coordinates_20150603 Element pdf coordinates 20150603 Former-commit-id: 80fea0129a41a34cff6bb8be1de2fc889a1e087d 18 August 2015, 11:27:43 UTC
5f20101 merging master branch Former-commit-id: 6874c1e2b7a03ca928a2c52e712bc58903c82349 18 August 2015, 10:45:07 UTC
eab6c1c Merge pull request #64 from chrismattmann/allow-logging-jetty Allow specification of logging in Jetty Grobid Service Former-commit-id: 2c5ca8b86363f849629d71e880865a8790da8bbc 17 August 2015, 22:35:56 UTC
a3e2aa1 Allow specification of log4j logging in Jetty. Former-commit-id: 6772ebb6e580e81353debcedde3a47ac4c628aa7 17 August 2015, 16:14:17 UTC
e1677ca Merge pull request #63 from fmux/master improve handling of HTML entities Former-commit-id: b471ef82e89ac1db8ac870f365cb99295872a627 10 August 2015, 12:08:21 UTC
f578b06 fix error preventing double encoding of " entities (also tidy up some whitespace) Former-commit-id: 6f9e0943a0a0db83fcd3ee237258a3b167603599 10 August 2015, 11:40:17 UTC
f6ef235 make sure xml:id attribute is encoded properly Former-commit-id: a3d12e3e2c68a161e1b8df3a03f163c0a5a0e6e0 10 August 2015, 11:39:36 UTC
9efac05 Merge pull request #62 from sujen1412/GROBID-59 Fix for Grobid #59 - Publishing to Maven Central Former-commit-id: 999c43ac09e14ae483de6a4848da44ad4152ab7d 07 August 2015, 20:52:40 UTC
928bdbc Removed connection url from maven release plugin Former-commit-id: 7b8dd9e98f5f9a2c6cd8ccac8131ee978a3517c8 05 August 2015, 20:24:21 UTC
b4d941e Added release profile for Maven Central in pom.xml Former-commit-id: ebff4ad905b28f7d2d5db7d07a3e42b117b78c7d 05 August 2015, 18:10:17 UTC
49e0694 Some additional robustness related to pdf2xml Former-commit-id: 3399c8c85624b4ede3fb3f8ca859a3ef8d9ae216 03 August 2015, 00:47:27 UTC
db9d449 Generate additional files with training raw texts Additional files with raw text of citations are created when executing createTrainingSegmentation or createTrainingReferenceSegmentation Former-commit-id: 6a658bff5bbf77b74f1fef4626bbe316a881cdcb 06 July 2015, 17:30:41 UTC
5a52b38 Fix issue #57 and some minor typos Former-commit-id: da8843f2757f1f52e1b0c8d8eba82c130558aa28 06 July 2015, 15:57:39 UTC
b184890 preserving coordinates for reference markers Former-commit-id: 90ce22da7b340b12a6214d1fb4ca47e414b4374f 03 June 2015, 12:27:12 UTC
94aa3bb Person name suffix outputted under <genName> in TEI Former-commit-id: 82e23b2f5d1f4fb8874a070400df94b3d930658d 20 May 2015, 22:40:02 UTC
e8a2215 In the batch mode, correcting incorrect TEI file names when the PDF file has a .PDF extension Former-commit-id: 9370901d02899090143011a8bef8984277a60a90 16 May 2015, 06:17:21 UTC
4052371 Simplify the code for patent processing Former-commit-id: 60e12dbda0b6510e01e2ce4fa0b0474fc568b27a 14 May 2015, 08:20:04 UTC
79dde17 Typo Former-commit-id: 03969f415ea7fe628d5762ca93f49b5f3490b74b 13 May 2015, 01:06:48 UTC
ac9d32b Refer to the GROBID ant example project and some updates. Former-commit-id: 933b85c8b2dd9f7484920c4e80f81becdc33a391 13 May 2015, 01:03:02 UTC
d3f13da Update ant build files Former-commit-id: f5187a1b9693639c5d23c016517cac4c0294545c 12 May 2015, 23:46:27 UTC
2600e4c Improve handling of extracted keywords in header Former-commit-id: baefe86a8bdf8247d6f5c4f05695dc3845ee33fb 12 May 2015, 03:44:00 UTC
c61c15f Make documentation notations more consistent Former-commit-id: 93f8f3bf4f20da0e725c3f82e394939b14685548 12 May 2015, 03:41:51 UTC
9337dec Context window parameter for patent training data Former-commit-id: ed5b8e10a25a0f7a2da59925a4f130ea1cf9feff 12 May 2015, 03:41:14 UTC
a6d834a Integrate the new documentation Former-commit-id: 8449d8279d5907c9fbc8bf34ee65952f2af02d30 09 May 2015, 20:59:16 UTC
f92dc7a Move all the doc to mkdocs Former-commit-id: 012e82dde7ea2293ae2295087044e21d9deca9a9 09 May 2015, 20:43:17 UTC
2a6a98f It seems that ReadTheDocs does not use the latest version of mkdocs It does not help ;) Former-commit-id: bbee116dae2f97610f56fccf54768303afb128e6 09 May 2015, 04:34:42 UTC
01411fb Try to solve the section issue observed with ReadTheDocs mkdocs locally works fine Former-commit-id: 8488c381dea712e7723dfd3a261f0e0381759de5 09 May 2015, 04:29:06 UTC
a22ca18 Try to get the doc section correctly Former-commit-id: d162566eaeabd54ecf66f5d39ed405d252e55541 09 May 2015, 04:20:23 UTC
5a9b0c8 New attempt to have the doc building via readthedocs.org Former-commit-id: 998971043d2a0e9a0bf763b3db5e69a2b0ec5d1d 09 May 2015, 04:09:03 UTC
eb462ef Using mkdocs and ReadTheDocs for the doc... Former-commit-id: fce8472d28a92062c78095bc7ffafca18f07e47e 09 May 2015, 03:50:26 UTC
d309a6b New attempt to get the doc built Former-commit-id: 61d67e14bb8a392b8a4a6f1c8bd65d669794cb8a 09 May 2015, 01:04:58 UTC
92f11ef (Re)try to config the doc Former-commit-id: afedce032a42831762da072b2a39c796e27a3a14 09 May 2015, 01:01:26 UTC
6f8cb70 Test doc config Former-commit-id: 45d4d0615b12b4c5f055c9b6bb188131f6010548 09 May 2015, 00:18:37 UTC
ea744b8 Try more serious documentations with mkDocs Former-commit-id: 0d757130a8e0ac7244745bf3daa2d57eaf73146a 08 May 2015, 23:33:43 UTC
7b5ae60 Additional patent training data Former-commit-id: 74f109cab6c88411cce7dee9d177f8b098b6f9b2 06 May 2015, 03:02:35 UTC
2fb6ce5 Move GrobidTimer Former-commit-id: 2a10557374639bf551f1b145b54843b4192b84d5 03 May 2015, 18:01:04 UTC
acdd951 Yet another attempt to make the coveralls maven plugin working Former-commit-id: d2087558c898509d93056f7d0901c5e61038be93 03 May 2015, 00:50:28 UTC
dcf8f7a Still testing the coveralls maven plugin Former-commit-id: 8501c75d36d0dcbcb5e9116f2b3945d6c2ba5b65 02 May 2015, 21:41:08 UTC
1fdfa58 New try for getting coveralls reports Former-commit-id: 9d2c2a375cdfbcba17fba3942f685c972f055c58 02 May 2015, 20:24:33 UTC
cf091ea Still trying coveralls maven plugin Former-commit-id: 36ec55e11f713f4a92dc1137ec4f420a88429607 02 May 2015, 20:02:32 UTC
ac1134a Testing coveralls maven plugin Former-commit-id: bcbd586ae5ddfcbfbef915af6d5075b7b7a1b413 02 May 2015, 19:18:54 UTC
6a723ad Trying https://coveralls.io service Former-commit-id: 7e4893afb92dfaf0a560b36208ce49b23f8646f4 02 May 2015, 18:54:46 UTC
54fbc97 Goodbye Jenkins and welcome travis-ci Former-commit-id: 0e0358a8db22f98a52b4d600de6323ab3334bf11 02 May 2015, 17:57:04 UTC
14c8898 Right version of the sax parser Former-commit-id: 2d858ee11824a38cc017786e9fcfbd2d1926b153 02 May 2015, 17:48:23 UTC
faf2821 Add training data for patents Former-commit-id: 6fab24848a576d7ed580c14bed6bc9d7febd781e 02 May 2015, 17:18:21 UTC
596cbac Try to get back the build status icon... Former-commit-id: 1f22d212c8c55be428c9242e705e009df1389a1e 30 April 2015, 15:08:23 UTC
a15e762 Fix a problem with language code Former-commit-id: 9f88c36253f66e2d399f2c6f4b3bc0206460eeef 30 April 2015, 13:20:19 UTC
500525c Extraction of the kind codes and output of the original patent number in addition to epodoc format Former-commit-id: aa4cf4fd8f4b0b035bd299765dc8ecba620e8f9e 30 April 2015, 01:31:42 UTC
72f7570 Add tests for CJK patent processing Former-commit-id: b5570a1d859b4f79aeba7516f033676aee02b286 26 April 2015, 21:43:56 UTC
5515ca9 Use of CJK analyzers for patent processing Former-commit-id: b2fcdf2db86adb37eb0e3eabc59a9d718df4a2b6 26 April 2015, 18:39:43 UTC
c600181 Use the new unique analyzer class for tokenization All the tokenizations use the default tokenizer for the moment. Former-commit-id: e02125fe51eee1ee2c87ccbb985f92f6b47fd863 26 April 2015, 03:18:03 UTC
3dd1165 Add support classes for CKJ and Arabic languages Former-commit-id: 4fb295822d48a76801e524e0d2423c879035625d 26 April 2015, 03:15:11 UTC
7416511 Cleaning Former-commit-id: b9a9fd9d74512463e0eb188c0fa65d25756f1081 23 April 2015, 18:33:01 UTC
aa77caa caching tei ids of citations in Document Former-commit-id: f5de8928972753d296f1b56db2dd82515611df5d 23 April 2015, 15:24:25 UTC
c575641 :wMerge branch 'master' of https://github.com/kermitt2/grobid Former-commit-id: 30b6ca6335bf7a13e8be68a519aba5982a860cdf 23 April 2015, 09:35:53 UTC
7f7d013 embedding citations into Document when doing fulltext processing Former-commit-id: abb32972ad826edad59dae4135098847af031fc4 23 April 2015, 09:18:02 UTC
e1cc654 Try to serialize Wapiti parameters in a more portable way Former-commit-id: 0824aa8cd630d49d850bbf6850fb18941bbac259 20 April 2015, 23:07:07 UTC
7cf5d65 Minor Readme.md correction Former-commit-id: 4d0c9e1e955b65a2c495eb9efe24db2e348fa015 20 April 2015, 23:06:18 UTC
d6cc8a0 Add link to PubMedCentral evaluation wiki page Former-commit-id: 1b34841f63c1f4d90d99f856226151ccf1a844b6 19 April 2015, 05:46:56 UTC
0025280 Reference markers are back in the TEI results XML well-formedness tested on 2000 PDF Former-commit-id: 241ad2a33a8bb59983f3ed6f845b8f2beaacdd62 19 April 2015, 05:19:02 UTC
ff4377a Add build status icon Former-commit-id: 1a80f05f0c7b858fe07ca81a7c9437709a807b7e 19 April 2015, 05:17:56 UTC
70ecab8 Start text body evaluation against PubMedCentral Former-commit-id: 15d0558917f2935bedb74c6e73ec2af664a76878 19 April 2015, 05:17:40 UTC
95f1690 Update reference-segmenter resources Former-commit-id: 5d9d2c89434731fe2a3280c34c067668a158a1e2 19 April 2015, 05:16:27 UTC
261a5c6 Generalize the way the training data for reference-segmenter is generated Also make the method more similar to the other parsers. Former-commit-id: 8a77d4b6c3c4b37d9426caab8d982031f6a37aed 11 April 2015, 19:05:13 UTC
eb1dfe3 Update resources for referenceSegmenter Test is OK again Former-commit-id: 32e410fffc9939401970c86e4a130e461f886789 11 April 2015, 05:36:42 UTC
91cfe71 Skipping the reference parser test for the moment... Test is now failing :( Former-commit-id: 2a9f2d844aa98ee2ce319ab548b99cb8485c0290 10 April 2015, 18:02:45 UTC
ed0fb35 Update the reference-segmenter resources with the indent line feature Former-commit-id: b56204d22a10a58a92a6c9cbef32847044d9525a 10 April 2015, 17:50:32 UTC
553f750 Add mvn exec command for PMC evaluation Former-commit-id: 0d42ef4f689c8cd0d0f8060419313d5d4d23a502 10 April 2015, 17:49:23 UTC
cf3168f Merge pull request #52 from rodneykinney/master Indentation feature for ReferenceSegmenter Former-commit-id: e04487fbcf4cfd279514df14ef3338528c62ee1f 09 April 2015, 23:33:54 UTC
ecbc420 Fix issue #47 Address the second identified bug related to accent and diaeresis character recomposition Former-commit-id: 38c95704beee62f82d4ba21916fd7ea21c9f57f4 09 April 2015, 23:30:31 UTC
32d4bd9 Indentation feature complete Former-commit-id: ef7dda195351755f1c76e3919787e06a15286109 09 April 2015, 23:26:33 UTC
0f66b96 Merge branch 'master' of https://github.com/kermitt2/grobid into featurize Former-commit-id: 8656719f7f240e75105df8687c74853e8e44a5a5 09 April 2015, 15:01:54 UTC
db147fd Fix a token inconsistency in case of accent/diaresis Former-commit-id: 9e6c365df054c948bb2f396c17027f6aa125af8f 09 April 2015, 00:44:50 UTC
back to top