https://github.com/ContentMine/quickscrape

sort by:
Revision Author Date Message Commit Date
27dce16 Merge pull request #78 from chreman/master added CONTRIBUTING.md 23 September 2016, 09:06:08 UTC
b1348b9 Create BUGS.md 19 August 2016, 10:37:31 UTC
afc882f Merge pull request #84 from larsgw/patch-1 Added missing comma 21 June 2016, 09:24:10 UTC
f2bdd85 Merge pull request #86 from tarrow/santiseFileNames sanitize creation of folder 21 June 2016, 09:23:19 UTC
0be76de sanitize creation of folder 20 June 2016, 08:34:51 UTC
e1d22b1 Added missing comma 19 June 2016, 17:31:07 UTC
8198895 added CONTRIBUTING.md 16 May 2016, 11:30:41 UTC
fbfd07e added CONTRIBUTING.md 16 May 2016, 11:22:59 UTC
43b7516 Resolve directory paths before changing directory -_- (fixes #60 #67) 15 April 2016, 14:13:15 UTC
f4f47f7 Fix directory numbering string bug 15 April 2016, 13:34:51 UTC
9997dfd Implement simple output subdirectory numbering (#61) 15 April 2016, 13:29:50 UTC
19cefd9 Release v0.4.7. 12 October 2015, 10:28:55 UTC
90d7e6f Get thresher with follow bug fix 12 October 2015, 10:28:46 UTC
d38d37b Add optional saving of logs (closes #51) 08 August 2015, 04:47:22 UTC
2d6d655 Change short arg for ratelimit to avoid conflict (fixes #53) 08 August 2015, 04:35:26 UTC
cb18b31 Release v0.4.6. 08 August 2015, 04:33:08 UTC
c61837f Update license field to current NPM spec 08 August 2015, 04:32:59 UTC
0cd2b75 Bump thresher for DOI resolve fix (fixes #54) 08 August 2015, 04:32:36 UTC
7f5a4a6 Release v0.4.5. 14 June 2015, 13:22:30 UTC
651ceb0 Handle invalid attributes and log all failed captures in debug 14 June 2015, 13:22:22 UTC
24866c1 Print version in first line of log 14 June 2015, 13:02:50 UTC
8e141ed Release v0.4.4. 14 June 2015, 12:31:01 UTC
e0b694f Bump thresher for new scraper validation 14 June 2015, 12:30:56 UTC
99a0983 Validate scraper(dir)s before running (fixes #48) 14 June 2015, 12:30:08 UTC
04223ae Release v0.4.3. 14 June 2015, 12:01:41 UTC
f44d8dc Refactor CLI code Rate-limited loops now avoid using recursion. Argument paths are expanded before use. Element capture statistics are reported for each URL (closes #7) 14 June 2015, 11:58:48 UTC
bbc729a Bump thresher depedency 14 June 2015, 11:57:14 UTC
6978162 Merge branch 'master' of github.com:ContentMine/quickscrape 14 May 2015, 15:05:47 UTC
baa1dcb Release v0.4.2. 14 May 2015, 15:05:39 UTC
2e7b22f Bump thresher dependency 14 May 2015, 15:04:41 UTC
e6d064c Export version globally for logging 14 May 2015, 14:59:18 UTC
5eef420 Clarify installation instructions (see #41) 06 May 2015, 10:14:53 UTC
8448f0b No OS specific instructions any more 04 May 2015, 14:20:56 UTC
f087ce1 Make it clear that headless scraping is optional 11 April 2015, 12:00:43 UTC
e971ae5 Load version number from package.json This avoids accidentally forgetting to update the version number in multiple places when releasing. 11 April 2015, 11:57:38 UTC
1b183ba Release v0.4.1. 11 April 2015, 11:51:03 UTC
f65967a Fix version number reporting 11 April 2015, 11:50:56 UTC
d7ee835 Simpler install using cross-platform NVM 11 April 2015, 11:49:29 UTC
6ce34fd Release v0.4.0. 11 April 2015, 11:01:06 UTC
677ee3d Update README for v0.4.0 11 April 2015, 11:00:58 UTC
167be4c Print help and exit when run with no arguments (fixes #36) 11 April 2015, 10:57:22 UTC
3b064cd Trim empty lines from URLlist files (fixes #29) Many text editors add a terminal newline to files on save. This was previously being interpreted as an invalid URL. Fixed by filterng the URLs loaded from `--urllist` to remove empty entries. 11 April 2015, 10:26:00 UTC
8662a1e Clear event listeners on thresher after each URL (fixes #33) When the URLlist is iterated over using recursive setTimeOuts, the Thresher object is staying in scope for all the nested calls. This is leading to event listeners accumulating on the Thresher object, and multiple identical handlers being called for the same event. As a shim, I have simply cleared all the event listeners after each URL finishes processing. 11 April 2015, 09:38:16 UTC
ac23ddb Update dependencies 11 April 2015, 09:36:52 UTC
1cbc298 Release v0.3.7. 10 April 2015, 21:37:42 UTC
f8b16d9 Update README for v0.3.7 10 April 2015, 21:37:37 UTC
4bd505b Update thresher dependency to v0.1.3 10 April 2015, 21:36:36 UTC
539f00a Fix dates in bibJSON 10 April 2015, 21:36:06 UTC
b6983a8 remove spurious files 10 April 2015, 20:14:17 UTC
71695d6 Merge branch 'master' of github.com:ContentMine/quickscrape 10 April 2015, 20:12:17 UTC
cc3879e fix error when missing log message 31 March 2015, 09:06:28 UTC
9e36288 tidy bibjson html capture 31 March 2015, 09:06:02 UTC
f92e632 readme tidy 22 January 2015, 09:38:34 UTC
192cb1c no longer need unsafe-perms option 22 January 2015, 09:35:42 UTC
7012cb6 removed files 12 January 2015, 16:14:39 UTC
980f10e added bmc scraper 12 January 2015, 15:58:11 UTC
3722aea Merge branch 'master' of https://github.com/ContentMine/quickscrape added scrapers for BMC 12 January 2015, 15:52:56 UTC
ab740ce converted existing MDPI scraper to BMC trials 12 January 2015, 15:52:29 UTC
c86dabe update README 11 January 2015, 16:05:05 UTC
f4a4587 Release v0.3.6. 11 January 2015, 15:34:54 UTC
a615c09 bump patch 11 January 2015, 15:34:42 UTC
e5764d8 typo 11 January 2015, 15:34:08 UTC
4f37372 Release v0.3.5. 11 January 2015, 14:28:11 UTC
a35be25 bump thresher dependency version; bump version 11 January 2015, 14:27:53 UTC
fa2c3ec Release v0.3.4. 11 January 2015, 12:38:18 UTC
531ea16 prep for v0.3.4 11 January 2015, 12:38:14 UTC
9105b8b fix ref/table/fig output key 11 January 2015, 12:37:52 UTC
97dfcca Release v0.3.3. 10 January 2015, 14:58:24 UTC
96f659f add --outformat option 10 January 2015, 14:54:52 UTC
6ad698c Release v0.3.2. 06 October 2014, 21:20:14 UTC
46be3a9 prep for v0.3.2 06 October 2014, 21:20:02 UTC
7d6870f Release v0.3.1. 02 October 2014, 21:13:53 UTC
678a0d5 prep for v0.3.1 02 October 2014, 21:13:43 UTC
8dcf3c3 Release v0.3.0. 02 October 2014, 20:59:35 UTC
173ee67 prep for v0.3.0 02 October 2014, 20:59:14 UTC
12688ba write out structured JSON correctly 02 October 2014, 20:35:38 UTC
c7cfa03 add mac DS_Store to gitignore 02 October 2014, 18:49:02 UTC
600a567 tidy up logging 22 September 2014, 10:28:56 UTC
4942aeb integrate thresher updates 21 September 2014, 13:33:27 UTC
1d1b53d use new thresher interface 08 September 2014, 16:24:24 UTC
76f5bce Release v0.2.8. 14 August 2014, 08:27:11 UTC
2abea18 prepare for version bump 14 August 2014, 08:27:01 UTC
c62e8ce Merge pull request #26 from Mec-iS/patch-1 Typo at line 10 02 August 2014, 15:10:06 UTC
458b833 Typo at line 10 the correct property's name for `thresher` object at line 10 is `ScraperBox` 02 August 2014, 14:05:47 UTC
715d6e4 Release v0.2.7. 22 July 2014, 16:03:15 UTC
9d44283 prep for another version bump 22 July 2014, 16:02:50 UTC
69e1e8c Release v0.2.6. 22 July 2014, 15:57:23 UTC
81dc2f0 prep for version bump 22 July 2014, 15:57:14 UTC
bbaa653 Release v0.2.6. 22 July 2014, 15:56:02 UTC
38e9720 integrate latest thresher API changes 22 July 2014, 15:55:52 UTC
996d5d6 Reflect changes to Thresher API 22 July 2014, 15:27:48 UTC
334acaf flush work 21 July 2014, 22:50:15 UTC
3d189fd add libfontconfig to ubuntu instructions 18 July 2014, 09:07:55 UTC
9ed63c3 Merge pull request #21 from scraperdragon/master Correct path in `quickscrape --scraper` 17 July 2014, 08:59:44 UTC
9fb6690 Correct path in `quickscrape --scraper` `peerj.json` isn't in the root directory of `journal-scrapers` 16 July 2014, 20:19:07 UTC
cd5af47 Release v0.2.5. 10 July 2014, 15:34:07 UTC
764e94f dependency bump 10 July 2014, 15:33:53 UTC
d44a87d fix thresher link 10 July 2014, 15:26:06 UTC
9ade74f Release v0.2.4. 10 July 2014, 15:24:31 UTC
72f320c prep for version bump 10 July 2014, 15:24:23 UTC
back to top