https://github.com/grangier/python-goose

sort by:
Revision Author Date Message Commit Date
6e76dc3 python 2.6 doesn't support assertIsInstance 24 March 2013, 22:16:57 UTC
48f03c5 python 2.6 doesn't support assertIsNotNone 24 March 2013, 22:15:28 UTC
1b72703 update THANKS file 23 March 2013, 18:33:36 UTC
fbd1e42 update THANKS file 23 March 2013, 18:31:46 UTC
612aded missing import 23 March 2013, 18:26:16 UTC
aa86d4b cross platform filepath handeling 23 March 2013, 18:24:58 UTC
374e00b cross platform filepath handeling 23 March 2013, 18:17:09 UTC
ca73bbf Missing parentheses added in isOkToBoost 22 March 2013, 19:05:26 UTC
eeeb68b Merge branch 'master' of github.com:xgdlm/python-goose 22 March 2013, 19:02:18 UTC
de5ce5a add sup tag to replaceTagsWithText methode 22 March 2013, 19:01:56 UTC
133fc3e Merge pull request #10 from timjurka/master Removing set() Usage in Content Clusterer 24 February 2013, 20:05:51 UTC
690663d Don't want to use set() in content extractor, because DOM elements get reordered. 13 February 2013, 01:00:39 UTC
e36ff1e Clean only current thread tmp files, add timestamp for multithreading 10 December 2012, 16:20:53 UTC
ba41090 Update README.md 10 December 2012, 15:58:03 UTC
39384a5 Missing jieba package 10 December 2012, 12:25:42 UTC
f1064ad Adds Chinese stopwords analyser, and enable to pass a stopword analyser to config object 10 December 2012, 11:26:37 UTC
ff20ac2 Add v0idnull to contributors 10 December 2012, 10:17:48 UTC
a932d96 Merge pull request #8 from v0idnull/master Debug flag 10 December 2012, 10:16:29 UTC
ef18ab2 HTTP Debug mode now dependent on configuration 10 December 2012, 00:40:45 UTC
bb83da5 Added debug flag to configuration 10 December 2012, 00:40:07 UTC
1db85f9 Merge branch 'master' of github.com:xgdlm/python-goose 08 December 2012, 14:21:37 UTC
f431d51 release resources 08 December 2012, 14:20:36 UTC
d9bbb46 Merge pull request #5 from dzen/master Fix missing dependancy thanks 01 November 2012, 10:18:04 UTC
758243f setup.py: Missing dependancy on cssselect 31 October 2012, 09:42:39 UTC
94f0c62 Merge branch 'master' of github.com:xgdlm/python-goose 30 October 2012, 22:35:55 UTC
68ca62b hashlib.md5 doesn't support unicode, use str instead 30 October 2012, 22:33:52 UTC
e2a0962 Typo thanks to brutasse 29 October 2012, 10:42:52 UTC
5e24b7d Better configuration and usage instruction 28 October 2012, 12:08:48 UTC
dc3f762 Missing commit for Language support 28 October 2012, 12:07:56 UTC
e28fff4 Use the correct stopword file in regard of meta language stopswords are really important to check words and paragraphe density goose will now try to fetch the correct stop word file. It's also possible to force the target language using configuration 28 October 2012, 11:50:23 UTC
5ef43c5 adds Parsely 27 October 2012, 20:23:18 UTC
e6cb957 Cache the list of stop words (per language). This avoids re-reading all of the stop words from disk continuously. Tanks to Parsely 27 October 2012, 20:22:25 UTC
fbd1269 Only create the trans table once and reuse it. 27 October 2012, 20:18:27 UTC
3cb0f4b pep8 27 October 2012, 20:30:17 UTC
7261316 Make the precompiled PUNCTUATION regex actually reusable. 27 October 2012, 19:41:07 UTC
16de610 Better MANIFEST.in 27 October 2012, 19:32:43 UTC
c60bb53 Remove useless module 27 October 2012, 19:23:49 UTC
7d88d01 Remove "facebook_broadcasting" junk Remove the "facebook-broadcasting" div which would lead to 'Click "Add to Timeline" to publish what you read to Facebook' becoming the article text. This issue was surfacing while looking at CBS Local affiliate site support. 27 October 2012, 19:15:52 UTC
e5e2869 Update include better OS X installation instructions 27 October 2012, 19:09:28 UTC
8fc6a0b adds thank you file 27 October 2012, 21:09:43 UTC
ec58eeb adds OXs setup instructions thanks to litso 27 October 2012, 21:08:06 UTC
628886b add .gitignore file thanks litso 27 October 2012, 21:03:23 UTC
4c3c4f0 add cssslect to requirements.txt 27 October 2012, 20:55:24 UTC
670152b Merge pull request #3 from aidenbell/master Add MANIFEST for setup.py thanks aidenbell 27 October 2012, 18:36:05 UTC
9bd503a Added MANIFEST.in to copy across the resources folder containing the various non-python files required (stopword lists etc) 23 October 2012, 15:59:16 UTC
b6b061e cleanup whitespaces 26 February 2012, 12:49:24 UTC
78ed2b2 useless import 26 February 2012, 11:37:40 UTC
5afede8 utf-8 special comment 26 February 2012, 11:32:10 UTC
7896459 add license header 26 February 2012, 11:28:51 UTC
acd069a remove useless param in FileHelper loadResourceFile 26 February 2012, 11:24:31 UTC
b4bef65 import Article form Article 25 February 2012, 16:20:47 UTC
4a76daf fix some imports 25 February 2012, 16:19:01 UTC
0ce2005 Fix some imports 25 February 2012, 16:17:19 UTC
86dd4b8 Remove unused variables 25 February 2012, 16:13:43 UTC
f05bbad move modules around 25 February 2012, 16:08:49 UTC
c7922e7 move modules around 25 February 2012, 16:04:03 UTC
cf68389 move modules around 25 February 2012, 13:12:56 UTC
95cec0e add title meta description exemples 22 February 2012, 14:48:40 UTC
e096ea7 typo spotted by cyp on irc, thanks 22 February 2012, 14:42:58 UTC
f263354 TODO list 22 February 2012, 14:36:32 UTC
9f311ba markdown README only 22 February 2012, 14:29:48 UTC
6f65a78 markdown readme file 22 February 2012, 14:24:44 UTC
b26cd6e usage info in readme file 22 February 2012, 14:24:05 UTC
fc094d8 basic setup file 22 February 2012, 14:23:07 UTC
1eb40dc pip requirements 22 February 2012, 14:05:04 UTC
0f16b72 enable image fetching 22 February 2012, 14:04:04 UTC
fe00737 enable tests 22 February 2012, 11:48:17 UTC
73f475a initial import 22 February 2012, 11:43:41 UTC
back to top