https://github.com/grangier/python-goose

sort by:
Revision Author Date Message Commit Date
71ea7b4 Merge branch 'release/v1.0.6' 30 December 2013, 15:01:04 UTC
35d5652 bump version 30 December 2013, 15:00:35 UTC
bce32c3 Merge branch 'release/v1.0.5' 30 December 2013, 14:51:20 UTC
c928ced Merge branch 'feature/norvegian-69' into develop 30 December 2013, 14:49:04 UTC
231e434 Merge branch 'develop' of https://github.com/svenanders/python-goose into svenanders-develop 30 December 2013, 14:45:15 UTC
40652c8 Merge branch 'feature/korean-67' into develop 30 December 2013, 14:39:42 UTC
b5b99ac #67 - pep8 30 December 2013, 14:36:03 UTC
60dd742 #67 - add target language 30 December 2013, 14:34:56 UTC
a3fe0f9 #67 - remove empty lines 30 December 2013, 14:16:22 UTC
79ee503 #67 - correct readme file 30 December 2013, 14:14:30 UTC
e770d0a #67 - remove unwanted committed files 30 December 2013, 14:11:24 UTC
1fec34b Merge branch 'develop' of https://github.com/sp576/python-goose into sp576-develop 30 December 2013, 14:10:14 UTC
04f444e Merge branch 'feature/image-width-height-44' into develop 30 December 2013, 14:02:49 UTC
bd505cc #44 - correct indent 30 December 2013, 14:00:10 UTC
1bbf48b Fixing bigimage width and height basic image test 30 December 2013, 04:46:07 UTC
64e6301 korean test case 17 December 2013, 21:12:26 UTC
3563aa9 Added stopwords for norwegian 16 December 2013, 15:04:46 UTC
51c5de8 Goose in Korean example 12 December 2013, 06:00:49 UTC
f8751c3 Goose in Korean example 12 December 2013, 05:59:21 UTC
a88a6b2 Goose in Korean example 12 December 2013, 05:58:16 UTC
fdc3b34 added korean support 12 December 2013, 05:41:39 UTC
bbe9ec3 #58 - update travis build image src 08 December 2013, 07:30:54 UTC
3804752 #43 - add cnn specifc classes 07 December 2013, 22:25:00 UTC
203952e Merge branch 'release/v1.0.4' 07 December 2013, 10:42:14 UTC
b695ba0 added list of german stopwords 05 December 2013, 22:16:01 UTC
05e66b4 Merge branch 'release/v1.0.3' 13 October 2013, 18:57:43 UTC
31e51e8 #50 - add italian stopwords 13 October 2013, 18:42:37 UTC
1bb368e Handle missing src attribute values for known image elements. Calls to UpgradedImageIExtractor.get_image in check_known_elements were not checking the validity of the extracted src before trying to use it to build the image. This caused an exception when build_image tried to parse the src URL. 10 October 2013, 20:43:22 UTC
050b11f Merge pull request #38 from grangier/imageextractor known image css work at element level 20 August 2013, 06:57:49 UTC
67f5ab4 known image css work at element level 19 August 2013, 19:21:50 UTC
e008ac1 add test for known_image_css, known_image_name, opengraph_tag 18 August 2013, 23:41:59 UTC
692a80d move assert_top_image to image test file 18 August 2013, 23:40:43 UTC
2b1ac2a return image even if local_image is None 18 August 2013, 23:39:23 UTC
0c0a3bf pep8 18 August 2013, 22:30:03 UTC
9cd3417 no need for OSX setup info 18 August 2013, 21:08:44 UTC
cf9af14 updated README 18 August 2013, 20:53:49 UTC
22391da Add cookie handeling exemple in known issues 18 August 2013, 18:29:20 UTC
40a1993 add gizmodo tests 18 August 2013, 17:54:39 UTC
6f77d7e add clean_article_tags method, clean classes, ids and name attribute from article tags 18 August 2013, 17:54:09 UTC
68100e4 add delAttribute method to parser 18 August 2013, 17:37:29 UTC
7c989ba better handeling of object videos 14 August 2013, 12:54:43 UTC
2bd252c Update README.rst 13 August 2013, 10:18:02 UTC
79ecfac Merge pull request #33 from grangier/video video extraction of youtube, vimeo, dailymotion and kewego 12 August 2013, 23:56:09 UTC
505f3b1 imports first 12 August 2013, 23:52:12 UTC
7be02d1 bump to version 1.0.2 12 August 2013, 23:49:05 UTC
c1fe968 video exemple 12 August 2013, 23:47:41 UTC
e8bfbca video iframe and embed tests 12 August 2013, 23:42:43 UTC
de01b00 check if video candidate has a src attribute 12 August 2013, 21:36:56 UTC
a8e242c basic video extraction 12 August 2013, 21:28:12 UTC
00dfcaa useless functions 11 August 2013, 19:58:45 UTC
1ddbf2d use loadConfig instead of getArticle in images tests 11 August 2013, 19:33:51 UTC
c88994a use correct version for browser_user_agent configuration 10 August 2013, 17:16:49 UTC
9e2b2be more image detail/utils tests 10 August 2013, 17:11:01 UTC
61041d3 basic ImageUtilsTests test cases 10 August 2013, 17:01:23 UTC
8a9c563 rename ImageTests to ImageExtractionTests 10 August 2013, 16:13:38 UTC
f3161e0 remove useless imports 10 August 2013, 15:04:03 UTC
f611627 remove useless comment 10 August 2013, 15:02:22 UTC
a88448f ignore ._* 10 August 2013, 14:55:42 UTC
190d38b ignore ._.DS_Store files 10 August 2013, 14:54:44 UTC
9ed832b link contributors to github repository 10 August 2013, 12:09:05 UTC
2bafd86 add a TODO notice 10 August 2013, 11:03:16 UTC
3484135 uncomment callback class 10 August 2013, 11:03:04 UTC
0f91505 tags tests are back to extractors 10 August 2013, 10:53:06 UTC
23c0a5f mocked tags tests 10 August 2013, 10:51:03 UTC
5fdbc82 rename tags tests files to reflect testcase name 10 August 2013, 10:11:04 UTC
7e78bef rename article_tags folder to tags 10 August 2013, 10:07:25 UTC
2adfb5e remove useless print 09 August 2013, 21:59:23 UTC
3435aeb extractors tests now used mocked urllib2 09 August 2013, 21:49:34 UTC
4d1ccaf use urlib2.Request instrand of HTTPhandler 09 August 2013, 20:28:26 UTC
f6eb9e2 directories don't existes anymore 09 August 2013, 20:27:15 UTC
afb201f remove useless comments 08 August 2013, 21:18:00 UTC
aa35144 basic mocked handler for images 08 August 2013, 21:09:06 UTC
7f17c0b basic mocked handler and response 08 August 2013, 21:07:35 UTC
2dff006 use ReStructuredText for README instead of markdown 05 August 2013, 07:09:00 UTC
32f4dae fall back for long description 04 August 2013, 23:19:40 UTC
0032233 bump version 1.0.1 04 August 2013, 23:13:39 UTC
1f36c95 package name is goose-extractor since Goose is already a package name in pypi 04 August 2013, 23:04:53 UTC
a36192c add classifiers and long description to setup file 04 August 2013, 22:58:37 UTC
807d992 Update README.md 04 August 2013, 16:12:07 UTC
f31d786 Merge pull request #30 from grangier/testsplit Split tests in sub modules 04 August 2013, 16:09:26 UTC
d002a95 missing blank line 04 August 2013, 16:03:48 UTC
7cd0633 remove tests.py file 04 August 2013, 16:02:51 UTC
5d4dc6d useless imports 04 August 2013, 16:00:53 UTC
5615ade remove print 04 August 2013, 15:58:21 UTC
6a0a034 split test module in subs modules 04 August 2013, 15:55:33 UTC
1aa6e83 correct test case name 04 August 2013, 15:25:28 UTC
65811d7 correct test case name 04 August 2013, 15:24:11 UTC
f8521e5 basic testcase for arabic content 04 August 2013, 15:23:09 UTC
737f09e typo 04 August 2013, 15:17:19 UTC
fae9d53 missing nltk in requirements.txt 03 August 2013, 23:04:46 UTC
39f90f5 Merge pull request #29 from grangier/isri #24 Goose now handles Arabic content 03 August 2013, 23:02:14 UTC
f846577 How to use Goose in Arabic 03 August 2013, 22:57:11 UTC
7026e6d basic arabic handeling 03 August 2013, 22:47:35 UTC
bbfc682 Merge pull request #28 from litso/topic_tags Add support for tags in a /topic/ link 03 August 2013, 22:13:26 UTC
7879711 Add support for tags in a /topic/ link 03 August 2013, 22:04:02 UTC
100e661 typo 01 August 2013, 01:53:31 UTC
89ff5b1 StopWord class refactor 01 August 2013, 01:52:07 UTC
4790fc7 StopWord class refactor 01 August 2013, 01:51:10 UTC
a24cf18 useless code in StopWordsChinese class 01 August 2013, 01:18:10 UTC
9ba4b5b Merge pull request #27 from psilva261/handle_br let br tags create newlines; fixes issue #25 in grangier/python-goose 01 August 2013, 00:47:23 UTC
back to top