451d94f | Radim Řehůřek | 06 July 2015, 19:34:31 UTC | Merge branch 'release-0.12.0' | 06 July 2015, 19:34:31 UTC |
7476ed2 | Radim Řehůřek | 06 July 2015, 19:23:19 UTC | re #385: default to no vocab pruning | 06 July 2015, 19:30:10 UTC |
a06d075 | Radim Řehůřek | 06 July 2015, 16:56:24 UTC | minor formatting fix | 06 July 2015, 16:56:24 UTC |
43a7ffd | Radim Řehůřek | 06 July 2015, 15:57:22 UTC | Merge branch 'develop' of github.com:piskvorky/gensim into develop | 06 July 2015, 15:57:22 UTC |
142a599 | Radim Řehůřek | 06 July 2015, 15:56:04 UTC | up version: 0.12.0 | 06 July 2015, 15:56:04 UTC |
f9afd3b | Radim Řehůřek | 06 July 2015, 15:45:55 UTC | minor doc fixes | 06 July 2015, 15:55:57 UTC |
723443b | Radim Řehůřek | 06 July 2015, 14:28:21 UTC | fix doc2vec unit test | 06 July 2015, 15:48:31 UTC |
61d16bb | Radim Řehůřek | 06 July 2015, 10:11:50 UTC | fix log report during word2vec vocab building | 06 July 2015, 15:48:31 UTC |
37381f7 | Radim Řehůřek | 05 July 2015, 23:49:13 UTC | add max_vocab_size param to doc2vec too | 06 July 2015, 15:48:30 UTC |
7d8eba6 | Radim Řehůřek | 05 July 2015, 23:25:35 UTC | prune vocab during doc2vec vocab building too | 06 July 2015, 15:48:30 UTC |
ae243b3 | Radim Řehůřek | 05 July 2015, 22:53:35 UTC | prune word2vec vocab automatically if too large | 06 July 2015, 15:48:30 UTC |
aa72ffa | Radim Řehůřek | 06 July 2015, 14:58:00 UTC | Merge pull request #385 from piskvorky/prune_vocab Prune vocab | 06 July 2015, 14:58:00 UTC |
6a8faa2 | Radim Řehůřek | 06 July 2015, 14:28:21 UTC | fix doc2vec unit test | 06 July 2015, 14:28:21 UTC |
e0a8e7d | Radim Řehůřek | 06 July 2015, 14:24:12 UTC | improve word2vec unit tests | 06 July 2015, 14:24:12 UTC |
0094de5 | Radim Řehůřek | 06 July 2015, 10:50:31 UTC | Merge branch 'release-0.12.0rc1' | 06 July 2015, 10:50:31 UTC |
0d65960 | Radim Řehůřek | 06 July 2015, 10:48:18 UTC | bump up version: 0.12.0rc1 | 06 July 2015, 10:48:18 UTC |
599b6ae | Radim Řehůřek | 06 July 2015, 10:11:50 UTC | fix log report during word2vec vocab building | 06 July 2015, 10:11:50 UTC |
1c2d0b6 | Radim Řehůřek | 05 July 2015, 23:49:13 UTC | add max_vocab_size param to doc2vec too | 05 July 2015, 23:49:13 UTC |
882fc9e | Radim Řehůřek | 05 July 2015, 23:25:35 UTC | prune vocab during doc2vec vocab building too | 05 July 2015, 23:25:35 UTC |
78fea7f | Radim Řehůřek | 05 July 2015, 22:53:35 UTC | prune word2vec vocab automatically if too large | 05 July 2015, 22:53:35 UTC |
77d4def | Gordon Mohr | 05 July 2015, 21:16:13 UTC | detailed d2v change bullets; ".docvecs" API note | 05 July 2015, 21:16:13 UTC |
aae80ba | Radim Řehůřek | 05 July 2015, 19:56:25 UTC | update CHANGELOG | 05 July 2015, 19:56:25 UTC |
a9fad67 | Radim Řehůřek | 05 July 2015, 18:02:42 UTC | remove flaky unittest for LDA topic seeding | 05 July 2015, 18:02:42 UTC |
d0e5e74 | Radim Řehůřek | 05 July 2015, 18:01:45 UTC | Merge pull request #324 from summanlp/develop Module for automatic summarization | 05 July 2015, 18:01:45 UTC |
eb5d5fb | Gordon Mohr | 05 July 2015, 14:57:50 UTC | Merge pull request #384 from gojomo/doc_progress_pr nicer progress-logging during vocab scan | 05 July 2015, 14:57:50 UTC |
24b4cc9 | Radim Řehůřek | 05 July 2015, 14:10:58 UTC | Merge branch 'develop' of github.com:piskvorky/gensim into develop | 05 July 2015, 14:10:58 UTC |
443985c | Radim Řehůřek | 05 July 2015, 14:10:28 UTC | parametrize min/max token length in utils.lemmatize | 05 July 2015, 14:10:28 UTC |
a55b47b | Gordon Mohr | 05 July 2015, 14:06:56 UTC | nicer progress-logging during vocab scan | 05 July 2015, 14:06:56 UTC |
ebf4d8a | Radim Řehůřek | 05 July 2015, 13:13:11 UTC | Merge pull request #380 from gojomo/build_train_pr Split build_vocab to scan, scale, finalize; train() loop/locking refactor; downsampling into cython | 05 July 2015, 13:13:11 UTC |
ef8a12c | Federico Barrios | 05 July 2015, 05:59:52 UTC | Adding summarization ratio test. | 05 July 2015, 05:59:52 UTC |
da382e9 | Federico Barrios | 05 July 2015, 05:37:21 UTC | Adding test for the corpus summarization feature. | 05 July 2015, 05:37:21 UTC |
dad4670 | Gordon Mohr | 05 July 2015, 04:40:22 UTC | many pep8 fixes | 05 July 2015, 04:40:22 UTC |
2268d20 | Gordon Mohr | 05 July 2015, 02:09:49 UTC | Merge remote-tracking branch 'upstream/develop' into build_train_pr | 05 July 2015, 02:09:49 UTC |
0feb366 | Gordon Mohr | 05 July 2015, 01:32:05 UTC | .gitignore cython_debug | 05 July 2015, 01:32:05 UTC |
60d35e8 | Federico Barrios | 04 July 2015, 19:06:29 UTC | Adding documentation. Fixing bug with the word_count parameter. | 04 July 2015, 19:06:29 UTC |
4083b89 | Federico Barrios | 04 July 2015, 18:40:02 UTC | Fixing bug that generated the graph two times. Changed method name. | 04 July 2015, 18:40:02 UTC |
b3e07ff | Gordon Mohr | 03 July 2015, 23:39:09 UTC | progress_per to control logged progress; fix cum_table err on empty vocab | 03 July 2015, 23:54:08 UTC |
42ea4d0 | Gordon Mohr | 03 July 2015, 23:36:43 UTC | fix: repeat str doctags trigger bad indexes | 03 July 2015, 23:36:43 UTC |
e9e6246 | Gordon Mohr | 03 July 2015, 23:13:17 UTC | failing test: repeat str doctags trigger bad indexes | 03 July 2015, 23:27:00 UTC |
0f7ae51 | Radim Řehůřek | 03 July 2015, 22:58:21 UTC | re #383: fix topic seeding test | 03 July 2015, 22:58:21 UTC |
096a505 | Radim Řehůřek | 03 July 2015, 21:25:40 UTC | regenerate cython extensions | 03 July 2015, 21:25:40 UTC |
d148911 | Radim Řehůřek | 03 July 2015, 21:25:05 UTC | Merge branch 'develop' of github.com:piskvorky/gensim into develop | 03 July 2015, 21:25:05 UTC |
4905635 | Radim Řehůřek | 03 July 2015, 20:41:54 UTC | Merge pull request #383 from piskvorky/pr_281 Speed up doc2bow | 03 July 2015, 20:41:54 UTC |
fa8fe10 | Christopher Corley | 03 July 2015, 18:41:57 UTC | Fix testTopicSeeding for LDA models | 03 July 2015, 18:41:57 UTC |
306d0ca | Radim Řehůřek | 03 July 2015, 14:12:00 UTC | checking what's wrong with topic seeding test | 03 July 2015, 18:21:39 UTC |
e7944ae | Radim Řehůřek | 03 July 2015, 13:15:49 UTC | fix similarity test due to different dictionary order | 03 July 2015, 18:21:39 UTC |
765ab14 | Radim Řehůřek | 03 July 2015, 12:44:16 UTC | fix LSI test | 03 July 2015, 18:21:39 UTC |
9ca9d92 | Christopher Corley | 02 July 2015, 02:25:41 UTC | Fix 2.6 syntax issue with doc2ow | 03 July 2015, 18:21:38 UTC |
bd6bb57 | Christopher Corley | 02 July 2015, 02:15:02 UTC | Fixes issue when document given to doc2bow is a generator. Fixes test cases for similarities. | 03 July 2015, 18:21:38 UTC |
282c797 | Lars Buitinck | 15 January 2015, 15:32:56 UTC | speed up doc2bow by ~40% | 03 July 2015, 18:21:38 UTC |
c59013c | Radim Řehůřek | 02 July 2015, 17:29:14 UTC | Merge pull request #358 from TaddyLab/deepir Sentence likelihood scores | 02 July 2015, 17:29:14 UTC |
381d45a | mataddy | 02 July 2015, 15:01:44 UTC | oops; forgot to add log inport from numpy | 02 July 2015, 15:01:44 UTC |
d4a3e75 | mataddy | 02 July 2015, 14:59:42 UTC | merged latest changes | 02 July 2015, 14:59:42 UTC |
3d5180d | Radim Řehůřek | 02 July 2015, 12:44:51 UTC | make matutils.argsort accept any sequence of numbers | 02 July 2015, 12:44:51 UTC |
84370a5 | Gordon Mohr | 02 July 2015, 12:28:27 UTC | plausible trained_item() behavior | 02 July 2015, 12:40:07 UTC |
d392f6f | Gordon Mohr | 02 July 2015, 12:08:10 UTC | refactor loop: keep logging progress when pushing jobs done | 02 July 2015, 12:40:07 UTC |
f8260c1 | Gordon Mohr | 02 July 2015, 10:41:09 UTC | build_vocab split to scan, scale, finalize scale_vocab() offers 'dry_run' w/ estimated effects on vocab, corpus, memory | 02 July 2015, 12:40:07 UTC |
390c35b | Gordon Mohr | 01 July 2015, 07:35:14 UTC | downsampling into train_*/cython; train_* take word lists also: small pure-py fixes; no redundant None-checking; no codelens[i]==0 skip-convention | 02 July 2015, 12:40:06 UTC |
7650a0e | Gordon Mohr | 30 June 2015, 23:32:15 UTC | ignore *.npz | 02 July 2015, 12:37:30 UTC |
2f711fb | Gordon Mohr | 30 June 2015, 13:15:46 UTC | tally calls to train(), total_train_time | 02 July 2015, 12:37:30 UTC |
626225b | Gordon Mohr | 30 June 2015, 07:25:43 UTC | refactor worker_train for less-locking & 0-worker mode | 02 July 2015, 12:37:30 UTC |
d2f48bb | Radim Řehůřek | 02 July 2015, 12:26:07 UTC | use matutils.argsort consistently everywhere * makes a big performance difference when doing partial sorts (the `topn` parameter) * ditch the bottleneck code path, rely only on numpy * fixes #379 | 02 July 2015, 12:26:07 UTC |
3fc6482 | Radim Řehůřek | 02 July 2015, 11:52:15 UTC | add GA to github README | 02 July 2015, 11:52:15 UTC |
342f10a | fedelopez77 | 01 July 2015, 23:33:54 UTC | Fix in test for python 2.6 compatibility | 01 July 2015, 23:33:54 UTC |
92e437d | mataddy | 01 July 2015, 23:05:19 UTC | update prepare_sentences -> prepare_items | 01 July 2015, 23:05:19 UTC |
7caf535 | fedelopez77 | 01 July 2015, 23:00:06 UTC | Merge from gensim/develop to the fork | 01 July 2015, 23:00:06 UTC |
8dae4a4 | mataddy | 01 July 2015, 20:37:00 UTC | merge with upstream/develop and resolve conflicts | 01 July 2015, 20:37:00 UTC |
ad12ec9 | Radim Řehůřek | 01 July 2015, 19:02:41 UTC | split C word2vec text format only on space (was: any whitespace) fixes #344 | 01 July 2015, 19:02:41 UTC |
73d8167 | Radim Řehůřek | 01 July 2015, 18:11:46 UTC | py3k fix: remove forgotten iteritems in distributed LSI | 01 July 2015, 18:11:46 UTC |
570f08a | Radim Řehůřek | 30 June 2015, 13:53:36 UTC | remove inline from word2vec pxd | 30 June 2015, 13:53:36 UTC |
4e98e68 | Radim Řehůřek | 30 June 2015, 12:49:34 UTC | Merge branch 'develop' of github.com:piskvorky/gensim into develop | 30 June 2015, 12:49:34 UTC |
3cdc43c | Gordon Mohr | 30 June 2015, 09:39:31 UTC | Merge pull request #373 from gojomo/bdv_followups_pr smaller&faster neg-sampling table; reduce cython duplication; feedback tweaks | 30 June 2015, 09:39:31 UTC |
b31e94f | Gordon Mohr | 30 June 2015, 02:08:54 UTC | super(); LabeledSentence deprecation; intersect error louder | 30 June 2015, 02:08:54 UTC |
1a393b8 | Gordon Mohr | 26 June 2015, 09:57:17 UTC | cumulative table for neg-samples; local RandomState | 30 June 2015, 01:44:52 UTC |
b8dc13a | Gordon Mohr | 26 June 2015, 09:15:19 UTC | share declarations from word2vec_inner.[pyx|pxd] | 30 June 2015, 01:44:26 UTC |
1d5bd88 | Radim Řehůřek | 28 June 2015, 22:33:21 UTC | Merge pull request #356 from gojomo/bigdocvec_pr big doc-vector refactor/enhancements | 28 June 2015, 22:33:21 UTC |
b558262 | Gordon Mohr | 28 June 2015, 20:43:02 UTC | Merge pull request #6 from piskvorky/bigdocvec_pr pep8 & python2 fixes to doc2vec notebook | 28 June 2015, 20:43:02 UTC |
356c53a | Radim Řehůřek | 28 June 2015, 19:37:58 UTC | pep8 & python2 fixes to doc2vec notebook | 28 June 2015, 19:37:58 UTC |
5ecb6e2 | Radim Řehůřek | 27 June 2015, 14:53:26 UTC | Merge pull request #369 from S-Eugene/develop Fix utils.is_corpus removing the first item from a streaming corpus | 27 June 2015, 14:53:26 UTC |
71cd37d | Eugene S | 27 June 2015, 12:47:51 UTC | Fix utils.is_corpus removing the first item from a streaming corpus in python 3 | 27 June 2015, 12:47:51 UTC |
739fe31 | Gordon Mohr | 24 June 2015, 13:39:26 UTC | _lockf support in cython; test | 24 June 2015, 13:39:26 UTC |
1ed5e49 | Gordon Mohr | 24 June 2015, 11:48:26 UTC | for cbow-sum, divide error over all contributing vectors | 24 June 2015, 13:32:19 UTC |
d02b574 | Gordon Mohr | 24 June 2015, 11:22:22 UTC | only swap dot/saxpy – reduce redundancy | 24 June 2015, 11:22:22 UTC |
a1ed490 | Gordon Mohr | 24 June 2015, 10:52:24 UTC | reorder to respect ignores; move mmap_error (fixes unit tests) | 24 June 2015, 10:52:24 UTC |
19faaab | Gordon Mohr | 24 June 2015, 10:29:51 UTC | don't (try to) share __doc__ | 24 June 2015, 10:29:51 UTC |
f88beab | Gordon Mohr | 24 June 2015, 09:47:49 UTC | recursive SaveLoad for DocvecsArray numpys | 24 June 2015, 09:47:49 UTC |
46c81a3 | Gordon Mohr | 24 June 2015, 06:27:35 UTC | Merge remote-tracking branch 'upstream/develop' into bigdocvec_pr catchup SaveLoad before changes for DocvecsArray | 24 June 2015, 06:27:35 UTC |
548d94b | Radim Řehůřek | 21 June 2015, 21:11:08 UTC | Merge pull request #363 from ccwang002/pickle_py2k_compat Add pickle protocol as option in utils.SaveLoad | 21 June 2015, 21:11:08 UTC |
7bf380b | Liang Bo Wang | 21 June 2015, 13:31:42 UTC | Add pickle protocol as option in utils.SaveLoad Users now can specify desired pickle version when saving their models. Before the change, pickle version is set to the highest version available, now defaults to 2 to make models compatible across Python 2 and 3. For full report see issue #359. Close #359. | 21 June 2015, 13:31:42 UTC |
e970c70 | Radim Řehůřek | 21 June 2015, 12:47:58 UTC | Merge pull request #362 from ccwang002/pickle_py2k_compat Set default pickle protocol version to 2 | 21 June 2015, 12:47:58 UTC |
da8015b | Liang Bo Wang | 21 June 2015, 10:13:12 UTC | Set default pickle protocol version to 2 To make pickled objects compatible across Python 2 and 3, max pickle protocol version should be set at 2. For full report see issue #359. | 21 June 2015, 10:20:36 UTC |
09a30b3 | Gordon Mohr | 16 June 2015, 08:44:28 UTC | comments; sentence->document; ipynb tweaks | 16 June 2015, 08:44:28 UTC |
4e470af | mataddy | 15 June 2015, 20:15:06 UTC | re-org functions within files | 15 June 2015, 20:15:06 UTC |
aec4633 | mataddy | 15 June 2015, 20:08:57 UTC | clean up doc for score | 15 June 2015, 20:08:57 UTC |
5b63d49 | mataddy | 15 June 2015, 20:05:05 UTC | grab piskvorky latest doc2vec | 15 June 2015, 20:05:05 UTC |
c8a8e83 | mataddy | 15 June 2015, 20:04:21 UTC | grab piskvorky latest doc2vec | 15 June 2015, 20:04:21 UTC |
537e23b | mataddy | 15 June 2015, 19:57:51 UTC | Merge remote-tracking branch 'upstream/develop' into deepir | 15 June 2015, 19:57:51 UTC |
d3b6ca0 | mataddy | 15 June 2015, 19:57:22 UTC | upstream merge | 15 June 2015, 19:57:22 UTC |
564956a | mataddy | 15 June 2015, 19:56:48 UTC | Merge remote-tracking branch 'upstream/master' into deepir | 15 June 2015, 19:56:48 UTC |
ebed5af | mataddy | 15 June 2015, 14:42:25 UTC | cleanup while on airplane | 15 June 2015, 14:42:25 UTC |