Revision history - None - origin: https://github.com/RaRe-Technologies/gensim

visit type:

Revision	Author	Date	Message	Commit Date
451d94f	Radim Řehůřek	06 July 2015, 19:34:31 UTC	Merge branch 'release-0.12.0'	06 July 2015, 19:34:31 UTC
7476ed2	Radim Řehůřek	06 July 2015, 19:23:19 UTC	re #385: default to no vocab pruning	06 July 2015, 19:30:10 UTC
a06d075	Radim Řehůřek	06 July 2015, 16:56:24 UTC	minor formatting fix	06 July 2015, 16:56:24 UTC
43a7ffd	Radim Řehůřek	06 July 2015, 15:57:22 UTC	Merge branch 'develop' of github.com:piskvorky/gensim into develop	06 July 2015, 15:57:22 UTC
142a599	Radim Řehůřek	06 July 2015, 15:56:04 UTC	up version: 0.12.0	06 July 2015, 15:56:04 UTC
f9afd3b	Radim Řehůřek	06 July 2015, 15:45:55 UTC	minor doc fixes	06 July 2015, 15:55:57 UTC
723443b	Radim Řehůřek	06 July 2015, 14:28:21 UTC	fix doc2vec unit test	06 July 2015, 15:48:31 UTC
61d16bb	Radim Řehůřek	06 July 2015, 10:11:50 UTC	fix log report during word2vec vocab building	06 July 2015, 15:48:31 UTC
37381f7	Radim Řehůřek	05 July 2015, 23:49:13 UTC	add max_vocab_size param to doc2vec too	06 July 2015, 15:48:30 UTC
7d8eba6	Radim Řehůřek	05 July 2015, 23:25:35 UTC	prune vocab during doc2vec vocab building too	06 July 2015, 15:48:30 UTC
ae243b3	Radim Řehůřek	05 July 2015, 22:53:35 UTC	prune word2vec vocab automatically if too large	06 July 2015, 15:48:30 UTC
aa72ffa	Radim Řehůřek	06 July 2015, 14:58:00 UTC	Merge pull request #385 from piskvorky/prune_vocab Prune vocab	06 July 2015, 14:58:00 UTC
6a8faa2	Radim Řehůřek	06 July 2015, 14:28:21 UTC	fix doc2vec unit test	06 July 2015, 14:28:21 UTC
e0a8e7d	Radim Řehůřek	06 July 2015, 14:24:12 UTC	improve word2vec unit tests	06 July 2015, 14:24:12 UTC
0094de5	Radim Řehůřek	06 July 2015, 10:50:31 UTC	Merge branch 'release-0.12.0rc1'	06 July 2015, 10:50:31 UTC
0d65960	Radim Řehůřek	06 July 2015, 10:48:18 UTC	bump up version: 0.12.0rc1	06 July 2015, 10:48:18 UTC
599b6ae	Radim Řehůřek	06 July 2015, 10:11:50 UTC	fix log report during word2vec vocab building	06 July 2015, 10:11:50 UTC
1c2d0b6	Radim Řehůřek	05 July 2015, 23:49:13 UTC	add max_vocab_size param to doc2vec too	05 July 2015, 23:49:13 UTC
882fc9e	Radim Řehůřek	05 July 2015, 23:25:35 UTC	prune vocab during doc2vec vocab building too	05 July 2015, 23:25:35 UTC
78fea7f	Radim Řehůřek	05 July 2015, 22:53:35 UTC	prune word2vec vocab automatically if too large	05 July 2015, 22:53:35 UTC
77d4def	Gordon Mohr	05 July 2015, 21:16:13 UTC	detailed d2v change bullets; ".docvecs" API note	05 July 2015, 21:16:13 UTC
aae80ba	Radim Řehůřek	05 July 2015, 19:56:25 UTC	update CHANGELOG	05 July 2015, 19:56:25 UTC
a9fad67	Radim Řehůřek	05 July 2015, 18:02:42 UTC	remove flaky unittest for LDA topic seeding	05 July 2015, 18:02:42 UTC
d0e5e74	Radim Řehůřek	05 July 2015, 18:01:45 UTC	Merge pull request #324 from summanlp/develop Module for automatic summarization	05 July 2015, 18:01:45 UTC
eb5d5fb	Gordon Mohr	05 July 2015, 14:57:50 UTC	Merge pull request #384 from gojomo/doc_progress_pr nicer progress-logging during vocab scan	05 July 2015, 14:57:50 UTC
24b4cc9	Radim Řehůřek	05 July 2015, 14:10:58 UTC	Merge branch 'develop' of github.com:piskvorky/gensim into develop	05 July 2015, 14:10:58 UTC
443985c	Radim Řehůřek	05 July 2015, 14:10:28 UTC	parametrize min/max token length in utils.lemmatize	05 July 2015, 14:10:28 UTC
a55b47b	Gordon Mohr	05 July 2015, 14:06:56 UTC	nicer progress-logging during vocab scan	05 July 2015, 14:06:56 UTC
ebf4d8a	Radim Řehůřek	05 July 2015, 13:13:11 UTC	Merge pull request #380 from gojomo/build_train_pr Split build_vocab to scan, scale, finalize; train() loop/locking refactor; downsampling into cython	05 July 2015, 13:13:11 UTC
ef8a12c	Federico Barrios	05 July 2015, 05:59:52 UTC	Adding summarization ratio test.	05 July 2015, 05:59:52 UTC
da382e9	Federico Barrios	05 July 2015, 05:37:21 UTC	Adding test for the corpus summarization feature.	05 July 2015, 05:37:21 UTC
dad4670	Gordon Mohr	05 July 2015, 04:40:22 UTC	many pep8 fixes	05 July 2015, 04:40:22 UTC
2268d20	Gordon Mohr	05 July 2015, 02:09:49 UTC	Merge remote-tracking branch 'upstream/develop' into build_train_pr	05 July 2015, 02:09:49 UTC
0feb366	Gordon Mohr	05 July 2015, 01:32:05 UTC	.gitignore cython_debug	05 July 2015, 01:32:05 UTC
60d35e8	Federico Barrios	04 July 2015, 19:06:29 UTC	Adding documentation. Fixing bug with the word_count parameter.	04 July 2015, 19:06:29 UTC
4083b89	Federico Barrios	04 July 2015, 18:40:02 UTC	Fixing bug that generated the graph two times. Changed method name.	04 July 2015, 18:40:02 UTC
b3e07ff	Gordon Mohr	03 July 2015, 23:39:09 UTC	progress_per to control logged progress; fix cum_table err on empty vocab	03 July 2015, 23:54:08 UTC
42ea4d0	Gordon Mohr	03 July 2015, 23:36:43 UTC	fix: repeat str doctags trigger bad indexes	03 July 2015, 23:36:43 UTC
e9e6246	Gordon Mohr	03 July 2015, 23:13:17 UTC	failing test: repeat str doctags trigger bad indexes	03 July 2015, 23:27:00 UTC
0f7ae51	Radim Řehůřek	03 July 2015, 22:58:21 UTC	re #383: fix topic seeding test	03 July 2015, 22:58:21 UTC
096a505	Radim Řehůřek	03 July 2015, 21:25:40 UTC	regenerate cython extensions	03 July 2015, 21:25:40 UTC
d148911	Radim Řehůřek	03 July 2015, 21:25:05 UTC	Merge branch 'develop' of github.com:piskvorky/gensim into develop	03 July 2015, 21:25:05 UTC
4905635	Radim Řehůřek	03 July 2015, 20:41:54 UTC	Merge pull request #383 from piskvorky/pr_281 Speed up doc2bow	03 July 2015, 20:41:54 UTC
fa8fe10	Christopher Corley	03 July 2015, 18:41:57 UTC	Fix testTopicSeeding for LDA models	03 July 2015, 18:41:57 UTC
306d0ca	Radim Řehůřek	03 July 2015, 14:12:00 UTC	checking what's wrong with topic seeding test	03 July 2015, 18:21:39 UTC
e7944ae	Radim Řehůřek	03 July 2015, 13:15:49 UTC	fix similarity test due to different dictionary order	03 July 2015, 18:21:39 UTC
765ab14	Radim Řehůřek	03 July 2015, 12:44:16 UTC	fix LSI test	03 July 2015, 18:21:39 UTC
9ca9d92	Christopher Corley	02 July 2015, 02:25:41 UTC	Fix 2.6 syntax issue with doc2ow	03 July 2015, 18:21:38 UTC
bd6bb57	Christopher Corley	02 July 2015, 02:15:02 UTC	Fixes issue when document given to doc2bow is a generator. Fixes test cases for similarities.	03 July 2015, 18:21:38 UTC
282c797	Lars Buitinck	15 January 2015, 15:32:56 UTC	speed up doc2bow by ~40%	03 July 2015, 18:21:38 UTC
c59013c	Radim Řehůřek	02 July 2015, 17:29:14 UTC	Merge pull request #358 from TaddyLab/deepir Sentence likelihood scores	02 July 2015, 17:29:14 UTC
381d45a	mataddy	02 July 2015, 15:01:44 UTC	oops; forgot to add log inport from numpy	02 July 2015, 15:01:44 UTC
d4a3e75	mataddy	02 July 2015, 14:59:42 UTC	merged latest changes	02 July 2015, 14:59:42 UTC
3d5180d	Radim Řehůřek	02 July 2015, 12:44:51 UTC	make matutils.argsort accept any sequence of numbers	02 July 2015, 12:44:51 UTC
84370a5	Gordon Mohr	02 July 2015, 12:28:27 UTC	plausible trained_item() behavior	02 July 2015, 12:40:07 UTC
d392f6f	Gordon Mohr	02 July 2015, 12:08:10 UTC	refactor loop: keep logging progress when pushing jobs done	02 July 2015, 12:40:07 UTC
f8260c1	Gordon Mohr	02 July 2015, 10:41:09 UTC	build_vocab split to scan, scale, finalize scale_vocab() offers 'dry_run' w/ estimated effects on vocab, corpus, memory	02 July 2015, 12:40:07 UTC
390c35b	Gordon Mohr	01 July 2015, 07:35:14 UTC	downsampling into train_/cython; train_ take word lists also: small pure-py fixes; no redundant None-checking; no codelens[i]==0 skip-convention	02 July 2015, 12:40:06 UTC
7650a0e	Gordon Mohr	30 June 2015, 23:32:15 UTC	ignore *.npz	02 July 2015, 12:37:30 UTC
2f711fb	Gordon Mohr	30 June 2015, 13:15:46 UTC	tally calls to train(), total_train_time	02 July 2015, 12:37:30 UTC
626225b	Gordon Mohr	30 June 2015, 07:25:43 UTC	refactor worker_train for less-locking & 0-worker mode	02 July 2015, 12:37:30 UTC
d2f48bb	Radim Řehůřek	02 July 2015, 12:26:07 UTC	use matutils.argsort consistently everywhere * makes a big performance difference when doing partial sorts (the `topn` parameter) * ditch the bottleneck code path, rely only on numpy * fixes #379	02 July 2015, 12:26:07 UTC
3fc6482	Radim Řehůřek	02 July 2015, 11:52:15 UTC	add GA to github README	02 July 2015, 11:52:15 UTC
342f10a	fedelopez77	01 July 2015, 23:33:54 UTC	Fix in test for python 2.6 compatibility	01 July 2015, 23:33:54 UTC
92e437d	mataddy	01 July 2015, 23:05:19 UTC	update prepare_sentences -> prepare_items	01 July 2015, 23:05:19 UTC
7caf535	fedelopez77	01 July 2015, 23:00:06 UTC	Merge from gensim/develop to the fork	01 July 2015, 23:00:06 UTC
8dae4a4	mataddy	01 July 2015, 20:37:00 UTC	merge with upstream/develop and resolve conflicts	01 July 2015, 20:37:00 UTC
ad12ec9	Radim Řehůřek	01 July 2015, 19:02:41 UTC	split C word2vec text format only on space (was: any whitespace) fixes #344	01 July 2015, 19:02:41 UTC
73d8167	Radim Řehůřek	01 July 2015, 18:11:46 UTC	py3k fix: remove forgotten iteritems in distributed LSI	01 July 2015, 18:11:46 UTC
570f08a	Radim Řehůřek	30 June 2015, 13:53:36 UTC	remove inline from word2vec pxd	30 June 2015, 13:53:36 UTC
4e98e68	Radim Řehůřek	30 June 2015, 12:49:34 UTC	Merge branch 'develop' of github.com:piskvorky/gensim into develop	30 June 2015, 12:49:34 UTC
3cdc43c	Gordon Mohr	30 June 2015, 09:39:31 UTC	Merge pull request #373 from gojomo/bdv_followups_pr smaller&faster neg-sampling table; reduce cython duplication; feedback tweaks	30 June 2015, 09:39:31 UTC
b31e94f	Gordon Mohr	30 June 2015, 02:08:54 UTC	super(); LabeledSentence deprecation; intersect error louder	30 June 2015, 02:08:54 UTC
1a393b8	Gordon Mohr	26 June 2015, 09:57:17 UTC	cumulative table for neg-samples; local RandomState	30 June 2015, 01:44:52 UTC
b8dc13a	Gordon Mohr	26 June 2015, 09:15:19 UTC	share declarations from word2vec_inner.[pyx\|pxd]	30 June 2015, 01:44:26 UTC
1d5bd88	Radim Řehůřek	28 June 2015, 22:33:21 UTC	Merge pull request #356 from gojomo/bigdocvec_pr big doc-vector refactor/enhancements	28 June 2015, 22:33:21 UTC
b558262	Gordon Mohr	28 June 2015, 20:43:02 UTC	Merge pull request #6 from piskvorky/bigdocvec_pr pep8 & python2 fixes to doc2vec notebook	28 June 2015, 20:43:02 UTC
356c53a	Radim Řehůřek	28 June 2015, 19:37:58 UTC	pep8 & python2 fixes to doc2vec notebook	28 June 2015, 19:37:58 UTC
5ecb6e2	Radim Řehůřek	27 June 2015, 14:53:26 UTC	Merge pull request #369 from S-Eugene/develop Fix utils.is_corpus removing the first item from a streaming corpus	27 June 2015, 14:53:26 UTC
71cd37d	Eugene S	27 June 2015, 12:47:51 UTC	Fix utils.is_corpus removing the first item from a streaming corpus in python 3	27 June 2015, 12:47:51 UTC
739fe31	Gordon Mohr	24 June 2015, 13:39:26 UTC	_lockf support in cython; test	24 June 2015, 13:39:26 UTC
1ed5e49	Gordon Mohr	24 June 2015, 11:48:26 UTC	for cbow-sum, divide error over all contributing vectors	24 June 2015, 13:32:19 UTC
d02b574	Gordon Mohr	24 June 2015, 11:22:22 UTC	only swap dot/saxpy – reduce redundancy	24 June 2015, 11:22:22 UTC
a1ed490	Gordon Mohr	24 June 2015, 10:52:24 UTC	reorder to respect ignores; move mmap_error (fixes unit tests)	24 June 2015, 10:52:24 UTC
19faaab	Gordon Mohr	24 June 2015, 10:29:51 UTC	don't (try to) share __doc__	24 June 2015, 10:29:51 UTC
f88beab	Gordon Mohr	24 June 2015, 09:47:49 UTC	recursive SaveLoad for DocvecsArray numpys	24 June 2015, 09:47:49 UTC
46c81a3	Gordon Mohr	24 June 2015, 06:27:35 UTC	Merge remote-tracking branch 'upstream/develop' into bigdocvec_pr catchup SaveLoad before changes for DocvecsArray	24 June 2015, 06:27:35 UTC
548d94b	Radim Řehůřek	21 June 2015, 21:11:08 UTC	Merge pull request #363 from ccwang002/pickle_py2k_compat Add pickle protocol as option in utils.SaveLoad	21 June 2015, 21:11:08 UTC
7bf380b	Liang Bo Wang	21 June 2015, 13:31:42 UTC	Add pickle protocol as option in utils.SaveLoad Users now can specify desired pickle version when saving their models. Before the change, pickle version is set to the highest version available, now defaults to 2 to make models compatible across Python 2 and 3. For full report see issue #359. Close #359.	21 June 2015, 13:31:42 UTC
e970c70	Radim Řehůřek	21 June 2015, 12:47:58 UTC	Merge pull request #362 from ccwang002/pickle_py2k_compat Set default pickle protocol version to 2	21 June 2015, 12:47:58 UTC
da8015b	Liang Bo Wang	21 June 2015, 10:13:12 UTC	Set default pickle protocol version to 2 To make pickled objects compatible across Python 2 and 3, max pickle protocol version should be set at 2. For full report see issue #359.	21 June 2015, 10:20:36 UTC
09a30b3	Gordon Mohr	16 June 2015, 08:44:28 UTC	comments; sentence->document; ipynb tweaks	16 June 2015, 08:44:28 UTC
4e470af	mataddy	15 June 2015, 20:15:06 UTC	re-org functions within files	15 June 2015, 20:15:06 UTC
aec4633	mataddy	15 June 2015, 20:08:57 UTC	clean up doc for score	15 June 2015, 20:08:57 UTC
5b63d49	mataddy	15 June 2015, 20:05:05 UTC	grab piskvorky latest doc2vec	15 June 2015, 20:05:05 UTC
c8a8e83	mataddy	15 June 2015, 20:04:21 UTC	grab piskvorky latest doc2vec	15 June 2015, 20:04:21 UTC
537e23b	mataddy	15 June 2015, 19:57:51 UTC	Merge remote-tracking branch 'upstream/develop' into deepir	15 June 2015, 19:57:51 UTC
d3b6ca0	mataddy	15 June 2015, 19:57:22 UTC	upstream merge	15 June 2015, 19:57:22 UTC
564956a	mataddy	15 June 2015, 19:56:48 UTC	Merge remote-tracking branch 'upstream/master' into deepir	15 June 2015, 19:56:48 UTC
ebed5af	mataddy	15 June 2015, 14:42:25 UTC	cleanup while on airplane	15 June 2015, 14:42:25 UTC

Newer
Older