Revision history - refs/heads/release-4.1.1 - origin: https://github.com/RaRe-Technologies/gensim

visit type:

Revision	Author	Date	Message	Commit Date
1d2af3a	Michael Penkov	14 September 2021, 12:40:53 UTC	patch conf.py	14 September 2021, 12:40:53 UTC
4c751f1	Michael Penkov	14 September 2021, 12:39:30 UTC	updated CHANGELOG.md for version 4.1.1	14 September 2021, 12:39:38 UTC
5e258c8	Michael Penkov	14 September 2021, 12:35:58 UTC	bumped version to 4.1.1	14 September 2021, 12:35:58 UTC
96add7f	Michael Penkov	11 September 2021, 23:34:01 UTC	fix typo in build matrix	11 September 2021, 23:34:01 UTC
83a67af	Michael Penkov	11 September 2021, 14:07:28 UTC	tweak push.sh release script	11 September 2021, 14:07:28 UTC
fae805d	Michael Penkov	11 September 2021, 14:07:20 UTC	always upload wheels, if possible	11 September 2021, 14:07:20 UTC
3de23ce	Michael Penkov	11 September 2021, 12:49:41 UTC	expand build matrix to specify numpy version explicitly	11 September 2021, 12:49:41 UTC
7494910	Michael Penkov	10 September 2021, 11:44:58 UTC	ok, maybe not _that_ much verbosity	10 September 2021, 11:44:58 UTC
6388245	Michael Penkov	10 September 2021, 11:25:24 UTC	increase verbosity of commands	10 September 2021, 11:25:24 UTC
8373b86	Michael Penkov	09 September 2021, 12:39:39 UTC	updated multibuild to 4e30a05	09 September 2021, 12:39:39 UTC
fb0c722	Michael Penkov	08 September 2021, 13:26:40 UTC	more tweaks to build-wheels.yml for windows	08 September 2021, 13:26:40 UTC
a6080ed	Michael Penkov	08 September 2021, 13:17:25 UTC	fixup in install_wheel.py to correctly find dist subdir	08 September 2021, 13:17:25 UTC
e106ac3	Michael Penkov	08 September 2021, 13:11:33 UTC	ffs windows	08 September 2021, 13:11:33 UTC
201f4db	Michael Penkov	08 September 2021, 13:01:21 UTC	more appveyor -> github actions syntax tweaks	08 September 2021, 13:01:21 UTC
e2ab5af	Michael Penkov	08 September 2021, 12:58:27 UTC	fix github actions syntax	08 September 2021, 12:58:27 UTC
18bc464	Michael Penkov	08 September 2021, 12:56:40 UTC	s/runner/matrix/	08 September 2021, 12:56:40 UTC
df4292a	Michael Penkov	08 September 2021, 12:53:16 UTC	copy windows build script from gensim-wheels repo	08 September 2021, 12:53:16 UTC
d9313f4	Michael Penkov	08 September 2021, 12:31:58 UTC	simplify ubuntu-latest part of the build matrix	08 September 2021, 12:31:58 UTC
cf246ff	Michael Penkov	08 September 2021, 12:08:45 UTC	enable windows for build-wheels workflow	08 September 2021, 12:08:45 UTC
ce14d10	Michael Penkov	08 September 2021, 12:04:13 UTC	use oldest-supported-numpy in github workflow	08 September 2021, 12:04:13 UTC
919b415	Michael Penkov	29 August 2021, 22:18:57 UTC	bump version to next dev release	29 August 2021, 22:18:57 UTC
c0f384c	Michael Penkov	28 August 2021, 12:22:12 UTC	Fix wheel building on Travis CI (#3219) * misc updates to travis.yml - Copy some comment wisdom from older gensim-wheels repo - Use oldest-supported-numpy - Disable email notifications, I never read them anyway * bump scipy version for py3.9 aarch64 build * use newer scipy version where available	28 August 2021, 12:22:12 UTC
cf29beb	Michael Penkov	28 August 2021, 11:09:57 UTC	update build-wheels.yml, include py3.6 build in matrix	28 August 2021, 11:09:57 UTC
b3e820b	Michael Penkov	18 August 2021, 12:55:44 UTC	add credentials for wheelhouse uploader	18 August 2021, 12:55:44 UTC
c055e29	Michael Penkov	18 August 2021, 12:25:47 UTC	update build-wheels.yml bump manylinux version to 2010 build wheels for py3.9	18 August 2021, 12:26:03 UTC
ede3c2a	Michael Penkov	18 August 2021, 12:13:08 UTC	dummy commit to trigger build-wheels actions workflow	18 August 2021, 12:13:08 UTC
9db34f0	Michael Penkov	18 August 2021, 08:10:36 UTC	update circleCI image to python:3.8.11 (#3215) * update circleCI image to python:3.8.11 * s/3.7/3.8/	18 August 2021, 08:10:36 UTC
fa8d707	Michael Penkov	14 August 2021, 22:33:18 UTC	Merge branch 'master' into develop	14 August 2021, 22:33:18 UTC
109c88e	Michael Penkov	14 August 2021, 22:33:06 UTC	Merge branch 'release-4.1.0'	14 August 2021, 22:33:06 UTC
1bb426a	Michael Penkov	14 August 2021, 22:31:45 UTC	patch docs/src/conf.py	14 August 2021, 22:31:45 UTC
6542080	Michael Penkov	14 August 2021, 22:29:14 UTC	update changelog for version 4.1.0	14 August 2021, 22:29:14 UTC
4c48968	Michael Penkov	14 August 2021, 12:59:03 UTC	bumped version to 4.1.0	14 August 2021, 12:59:03 UTC
15fb53f	Michael Penkov	14 August 2021, 10:55:46 UTC	Update CHANGELOG.md (#3214) * Update CHANGELOG.md * Update CHANGELOG.md Co-authored-by: Radim Řehůřek <radimrehurek@seznam.cz>	14 August 2021, 10:55:46 UTC
247da47	Simon Wiles	13 August 2021, 12:00:09 UTC	Tidy up KeyedVectors.most_similar() API (#3000) * Allow supplying a string-key as the negative arg. to most_similar() * Allow a single vector as a positive or negative arg. to most_similar() * Update comments * Accept single arguments when positive and negative are both supplied * Update most_similar_cosmul to match most_similar I'm not sure if this fully addresses the `# TODO: Update to better match & share code with most_similar()` at line #981 or not, so I've left it in. * minor code cleanup * add unit tests * Update CHANGELOG.md * remove redundant variable declaration * enforce consistency * respond to review feedback * Update keyedvectors.py Co-authored-by: Michael Penkov <misha.penkov@gmail.com> Co-authored-by: Michael Penkov <m@penkov.dev>	13 August 2021, 12:00:09 UTC
6129a24	Ayan Saha	12 August 2021, 02:11:23 UTC	Isolate generic preprocessing functions (#3180) * Move preprocessing functions from textcourpus module * Move preprocessing functions from lowcorpus module * Add test cases for preprocessing functions * Fix styling issues * Refactor remove_stopwords() and strip_short() * make tests pass * rm unused import Co-authored-by: Michael Penkov <m@penkov.dev>	12 August 2021, 02:11:23 UTC
266a014	Radim Řehůřek	07 August 2021, 11:45:38 UTC	Merge pull request #3203 from RaRe-Technologies/changelog42 [MRG] Update CHANGELOG for the Gensim 4.1.0 release	07 August 2021, 11:45:38 UTC
3d1c7b8	Radim Řehůřek	07 August 2021, 07:47:27 UTC	avoid github API throttling	07 August 2021, 07:47:27 UTC
11c3fba	Radim Řehůřek	07 August 2021, 07:42:33 UTC	hanging indent + improve docs	07 August 2021, 07:42:33 UTC
cfa4f58	Michael Penkov	27 July 2021, 06:15:22 UTC	removed KEY_TYPES description from change log	27 July 2021, 06:15:22 UTC
8cd5ad3	Michael Penkov	27 July 2021, 03:31:00 UTC	Mention backward incompatibilities	27 July 2021, 03:31:00 UTC
982aa1f	Radim Řehůřek	26 July 2021, 13:35:11 UTC	update CHANGELOG	26 July 2021, 13:35:11 UTC
6cdcb01	Radim Řehůřek	26 July 2021, 12:56:16 UTC	reduce MAX_WORD_LENGTH in FastSS Levenshtein	26 July 2021, 12:56:16 UTC
fbf776b	Radim Řehůřek	26 July 2021, 12:56:05 UTC	improve docs	26 July 2021, 12:56:05 UTC
4837683	Michael Penkov	23 July 2021, 07:12:33 UTC	git add release/hijack_pr.py	23 July 2021, 07:12:33 UTC
76579b3	sezanzeb	22 July 2021, 12:32:19 UTC	EnsembleLda (#2980) * added EnsembleLda * improvements to add_model, various small changes to comments and code * pandas -> numpy: group by label and mean * pandas -> numpy: generate_stable_topics * pandas -> numpy: distance matrix creation * pandas -> numpy: CBDBSCAN * fixes for automated checks * improvements on logs, comments and variable naming. Changed save function to simply pickling the whole thing. * minor fix in log message format * added tests * fixed test * removed some dead leftover pandas code from test * removed pathlib from test * tests work in python2 locally now * updated ensemble test reference model * passing tox8 * improved determinism of methods * improved order of assertions * trying to achieve higher precision with float64 to avoid some sorting differences across architectures * better approach for comparing with pretrained model * potentially fixing the tests on windows * potentially fixing the tests on windows * changed citation of opinosis * tox8 test passing after small change on opinosis comments/citation * Moving max_random_state inside the model as a private variable. * removed whitespace * docstring width * sphinx udpate * fixed urls to sphinx notation * changed doc strings, number --> int + some sphinx * Removed hanging indents. * improved topic_model_kind type checking * Sphinx and docstring updates. * review stuff * removed unneccessary comments * Update gensim/models/ensemblelda.py Co-Authored-By: Michael Penkov <m@penkov.dev> * removed paranthesis * review * refactor private, hanging indent * typo * Clarifications to ttda in docstrings and in method docstrings. * docstrings, masks explained and mask warning removed * created internal variable for cosine distance calculations * cbdbscan docstring * moved validate_core outside * added citation note * moved more stuff outside of _generate_stable_topics * typos * explained CBDBSCAN * added extra explanation: * using none instead of nan for unchecked core * updated docs * refactored kind to class, fixed check how to proceed with topic_model_class * reverted change that accidentally broke things * fixed tests locally * fix code style * added _is_easy_valid_cluster * updated thesis reference * updated notebook example, typo * docstring styles, renaming, cleanup, stuff I need to discuss first * tox * fixed stuff in CBDBSCAN * removed unused results column and only CB-Distance to other cores * tox whitespace * cleaned obsolete stuff from cbdbscan * idk * updated doc-strings to be clearer and better reflect the truth * make flake8 happy * fix trailing whitespace * reverted some changes * comma, newline, comment * whitespace * citation, reference, authors * potential fix for utils saveload when a class is in __dict__ * commented out eLDA tests, tox8 * saving the topic_model_class using a string instead * reverted utils * saving the topic_model_class using a string instead fixes * tox * quotes in logger.error * multiline string * python 3.5 format strings * ModuleNotFoundError: No module named 'numpy.random._pickle' * ModuleNotFoundError: No module named 'numpy.random._pickle' x2 * fixed inference * removed print asdf * added spec for inference * tox * lazy loading topic_model_class * tox * removed debug thing * Documents now compile * escape sequence thing indent fix * Better document rendering and added opinosiscorpus to apirefs * citation opinosis * missing opinosiscorpus.rst file committed * p names refactored to be descriptive, now using append for appending singletons to list * Changing to hanging indents where they were not used before * Adding :meth: and `` `` styling for RST * a bunch of reviews * * Changed ensemblelda default to use ldamulticore instead of old lda model. * Added more :meth: and `` `` styling for RST, fixed a few typos * More docstring polish * removing some camel-case vars for pep8 compliance. * a bunch of reviews * fixed linter * less precision for windows * hanging indents * somehow recognized some old versions of unrelated files as current changes * fixed autoformat of IDEA * same thing for tox.ini * hanging indents in opinosis notebook * I hate windows * test for LdaMulcitore ensemble similarity * no * fixed wrong max calculation in loop * improved comment for sorting clusters * added ensemblelda tutorial * added test that starts with an empty ensemble * some logging of auto parameters * update auto_examples * changed eps in example * added to tutorials * rebuilt * update docs * attempt at fixing that pickle problem * merge * idk * review * stream documents instead of loading all into memory * streaming docs instead of loading all into memory * .format replaced with f-string * docstring * tuple variable expansion * removed duplicate string in module docstring * removed obsolete parent connection comment * reliable -> stable * topic model intro v1 * topic model intro v2 * topic model intro v3 * elda introduction with references * Update gensim/corpora/opinosiscorpus.py Co-authored-by: Michael Penkov <m@penkov.dev> * Update gensim/models/ensemblelda.py Co-authored-by: Michael Penkov <m@penkov.dev> * update docstrings * static functions * module-level constant * assert and no return * static _calculate_asymmetric_distance_matrix_chunk * better variable names and data types * refactoring * pythonic varnames + pytest style asserts * better var names * better var names * simplified some function calls to use attributes instead of parameters * sort key function * more efficient tests with better case names * new reference model * updated opinosis example * tox * using dataclasses * updated type syntax for docstring * unused import * update sbt install step * minor refactoring Decoupled multiprocessing code from EnsembleLda class. This reduces the length of the class by several hundred lines, making it slightly easier to understand. Added _generate_topic_models_worker function to clarify distinction between single-process and multi-process code. Fixed flake8 problem (l is an ambiguous variable name) Adjusted _teardown function (removed i parameter, it's only for logs) Moved _MAX_RANDOM_STATE to module level * roll back change to docs/src/Makefile * re-raise caught exception instead of raising a new one we don't want to hide the details of the problem * add docstring Co-authored-by: Alex Loosley <aloosley@alumni.brown.edu> Co-authored-by: sezanzeb <proxmia@hip70890bb.de> Co-authored-by: sezanzeb <proxmia@hip70890b.de> Co-authored-by: Michael Penkov <m@penkov.dev> Co-authored-by: aloosley <a.loosley@rubinstein-schmiedel.com> Co-authored-by: Radim Řehůřek <radimrehurek@seznam.cz> Co-authored-by: Michael Penkov <misha.penkov@gmail.com>	22 July 2021, 12:32:19 UTC
b287fd8	M-Demay	19 July 2021, 06:50:25 UTC	Vectorize word2vec.predict_output_word for speed (#3153) * [Fix] gensim/models/word2vec.py: in method predict_output_word, changed a call to sum to numpy.sum to gain performance. * [Feat] gensim.models.word2vec.Word2Vec.predict_output_word: added possibility for the user to input a list of word indices as parameter 'context' instead of a list of words. * Word2Vec.predict_output_word: Changed handling of ints and strs, trying to trying to make it more compact and versatile. * Fixed docstring of predict_output_word. * Simplified `predict_output_word` changes. * Retained the suggested `sum`->`np.sum` replacement, which has been tested to yield significant runtime gains. * Dropped unnecessary type/value checks that are already run when calling the `KeyedVectors.__isin__` dunder method. * Corrected the docstring to accurately document the supported inputs (which were already compatible prior to the PR this commit is a part of). * Added tests for gensim.Word2Vec.predict_output_word() when context contains ints. * Update CHANGELOG.md * update sbt install step Co-authored-by: Mathis <mathis.demay@protonmail.com> Co-authored-by: Paul Andrey <paul.andrey@hotmail.fr> Co-authored-by: Mathis Demay <mathis.demay.etu@univ-lille.fr> Co-authored-by: Michael Penkov <m@penkov.dev>	19 July 2021, 06:50:25 UTC
a93067d	Vít Novotný	29 June 2021, 09:09:24 UTC	New KeyedVectors.vectors_for_all method for vectorizing all words in a dictionary (#3157) * Add KeyedVectors.vectors_for_all * Add examples for KeyedVectors.vectors_for_all * Support Dictionary in KeyedVectors.vectors_for_all * Don't sort keys in KeyedVectors.vectors_for_all, just deduplicate * Use docstrings in imperative mode (PEP8) Co-authored-by: Radim Řehůřek <me@radimrehurek.com> * Guard against KeyError in KeyedVectors.vectors_for_all * Unit-test dictionary parameter of KeyedVectors.vectors_for_all * Order dictionary by decreasing cfs in KeyedVectors.vectors_for_all * Add allow_inference parameter to KeyedVectors.vectors_for_all * Add copy_vecattrs parameter to KeyedVectors.vectors_for_all * Move copy_vecattrs tests for KeyedVectors.vectors_for_all * Fix translation of term ids to terms in KeyedVectors.vectors_for_all * Fix a typo in KeyedVectors.vectors_for_all unit test * Do not make assumptions about fake counts in _add_word_to_kv * Document that KeyedVectors.vectors_for_all allows arbitrary keys * Add notes about the behavior of KeyedVectors.vectors_for_all * Properly reference Dictionary in KeyedVectors.vectors_for_all docstring * Make deduplication in KeyedVectors.vectors_for_all a oneliner * Remove an unnecessary temporary variable in KeyedVectors.vectors_for_all * Make deduplication in KeyedVectors.vectors_for_all a oneliner (cont.) * Add Dictionary.most_common * Remove test_vectors_for_all_dictionary unit test * Remove a trailing bracket in an example * Fix unit tests for Dictionary.most_common * Update an example for SparseTermSimilarityMatrix * Remove Gensim downloader from KeyedVectors.vectors_for_all example * Remove include_counts parameter from Dictionary.most_common * Shorten the KeyedVectors.vectors_for_all example * Remove include_counts parameter from Dictionary.most_common (cont.) * Use pytest assertion syntax in unit tests * Remove an unnecessary comment in KeyedVectors.vectors_for_all * Remove an unnecessary comment in KeyedVectors.vectors_for_all Co-authored-by: Michael Penkov <m@penkov.dev> * Remove an unnecessary variable in KeyedVectors.vectors_for_all * Make the creation of new vocab in KeyedVectors.vectors_for_all explicit * Make AnnoyIndexer use the correct word-vectors in example * Apply suggestions from code review * Apply suggestions from code review * Update CHANGELOG.md Co-authored-by: Radim Řehůřek <me@radimrehurek.com> Co-authored-by: Michael Penkov <m@penkov.dev>	29 June 2021, 09:09:24 UTC
2a41200	Michael Penkov	29 June 2021, 06:11:16 UTC	polishing up after #3169 The repo wasn't accepting maintainer commits, so I'm taking care of this here.	29 June 2021, 06:13:27 UTC
2b9b1b3	M-Demay	29 June 2021, 06:09:41 UTC	Implement `shrink_windows` argument for Word2Vec. (#3169) * Implemented `reduced_windows` argument for Word2Vec. Co-Authored-By: Mathis Demay <mathis.demay.md@gmail.com> * Improve the way `reduced_windows` is passed around and used. * Renamed `reduced_windows` to `shrink_windows`. * Removed `shrink_windows` argument from `Word2Vec.train`. * Aesthetic fix. * Fixed old word2vec models' reloading. * Fixed undue docstring. * Added `shrink_windows` argument to Doc2Vec. * Added `shrink_windows` argument to FastText. * Fixed and optimized `shrink_windows` backend use. * `c.reduced_windows[:] = 0` syntax is not supported; as a consequence, all zero-value assignments due to `shrink_windows=False` have been rewritten as `for` loops. * Those changes have now been tested (cf. next commit). * NOTE: another way to proceed could be to initialize the `reduced_windows` array with zeros as values; it would then only be altered when `shrink_windows=True`. * Added tests for `shrink_windows=False` in Word2Vec-based models. * Added docstring mentions of `shrink_window` being experimental. * Rolled back some purely aesthetic changes. Co-authored-by: Paul Andrey <paul.andrey@hotmail.fr>	29 June 2021, 06:09:41 UTC
a164685	Vít Novotný	29 June 2021, 05:23:41 UTC	Fix Unicode string incompatibility in gensim.similarities.fastss.editdist (#3178) * Do not expect the same Unicode type in editdist * Unit-test editdist * Use pytest assertion syntax in unit tests * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	29 June 2021, 05:23:41 UTC
52fade6	Pietro	29 June 2021, 05:00:11 UTC	Fixed KeyError in coherence model (#2830) * Fixed coherence model issue #2711 * Handled token or id formatting of topics * Raised error with wrong formatting * removed blank lines * updated code * updated code * revision on coherencemodel.py * added new tests * rm trailing whitespace * more flake8 fixes * still more flake8 fixes * update changelog Co-authored-by: Michael Penkov <misha.penkov@gmail.com>	29 June 2021, 05:00:11 UTC
b378b1b	Takanori Hayashi	29 June 2021, 01:47:51 UTC	Optimize word mover distance (WMD) computation (#3163) * Faster WMD computation by removing a nested loop * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	29 June 2021, 01:47:51 UTC
2f23566	sciatro	29 June 2021, 01:44:30 UTC	Remove strip_punctuation2 alias of strip_punctuation (#2965) * Remove strip_punctuation2 Re Issue 2961 * Remove strip_punctuation2 alias Re Issue 2961, remove strip_punctuation alias strip_punctuation2 which makes a mess of docs * Move strip_punctuation2 to strip_punctuation Re Issue 2961, remove use of strip_punctuation2 function which was an alias of strip_punctuation * reorganize imports * make flake8 happy * Update CHANGELOG.md Co-authored-by: Michael Penkov <misha.penkov@gmail.com> Co-authored-by: Michael Penkov <m@penkov.dev>	29 June 2021, 01:44:30 UTC
dab0369	sciatro	29 June 2021, 01:42:38 UTC	Document that preprocessing.strip_punctuation is limited to ASCII (#2964) * Clarifying strip_punctuation limited to ASCII Add ASCII as qualification on `strip_punctuation` doc string. This is "option 1" fix for issue #2962 * Added code comment pointing to issue 2962 Code comment added linking to issue #2962 as a reminder of enhancement possibilities. * update CHANGELOG.md Co-authored-by: Michael Penkov <misha.penkov@gmail.com>	29 June 2021, 01:42:38 UTC
d59a241	Ayan Saha	29 June 2021, 00:55:42 UTC	Eliminate obsolete step parameter from doc2vec infer_vector and similarity_unseen_docs (#3176) * Fix: eliminate step params * Fix: typo in doc2vec.infer_vector() documentation * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	29 June 2021, 00:55:42 UTC
bdcd100	Emgu	16 June 2021, 08:42:33 UTC	Fix a bug when upgrading phraser from gensim 3.x to 4.0 (#3174) * Fix a bug when upgrading phraser from gensim 3.x to 4.0 * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	16 June 2021, 08:42:33 UTC
57b3af3	Paul Wise	01 June 2021, 01:56:45 UTC	Allow newer versions of the Morfessor module for the tests (#2952) The tests that use Morfessor also pass with version 2.0.6, which is the latest version at this time.	01 June 2021, 01:56:45 UTC
3b36c8a	Rohit K Bharadwaj	01 June 2021, 01:46:13 UTC	fix broken link in documentation (#3148) * fixes broken link in documentation * make -C docs/src html * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	01 June 2021, 01:46:13 UTC
f3f6d2d	Michael Penkov	01 June 2021, 01:45:07 UTC	replace _mul function with explicit casts (#3143) * replace _mul function with explicit casts * git rm 16	01 June 2021, 01:45:07 UTC
383fc48	Bisola Olasehinde	25 May 2021, 01:54:31 UTC	Correct parameter name in documentation of fasttext.py (#3155) * Documentation: fix #3151 Correct parameter name in documentation * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	25 May 2021, 01:54:31 UTC
82a2968	Radim Řehůřek	24 May 2021, 12:46:45 UTC	Merge pull request #3156 from PrimozGodec/develop Update Numpy minimum version to 1.17.0	24 May 2021, 12:46:45 UTC
8104ff0	Primož Godec	24 May 2021, 07:34:27 UTC	Update Numpy minimum version to 1.17.0	24 May 2021, 07:34:27 UTC
29fecbf	Radim Řehůřek	20 May 2021, 10:14:11 UTC	Merge pull request #3146 from Witiko/levenshtein-ball-tree Use FastSS for fast kNN over Levenshtein distance	20 May 2021, 10:14:11 UTC
7655d75	Radim Řehůřek	19 May 2021, 19:16:36 UTC	rewrite the editdist function (Levenshtein) in C	20 May 2021, 09:58:34 UTC
86e8a25	Radim Řehůřek	19 May 2021, 00:09:44 UTC	FastSS cleanup	19 May 2021, 00:09:44 UTC
ae91204	Radim Řehůřek	18 May 2021, 21:51:18 UTC	make max_distance=2 the default in LevenshteinSimilarityIndex	18 May 2021, 21:51:18 UTC
7054f90	Radim Řehůřek	18 May 2021, 21:08:09 UTC	update Cython to 0.29.23	18 May 2021, 21:17:16 UTC
9381965	Radim Řehůřek	18 May 2021, 19:48:02 UTC	remove dead code	18 May 2021, 19:52:57 UTC
05284d1	Radim Řehůřek	18 May 2021, 19:22:15 UTC	cythonize FastSS	18 May 2021, 19:41:58 UTC
6c4abc5	Radim Řehůřek	17 May 2021, 20:40:59 UTC	clean up FastSS & logging	17 May 2021, 20:40:59 UTC
de2ec13	Radim Řehůřek	16 May 2021, 22:32:14 UTC	minor doc fixes	16 May 2021, 22:32:14 UTC
da501b5	Radim Řehůřek	16 May 2021, 22:09:12 UTC	update docs	16 May 2021, 22:09:12 UTC
9e614c0	Radim Řehůřek	16 May 2021, 21:42:33 UTC	Merge branch 'levenshtein-ball-tree' of github.com:Witiko/gensim into levenshtein-ball-tree	16 May 2021, 21:42:33 UTC
18fe2a2	Radim Řehůřek	16 May 2021, 21:41:25 UTC	clarify + add comments	16 May 2021, 21:41:25 UTC
c91bda5	Vít Novotný	16 May 2021, 21:32:24 UTC	Eagerly filter out zero similarities	16 May 2021, 21:32:24 UTC
80cdb7e	Vít Novotný	16 May 2021, 21:21:22 UTC	Suggest max_distance < 3	16 May 2021, 21:21:22 UTC
e2e1d9f	Vít Novotný	16 May 2021, 21:16:59 UTC	Suggest max_distance <= 2	16 May 2021, 21:17:31 UTC
6350a22	Vít Novotný	16 May 2021, 21:13:56 UTC	Merge branch 'levenshtein-ball-tree' of github.com:Witiko/gensim into levenshtein-ball-tree	16 May 2021, 21:13:56 UTC
22a0221	Vít Novotný	16 May 2021, 21:13:02 UTC	Silence flake8 about fastss	16 May 2021, 21:13:02 UTC
362f458	Radim Řehůřek	16 May 2021, 21:09:47 UTC	reintroduce max_dist to FastSS query	16 May 2021, 21:09:47 UTC
80b99d0	Vít Novotný	16 May 2021, 21:08:47 UTC	Merge branch 'levenshtein-ball-tree' of github.com:Witiko/gensim into levenshtein-ball-tree	16 May 2021, 21:08:47 UTC
6c2d033	Vít Novotný	16 May 2021, 21:08:04 UTC	Return only similarities greater than zero	16 May 2021, 21:08:04 UTC
b675507	Radim Řehůřek	16 May 2021, 21:05:26 UTC	clean up FastSS	16 May 2021, 21:05:26 UTC
567b0a4	Vít Novotný	16 May 2021, 20:36:11 UTC	Replace DAWG with TinyFastSS	16 May 2021, 20:51:27 UTC
ba36d01	Vít Novotný	16 May 2021, 17:51:31 UTC	Apply suggestions from code review Co-authored-by: Radim Řehůřek <me@radimrehurek.com>	16 May 2021, 18:09:15 UTC
48cb664	Vít Novotný	16 May 2021, 14:43:50 UTC	Implement a back-off strategy for max_distance	16 May 2021, 15:14:46 UTC
40b96bc	Vít Novotný	16 May 2021, 13:17:56 UTC	Improve unit tests for the levenshtein module	16 May 2021, 13:46:04 UTC
80ec65f	Vít Novotný	16 May 2021, 12:29:30 UTC	Remove itertools import	16 May 2021, 12:29:30 UTC
fb98a43	Vít Novotný	15 May 2021, 23:44:14 UTC	Use DAWG for fast approximate kNN over Levenshtein distance	16 May 2021, 00:23:20 UTC
af5833d	Vít Novotný	15 May 2021, 16:08:54 UTC	Use VP-Tree for fast kNN over Levenshtein distance	15 May 2021, 23:50:03 UTC
5a116db	dymil	14 May 2021, 14:11:12 UTC	Use more permanent pdf link and update code link (#3142)	14 May 2021, 14:11:12 UTC
69ba51e	dymil	14 May 2021, 03:11:28 UTC	Update link for online LDA paper (#3141) * Update link for online LDA paper * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	14 May 2021, 03:11:28 UTC
064c2a8	Michael Penkov	13 May 2021, 02:10:24 UTC	Update CHANGELOG.md	13 May 2021, 02:10:24 UTC
583be07	Michael Penkov	13 May 2021, 01:28:26 UTC	consolidate formatting of changelog entries	13 May 2021, 01:28:26 UTC
ec95fe4	Jinhyuk Yun	13 May 2021, 01:16:18 UTC	fix indexing error in word2vec_inner.pyx (#3136) * Add explicit typecasting for the W2V cython code * Add explicit typecasting for the W2V cython code * Unifying the types * Fix minor code style issues * Code refactoring * Update CHANGELOG.md Co-authored-by: Jinhyuk Yun <jhyun@ssu> Co-authored-by: Michael Penkov <m@penkov.dev>	13 May 2021, 01:16:18 UTC
52281eb	horpto	12 May 2021, 14:20:22 UTC	Optimize performance of Author-Topic model (#2978) * take out common part of expression in ATModel * Update atmodel.py * Update atmodel.py * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	12 May 2021, 14:20:22 UTC
69b8cb5	Jonathan Schneider	12 May 2021, 13:33:34 UTC	Update link to Hoffman paper (online VB LDA) (#3133) * Update link to Hoffman paper (online VB LDA) The previous link was Matthew Hoffman's Google Scholar profile or not the official one. Use full author names in the first occurrence and first author only afterward. * Refactor links to Hoffman paper and use latex symbols for original parameter names * Fix flake8 * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	12 May 2021, 13:33:34 UTC
351456b	aloknayak29	08 May 2021, 05:25:33 UTC	Fix bug where saved Phrases model did not load its connector_words (#3116) * fixed bug of connector_words not loading, while loading saved phrases model of version >= 4 Added tests for asserting persistence of phrases connector_words * Update test_phrases.py * Update phrases.py * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	08 May 2021, 05:25:33 UTC
8d70657	Graham Arthur Mackenzie	08 May 2021, 05:24:37 UTC	Added import to Nmf docs, and to models/__init__.py (#3131) * Added import to Nmf docs, and to models/__init__.py * Update CHANGELOG.md * Update CHANGELOG.md Co-authored-by: Graham Arthur Mackenzie <gmackenzie3@gatech.edu> Co-authored-by: Michael Penkov <m@penkov.dev>	08 May 2021, 05:24:37 UTC
0be9891	Jonathan Schneider	05 May 2021, 07:03:33 UTC	Improve & unify docs for dirichlet priors (#3125) * Make docs of Dirichlet parameters consistent * Improve Dirichlet prior initialization * Update CHANGELOG.md Co-authored-by: Michael Penkov <m@penkov.dev>	05 May 2021, 07:03:33 UTC

Newer
Older