775e07e | IlyaKot | 14 July 2021, 09:52:55 UTC | parallelized computation of distances (#28) parallelized computation of distances for some of the metrics exist for cdist. Cover the existing euclidean an cosine for knn and mknn. Significantly accelerates the computation. if some other metrics will desired to be used by users (instead the euclidean and cosine for knn and mknn respectively), then If the user have enough memory and the metric is in distance_metrics, then users can replace the whole loop, that remained for their decision. | 14 July 2021, 09:52:55 UTC |
6e9c5e8 | Sohil Shah | 25 March 2020, 06:58:26 UTC | Update DCCComputation.py a bug in pruning an epsilon array | 25 March 2020, 06:58:26 UTC |
d918a89 | Johnson Zhong | 11 August 2019, 16:59:23 UTC | Adding visual end-to-end example and python3 compatibility (#15) * Ignore generated and cached files * Use new pytorch syntax for extracting python scalar * Fix space formatting * Make easy to visualize 2D data * Include edge construction in this repo to allow updates The old edgeConstruction's unsuited for use on new datasets. * Adapt edge construction to new dataset format * Pretrain 2D visualizable dataset * Fix code snippet annotation language (shell not python) * Add image showing demo of generated data * Start factoring out dataset specific parameters Reduce repetition in various use cases (SDAE and extractSDAE) of suitable parameters for each dataset. * Extract pretrained features for easy dataset * Merge mkNN graph with pretrained features * Run DCC on easy to visualize dataset * Make sure indices are integral * Factor out logic of pretraining to allow calling in code Just have to create an args-like data structure such as an EasyDict to pass in. * Start example of doing all pretraining and training steps in code only * Return index and network from pretraining * Allow copy graph to be called from code * Return extracted features * Extract features and merge pretraining data * Allow DCC to be called from code * Record first 2 dimensions of representatives over epochs * Do full DCC * Prevent error from trying to reconfigure logger * Document running everything in one script and do visual debugging * Ignore generated data files * Use the same figure to save memory * Use separate loggers Can't reconfigure loggers and they are supposed to log to different directories * Separate data loaded into multiple steps for debugging * Move k-nearest neighbour param outside * Make pickling python 3 compatible * Normalize generated easy to visualize data * Add option to clean log folders Prevent tensorboard plotting all previous iteration's logs together with the current one if you don't want to keep previous iteration's data. * Visualize raw generated data * Update documentation for python 3 compatibility and how to visualize results * Fix early termination during fine tuning * Pass custom nets in to DCC and centralize network definition Instead of the default/pretrained autoencoder, can pass something else. Choose a central location to define problem specific network architecture (data_params). * Clean up code * Fix undefined unique_rows for knn * Link criterion 2 to objective function in paper * Use identity net and knn graph to achieve 100% accuracy * Treat edge case of too little difference after pretraining * Try training with autoencoder (also works) Basically just needs knn instead of mknn for distance graph * Add video to readme of representatives shifting * Revert stopping threshold modification See discussion in #14 * Keep original epsilon and move sqrt only to selection * Execute main if run as script | 11 August 2019, 16:59:23 UTC |
7bf5257 | Sohil Shah | 31 March 2019, 04:57:09 UTC | Update README.md | 31 March 2019, 04:57:09 UTC |
f17dc8f | Sohil Shah | 31 March 2019, 04:42:41 UTC | Merge pull request #11 from LemonPi/master Consistify hyphen usage to enable copy pasting code snippet | 31 March 2019, 04:42:41 UTC |
5ef1794 | Johnson Zhong | 26 March 2019, 00:09:38 UTC | Consistify hyphen usage to enable copy pasting code snippet | 26 March 2019, 00:09:38 UTC |
7eeabcf | Sohil Shah | 08 March 2019, 12:03:32 UTC | added info regarding pretrained.mat | 08 March 2019, 12:03:32 UTC |
63f3851 | Sohil Shah | 02 March 2019, 14:36:36 UTC | Added external link to dataset | 02 March 2019, 14:36:36 UTC |
2059572 | Sohil Atul Shah | 01 April 2018, 23:32:13 UTC | Merging master Merge branch 'master' of github.com:shahsohil/DCC | 01 April 2018, 23:32:13 UTC |
d52de08 | Sohil Atul Shah | 01 April 2018, 23:32:06 UTC | added support for extracting cluster assignment | 01 April 2018, 23:32:06 UTC |
19dc2a4 | Sohil Shah | 28 March 2018, 21:05:40 UTC | Update README.md | 28 March 2018, 21:05:40 UTC |
00d420d | Sohil Shah | 06 March 2018, 02:39:29 UTC | Update README.md | 06 March 2018, 02:39:29 UTC |
eaa767e | Sohil Shah | 05 March 2018, 01:29:21 UTC | Update README.md | 05 March 2018, 01:29:21 UTC |
df2c0d8 | Sohil Atul Shah | 04 March 2018, 03:55:38 UTC | changes to readme.md | 04 March 2018, 03:55:38 UTC |
4b545cd | Sohil Atul Shah | 04 March 2018, 03:47:33 UTC | small change | 04 March 2018, 03:47:33 UTC |
73d7406 | Sohil Atul Shah | 04 March 2018, 03:34:04 UTC | removed DS_Store files | 04 March 2018, 03:34:04 UTC |
47b8b0b | Sohil Atul Shah | 04 March 2018, 03:28:53 UTC | added git ignore | 04 March 2018, 03:28:53 UTC |
ae8cf9a | Sohil Atul Shah | 04 March 2018, 03:22:43 UTC | initial commit | 04 March 2018, 03:22:43 UTC |
0dba2ac | Sohil Shah | 01 March 2018, 17:29:12 UTC | Initial commit | 01 March 2018, 17:29:12 UTC |