be33e2d | Sam Vohr | 06 October 2023, 00:46:11 UTC | Fix index error for report top proportions (#24) | 06 October 2023, 00:46:11 UTC |
f854cf2 | Sam Vohr | 30 October 2020, 22:18:29 UTC | Remove an xrange left over from python 2.7 (#23) | 30 October 2020, 22:18:29 UTC |
954fcf4 | Alex Hübner | 30 May 2020, 00:49:56 UTC | Python3 (#20) * Fixed import of SeqRecord and Seq from biopython * Lift to Python3 using 2to3 * Update reading of binary data and module structure of scipy - Decode binary to UTF-8 string of CSV files because pkg_resources.resource_stream reads data in binary mode - Import scipy.special instead of scipy.misc to get access to logsumexp in Scipy >= v1.1 * Remove refs to Python 2.7, change pkg_resources use Remove references to Python 2.7 in mixemt executable and .travis.yml. Change pkg_resources use to get Phylotree as a string to avoid conversion from binary stream and preserve existing tests. Update tests to avoid float comparison. * Change pkg_resource call to get phylotree filename Co-authored-by: svohr <shvohr@gmail.com> | 30 May 2020, 00:49:56 UTC |
397cceb | Sam Vohr | 03 May 2019, 01:55:08 UTC | Add option to set seed for random number generation (#19) * Add option to set seed for random number generation * Add README description of --seed option | 03 May 2019, 01:55:08 UTC |
05746fb | Jerrythafast | 03 May 2019, 01:19:37 UTC | Add option for percentage-of-coverage in addition to -R (#17) * added -p option for percentage next to minimum * PEP8 * Fix assemble_test * Refactor to frac_var_reads and add to README.md * Add README.md changes | 03 May 2019, 01:19:37 UTC |
436ca02 | Jerrythafast | 03 May 2019, 01:17:54 UTC | Allow setting -i/--init to infinity for equal priors (#18) * Allow setting -i/--init to infinity for equal priors * Add -i/--init change to README.md | 03 May 2019, 01:17:54 UTC |
e5198aa | Sam Vohr | 20 February 2018, 00:33:02 UTC | Fixed biopython imports for consensus calling to work (#14) | 20 February 2018, 00:33:02 UTC |
8ff7409 | Sam Vohr | 13 November 2017, 01:57:43 UTC | Added variants to plot_hap_coverage plots. (#13) Using new invocation of plot_hap_coverage, phylotree and novel variants will be added to the plot that is produced. | 13 November 2017, 01:57:43 UTC |
cb14285 | Sam Vohr | 13 November 2017, 00:55:19 UTC | Citation (#12) * Added citation information to README.md * Added complete citation including volume and page numbers. | 13 November 2017, 00:55:19 UTC |
fde2fa7 | Sam Vohr | 05 June 2017, 17:35:33 UTC | Added citation information to README.md (#10) | 05 June 2017, 17:35:33 UTC |
51c8ced | Sam Vohr | 08 May 2017, 23:28:42 UTC | Merge pull request #8 from svohr/ref-fais Added fasta indexes for reference sequences | 08 May 2017, 23:28:42 UTC |
44d493c | svohr | 08 May 2017, 23:25:49 UTC | Added fasta indexes to package_data in setup.py | 08 May 2017, 23:25:49 UTC |
c451c8f | svohr | 08 May 2017, 23:18:56 UTC | Changed ref/README.txt to include more builds of Phylotree. | 08 May 2017, 23:18:56 UTC |
6814283 | svohr | 08 May 2017, 23:06:22 UTC | Added fasta indexes for mtDNA references. User may not always have permission to build fasta indexes (*.fai files) if the package file is installed into the system python package directory. We'll keep the index files here since they're small and they shouldn't change from system to system. | 08 May 2017, 23:06:22 UTC |
e42b503 | Sam Vohr | 05 April 2017, 23:19:34 UTC | Merge pull request #6 from svohr/ref-phy-pack-defaults Use Package Ref. Sequence and Phylotree files by default | 05 April 2017, 23:19:34 UTC |
5bee254 | svohr | 05 April 2017, 23:10:27 UTC | Added verbose mesg for reading Phylotree input. Changed order of initial data input so that alignments are read last. This streamlines the verbose output and lets us test the defaults for reference sequence and Phylotree more easily. | 05 April 2017, 23:10:27 UTC |
e202704 | svohr | 05 April 2017, 21:28:02 UTC | minor spelling fixes in README.md | 05 April 2017, 21:28:02 UTC |
a00a892 | svohr | 05 April 2017, 21:25:47 UTC | Minor spelling fixes in help output. | 05 April 2017, 21:25:47 UTC |
187fc9c | svohr | 05 April 2017, 18:43:16 UTC | Minor re-word in README.md | 05 April 2017, 18:43:16 UTC |
2885078 | svohr | 05 April 2017, 18:24:53 UTC | Added output option descriptions | 05 April 2017, 18:24:53 UTC |
0d6ce99 | svohr | 04 April 2017, 23:28:05 UTC | Added descriptions of contributor and assembly options. | 04 April 2017, 23:28:05 UTC |
3a33904 | svohr | 04 April 2017, 22:18:14 UTC | Fixed incorrectly formatted verbatim/code block | 04 April 2017, 22:18:14 UTC |
af7cf66 | svohr | 04 April 2017, 22:11:31 UTC | Added description of EM options to README.md | 04 April 2017, 22:11:31 UTC |
446d9b5 | svohr | 04 April 2017, 21:49:57 UTC | Added descriptions of customization and quality filtering options | 04 April 2017, 21:49:57 UTC |
06be425 | svohr | 04 April 2017, 18:46:17 UTC | Added descriptions of basic options to README.md | 04 April 2017, 18:46:17 UTC |
bf45e00 | svohr | 03 April 2017, 23:14:26 UTC | README.md updated for installation and new usage. mixemt now uses the reference sequence and Phylotree representation included in the repository by default and the README has been updated to reflect that. | 03 April 2017, 23:14:26 UTC |
ed30a70 | Sam Vohr | 14 March 2017, 18:01:13 UTC | Updated .gitignore with new path to ref/ | 14 March 2017, 18:01:13 UTC |
e7a39c0 | Sam Vohr | 14 March 2017, 17:59:46 UTC | Phylotree input now read from package data by default. | 14 March 2017, 17:59:46 UTC |
bc12c11 | Sam Vohr | 13 March 2017, 21:03:33 UTC | cmdline args Ref and Phylotree now options. Reference sequence now read from package data by default. | 13 March 2017, 21:03:33 UTC |
4c4d92b | Sam Vohr | 10 March 2017, 21:56:36 UTC | Turn off Travis CI email notifications. | 10 March 2017, 21:56:36 UTC |
c222910 | Sam Vohr | 10 March 2017, 19:51:21 UTC | Changed shebang line to use env for portability. | 10 March 2017, 19:51:21 UTC |
d745cba | svohr | 08 March 2017, 01:59:00 UTC | Removed osx from Travis config. | 08 March 2017, 01:59:00 UTC |
98900dd | svohr | 07 March 2017, 23:14:07 UTC | Added Travis CI badge to README.md | 07 March 2017, 23:14:07 UTC |
d8644bc | svohr | 07 March 2017, 22:41:42 UTC | Changed testing call from pytest to py.test | 07 March 2017, 22:41:42 UTC |
1ae25ba | svohr | 07 March 2017, 22:34:11 UTC | Upgrade pip to use --only-binary. | 07 March 2017, 22:34:11 UTC |
3e726c6 | svohr | 07 March 2017, 21:30:23 UTC | Using pre-compiled numpy/scipy with Travis. | 07 March 2017, 21:30:23 UTC |
52ce279 | svohr | 07 March 2017, 21:22:22 UTC | Only test with Python 2.7 | 07 March 2017, 21:22:22 UTC |
75d9ff6 | svohr | 07 March 2017, 19:46:45 UTC | Added config file for Travis CI. | 07 March 2017, 19:46:45 UTC |
99cdd9c | Sam Vohr | 07 March 2017, 00:00:34 UTC | Merge pull request #4 from svohr/re-org-dirs Reorganized Directories, added mixemt package, setup.py | 07 March 2017, 00:00:34 UTC |
46941ca | svohr | 06 March 2017, 23:54:53 UTC | Added description to setup.py | 06 March 2017, 23:54:53 UTC |
8d130e5 | svohr | 06 March 2017, 23:41:39 UTC | Updated test script to use new paths and pytest. | 06 March 2017, 23:41:39 UTC |
54a86c7 | svohr | 06 March 2017, 23:22:12 UTC | Added docstring for mixemt/__init__.py | 06 March 2017, 23:22:12 UTC |
45f4e41 | svohr | 03 March 2017, 19:47:22 UTC | order of imports changed to fit convention | 03 March 2017, 19:47:22 UTC |
0489c63 | svohr | 03 March 2017, 19:24:00 UTC | Removed R scripts (not used). Added requirements. | 03 March 2017, 19:24:00 UTC |
1bc3fac | svohr | 03 March 2017, 19:23:33 UTC | Changed mixemt imports to use full path. | 03 March 2017, 19:23:33 UTC |
3722941 | svohr | 03 March 2017, 17:58:40 UTC | Setuptools works in setup.py now. | 03 March 2017, 17:58:40 UTC |
bbb22fc | svohr | 03 March 2017, 17:57:49 UTC | Fixed imports for new mixemt package. | 03 March 2017, 17:57:49 UTC |
46c5cfc | svohr | 03 March 2017, 17:47:10 UTC | Added entries for additional files to setup.py | 03 March 2017, 17:47:10 UTC |
9e240a8 | svohr | 03 March 2017, 17:46:19 UTC | Renamed ref/README as it does not use markdown. | 03 March 2017, 17:46:19 UTC |
0cedff9 | svohr | 01 March 2017, 23:54:00 UTC | Added basic info to setup.py Still untested. | 01 March 2017, 23:54:00 UTC |
9a1b650 | svohr | 01 March 2017, 23:37:40 UTC | Added empty setup.py file | 01 March 2017, 23:37:40 UTC |
8aa50b0 | svohr | 01 March 2017, 23:35:49 UTC | Added __init__.py Hope this works! | 01 March 2017, 23:35:49 UTC |
57600c4 | svohr | 01 March 2017, 23:14:57 UTC | Re-organized the directory structure This will make it easier to write the setup.py | 01 March 2017, 23:14:57 UTC |
7c20b04 | svohr | 01 March 2017, 22:46:56 UTC | style: removed escaped \n | 01 March 2017, 22:46:56 UTC |
11fd1f0 | svohr | 28 February 2017, 23:19:33 UTC | Style-fixes after linting. Mostly fixing indentation for continued lines. | 28 February 2017, 23:19:33 UTC |
d1dbfad | svohr | 09 February 2017, 18:44:44 UTC | changed multi-em logaddexp to in-place operation | 09 February 2017, 18:44:44 UTC |
e2686d4 | svohr | 08 February 2017, 22:54:47 UTC | Refactored em_step() for speed. em_step() now uses in place operations when possible and no longer uses for loops. | 08 February 2017, 22:54:47 UTC |
87aa436 | svohr | 08 February 2017, 00:38:21 UTC | EM multi run now works the same as before. Multi run could probably be refactored to be more memory efficient and it would be better if it kept track of the variance in mixture proportions as well as the average. | 08 February 2017, 00:38:21 UTC |
9319479 | svohr | 07 February 2017, 21:41:11 UTC | Updated README requirements and links | 07 February 2017, 21:41:11 UTC |
551d42b | svohr | 07 February 2017, 21:27:08 UTC | Updated read assignment to use log-probs, and tests. | 07 February 2017, 21:27:08 UTC |
ada5da7 | svohr | 07 February 2017, 20:59:34 UTC | Updated docstring for em functions. | 07 February 2017, 20:59:34 UTC |
6fdbb28 | svohr | 07 February 2017, 19:19:12 UTC | Tests for em.py updated to work with log-probs. Multi EM run is still not fixed. | 07 February 2017, 19:19:12 UTC |
2d83fd6 | svohr | 07 February 2017, 18:24:07 UTC | run_em() now returns mixture props (linear) For handing around the results of em, mixture proportions are stored as linear while the read probabilities remain in log form. We only need to do math with the proportions and read probablity matrix one more time so this will avoid lots of exp(props) in the post-em functions. | 07 February 2017, 18:24:07 UTC |
d23e0e4 | svohr | 04 February 2017, 02:36:31 UTC | EM now performed with log-probs. Need to fix mulit-run EM and could reduce redundant exp calls. | 04 February 2017, 02:36:31 UTC |
af544e7 | svohr | 04 February 2017, 01:35:20 UTC | Refactored EM single step to log-probs. This change introduces a dependency for scipy in order to the the log(sum(exp())) operation. | 04 February 2017, 01:35:20 UTC |
01ed649 | svohr | 03 February 2017, 18:41:13 UTC | preprocess now produces log-prob input matrix Updated preprocess tests to match new ownput | 03 February 2017, 18:41:13 UTC |
5834464 | svohr | 01 February 2017, 18:09:54 UTC | corrected README headings | 01 February 2017, 18:09:54 UTC |
93ccd5f | svohr | 31 January 2017, 22:58:27 UTC | README edits | 31 January 2017, 22:58:27 UTC |
00446a6 | svohr | 30 January 2017, 23:48:13 UTC | Merge branch 'master' of https://github.com/svohr/mixemt Fixed conflict in title. | 30 January 2017, 23:48:13 UTC |
41971dd | svohr | 30 January 2017, 23:45:45 UTC | Added usage and preparing input to README.md | 30 January 2017, 23:45:45 UTC |
0c5344f | Sam Vohr | 27 January 2017, 22:12:58 UTC | Added check and error message for 0 contributors detected. | 27 January 2017, 22:12:58 UTC |
be889c2 | Sam Vohr | 24 January 2017, 17:18:15 UTC | Update README.md spelling: deconvolving rather than deconvoluting | 24 January 2017, 17:18:15 UTC |
a05cacb | svohr | 13 January 2017, 22:04:31 UTC | Added overview of program | 13 January 2017, 22:04:31 UTC |
8acb375 | Sam Vohr | 31 December 2016, 02:11:30 UTC | Create LICENSE | 31 December 2016, 02:11:30 UTC |
3ede038 | Sam Vohr | 05 October 2016, 00:16:14 UTC | Merge branch 'em-refine' | 05 October 2016, 00:16:14 UTC |
69bbf78 | Sam Vohr | 05 October 2016, 00:07:22 UTC | Added option to skip contribution estimate refinement Also added backwards compatibility for old results where the initial EM input matrix was not saved. | 05 October 2016, 00:07:22 UTC |
d413dbe | Sam Vohr | 22 September 2016, 20:32:55 UTC | Fixed issue with diagnostic variant check and custom haplogroups Ran into a problem with the steps that fill in ancestral bases after a haplogroup has been detected and the variant bases it carries are removed from consideration as evidence for another haplogroup. In the case that a variant that defines a custom haplogroup that is not at a known variant position, we previously tried to delete that position from the list of ancestral pos/bases. We now check to see if this position is included in the list before removing it now. We may need to revisit this later. | 22 September 2016, 20:32:55 UTC |
8041df7 | Sam Vohr | 20 September 2016, 00:27:17 UTC | Changed save/load to also include the input matrix for EM. In order to implement the second EM run to refine the contribution estimates, we need to know the original input matrix for EM. We could rebuild it but it would take a long time and not necessarily match. Instead, the input matrix is now output with the -s flag, and read in using the -l flag. This is a little annoying since it means that any previous results will not be usable, unless I add in some backwards compatibility checks. | 20 September 2016, 00:27:17 UTC |
8acf899 | Sam Vohr | 19 September 2016, 23:33:43 UTC | Changed contribution estimate refinement to replace intial values. | 19 September 2016, 23:33:43 UTC |
e6bc240 | Sam Vohr | 16 September 2016, 04:19:16 UTC | Output table now reports only final contribution estimate. | 16 September 2016, 04:19:16 UTC |
307d35d | Sam Vohr | 15 September 2016, 18:57:22 UTC | Tweaks to get estimate refinement working. Changed when verbose mode initial results are reported. Changed main output to include both initial and refined estimates. | 15 September 2016, 18:57:22 UTC |
e216722 | Sam Vohr | 15 September 2016, 04:59:43 UTC | Added parts for refined contribution estimates Added function to update contributor table and updated main workflow to include second EM run to refine contribution estimates. Output updated to include both contribution estimates. Not running just yet. | 15 September 2016, 04:59:43 UTC |
7570552 | Sam Vohr | 15 September 2016, 03:51:38 UTC | moved reduce_em_matrix to preprocess.py | 15 September 2016, 03:51:38 UTC |
d9f795c | Sam Vohr | 15 September 2016, 03:38:05 UTC | Added function to reduce EM input matrix to only identified contributors | 15 September 2016, 03:38:05 UTC |
d0dd278 | Sam Vohr | 07 September 2016, 22:46:22 UTC | write_variants() output overhauled to be more useful. write_variants used to report a list of polymorphic positions from phylotree and whether they were expected to be polymorphic or fixed in the sample, given the haplogroups that were detected. This has been changed to be more useful in interpretting the results of mixemt. write_variants now reports for each position in the reference, the number of A,C,G,Ts, whether this position is expected to be polymorphic, whether this position would be called a variant position under our scheme (i.e. whether there exist more than 1 base that pass the minimum variant count cutoff) and, finally, the variants that are known for this position from our detected haplogroups. This should make it much easier to see how the results are affected by what bases are found in the sample. | 07 September 2016, 22:46:22 UTC |
42ff20e | Sam Vohr | 07 September 2016, 18:47:57 UTC | Formatting of the verbose output for _check_contrib_phy_vars() now cleaner Verbose output for _check_contrib_phy_vars() is now sorted on reference position and is justified to line up a little nice now. | 07 September 2016, 18:47:57 UTC |
48e40bb | Sam Vohr | 07 September 2016, 18:28:28 UTC | Added much more verbose output to haplogroup diagnostic checks. The verbose output for the haplogroup diagnostic position checks now includes a report for every position considered so that weird things can be picked out more easily. Added a method to ObservedBases to get the total number of observed bases at a reference position more easily. | 07 September 2016, 18:28:28 UTC |
a83ca2e | Sam Vohr | 08 August 2016, 22:54:09 UTC | Minor spelling fix. | 08 August 2016, 22:54:09 UTC |
abfaa5f | Sam Vohr | 06 August 2016, 23:17:31 UTC | For coverage plots, Y axis always starts at 0 now. | 06 August 2016, 23:17:31 UTC |
dbbfa35 | Sam Vohr | 03 August 2016, 21:39:45 UTC | Fixed missing empty-line for consistent formatting. | 03 August 2016, 21:39:45 UTC |
2d4b926 | Sam Vohr | 03 August 2016, 16:42:55 UTC | Merge branch 'master' of https://github.com/svohr/mixemt Conflicts: preprocess.py Removed TODO note for precomputing mutation weights because it is now done. | 03 August 2016, 16:42:55 UTC |
223377a | svohr | 03 August 2016, 04:24:31 UTC | HapVarBaseMatrix now pre-computes site instability score. Big win. We now compute the site-specific instability frequency score for every variant position and store it a dictionary at initialization. The values are used later to find the probability of observing a read from a haplogroup. This change cuts the time needed to build the initial EM input matrix in half. | 03 August 2016, 04:24:31 UTC |
1e8a09d | Sam Vohr | 02 August 2016, 22:36:12 UTC | Added TODO to speed up building initial matrix. | 02 August 2016, 22:36:12 UTC |
49ad26c | Sam Vohr | 02 August 2016, 22:34:27 UTC | Updated docstring for _check_contrib_phy_vars | 02 August 2016, 22:34:27 UTC |
768c446 | Sam Vohr | 28 July 2016, 21:33:03 UTC | Fix for problem in using ancestral bases as evidence for haplogroup. Previously, we only checked positions with defining mutations to see if a haplogroup was present in the sample. This led to problems where the defining mutations on a branch we actually back mutations to the ancestral base. We did not account for the fact that these ancestral bases were likely introduced by haplogroups at higher proportions and that they should only be used as evidence if the previous haplogroups did not carry the ancestral base. To fix this, we now keep track of ancestral bases (positions where no mutation is inferred to have occurred) so that they can be ignore when checking for the defining mutations for a haplogroup/contributor. Added, modified 2 tests to match new behavior and added 2 new ones. | 28 July 2016, 21:33:03 UTC |
ee0d877 | Sam Vohr | 28 July 2016, 20:30:54 UTC | Added reference sequence to phylotree object at init. stats.write_statistics() now uses the reference sequence stored in phylotree rather than passing it separately. PhyloTree.polymorphic_sites() now uses the internal reference sequence by default. New unit tests for PhyloTree.get_ancestral() | 28 July 2016, 20:30:54 UTC |
a479494 | Sam Vohr | 28 July 2016, 18:27:13 UTC | Added new method get_ancestral() Phylotree now stores a reference to the reference sequence that the tree is based on. This is used by get_ancestral() to return the ancestral bases for all sites not affected by a mutation in a lineage. We can refactor some of the code now so we do not have to pass a phylotree object and a reference sequence string around all of the time. | 28 July 2016, 18:27:13 UTC |
f02efd0 | Sam Vohr | 28 July 2016, 18:09:33 UTC | Upped default max number of iterations in EM to 10,000 | 28 July 2016, 18:09:33 UTC |
33d6eb2 | Sam Vohr | 15 July 2016, 21:56:37 UTC | Adjusted coverage plot so lines stop at 0 when coverage reaches 0 Lines in coverage plot now drop to 0 when coverage reaches 0 but lines do not extend across the entirety of 0 coverage stretch. Plot size also adjusted to not look so stretched out. | 15 July 2016, 21:56:37 UTC |
1c29a82 | Sam Vohr | 09 July 2016, 23:41:19 UTC | Added correction for 0 to 1 based coords to unassigned coverage too. | 09 July 2016, 23:41:19 UTC |