https://github.com/brettc/partitionfinder
Name Target Message Date
HEAD 4201050 remove download stats for reasons I don't understand, it doesn't work. 10 March 2021, 21:10:50 UTC
refs/heads/buffer2 c92c486 Don't use stderr in ExternalProgramError because it’s empty now. 19 April 2017, 09:13:03 UTC
refs/heads/combined-speedup e0c2f2d remove pre-made task lists no evidence that they speed things up from empirical tests 14 October 2016, 01:33:49 UTC
refs/heads/develop 47d2328 fix bug in rclusterf this bug meant that if the median change was zero, we got stuck in an infinite loop. 03 December 2015, 22:24:34 UTC
refs/heads/feature-krmeans 3378c1c implement krmeans this is an idea for an algorithm in which the zero entropy sites are reassigned the entropy of their nearest physical non-zero entropy site in the alignment. Works fine so far. 05 May 2016, 06:29:00 UTC
refs/heads/feature/1kite-bugfix2 ceaef5d fixed bug in rcluster This is really fixed now. The issue was that the previous bug fix wasn’t bulletproof. It left the door open for a second bug, in which a single subset had an identical improvement score with >1 other subset. The new fix addresses this bug, as well as making sure that the original bug is fixed. 25 August 2015, 00:14:20 UTC
refs/heads/feature/DBSCAN 7d0e1a2 proto_DBSCAN 19 August 2015, 13:10:55 UTC
refs/heads/feature/complete_alignments 5ba17e0 improved user output for kmeans 15 September 2015, 05:19:47 UTC
refs/heads/feature/fabricated_subsets 0eb9964 fixed fabricated subset dealings at the end of the kmeans algorithm 27 February 2015, 03:32:50 UTC
refs/heads/feature/fastercluster e394e8e add two spaces 13 November 2015, 02:46:55 UTC
refs/heads/feature/fasttree 1f628e8 added write_fasta alignment function * FastTree requires interleaved phylip or fasta alignments. It is probably easier to write a fasta alignment so this function does that. 02 September 2014, 15:29:48 UTC
refs/heads/feature/fix-tests-pf2 55493fb add init to make tests run 26 February 2015, 06:45:24 UTC
refs/heads/feature/garli_output f9836c0 Merge branch 'develop' into feature/garli_output 30 April 2013, 01:14:54 UTC
refs/heads/feature/greedy-speedy 57715e4 new version of greedy algorithm that borrows from the cluster algorithm, and is now a whole lot quicker and more efficient. 07 September 2015, 22:25:35 UTC
refs/heads/feature/importcheck a0ea4df some very minor changes 18 September 2016, 23:35:19 UTC
refs/heads/feature/iqtree 77a3347 first attempt at a whole bunch of IQtree model commandlines including R4-R8, R10, R12, R15, R20. will require some empirical tests to see which of the R’s are really needed. Ultimately, a progressive algorithm like that in IQtree (keep adding R cats until the AICc starts dropping) would be better. 21 March 2017, 00:00:52 UTC
refs/heads/feature/kmeans-manyparts a4a5718 make RAxML fall back on standard raxml with one CPU preparation for making the ML tree the default option 25 July 2016, 22:54:19 UTC
refs/heads/feature/krmeans2 6f760ad new krmeans algorithm the previous version was naive. I reassigned invariant sites at every step, which just got the algorithm stuck early on. This version waits until the end of the kmeans algorithm to reassign sites, which is a much better idea. It appears to work well (in terms of AICc scores) on empirical datasets. 12 May 2016, 07:58:51 UTC
refs/heads/feature/lie-markov d9ae785 include category for lie markov models in models.cv These models have the attractive and possibly important property that you can multiply them together along branches and still have lie markov models. I don’t know of any evidence that inferences go wrong if you don’t use these models, but it’s possible. For a full description see e.g.: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC4468350/ 13 November 2015, 03:00:40 UTC
refs/heads/feature/merge-little-subsets 5851b59 change scheme name when cleaning schemes so that it’s very obvious in the best_scheme.txt that you are using a cleaned scheme. 11 November 2015, 22:02:54 UTC
refs/heads/feature/model_csv 7bc895f implemented models.csv file, which is working in principle. 26 February 2015, 06:38:55 UTC
refs/heads/feature/morph_tiger 0fd45b4 Experimental morpho tiger rates This is a preliminary implementation to estimate tiger rates from a morphology alignment. 25 November 2015, 19:22:30 UTC
refs/heads/feature/morph_tiger_rates 28a0cf1 cleaned up print statements 13 June 2016, 15:27:58 UTC
refs/heads/feature/morphology e0d9e80 added dummy morphology models for phyml 01 July 2013, 19:58:00 UTC
refs/heads/feature/morphology2 dcc88c3 clean up model list checking still needs more work, but this is a good start 21 November 2013, 21:31:59 UTC
refs/heads/feature/new_clustering a172fb3 new relaxed clustering algorithm complete contains some much more efficient routines, including only making schemes once per step, and keeping a more efficient running tally of subset improvements. 14 December 2013, 07:36:42 UTC
refs/heads/feature/no-sleep 50dd23a remove sleep condition I suspect this is slowing us down a lot… 29 September 2016, 04:12:00 UTC
refs/heads/feature/phyml-external 27137d2 Update saved results files for latest phyml 07 August 2015, 04:46:46 UTC
refs/heads/feature/profiling 2739914 PEP8: Newline at end of file 04 December 2013, 19:29:37 UTC
refs/heads/feature/pytables 61804e4 Merge branch 'develop' into feature/pytables 10 December 2013, 22:42:19 UTC
refs/heads/feature/raxml-external de2d12a Fix raxml build 07 August 2015, 07:41:11 UTC
refs/heads/feature/test-tiger-arrays d40f4d3 Fix silly bugs after merge. 09 March 2015, 06:47:49 UTC
refs/heads/gh-pages 0dfbde4 github generated gh-pages branch 04 July 2011, 10:14:21 UTC
refs/heads/gui_test c957068 basic gui working 30 June 2012, 03:57:23 UTC
refs/heads/h5-bug a874845 ignore pyenv cruft 03 October 2017, 19:55:50 UTC
refs/heads/master 4201050 remove download stats for reasons I don't understand, it doesn't work. 10 March 2021, 21:10:50 UTC
refs/heads/paul_develop d0cf8a8 removed confusing log statement *log.info() statement read the number of sites from each codon position being split. This was for testing and, in reality, doesn’t work for most datasets. 16 February 2015, 19:39:57 UTC
refs/heads/release/1.1.0 27309ff Handle expected failure of DNA_Clustering3 16 May 2013, 03:25:16 UTC
refs/heads/speedup-threadpool 3c5bfd0 Create list of correct size for all tasks Should speed things up a little 13 October 2016, 22:59:29 UTC
refs/tags/h5-bugfix-1 a874845 ignore pyenv cruft 03 October 2017, 19:55:50 UTC
refs/tags/v0.9.1 eccc508 Use md5 to generate consistent length names 07 March 2012, 11:43:38 UTC
refs/tags/v2.0-pre1 a554d7f Better user output for kmeans 14 August 2015, 11:13:37 UTC
refs/tags/v2.0-pre2 dba0794 Merge pull request #63 from brettc/feature/1kitebugfix1 Feature/1kitebugfix1 16 August 2015, 02:09:03 UTC
refs/tags/v2.0.0 41a5ef0 update PF2 citation the ppr is now accepted 22 November 2016, 04:50:43 UTC
refs/tags/v2.0.0-pre10 b6fcd69 Merge pull request #80 from brettc/feature/fastercluster Feature/fastercluster we now have the search option rclusterf, which is a faster version of the rcluster algorithm. I do not yet know exactly how well it compares to rcluster, though it should be quite a bit faster in certain situations (especially where the number of models is << than the number of processors you have). 13 November 2015, 05:14:57 UTC
refs/tags/v2.0.0-pre11 47d2328 fix bug in rclusterf this bug meant that if the median change was zero, we got stuck in an infinite loop. 03 December 2015, 22:24:34 UTC
refs/tags/v2.0.0-pre12 2d28e48 remove unused test 14 March 2016, 04:59:39 UTC
refs/tags/v2.0.0-pre13 a307ab8 remove old debugging statement Embarrassing. Thanks to Ben Anderson for pointing this out. https://groups.google.com/forum/#!topic/partitionfinder/MSdcgxJ415w 18 March 2016, 20:29:32 UTC
refs/tags/v2.0.0-pre14 acb84f8 Merge pull request #104 from brettc/feature/morph Feature/morph 31 May 2016, 22:39:13 UTC
refs/tags/v2.0.0-pre15 a06b857 updated citation for PF2 18 September 2016, 23:41:24 UTC
refs/tags/v2.0.0-pre16 7f70beb fix windows bug reported here: https://groups.google.com/forum/#!topic/partitionfinder/4pAkDOHB5FM the bug was a hangover from the TIGER days. 21 September 2016, 05:55:35 UTC
refs/tags/v2.0.0-pre17 e561bfa update raxml version to https://github.com/stamatak/standard-RAxML/commit/5d9558ac18ddb2c69dd75a 9dc971bcf541bbfeb2 22 September 2016, 06:29:18 UTC
refs/tags/v2.0.0-pre3 e7529ea updated gitignore 04 May 2015, 07:21:07 UTC
refs/tags/v2.0.0-pre4 97b68ef updated manual contents 25 August 2015, 03:55:24 UTC
refs/tags/v2.0.0-pre5 8bd784c changed user output for cluster 25 August 2015, 05:17:09 UTC
refs/tags/v2.0.0-pre6 fb4fcd3 remove -U option for RAxML it might be causing issues, and won’t work with morphology data. 28 August 2015, 23:52:38 UTC
refs/tags/v2.0.0-pre7 b106624 update kmeans test since we now disallow multiple subsets as input 12 September 2015, 07:55:45 UTC
refs/tags/v2.0.0-pre8 83be0bd Merge pull request #70 from brettc/feature/complete_alignments Feature/complete alignments 15 September 2015, 05:22:05 UTC
refs/tags/v2.0.0-pre9 a2d3b33 updated manual added in —all-states and —min-subset-size 02 October 2015, 04:17:43 UTC
refs/tags/v2.1.0 19d7fe4 Disable k-means for all but morphology #Why? A paper came out yesterday (http://www.sciencedirect.com/science/article/pii/S1055790316302780) that raises some serious concerns about the k-means algorithm, suggesting that it might lead to bad inferences on empirical datasets. I had spoken to the authors of the paper when they were revising it, but wasn’t aware until yesterday of the details of the problems they’d uncovered. Given how odd the inferences from k-means look, we decided to disable the method for all but morphological analyses (see below). # But there was a warning before, why disable it now? Our previous concerns came from our own realisation about one aspect of the method (that it lumps together all invariant sites) and some concerns raised by folks in Brian Moore’s lab this year. Specifically, we put in the warning when we learned that some simulated datasets that were analysed with k-means partitioning schemes led to bad inferences. I was hopeful that these simulations would be corner cases, and/or that one aspect of the simulations where k-means was misleading (that you got implausibly long trees) would mean that it would be trivial to diagnose cases in which there were issues. In addition, we had tried the method on lots of empirical datasets, and never seen any issues. Indeed, on at least one dataset the k-means tree seemed much more reasonable than trees we were getting from other methods. (I note that Brian Moore and co were less optimistic, and suggested from the start that we should consider disabling the method.) The empirical results in the recent paper suggest otherwise, and suggest that the best we can say of k-means for now is: ‘you should try other methods too, and if the methods disagree, we’d suggest ignoring the k-means tree'. On this basis, there seems little point keeping k-means as an available method: if you can't trust the reuslts, why bother. # I liked/used it, what should I do? Use standard methods, e.g. partitioning by codon position and locus, instead. Even better (if your dataset is small enough) use the automatic partitioning solutions in BEAST2 and/or MrBayes (google AutoParts). If you have used k-means to make an inference, it would be worthwhile to check that the inference is robust when you use a standard partitioning scheme too. # What’s the problem? We don’t know for sure, but it’s likely to be related to the fact that k-means separates out all invariant sites into a single subset. I presented on this at SMBE in July this year, but this has a couple of downstream effects. First, it makes AIC/AICc/BIC scores look really great, because when you have all the invariant sites together, you can estimate a rate of zero and get likelihoods of 1 for all of those sites. That’s a bit silly, and something I wish we’d realised earlier. Second, and more seriously for inference, putting all the invariant sites into one subset means that the other subsets have NO invariant sites. If you then analyse these without a model that accounts for this (e.g. with some kind of ascertainment bias) this is likely to mess with estimate of rates, branch lengths, and topologies. It’s not totally obvious yet how common the problem is, but now we’ve seen it in simulated and empirical datasets, it seems wise to can the method until we completely understand the problem and can fix it. # Are you going to fix it? We're working on it, but it will take a while. Apart from anything else, we are going to be exceptionally cautious in proposing more new methods related to this one. # But why is it still available for morphology? We’ve kept it in there for morphological datasets as an experimental method, and provide lots of warnings when you run the code and in the output that it’s experimental, untested etc. We did this because morphological datasets are different: they tend to have no invariant sites, and people tend to use models that correct for ascertainment bias. Because of that, it seems worthwhile to leave it in. We are working on testing it as exhaustively as possible for these datasets. # I want to use it anyway If you want to use it for empirical inferences, just don’t. But if you want to use it to try and figure out why it doesn’t work, and how you might improve it, then all you need to do is edit out the line that raises the error. # I have questions… Post on the google group or raise an issue on GitHub. 02 December 2016, 05:08:47 UTC
refs/tags/v2.1.1 63d5af1 bump version number 06 December 2016, 01:45:13 UTC
back to top