Revision history - refs/tags/v9.9.1 - origin: https://github.com/ekg/freebayes

visit type:

Revision	Author	Date	Message	Commit Date
a994e78	Erik Garrison	02 July 2013, 08:32:27 UTC	Setting Release-Version v9.9.1	02 July 2013, 08:32:27 UTC
e73c3e5	Erik Garrison	01 July 2013, 12:39:16 UTC	partial haplotype observations Use all read evidence, even when calling haplotypes, by utilizing equivalencies between partial observations of a haplotype and the putative alleles at the site. Observations which partially support a number of haplotypes have their probability mass divided amongst the alleles they support when calculating genotype likelihoods. This provision resolves sensitivity issues caused by increased spanning coverage required to call small variants when using larger --haplotype-length values. The adjustment to genotype likelihood calculations currently requires the use of the new GL calculation routine provided when enabling --prob-contamination 0, or providing a per-sample contamination estimate file via --contamination-estimates.	01 July 2013, 12:39:16 UTC
bea983b	Erik Garrison	31 May 2013, 09:55:36 UTC	per-read group contamination estimates This is a checkpoint. This functionality is relatively stable.	31 May 2013, 09:55:36 UTC
d549975	Erik Garrison	25 April 2013, 18:48:12 UTC	addition of contamination estimates into GLs This is a first pass solution, and should probably not be used in production. This commit is for future reference.	25 April 2013, 18:48:12 UTC
296a0fa	Erik Garrison	23 April 2013, 15:39:50 UTC	resolve read group header parsing bug, empty alignment bug When read groups had colons in them, they were not parsed properly. When reads were aligned as wholly soft-clipped, the allele parsing segfaulted.	23 April 2013, 15:39:50 UTC
b0ee6e1	Erik Garrison	19 April 2013, 23:05:46 UTC	set correct merge order using new bamtools method	19 April 2013, 23:05:46 UTC
e0f8a94	Erik Garrison	19 April 2013, 15:32:49 UTC	avoid errors with soft clipped sequence at the beginning of reference This leads to errors typically in chrM, as the circular nature of this chromosome means many reads are mapped with soft clips at position 0.	19 April 2013, 15:32:49 UTC
6989b9e	Erik Garrison	15 April 2013, 16:36:24 UTC	set the haplotype calling window with --haplotype-length This is a synonym for --max-complex-gap, but I wanted to ensure that users were clear on the meaning of this parameter. If you want to call haplotypes, then increase --haplotype-length. (It's 3bp by default.)	15 April 2013, 16:36:24 UTC
cd21fa4	Erik Garrison	09 April 2013, 10:58:54 UTC	indicate that 0.9.9 is new stable revision	09 April 2013, 10:58:54 UTC
c993c5c	Erik Garrison	23 March 2013, 05:12:04 UTC	allow detection of long indels Long deletions were filtered out by legacy code which considered gaps as mismatches.	23 March 2013, 05:12:04 UTC
d0c1f12	Erik Garrison	30 January 2013, 00:03:43 UTC	turn off genotype qualities by default Genotype qualities (reported as GQ in the output) are marginal likelihoods of the specific genotype for a specific sample given the data and Bayesian model. They may be helpful for filtering or assessing genotyping accuracy, but they take a lot of time to compute because the current method for estimating them is O(N^2) in the number of samples. For more than 10 low-coverage samples, GQ estimation becomes the dominant use of compute time. For 1000 samples, GQ estimation is 90% of compute time. Prior to this commit it was possible to disable them using --no-marginals. I've removed this crypticly-named parameter, set GQ estimation off by default, and added --genotype-qualities, which turns them back on. Users who wish to use fill out the GQ field (old behavior) must provide --genotype-qualities.	30 January 2013, 00:03:43 UTC
cc16f93	Erik Garrison	29 January 2013, 20:13:29 UTC	--no-marginals now means "no-GQ's"	29 January 2013, 20:13:29 UTC
e85de7b	Erik Garrison	28 January 2013, 23:16:01 UTC	version 0.9.9, set as default "mappability" priors freebayes can estimate the probability that the loci in question is accurately mapped using a number of features extracted from read placement and distribution among samples. This framework effectively extends the basic Bayesian formulation from P(genotypes \| reads) to P(genotypes, properly-mapped alleles \| reads). As such, the QUAL value must be understood to incorporate expectations about mappability derived from observation features such as allele balance, strand bias, and read placement relative to the allele. This commit sets this model on by default. To turn OFF this behavior, use -wVa or: --hwe-priors-off \ --binomial-obs-priors-off \ --allele-balance-priors-off Extensive testing showed that this combination of parameters provided excellent sensitivity and specificity at all levels of genomic coverage and numbers of samples. The largest improvement in performance is for low-coverage resequencing (<5x coverage, >1000 samples) experiments. Higher-coverage experiments, where data tends to overwhelm priors, should not be affected.	28 January 2013, 23:16:01 UTC
d01982a	Erik Garrison	28 January 2013, 22:37:32 UTC	pooled frequency-based calling (and nan guard) Separate --pooled into --pooled-discrete (old behavior) and --pooled-continuous. In the continuous case, allele observation characteristics are reported for all alleles which passed the input filters (default -F 0.2 -C 2). Pooled continuous calling does not modify the Bayesian algorithm, and is effectively orthogonal to other parameters. For instance, --pooled-discrete and --pooled-continuous can be specified toether. The called genotypes will be affected by the --ploidy setting and --pooled-discrete flags, but the output will reflect all observed alleles passing input filters a the site. Also, guard against nan's in output (Utility.cpp).	28 January 2013, 22:37:32 UTC
36fe0be	Erik Garrison	28 January 2013, 20:53:23 UTC	use of big number library for improved numerical precision (ttmath) Removes the QUAL limit of 50000, more experimentation may be required to apply the method to the marginal genotype quality calculations.	28 January 2013, 20:53:23 UTC
0595cc5	Erik Garrison	28 January 2013, 02:59:19 UTC	change to help text to reflect region spec change	28 January 2013, 02:59:19 UTC
8bb6181	Erik Garrison	28 January 2013, 02:57:24 UTC	resolve targeting issue The last position in a region was being excluded. This ensures that the entire target is processed. The documentation is updated to reflect this change. scripts/fasta_generate_regions.py will now make completely covering regions.	28 January 2013, 02:57:24 UTC
d84ba44	Erik Garrison	27 January 2013, 18:01:46 UTC	track total genotyping iterations, change iteration defaults	27 January 2013, 18:01:46 UTC
f3e5186	Erik Garrison	06 January 2013, 18:15:29 UTC	improve scaling of probabilities, resolve #42 When using --allele-balance-priors, --binomial-obs-priors, scale probabilities according to the number of possible observation permutations. Resolve #42 by preventing use of soft-clipped sequence at the beginning of the reference.	06 January 2013, 18:15:29 UTC
8c2bb94	Erik Garrison	04 January 2013, 17:12:11 UTC	resolve #43, add segfault handler In #43, challisd reports a segfault when using input alleles. This was caused by 0-length allele artifacts generated when parsing the input VCF. Additionally, when compiled in debug mode, the segfault handler will now print a stacktrace.	04 January 2013, 17:12:11 UTC
eda4b69	Erik Garrison	20 December 2012, 12:08:19 UTC	actually set defaults in Parametecs.cpp Sets -C 2 -F 0.2 by default.	20 December 2012, 12:08:19 UTC
f8d78ff	Erik Garrison	19 December 2012, 17:28:13 UTC	set default input filters (-C 2 -F 0.2) In testing, these input filters on the minimum support for a given allele have been found to provide a very good balance between sensitivity and specificity, reducing the need for users to place complex filters on their VCF output. We use them by default in our work in the 1000 Genomes project low-coverage (4-6x) data. However, They may not be ideal in polyploid or pooled systems or low-frequency somatic variant detection, so users working in such contexts should set them to a level appropriate for their needs.	19 December 2012, 17:28:13 UTC
48962c8	Erik Garrison	18 December 2012, 13:10:38 UTC	version 0.9.8	18 December 2012, 13:10:38 UTC
84bf532	Erik Garrison	18 December 2012, 13:01:23 UTC	add empirical allele observation bias adjustment table By specifying --observation-bias users may provide a table which describes the empirical mapping bias against alleles given the number of bases subtracted or added between the allele and the reference. This is intended to improve genotype likelihood (GL) estimates and downstream imputation and processing of these likelihoods.	18 December 2012, 13:01:23 UTC
61b7bbc	Erik Garrison	06 December 2012, 15:15:35 UTC	use cast to get correct call to max(...) To resolve issue reported by C here: http://blog.gkno.me/post/29962850248/getting-started-with-gkno#disqus_thread	06 December 2012, 15:15:35 UTC
8af4379	Erik Garrison	02 December 2012, 22:25:14 UTC	guard againstsoft-clip edge cases Soft clips can occur where there is not reference sequence. When generating the allele do notprocess the reference sequence.	02 December 2012, 22:25:14 UTC
0e3f75b	Erik Garrison	10 October 2012, 15:58:28 UTC	cache only needed sequence, cleanup repeat detection Indeed, freebayes was holding onto unneeded reference sequence. This closes a long-standing issue. Also, cleans up repeat edge detection issues.	10 October 2012, 15:58:28 UTC
65e689c	Erik Garrison	10 October 2012, 12:27:00 UTC	bump, attempting to fix github state The last commit is not reflected in github, but is possible to obtain by cloning. This is an attempt to resolve the mismatch between github's overview and the repository.	10 October 2012, 12:27:00 UTC
b950138	Erik Garrison	09 October 2012, 14:29:03 UTC	reference most recent stable revision Users encountering bugs with the development version can revert to the most recent stable version. This version will be updated in the README as development continues.	09 October 2012, 14:29:03 UTC
f16e2bb	Erik Garrison	05 October 2012, 15:25:09 UTC	remove errant debugging messages and force exit	05 October 2012, 15:25:09 UTC
0995bd8	Erik Garrison	02 October 2012, 07:33:20 UTC	build haplotypes across repeats (version 0.9.7) When an indel is based on underlying repeat structure, record the right boundary of the repeat in the reference (technically, the first base past the repeat) in the indel's Allele structure. When building haplotype alleles during genotyping, assemble across the repeat, requiring, for instance, reference-matching reads to cover the entire repeat sequence.	02 October 2012, 07:33:20 UTC
03cb231	Erik Garrison	26 September 2012, 12:40:20 UTC	add freebayes parallelization script	26 September 2012, 12:40:20 UTC
39e625c	Erik Garrison	18 September 2012, 17:15:15 UTC	resolution of issues related to directed haplotyping	18 September 2012, 17:15:15 UTC
8351d54	Erik Garrison	09 September 2012, 15:54:21 UTC	updated vcflib	09 September 2012, 15:54:21 UTC
0f20f17	Erik Garrison	09 September 2012, 15:48:05 UTC	bugfix for haplotype basis alleles	09 September 2012, 15:48:05 UTC
9608597	Erik Garrison	20 August 2012, 12:59:48 UTC	example pipeline script This script is suitable for large (>1000 sample) processing jobs that broken down by genomic region.	20 August 2012, 12:59:48 UTC
7677631	Erik Garrison	15 August 2012, 22:49:55 UTC	resolve #22 In this case the problem was that the cached reference sequence window is not updated before the first time that the "current reference base" is acquired. This leads to garbage in the allele tags used internally, resulting in an out-of-range error.	15 August 2012, 22:49:55 UTC
20fd465	Erik Garrison	26 July 2012, 20:25:39 UTC	reference to arXiv:1207.3907	26 July 2012, 20:25:39 UTC
0132d1e	Erik Garrison	26 July 2012, 20:10:49 UTC	remove assertion (and thus exit) on alt == ref However, the warning will still be triggered.	26 July 2012, 20:10:49 UTC
5b17936	Erik Garrison	30 May 2012, 19:10:29 UTC	ignore (and signal errors) when out-of-order alignments are detected	30 May 2012, 19:10:29 UTC
88afddf	Erik Garrison	18 May 2012, 01:34:24 UTC	resolve #30, multiple alternates with same sequence This appears to be caused by Ns in the read sequence, which were not being parsed properly in some cases. Proper handling of these bases should resolve the issue.	18 May 2012, 01:34:24 UTC
ba4fb65	Erik Garrison	16 May 2012, 19:30:12 UTC	remove default mapping quality and base quality restrictions The mis-estimation of mapping quality causes a lot of problems for users. Largely, these issues can be resolved by removing the default input filters in freebayes. A better method of incorporating mapping quality into the analysis is generate genotype likelihoods using the minimum of base quality and mapping quality. This can be enabled by providing the --use-mapping-quality flag on the command line.	16 May 2012, 19:30:12 UTC
9696d0c	Erik Garrison	30 April 2012, 19:35:47 UTC	fix allele misclassification bug	30 April 2012, 19:35:47 UTC
3f0ae56	Erik Garrison	27 April 2012, 19:37:23 UTC	haplotype basis alleles By specifying a set of haplotype basis alleles, phasing information can be established in long reads even in the presence of high error rates. The haplotype basis allele input is used to select the alleles which will be phased. Other alleleic primitives will be ignored by replacement with the reference allele.	27 April 2012, 19:37:23 UTC
a464833	Erik Garrison	27 March 2012, 20:04:22 UTC	really resolve #17 Use eof() to check for string/variable conversion in convert.h instead of tellg(), which behaves correctly according to the C++ spec as of gcc 4.6.2, returning -1 when eof() is set and when there is an error.	27 March 2012, 20:04:22 UTC
81c3ec5	Erik Garrison	26 March 2012, 18:48:53 UTC	resolve #17 Per http://stackoverflow.com/questions/6552876/file-stream-tellg-tellp-and-gcc-4-6-is-this-a-bug tellg(): (27.6.1.3) After constructing a sentry object, if fail() != false, returns pos_type(-1) to indicate failure. Otherwise, returns rdbuf()->pubseekoff(0, cur, in).	26 March 2012, 18:48:53 UTC
df23b3f	Erik Garrison	07 February 2012, 22:39:17 UTC	resolve bug #25 During non-targeted analysis of an entire reference sequence, freebayes would fail to process positions after the first reference sequence. This resolves the issue.	07 February 2012, 22:39:17 UTC
a6943d9	Erik Garrison	03 February 2012, 00:25:04 UTC	bamtools update	03 February 2012, 00:25:04 UTC
8c35b17	Erik Garrison	03 February 2012, 00:05:55 UTC	update documentation to describe variant input behavior	03 February 2012, 00:05:55 UTC
32b9693	Erik Garrison	31 January 2012, 22:41:29 UTC	remove requirement of PL (sequencing technology) tag	31 January 2012, 22:41:29 UTC
a2db81c	Erik Garrison	19 January 2012, 18:21:05 UTC	Revert "update of bamtools" This reverts commit 3cb41894c3850a863a8d66cca51d3c3da4d4961e.	19 January 2012, 18:21:05 UTC
2cdddfc	Erik Garrison	19 January 2012, 18:20:41 UTC	Revert "update submodules, vcflib and bamtools" This reverts commit a3b707ef2a9ef4174a7b16a61a996b822973219f. Conflicts: bamtools	19 January 2012, 18:20:41 UTC
3cb4189	Erik Garrison	18 January 2012, 23:57:52 UTC	update of bamtools	18 January 2012, 23:57:52 UTC
a3b707e	Erik Garrison	18 January 2012, 23:05:57 UTC	update submodules, vcflib and bamtools	18 January 2012, 23:05:57 UTC
31cffd8	Erik Garrison	05 January 2012, 23:53:24 UTC	resolve https://github.com/ekg/freebayes/issues/26 This issue arose due to a split indel allele generated in the haplotype creation step. The issue is resolved by removing such alleles from analysis at a prior stage.	05 January 2012, 23:53:24 UTC
094879e	Erik Garrison	04 January 2012, 20:47:10 UTC	resolves https://github.com/ekg/freebayes/issues/22 This involved errors when producing VCF output with complex alleles.	04 January 2012, 20:47:10 UTC
4fd14bb	Erik Garrison	15 December 2011, 22:04:34 UTC	minor adjustments to handle BAMs produced by Complete Genomics Some CG BAM records are not processable using our system, and so they must be ignored. This commit ensures proper handling of these cases.	15 December 2011, 22:04:34 UTC
5693c69	Erik Garrison	14 December 2011, 23:15:09 UTC	ignore indel alleles which are not flanked by invariant sequence In its present design, the detection model used by freebayes cannot handle ambiguous alleles. One notable class of these are indels described at the beginning and end of alignments, as it is not guaranteed that these are fully defined. This commit excludes these alleles from analysis.	14 December 2011, 23:15:09 UTC
d149da7	Erik Garrison	07 December 2011, 21:32:27 UTC	resolve segfault when enumerating polyploid genotype likelihoods For the time being, I am removing the genotype likelihood output for the polyploid model. The ordering of genotypes for polyploid data is not specified in the VCF 4.1 spec.	07 December 2011, 21:32:27 UTC
152bf35	Erik Garrison	07 December 2011, 15:51:34 UTC	allele frequency input priors	07 December 2011, 15:51:34 UTC
474c9e1	Erik Garrison	18 November 2011, 16:34:46 UTC	use intervaltree for in-target detection Eventually this will allow the selection of a set of (possibly overlapping) targets when reading data from stdin.	18 November 2011, 16:34:46 UTC
2f4c924	Erik Garrison	17 November 2011, 14:25:05 UTC	add intervaltree submodule	17 November 2011, 14:25:05 UTC
9b42dc8	Erik Garrison	16 November 2011, 19:06:03 UTC	output bug with adjusted haplotypes Errant overwriting of refbase caused breakage of downstream GT and GL output functions.	16 November 2011, 19:06:03 UTC
0a912f9	Erik Garrison	16 November 2011, 18:04:08 UTC	resolves segfault in the context of a read N matching a reference N	16 November 2011, 18:04:08 UTC
64c2a9a	Erik Garrison	10 November 2011, 21:48:16 UTC	retain a flanking base when reporting adjusted biallelic indels This bug produced "SEQ"/"" calls with empty alternate sequences, which violates the VCF spec.	10 November 2011, 21:48:16 UTC
47a4513	Erik Garrison	07 November 2011, 01:45:22 UTC	remove errant -1 Causes truncation error with haplotype allele printing.	07 November 2011, 01:45:22 UTC
f4207a3	Erik Garrison	27 October 2011, 23:46:39 UTC	clean up reporting of ref/alt pairs with matching start and end sequence Prevents reporting lots of extra matching sequence on haplotype-based alleles.	27 October 2011, 23:46:39 UTC
f47c3da	Erik Garrison	27 October 2011, 21:19:18 UTC	version 0.9.4 Haplotype calling cleanup.	27 October 2011, 21:19:18 UTC
6543587	Erik Garrison	27 October 2011, 15:55:29 UTC	homogenize alternate alleles at haplotype loci Depending on sequence context, a complex allele which is 1M5D1M1D is potentially the same as a deletion allele 2M6D. This equivalence can be established by comparing the alternate sequences for a given reference-relative haplotype. The most-commonly-observed alignment is used to adjust the cigars for identical but differentially described alternate alleles.	27 October 2011, 15:55:29 UTC
d404615	Erik Garrison	26 October 2011, 23:10:11 UTC	fix VCF fields, AA -> AO, RA -> RO AA is reserved for another use. Also, resolves mistake with previous bugfix.	26 October 2011, 23:10:11 UTC
8115c27	Erik Garrison	26 October 2011, 21:54:38 UTC	inbreeding coefficient calculations in python	26 October 2011, 21:54:38 UTC
7776445	Erik Garrison	26 October 2011, 21:48:02 UTC	fix haplotype breakage across MNPs	26 October 2011, 21:48:02 UTC
addf5a0	Erik Garrison	19 October 2011, 22:31:37 UTC	allow unsetting the genotyping max banddepth This is done via "--genotyping-max-banddepth 0".	19 October 2011, 22:31:37 UTC
75ef778	Erik Garrison	19 October 2011, 14:45:48 UTC	combine homozgyous combos across populations This is required for proper normalization of site QUAL, as it depends on the present homozygous genotypings by definition.	19 October 2011, 14:45:48 UTC
3bfe993	Erik Garrison	18 October 2011, 01:24:10 UTC	fix bugs with population subdivision	18 October 2011, 01:24:10 UTC
f37c427	Erik Garrison	17 October 2011, 23:08:45 UTC	population subdivisions These changes allow the subdivision of the input samples into sub-populations. The sub-populations are assumed to be inbreeding, selectively neutral, random samples. Provided this, the model is evaluated for each population independently. The sub-population model assumes independence among the populations. At present, mutual information is shared between populations only in the sense that alleles and genotypes evaluated for one population are evaluated for all. Populations may be specified using a file mapping sample names to populations. The command-line flag is --populations.	17 October 2011, 23:08:45 UTC
8ee5c9d	Erik Garrison	14 October 2011, 22:28:30 UTC	minor README update	14 October 2011, 22:28:30 UTC
642d44f	Erik Garrison	14 October 2011, 05:07:48 UTC	fix bugs related to the input of complex alleles And, version 0.9.3!	14 October 2011, 05:07:48 UTC
a3e256f	Erik Garrison	12 October 2011, 15:53:50 UTC	add discrete HWE sampling probability of genotyping to VCF Also, rationalize het sample all observation count, used in some other VCF INFO fields.	12 October 2011, 15:53:50 UTC
dd97e73	Erik Garrison	10 October 2011, 22:38:07 UTC	fix AB, add MEANALT, fix haplotype generation bug	10 October 2011, 22:38:07 UTC
900d8e5	Erik Garrison	10 October 2011, 04:49:40 UTC	resolves haplotype allele construction bug Inappropriate amplification.	10 October 2011, 04:49:40 UTC
623efe4	Erik Garrison	05 October 2011, 00:01:50 UTC	exclude alleles with no reference sequence These are generated in the process of haplotype construction. Call them chaff; the alleles which they have been carved out of could not pass relatively minimal filter cutoffs, and as such are carved up. It's very unlikely that they are significant, and reconstructing them properly would require a lot of code adjustment and probably would not result in better performance.	05 October 2011, 00:01:50 UTC
83f24b6	Erik Garrison	04 October 2011, 22:43:19 UTC	re-enable DPRA (depth reference alternate ratio)	04 October 2011, 22:43:19 UTC
d5d90a3	Erik Garrison	30 September 2011, 02:43:43 UTC	version 0.9.2	30 September 2011, 02:43:43 UTC
8e9d51c	Erik Garrison	30 September 2011, 02:23:48 UTC	code cleanup, performance enhancement Add an upper bound for the depth of integration (default 6 best genotypes, sorted by data likelihood) for each sample. This caps the amount of computation at complex multiallelic sites.	30 September 2011, 02:23:48 UTC
648e845	Erik Garrison	29 September 2011, 18:33:12 UTC	fix AB calculations for multiallelics	29 September 2011, 18:33:12 UTC
74f7171	Erik Garrison	28 September 2011, 05:24:12 UTC	remove broken ts/tv tagging	28 September 2011, 05:24:12 UTC
c370968	Erik Garrison	27 September 2011, 23:38:27 UTC	fix bug with homozgyous convergence case	27 September 2011, 23:38:27 UTC
32a7145	Erik Garrison	27 September 2011, 16:37:54 UTC	use the null allele and genotype when excluding unobserved genotypes Only attempt to add the null allele in the case of --exclude-unobserved-genotypes.	27 September 2011, 16:37:54 UTC
d0a70df	Erik Garrison	26 September 2011, 23:13:37 UTC	performance improvements 1) Introduce the concept of a null allele. This is used in the place of all the other potential alleles at the site when calculating genotype likelihoods for a given sample. If the sample does not have any observations for a given alternate allele, we just ignore it. The likelihoods for such genotypes are then provided by matching the genotype to one in which the missing allele is replaced by a null allele. The benefit of this is that we dramatically reduce the number of potential genotype combinations which we have to evaluate when searching the posterior space for the maximum likelihood solution. This is done without any serious change to the algorithm design, and allows marginals to be calculated without issue. 2) Cache binomial calculations. This provides a 15% speedup when using --binomial-obs-priors. 3) Don't store intermediate genotype combination results. Doing so causes severe memory blowups. (I'm also considering changing the GenotypeCombo to a vector<short>, and including some kind of genotype ptr -> short int mapping for the combo.)	26 September 2011, 23:13:37 UTC
61bffa9	Erik Garrison	22 September 2011, 15:01:38 UTC	limit size of factorial cache	22 September 2011, 15:01:38 UTC
8a0e27b	Erik Garrison	22 September 2011, 00:12:29 UTC	add null alleles to handle N's in reads Additionally, this adjusts the way that some complex alleles are generated, such as those flanking N bases in reads. Also, cleanup of some logic in the AlleleParser::getNextAlleles code.	22 September 2011, 00:12:29 UTC
4957b5c	Erik Garrison	21 September 2011, 15:29:48 UTC	fix haplotype allele generation bug Resolves an off-by-one error in the haplotype generation code.	21 September 2011, 15:29:48 UTC
42eb578	Erik Garrison	20 September 2011, 17:58:56 UTC	haplotype-based detection This commit enables correct evaluation of variant loci with multi-base alleles by combining variant alleles into dynamically-sized haplotype alleles. These haplotypes, or phased sets of alleles, are tagged as "complex" in the VCF output. Some minor issues remain following this commit: 1) the reported CIGAR strings for SNPs are sometimes incorrect, as when the SNP lies at the first base in a mult-base allele, 2) freebayes cannot yet take complex alleles as --variant-input; they will be broken into their constituent alleles.	20 September 2011, 17:58:56 UTC
0f42d54	Erik Garrison	15 September 2011, 21:29:03 UTC	ensure proper future use of allLocalGenotypeCombinations This check allows the use of this function in the case that it is used to add to a previous set of genotype combinations.	15 September 2011, 21:29:03 UTC
e55cd26	Erik Garrison	15 September 2011, 21:26:36 UTC	resolve haploid genotyping bug Due to a recent change in the way that the reference allele is handled, in some cases it was possible that the best genotype combination was not evaluated when calculating marginal genotype likelihoods. As a result, genotypes were frequenty mis-called in the case of two haploid samples. This resolves the bug by ensuring that the best genotype combination is added to the set of combinations that are evaluated.	15 September 2011, 21:26:36 UTC
fd4184c	Erik Garrison	09 September 2011, 00:01:16 UTC	resolve ref allele bug	09 September 2011, 00:01:16 UTC
47df665	Erik Garrison	08 September 2011, 17:18:37 UTC	allow complex alleles to have embedded matching sequence (v0.9.0) With this change, complex alleles are generated for cases where two small variants in the same read occur at most --max-complex-gap bases apart (3bp, by default). This allows for the detection of MNPs (multi-nucleotide polymorphisms) with embedded matching bases. The behavior can be disabled by setting --max-complex-gap 0. Also, this commit adds a new tag to the VCF output, CIGAR, which provides the CIGAR strings of the variants, allowing for post-hoc filtering of certain classes of complex variants.	08 September 2011, 17:18:37 UTC
3d097bf	Erik Garrison	30 August 2011, 15:49:54 UTC	more cleanup for bamtools API integration	30 August 2011, 15:49:54 UTC
4b641f4	Erik Garrison	30 August 2011, 15:29:20 UTC	add libbamtools.a to build commands Without this, users would need libbamtools.so in their shared object search path.	30 August 2011, 15:29:20 UTC

Newer
Older