Revision history - refs/tags/v0.9.14 - origin: https://github.com/ekg/freebayes

visit type:

Revision	Author	Date	Message	Commit Date
698098e	Erik Garrison	03 March 2014, 14:43:15 UTC	Setting Release-Version v0.9.14	03 March 2014, 14:43:15 UTC
c04d877	Erik Garrison	03 March 2014, 14:35:11 UTC	resolve bug in implementation of Ewens' Sampling Formula Thanks to Severine Catreux for the catch. This should have the largest effect on sites that are multiallelic. The quality of these will be diminished slightly, in line with the bug. This may reduce the overall power to detect variants at multiallelic loci. It is possible that this is not the most-correct way to utilize the ESF in freebayes, as the assumption of a constant mutation rate for each locus in the genome is inadequate. At some loci the mutation rate is orders of magnitude higher than elsewhere. Problematically, these are the same places where Illumina data (and any PCR-based library) tends to present artifacts.	03 March 2014, 14:35:11 UTC
a830efd	Erik Garrison	19 February 2014, 17:35:48 UTC	fix targeting issue in vcflib, add some more info to allele obs debugging	19 February 2014, 17:35:48 UTC
0e8c2f2	Erik Garrison	19 February 2014, 14:41:27 UTC	add warning that region cannot be set on haplotype basis file	19 February 2014, 14:41:27 UTC
dc0f063	Erik Garrison	19 February 2014, 00:37:29 UTC	Merge pull request #66 from andersje/git_to_http changed git:// to https://	19 February 2014, 00:37:29 UTC
4e21366	andersje	18 February 2014, 22:10:54 UTC	changed git:// to https://	18 February 2014, 22:10:54 UTC
c807ef8	Erik Garrison	14 February 2014, 22:00:52 UTC	another update to version Make sure it sticks.	14 February 2014, 22:00:52 UTC
f2efa6d	Erik Garrison	14 February 2014, 21:57:37 UTC	Setting Release-Version v0.9.13	14 February 2014, 21:57:37 UTC
c5c8aa8	Erik Garrison	14 February 2014, 21:56:19 UTC	add check that submodules are downloaded to makefile	14 February 2014, 21:56:19 UTC
d298b4e	Erik Garrison	14 February 2014, 21:40:37 UTC	updated version	14 February 2014, 21:40:37 UTC
21a2951	Erik Garrison	14 February 2014, 21:39:33 UTC	Setting Release-Version v9.9.13	14 February 2014, 21:39:33 UTC
dfddc43	Erik Garrison	24 January 2014, 17:12:04 UTC	Merge pull request #62 from pmarks/master fix crasher in homogenizeAllele. use map.rbegin() to the entry with the...	24 January 2014, 17:12:04 UTC
e1203fc	Erik Garrison	24 January 2014, 17:05:52 UTC	resolve #63 by update to documentation	24 January 2014, 17:05:52 UTC
dda84cb	Patrick Marks	23 January 2014, 06:00:35 UTC	fix crasher in homogenizeAllele. use map.rbegin() to the entry with the largest key	23 January 2014, 06:00:35 UTC
a97dbf8	Erik Garrison	16 January 2014, 23:29:42 UTC	Setting Release-Version v9.9.11	16 January 2014, 23:29:42 UTC
e8e862c	Erik Garrison	16 January 2014, 23:29:18 UTC	Setting Release-Version v9.9.11	16 January 2014, 23:29:18 UTC
53e147a	Erik Garrison	16 January 2014, 23:28:16 UTC	handle = and X in alignment cigars The handling at present isn't intelligent, but by treating this in the same way as 'M' we can properly parse CIGARs which have them. Better would be to skip some of the comparison logic during parsing.	16 January 2014, 23:28:16 UTC
d15f1e6	Erik Garrison	16 January 2014, 23:27:47 UTC	Setting Release-Version v9.9.11	16 January 2014, 23:27:47 UTC
90b7027	Erik Garrison	15 January 2014, 22:48:14 UTC	output ALT='.' in --report-monomorphic	15 January 2014, 22:48:14 UTC
ffbc611	Erik Garrison	15 January 2014, 22:42:01 UTC	resolve inconsistencies in vcflib from previous commit	15 January 2014, 22:42:01 UTC
103d814	Erik Garrison	15 January 2014, 22:32:06 UTC	resolve #59 Set an internal reference sample name to avoid polluting the CNV map table with the reference sequence names, which can sometimes intersect with actual sequence names (e.g. 1, 2, 3... can both be sequence names and sample names).	15 January 2014, 22:32:06 UTC
ff98393	Erik Garrison	25 December 2013, 21:37:49 UTC	... done	25 December 2013, 21:37:49 UTC
0e8e881	Erik Garrison	25 December 2013, 21:37:13 UTC	completes previous commit	25 December 2013, 21:37:13 UTC
350ab0b	Erik Garrison	25 December 2013, 21:36:20 UTC	removed horizontal lines from README.md github already has these! Cool.	25 December 2013, 21:36:20 UTC
294c1f0	Erik Garrison	25 December 2013, 21:33:39 UTC	extensive updates to documentation/manual (README.md) freebayes now has a much better manual! Please provide feedback and keep the questions coming! Happy holidays to all.	25 December 2013, 21:33:39 UTC
3c54afe	Erik Garrison	18 December 2013, 23:19:58 UTC	remove errant warning message "What is this?" ... shouldn't happen.	18 December 2013, 23:19:58 UTC
47a713e	Erik Garrison	18 December 2013, 20:17:53 UTC	change behavior of --report-genotype-likelihood-max This does not affect typical operation vs. the previous commit.	18 December 2013, 20:17:53 UTC
6aa6591	Erik Garrison	18 December 2013, 16:11:45 UTC	avoid and resolve #6 Some aligners will place reads past the ends of short reference sequences that aren't flanked by Ns. Before this commit, freebayes would choke on this kind of issue, and say it was "Unable to read reference sequence base past end of current reference sequence." Now, it just registers the error and continues, as it probably always should have.	18 December 2013, 16:11:45 UTC
9ed353c	Erik Garrison	10 December 2013, 00:25:15 UTC	version 0.9.10 This begins the v1 release candidate series. The change in version numbering style recognizes the need to move the version past 0.9.9! However, version 1.0 is meant to coincide with paper publication or submission. Soon! A large number of small changes land in this version, including changes to the genotype likelihood calculations: * Support for correct genotyping of large deletions. This fix involved correcting handling of "partial" observations of the reference allele. * Default use of mapping quality in genotype likelihood calculations (this is disabled via the --standard-gls flag). Mapping qualities are incorporated via the Effective Base Depth metric from snpTools (Baylor). * Exclusion of mapping quality 0 mappings. (This can be disabled with --min-mapping-quality -1 --standard-gls). The change in genotype likelihood calculations means these are not considered meaningful. These three changes improved the ROC-AUC for indels in simulation by about 1%, which is substantial given that it now stands at 0.95. No difference was recorded for SNPs. The commit also includes a change relevant to the output of variant records. * Removal of variant parsing routines meant to standardize VCF records by removing redundant information from the ALT/REF pairs. These manipulation routines would, for example, remove portions of the REF and ALTs which were always matching. The canonical case would be a haplotype call in which only a SNP was ultimately called. The call would occur over several bp of reference. The manipulation is complex and caused errors in many instances, which leads to many maintenance issues and sad users. At present, VCF normalization is best-handled by external utilities, such as vcfallelicprimitives from vcflib. Retaining the haplotype information in the call in this way clarifies which calls were attempted haplotype calls and which weren't, regardless of output state.	10 December 2013, 00:25:15 UTC
9755e43	Erik Garrison	10 December 2013, 00:24:08 UTC	Setting Release-Version v0.9.10	10 December 2013, 00:24:08 UTC
5d5b8ac	Erik Garrison	15 November 2013, 20:56:30 UTC	updated vcflib, fix ordering of flags to linker	15 November 2013, 20:56:30 UTC
8a98f11	Erik Garrison	13 November 2013, 22:24:14 UTC	ensure allele homogenization works when homogenizing to reference allele	13 November 2013, 22:24:14 UTC
4f2fe5e	Erik Garrison	03 November 2013, 17:02:17 UTC	SAR and SAF should be Number=A in the header Thanks to Rajgopal Srinivasan for catching this.	03 November 2013, 17:02:17 UTC
78714b8	Erik Garrison	31 October 2013, 23:32:02 UTC	ignore un-flanked deletions at the beginning and end of alignments Some aligners will report deletions at the beginning of alignments. It is not clear how to interpret these cases because, without flanking sequence in the read, the indication of a relative deletion in the read is meaningless. For instance, a cigar for a 70bp read could be 1D70M. What base in the read has been deleted? This resolves some bugs which generate errors like this: deletion... alt is empty	31 October 2013, 23:32:02 UTC
7e198dc	Erik Garrison	16 October 2013, 19:37:49 UTC	avoid mixing true full and partial observations Resolves common error mode causing "ref is same as alt" bug.	16 October 2013, 19:37:49 UTC
c283d6d	Erik Garrison	10 October 2013, 13:15:04 UTC	proper positional progression, resolves issue #52 A flag which indicated change in target region was not respected by the previous commit.	10 October 2013, 13:15:04 UTC
e315866	Erik Garrison	09 October 2013, 02:26:40 UTC	further improvements to indel (and SNP) detection A further 1% improvement in AUC for indels, and .5% improvemint in AUC for SNPs. Changes in this commit are not yet well-optimized, and users should be aware that runtime performance here will be slower than in previous commits. (3x slower for 100 10X samples.) Subsequent commits will focus on reducing runtime.	09 October 2013, 02:26:40 UTC
60622b5	Erik Garrison	08 October 2013, 12:57:46 UTC	checkpoint, resolution of haplotype breakage issue (a->empty() errors) There are still some outstanding concerns with this commit, but a checkpoint is necessary as this code state has excellent performance against indels, another 1% better than the previous commit.	08 October 2013, 12:57:46 UTC
011561f	Erik Garrison	03 October 2013, 20:08:37 UTC	handling of large variants Maintain the registered alleles set correctly.	03 October 2013, 20:08:37 UTC
10ac8d4	Erik Garrison	30 September 2013, 23:18:45 UTC	resolve #51 Slicing and dicing haplotype observations could lead to (erroneous) divided indels. Avoid these using a suitable guard in the haplotype observation generation process (fithaplotype).	30 September 2013, 23:18:45 UTC
8e44a20	Erik Garrison	28 September 2013, 06:03:14 UTC	indel genotye likelihoods This commit resolves a number of issues with indel genotype likelihoods. First, although haplotype detection has proceeded across repeat structures in the reference, the same extension was not applied to repeats which were not represented in the reference, and only in reads. This allowed reads supporting the alternate to appear as if they supported the reference, leading to false heterozygote calls for homozygous alternates. Second, in some cases, it may be desireable to call haplotypes across regions of low complexity. --min-repeat-entropy provides this facility, requiring a given number of bits per base of any reference-relative haplotype window built around a repeat structure. Thirdly, improvements in performance were provided by careful handling of partial observations, specifically checking if apparent full-length haplotype observations in fact support alternate haplotypes, Fourthly, "null" observations (portions of reads which are N, variants that are not specified in --haplotype-basis-alleles), are now correctly treated as non-observations. Previously, these were incorrectly coerced to be reference. In all, these changes yield a 2% improvement in the area under the curve for detection in 100 simulated 10x genomes (0.917 -> 0.937). They also eliminate a number of pathological errors in 1000 Genomes. Also! This commit includes bugfixes for invalid memory access errors detected by valgrind.	28 September 2013, 06:03:14 UTC
a290229	Erik Garrison	13 September 2013, 22:02:47 UTC	strand observation counts In response to popular demand, the raw counts of forward and reverse observation count return to the freebayes output. SAF: forward alternate observations SAR: reverse alternate observations SRF: forward reference observations SRR: reverse reference observations A number of infrequently-used variables have been removed (X* mismatch variables, CPG, discrete HWE sampling prob).	13 September 2013, 22:02:47 UTC
c2ea1aa	Erik Garrison	13 September 2013, 21:12:15 UTC	fix reference handling, correctly switch targets Resolves failing updates to cached reference sequence. These generated "alt is the same as the ref" type errors, and monomorphic calls in the VCF. They also could generate spurious haplotype calls. Also affected by this change were "subsequence of zero length or negative offset" errors. Bounds checking is now integrated into the routines in Fasta.cpp, and out-of-bounds returns the null string instead of blowing up. When processing entire BAM files, rather than targets or specified regions, freebayes would wait for the condition that there were no more alignments in order to switch targets. This would cause problems specifically when 1) an alignment reached the end of a sequence, 2) there were still more alignments, 3) no targets were specified. A clarification of the position and target switching logic resolves this issue.	13 September 2013, 21:12:15 UTC
3e07445	Erik Garrison	29 August 2013, 23:54:15 UTC	resolve "...is out of order! expected after..." bug This error message was generated when attempting to gather partial support for haplotypes. If an alignment did not span the haplotype (which is unlikely, because the haplotype calls tend to be short, but happens with reads that are heavily soft-clipped), then this warning would be triggered and the read would be ignored. The adjustment allows for the use of such alignments when generating partial support.	29 August 2013, 23:54:15 UTC
ad718f9	Erik Garrison	23 August 2013, 20:00:20 UTC	fix default reporting of (some) monomorphic loci If you want to report monomorphic loci, use --report-monomorphic.	23 August 2013, 20:00:20 UTC
c3e485d	Erik Garrison	05 August 2013, 14:17:03 UTC	correctly retain observations during partial observation generation, allele balance fix A coding error resulted in observation loss after assembling partial observations. Limit the "probe" length required of all observations when approximating correct allele balance for indels (specifically insertions). This limits the probe length to 50bp, not-configurable. It will be updated to a measure based on the average read length for future-correctness.	05 August 2013, 14:17:03 UTC
cbe555b	Erik Garrison	01 August 2013, 18:06:42 UTC	--report-monomorphic Optionally report all loci for which there are any observations. Also report failing considered alternates with AC=0.	01 August 2013, 18:06:42 UTC
576bc70	Erik Garrison	22 July 2013, 16:17:04 UTC	pooled variant detection improvements * enable use-best-n-alleles The --use-best-n-alleles parameter was previously disabled for non-SNP variation, but this prevents its use as a bound on computational complexity. This occurs readily in the case of multiple --ploidy 20 pools and --pooled-discrete, leading to an combinatorial explosion in possible genotype states. A common result of --pooled-diploid would be the exhaustion of system memory. Users can now safely use --pooled-discrete provided they also use a suitable setting for --use-best-n-alleles. In practice, setting to 5 or lower should be sufficient to prevent memory blowup in most situations. For the time being, I suggest testing with progressively lower settings, or simply setting it as low as you think reasonable. (For pooled experiments focused on SNPs, this would be 2.) This update includes two other fixes: * uppercase reference sequence Uppercasing the reference allele properly resolves an error with FastaReference::getSubSequence (negative length). * no hwe priors for pooled samples The HWE component of the mappability estimate should be turned off in the case of pooled-discrete runs, so it is now turned off when --pooled-discrete is specified.	22 July 2013, 16:17:04 UTC
fbf46fc	Erik Garrison	11 July 2013, 09:38:39 UTC	fix header, set new GL calculations as default "Int" should be "Integer" in the VCF header. The new GL calculations take mapping quality into account by default, so observation probability is given by (1-MQ)*(1-BQ).	11 July 2013, 09:38:39 UTC
64f41db	Erik Garrison	09 July 2013, 17:19:50 UTC	quick fix to previous commit Variable casting issue.	09 July 2013, 17:19:50 UTC
9cd5548	Erik Garrison	09 July 2013, 17:18:10 UTC	LUT for factorials, tweaks to input filtering, QUAL < 0 bugfix And generally, better performance than previous commit.	09 July 2013, 17:18:10 UTC
dce0cb0	Erik Garrison	07 July 2013, 17:24:08 UTC	fix bounding on genotyping iterations	07 July 2013, 17:24:08 UTC
c0cca0b	Erik Garrison	06 July 2013, 12:02:29 UTC	help text update	06 July 2013, 12:02:29 UTC
aa5282c	Erik Garrison	06 July 2013, 11:43:15 UTC	reset version to 0.9.9.2 (rather than v9...)	06 July 2013, 11:43:15 UTC
69f09a5	Erik Garrison	06 July 2013, 11:42:45 UTC	Setting Release-Version v0.9.9.2	06 July 2013, 11:42:45 UTC
0603a6e	Erik Garrison	05 July 2013, 15:56:05 UTC	allow maximum search iterations to be == genotypingMaxIterations	05 July 2013, 15:56:05 UTC
c8bbba1	Erik Garrison	04 July 2013, 16:45:15 UTC	update version_git.h	04 July 2013, 16:45:15 UTC
be06174	Erik Garrison	04 July 2013, 16:41:50 UTC	Setting Release-Version v9.9.2	04 July 2013, 16:41:50 UTC
c2483d7	Erik Garrison	04 July 2013, 16:36:09 UTC	6x performance improvement Remove duplicated genotype search, as the algorithm now always searches deeply. The default default --pvar of 0 means "run gradient descent on everything". Fix incorrect (too large type) usage of ttmath. This resolves a major performance bug (30% of runtime) in the previous builds. Remove indel mask vector<bool>, whose aligned copy was occupying a large fraction of runtime (50%). Users employing this method for the removal of artifacts are suggested to look at samtools BAQ, which applies an HMM to incorporate local alignment quality into base quality. Together, these changes increase processing speed in the 1000G release set (2500 samples) by around 6x over the previous commit.	04 July 2013, 16:36:09 UTC
29fa45a	Erik Garrison	03 July 2013, 11:02:58 UTC	remove partial component from RO in VCF output	03 July 2013, 11:02:58 UTC
296164d	Erik Garrison	03 July 2013, 09:59:01 UTC	updated bamtools	03 July 2013, 09:59:01 UTC
67805dd	Erik Garrison	03 July 2013, 09:55:24 UTC	remove spurious N alleles from output Now that haplotype construction occurs at every base, the removal of part-null alleles needs to occur also in the genotypeAlleles routine (which establishes which alleles should be used for genotyping).	03 July 2013, 09:55:24 UTC
7f88f0a	Erik Garrison	02 July 2013, 12:06:38 UTC	parameter changes, --bam-list and --region Allow a bam file list to be provided on the command line. Allow the use of '-' as a region separator in region strings.	02 July 2013, 12:06:38 UTC
3b29e57	Erik Garrison	02 July 2013, 09:21:04 UTC	updated .gitignore, added CNV BED example	02 July 2013, 09:21:04 UTC
5736bf1	Erik Garrison	02 July 2013, 08:55:04 UTC	versioning fix (add version_git.h to dependencies)	02 July 2013, 08:55:04 UTC
d291f99	Erik Garrison	02 July 2013, 08:54:03 UTC	include autoversion in default build path Now, the version will include the git commit id from the repo!	02 July 2013, 08:54:03 UTC
a5a4300	Erik Garrison	02 July 2013, 08:51:22 UTC	properly include Contamination.*	02 July 2013, 08:51:22 UTC
2873687	Erik Garrison	02 July 2013, 08:49:53 UTC	add version_git.h	02 July 2013, 08:49:53 UTC
5f36d03	Erik Garrison	02 July 2013, 08:45:37 UTC	balanced observations for indels (indels) For insertions (currently), require that the reference observations at the haplotype containing the insertion flank the insertion by a combined number of bases that is the same as the length of the insertion. This normalizes the likelihood calculations for insertions. Normalization is (currently) provided implicitly for deletions in that reference bias prevents the mapping of longer deletions. This effect is stronger than the sum of bases effect driving bias problems for insertions.	02 July 2013, 08:45:37 UTC
a994e78	Erik Garrison	02 July 2013, 08:32:27 UTC	Setting Release-Version v9.9.1	02 July 2013, 08:32:27 UTC
e73c3e5	Erik Garrison	01 July 2013, 12:39:16 UTC	partial haplotype observations Use all read evidence, even when calling haplotypes, by utilizing equivalencies between partial observations of a haplotype and the putative alleles at the site. Observations which partially support a number of haplotypes have their probability mass divided amongst the alleles they support when calculating genotype likelihoods. This provision resolves sensitivity issues caused by increased spanning coverage required to call small variants when using larger --haplotype-length values. The adjustment to genotype likelihood calculations currently requires the use of the new GL calculation routine provided when enabling --prob-contamination 0, or providing a per-sample contamination estimate file via --contamination-estimates.	01 July 2013, 12:39:16 UTC
bea983b	Erik Garrison	31 May 2013, 09:55:36 UTC	per-read group contamination estimates This is a checkpoint. This functionality is relatively stable.	31 May 2013, 09:55:36 UTC
d549975	Erik Garrison	25 April 2013, 18:48:12 UTC	addition of contamination estimates into GLs This is a first pass solution, and should probably not be used in production. This commit is for future reference.	25 April 2013, 18:48:12 UTC
296a0fa	Erik Garrison	23 April 2013, 15:39:50 UTC	resolve read group header parsing bug, empty alignment bug When read groups had colons in them, they were not parsed properly. When reads were aligned as wholly soft-clipped, the allele parsing segfaulted.	23 April 2013, 15:39:50 UTC
b0ee6e1	Erik Garrison	19 April 2013, 23:05:46 UTC	set correct merge order using new bamtools method	19 April 2013, 23:05:46 UTC
e0f8a94	Erik Garrison	19 April 2013, 15:32:49 UTC	avoid errors with soft clipped sequence at the beginning of reference This leads to errors typically in chrM, as the circular nature of this chromosome means many reads are mapped with soft clips at position 0.	19 April 2013, 15:32:49 UTC
6989b9e	Erik Garrison	15 April 2013, 16:36:24 UTC	set the haplotype calling window with --haplotype-length This is a synonym for --max-complex-gap, but I wanted to ensure that users were clear on the meaning of this parameter. If you want to call haplotypes, then increase --haplotype-length. (It's 3bp by default.)	15 April 2013, 16:36:24 UTC
cd21fa4	Erik Garrison	09 April 2013, 10:58:54 UTC	indicate that 0.9.9 is new stable revision	09 April 2013, 10:58:54 UTC
c993c5c	Erik Garrison	23 March 2013, 05:12:04 UTC	allow detection of long indels Long deletions were filtered out by legacy code which considered gaps as mismatches.	23 March 2013, 05:12:04 UTC
d0c1f12	Erik Garrison	30 January 2013, 00:03:43 UTC	turn off genotype qualities by default Genotype qualities (reported as GQ in the output) are marginal likelihoods of the specific genotype for a specific sample given the data and Bayesian model. They may be helpful for filtering or assessing genotyping accuracy, but they take a lot of time to compute because the current method for estimating them is O(N^2) in the number of samples. For more than 10 low-coverage samples, GQ estimation becomes the dominant use of compute time. For 1000 samples, GQ estimation is 90% of compute time. Prior to this commit it was possible to disable them using --no-marginals. I've removed this crypticly-named parameter, set GQ estimation off by default, and added --genotype-qualities, which turns them back on. Users who wish to use fill out the GQ field (old behavior) must provide --genotype-qualities.	30 January 2013, 00:03:43 UTC
cc16f93	Erik Garrison	29 January 2013, 20:13:29 UTC	--no-marginals now means "no-GQ's"	29 January 2013, 20:13:29 UTC
e85de7b	Erik Garrison	28 January 2013, 23:16:01 UTC	version 0.9.9, set as default "mappability" priors freebayes can estimate the probability that the loci in question is accurately mapped using a number of features extracted from read placement and distribution among samples. This framework effectively extends the basic Bayesian formulation from P(genotypes \| reads) to P(genotypes, properly-mapped alleles \| reads). As such, the QUAL value must be understood to incorporate expectations about mappability derived from observation features such as allele balance, strand bias, and read placement relative to the allele. This commit sets this model on by default. To turn OFF this behavior, use -wVa or: --hwe-priors-off \ --binomial-obs-priors-off \ --allele-balance-priors-off Extensive testing showed that this combination of parameters provided excellent sensitivity and specificity at all levels of genomic coverage and numbers of samples. The largest improvement in performance is for low-coverage resequencing (<5x coverage, >1000 samples) experiments. Higher-coverage experiments, where data tends to overwhelm priors, should not be affected.	28 January 2013, 23:16:01 UTC
d01982a	Erik Garrison	28 January 2013, 22:37:32 UTC	pooled frequency-based calling (and nan guard) Separate --pooled into --pooled-discrete (old behavior) and --pooled-continuous. In the continuous case, allele observation characteristics are reported for all alleles which passed the input filters (default -F 0.2 -C 2). Pooled continuous calling does not modify the Bayesian algorithm, and is effectively orthogonal to other parameters. For instance, --pooled-discrete and --pooled-continuous can be specified toether. The called genotypes will be affected by the --ploidy setting and --pooled-discrete flags, but the output will reflect all observed alleles passing input filters a the site. Also, guard against nan's in output (Utility.cpp).	28 January 2013, 22:37:32 UTC
36fe0be	Erik Garrison	28 January 2013, 20:53:23 UTC	use of big number library for improved numerical precision (ttmath) Removes the QUAL limit of 50000, more experimentation may be required to apply the method to the marginal genotype quality calculations.	28 January 2013, 20:53:23 UTC
0595cc5	Erik Garrison	28 January 2013, 02:59:19 UTC	change to help text to reflect region spec change	28 January 2013, 02:59:19 UTC
8bb6181	Erik Garrison	28 January 2013, 02:57:24 UTC	resolve targeting issue The last position in a region was being excluded. This ensures that the entire target is processed. The documentation is updated to reflect this change. scripts/fasta_generate_regions.py will now make completely covering regions.	28 January 2013, 02:57:24 UTC
d84ba44	Erik Garrison	27 January 2013, 18:01:46 UTC	track total genotyping iterations, change iteration defaults	27 January 2013, 18:01:46 UTC
f3e5186	Erik Garrison	06 January 2013, 18:15:29 UTC	improve scaling of probabilities, resolve #42 When using --allele-balance-priors, --binomial-obs-priors, scale probabilities according to the number of possible observation permutations. Resolve #42 by preventing use of soft-clipped sequence at the beginning of the reference.	06 January 2013, 18:15:29 UTC
8c2bb94	Erik Garrison	04 January 2013, 17:12:11 UTC	resolve #43, add segfault handler In #43, challisd reports a segfault when using input alleles. This was caused by 0-length allele artifacts generated when parsing the input VCF. Additionally, when compiled in debug mode, the segfault handler will now print a stacktrace.	04 January 2013, 17:12:11 UTC
eda4b69	Erik Garrison	20 December 2012, 12:08:19 UTC	actually set defaults in Parametecs.cpp Sets -C 2 -F 0.2 by default.	20 December 2012, 12:08:19 UTC
f8d78ff	Erik Garrison	19 December 2012, 17:28:13 UTC	set default input filters (-C 2 -F 0.2) In testing, these input filters on the minimum support for a given allele have been found to provide a very good balance between sensitivity and specificity, reducing the need for users to place complex filters on their VCF output. We use them by default in our work in the 1000 Genomes project low-coverage (4-6x) data. However, They may not be ideal in polyploid or pooled systems or low-frequency somatic variant detection, so users working in such contexts should set them to a level appropriate for their needs.	19 December 2012, 17:28:13 UTC
48962c8	Erik Garrison	18 December 2012, 13:10:38 UTC	version 0.9.8	18 December 2012, 13:10:38 UTC
84bf532	Erik Garrison	18 December 2012, 13:01:23 UTC	add empirical allele observation bias adjustment table By specifying --observation-bias users may provide a table which describes the empirical mapping bias against alleles given the number of bases subtracted or added between the allele and the reference. This is intended to improve genotype likelihood (GL) estimates and downstream imputation and processing of these likelihoods.	18 December 2012, 13:01:23 UTC
61b7bbc	Erik Garrison	06 December 2012, 15:15:35 UTC	use cast to get correct call to max(...) To resolve issue reported by C here: http://blog.gkno.me/post/29962850248/getting-started-with-gkno#disqus_thread	06 December 2012, 15:15:35 UTC
8af4379	Erik Garrison	02 December 2012, 22:25:14 UTC	guard againstsoft-clip edge cases Soft clips can occur where there is not reference sequence. When generating the allele do notprocess the reference sequence.	02 December 2012, 22:25:14 UTC
0e3f75b	Erik Garrison	10 October 2012, 15:58:28 UTC	cache only needed sequence, cleanup repeat detection Indeed, freebayes was holding onto unneeded reference sequence. This closes a long-standing issue. Also, cleans up repeat edge detection issues.	10 October 2012, 15:58:28 UTC
65e689c	Erik Garrison	10 October 2012, 12:27:00 UTC	bump, attempting to fix github state The last commit is not reflected in github, but is possible to obtain by cloning. This is an attempt to resolve the mismatch between github's overview and the repository.	10 October 2012, 12:27:00 UTC
b950138	Erik Garrison	09 October 2012, 14:29:03 UTC	reference most recent stable revision Users encountering bugs with the development version can revert to the most recent stable version. This version will be updated in the README as development continues.	09 October 2012, 14:29:03 UTC
f16e2bb	Erik Garrison	05 October 2012, 15:25:09 UTC	remove errant debugging messages and force exit	05 October 2012, 15:25:09 UTC
0995bd8	Erik Garrison	02 October 2012, 07:33:20 UTC	build haplotypes across repeats (version 0.9.7) When an indel is based on underlying repeat structure, record the right boundary of the repeat in the reference (technically, the first base past the repeat) in the indel's Allele structure. When building haplotype alleles during genotyping, assemble across the repeat, requiring, for instance, reference-matching reads to cover the entire repeat sequence.	02 October 2012, 07:33:20 UTC

Newer
Older