https://github.com/galaxyproject/galaxy
Tip revision: ce362e60abbafdffa65acaf68f9492cec7196a9d authored by Martin Cech on 23 May 2018, 18:44:54 UTC
Update version to 18.05
Update version to 18.05
Tip revision: ce362e6
add_scores.xml
<tool id="hgv_add_scores" name="phyloP" version="1.0.0">
<description>interspecies conservation scores</description>
<requirements>
<requirement type="package">add_scores</requirement>
</requirements>
<command>
python '$__tool_directory__/add_scores.py' '$input1' '$out_file1' '${GALAXY_DATA_INDEX_DIR}/add_scores.loc' '${input1.metadata.dbkey}' '${input1.metadata.chromCol}' '${input1.metadata.startCol}'
</command>
<inputs>
<param format="interval" name="input1" type="data" label="Dataset">
<validator type="unspecified_build"/>
<validator type="dataset_metadata_in_file" filename="add_scores.loc" metadata_name="dbkey" metadata_column="0" message="Data is currently not available for the specified build."/>
</param>
</inputs>
<outputs>
<data format_source="input1" name="out_file1" />
</outputs>
<tests>
<test>
<param name="input1" value="add_scores_input1.interval" ftype="interval" dbkey="hg18" />
<output name="output" file="add_scores_output1.interval" />
</test>
<test>
<param name="input1" value="add_scores_input2.bed" ftype="interval" dbkey="hg18" />
<output name="output" file="add_scores_output2.interval" />
</test>
</tests>
<help>
.. class:: warningmark
This currently works only for builds hg18 and hg19.
-----
**Dataset formats**
The input can be any interval_ format dataset. The output is also in interval format.
(`Dataset missing?`_)
.. _interval: ${static_path}/formatHelp.html#interval
.. _Dataset missing?: ${static_path}/formatHelp.html
-----
**What it does**
This tool adds a column that measures interspecies conservation at each SNP
position, using conservation scores for primates pre-computed by the
phyloP program. PhyloP performs an exact P-value computation under a
continuous Markov substitution model.
The chromosome and start position
are used to look up the scores, so if a larger interval is in the input,
only the score for the first nucleotide is returned.
-----
**Example**
- input file, with SNPs::
chr22 16440426 14440427 C/T
chr22 15494851 14494852 A/G
chr22 14494911 14494912 A/T
chr22 14550435 14550436 A/G
chr22 14611956 14611957 G/T
chr22 14612076 14612077 A/G
chr22 14668537 14668538 C
chr22 14668703 14668704 A/T
chr22 14668775 14668776 G
chr22 14680074 14680075 A/T
etc.
- output file, showing conservation scores for primates::
chr22 16440426 14440427 C/T 0.509
chr22 15494851 14494852 A/G 0.427
chr22 14494911 14494912 A/T NA
chr22 14550435 14550436 A/G NA
chr22 14611956 14611957 G/T -2.142
chr22 14612076 14612077 A/G 0.369
chr22 14668537 14668538 C 0.419
chr22 14668703 14668704 A/T -1.462
chr22 14668775 14668776 G 0.470
chr22 14680074 14680075 A/T 0.303
etc.
"NA" means that the phyloP score was not available.
</help>
<citations>
<citation type="doi">10.1007/11732990_17</citation>
</citations>
</tool>