Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

Revision be9977925e9e842cc755f14ced72bbee5c5d6d77 authored by Alexey Sergushichev on 02 December 2019, 17:22:45 UTC, committed by Alexey Sergushichev on 02 December 2019, 17:22:45 UTC
fixes for htseq & macs2 upgrades
1 parent 0aa4fbf
  • Files
  • Changes
  • 34f2683
  • /
  • bin
  • /
  • quant3p
Raw File Download

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • revision
  • directory
  • content
revision badge
swh:1:rev:be9977925e9e842cc755f14ced72bbee5c5d6d77
directory badge
swh:1:dir:5b653c90d686c5fd21bed29c1f8bafb2420ece4f
content badge
swh:1:cnt:6589f424e2f51774b5634084a9938272b6827056

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • revision
  • directory
  • content
(requires biblatex-software package)
Generating citation ...
(requires biblatex-software package)
Generating citation ...
(requires biblatex-software package)
Generating citation ...
quant3p
#!/usr/bin/env bash

#### Set up

SCRIPTNAME=`basename $0`
function error() 
{
    local PARENT_LINENO="$1"
    local MESSAGE="$2"
    local CODE="${3:-1}"
    if [[ -n "$MESSAGE" ]] ; then
        echo "$SCRIPTNAME: Error on or near line ${PARENT_LINENO}: ${MESSAGE}; exiting with status ${CODE}"
    else
        echo "$SCRIPTNAME: Error on or near line ${PARENT_LINENO}; exiting with status ${CODE}"
    fi
    exit "${CODE}"
}

trap 'error ${LINENO}' ERR
# setting exit on error and inheriting of ERR trap
set -eE

die()
{
    if [ -n "$@" ]; then
        echo "$@" >&2
    else
        echo "FAILED" >&2
    fi
    exit 1
}


#### Arguments

GSIZE="3e9"
QVALUE="0.01"
NPROC="2"
KEEP_TEMP=""

# help message
help()
{
    echo "usage: $SCRIPTNAME <options> <bam-file>+"
    echo ""
    echo "mandatory arguments:"
    echo "-n NAME name of the experiment; mandatory"
    echo "-g/--gtf GTF annotaion ; mandatory"
    echo ""
    echo "optional arguments:"
    echo "-p NPROC number of processes to do in parallel; default = $NPROC"
    echo "--keep-temp keep temporary files"
    echo "--genome GSIZE approximate size of the genome (for MACS); default = $GSIZE"
    echo "--qvalue QVALUE q-value cutoff (for MACS); default = $QVALUE"
    echo "-h|--help shows this message and exit"
}


if [ $# -eq 0 ]; then
    help
    exit
fi

while true; do
    case "$1" in
        -n|--name) NAME="$2"; shift 2;;
        -g|--gtf) GTF="$2"; shift 2;;
        --genome) GSIZE="$2"; shift 2;;
        --qvalue) QVALUE="$2"; shift 2;;
        -p) NPROC="$2"; shift 2;;
        --keep-temp) KEEP_TEMP="--keep-temp"; shift 1;;
        -h|--help) help; exit 0;;
        --) shift 1; break;;
        -*) die "unrecognized option: $1";;
        *) break;;
    esac
done

if [ -z "$NAME" ]; then
    die "NAME is not specified (try -h for details)"
fi

if [ -z "$GTF" ]; then
    die "GTF is not specified (try -h for details)"
fi

BAMS=("$@")

if [ "${#BAMS}" == 0 ]; then
    die "no bam files specified (try -h for details)"
fi

#### Run

echo "Calling peaks..."
macs2-stranded $KEEP_TEMP -n "${NAME}" -g "${GSIZE}" -q "${QVALUE}" "${BAMS[@]}" | sed "s/^/  /"
peaks="${NAME}"_peaks.narrowPeak

echo "Extending annotaion..."
fixed_gtf="${NAME}.`basename "${GTF}" .gtf`.fixed.gtf"
gtf-extend -g "${GTF}" -p "${peaks}" -o "${fixed_gtf}" | sed "s/^/  /"


export htseq_dir="${NAME}.htseq"
mkdir -p "${htseq_dir}"

count_bam() {
    BAM="$1"
    TAG=`basename "${BAM}" .bam`
    fix-mm -g "${GTF}" "${BAM}" -o - | \
        samtools view -h - | \
        htseq-count --secondary-alignments score -s yes -t exon - "${fixed_gtf}" > "${htseq_dir}/${TAG}.htseq.out"
}


export -f count_bam
export GTF
export fixed_gtf
export KEEP_TEMP

find "${BAMS[@]}" -print0 | xargs -0 -n 1 -P "${NPROC}" bash -c 'count_bam "$@"' _ 

if [ -z "${KEEP_TEMP}" ]; then
    rm "${fixed_gtf}"
    rm "${peaks}"
fi

tags=()
htseq_outs=()

header=""

fields="1"

i=0

for bam in "${BAMS[@]}"
do
    tag=`basename "${bam}" .bam`
    htseq_out="${htseq_dir}/${tag}.htseq.out"
    tags+=("${tag}")
    htseq_outs+=("${htseq_out}")
    header="${header}\t${tag}"
    i=$(($i+2))
    fields="${fields},${i}"
done

cat <(echo -e "${header}") \
    <(paste "${htseq_outs[@]}" | cut -f "${fields}") \
    > "${NAME}.cnt"

if [ -z "${KEEP_TEMP}" ]; then
    for htseq_out in "${htseq_outs[@]}"
    do
        rm "${htseq_out}"
    done

    rmdir "${htseq_dir}"
fi

echo "Done"
The diff you're trying to view is too large. Only the first 1000 changed files have been loaded.
Showing with 0 additions and 0 deletions (0 / 0 diffs computed)
swh spinner

Computing file changes ...

back to top

Software Heritage — Copyright (C) 2015–2026, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API