Skip to main content
  • Home
  • Development
  • Documentation
  • Donate
  • Operational login
  • Browse the archive

swh logo
SoftwareHeritage
Software
Heritage
Archive
Features
  • Search

  • Downloads

  • Save code now

  • Add forge now

  • Help

Revision d64c7115b852e3a269ed9e3c069a3485b34bbea5 authored by Eric Sanford on 10 October 2019, 18:04:36 UTC, committed by Eric Sanford on 10 October 2019, 18:04:36 UTC
added location of hg19 reference files as comments to setEnvironmentVariables.sh script
1 parent c5b88eb
  • Files
  • Changes
  • 7fe3ab4
  • /
  • Snakemake_bulkRNA
  • /
  • README.md
Raw File Download

To reference or cite the objects present in the Software Heritage archive, permalinks based on SoftWare Hash IDentifiers (SWHIDs) must be used.
Select below a type of object currently browsed in order to display its associated SWHID and permalink.

  • revision
  • directory
  • content
revision badge
swh:1:rev:d64c7115b852e3a269ed9e3c069a3485b34bbea5
directory badge Iframe embedding
swh:1:dir:8909dba22cf287e1b5a733eaa8187882930d7197
content badge Iframe embedding
swh:1:cnt:78364a4b63423cc5212aa380f887e2ce7c01c9b9

This interface enables to generate software citations, provided that the root directory of browsed objects contains a citation.cff or codemeta.json file.
Select below a type of object currently browsed in order to generate citations for them.

  • revision
  • directory
  • content
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
Generate software citation in BibTex format (requires biblatex-software package)
Generating citation ...
README.md
# Snakemake workflow: rna-seq-star-deseq2

[![Snakemake](https://img.shields.io/badge/snakemake-≥5.2.1-brightgreen.svg)](https://snakemake.bitbucket.io)
[![Build Status](https://travis-ci.org/snakemake-workflows/rna-seq-star-deseq2.svg?branch=master)](https://travis-ci.org/snakemake-workflows/rna-seq-star-deseq2)
[![Snakemake-Report](https://img.shields.io/badge/snakemake-report-green.svg)](https://cdn.rawgit.com/snakemake-workflows/rna-seq-star-deseq2/master/.test/report.html)

This workflow performs a differential expression analysis with STAR and Deseq2.

## Authors of base pipeline

* Johannes Köster (@johanneskoester), https://koesterlab.github.io
* Sebastian Schmeier (@sschmeier), https://sschmeier.com
* Jose Maturana (@matrs)

## Usage

###	 Snakemake-bulkRNAseq-pipeline
# 	Last updated:	08/27/2019
# 	by:	Phil

The purpose of this document to to setup Snakemake and a differential expression pipeline to analyze bulk RNA-sequencing data.

The pipeline was generated by Johannes Koesterr, the Github repo is here.


¡BEFORE STARTING!
Make sure you have installed miniconda (Python3.7 version) and snakemake.
Download and install STAR version 2.5.3a.

Installing the pipeline
Clone the git/bitbucket repository.

Create a branch in case you make edits.

Make sure you are in the directory with the pipeline.

Place your raw FASTQ files in the data/ folder. Ex:

 rsync -av /path/to/data/SRX5725609.fastq.gz data/ ”


Make sure you have the following components of your reference:
Reference FASTA
Reference GTF

Index your reference genome using the following command (this will take several hours to a day, but only needs to be done once):	

Edit the file samples.tsv to appropriately reflect your sample names and the conditions.
	Ex: 	“ sample		condition
		  SRX5725609      	mock 
  SRX5725612      	infected 
  … “

Edit the file units.tsv to appropriately reflect your sample characteristics. [this is important to, for instance, mark fastq coming from the same file but different sequencing lanes if you don’t combine beforehand] [*** If you only have single end reads, leave the fq2 column blank]
	Ex:	“ sample  unit    fq1     fq2
  SRX5725609      sample  data/SRX5725609.fastq.gz
  SRX5725610      rep1    data/SRX5725610.fastq.gz
  …”

Edit the configuration file, config.yaml, to reflect:
The appropriate adapter sequence
The pca label
The comparison of conditions
Any parameters you want to include
*** and to the location of your STAR index directory and the GTF file.

Once this is complete, if you are running on MAC OSX, the command “zcat” will not work. To work around this without editing the wrapper, make the following changes:

	sed 's/.fastq.gz/.fastq/g' rules/align.smk > rules/align.smk ;

	sed 's/.fastq.gz/.fastq/g' rules/trim.smk > rules/trim.smk ;

This effectively removes the compression of all intermediate FASTQ files and bypasses the need to gunzip them.

If you are running on a Linux workstation, the above step is not necessary.

You’re almost there! Now we run snakemake:

Test that the configuration works with:

	snakemake --use-conda -n

If this work, then run the following command to run the script ($N is number of cores):

	snakemake --use-conda --cores $N

Once it’s done running you can create a report with the following command:

	snakemake --report report.html

If there is an error, the type will come up in red color and you can check the logs/ folder for the sample/step where an error was generated.

The pipeline should produce two useful .html files, two .svg files, and two .pdf files:

 ./report.html (overview of basic properties of your data)
 qc/multiqc_report.html (more detailed overview of sequencing data, including number of intronic reads, splice reads, etc…)
results/pca.svg (a figure of first two principal components of your samples based on gene expression)
results/diffexp/{condition2-vs-condition1}.ma-plot.svg (a figuring giving the comparison of the log ratio and mean average expression comparing the two conditions).
results/diffexp/{condition2-vs-condition1}.diffexp.volcano.pdf
results/diffexp/{condition2-vs-condition1}.diffexp.heatmap.pdf
The diff you're trying to view is too large. Only the first 1000 changed files have been loaded.
Showing with 0 additions and 0 deletions (0 / 0 diffs computed)
swh spinner

Computing file changes ...

back to top

Software Heritage — Copyright (C) 2015–2025, The Software Heritage developers. License: GNU AGPLv3+.
The source code of Software Heritage itself is available on our development forge.
The source code files archived by Software Heritage are available under their own copyright and licenses.
Terms of use: Archive access, API— Content policy— Contact— JavaScript license information— Web API