[[#]] bio-cd-hit-report

[![Build Status](https://secure.travis-ci.org/georgeG/bioruby-cd-hit-report.png)](http://travis-ci.org/georgeG/bioruby-cd-hit-report)

Clustering sequences with CD-HIT produces a cluster file(.clstr)
containing sequence names and their respective clusters. This plugin
provides methods for parsing this file. 

Note: this plugin is under active development!

## Installation

```sh
    gem install bio-cd-hit-report
```

## Usage

```ruby
    require 'bio-cd-hit-report'
   
    cluster_file = "cluster95.clstr"
    report = Bio::CdHitReport.new(cluster_file)

      #print total number of clusters in the report
      puts report.total_clusters  

      #print the cluster members for cluster with id 1
      puts report.get_cluster(1)

      #information for each cluster
      report.each_cluster do |c|
        puts c.name        #print the full cluster name
        puts c.members     #print respective sequence names in the cluster
        puts c.cluster_id  #print the cluster id only
        puts c.size        #print the total number of entries in the cluster
        puts c.rep_seq     #print the name of the representative sequence in this cluster
      end
```
        
## Project home page

Information on the source tree, documentation, examples, issues and
how to contribute, see

  http://github.com/georgeG/bioruby-cd-hit-report

The BioRuby community is on IRC server: irc.freenode.org, channel: #bioruby.

## Cite

If you use this software, please cite one of
  
* [BioRuby: bioinformatics software for the Ruby programming language](http://dx.doi.org/10.1093/bioinformatics/btq475)
* [Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics](http://dx.doi.org/10.1093/bioinformatics/bts080)

## Biogems.info

This Biogem is published at [#bio-cd-hit-report](http://biogems.info/index.html)

## Copyright

Copyright (c) 2013 George Githinji. See LICENSE.txt for further details.