Summarizing SNPs
A simple script to count and save SNP data from an alignment.
Because counting SNPs is an irritating but common task. Usage is:
sumsnps.rb [options] FILE1 [FILE2] [FILE3] ...
where the options are:
| -h, --help | show the help screen |
| -t, --threshold FLOAT | |
| the threshold to apply for identifying snps | |
| -o, --overwrite | |
| if saving output, overwrite pre-existing files | |
| -s, --save FILE | |
| save results as CSV in this file. | |
The input is an alignment file in any format that BioRuby can read. Sites with a residue frequency of less than or equal to the threshold are reported. Thus, if the threshold is 1.0 (the default) all sites are reported. Results are saved in the CSV file as <position>, <residue>, <frequency>, i.e. a residue per line. A SNP will thus occupy multiple lines. A small gotcha: the results from each input file are saved to different files, all of which use the passed name. Hence, they write over each other. This should be easy to fix.
The usual caveats apply: this is a quick hack with little error-checking.
Requirements
Ruby, BioRuby

