Full online documentation at: https://wiki-bsse.ethz.ch/display/ShoRAH/Documentation ShoRAH consists of several programs: shorah.py - runs the following programs dec.py - local error correction based on diri_sampler diri_sampler - Gibbs sampling for Dirichlet process mixture contain.cc - eliminate redundant reads mm.py - maximum matching haplotype construction freqEst.cc - EM algorithm for haplotype frequency fas2read.pl - translates between two formats for read data files bam2msa.py - extracts a multiple sequence alignment from a BAM file snv.py - extracts SNVs from locally reconstructed windows b2w.c - creates windows from a bam alignment file fil.c - performs a strand bias test on predicted SNVs Copyright 2007, 2008, 2009, 2010, 2012 Niko Beerenwinkel, Arnab Bhattacharya, Nicholas Eriksson, Moritz Gerstung, Lukas Geyrhofer, Osvaldo Zagordi, Kerensa McElroy, ETH Zurich under GPL. See file LICENSE for details. The program also includes other software written by third parties. This has been distributed according to the licenses provided by the respective authors. GENERAL USAGE: -------------------------------------------------- Install: From within the samtools directory, type 'make'. Then, from within shorah directory, type 'make' to build the C++ and C programs, then run. See INSTALL for additional information. Run: Type shorah.py -h for command line instructions. If shorah is run within a directory in which it has already been run, only SNV calling will be performed. This facilitates experimenting with different values of sigma for the strand bias test. SHORAH.PY: -------------------------------------------------- Options: -h, --help show this help message and exit -b B, --bam=B sorted bam format alignment file. -f F, --fasta=F reference genome in fasta format. -a A, --alpha=A alpha in dpm sampling <0.01> -w W, --windowsize=W window size <201> -s S, --winshifts=S number of window shifts <3> -i I, --sigma=I value of sigma to use when calling SNVs -x X, --maxcov=X approximate maximum coverage allowed -r R, --region=R region in format 'chr:start-stop', eg 'ch3:1000-3000' -k, --keep_files keep intermediate files (Gibbs sampling)