--^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^-- ----------------Run and to understand SOAP!---------------- -v v v v v v v v v v v v v v v v v v v v v v v v v v v v v- Program URL: http://soap.genomics.org.cn/ SOAP has been in evolution from a single alignment tool to a tool package that provides full solution to next generation sequencing data analysis. Currently, it consists of a new alignment tool (SOAPaligner/soap2), a re-sequencing consensus sequence builder (SOAPsnp), an indel finder ( SOAPindel ), a structural variation scanner ( SOAPsv ) and a de novo short reads assembler ( SOAPdenovo ). And a GPU-accelerated alignment tool (SOAP3/GPU) are being implemented. ORDER OF THINGS 1. pregraph or sparse_pregraph 2. contig 3. map 4. scaff --^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^ ^-- ----------------------------------------------------------- -v v v v v v v v v v v v v v v v v v v v v v v v v v v v v- ~~~~~~~~~~~~~~ ~ ~~ ~ SOAPdenove ~~ ~ ~~ ~~~~~~~~~~~~~~ SOAPdenovo is a novel short-read assembly method that can build a de novo draft assembly for the human-sized genomes. The program is specially designed to assemble Illumina GA short reads. It creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost effective way. Now the new version is available. SOAPdenovo2, which has the advantage of a new algorithm design that reduces memory consumption in graph construction, resolves more repeat regions in contig assembly, increases coverage and length in scaffold construction, improves gap closing, and optimizes for large genome. ################## # example.config # ################## #maximal read length max_rd_len=100 [LIB] #average insert size avg_ins=200 #if sequence needs to be reversed reverse_seq=0 #in which part(s) the reads are used asm_flags=3 #use only first 100 bps of each read rd_len_cutoff=100 #in which order the reads are used while scaffolding rank=1 # cutoff of pair number for a reliable connection (at least 3 for short insert size) pair_num_cutoff=3 #minimum aligned length to contigs for a reliable read location (at least 32 for short insert size) map_len=32 #a pair of fastq file, read 1 file should always be followed by read 2 file q1=/path/**LIBNAMEA**/fastq1_read_1.fq q2=/path/**LIBNAMEA**/fastq1_read_2.fq #another pair of fastq file, read 1 file should always be followed by read 2 file q1=/path/**LIBNAMEA**/fastq2_read_1.fq q2=/path/**LIBNAMEA**/fastq2_read_2.fq #a pair of fasta file, read 1 file should always be followed by read 2 file f1=/path/**LIBNAMEA**/fasta1_read_1.fa f2=/path/**LIBNAMEA**/fasta1_read_2.fa #another pair of fasta file, read 1 file should always be followed by read 2 file f1=/path/**LIBNAMEA**/fasta2_read_1.fa f2=/path/**LIBNAMEA**/fasta2_read_2.fa #fastq file for single reads q=/path/**LIBNAMEA**/fastq1_read_single.fq #another fastq file for single reads q=/path/**LIBNAMEA**/fastq2_read_single.fq #fasta file for single reads f=/path/**LIBNAMEA**/fasta1_read_single.fa #another fasta file for single reads f=/path/**LIBNAMEA**/fasta2_read_single.fa #a single fasta file for paired reads p=/path/**LIBNAMEA**/pairs1_in_one_file.fa #another single fasta file for paired reads p=/path/**LIBNAMEA**/pairs2_in_one_file.fa #bam file for single or paired reads, reads 1 in paired reads file should always be followed by reads 2 # NOTE: If a read in bam file fails platform/vendor quality checks(the flag field 0x0200 is set), itself and it's paired read would be ignored. b=/path/**LIBNAMEA**/reads1_in_file.bam #another bam file for single or paired reads b=/path/**LIBNAMEA**/reads2_in_file.bam [LIB] avg_ins=2000 reverse_seq=1 asm_flags=2 rank=2 # cutoff of pair number for a reliable connection (at least 5 for large insert size) pair_num_cutoff=5 #minimum aligned length to contigs for a reliable read location (at least 35 for large insert size) map_len=35 q1=/path/**LIBNAMEB**/fastq_read_1.fq q2=/path/**LIBNAMEB**/fastq_read_2.fq f1=/path/**LIBNAMEA**/fasta_read_1.fa f2=/path/**LIBNAMEA**/fasta_read_2.fa p=/path/**LIBNAMEA**/pairs_in_one_file.fa b=/path/**LIBNAMEA**/reads_in_file.bam ######################### # Example command lines # ######################### $ SOAPdenovo-63mer all -s my.config -o graphOutput -K 63 -k 63 -p 32 -a 16 -d 1 -R -D 1 -M 1 -e 0 -z 9000000000 1>soap.log 2>soap.err