$ gt shredder U89959_genomic.fas > fragments.fas
NAME
gt-shredder - Shredder sequence file(s) into consecutive pieces of random length.
SYNOPSIS
gt shredder [option …] [sequence_file …]
DESCRIPTION
- -coverage [value]
-
set the number of times the sequence_file is shreddered (default: 1)
- -minlength [value]
-
set the minimum length of the shreddered fragments (default: 300)
- -maxlength [value]
-
set the maximum length of the shreddered fragments (default: 700)
- -overlap [value]
-
set the overlap between consecutive pieces (default: 0)
- -sample [value]
-
take samples of the generated sequences pieces with the given probability (default: 1.000000)
- -clipdesc [yes|no]
-
clip descriptions after first space (fooled by \t, \n etc) adds offset and length to ensure unique identifier (default: no)
- -width [value]
-
set output width for FASTA sequence printing (0 disables formatting) (default: 0)
- -o [filename]
-
redirect output to specified file (default: undefined)
- -gzip [yes|no]
-
write gzip compressed output file (default: no)
- -bzip2 [yes|no]
-
write bzip2 compressed output file (default: no)
- -force [yes|no]
-
force writing to output file (default: no)
- -help
-
display help and exit
- -version
-
display version information and exit
Each sequence given in sequence_file is shreddered into consecutive pieces of random length (between -minlength and -maxlength) until it is consumed. By this means the last shreddered fragment of a given sequence can be shorter than the argument to option -minlength. To get rid of such fragments use gt seqfilter (see example below).
Examples:
Shredder a given BAC:
Shredder an EST collection into pieces between 50 and 100 bp and get rid of all (terminal) fragments shorter than 50 bp:
$ gt shredder -minlength 50 -maxlength 100 U89959_ests.fas \ | gt seqfilter -minlength 50 - > fragments.fas # 130 out of 1260 sequences have been removed (10.317%)
Shredder an EST collection and show only random 10% of the resulting fragments:
$ gt shredder -sample 0.1 U89959_ests.fas
REPORTING BUGS
Report bugs to https://github.com/genometools/genometools/issues.