gtscript documentation
This document describes the gtscript API. gtscript is basically Lua (an embeddable scripting language) plus parts of the GenomeTools C libraries exported to Lua. Because the GenomeTools binary gt contains an embedded Lua interpreter, gtscript files can be executed with the gt binary. The parts of the GenomeTools C libraries exported to Lua (the gtscript API) are described here, for a documentation of Lua itself and its APIs, please refer to the Lua reference manual.
Notes
-
You have to add
require 'gtlua'
to your script in order to load the parts of the gtscript API which are implemented in gtscript itself. -
By default, all functions of the gtscript API are contained in the
gt
table. That is, you have to prepend all functions calls with ``gt.
''. For example, writegt.show()
to call theshow()
function documented below. You can usegt.export()
orgt.re()
to export thegt
table to the global environment, which makes the prepending unnecessary (but clutters your global environment). -
If a function is documented as ``returns an array'', this means that the function
returns a table where only consecutive
integer keys from 1 on are used and which can be traversed with the
ipairs()
function.
Classes
- Alpha
- Bittab
- CDSStream
- CSAStream
- Diagram
- FeatureIndex
- FeatureStream
- FeatureVisitor
- GenomeNode
- GenomeNodeIterator
- GenomeStream
- Range
- RegionMapping
- Render
- ScoreMatrix
- StreamEvaluator
Sole functions
genome_feature_new(type, range, strand)
Returns a new genome feature of type
spanning range
on strand
.
sequence_region_new(seqid, range)
Returns a new sequence region for sequence id seqid
spanning range
.
genome_feature:get_strand()
Returns the strand of genome_feature
.
genome_feature:get_source()
Returns the source of genome_feature
.
genome_feature:set_source(source)
Set the source of genome_feature
to source
.
genome_feature:output_leading()
Show leading part of GFF3 output for genome_feature
genome_feature:get_type()
Return type of genome_feature
as string.
genome_feature:extract_sequence(type, join, region_mapping)
Extract the sequence of genome_feature
. If join
is false and genome_feature
has type type
the sequence is returned (using region_mapping
to get it). If join
is true and genome_feature
has children of type type
their joined sequences are returned. If none of the above applies nil is returned.
gff3_in_stream_new_sorted(filename)
Returns a new GFF3 input stream object for filename
. The file filename
has to be a sorted GFF3 file.
gff3_out_stream_new(genome_stream)
Returns a new GFF3 output stream which pulls its features from genome_stream
.
gff3_visitor_new()
Returns a new GFF3 visitor.
ranges_sort(range_array)
Returns an array containing the ranges from array range_array
in sorted order.
ranges_are_sorted(range_array)
Returns true if the ranges in array range_array
are sorted, false otherwise.
translate_dna(dna)
Returns translated dna
.
reload()
Reload gt
module.
features_contain_marked(features)
Returns true if the given array of features
contains a marked feature, false otherwise.
features_show(features)
Print the given array of features
to stdout.
features_get_marked(features)
Return all marked features
(an array) as an array or nil if features
contains no marked features.
features_show_marked(features)
Print all marked features
(an array) to stdout.
features_mRNAs2genes(in_features)
Return an array of genome features which contains a separate gene feature for each mRNA in in_features
.
features_extract_sequences(features, type, join, region_mapping)
Return an array with the sequences of the given features.
export()
Export the content of gt
table to the global environment.
display(filename)
Call external 'display' program for file filename
.
show_table(tbl)
Show all keys and values of table tbl
.
show(all)
Show content of the gt
table.
re()
Reload the gt
module and export its content to the global environment.
Class Alpha
alpha_new_protein()
Returns a new protein alphabet.
alpha:decode(code)
Returns a string containing the decoded character of the code
number.
alpha:size()
Returns the size of alpha
a number.
Class Bittab
bittab_new(num_of_bits)
Returns a bittab with num_of_bits
many bits.
bittab:set_bit(bit)
Set bit
in bittab
.
bittab:unset_bit(bit)
Unset bit
in bittab
.
bittab:complement(src)
Store the complement of bittab src
in bittab
. bittab
and src
must have the same size.
bittab:equal(src)
Set bittab
equal to bittab src
. bittab
and src
must have the same size.
bittab:and_equal(src)
Set bittab
equal to the bitwise AND of bittab
and src
. bittab
and src
must have the same size.
bittab:bit_is_set(bit)
Returns true if bit
is set in bittab
, false otherwise.
Class CDSStream
cds_stream_new(region_mapping)
Returns a new CDS (coding sequence) stream object (a genome stream) which uses genome stream in_stream
as input. The CDS stream adds CDS features to exon features in in_stream
. The given region_mapping
is used to map the sequence regions given in in_stream
to the actual sequence files necessary for computing the coding sequences.
Class CSAStream
csa_stream_new(in_stream, join)
Returns a new CSA (consensus spliced alignment) stream object (a genome stream) which uses genome stream in_stream
as input. The CSA stream replaces spliced alignments with computed consensus spliced alignments. The optional join
parameters sets the length for the spliced alignment clustering (default: 300).
Class Diagram
diagram_new(feature_index, range, seqid)
Return a diagram object which contains the genome nodes given in feature_index
in the given range
of the sequence region with sequence ID seqid
.
Class FeatureIndex
feature_index_new()
Returns a new feature_index
object.
feature_index:add_sequence_region(sequence_region)
Add sequence_region
to feature_index
.
feature_index:add_genome_feature(genome_feature)
Add genome_feature
to feature_index
.
feature_index:get_features_for_seqid(seqid)
Returns the genome features for sequence ID seqid
in an array.
feature_index:get_features_for_range(seqid, range)
Returns the genome features for sequence ID seqid
within range
in an array.
feature_index:get_first_seqid()
Returns the first sequence ID stored in feature_index
.
feature_index:get_seqids()
Returns an array containins all sequence IDs stored in feature_index
.
feature_index:get_range_for_seqid(seqid)
Returns the range covered by features of sequence ID seqid
in feature_index
.
feature_index:get_coverage(seqid, maxdist)
Computes the coverage for the sequence ID seqid
. The optional maxdist
parameter denotes the maximal distance two features can be apart without creating a new Range. Returns an array of Ranges denoting parts the of seqid
covered by features.
feature_index:get_marked_regions(seqid, maxdist)
Returns an array of Ranges denoting parts of seqid
which are covered by at least one marked feature. Internally, get_coverage()
is called and the maxdist
is passed along.
feature_index:render_to_png(seqid, range, png_file, width)
Render to PNG file png_file
for seqid
in range
with optional width
. If no png_file
is given os.tmpname()
is called to create one. Returns name of written PNG file.
feature_index:show_seqids()
Show all sequence IDs.
feature_index:get_all_features()
Returns all features from feature_index
.
Class FeatureStream
feature_stream_new(in_stream, feature_index)
Returns a new feature stream object (a genome stream) over feature_index
which uses the genome stream in_stream
as input. That is, all genome nodes which are pulled through the feature stream are added to the feature_index
.
Class FeatureVisitor
feature_visitor_new(feature_index)
Returns a new feature visitor object over feature_index
. That is, all genome nodes which are visited by the feature visitor are added to the feature_index
.
Class GenomeNode
genome_node:get_filename()
Returns the filenname of genome_node
.
genome_node:get_range()
Returns the range of genome_node
.
genome_node:get_seqid()
Returns the sequence id of genome_node
.
genome_node:set_seqid(seqid)
Set the sequence id of genome_node
to seqid
.
genome_node:accept(genome_visitor)
Accept genome_visitor
.
genome_node:is_part_of_genome_node(child_node)
Make genome_node
the parent of child_node
.
genome_node:mark()
Mark genome_node
.
genome_node:is_marked()
Returns true if genome_node
is marked, false otherwise.
genome_node:contains_marked()
Returns true if genome_node
contains a marked node, false otherwise.
genome_node:show(gff3_visitor)
Show genome node on stdout (using the optional gff3_visitor
).
genome_node:show_marked()
Show marked parts of genome node on stdout.
Class GenomeNodeIterator
genome_node_iterator_new(genome_node)
Returns a new genome node iterator which performs a depth-first traversel of genome_node
(including genome_node
itself).
genome_node_iterator_new_direct(genome_node)
Returns a noew genome node iterator wich iterates over all direct children of genome_node
(without genome_node
itself).
genome_node_iterator:next()
Returns the next genome node for genome_node_iterator
or nil.
Class GenomeStream
genome_stream:next_tree()
Returns the next genome node for genome_stream
or nil.
Class Range
range_new(startpos, endpos)
Returns a new range object with start startpos
and end endpos
. startpos
must be smaller or equal than endpos
.
range:get_start()
Returns start of range
.
range:get_end()
Returns end of range
.
range:overlap(other_range)
Returns true if range
and other_range
overlap, false otherwise.
range:show()
Show range on stdout.
Class RegionMapping
region_mapping_new_seqfile(seqfile)
Returns a new region mapping which maps everything onto sequence file seqfile
.
Class Render
render_new()
Returns a new render object.
render:to_png(diagram, filename, width)
Uses render
to render the given diagram
as PNG to filename
. The optional width
parameter sets the width of the PNG (default: 800).
Class ScoreMatrix
score_matrix_new_read_protein(path)
Returns a new protein score matrix object which has been read from file path
.
score_matrix:get_dimension()
Returns the dimension of the score_matrix
as number.
score_matrix:get_score(idx1, idx2)
Returns the score for idx1
,idx2
as number.
Class StreamEvaluator
stream_evaluator_new(reality_stream, prediction_stream)
Returns a new stream evaluator object for the two genome streams reality_stream
and prediction_stream
.
stream_evaluator:evaluate(genome_visitor)
Run evaluation of stream_evaluator
. All evaluated features are visited by the optional genome_visitor
.
stream_evaluator:show()
Show result of stream_evaluator
on stdout.