gtscript documentation
This document describes the gtscript API. gtscript is basically Lua (an embeddable scripting language) plus parts of the GenomeTools C libraries exported to Lua. Because the GenomeTools binary gt contains an embedded Lua interpreter, gtscript files can be executed with the gt binary. The parts of the GenomeTools C libraries exported to Lua (the gtscript API) are described here, for a documentation of Lua itself and its APIs, please refer to the Lua reference manual.
See the index for an alphabetical list of all available interfaces.
Notes
-
You have to add
require 'gtlua'
to your script in order to load the parts of the gtscript API which are implemented in gtscript itself. -
By default, all functions of the gtscript API are contained in the
gt
table. That is, you have to prepend all functions calls with ``gt.
''. For example, writegt.show()
to call theshow()
function documented below. You can usegt.export()
orgt.re()
to export thegt
table to the global environment, which makes the prepending unnecessary (but clutters your global environment). -
If a function is documented as ``returns an array'', this means that the function
returns a table where only consecutive
integer keys from 1 on are used and which can be traversed with the
ipairs()
function.
Classes
- Alphabet
- Bittab
- CDSStream
- CSAStream
- Canvas
- CommentNode
- Diagram
- FeatureIndex
- FeatureNode
- FeatureNodeIterator
- FeatureStream
- FeatureVisitor
- GFF3InStream
- GFF3OutStream
- GFF3Visitor
- GenomeNode
- GenomeStream
- Imageinfo
- Layout
- MetaNode
- Range
- RegionMapping
- RegionNode
- ScoreMatrix
- SequenceNode
- StreamEvaluator
- translate
Modules
Sole functions
rand_max(val)
Returns a random number between 0 and val
.
reload()
Reload gt
module.
export()
Export the content of gt
table to the global environment.
display(filename)
Call external 'display' program for file filename
.
show_table(tbl)
Show all keys and values of table tbl
.
show(all)
Show content of the gt
table.
re()
Reload the gt
module and export its content to the global environment.
features_contain_marked(features)
Returns true if the given array of features
contains a marked feature, false otherwise.
features_show(features)
Print the given array of features
to stdout.
features_get_marked(features)
Return all marked features
(an array) as an array or nil if features
contains no marked features.
features_show_marked(features)
Print all marked features
(an array) to stdout.
features_mRNAs2genes(in_features)
Return an array of genome features which contains a separate gene feature for each mRNA in in_features
.
features_extract_sequences(features, type, join, region_mapping)
Return an array with the sequences of the given features.
Class Alphabet
alphabet_new_protein()
Return a new protein alphabet.
alphabet_new_dna()
Return a new DNA alphabet.
alphabet_new_empty()
Return an empty alphabet.
alphabet:add_mapping(characters)
Add the mapping of all given characters
to the given alphabet
. The first character is the result of subsequent alphabet:decode()
calls.
alphabet:add_wildcard(characters)
Add wildcard
to alphabet
.
alphabet:decode(code)
Return a string containing the decoded character of the code
number.
alphabet:size()
Return the size of alphabet
as a number.
Class Bittab
bittab_new(num_of_bits)
Returns a bittab with num_of_bits
many bits.
bittab:set_bit(bit)
Set bit
in bittab
.
bittab:unset_bit(bit)
Unset bit
in bittab
.
bittab:complement(src)
Store the complement of bittab src
in bittab
. bittab
and src
must have the same size.
bittab:equal(src)
Set bittab
equal to bittab src
. bittab
and src
must have the same size.
bittab:and_equal(src)
Set bittab
equal to the bitwise AND of bittab
and src
. bittab
and src
must have the same size.
bittab:bit_is_set(bit)
Returns true if bit
is set in bittab
, false otherwise.
Class CDSStream
cds_stream_new(in_stream, region_mapping)
Returns a new CDS (coding sequence) stream object (a genome stream) which uses genome stream in_stream
as input. The CDS stream adds CDS features to exon features in in_stream
. The given region_mapping
is used to map the sequence regions given in in_stream
to the actual sequence files necessary for computing the coding sequences.
Class CSAStream
csa_stream_new(in_stream, join)
Returns a new CSA (consensus spliced alignment) stream object (a genome stream) which uses genome stream in_stream
as input. The CSA stream replaces spliced alignments with computed consensus spliced alignments. The optional join
parameters sets the length for the spliced alignment clustering (default: 300).
Class Canvas
canvas_cairo_file_new_png(width, imageinfo)
Return a Canvas object which acts as a PNG drawing surface of width width
to be passed to rendering functions as a visitor. An imageinfo
object is filled with coordinate information if given. If not needed, pass nil as imageinfo
.
canvas_cairo_file_new_pdf(width, imageinfo)
Return a Canvas object which acts as a PDF drawing surface of width width
to be passed to rendering functions as a visitor. An imageinfo
object is filled with coordinate information if given.
canvas_cairo_file_new_ps(width, imageinfo)
Return a Canvas object which acts as a PS drawing surface of width width
to be passed to rendering functions as a visitor. An imageinfo
object is filled with coordinate information if given.
canvas_cairo_file_new_svg(width, imageinfo)
Return a Canvas object which acts as a SVG drawing surface of width width
to be passed to rendering functions as a visitor. An imageinfo
object is filled with coordinate information if given.
canvas:to_file(filename)
Creates an image file with the given filename
which contains the contents of the canvas (only for objects created with canvas_cairo_file_new_*()
).
Class CommentNode
comment_node_new(comment)
Returns a new comment node with comment text comment
.
comment_node:get_comment()
Return comment string of comment_node
.
Class Diagram
diagram_new(feature_index, range, seqid)
Return a diagram object which contains the genome nodes given in feature_index
in the given range
of the sequence region with sequence ID seqid
.
diagram_new_from_array(array, startpos, endpos)
Return a diagram object which contains the genome nodes given in array
. The range from startpos
to endpos
determines the visible region and should include the nodes in array
.
Class FeatureIndex
feature_index_memory_new()
Returns a new FeatureIndex object storing the index in memory.
feature_index:add_gff3file(gff3file)
Add all features from all sequence regions contained in gff3file
to feature_index
.
feature_index:add_region_node(region_node)
Add region_node
to feature_index
.
feature_index:add_feature_node(feature_node)
Add feature_node
to feature_index
, implicitly creating sequence region if not present before.
feature_index:get_features_for_seqid(seqid)
Returns the feature nodes for sequence ID seqid
in an array.
feature_index:get_features_for_range(seqid, range)
Returns the genome features for sequence ID seqid
within range
in an array.
feature_index:get_first_seqid()
Returns the first sequence ID stored in feature_index
.
feature_index:get_seqids()
Returns an array containins all sequence IDs stored in feature_index
.
feature_index:get_range_for_seqid(seqid)
Returns the range covered by features of sequence ID seqid
in feature_index
.
feature_index:get_coverage(seqid, maxdist)
Computes the coverage for the sequence ID seqid
. The optional maxdist
parameter denotes the maximal distance two features can be apart without creating a new Range. Returns an array of Ranges denoting parts the of seqid
covered by features.
feature_index:get_marked_regions(seqid, maxdist)
Returns an array of Ranges denoting parts of seqid
which are covered by at least one marked feature. Internally, get_coverage()
is called and the maxdist
is passed along.
feature_index:render_to_png(seqid, range, png_file, width)
Render to PNG file png_file
for seqid
in range
with optional width
. If no png_file
is given os.tmpname()
is called to create one. Returns name of written PNG file.
feature_index:show_seqids()
Show all sequence IDs.
feature_index:get_all_features()
Returns all features from feature_index
.
Class FeatureNode
feature_node_new(seqid, type, startpos, endpos, strand)
Create a new feature node on sequence with ID seqid
and type type
which lies from startpos
to end
on strand strand
. startpos
and endpos
always refer to the forward strand, therefore startpos
has to be smaller or equal than endpos
.
feature_node:get_strand()
Returns the strand of feature_node
.
feature_node:get_source()
Returns the source of feature_node
.
feature_node:set_source(source)
Set the source of feature_node
to source
.
feature_node:get_score()
Returns the score of feature_node
.
feature_node:get_phase()
Returns the phase of feature_node
.
feature_node:get_type()
Return type of feature_node
as string.
feature_node:set_type(type)
Sets type of feature_node
to be type
.
feature_node:get_attribute(attrib)
Returns the attrib
attribute of feature_node
.
feature_node:set_attribute(attrib, value)
Sets the attrib
attribute of feature_node
to value
.
feature_node:add_attribute(attrib, value)
Adds the new attrib
attribute of feature_node
with value value
.
feature_node:remove_attribute(attrib)
Removes the attrib
attribute of feature_node
.
feature_node:attribute_pairs()
Returns an Lua iterator over the all attributes of feature_node
, which delivers pairs of key/value strings. Similar to the Lua pairs()
function, applied to a table with string keys.
feature_node:get_exons()
Returns an array containing the exons of feature_node
.
feature_node:children()
Returns an depth-first Lua iterator over the all children of feature_node
(including feature_node
itself).
feature_node:get_children()
Returns an depth-first Lua iterator over the all children of feature_node
(including feature_node
itself).
feature_node:direct_children()
Returns an Lua iterator over the all direct children of feature_node
(not including feature_node
itself).
feature_node:get_direct_children()
Returns an Lua iterator over the all direct children of feature_node
(not including feature_node
itself).
feature_node:has_child_of_type(type)
Returns true
if feature_node
has a child node of type type
.
feature_node:add_child(child)
Adds child
as a child node of feature_node
.
feature_node:remove_leaf(leaf)
Removes leaf
as a child node in the subgraph beneath feature_node
, if it is a leaf.
feature_node:output_leading()
Show leading part of GFF3 output for feature_node
feature_node:extract_sequence(type, join, region_mapping)
Extract the sequence of feature_node
. If join
is false and feature_node
has type type
the sequence is returned (using region_mapping
to get it). If join
is true and feature_node
has children of type type
their joined sequences are returned. If none of the above applies nil is returned.
feature_node:extract_and_translate_sequence(type, join,
region_mapping)
Extract the translated sequence of feature_node
. If join
is false and feature_node
has type type
the sequence is returned (using region_mapping
to get it). If join
is true and feature_node
has children of type type
their joined sequences are returned. If none of the above applies nil is returned.
Class FeatureNodeIterator
feature_node_iterator_new(node)
Returns a new feature node iterator which performs a depth-first traversal of node
(including node
itself).
feature_node_iterator_new_direct(node)
Returns a new feature node iterator wich iterates over all direct children of node
(without node
itself).
feature_node_iterator:next()
Returns the next node for feature_node_iterator
or nil.
Class FeatureStream
feature_stream_new(in_stream, feature_index)
Returns a new feature stream object (a genome stream) over feature_index
which uses the genome stream in_stream
as input. That is, all genome nodes which are pulled through the feature stream are added to the feature_index
.
Class FeatureVisitor
feature_visitor_new(feature_index)
Returns a new feature visitor object over feature_index
. That is, all genome nodes which are visited by the feature visitor are added to the feature_index
.
Class GFF3InStream
gff3_in_stream_new_sorted(filename)
Returns a new GFF3 input stream object for filename
. The file filename
has to be a sorted GFF3 file. If filename
is omitted or nil
, input will be read from standard input.
Class GFF3OutStream
gff3_out_stream_new(genome_stream)
Returns a new GFF3 output stream which pulls its features from genome_stream
.
Class GFF3Visitor
gff3_visitor_new()
Returns a new GFF3 visitor.
Class GenomeNode
genome_node:get_filename()
Returns the filename of genome_node
.
genome_node:get_line_number()
Returns the line number of genome_node
.
genome_node:get_range()
Returns the range of genome_node
.
genome_node:set_range(range)
Sets the range of genome_node
to range
.
genome_node:get_seqid()
Returns the sequence id of genome_node
.
genome_node:accept(genome_visitor)
Accept genome_visitor
.
genome_node:is_part_of_genome_node(child_node)
Make genome_node
the parent of child_node
.
genome_node:mark()
Mark genome_node
.
genome_node:is_marked()
Returns true if genome_node
is marked, false otherwise.
genome_node:contains_marked()
Returns true if genome_node
contains a marked node, false otherwise.
genome_node:show(gff3_visitor)
Show genome node on stdout (using the optional gff3_visitor
).
genome_node:show_marked()
Show marked parts of genome node on stdout.
Class GenomeStream
genome_stream:next_tree()
Returns the next genome node for genome_stream
or nil.
Class Imageinfo
imageinfo_new()
returns a new ImageInfo object.
imageinfo:get_recmaps()
returns an array of tables with the fields "nw_x","nw_y","se_x","se_y" and "feature_ref" with the top left and bottom right coordinates in pixels or points and a GenomeNode reference per element drawn.
Class Layout
layout_new(diagram, width)
Return a Layout object which represents a layout of the content of diagram
with width width
.
layout:get_height(layout)
Return the height of the resulting layout
.
layout:sketch(layout, canvas)
Draw the content of the layout
on a given canvas
.
Class MetaNode
meta_node_new(directive, data)
Returns a new region node with key directive
and data string data
.
meta_node:get_directive()
Return directive of meta_node
as string.
meta_node:get_data()
Return data of meta_node
as string.
Class Range
range_new(startpos, endpos)
Returns a new range object with start startpos
and end endpos
. startpos
must be smaller or equal than endpos
.
range:get_start()
Returns start of range
.
range:get_end()
Returns end of range
.
range:overlap(other_range)
Returns true if range
and other_range
overlap, false otherwise.
range:overlap_delta(other_range, delta)
Returns true if range
and other_range
overlap with delta delta
, false otherwise.
range:length()
Returns the length of range
.
range:contains(other_range)
Returns true if range
contains other_range
, false otherwise.
range:join(other_range)
Returns a new range consisting of range
and other_range
joined.
range:within(point)
Returns true if point
lies within range
, false otherwise.
ranges_sort(range_array)
Returns an array containing the ranges from array range_array
in sorted order.
ranges_are_sorted(range_array)
Returns true if the ranges in array range_array
are sorted, false otherwise.
range:show()
Show range on stdout.
Class RegionMapping
region_mapping_new_seqfile(seqfile)
Returns a new region mapping which maps everything onto sequence file seqfile
.
region_mapping:get_sequence(seqid, start, end)
Use region_mapping
to extract the sequence from start
to end
of the given sequence ID seqid
.
Class RegionNode
region_node_new(seqid, range)
Returns a new region node for sequence id seqid
spanning range
.
Class ScoreMatrix
score_matrix_new_read_protein(path)
Returns a new protein score matrix object which has been read from file path
.
score_matrix_new_read(path, alphabet)
Read in score matrix from path
over given alphabet
and return it.
score_matrix:get_dimension()
Returns the dimension of the score_matrix
as number.
score_matrix:get_score(idx1, idx2)
Returns the score for idx1
,idx2
as number.
score_matrix:show()
Show score_matrix
on stdout.
Class SequenceNode
sequence_node_new(desc, sequence)
Returns a new sequence node with description desc
and (unencoded) sequence sequence
.
sequence_node:get_description()
Returns description of sequence_node
.
sequence_node:get_sequence()
Returns sequence of sequence_node
.
sequence_node:get_sequence_length()
Returns length of the sequence of sequence_node
.
Class StreamEvaluator
stream_evaluator_new(reality_stream, prediction_stream)
Returns a new stream evaluator object for the two genome streams reality_stream
and prediction_stream
.
stream_evaluator:evaluate(genome_visitor)
Run evaluation of stream_evaluator
. All evaluated features are visited by the optional genome_visitor
.
stream_evaluator:show()
Show result of stream_evaluator
on stdout.
Class translate
translate_dna(dna)
Returns translated dna
.
Index
alphabet:add_mapping
alphabet:add_wildcard
alphabet:decode
alphabet:size
alphabet_new_dna
alphabet_new_empty
alphabet_new_protein
bittab:and_equal
bittab:bit_is_set
bittab:complement
bittab:equal
bittab:set_bit
bittab:unset_bit
bittab_new
canvas:to_file
canvas_cairo_file_new_pdf
canvas_cairo_file_new_png
canvas_cairo_file_new_ps
canvas_cairo_file_new_svg
cds_stream_new
comment_node:get_comment
comment_node_new
csa_stream_new
diagram_new
diagram_new_from_array
display
export
feature_index:add_feature_node
feature_index:add_gff3file
feature_index:add_region_node
feature_index:get_all_features
feature_index:get_coverage
feature_index:get_features_for_range
feature_index:get_features_for_seqid
feature_index:get_first_seqid
feature_index:get_marked_regions
feature_index:get_range_for_seqid
feature_index:get_seqids
feature_index:render_to_png
feature_index:show_seqids
feature_index_memory_new
feature_node:add_attribute
feature_node:add_child
feature_node:attribute_pairs
feature_node:children
feature_node:direct_children
feature_node:extract_and_translate_sequence
feature_node:extract_sequence
feature_node:get_attribute
feature_node:get_children
feature_node:get_direct_children
feature_node:get_exons
feature_node:get_phase
feature_node:get_score
feature_node:get_source
feature_node:get_strand
feature_node:get_type
feature_node:has_child_of_type
feature_node:output_leading
feature_node:remove_attribute
feature_node:remove_leaf
feature_node:set_attribute
feature_node:set_source
feature_node:set_type
feature_node_iterator:next
feature_node_iterator_new
feature_node_iterator_new_direct
feature_node_new
feature_stream_new
feature_visitor_new
features_contain_marked
features_extract_sequences
features_get_marked
features_mRNAs2genes
features_show
features_show_marked
genome_node:accept
genome_node:contains_marked
genome_node:get_filename
genome_node:get_line_number
genome_node:get_range
genome_node:get_seqid
genome_node:is_marked
genome_node:is_part_of_genome_node
genome_node:mark
genome_node:set_range
genome_node:show
genome_node:show_marked
genome_stream:next_tree
gff3_in_stream_new_sorted
gff3_out_stream_new
gff3_visitor_new
imageinfo:get_recmaps
imageinfo_new
layout:get_height
layout:sketch
layout_new
meta_node:get_data
meta_node:get_directive
meta_node_new
rand_max
range:contains
range:get_end
range:get_start
range:join
range:length
range:overlap
range:overlap_delta
range:show
range:within
range_new
ranges_are_sorted
ranges_sort
re
region_mapping:get_sequence
region_mapping_new_seqfile
region_node_new
reload
score_matrix:get_dimension
score_matrix:get_score
score_matrix:show
score_matrix_new_read
score_matrix_new_read_protein
sequence_node:get_description
sequence_node:get_sequence
sequence_node:get_sequence_length
sequence_node_new
show
show_table
stream_evaluator:evaluate
stream_evaluator:show
stream_evaluator_new
translate_dna