Translation helpers¶
- class co.translation.MutableTranslationTable(size)¶
Parameters: size – the length of the source sequence A mutable version of TranslationTable with insert, delete and substitute methods for updating the translation table with the corresponding mutations.
- delete(position, size, strict=True)¶
Insert a gap in the target sequence
Parameters: - position (int) – start of gap site
- size (int) – length of gap
- strict (bool) – use strict mode
Raises OverlapError: in various edge cases involving overlapping mutations, particularly in strict mode.
- freeze()¶
Return an immutable version of this translation table.
Return type: TranslationTable
- classmethod from_mutations(sequence, mutations, strict=True)¶
Parameters: - sequence – the source sequence
- mutations – iterable of Mutation objects
- strict (bool) – use strict mode
- classmethod from_sequences(reference, query, algorithm=None)¶
Raises NotImplementedError:
- insert(position, size, strict=True)¶
Insert a gap in the source sequence
Parameters: - position (int) – start of gap site
- size (int) – length of gap
- strict (bool) – use strict mode
Raises OverlapError: in various edge cases involving overlapping mutations, particularly in strict mode.
- substitute(position, size, strict=True)¶
Insert two gaps of equal length in the source and target sequences
Parameters: - position (int) – start of gap site
- size (int) – length of gap
- strict (bool) – use strict mode
Raises OverlapError: in various edge cases involving overlapping mutations, particularly in strict mode.
- exception co.translation.OverlapError¶
OverlapError is raised when a mutation is applied to a position in a sequence that has been altered by a previous mutation.
In strict mode, an OverlapError is fired more frequently, such as when a deletion is applied to a range that has previously been modified by an insertion.
- class co.translation.TranslationTable(source_size, target_size, source_start, source_end, target_start, target_end, chain)¶
This class is inspired by the UCSC chain format for pairwise alignments documented here:
http://genome.ucsc.edu/goldenPath/help/chain.html
TranslationTable encodes an alignment between two sequences, source and target.
The alignment is encoded in a chain of tuples in the format (ungapped_size, ds, dt), where ungapped_size refers to regions that align, and the gaps dt and ds each refer to regions present only in the other sequence.
- source_start¶
The first position in the source sequence that aligns with the target sequence
- source_end¶
The last position in the source sequence that aligns with the target sequence.
- alignment()¶
Returns an iterator yielding tuples in the form (source, target).
Returns: an iterator over all coordinates in both the source and target sequence.
- alignment_str()¶
Returns a string representation of the alignment between source and target coordinates.
Warning
This function should only be used for debugging purposes.
- ge(position)¶
See also
The le() function.
Raises IndexError: if position does not exist in the source or if it maps to a coordinate after the end of the target sequence alignment. Returns: the first position, equal or greater than position that exists in the query sequence.
- invert()¶
Creates a copy of the table where source and target are inverted.
Returns: a new TranslationTable object.
- le(position)¶
le() attempts to return the coordinate in the target sequence that corresponds to the position parameter in the source sequence. If position falls into a gap in the target sequence, it will instead return the last coordinate in front of that gap.
Raises IndexError: if position does not exist in the source or if it maps to a coordinate before the start of the target sequence alignment. Returns: the first position, equal or lower than position that exists in the query sequence.
- total_ungapped_size¶
The total length of the alignment between source and target.