API Reference#

Bindings to tantan, a method for finding repeats in biological sequences.

References

Frith, Martin C. A new repeat-masking method enables specific detection of homologous sequences. Nucleic acids research vol. 39,4 (2011): e23. doi:10.1093/nar/gkq1212.

Functions#

pytantan.mask_repeats(sequence, *, protein=False, scoring_matrix=None, match_score=None, mismatch_cost=None, repeat_start=0.005, repeat_end=0.05, repeat_period=None, decay=0.9, mask=None, threshold=0.5)#

Mask regions predicted as repeats in the given sequence.

Parameters:

sequence (str or byte-like object) – The sequence containing the repeats to mask.
protein (bool) – Set to True to treat the input sequence as a protein sequence.
scoring_matrix (str or ScoringMatrix) – A scoring matrix to use for scoring character matches and mismatches. Either pass a matrix name (such as BLOSUM62) to load a built-in matrix, or a pre-initialized ScoringMatrix object.
match_score (int) – The score for scoring character matches. Must be set along mismatch_cost. Incompatible with the scoring_matrix option.
match_score – The penalty for scoring character mismatches. Must be set along match_score. Incompatible with the scoring_matrix option.
repeat_start (float) – The probability of a repeat starting per position.
repeat_end (float) – The probability of a repeat ending per position.
decay (float) – The probability decay per period.
threshold (float) – The probability threshold above which to mask sequence characters.
mask (str or None) – A single mask character to use for masking positions. If None given, masking uses the lowercase letters of the original sequence.

pytantan.default_scoring_matrix(protein=False, match_score=None, mismatch_cost=None)#: Get the default Tantan scoring matrix for the given parameters.

Classes#

`pytantan.Alphabet`	An alphabet used for encoding sequences with ordinal encoding.
`pytantan.LikelihoodMatrix`	A likelihood ratio matrix derived from a scoring matrix.
`pytantan.RepeatFinder`	A repeat finder using the Tantan method.