pyprobound.alphabets.RNA

class RNA

Bases: Alphabet

Stores the RNA encoding of sequences into tensors.

Three sequence characters are reserved: ‘ ‘ is -infinity (not scored), ‘*’ is the IUPAC wildcard character N, and ‘-’ is zero.

alphabet

(‘A’, ‘C’, ‘G’, ‘U’).

Type:

tuple[str]

get_index

A mapping of monomers in the alphabet to indices in the embedding matrix.

Type:

dict[str, int]

get_encoding

IUPAC encoding of monomers to tuples of indices in the embedding matrix; for example, ‘N’ maps to (0, 1, 2, 3).

Type:

dict[str, tuple[int,…]]

__init__()

Initializes the alphabet.

Methods

embed(seqs)

Embeds sequences from a dense to a one-hot representation.

pairwise_embed(seqs, dist)

Embeds sequences into a one-hot pairwise representation.

translate(sequence)

Translates a sequence into a tensor.

Non-Inherited Members