pyprobound.layers.conv1d.Conv1d

class Conv1d(psam, input_shape, min_input_length, max_input_length, train_posbias=False, bias_mode='channel', bias_bin=1, length_specific_bias=True, out_channel_indexing=None, one_hot=False, unfold=False, normalize=False)

Bases: Layer

1d convolution with a PSAM filter and position bias modeling.

Since the weight \(\beta_\phi\) of feature \(\phi\) is defined as \(-\Delta\Delta G_\phi/RT\), the output of the convolution is the \(-\log K^{rel}_{\text{D}}\) of each sliding window.

\[\log \frac{1}{K^{rel}_{\text{D}, a} (S_{i, x})} = \omega(x) + \sum_{\phi} \beta_\phi \mathbb{1}_\phi(S_{i, x})\]

where \(\mathbb{1}_\phi(S_{i, x})\) is the indicator function of when window \(x\) of sequence \(i\) contains feature \(\phi\).

log_posbias

The bias \(\omega(x)\) for each output position and channel.

Type:: Tensor

__init__(psam, input_shape, min_input_length, max_input_length, train_posbias=False, bias_mode='channel', bias_bin=1, length_specific_bias=True, out_channel_indexing=None, one_hot=False, unfold=False, normalize=False)

Initializes the 1d convolution layer.

Parameters:

psam (PSAM) – The specification of the 1d convolution layer.
input_shape (int) – The number of elements in an input sequence.
min_input_length (int) – The minimum number of finite elements in an input sequence.
max_input_length (int) – The maximum number of finite elements in an input sequence.
train_posbias (bool) – Whether to train a bias \(\omega(x)\) for each output position and channel.
bias_mode (Literal['channel', 'same', 'reverse']) – Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.
bias_bin (int) – Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).
length_specific_bias (bool) – Whether to train a separate bias parameter for each input length.
out_channel_indexing (Sequence[int] | None) – Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].
one_hot (bool) – Whether to use one-hot scoring instead of dense.
unfold (bool) – Whether to score using unfold or conv1d (if one_hot).
normalize (bool) – Whether to mean-center log_posbias over all windows.

Methods

`cache`(fun)	Decorator for a function to cache its output.
`check_length_consistency`()	Checks that input lengths of Binding components are consistent.
`components`()	Iterator of child components.
`forward`(seqs)	Calculates the log score of each window.
`freeze`()	Turns off gradient calculation for all parameters.
`from_psam`(psam, prev[, train_posbias, ...])	Creates a new instance from a PSAM and an input component.
`get_log_posbias`()	The bias \(\omega(x)\) for each output position and channel.
`in_len`(length[, mode])	Calculates the receptive field.
`lengths`(seqs)	Counts the number of finite elements in each sequence.
`max_embedding_size`()	The maximum number of bytes needed to encode a sequence.
`optim_procedure`([ancestry, current_order])	The sequential optimization procedure for all Binding components.
`out_len`(length[, mode])	Calculates the number of elements in the output length dimension.
`reload`(checkpoint)	Loads the model from a checkpoint file.
`reload_from_state_dict`(state_dict)	Loads the model from a state dict.
`save`(checkpoint[, flank_lengths])	Saves the model to a file with "state_dict" and "metadata" fields.
`score_dense`(seqs)	Calculates the log score of each window using indexing.
`score_onehot`(seqs)	Calculates the log score of each window using convolutions.
`unfreeze`([parameter])	Turns on gradient calculation for the specified parameter.
`update_binding_optim`(binding_optim)	Updates a BindingOptim with the specification's optimization steps.
`update_input_length`([left_shift, ...])	Updates input shapes, called by a child PSAM after its update.

Attributes

`bias_bin`	Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).
`bias_mode`	Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.
`in_channels`	The number of input channels.
`input_shape`	The number of elements in an input sequence.
`length_specific_bias`	Whether to train a separate bias parameter for each input length.
`max_input_length`	The maximum number of finite elements in an input sequence.
`min_input_length`	The minimum number of finite elements in an input sequence.
`out_channel_indexing`	Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].
`out_channels`	The number of output channels.
`unfreezable`	alias of `Literal`['all', 'posbias']

Non-Inherited Members

unfreezable: alias of Literal[‘all’, ‘posbias’]

classmethod from_psam(psam, prev, train_posbias=False, bias_mode='channel', bias_bin=1, length_specific_bias=True, out_channel_indexing=None, one_hot=False, unfold=False, normalize=False)

Creates a new instance from a PSAM and an input component.

Parameters:

psam (PSAM) – The specification of the 1d convolution layer.
prev (Union[Table[Any], Layer]) – If used as the first layer, the table that will be passed as an input; otherwise, the layer that precedes it.
train_posbias (bool) – Whether to train a bias \(\omega(x)\) for each output position and channel.
bias_mode (Literal['channel', 'same', 'reverse']) – Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.
bias_bin (int) – Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).
length_specific_bias (bool) – Whether to train a separate bias parameter for each input length.
out_channel_indexing (Sequence[int] | None) – Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].
one_hot (bool) – Whether to use one-hot scoring instead of dense.
unfold (bool) – Whether to score using unfold or conv1d (if one_hot).
normalize (bool) – Whether to mean-center log_posbias over all windows.

Return type:

Self

property bias_mode: Literal['channel', 'same', 'reverse']: Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.

property bias_bin: int: Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).

property length_specific_bias: bool: Whether to train a separate bias parameter for each input length.

property out_channel_indexing: list[int] | None: Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].

property out_channels: int: The number of output channels.

check_length_consistency()

Checks that input lengths of Binding components are consistent.

Raises:: RuntimeError – There is an input mismatch between components.
Return type:: None

unfreeze(parameter='all')

Turns on gradient calculation for the specified parameter.

Parameters:: parameter (Literal['all', 'posbias']) – Parameter to be unfrozen, defaults to all parameters.
Return type:: None

update_binding_optim(binding_optim)

Updates a BindingOptim with the specification’s optimization steps.

Parameters:: binding_optim (BindingOptim) – The parent BindingOptim to be updated.
Return type:: BindingOptim
Returns:: The updated BindingOptim.

max_embedding_size()

The maximum number of bytes needed to encode a sequence.

Used for splitting calculations to avoid GPU limits on tensor sizes.

Return type:: int

update_input_length(left_shift=0, right_shift=0, min_len_shift=0, max_len_shift=0, new_min_len=None, new_max_len=None)

Updates input shapes, called by a child PSAM after its update.

Parameters:

left_shift (int) – The change in size on the left side of the sequence.
right_shift (int) – The change in size on the right side of the sequence.
min_len_shift (int) – The change in the number of short input lengths.
max_len_shift (int) – The change in the number of long input lengths.
new_min_len (int | None) – The new min_input_length.
new_max_len (int | None) – The new max_input_length.

Return type:

None

get_log_posbias()

The bias \(\omega(x)\) for each output position and channel.

Return type:: Tensor
Returns:: A tensor with the bias of each output position and channel of shape \((\text{input_lengths},\text{out_channels}, \text{out_length})\).

score_onehot(seqs)

Calculates the log score of each window using convolutions.

Parameters:: seqs (Tensor) – A sequence tensor of shape \((\text{minibatch},\text{length})\) or \((\text{minibatch},\text{in_channels},\text{in_length})\).
Return type:: Tensor
Returns:: A tensor with the log score of each window of shape \((\text{minibatch},\text{out_channels},\text{out_length})\).

score_dense(seqs)

Calculates the log score of each window using indexing.

Parameters:: seqs (Tensor) – A sequence tensor of shape \((\text{minibatch},\text{length})\).
Return type:: Tensor
Returns:: A tensor with the log score of each window of shape \((\text{minibatch},\text{out_channels},\text{out_length})\).

forward(seqs)

Calculates the log score of each window.

\[\log \frac{1}{K^{rel}_{\text{D}, a} (S_{i, x})} = \omega(x) + \sum_{\phi} \beta_\phi \mathbb{1}_\phi(S_{i, x})\]

Parameters:: seqs (Tensor) – A sequence tensor of shape \((\text{minibatch},\text{length})\) or \((\text{minibatch},\text{in_channels},\text{in_length})\).
Return type:: Tensor
Returns:: A tensor with the log score of each window of shape \((\text{minibatch},\text{out_channels},\text{out_length})\).