pyprobound.layers.conv1d.Conv1d

class Conv1d(psam, input_shape, min_input_length, max_input_length, train_posbias=False, bias_mode='channel', bias_bin=1, length_specific_bias=True, out_channel_indexing=None, one_hot=False, unfold=False, normalize=False)

Bases: Layer

1d convolution with a PSAM filter and position bias modeling.

Since the weight \(\beta_\phi\) of feature \(\phi\) is defined as \(-\Delta\Delta G_\phi/RT\), the output of the convolution is the \(-\log K^{rel}_{\text{D}}\) of each sliding window.

\[\log \frac{1}{K^{rel}_{\text{D}, a} (S_{i, x})} = \omega(x) + \sum_{\phi} \beta_\phi \mathbb{1}_\phi(S_{i, x})\]

where \(\mathbb{1}_\phi(S_{i, x})\) is the indicator function of when window \(x\) of sequence \(i\) contains feature \(\phi\).

log_posbias

The bias \(\omega(x)\) for each output position and channel.

Type:

Tensor

__init__(psam, input_shape, min_input_length, max_input_length, train_posbias=False, bias_mode='channel', bias_bin=1, length_specific_bias=True, out_channel_indexing=None, one_hot=False, unfold=False, normalize=False)

Initializes the 1d convolution layer.

Parameters:
  • psam (PSAM) – The specification of the 1d convolution layer.

  • input_shape (int) – The number of elements in an input sequence.

  • min_input_length (int) – The minimum number of finite elements in an input sequence.

  • max_input_length (int) – The maximum number of finite elements in an input sequence.

  • train_posbias (bool) – Whether to train a bias \(\omega(x)\) for each output position and channel.

  • bias_mode (Literal['channel', 'same', 'reverse']) – Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.

  • bias_bin (int) – Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).

  • length_specific_bias (bool) – Whether to train a separate bias parameter for each input length.

  • out_channel_indexing (Sequence[int] | None) – Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].

  • one_hot (bool) – Whether to use one-hot scoring instead of dense.

  • unfold (bool) – Whether to score using unfold or conv1d (if one_hot).

  • normalize (bool) – Whether to mean-center log_posbias over all windows.

Methods

cache(fun)

Decorator for a function to cache its output.

check_length_consistency()

Checks that input lengths of Binding components are consistent.

components()

Iterator of child components.

forward(seqs)

Calculates the log score of each window.

freeze()

Turns off gradient calculation for all parameters.

from_psam(psam, prev[, train_posbias, ...])

Creates a new instance from a PSAM and an input component.

get_log_posbias()

The bias \(\omega(x)\) for each output position and channel.

in_len(length[, mode])

Calculates the receptive field.

lengths(seqs)

Counts the number of finite elements in each sequence.

max_embedding_size()

The maximum number of bytes needed to encode a sequence.

optim_procedure([ancestry, current_order])

The sequential optimization procedure for all Binding components.

out_len(length[, mode])

Calculates the number of elements in the output length dimension.

reload(checkpoint)

Loads the model from a checkpoint file.

reload_from_state_dict(state_dict)

Loads the model from a state dict.

save(checkpoint[, flank_lengths])

Saves the model to a file with "state_dict" and "metadata" fields.

score_dense(seqs)

Calculates the log score of each window using indexing.

score_onehot(seqs)

Calculates the log score of each window using convolutions.

unfreeze([parameter])

Turns on gradient calculation for the specified parameter.

update_binding_optim(binding_optim)

Updates a BindingOptim with the specification's optimization steps.

update_input_length([left_shift, ...])

Updates input shapes, called by a child PSAM after its update.

Attributes

bias_bin

Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).

bias_mode

Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.

in_channels

The number of input channels.

input_shape

The number of elements in an input sequence.

length_specific_bias

Whether to train a separate bias parameter for each input length.

max_input_length

The maximum number of finite elements in an input sequence.

min_input_length

The minimum number of finite elements in an input sequence.

out_channel_indexing

Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].

out_channels

The number of output channels.

unfreezable

alias of Literal['all', 'posbias']

Non-Inherited Members

unfreezable

alias of Literal[‘all’, ‘posbias’]

classmethod from_psam(psam, prev, train_posbias=False, bias_mode='channel', bias_bin=1, length_specific_bias=True, out_channel_indexing=None, one_hot=False, unfold=False, normalize=False)

Creates a new instance from a PSAM and an input component.

Parameters:
  • psam (PSAM) – The specification of the 1d convolution layer.

  • prev (Union[Table[Any], Layer]) – If used as the first layer, the table that will be passed as an input; otherwise, the layer that precedes it.

  • train_posbias (bool) – Whether to train a bias \(\omega(x)\) for each output position and channel.

  • bias_mode (Literal['channel', 'same', 'reverse']) – Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.

  • bias_bin (int) – Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).

  • length_specific_bias (bool) – Whether to train a separate bias parameter for each input length.

  • out_channel_indexing (Sequence[int] | None) – Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].

  • one_hot (bool) – Whether to use one-hot scoring instead of dense.

  • unfold (bool) – Whether to score using unfold or conv1d (if one_hot).

  • normalize (bool) – Whether to mean-center log_posbias over all windows.

Return type:

Self

property bias_mode: Literal['channel', 'same', 'reverse']

Whether to train a separate bias for each output channel, use the same bias across all output channels, or (if score_reverse) flip it for the reverse output channels.

property bias_bin: int

Applies the constraint \(\omega(x_{i\times\text{bias_bin}}) = \cdots = \omega(x_{(i+1)\times\text{bias_bin}-1})\).

property length_specific_bias: bool

Whether to train a separate bias parameter for each input length.

property out_channel_indexing: list[int] | None

Output channel indexing, equivalent to Conv1d(seqs)[:,out_channel_indexing].

property out_channels: int

The number of output channels.

check_length_consistency()

Checks that input lengths of Binding components are consistent.

Raises:

RuntimeError – There is an input mismatch between components.

Return type:

None

unfreeze(parameter='all')

Turns on gradient calculation for the specified parameter.

Parameters:

parameter (Literal['all', 'posbias']) – Parameter to be unfrozen, defaults to all parameters.

Return type:

None

update_binding_optim(binding_optim)

Updates a BindingOptim with the specification’s optimization steps.

Parameters:

binding_optim (BindingOptim) – The parent BindingOptim to be updated.

Return type:

BindingOptim

Returns:

The updated BindingOptim.

max_embedding_size()

The maximum number of bytes needed to encode a sequence.

Used for splitting calculations to avoid GPU limits on tensor sizes.

Return type:

int

update_input_length(left_shift=0, right_shift=0, min_len_shift=0, max_len_shift=0, new_min_len=None, new_max_len=None)

Updates input shapes, called by a child PSAM after its update.

Parameters:
  • left_shift (int) – The change in size on the left side of the sequence.

  • right_shift (int) – The change in size on the right side of the sequence.

  • min_len_shift (int) – The change in the number of short input lengths.

  • max_len_shift (int) – The change in the number of long input lengths.

  • new_min_len (int | None) – The new min_input_length.

  • new_max_len (int | None) – The new max_input_length.

Return type:

None

get_log_posbias()

The bias \(\omega(x)\) for each output position and channel.

Return type:

Tensor

Returns:

A tensor with the bias of each output position and channel of shape \((\text{input_lengths},\text{out_channels}, \text{out_length})\).

score_onehot(seqs)

Calculates the log score of each window using convolutions.

Parameters:

seqs (Tensor) – A sequence tensor of shape \((\text{minibatch},\text{length})\) or \((\text{minibatch},\text{in_channels},\text{in_length})\).

Return type:

Tensor

Returns:

A tensor with the log score of each window of shape \((\text{minibatch},\text{out_channels},\text{out_length})\).

score_dense(seqs)

Calculates the log score of each window using indexing.

Parameters:

seqs (Tensor) – A sequence tensor of shape \((\text{minibatch},\text{length})\).

Return type:

Tensor

Returns:

A tensor with the log score of each window of shape \((\text{minibatch},\text{out_channels},\text{out_length})\).

forward(seqs)

Calculates the log score of each window.

\[\log \frac{1}{K^{rel}_{\text{D}, a} (S_{i, x})} = \omega(x) + \sum_{\phi} \beta_\phi \mathbb{1}_\phi(S_{i, x})\]
Parameters:

seqs (Tensor) – A sequence tensor of shape \((\text{minibatch},\text{length})\) or \((\text{minibatch},\text{in_channels},\text{in_length})\).

Return type:

Tensor

Returns:

A tensor with the log score of each window of shape \((\text{minibatch},\text{out_channels},\text{out_length})\).