pyprobound.optimizer.Optimizer

class Optimizer(model, train_tables, val_tables=None, epochs=200, patience=10, greedy_threshold=0.0002, batch_size=None, checkpoint='model.pt', output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, device=None, sampler=None, optimizer=<class 'torch.optim.lbfgs.LBFGS'>, optim_args=None, sampler_args=None)

Bases: Generic[T]

Optimizer of a BaseLoss.

__init__(model, train_tables, val_tables=None, epochs=200, patience=10, greedy_threshold=0.0002, batch_size=None, checkpoint='model.pt', output=<_io.TextIOWrapper name='<stdout>' mode='w' encoding='utf-8'>, device=None, sampler=None, optimizer=<class 'torch.optim.lbfgs.LBFGS'>, optim_args=None, sampler_args=None)

Initializes the optimizer.

Parameters:

model (BaseLoss[TypeVar(T, bound= Batch)]) – The loss module to be optimized.
train_tables (Sequence[Table[TypeVar(T, bound= Batch)]]) – The tables to be trained against.
val_tables (Sequence[Table[TypeVar(T, bound= Batch)]] | None) – The tables to be validated against for early stopping.
epochs (int) – The maximum number of epochs taken until convergence.
patience (int) – The maximum number of epochs taken without improvement of the train loss (or validation loss if available).
greedy_threshold (float) – The change in loss necessary to accept a Step with attribute greedy set to True.
batch_size (int | None) – The number of sequences used to optimize the model at a time.
checkpoint (str | PathLike[str]) – The file where the model will be checkpointed to.
output (str | PathLike[str] | TextIOBase) – The file where the optimization output will be written to.
device (str | None) – The device on which to perform optimization.
sampler (type[Sampler[TypeVar(T, bound= Batch)]] | None) – The sampler used when creating the dataloader.
optimizer (type[Optimizer]) – The optimizer used for optimization.
optim_args (MutableMapping[str, Any] | None) – Parameters passed to the optimizer. (Defaults to {“line_search_fn”:”strong_wolfe”} if available).
sampler_args (MutableMapping[str, Any] | None) – Parameters passed to the sampler.

Methods

`check_length_consistency`()	Checks that input lengths of Binding components are consistent.
`get_parameter_string`()	A prettified representation of the parameters of the model.
`get_setup_string`()	A description of the current model and tables.
`get_train_sequential`([order])	A description of the sequential optimization steps to be taken.
`greedy_search`(step[, best_loss])	Run an optimization step repeatedly until the loss stops improving.
`print`(*objects[, sep, mode, end])	Prints to self.output.
`reload`([checkpoint])	Loads the model from a checkpoint file.
`run_one_epoch`(train, dataloader)	Runs one training epoch.
`run_step`(step)	Run all calls specified in a step.
`save`(checkpoint)	Saves the model to a file with "state_dict", "metadata" fields.
`train_sequential`([maintain_loss, order])	Train parameters according to the sequential optimization procedure.
`train_simultaneous`([best_loss])	Train all parameters in a model simultaneously.
`train_until_convergence`([checkpoint, best_loss])	Repeat optimization steps until the loss stops improving.
`update_read_length`(calls)	Combines all update_read_length calls across count tables.

Non-Inherited Members

get_parameter_string()

A prettified representation of the parameters of the model.

Return type:: str

get_setup_string()

A description of the current model and tables.

Return type:: str

get_train_sequential(order=None)

A description of the sequential optimization steps to be taken.

Parameters:: order (Iterable[Iterable[tuple[Spec, ...]]] | None) – An iterable encoding the training order, where each element is an iterable of binding keys to be trained simultaneously.
Return type:: str

print(*objects, sep=' ', mode='at', end='\\n')

Prints to self.output.

Return type:: None

save(checkpoint)

Saves the model to a file with “state_dict”, “metadata” fields.

Parameters:: checkpoint (Union[str, PathLike, BinaryIO, IO[bytes]]) – The file where the model will be checkpointed to.
Return type:: None

reload(checkpoint=None)

Loads the model from a checkpoint file.

Parameters:: checkpoint (Union[str, PathLike, BinaryIO, IO[bytes], None]) – The file where the model state_dict was written to.
Return type:: dict[str, Any]

check_length_consistency()

Checks that input lengths of Binding components are consistent.

Raises:: RuntimeError – There is an input mismatch between components.
Return type:: None

run_one_epoch(train, dataloader)

Runs one training epoch.

Parameters:

train (bool) – Whether to take an optimization step before returning Loss.
dataloader (MultitaskLoader[TypeVar(T, bound= Batch)]) – A multi-dataset loader.

Return type:

Loss

Returns:

The loss calculated from the provided dataloader.

train_until_convergence(checkpoint=None, best_loss=tensor(inf))

Repeat optimization steps until the loss stops improving.

Parameters:

checkpoint (Union[str, PathLike, BinaryIO, IO[bytes], None]) – The file where the model will be checkpointed to.
best_loss (Tensor) – The previous loss used to determine if loss improved.

Return type:

Tensor

Returns:

The best loss calculated during the optimization procedure.

train_simultaneous(best_loss=tensor(inf))

Train all parameters in a model simultaneously.

Parameters:: best_loss (Tensor) – The previous loss used to determine if loss improved.
Return type:: Tensor
Returns:: The best loss calculated during the optimization procedure.

run_step(step)

Run all calls specified in a step.

Groups all update_read_length calls to be dispatched separately.

Return type:: None

greedy_search(step, best_loss=tensor(inf))

Run an optimization step repeatedly until the loss stops improving.

Parameters:

step (Step) – The step to be repeated until the loss stops improving.
best_loss (Tensor) – The previous loss used to determine if loss improved.

Return type:

Tensor

Returns:

The best loss calculated during the optimization procedure.

update_read_length(calls)

Combines all update_read_length calls across count tables.

Return type:: None

train_sequential(maintain_loss=True, order=None)

Train parameters according to the sequential optimization procedure.

Parameters:

maintain_loss (bool) – Whether to keep the best_loss from one step to the next when determining if the loss has improved.
order (Iterable[Iterable[tuple[Spec, ...]]] | None) – An iterable encoding the training order, where each element is an iterable of binding keys to be trained simultaneously.

Return type:

Tensor

Returns:

The best loss calculated during the optimization procedure.