torchtraining.pytorch¶
This module provides standard PyTorch operations (like backward
)
in functional manner.
Note
IMPORTANT: This module is used almost all the time so be sure to understand how it works.
It allows users to perform training on single step
for both training and evaluation
using PyTorch’s optimizer, backward or zeroing gradient, for example:
class Step(tt.steps.Step):
def forward(self, module, sample):
# Your forward step here
...
return loss, predictions
training = (
Step(criterion, gradient=True, device=device)
** tt.Select(loss=0)
** tt.pytorch.ZeroGrad(network)
** tt.pytorch.Backward()
** tt.pytorch.Optimize(optimizer)
** tt.pytorch.Detach()
)
evaluation = (
Step(criterion, gradient=False, device=device)
** tt.Select(predictions=1)
** tt.callbacks.Log(writer, "Predicted")
)
Some other operations are also simplified (e.g. gradient accumulation),
see torchtraining.callbacks.Optimize
-
class
torchtraining.pytorch.
Backward
(scaler=None, accumulate: int = 1, gradient: torch.Tensor = None)[source]¶ Run backpropagation on output tensor.
- Parameters
scaler (torch.cuda.amp.GradScaler, optional) – Gradient scaler used for automatic mixed precision mode.
accumulate (int, optional) – Divide loss by
accumulate
if gradient accumulation is used. This approach averages gradient from multiple batches. Default:1
(no accumulation)gradient (torch.Tensor, optional) – Tensor used as initial value to backpropagation. If unspecified, uses
torch.tensor([1.0])
as default value (just liketensor.backward()
call).
- Returns
Tensor after backward (possibly scaled by
accumulate
)- Return type
-
forward
(data)[source]¶ - Parameters
data (torch.Tensor) – Tensor on which
backward
will be run (possibly accumulated). Usuallyloss
value
-
class
torchtraining.pytorch.
Detach
[source]¶ Returns a new Tensor, detached from the current graph.
Note
IMPORTANT: This operation should be used before accumulating values after
iteration
in order not to grow backpropagation graph.- Returns
Detached tensor
- Return type
-
class
torchtraining.pytorch.
Optimize
(optimizer, accumulate: int = 1, closure=None, scaler=None, *args, **kwargs)[source]¶ Perform optimization step on
parameters
stored byoptimizer
.Currently specifying
closure
andscaler
is mutually exclusive.- Parameters
optimizer (torch.optim.Optimizer) – Instance of optimizer-like object with interface aligned with
torch.optim.Optimizer
.accumulate (int, optional) – Divide loss by
accumulate
if gradient accumulation is used. This approach averages gradient from multiple batches. Default:1
(no accumulation)closure (Callable, optional) – A closure that reevaluates the model and returns the loss. Optional for most optimizers. Default:
None
scaler (torch.cuda.amp.GradScaler, optional) – Gradient scaler used for automatic mixed precision mode. Default:
None
*args – Arguments passed to either
scaler.step
(if specified) oroptimizer.step
**kwargs – Keyword arguments passed to either
scaler.step
(if specified) oroptimizer.step
- Returns
Anything passed to
forward
.- Return type
Any
-
class
torchtraining.pytorch.
Schedule
(scheduler, use_data: bool = False)[source]¶ Run single step of given scheduler.
Usually placed after each
step
oriteration
(depending on provided scheduler instance).- Returns
Value passed to function initially
- Return type
- Parameters
scheduler (torch.optim.lr_scheduler._LRScheduler) – Instance of scheduler-like object with interface aligned with
torch.optim.lr_scheduler._LRScheduler
base classuse_data (bool) – Whether input data should be used when stepping scheduler.
-
class
torchtraining.pytorch.
UpdateGradScaler
(scaler)[source]¶ Update gradient scaler used with automatic mixed precision.
- Parameters
scaler (torch.cuda.amp.GradScaler) – Gradient scaler used for automatic mixed precision mode.
- Returns
Anything passed to
forward
.- Return type
Any
-
class
torchtraining.pytorch.
ZeroGrad
(obj, accumulate: int = 1)[source]¶ Zero model or optimizer gradients.
Function
zero_grad()
will be run on the provided object. Usually called after everystep
(or after multiple steps, seeaccumulate
argument).