torchtraining.loss¶
-
class
torchtraining.loss.
BinaryFocal
(gamma: float, weight=None, pos_weight=None, reduction: Callable[torch.Tensor, torch.Tensor] = None)[source]¶ Binary focal loss working with raw output from network (logits).
See original research paper: Focal Loss for Dense Object Detection
Underplays loss of easy examples while leaving loss of harder examples for neural network mostly intact (dampened way less).
The higher the gamma parameter, the greater the “focusing” effect.
- Parameters
gamma (float) – Scale of focal loss effect. To obtain binary crossentropy set it to 0.0.
0.5 - 2.5
range was used in original research paper and seemed robust.weight (Tensor, optional) – Manual rescaling weight, if provided it’s repeated to match input tensor shape
pos_weight (Tensor, optional) – Weight of positive examples. Must be a vector with length equal to the number of classes. In general
pos_weight
should be decreased slightly asgamma
is increased (forgamma=2
,pos_weight=0.25
was found to work best in original paper).reduction (typing.Callable(torch.Tensor) -> torch.Tensor, optional) – Specifies the reduction to apply to the output. If user wants no reduction he should use:
lambda loss: loss
. If user wants a summation he should use:torch.sum
. By default,lambda loss: loss.sum() / loss.shape[0]
is used (mean across examples).
-
forward
(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶ - Parameters
outputs (torch.Tensor) – where means, any number of additional dimensions. Usually of shape , where is image height and is it’s width.
targets (torch.Tensor) – , same shape as the input.
- Returns
If
reduction
is not specified thenmean
across sample is taken. Otherwise whatever shapereduction
returns.- Return type
-
class
torchtraining.loss.
MulticlassFocal
(gamma: float, weight=None, ignore_index: int = - 100, reduction: Callable[torch.Tensor, torch.Tensor] = None)[source]¶ Multiclass focal loss working with raw output from network (logits).
See original research paper: Focal Loss for Dense Object Detection
Underplays loss of easy examples while leaving loss of harder examples for neural network mostly intact (dampened way less).
The higher the gamma parameter, the greater the “focusing” effect.
- Parameters
gamma (float) – Scale of focal loss effect. To obtain binary crossentropy set it to 0.0.
0.5 - 2.5
range was used in original research paper and seemed robust.weight (Tensor, optional) – Manual rescaling weight, if provided it’s repeated to match input tensor shape.
int (ignore_index) – Specifies a target value that is ignored and does not contribute to the input gradient. When
size_average
isTrue
, the loss is averaged over non-ignored targets.optional – Specifies a target value that is ignored and does not contribute to the input gradient. When
size_average
isTrue
, the loss is averaged over non-ignored targets.reduction (typing.Callable(torch.Tensor) -> torch.Tensor, optional) – Specifies the reduction to apply to the output. If user wants no reduction he should use:
lambda loss: loss
. If user wants a summation he should use:torch.sum
. By default,lambda loss: loss.sum() / loss.shape[0]
is used (mean across examples).
-
forward
(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶ - Parameters
outputs (torch.Tensor) – where
C = number of classes
, or with in the case ofK
-dimensional loss. Usually of shape , where is image height and is it’s width.targets (torch.Tensor) – where each value is , or with in the case of K-dimensional loss. Usually of shape , where is image height and is it’s width and elements are of specified
C
classes.
- Returns
If
reduction
is not specified thenmean
across sample is taken. Otherwise whatever shapereduction
returns.- Return type
-
class
torchtraining.loss.
QuadrupletLoss
(alpha1: float = 1.0, alpha2: float = 0.5, metric: Callable[[torch.Tensor, torch.Tensor], torch.Tensor] = <function pairwise_distance>, weight=None, reduction: str = 'sum')[source]¶ Quadruplet loss pushing away samples belonging to different classes.
See original research paper Beyond triplet loss: a deep quadruplet network for person re-identification for more information.
It is an extension of
torch.nn.TripletMarginLoss
, where samples from two differentnegative
(negative
andnegative2
) classes should be pushed further away in space than those belonging to the same class (anchor
andpositive
)The loss function for each sample in the mini-batch is:
- Parameters
alpha1 (float, optional) – Margin of standard
triplet
loss. Default:1.0
alpha2 (float, optional) – Margin of second part of loss (pushing negative1 and negative2 samples more than positive and anchor). Default:
0.5
metric (Callable(torch.Tensor, torch.Tensor) -> torch.Tensor, optional) – Metric used to rate distance between samples. Fully Connected neural network with one output and
sigmoid
could be used (as in original paper) or anything else adhering to API. Default: Euclidean distance.weight (Tensor, optional) – Manual rescaling weight, if provided it’s repeated to match input tensor shape. Default:
None
(no weighting)reduction (typing.Callable(torch.Tensor) -> torch.Tensor, optional) – Specifies the reduction to apply to the output. If user wants no reduction he should use:
lambda loss: loss
. If user wants a summation he should use:torch.sum
. By default,lambda loss: loss.sum() / loss.shape[0]
is used (mean across examples).
-
forward
(anchor: torch.Tensor, positive: torch.Tensor, negative: torch.Tensor, negative2: torch.Tensor) → torch.Tensor[source]¶ - Parameters
anchor (torch.Tensor) – where means, any number of additional dimensions For images usually of shape .
positive (torch.Tensor) – Same as
anchor
negative (torch.Tensor) – Same as
anchor
negative2 (torch.Tensor) – Same as
anchor
- Returns
If
reduction
is not specified thenmean
across sample is taken. Otherwise whatever shapereduction
returns.- Return type
-
class
torchtraining.loss.
SmoothBinaryCrossEntropy
(alpha: float, weight=None, pos_weight: int = None, reduction: Callable[torch.Tensor, torch.Tensor] = None)[source]¶ Run binary cross entropy with booleans smoothed by
alpha
.See When Does Label Smoothing Help? for more details
targets
will be transformed to one-hot encoding and modified according to formula:where is total number of classes in binary case.
- Parameters
alpha (float) – Smoothing parameter in the range
[0, 1)
.weight (Tensor, optional) – Manual rescaling weight, if provided it’s repeated to match input tensor shape. Default:
None
(no weighting)pos_weight (Tensor, optional) – Weight of positive examples. Must be a vector with length equal to the number of classes. In general
pos_weight
should be decreased slightly asgamma
is increased (forgamma=2
,pos_weight=0.25
was found to work best in original paper).reduction (typing.Callable(torch.Tensor) -> torch.Tensor, optional) – Specifies the reduction to apply to the output. If user wants no reduction he should use:
lambda loss: loss
. If user wants a summation he should use:torch.sum
. By default,lambda loss: loss.sum() / loss.shape[0]
is used (mean across examples).
-
forward
(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶ - Parameters
outputs (torch.Tensor) – where means, any number of additional dimensions
targets (torch.Tensor) – , same shape as the input
- Returns
If
reduction
is not specified thenmean
across sample is taken. Otherwise whatever shapereduction
returns.- Return type
-
class
torchtraining.loss.
SmoothCrossEntropy
(alpha: float, weight=None, ignore_index: int = - 100, reduction: Callable[torch.Tensor, torch.Tensor] = None)[source]¶ Run cross entropy with non-integer labels smoothed by
alpha
.See When Does Label Smoothing Help? for more details
targets
will be transformed to one-hot encoding and modified according to formula:where is total number of classes.
- Parameters
alpha (float) – Smoothing parameter in the range
[0, 1)
.weight (Tensor, optional) – Manual rescaling weight, if provided it’s repeated to match input tensor shape. Default:
None
(no weighting)int (ignore_index) – Specifies a target value that is ignored and does not contribute to the input gradient. When
size_average
isTrue
, the loss is averaged over non-ignored targets. Default:-100
optional – Specifies a target value that is ignored and does not contribute to the input gradient. When
size_average
isTrue
, the loss is averaged over non-ignored targets. Default:-100
reduction (typing.Callable(torch.Tensor) -> torch.Tensor, optional) – Specifies the reduction to apply to the output. If user wants no reduction he should use:
lambda loss: loss
. If user wants a summation he should use:torch.sum
. By default,lambda loss: loss.sum() / loss.shape[0]
is used (mean across examples).
-
forward
(outputs: torch.Tensor, targets: torch.Tensor) → torch.Tensor[source]¶ - Parameters
outputs (torch.Tensor) – where
C = number of classes
, or with in the case ofK
-dimensional loss.targets (torch.Tensor) – where each value is , or with in the case of K-dimensional loss.
- Returns
If
reduction
is not specified thenmean
across sample is taken. Otherwise whatever shapereduction
returns.- Return type