torchdata.maps¶
This module provides functions one can use with torchdata.Dataset.map
method.
Following dataset
object will be used throughout documentation for brevity (if not defined explicitly):
# Image loading dataset
import torchdata as td
class Example(td.Dataset):
def __init__(self, max: int):
self.values = list(range(max))
def __getitem__(self, index):
return self.values[index]
def __len__(self):
return len(self.values)
dataset = Example(100)
maps
below are general and can be used in various scenarios.
-
class
torchdata.maps.
After
(samples: int, function: Callable)[source]¶ Apply function after specified number of samples passed.
Useful for introducing data augmentation after an initial warm-up period. If you want a direct control over when function will be applied to sample, please use
torchdata.transforms.OnSignal
.Example:
# After 10 samples apply lambda mapping dataset = dataset.map(After(10, lambda x: -x))
- Parameters
samples (int) – After how many samples function will start being applied.
function (Callable) – Function to apply to sample.
- Returns
Either unchanged sample or function(sample)
- Return type
Union[sample, function(sample)]
-
class
torchdata.maps.
Drop
(*indices)[source]¶ Return sample without selected elements.
Sample has to be indexable object (has
__getitem__
method implemented).Important:
Negative indexing is supported if supported by sample object.
This function is slower than
Select
and the latter should be preffered.If you want to select sample from nested
tuple
, please useFlatten
firstReturns single element if only one element is left
Returns
None
if all elements are dropped
Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset | dataset # Zeroth and last samples dropped selected = new_dataset.map(td.maps.Drop(0, 2))
- Parameters
*indices (int) – Indices of objects to remove from the sample. If left empty, tuple containing all elements will be returned.
- Returns
Tuple without selected elements
- Return type
Tuple[samples]
-
class
torchdata.maps.
Except
(function: Callable, *indices)[source]¶ Apply function to all elements of sample except the ones specified.
Sample has to be
iterable
object.Important:
If you want to apply function to all nested elements (e.g. in nested
tuple
), please usetorchdata.maps.Flatten
object first.Example:
# Sample-wise concatenate dataset three times dataset |= dataset # Every element increased by one except the first one selected = new_dataset.map(td.maps.Except(lambda x: x+1, 0))
-
function
¶ Function to apply to chosen elements of sample.
- Type
Callable
-
\*indices
Indices of objects to which function will not be applied. If left empty, function will be applied to every element of sample.
- Type
int
- Returns
Tuple with subsamples where some have the function applied.
- Return type
Tuple[function(subsample)]
-
-
class
torchdata.maps.
Flatten
(types: Tuple = (<class 'list'>, <class 'tuple'>))[source]¶ Flatten arbitrarily nested sample.
Example:
# Nest elements dataset = dataset.map(lambda x: (x, (x, (x, x), x),)) # Flatten no matter how deep dataset = dataset.map(torchdata.maps.Flatten())
- Parameters
types (Tuple[type], optional) – Types to be considered non-flat. Those will be recursively flattened. Default:
(list, tuple)
- Returns
Tuple with elements flattened
- Return type
Tuple[samples]
-
class
torchdata.maps.
OnSignal
(signal: Callable[[…], bool], function: Callable)[source]¶ Apply function based on boolean output of signalling function.
Useful for introducing data augmentation after an initial warm-up period. You can use it to turn on/off specific augmentation with respect to outer world, for example turning on image rotations after 5 epochs and turning off 5 epochs before the end in order to fine-tune your network.
Example:
import torch from PIL import Image import torchdata as td import torchvision # Image loading dataset class ImageDataset(td.datasets.Files): def __getitem__(self, index): return Image.open(self.files[index]) class Handle: def __init__(self): self.value: bool = False def __call__(self): return self.value # you can change handle.value to switch whether mapping should be applied handle = Handle() dataset = ( ImageDataset.from_folder("./data") .map(torchvision.transforms.ToTensor()) .cache() # If handle returns True, mapping will be applied .map( td.maps.OnSignal( handle, lambda image: image + torch.rand_like(image) ) ) )
- Parameters
signal (Callable) – No argument callable returning boolean, indicating whether to apply function.
function (Callable) – Function to apply to sample.
- Returns
Either unchanged sample of function(sample)
- Return type
Union[sample, function(sample)]
-
class
torchdata.maps.
Repeat
(n: int, function: Callable)[source]¶ Apply function repeatedly to the sample.
Example:
import torchdata as td # Creating td.Dataset instance ... # Increase each value by 10 * 1 dataset = dataset.map(td.maps.Repeat(10, lambda x: x+1))
- Parameters
n (int) – How many times the function will be applied.
function (Callable) – Function to apply.
- Returns
Function(sample) applied n times.
- Return type
function(sample)
-
class
torchdata.maps.
Select
(*indices)[source]¶ Select elements from sample.
Sample has to be indexable object (has
__getitem__
method implemented).Important:
Negative indexing is supported if supported by sample object.
This function is faster than
Drop
and should be used if possible.If you want to select sample from nested
tuple
, please useFlatten
firstReturns single element if only one element is left
Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset # Only second (first index) element will be taken selected = new_dataset.map(td.maps.Select(1))
- Parameters
*indices (int) – Indices of objects to select from the sample. If left empty, empty tuple will be returned.
- Returns
Tuple with selected elements
- Return type
Tuple[samples]
-
class
torchdata.maps.
To
(function: Callable, *indices)[source]¶ Apply function to specified elements of sample.
Sample has to be
iterable
object.Important:
If you want to apply function to all nested elements (e.g. in nested
tuple
), please usetorchdata.maps.Flatten
object first.Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset | dataset # Zero and first subsamples will be increased by one, last one left untouched selected = new_dataset.map(td.maps.To(lambda x: x+1, 0, 1))
-
function
¶ Function to apply to specified elements of sample.
- Type
Callable
-
\*indices
Indices to which function will be applied. If left empty, function will not be applied to anything.
- Type
int
- Returns
Tuple consisting of subsamples with some having the function applied.
- Return type
Tuple[function(subsample)]
-
-
class
torchdata.maps.
ToAll
(function: Callable)[source]¶ Apply function to each element of sample.
Sample has to be
iterable
object.Important:
If you want to apply function to all nested elements (e.g. in nested
tuple
), please usetorchdata.maps.Flatten
object first.Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset | dataset # Each concatenated sample will be increased by 1 selected = new_dataset.map(td.maps.ToAll(lambda x: x+1))
-
function
¶ Function to apply to each element of sample.
- Type
Callable
- Returns
Tuple consisting of subsamples with function applied.
- Return type
Tuple[function(subsample)]
-