torchdata.maps¶
This module provides functions one can use with torchdata.Dataset.map method.
Following dataset object will be used throughout documentation for brevity (if not defined explicitly):
# Image loading dataset
import torchdata as td
class Example(td.Dataset):
def __init__(self, max: int):
self.values = list(range(max))
def __getitem__(self, index):
return self.values[index]
def __len__(self):
return len(self.values)
dataset = Example(100)
maps below are general and can be used in various scenarios.
-
class
torchdata.maps.After(samples: int, function: Callable)[source]¶ Apply function after specified number of samples passed.
Useful for introducing data augmentation after an initial warm-up period. If you want a direct control over when function will be applied to sample, please use
torchdata.transforms.OnSignal.Example:
# After 10 samples apply lambda mapping dataset = dataset.map(After(10, lambda x: -x))
- Parameters
samples (int) – After how many samples function will start being applied.
function (Callable) – Function to apply to sample.
- Returns
Either unchanged sample or function(sample)
- Return type
Union[sample, function(sample)]
-
class
torchdata.maps.Drop(*indices)[source]¶ Return sample without selected elements.
Sample has to be indexable object (has
__getitem__method implemented).Important:
Negative indexing is supported if supported by sample object.
This function is slower than
Selectand the latter should be preffered.If you want to select sample from nested
tuple, please useFlattenfirstReturns single element if only one element is left
Returns
Noneif all elements are dropped
Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset | dataset # Zeroth and last samples dropped selected = new_dataset.map(td.maps.Drop(0, 2))
- Parameters
*indices (int) – Indices of objects to remove from the sample. If left empty, tuple containing all elements will be returned.
- Returns
Tuple without selected elements
- Return type
Tuple[samples]
-
class
torchdata.maps.Except(function: Callable, *indices)[source]¶ Apply function to all elements of sample except the ones specified.
Sample has to be
iterableobject.Important:
If you want to apply function to all nested elements (e.g. in nested
tuple), please usetorchdata.maps.Flattenobject first.Example:
# Sample-wise concatenate dataset three times dataset |= dataset # Every element increased by one except the first one selected = new_dataset.map(td.maps.Except(lambda x: x+1, 0))
-
function¶ Function to apply to chosen elements of sample.
- Type
Callable
-
\*indices Indices of objects to which function will not be applied. If left empty, function will be applied to every element of sample.
- Type
int
- Returns
Tuple with subsamples where some have the function applied.
- Return type
Tuple[function(subsample)]
-
-
class
torchdata.maps.Flatten(types: Tuple = (<class 'list'>, <class 'tuple'>))[source]¶ Flatten arbitrarily nested sample.
Example:
# Nest elements dataset = dataset.map(lambda x: (x, (x, (x, x), x),)) # Flatten no matter how deep dataset = dataset.map(torchdata.maps.Flatten())
- Parameters
types (Tuple[type], optional) – Types to be considered non-flat. Those will be recursively flattened. Default:
(list, tuple)- Returns
Tuple with elements flattened
- Return type
Tuple[samples]
-
class
torchdata.maps.OnSignal(signal: Callable[[…], bool], function: Callable)[source]¶ Apply function based on boolean output of signalling function.
Useful for introducing data augmentation after an initial warm-up period. You can use it to turn on/off specific augmentation with respect to outer world, for example turning on image rotations after 5 epochs and turning off 5 epochs before the end in order to fine-tune your network.
Example:
import torch from PIL import Image import torchdata as td import torchvision # Image loading dataset class ImageDataset(td.datasets.Files): def __getitem__(self, index): return Image.open(self.files[index]) class Handle: def __init__(self): self.value: bool = False def __call__(self): return self.value # you can change handle.value to switch whether mapping should be applied handle = Handle() dataset = ( ImageDataset.from_folder("./data") .map(torchvision.transforms.ToTensor()) .cache() # If handle returns True, mapping will be applied .map( td.maps.OnSignal( handle, lambda image: image + torch.rand_like(image) ) ) )
- Parameters
signal (Callable) – No argument callable returning boolean, indicating whether to apply function.
function (Callable) – Function to apply to sample.
- Returns
Either unchanged sample of function(sample)
- Return type
Union[sample, function(sample)]
-
class
torchdata.maps.Repeat(n: int, function: Callable)[source]¶ Apply function repeatedly to the sample.
Example:
import torchdata as td # Creating td.Dataset instance ... # Increase each value by 10 * 1 dataset = dataset.map(td.maps.Repeat(10, lambda x: x+1))
- Parameters
n (int) – How many times the function will be applied.
function (Callable) – Function to apply.
- Returns
Function(sample) applied n times.
- Return type
function(sample)
-
class
torchdata.maps.Select(*indices)[source]¶ Select elements from sample.
Sample has to be indexable object (has
__getitem__method implemented).Important:
Negative indexing is supported if supported by sample object.
This function is faster than
Dropand should be used if possible.If you want to select sample from nested
tuple, please useFlattenfirstReturns single element if only one element is left
Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset # Only second (first index) element will be taken selected = new_dataset.map(td.maps.Select(1))
- Parameters
*indices (int) – Indices of objects to select from the sample. If left empty, empty tuple will be returned.
- Returns
Tuple with selected elements
- Return type
Tuple[samples]
-
class
torchdata.maps.To(function: Callable, *indices)[source]¶ Apply function to specified elements of sample.
Sample has to be
iterableobject.Important:
If you want to apply function to all nested elements (e.g. in nested
tuple), please usetorchdata.maps.Flattenobject first.Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset | dataset # Zero and first subsamples will be increased by one, last one left untouched selected = new_dataset.map(td.maps.To(lambda x: x+1, 0, 1))
-
function¶ Function to apply to specified elements of sample.
- Type
Callable
-
\*indices Indices to which function will be applied. If left empty, function will not be applied to anything.
- Type
int
- Returns
Tuple consisting of subsamples with some having the function applied.
- Return type
Tuple[function(subsample)]
-
-
class
torchdata.maps.ToAll(function: Callable)[source]¶ Apply function to each element of sample.
Sample has to be
iterableobject.Important:
If you want to apply function to all nested elements (e.g. in nested
tuple), please usetorchdata.maps.Flattenobject first.Example:
# Sample-wise concatenate dataset three times new_dataset = dataset | dataset | dataset # Each concatenated sample will be increased by 1 selected = new_dataset.map(td.maps.ToAll(lambda x: x+1))
-
function¶ Function to apply to each element of sample.
- Type
Callable
- Returns
Tuple consisting of subsamples with function applied.
- Return type
Tuple[function(subsample)]
-