torchdata.modifiers¶
This module allows you to modify behaviour of torchdata.cachers.
To cache in memory only 20 first samples you could do (assuming you have already created
torchdata.Dataset instance named dataset):
dataset.cache(td.modifiers.UpToIndex(20, td.cachers.Memory()))
Modifers could be mixed intuitively as well using logical operators | (or) and
& (and).
Example (cache to disk 20 first or samples with index 1000 and upwards):
dataset.cache(
td.modifiers.UpToIndex(20, td.cachers.Memory())
| td.modifiers.FromIndex(1000, td.cachers.Memory())
)
You can mix provided modifiers or extend them by inheriting from Modifier
and implementing condition method (interface described below).
For most of cases Lambda modifier should be sufficient, for example:
# Only element up to `25th` and those which are divisible by `2`
dataset = dataset.cache(
td.modifiers.UpToIndex(25, cacher)
& td.modifiers.Lambda(lambda index: index % 2 == 0, cacher)
)
-
class
torchdata.modifiers.All(*modifiers)[source]¶ Return True if all modifiers return True on given sample.
- Parameters
*modifiers (List[torchdata.modifiers.Modifier]) – List of modifiers
-
class
torchdata.modifiers.Any(*modifiers)[source]¶ Return True if any modifier returns True on given sample.
- Parameters
*modifiers (List[torchdata.modifiers.Modifier]) – List of modifiers
-
class
torchdata.modifiers.FromIndex(index: int, cacher)[source]¶ Cache samples from specified index leaving the rest untouched.
- Parameters
index (int) – Index of sample
cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.FromPercentage(p: float, length: int, cacher)[source]¶ Cache from specified percentage of samples leaving the rest untouched.
- Parameters
p (float) – Percentage specified as flow between
[0, 1].length (int) – How many samples are in dataset. You can pass
len(dataset).cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.Indices(cacher, *indices)[source]¶ Cache samples if index is one of specified.
- Parameters
cacher (List[torchdata.modifiers.Modifier]) – List of modifiers
index (int) – Index of sample
-
class
torchdata.modifiers.Lambda(function: Callable, cacher)[source]¶ Cache samples if specified function returns
True.- Parameters
function (Callable) – Single-element callable, if
Truereturned, cache this sample. Number of sample is passed as an argument.cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.Modifier[source]¶ Interface for all modifiers.
Most methods are pre-configured, so user should not override them. In-fact only
conditionhas to be overriden and__init__implemented. Constructor should assigncachertoselfin order for everything to work, see example below.Example implementation of
modifiercaching only elements0to100of anytd.cacher.Cacher:import torchdata as td class ExampleModifier(td.modifiers.Modifier): # You have to assign cacher to self.cacher so modifier works. def __init__(self, cacher): self.cacher = cacher def condition(self, index): return index < 100 # Cache if index smaller than 100
-
__and__(other)[source]¶ If self and other returns True, then use
cacher.Important:
selfandothershould have the samecacherwrapped. Cacher of first modifier is used no matter what.
-
__contains__(index: int) → bool[source]¶ Acts as invisible proxy for
cacher’s__contains__method.User should not override this method. For more information check
torchdata.cacher.Cacherinterface.- Parameters
index (int) – Index of sample
-
__getitem__(index: int)[source]¶ Acts as invisible proxy for
cacher’s__getitem__method.User should not override this method. For more information check
torchdata.cacher.Cacherinterface.- Parameters
index (int) – Index of sample
-
__or__(other)[source]¶ If self or other returns True, then use
cacher.User should not override this method.
Important:
selfandothershould have the samecacherwrapped. Otherwise exception is thrown. Cacher of first modifier is used in such case.
-
__setitem__(index: int, data: Any) → None[source]¶ Acts as invisible proxy for
cacher’s__setitem__method.User should not override this method. For more information check
torchdata.cacher.Cacherinterface.- Parameters
index (int) – Index of sample
data (typing.Any) – Data generated by dataset.
-
-
class
torchdata.modifiers.UpToIndex(index: int, cacher)[source]¶ Cache up to samples of specified index leaving the rest untouched.
- Parameters
index (int) – Index of sample
cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.UpToPercentage(p: float, length: int, cacher)[source]¶ Cache up to percentage of samples leaving the rest untouched.
- Parameters
p (float) – Percentage specified as flow between
[0, 1].length (int) – How many samples are in dataset. You can pass
len(dataset).cacher (torchdata.cacher.Cacher) – Instance of cacher