torchdata.modifiers¶
This module allows you to modify behaviour of torchdata.cachers
.
To cache in memory
only 20
first samples you could do (assuming you have already created
torchdata.Dataset
instance named dataset
):
dataset.cache(td.modifiers.UpToIndex(20, td.cachers.Memory()))
Modifers could be mixed intuitively as well using logical operators |
(or) and
&
(and).
Example (cache to disk 20
first or samples with index 1000
and upwards):
dataset.cache(
td.modifiers.UpToIndex(20, td.cachers.Memory())
| td.modifiers.FromIndex(1000, td.cachers.Memory())
)
You can mix provided modifiers or extend them by inheriting from Modifier
and implementing condition
method (interface described below).
For most of cases Lambda
modifier should be sufficient, for example:
# Only element up to `25th` and those which are divisible by `2`
dataset = dataset.cache(
td.modifiers.UpToIndex(25, cacher)
& td.modifiers.Lambda(lambda index: index % 2 == 0, cacher)
)
-
class
torchdata.modifiers.
All
(*modifiers)[source]¶ Return True if all modifiers return True on given sample.
- Parameters
*modifiers (List[torchdata.modifiers.Modifier]) – List of modifiers
-
class
torchdata.modifiers.
Any
(*modifiers)[source]¶ Return True if any modifier returns True on given sample.
- Parameters
*modifiers (List[torchdata.modifiers.Modifier]) – List of modifiers
-
class
torchdata.modifiers.
FromIndex
(index: int, cacher)[source]¶ Cache samples from specified index leaving the rest untouched.
- Parameters
index (int) – Index of sample
cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.
FromPercentage
(p: float, length: int, cacher)[source]¶ Cache from specified percentage of samples leaving the rest untouched.
- Parameters
p (float) – Percentage specified as flow between
[0, 1]
.length (int) – How many samples are in dataset. You can pass
len(dataset)
.cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.
Indices
(cacher, *indices)[source]¶ Cache samples if index is one of specified.
- Parameters
cacher (List[torchdata.modifiers.Modifier]) – List of modifiers
index (int) – Index of sample
-
class
torchdata.modifiers.
Lambda
(function: Callable, cacher)[source]¶ Cache samples if specified function returns
True
.- Parameters
function (Callable) – Single-element callable, if
True
returned, cache this sample. Number of sample is passed as an argument.cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.
Modifier
[source]¶ Interface for all modifiers.
Most methods are pre-configured, so user should not override them. In-fact only
condition
has to be overriden and__init__
implemented. Constructor should assigncacher
toself
in order for everything to work, see example below.Example implementation of
modifier
caching only elements0
to100
of anytd.cacher.Cacher
:import torchdata as td class ExampleModifier(td.modifiers.Modifier): # You have to assign cacher to self.cacher so modifier works. def __init__(self, cacher): self.cacher = cacher def condition(self, index): return index < 100 # Cache if index smaller than 100
-
__and__
(other)[source]¶ If self and other returns True, then use
cacher
.Important:
self
andother
should have the samecacher
wrapped. Cacher of first modifier is used no matter what.
-
__contains__
(index: int) → bool[source]¶ Acts as invisible proxy for
cacher
’s__contains__
method.User should not override this method. For more information check
torchdata.cacher.Cacher
interface.- Parameters
index (int) – Index of sample
-
__getitem__
(index: int)[source]¶ Acts as invisible proxy for
cacher
’s__getitem__
method.User should not override this method. For more information check
torchdata.cacher.Cacher
interface.- Parameters
index (int) – Index of sample
-
__or__
(other)[source]¶ If self or other returns True, then use
cacher
.User should not override this method.
Important:
self
andother
should have the samecacher
wrapped. Otherwise exception is thrown. Cacher of first modifier is used in such case.
-
__setitem__
(index: int, data: Any) → None[source]¶ Acts as invisible proxy for
cacher
’s__setitem__
method.User should not override this method. For more information check
torchdata.cacher.Cacher
interface.- Parameters
index (int) – Index of sample
data (typing.Any) – Data generated by dataset.
-
-
class
torchdata.modifiers.
UpToIndex
(index: int, cacher)[source]¶ Cache up to samples of specified index leaving the rest untouched.
- Parameters
index (int) – Index of sample
cacher (torchdata.cacher.Cacher) – Instance of cacher
-
class
torchdata.modifiers.
UpToPercentage
(p: float, length: int, cacher)[source]¶ Cache up to percentage of samples leaving the rest untouched.
- Parameters
p (float) – Percentage specified as flow between
[0, 1]
.length (int) – How many samples are in dataset. You can pass
len(dataset)
.cacher (torchdata.cacher.Cacher) – Instance of cacher