torchtraining.accelerators¶
Accelerators enabling distributed (multi-GPU/multi-node) training.
Accelerators should be instantiated only once and used on top-most module (in the following order):
epoch (if exists)
iteration (if exists)
step
Those are the only objects which can be “piped” into producers, for example:
tt.accelerators.Horovod(...) ** tt.iterations.Iteration(...)
And should be used in this way (although it’s not always necessary).
See horovod module for an example.
- 
class torchtraining.accelerators.Horovod(model, rank: int = 0, per_worker_threads: int = None, comm=None)[source]¶
- Accelerate training using Uber’s Horovod framework. - See - torchtraining.accelerators.horovodpackage for more information.- Note - IMPORTANT: This object needs - horovodPython package to be visible. You can install it with- pip install -U torchtraining[horovod]. Also you should export- CUDA_HOMEvariable like this:- CUDA_HOME=/opt/cuda pip install -U torchtraining[horovod](your path may vary)- Parameters
- module (torch.nn.Module) – Module to be broadcasted to all processes. 
- rank (int, optional) – Root process rank. Default: - 0
- per_worker_threads (int, optional) – Number of threads which can be utilized by each process. Default: - pytorch’s default
- comm (List, optional) – List specifying ranks for the communicator, relative to the - MPI_COMM_WORLDcommunicator OR the MPI communicator to use. Given communicator will be duplicated. If- None, Horovod will use MPI_COMM_WORLD Communicator. Default:- None
 
 
Submodules