What is: PipeDream-2BW?
Source | Memory-Efficient Pipeline-Parallel DNN Training |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
PipeDream-2BW is an asynchronous pipeline parallel method that supports memory-efficient pipeline parallelism, a hybrid form of parallelism that combines data and model parallelism with input pipelining. PipeDream-2BW uses a novel pipelining and weight gradient coalescing strategy, combined with the double buffering of weights, to ensure high throughput, low memory footprint, and weight update semantics similar to data parallelism. In addition, PipeDream2BW automatically partitions the model over the available hardware resources, while respecting hardware constraints such as memory capacities of accelerators, and topologies and bandwidths of interconnects. PipeDream-2BW also determines when to employ existing memory-savings techniques, such as activation recomputation, that trade off extra computation for lower memory footprint.
The two main features are a double-buffered weight update (2BW) and flush mechanisms ensure high throughput. PipeDream-2BW splits models into stages over multiple workers, and each stage is replicated an equal number of times (with data-parallel updates across replicas of the same stage). Such parallel pipelines work well for models where each layer is repeated a fixed number of times (e.g., transformer models).