What is: Blink Communication?
Source | Blink: Fast and Generic Collectives for Distributed ML |
Year | 2000 |
Data Source | CC BY-SA - https://paperswithcode.com |
Blink is a communication library for inter-GPU parameter exchange that achieves near-optimal link utilization. To handle topology heterogeneity from hardware generations or partial allocations from cluster schedulers, Blink dynamically generates optimal communication primitives for a given topology. Blink probes the set of links available for a given job at runtime and builds a topology with appropriate link capacities. Given the topology, Blink achieves the optimal communication rate by packing spanning trees, that can utilize more links (Lovasz, 1976; Edmonds, 1973) when compared to rings. The authors use a multiplicative-weight update based approximation algorithm to quickly compute the maximal packing and extend the algorithm to further minimize the number of trees generated. Blink’s collectives extend across multiple machines effectively utilizing all available network interfaces.