Seminars

View all Seminars  |  Download ICal for this event

HyCache: Hybrid Caching for Accelerating DNN Training Input Pipelines

Series: M.Tech (Research) Colloquium

Speaker: Keshav Vinayak Jha

Date/Time: Mar 08 10:30:00

Location: CSA Lecture Hall (Room No. 112, Ground Floor)

Faculty Advisor: Arkaprava Basu

Abstract:
The performance of deep neural networks (DNNs) is a function of both the compute latency and the latency to fetch the input data needed to train the model. With advancements in GPUs and software, performance in GPU computing has seen substantial performance gains. However, these improvements have shifted the bottleneck to the CPU-based input pipeline, which preprocesses and transforms data before it is fed into the accelerator for training. As the input data progresses through a series of preprocessing steps, each step generates intermediate tensors. Previous works cache intermediate data in either memory or storage, necessitating all tensors from an intermediate step to fit within the available memory or disk budget. This may not be possible given that the bloat of tensors in intermediate steps is much higher than raw dataset sizes. Given that modern training systems are equipped with substantial memory and storage, exploiting and optimizing both capacities is crucial. In this paper, we propose Hybrid Cache(HyCache), a technique that enables the caching of a subset of tensors from any intermediate step on both memory and disk automatically, without requiring programmer involvement. SHC concurrently targets both memory and storage capacities to maximize resource utilization. The approach offers a user-friendly library interface that determines the optimal tradeoff between recomputing a preprocessing step and caching across memory and storage. HyCache outperforms previous approaches, delivering a performance improvement ranging from 1.11X?? to 5.3X?? over the regular preprocessing pipeline

Speaker Bio:
Keshav Vinayak Jha is an MTech Research student in the Department of Computer Science and Automation at IISc, Bangalore. He is part of the Computer Systems Lab (CSL) and works under the guidance of Prof. Arkaprava Basu. His primary area of research is finding and alleviating potential system bottlenecks to speed up AI training applications.

Host Faculty: Arkaprava Basu