Analyzing the Influence of File Formats on I/O Patterns in Deep Learning

Research output: Chapter in BookChapterResearchpeer-review

2 Downloads (Pure)

Abstract

Deep Learning applications have become an important solution for analyzing and making predictions with massive amounts of data in recent years. However, this type of application introduces significant input/output (I/O) loads on computer systems. Moreover, when executed on distributed systems or parallel distributed memory systems, they handle much information that must be read during training. This persistent and continuous access to files can overwhelm file systems and negatively impact application performance. A file format defines how information is stored, and the choice of a format depends on the use case. Therefore, it is important to analyze how the file format influences the training stage when loading and reading the dataset, as opening and reading many small files could affect application performance. Thus, this paper will analyze the I/O pattern of different file formats used in deep learning applications to characterize their behavior.
Original languageEnglish
Title of host publicationCommunications in Computer and Information Science. CSCE 2024.
Pages130-136
Number of pages7
Volume2256
ISBN (Electronic)978-3-031-85638-9
DOIs
Publication statusPublished - 26 Mar 2025

Publication series

NameCommunications in Computer and Information Science
Volume2256 CCIS

Keywords

  • Distributed Deep Learning
  • I/O Analysis
  • Parallel I/O

Fingerprint

Dive into the research topics of 'Analyzing the Influence of File Formats on I/O Patterns in Deep Learning'. Together they form a unique fingerprint.

Cite this