DeepSeek Presents 3FS, Its Distributed File System for Artificial Intelligence

Sure! Here is the translation to American English:

The advancement of China in the artificial intelligence sector continues despite the restrictions imposed by the United States on access to advanced chips and manufacturing technology. DeepSeek AI, one of the most innovative companies in the field, has taken a significant step forward with the introduction of the Fire-Flyer File System (3FS), a distributed file system designed to optimize training and inference workloads in artificial intelligence.

This open-source system is designed to make the most of modern SSDs and RDMA networks, achieving read performance that exceeds current standards and optimizing access to large volumes of data in high-performance environments.


3FS: a file system for the new era of artificial intelligence

As artificial intelligence models become more complex, the need for storage systems that allow for quick and efficient access to data has become fundamental. 3FS has been developed with this purpose in mind, providing a distributed storage solution that enhances performance and scalability in advanced computing environments.

Some of its main features include:

  • Disaggregated architecture: Combines the performance of thousands of SSDs and hundreds of storage nodes, ensuring efficient data access regardless of physical location.
  • Strong consistency: Implements Chain Replication with Apportioned Queries (CRAQ), facilitating application development by ensuring consistency in stored data.
  • Standard file interface: Allows the use of storage systems without needing to learn new APIs, leveraging transactional databases like FoundationDB for metadata management.

Thanks to these features, 3FS emerges as an efficient option for data management in computing centers dedicated to training artificial intelligence models.


A performance that redefines storage in HPC environments

DeepSeek has tested 3FS under various heavy load conditions, achieving remarkable results:

  • In a 180-node cluster, each equipped with 16 NVMe SSDs of 14 TiB and 200 Gbps InfiniBand networks, it reached an aggregate read speed of 6.6 TiB/s (7.25 TB/s) under stress tests.
  • In the GraySort benchmark, which measures performance in sorting large volumes of data, 3FS processed 110.5 TiB in 30 minutes and 14 seconds, achieving a rate of 3.66 TiB/minute on a 25-node cluster.
  • In inference tasks with language models, KVCache on 3FS achieved a maximum performance of over 40 GiB/s per node, allowing for optimized caching without being overly dependent on DRAM memory.

These results reflect 3FS’s capability to overcome traditional storage bottlenecks and improve efficiency in artificial intelligence and high-performance computing (HPC) tasks.


Impact on the industry and advantages over traditional solutions

Efficient data storage is a key challenge in developing artificial intelligence models. DeepSeek has been using 3FS internally since 2019, integrating it into its infrastructure to enhance model training with lower resource consumption.

According to the company, its storage system has achieved 80% of the performance of an NVIDIA DGX-A100 server, but at only 50% of the cost and 60% of the energy consumption. This presents a significant competitive advantage for companies looking to optimize costs without sacrificing performance.

Another crucial aspect is its accessibility, as 3FS has been released as open-source software, allowing researchers and companies to leverage this technology for their own artificial intelligence applications.

The source code and official documentation for the file system can be found on GitHub:
Official 3FS repository on GitHub


Conclusion: 3FS sets a new standard in storage for AI

The development of Fire-Flyer File System (3FS) positions DeepSeek as one of the most innovative companies in the field of artificial intelligence. By providing a scalable, efficient, and high-performance storage solution, the company demonstrates that China has not only caught up with its competitors in AI but is leading advancements in the technological infrastructure necessary for the future of the sector.

With the growing demand for optimized solutions for model training, 3FS could become a key tool for data centers, research institutions, and companies looking to enhance their capabilities in artificial intelligence without relying on proprietary technologies.

via: AI News

Scroll to Top