The Art of File Reduction: History, Evolution, and Comparison of the Most Used Compression Formats

From ZIP to Zstandard, including extinct formats like ARJ, LZH or Quantum, this is how data compression has evolved across different platforms and operating systems.

File compression has accompanied the development of computing since its beginnings. Its objective has always been the same: reduce the size of data to save space, accelerate network transmission, and facilitate storage or backup. From the 1980s to the present, dozens of algorithms and formats have been developed, some now obsolete, others that remain current and even in constant evolution.

This article covers the history of the main compression formats, including some lesser-known ones, their extension, compatibility, advantages, limitations, and curiosities.

From ARC to Quantum: The Prehistory of Compression

In the 1980s, compression was a critical necessity for 360 KB floppy disks and slow modem connections. Some of the first formats included:

FormatExtensionYearCurrent StatusNotes
ARC.arc1985Obsolete since ~1990Caused legal controversy; replaced by ZIP
ARJ.arj1990Obsolete since ~2000Widely used in BBS and floppy disks, great efficiency for its time
LZH/LHA.lzh1988In disuse since ~2010Popular in Japan and AmigaOS, used in some games
Q (Quantum).q1991Discontinued in 1994High compression in MS-DOS, replaced by ZIP
Zoo.zoo1986Obsolete since ~1995Compression format for Unix

Extended Table of Modern and Ancient Formats

FormatExtensionCompressionAdvantagesDisadvantagesCompatibilityStatus
ZIP.zipMediumVery compatible, fastLower compression than othersWindows, macOS, Linux, mobileActive
RAR.rarHighGood ratio, error recoveryProprietaryWindows, macOS, Linux (limited), mobile appsActive (WinRAR 6.x)
7z.7zVery highOpen source, strong encryption, solid compressionSlow in some casesWindows, macOS, Linux, mobileActive
TAR.GZ.tar.gzHigh (gzip)Linux standard, good speedNo direct navigationLinux, Unix, Windows with toolsActive
XZ.xzVery highGreat compressionSlow, intensive CPU usageLinux, Windows, macOSActive
Zstandard.zstHighVery fast, modern, ideal for backupsNot yet widely adoptedLinux, macOS, Windows (new tools)Active
LZMA.lzmaVery highHigh compressionLess support than .7zLinux, WindowsActive
BZIP2.bz2HighBetter than gzip, losslessVery slowLinux, Windows, macOSActive
ARJ.arjHighEfficient in its timeObsolete, limited supportOnly old CLI or retro softwareObsolete (~2000)
LZH.lzhMediumUsed in video games and Japanese softwareLow compression, no supportJapan, retrocomputingObsolete (~2010)
ZOO.zooMediumCross-platform in its timeAlmost no current supportRetro UnixObsolete (~1995)
CAB.cabMediumFormat used by Microsoft for installersWindows only, no encryptionWindowsActive (limited use)
ACE.aceHighGood ratio, was popular in the 2000sAbandoned, vulnerabilitiesOnly old versions of WinRARAbandoned (since ~2007)

Platforms and Tools by Operating System

Windows

  • Native support: ZIP, CAB
  • Key tools: 7-Zip, WinRAR, Bandizip, PeaZip

macOS

  • Native support: ZIP, TAR, GZ, XZ
  • Tools: Keka, The Unarchiver, BetterZip

Linux

  • Complete terminal support: ZIP, 7z, TAR, GZ, BZ2, XZ, Zstandard
  • Utilities: tar, gzip, xz, zstd, p7zip

Android and iOS

  • Apps: ZArchiver, RAR for Android, WinZip, iZip
  • Limitations: Some encrypted or solid formats require paid apps

Curious Cases and Compression Records

ENWIK8 (Wikipedia in text, 100 MB):

  • ZIP: ~30 MB
  • 7z: ~20 MB
  • Zstandard: ~22 MB
  • PAQ8px: 13.3 MB
  • CMIX: up to 12.7 MB (but takes several hours)

Compression challenges: There are competitions like the Hutter Prize, which reward the algorithm that manages to best compress a copy of Wikipedia, seeking theoretical limits of lossless compression.

Compression Formats by Use Context

ContextIdeal FormatReason
General filesZIP, 7zCompatibility and balanced compression
Mass backupsTAR.ZST, 7zSolid and efficient compression
Linux distributionsTAR.GZ, TAR.XZTraditional and universally supported
Multimedia contentRAR, 7zSolid compression, volume splitting
Mobile appsZIP, RARApp limitations
Extreme compressionPAQ, CMIX, ZPAQOnly for testing or historical archiving

What Does the Future Hold?

Data growth doesn’t stop. Although technologies like real-time compression of file systems (ZFS, Btrfs, ReFS) or deduplication have changed part of the paradigm, classic formats remain alive. Projects like Zstandard or ZPAQ point to a future with smarter, adaptive, and faster algorithms, leveraging multi-core CPUs and abundant memory.

Additionally, content-specific lossless compression (such as PNG images, AV1 video, FLAC audio) continues to evolve in parallel, optimizing resources for each medium.

Conclusion

Far from being relegated by hardware advances or cheaper storage, compression formats remain a key tool for computing efficiency. Whether saving a backup, sending documents by email, or distributing software, choosing the right format can make a big difference.

From the ancient ARJ to the modern Zstandard, compression is a technical art that has evolved with each generation of computers. And, judging by current advances, it hasn’t said its last word yet.

  • https://7-zip.org
  • https://www.rarlab.com
  • https://facebook.github.io/zstd
  • https://peazip.github.io/
  • https://www.keka.io/
  • Wikipedia: «Comparison of file archivers»
  • Archive.org: Technical documentation of old formats

Via: Social Media News

Scroll to Top