January 28, 2022

it-cooking

Success is just one script away

2018 Comparison of Popular Archive Utility

13 min read
2018 comparison of popular archive / backup / compression utility. 7-zip • gzip • rar • tar • zip | Definition of Backup vs Archive | Command line examples | Comparison table
2018-archive-utility-comparison-meme
Spread the love
(Last Updated On: May 14, 2018)

2018 comparison of popular archive / backup / compression utility: tar / ZIP / gzip / RAR / 7-zip

Definition of Backup vs Archive

An archive is a collection of historical records that are kept for long-term retention and used for future reference. Typically, archives contain data that is not actively used.

A backup is a copy of a data set, while an archive holds original data that has been removed from its original location (Dorion 2008).

There is not much difference between the two, so the main point of an archive is to hold data that is meant to be removed from its active state.

 

TAR

Tar is an archive utility, that simply stacks files sequentially from input or as arguments in a single file, with a small payload at the end to store file structure and start/end of each file within the archive. Developed for UNIX in 1979 for tape archive recorders (Deutsch and Aladdin Enterprises 1996). Cannot span files, cannot compress, cannot encrypt, cannot backup unnamed pipes. Compression is achieved on the fly by piping the output with compress or gzip. Modern versions of tar are now linked to local UX compressors such as compress or gzip, and there are many forks such as gtar (GNU tar). It’s confusing because most of the time the tar command keeps the same name and one cannot guess its capabilities unless one prints out its usage with tar --help.

Option arguments do not always require dashes (“Tar(5) — Format Of Tape Archive Files” 2004).

tar usage (truncated)

Usage: tar [OPTION...] [FILE]...
GNU 'tar' saves many files together into a single tape or disk archive, and can
restore individual files from the archive.

Examples:
  tar -cf archive.tar foo bar  # Create archive.tar from files foo and bar.
  tar -tvf archive.tar         # List all files in archive.tar verbosely.
  tar -xf archive.tar          # Extract all files from archive.tar.

 Main operation mode:

  -A, --catenate, --concatenate   append tar files to an archive
  -c, --create               create a new archive
  -d, --diff, --compare      find differences between archive and file system
      --delete               delete from the archive (not on mag tapes!)
  -r, --append               append files to the end of an archive
  -t, --list                 list the contents of an archive
      --test-label           test the archive volume label and exit
  -u, --update               only append files newer than copy in archive
  -x, --extract, --get       extract files from an archive

 Operation modifiers:

      --check-device         check device numbers when creating incremental
                             archives (default)
  -g, --listed-incremental=FILE   handle new GNU-format incremental backup
  -G, --incremental          handle old GNU-format incremental backup
      --ignore-failed-read   do not exit with nonzero on unreadable files
      --level=NUMBER         dump level for created listed-incremental archive
  -n, --seek                 archive is seekable
      --no-check-device      do not check device numbers when creating
                             incremental archives
      --no-seek              archive is not seekable
      --occurrence[=NUMBER]  process only the NUMBERth occurrence of each file
                             in the archive; this option is valid only in
                             conjunction with one of the subcommands --delete,
                             --diff, --extract or --list and when a list of
                             files is given either on the command line or via
                             the -T option; NUMBER defaults to 1
      --sparse-version=MAJOR[.MINOR]
                             set version of the sparse format to use (implies
                             --sparse)
  -S, --sparse               handle sparse files efficiently

 

tar examples

Output file list piped to tar:

<command that produce file list> | tar cf backupfile.tar -

list content of tar file:

tar tvf backupfile.tar

backup recursively a directory:

tar cf backupfile.tar /path/to/backup

backup recursively a directory + gzip max compression:

tar cf - /path/to/compress | gzip -9 > backupfile.tar.gz

backup with gzip and rename directories inside the backup file:

tar zcvf backupfile.tar.gz --transform=s/path2rename/newName/ path2backup

 

ZIP

Zip is created in 1989 by Phil Katz to replace other concurrent formats such as ARC, traditionally uses DEFLATE compression (just like gzip). It has been greatly developed, maintained, and openly documented by PKWARE (PKWARE 2017). It is a de facto industry standard, and handles now more algorithms, supports file spanning and encryption (Zip-Crypto and AES). It’s the most popular file format used around the world, because DEFLATE is so fast when creating archives. It has been implemented in OS like MacOS and Windows to create on-the-fly compressed directories in their explorer. It’s primary behavior makes it a backup utility (Adler 2008).

Algorithms:

  • Deflate Standard LZ77-based algorithm
  • Deflate64 Standard LZ77-based algorithm
  • BZip2 Standard BWT algorithm
  • PPMD Dmitry Shkarin’s PPMdH with small changes
  • LZMA Improved and optimized version of LZ77 algorithm

zip usage

UX zip utility is divided in 2 binaries: zip to compress, unzip to decompress:

Copyright (c) 1990-2008 Info-ZIP - Type 'zip "-L"' for software license.
Zip 3.0 (July 5th 2008). Usage:
zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
  The default action is to add or replace zipfile entries from list, which
  can include the special name - to compress standard input.
  If zipfile and list are omitted, zip compresses stdin to stdout.
  -f   freshen: only changed files  -u   update: only changed or new files
  -d   delete entries in zipfile    -m   move into zipfile (delete OS files)
  -r   recurse into directories     -j   junk (don't record) directory names
  -0   store only                   -l   convert LF to CR LF (-ll CR LF to LF)
  -1   compress faster              -9   compress better
  -q   quiet operation              -v   verbose operation/print version info
  -c   add one-line comments        -z   add zipfile comment
  [email protected]   read names from stdin        -o   make zipfile as old as latest entry
  -x   exclude the following names  -i   include only the following names
  -F   fix zipfile (-FF try harder) -D   do not add directory entries
  -A   adjust self-extracting exe   -J   junk zipfile prefix (unzipsfx)
  -T   test zipfile integrity       -X   eXclude eXtra file attributes
  -y   store symbolic links as the link instead of the referenced file
  -e   encrypt                      -n   don't compress these suffixes
  -h2  show more help

 

unzip usage

UnZip 6.00 of 20 April 2009, by Info-ZIP.  Maintained by C. Spieler.  Send
bug reports using http://www.info-zip.org/zip-bug.html; see README for details.

Usage: unzip [-Z] [-opts[modifiers]] file[.zip] 
    [-x xlist] [-d exdir] Default action is to extract files in list, except those in xlist, to exdir; file[.zip] may be a wildcard. -Z => ZipInfo mode ("unzip -Z" for usage). -p extract files to pipe, no messages -l list files (short format) -f freshen existing files, create none -t test compressed archive data -u update files, create if necessary -z display archive comment only -v list verbosely/show version info -T timestamp archive to latest -x exclude files that follow (in xlist) -d extract files into exdir modifiers: -n never overwrite existing files -q quiet mode (-qq => quieter) -o overwrite files WITHOUT prompting -a auto-convert any text files -j junk paths (do not make directories) -aa treat ALL files as text -U use escapes for all non-ASCII Unicode -UU ignore any Unicode fields -C match filenames case-insensitively -L make (some) names lowercase -X restore UID/GID info -V retain VMS version numbers -K keep setuid/setgid/tacky permissions -M pipe through "more" pager See "unzip -hh" or unzip.txt for more help. Examples: unzip data1 -x joe => extract all files except joe from zipfile data1.zip unzip -p foo | more => send contents of foo.zip via pipe into program more unzip -fo foo ReadMe => quietly replace existing ReadMe if archive file newer

     

    zip examples

    Create an archive:

    zip archive.zip file(s)

    Create an archive recursively

    zip -r archive.zip directory(s)

    List archive content:

    unzip -l archive.zip

    Check integrity of an archive:

    zip -T archive.zip

    unzip -t archive.zip

    Extract an archive:

    unzip archive.zip

    Update (refresh) files inside an archive:

    zip -u archive.zip file(s)

    Zip list of files returned from the find command:

    find . -name "pattern" -print | zip archive.zip [email protected]

     

    gzip

    Gzip is an inline compression utility, that compresses redirects from terminal, or that compresses files in place. Algorithm used is DEFLATE, developed in 1992 to replace the compress program from early UNIX systems (J.-L. Gailly 2003). “G” stands for GNU, superseded by the Free Software Movement (Free Software Foundation 2018). Because of its on-the-fly abilities, this format is used today as the standard HTTP compression offered by every web servers available today (J. Gailly 2017). It doesn’t supports file spanning. It’s primary behavior (in-place compression) makes it an archive utility.

    Algorithms:

    • Deflate Standard LZ77-based algorithm

     

    gzip Usage

    Usage: gzip [OPTION]... [FILE]...
    Compress or uncompress FILEs (by default, compress FILES in-place).
    
    Mandatory arguments to long options are mandatory for short options too.
    
      -c, --stdout      write on standard output, keep original files unchanged
      -d, --decompress  decompress
      -f, --force       force overwrite of output file and compress links
      -h, --help        give this help
      -k, --keep        keep (don't delete) input files
      -l, --list        list compressed file contents
      -L, --license     display software license
      -n, --no-name     do not save or restore the original name and time stamp
      -N, --name        save or restore the original name and time stamp
      -q, --quiet       suppress all warnings
      -r, --recursive   operate recursively on directories
      -S, --suffix=SUF  use suffix SUF on compressed files
      -t, --test        test compressed file integrity
      -v, --verbose     verbose mode
      -V, --version     display version number
      -1, --fast        compress faster
      -9, --best        compress better
      --rsyncable       Make rsync-friendly archive
    
    With no FILE, or when FILE is -, read standard input.
    

     

    gzip Examples

    Compress a file in place and delete original file (produces file.gz):

    gzip file

    Compress file to file.gz and keep original:

    gzip -k file

    gzip -c file > file.gz

    Decompress an archive and remove it:

    gzip -d file.gz

    Compress output of an sqldump into a gziped file:

    mysqldump --opt <database> | gzip -c > database.sql.gz

     

    RAR

    Another backup compression utility developed by a Russian engineer in 1993, with a proprietary, licensed algorithm (win.rar GmbH 2018). It’s been updated over time so it can decompress a variety of format. Supports encryption and file spanning (win.rar 2018). On UX systems, rar utilities are split in two: while unrar is publically available, installing rar requires an additional proprietary repository, which few users care about since xz and 7zip are free and preferred. Because of its license, rar usage is clearly plummeting overall.

    Algorithms:

    • RAR proprietary
    • LZSS Lempel-Ziv
    • Deflate Standard LZ77-based algorithm

     

    Unrar usage

    UNRAR 5.30 beta 4 freeware      Copyright (c) 1993-2015 Alexander Roshal
    
    Usage:     unrar <command> -<switch 1> -<switch N> <archive> <files...>
                   <@listfiles...> <path_to_extract\>
    
    <Commands>
      e             Extract files without archived paths
      l[t[a],b]     List archive contents [technical[all], bare]
      p             Print file to stdout
      t             Test archive files
      v[t[a],b]     Verbosely list archive contents [technical[all],bare]
      x             Extract files with full path
    
    <Switches>
      -             Stop switches scanning
      @[+]          Disable [enable] file lists
      ad            Append archive name to destination path
      ag[format]    Generate archive name using the current date
      ai            Ignore file attributes
      ap<path>      Set path inside archive
      c-            Disable comments show
      cfg-          Disable read configuration
      cl            Convert names to lower case
      cu            Convert names to upper case
      dh            Open shared files
      ep            Exclude paths from names
      ep3           Expand paths to full including the drive letter
      f             Freshen files
      id[c,d,p,q]   Disable messages
      ierr          Send all messages to stderr
      inul          Disable all messages
      kb            Keep broken extracted files
      n<file>       Additionally filter included files
      [email protected]            Read additional filter masks from stdin
      [email protected]<list>      Read additional filter masks from list file
      o[+|-]        Set the overwrite mode
      ol[a]         Process symbolic links as the link [absolute paths]
      or            Rename files automatically
      ow            Save or restore file owner and group
      p[password]   Set password
      p-            Do not query password
      r             Recurse subdirectories
      sc<chr>[obj]  Specify the character set
      sl<size>      Process files with size less than specified
      sm<size>      Process files with size more than specified
      ta<date>      Process files modified after <date> in YYYYMMDDHHMMSS format
      tb<date>      Process files modified before <date> in YYYYMMDDHHMMSS format
      tn<time>      Process files newer than <time>
      to<time>      Process files older than <time>
      ts<m,c,a>[N]  Save or restore file time (modification, creation, access)
      u             Update files
      v             List all volumes
      ver[n]        File version control
      vp            Pause before each volume
      x<file>       Exclude specified file
      [email protected]            Read file names to exclude from stdin
      [email protected]<list>      Exclude files listed in specified list file
      y             Assume Yes on all queries
    

     

    Winrar examples

    On windows, installing the Winrar software also gives access to some command line utilities that can compress and decompress using RAR algorithm. unrar command uses the same arguments as on UX systems:

    Create a zip backup:

    winrar a -afzip backup.zip file(s)

    Create a rar backup:

    winrar a -r backup.rar file(s)

    Create a rar backup recursively:

    winrar a -r backup.rar path

    Test backup integrity:

    unrar t backup.rar

    List backup content:

    unrar va backup.rar

    Extract specific files in a backup file + directory structure:

    unrar x backup.rar *.ext [extractfolder\]

    Extract backup without directory structure:

    unrar e backup.rar [extractfolder\]

     

    7-zip

    7zip is a modern, free (Pavlov 2018a) backup compression utility developed in 1999 by another Russian engineer. It uses a variety of algorithms to compress and decompress many backup formats including tar. It can only unRAR archives due to licensing (Pavlov 2018b). Its flagship algorithm is LZMA, and today LZMA2 (Pavlov 2018c). It offers the best compression ratio available on the market, at the cost of speed though. It uses a huge dictionary size with new coding techniques, which explains the ratio, and also why it’s so slow while compressing. Surprisingly, decompression is almost as fast as DEFLATE. Supports file spanning and AES encryption.

    One can install p7zip for Linux, but there exists a more common utility called xz that also uses LZMA/LZMA2 compression (Collin 2016). Just like gzip, it compresses data streams: no ability to store multiple files (Himanshu 2012).

    Algorithms:

    • LZMA2 Improved version of LZMA
    • LZMA Improved and optimized version of LZ77 algorithm
    • PPMD Dmitry Shkarin’s PPMdH with small changes
    • BZip2 Standard BWT algorithm
    • BCJ Converter for 32-bit x86 executables
    • BCJ2 Converter for 32-bit x86 executables
    • Deflate Standard LZ77-based algorithm

    p7zip: Differences between 7z, 7za and 7zr binaries

    The package p7zip usually includes three binaries, 7z, 7za, and 7zr (Vinet and Griffin 2017). Their differences are:

    • 7z: 7z uses plugins to handle archives.
    • 7za: is a stand-alone executable. 7za handles fewer archive formats than 7z, but does not need any other plugin.
      • 7zr: is a stand-alone executable. 7zr does not need any other plugin, and is a light-version of 7za that only handles 7z archives.

     

    7zip Usage

    7-Zip (a) 9.38 beta  Copyright (c) 1999-2014 Igor Pavlov  2015-01-03
    p7zip Version 9.38.1 (locale=C,Utf16=off,HugeFiles=on,2 CPUs)
    
    Usage: 7za <command> [<switches>...] <archive_name> [<file_names>...]
           [<@listfiles...>]
    
    <Commands>
      a : Add files to archive
      b : Benchmark
      d : Delete files from archive
      e : Extract files from archive (without using directory names)
      h : Calculate hash values for files
      l : List contents of archive
      rn : Rename files in archive
      t : Test integrity of archive
      u : Update files to archive
      x : eXtract files with full paths
    <Switches>
      -- : Stop switches parsing
      -ai[r[-|0]]{@listfile|!wildcard} : Include archives
      -ax[r[-|0]]{@listfile|!wildcard} : eXclude archives
      -bd : Disable percentage indicator
      -i[r[-|0]]{@listfile|!wildcard} : Include filenames
      -m{Parameters} : set compression Method
      -o{Directory} : set Output directory
      -p{Password} : set Password
      -r[-|0] : Recurse subdirectories
      -scs{UTF-8|UTF-16LE|UTF-16BE|WIN|DOS|{id}} : set charset for list files
      -sfx[{name}] : Create SFX archive
      -si[{name}] : read data from stdin
      -slt : show technical information for l (List) command
      -so : write data to stdout
      -ssc[-] : set sensitive case mode
      -t{Type} : Set type of archive
      -u[-][p#][q#][r#][x#][y#][z#][!newArchiveName] : Update options
      -v{Size}[b|k|m|g] : Create volumes
      -w[{path}] : assign Work directory. Empty path means a temporary directory
      -x[r[-|0]]]{@listfile|!wildcard} : eXclude filenames
      -y : assume Yes on all queries
    

     

    Examples

    Create a backup:

    7z a backup.7z archiveDir

    List backup content:

    7z l backup.7z

    Check integrity of a backup:

    7z t backup.7z

    Extract a backup:

    7z e backup.7z

    Update (refresh) files inside a backup:

    7z u backup.7z backupDir

    Create a max compression LZMA backup with tar and xz on the fly:

    tar -cf - path/ | xz -9 -c - > archive.tar.xz

     

    Comparison

    A thorough comparison is indeed found on Wikipedia, but here is a quick facts comparison table, based on my personal experience:

    Name Recommended Extension Algorithm Cost Compression speed Compression ratio
    Tar (original) tar None Free Fastest None
    Zip zip DEFLATE Free Fast Average
    Gzip Gz DEFLATE Free Fast Average
    Rar Rar RAR Licensed Slow Best
    7-zip

    xz

    7z

    xz

    LZMA2 Free Slowest Bestest
    [callout type=”info” size=”lg”]

    Notes:

    • Compression ratios are subjective to the data compressed: none of these software do good on media files for instance.
    • The comparison above is valid between the utilities mentioned only.
    • tar, and at some extent, any compression software, actually do not require a file extension. However it’s best practice to use them for good file management.
    [/callout]

     

    Works Cited

    Adler, Mark. 2008. “Zip: Package And Compress (Archive) Files.” PTC Inc. https://www.mkssoftware.com/docs/man1/zip.1.asp.
    Collin, Lasse. 2016. “The .Xz File Format.” Tukaani. December 30. https://tukaani.org/xz/format.html.
    Deutsch, P, and Inc. Aladdin Enterprises. 1996. “DEFLATE Compressed Data Format Specification Version 1.3.” Request for Comments. May. https://tools.ietf.org/html/rfc1951.
    Dorion, Pierre. 2008. “Backup vs. Archive.” TechTarget. October. https://searchdatabackup.techtarget.com/tip/Backup-vs-archive.
    Free Software Foundation. 2018. “GNU Operating System.” Free Software Foundation Inc. March. https://www.gnu.org/.
    Gailly, Jean-Loup. 2003. “The Gzip Home Page.” Gzip.Org. July 27. http://www.gzip.org/.
    Gailly, Jean-loup. 2017. “Zlib: A Massively Spiffy Yet Delicately Unobtrusive Compression Library.” Zlib. January 15. https://www.zlib.net/.
    Pavlov, Igor. 2018a. “7-Zip Copyright (C) 1999-2018 Igor Pavlov.” 7-Zip. https://www.7-zip.org/license.txt.
    ———. 2018b. “7-Zip Format.” 7-Zip. https://www.7-zip.org/7z.html.
    ———. 2018c. “History of the 7-Zip.” 7-Zip. March 4. https://www.7-zip.org/history.txt.
    PKWARE, Inc. 2017. “ZIP File Format (PKWARE).” Digital Preservation at the Library of Congress. July 27. https://www.loc.gov/preservation/digital/formats/fdd/fdd000354.shtml.
    “Tar(5) — Format Of Tape Archive Files.” 2004. FreeBSD File Formats Manual. May 20. https://www.freebsd.org/cgi/man.cgi?query=tar&sektion=5&manpath=FreeBSD+7.0-RELEASE.
    Vinet, Judd, and Aaron Griffin. 2017. “P7zip.” Arch Linux. December 5. https://wiki.archlinux.org/index.php/P7zip.
    win.rar, GmbH. 2018. “RAR 5.0 Archive Format.” Win.Rar GmbH. https://www.rarlab.com/technote.htm.
    win.rar GmbH. 2018. “Rar And Winrar End User License Agreement (Eula).” Rarlab GmbH. https://www.rarlab.com/license.htm.
    0 0 votes
    Article Rating
    Subscribe
    Notify of
    guest
    0 Comments
    Inline Feedbacks
    View all comments
    Copyright © All rights reserved. | Newsphere by AF themes.
    0
    Would love your thoughts, please comment.x
    ()
    x