Info Node: (tar.info)sparse

CFHT HOME tar.info: sparse


up: Compression prev: gzip Back to Software Index

8.1.2 Archiving Sparse Files
----------------------------

Files in the file system occasionally have "holes".  A "hole" in a file
is a section of the file's contents which was never written.  The
contents of a hole reads as all zeros.  On many operating systems,
actual disk storage is not allocated for holes, but they are counted in
the length of the file.  If you archive such a file, `tar' could create
an archive longer than the original.  To have `tar' attempt to
recognize the holes in a file, use `--sparse' (`-S').  When you use
this option, then, for any file using less disk space than would be
expected from its length, `tar' searches the file for consecutive
stretches of zeros.  It then records in the archive for the file where
the consecutive stretches of zeros are, and only archives the "real
contents" of the file.  On extraction (using `--sparse' is not needed
on extraction) any such files have holes created wherever the
continuous stretches of zeros were found.  Thus, if you use `--sparse',
`tar' archives won't take more space than the original.

`-S'
`--sparse'
     This option instructs `tar' to test each file for sparseness
     before attempting to archive it.  If the file is found to be
     sparse it is treated specially, thus allowing to decrease the
     amount of space used by its image in the archive.

     This option is meaningful only when creating or updating archives.
     It has no effect on extraction.

   Consider using `--sparse' when performing file system backups, to
avoid archiving the expanded forms of files stored sparsely in the
system.

   Even if your system has no sparse files currently, some may be
created in the future.  If you use `--sparse' while making file system
backups as a matter of course, you can be assured the archive will
never take more space on the media than the files take on disk
(otherwise, archiving a disk filled with sparse files might take
hundreds of tapes).  Note: Incremental Dumps.

   However, be aware that `--sparse' option presents a serious
drawback.  Namely, in order to determine if the file is sparse `tar'
has to read it before trying to archive it, so in total the file is
read *twice*.  So, always bear in mind that the time needed to process
all files with this option is roughly twice the time needed to archive
them without it.

   When using `POSIX' archive format, GNU `tar' is able to store sparse
files using in three distinct ways, called "sparse formats".  A sparse
format is identified by its "number", consisting, as usual of two
decimal numbers, delimited by a dot.  By default, format `1.0' is used.
If, for some reason, you wish to use an earlier format, you can select
it using `--sparse-version' option.

`--sparse-version=VERSION'
     Select the format to store sparse files in.  Valid VERSION values
     are: `0.0', `0.1' and `1.0'.  Note: Sparse Formats, for a
     detailed description of each format.

   Using `--sparse-format' option implies `--sparse'.


automatically generated by info2www version 1.2