SPY HILL Research
Spy-Hill.net

Poughkeepsie, New York [DIR] [UP]

How to Backup your Data with tar

The Unix tar utility can be used to make off-line tape copies of files or collections of files (even entire directory trees). These notes tell you how you can use tar to save copies of your files to a personal backup tape, and how you can put more than one ``tar archive'' on a single tape.

Instead of writing your files to a tape, you can have tar save them to a floppy disk or to a ZIP disk. All you need to do is specify the device name for the floppy or disk drive instead of specifying a tape device. However, you cannot easily add multiple tar archives to a single floppy disk the way you can add multiple tar files to a magnetic tape.

It is also possible to save multiple files or directories into another disk file, called a ".tar" file (or sometimes called a "tarball" :-), instead of writing them to tape. This file can be compressed with the GNU gzip program to create a compressed tar archive, or ".tar.gz" file.


Tape device name

Both the tar command and the mt command (which is used to control a magnetic tape device) will use the TAPE environment variable for the name of the device or file on which to operate, provided that none is given on the commmand line. Thus it is useful to begin by setting this variable with the command
setenv TAPE /dev/rmt/0mn
It is a good idea to put this in your .login or .cshrc file (you may find that it is already there).

The device listed here, /dev/rmt/0mn, is the name of the tape drive on our HP's, but if you are working on a NeXT or a Sun with an attached tape drive then you will usually want to specify the device /dev/nrst0.

Under Linux the floppy drive is usually device /dev/fd0, so if you say

setenv TAPE /dev/fd0
then all tar commands that don't specify a device on the command line will default to the floppy drive.

You can also specify a tape device on another computer, as long as you also have an account on that computer. Simply preface the device name with the computer name, separated by a colon. Thus to use the tape device on the machine dirac you would say

setenv TAPE dirac:/dev/rmt/0mn
This probably won't work with a floppy disk.


Saving a directory to a tar tape

To create a ``tar archive,'' which writes a directory and all of its contents to the tape, give the tar command with the ``c'' flag, meaning ``create.'' For example, if you want to save to tape the directory projectX (and all of the files and directories it contains), the command is simply:
%  tar c projectX
(The ``%'' here represents the Unix command prompt.)

If you want to save the contents of the ``current'' directory (perhaps your home directory, and all of the files and subdirectories it contains), then you can use a dot (``.'') as the name of the directory, as in

%  tar c .
You can also put several files or directories into a single tar archive by listing them all on the command line, as in
%  tar c file1 file2 directory1 file3 directory2


Watching what goes on

You can cause tar to list the files it adds to the archive by adding the "v" flag, meaning "verbose". So for example:

%  tar cv .
will save all the files in the current directory and its subdirectories, and list each file by name as it is added to the archive.


Listing files in an archive

If you give the tar command with the ``t'' flag it will list a ``table of contents'' of an existing tar archive. The position of the tape determines which archive is read. So if you want to list the contents of the tar archive you just wrote to a tape you could simply give the command ``tar t'', like so:

%  mt rew
%  tar t
projectX/
projectX/CVstate.f
projectX/avgvar.f
projectX/curve.f
projectX/dsec.f
projectX/heapsort.f
projectX/hello.f
projectX/ranlist.f
projectX/realist.f
projectX/second.f
projectX/RCS/
projectX/RCS/curve.f,v
projectX/README


Extracting files from an archive

You can easily extract the entire contents of a tar archive, or you can selectively extract a single file or a list of files. To do so, give the tar command with the ``x'' flag. If you name the files to be extracted on the command line, then only these will be extracted. For example,
%  mt rew
%  tar x  projectX/README
will only extract the file README. It will be placed in the directory projectX, which will be created if it does not already exist. To extract the entire contents of the archive from the tape the command is simply:
%  mt rew
%  tar x  
You can add the "v" flag to cause the file names to be displayed as they are extracted, as:
%  mt rew
%  tar xv


Tar files on disk

The tar utility can also be used to make an on-line ``tar archive'' contained in a disk file. In fact, the tar format is one of the main ways software is packaged on the Internet. (Since tar does no compression, tar files are often compressed using gzip, so the resulting files often have names ending in ``.tar.gz'' or ``.tgz''.)

To create a tar archive in a disk file, rather than writing the tar archive to tape, simply specify the ``f'' flag in the tar command followed by the name of the file (instead of the name of a tape or floppy disk device). For example,

%  tar  cf projectX.tar   projectX
will create a tar archive in the file projectX.tar, and put in it all the files in the directory projectX (and the files and directories it contains). You can also use the ``t'' and ``x'' flags, with the ``f'' flag, to list the contents of a tar file or extract files from the tar file. For example to extract all the files from the file projectX.tar you could simply say:
%  tar  xf projectX.tar  


Multiple archives on a single tape

It is fairly easy to write several tar archives to a single tape, one after the other, by using the Unix mt command to position the tape. You do need to be somewhat careful about where you position the tape, or you will overwrite data already on the tape. It is important to remember that whenever you write data on a tape, it effectively erases all data on the tape following that point. (It may be possible to recover files farther along on the tape, but don't count on it. If you have an accident stop writing to the tape as soon as possible and come see me.) You cannot selectively ``cut and paste'' files onto a tape the way you can with a disk (even if you are using tar to write to a disk).

Note: The device names ``0mn'' or ``nrst0'' are for ``no rewind'' tape devices. It is very important that you use ``no rewind'' devices when you are trying to put more than one tar archive on a tape. In contrast, the devices ``0m'' or ``rst0'' are ``rewind'' devices---which means that the tape will be rewound automatically after a single command is executed, whether it is a tar command or an mt command. This can cause you to overwrite data already written on the tape, so be careful to always use a ``no rewind'' device.

Suppose you have already written the directory projectX to a tape using tar, and now you want to write projectY to the tape as the second tape file on the tape. To be sure about the tape position you can first rewind it, with the command

%  mt rew
You should then space the tape forward past the first tape file, with the ``forward space file'' command of mt,
%  mt fsf 1
You can omit the ``1'' to space forward by one tape file, but you can also specify some larger number of tape files to skip over all at once. On some machines (such as Suns) you can verify the tape position with the command ``mt status'', but this does not work on HP's.

Once the tape is positioned, simply give the appropriate tar command:

%  tar c projectY
It is a good idea to write the name of the archive and the date on both the tape label and the tape box. There is no easy way to make a ``directory'' of what is on a magnetic tape, since each archive is just a raw tape file, so this is the only way to record what is on the tape. If you really don't know what is on the tape you can list the contents of the archives to find out (see below).

As another example, now suppose that you want to write projectZ to the tape as the third tape file. You would skip over the first two tape files and then write the archive with tar, like so

%  mt rew
%  mt fsf 2
%  tar c projectZ
Although it may take a little longer, it is easiest to rewind the tape before spacing forward a given number of tape files, so that you don't get confused about where the tape is positioned.


Tar across the net

Tar can copy files which are mounted remotely via NFS, so the easiest way to copy your home directory on another machine is to log onto dirac (the tape drive is connected to dirac) and cd to your home directory on the other machine (in /net/feynman/users or /net/walden/users and so forth), then give a tar ``create'' command.

However, it is also possible to save files to tape from a filesystem which is not mounted via NFS. To do so you must have permission to execute remote commands on dirac, by having a proper entry in your .shosts file (see the man pages for ssh(1)) or by Using Public Key Authentication with SSH. You log in on your home machine and then give an incantation like the following:

% tar cvfb  -  20 projectX | ssh dirac dd of=/dev/rmt/0mn  obs=20b
This creates a tar archive on your home machine, but the "-" directs it to standard output rather than a file, and this is piped to the remote process on dirac. There the dd command simply copies the archive from standard input directly to the tape. The extra "b" and "obs" flags are needed to get the blocking factor correct on the tape.


Further Information

The on-line man pages for tar explain the command syntax and flags (they are a bit unusual, even for Unix) and lists many examples. Just say ``man tar'' to see them. If you are using GNU tar then the most current documentation will be in the "info" help system. Say "info tar" instead.
  Copyright © 2005 by Spy Hill Research http://www.Spy-Hill.net/~myers/help/HowTar.html (served by Islay.spy-hill.com) Last modified: 19 January 2005