Using HTAR to transfer files

How to use HTAR | HTAR help | Advanced use | Options

The HPSS Tape Archiver (HTAR) packages large numbers of files into a single archive file for efficient transfer to our High Performance Storage System (HPSS).

We highly recommend using HTAR rather than HSI if you need to archive large numbers of individual files that are smaller than 1 MB. Failure to follow this guideline can result in suspension of your access to HPSS.

Transferring hundreds or thousands of small files with HSI commands significantly slows the entire system for all users. HTAR is more efficient for archiving and reduces the time it takes to retrieve files later.

How HTAR works:

  1. A single command (examples below) packages multiple individual "member" files on your file system into a single archive file and sends it to HPSS.
  2. The archive file ensures that all the member files are stored on a single tape, which is efficient and convenient.
  3. HTAR creates and saves an index file on HPSS (extension .idx) along with the archive file.
  4. You can retrieve individual specified member files later if you need to without copying the whole archive file back to your file system.

See File size guidelines for HPSS for additional information that will help you use this system efficiently.

Concurrent transfer limits

HPSS is shared by a large number of users, and there are individual and global limits on file actions that can be executed concurrently. See Use and storage policies for details regarding these limits.


How to use HTAR

Creating an archive file

warning icon

Caution: When you create an archive file using a file name that already exists in HPSS, HTAR will overwrite the older file without warning.

To create an archive file named myncfile.tar with all .nc files in the current directory and store the file in your HPSS home directory, follow this example.

htar -cv -f myncfile.tar *.nc 

To create an archive file named mydata.tar with the directory MyData and store the file in your HPSS home directory, follow this example.

htar -cv -f mydata.tar MyData 

Submitting a job to create an archive file

You can submit a job to the hpss queue to create an archive file.

bsub -n 1 -q hpss -W 2:00 -P project_code htar -cv -f mydata.tar MyData

Also see Confirming HPSS transfers.

Listing archive file contents

Use the HTAR options shown here to list the contents of an archive file.

htar -tv -f myncfile.tar 

Retrieving files

You can retrieve individual files or an entire archive file using the htar command.

Caution: HTAR will overwrite local files without warning if they have the same names as the files that you are retrieving.

When fetching files from HPSS, make sure you have sufficient room in your GLADE file space. If you exceed your GLADE quota, the transfer will fail.

Also, if you are extracting files to your /glade/scratch directory, use the -m option as shown in the following examples to prevent the files from being purged prematurely because of an old date. The -m option sets the files' modification time to the time of extraction rather than the time the archive file was created.

In this example, you are retrieving my_output0.nc from the myncfile.tar archive file.

 htar -xv -m -f myncfile.tar my_output0.nc 

You would retrieve all member files from myncfile.tar like this:

htar -xv -m -f myncfile.tar 

Submitting a job to retrieve files

You can retrieve an archive file, or retrieve individual files from the archive, by submitting a job to the hpss queue. This example shows how to retrieve an archive file.

bsub -n 1 -q hpss -W 2:00 -P project_code htar -xv -m -f mydata.tar

HTAR help

See Getting help with HPSS.

Also see the HTAR web page for more information, including links to all of the man pages and a user guide.


Advanced use

Soft delete

To soft delete the member file my_output0.nc from the archive file myncfile.tar, follow this example. Run the htar command in the second line to see the updated list of file contents.

htar -Dv -f myncfile.tar my_output0.nc
htar -tv -f myncfile.tar 

Undelete

To undelete the file my_output0.nc from the archive file myncfile.tar, follow this example. Run the htar command in the second line to see the updated list of file contents.

htar -U -f myncfile.tar my_output0.nc
htar -tv -f myncfile.tar 

Build a new index file

Follow this example to build an HTAR index (or overwrite the index) for an archive file. You may want to do this if you created a large archive file using tar rather than htar and then ran an HSI command such as cput to put it on HPSS. Creating an index for it with HTAR will facilitate retrieval of individual files.

htar -Xv -f myncfile.tar 

Most important options

-f archive   Specifies archive file name (required option).

-c      Creates a new archive file.

-t      Lists contents of archive file, using the index file.

-x      Extracts files from the archive file to local file(s).

-D      Soft-deletes member files from the archive file (flags files as <deleted> in the index file).

-U      Undeletes soft-deleted member files in the archive file.

-X      Builds a new index file by reading an existing archive file.

-K      Verifies the contents of an existing archive file.


Other useful options

-?      Displays help message.

-d debuglevel   Sets debug level (0-5). 0 = no debug, 5 = highest debug level.

-h      Follow symbolic links as if they were normal files or directories.

-I index_name   Specifies the index file name or suffix.  If the first character of the index_name is a period, then a suffix is assumed (.idx, for example).

-L InputList    Use the list of files and directories listed in the InputList variable for this operation.

-m      Use the time of extraction as the modification time.

-n time   For creates, only includes files that were modified within the time period. Time is of the form:  "days" ":hours", or "days:hours."

-O      Extract files to stdout.

-q      Run in "quiet" mode.

-T threads  Specifies the maximum number of threads to use for copying local files to the archive file.

-V      "Slightly verbose" option. Displays transfer statistics.

-v      Verbose mode. Displays -V info plus list of files operated upon.

-w      Wait for interactive OK.

You can type htar -? to check the full list and description of HTAR options. If you are using csh or tcsh, type the command htar -\?.

Related training courses