Managing files with HSI

Important considerations | Special HSI commands | Common HSI commands | HSI modes

The HSI interface described on this page is one of two primary tools for transferring data to and from HPSS.

We highly recommend using HTAR rather than HSI if you need to archive large numbers of individual files that are smaller than 1 MB. Failure to follow this guideline can result in suspension of your access to HPSS.

Transferring hundreds or thousands of small files with HSI commands significantly slows the entire system for all users. HTAR is more efficient for archiving and reduces the time it takes to retrieve files later.

This page provides information on using HSI commands. Also see HSI Help.

Important considerations

File deletion is permanent

Files that are deleted from or overwritten on HPSS cannot be recovered. To avoid inadvertent data loss, carefully review the section below on Special HSI commands.

Concurrent transfer limits

HPSS is shared by a large number of users, and there are individual and global limits on file actions that can be executed concurrently. See Use and storage policies for details regarding these limits.

Also see Optimizing HPSS file retrieval. Configuring your requests based on tape location can result in quicker retrievals in some cases.

Bulk file operations

If you need to perform an operation on 100,000 or more of your HPSS files (with commands such as chacct, chcos, chgrp, chown, cp, mv, and rm, among others), contact the CISL Help Desk or Consulting Services Group for assistance so we can avoid slowing the system for all users.

Best practices

Review and follow CISL best practices for managing your files and data transfers. They will help you make the most efficient use of your computing and storage allocations.

Special HSI commands

warning iconSome commands—including mv, put, and cp—can overwrite data at their targets. Files deleted from or overwritten on HPSS cannot be recovered, so establish directory permissions carefully to help prevent inadvertently overwriting your HPSS files with these commands.

The cp command also resets a file's project code to your default code, so be especially careful with this if you have multiple projects to which you can charge. If you have files associated with non-default groups, the command will reset the group ID, as well.

Use cput and cget to avoid data loss. While put unconditionally clobbers its target, the conditional cput command will not overwrite a file with the same name. Similarly, using cget rather than get will prevent you from inadvertently overwriting a file on your local drive when retrieving an HPSS file with the same name.

Use put and get only when you know that you want to overwrite existing data.

When fetching files from HPSS (with either get or cget), make sure you have sufficient room in your GLADE file space. If you exceed your GLADE quota, the transfer will fail.

Common HSI commands

HSI uses common UNIX-like commands that generally work the same way as their UNIX counterparts.

For example:

  • ls lists the contents of a directory
  • rm permanently removes a file
  • mkdir creates a directory
  • rmdir deletes a directory

To start HSI after logging in to Yellowstone, just enter hsi on your command line. (To use HSI on NCAR systems that are outside of the Yellowstone environment, see Kerberos and HSI.)

Below are other common commands and examples of how to use them. Click here for a printable quick reference sheet.

cput command

This command writes local file file from your current working directory to HPSS as /home/username/file:

 [HSI]/home/username-> cput file 

To write the file to HPSS with a different name, follow this example:

 [HSI]/home/username-> cput file : newfile 

Absolute path names for local or HPSS files also are acceptable. The local file always comes before the colon, with both the cput and cget commands.

To put a set of files into a target directory, change to that directory and execute the cput command.

 [HSI]/home/username-> cd /home/user/targetdir; cput file-pattern 

UNIX users sometimes try to do the following, where targetdir is an existing directory (or a directory that is to be created with the -P option). HSI does not support this:

 [HSI]/home/username-> cput file-pattern : /home/user/targetdir 

cput and the -R option

You can specify a source with the -R (recursive) option in HSI, but you cannot specify a target with it. If you try, the command will interpret your source, your target, and the token “:” all as sources. This can produce unexpected and even damaging results.

Correct: The proper way to use this command is to change to the target directory first, then execute your cput command with the -R option. For example, to put the local directory mydir down to an HPSS target directory /home/username/test

 hsi "cd /home/username/test; cput -R mydir" 

The result is a directory /home/username/test/mydir that contains all the files and directories from your local mydir directory.

Incorrect: Here’s an example of how this often is done incorrectly by trying to write the source file tree rooted at mydir to /home/username/test/mydir:

 hsi cput mydir : /home/username/test/mydir 

You will get an error message recognizing that mydir is a directory and telling you that you need to use the -R option. Take care to execute this correctly, as shown above, or you risk overwriting valuable data.

cget command

Use cget to retrieve an HPSS file into your current working directory on your local machine:

 [HSI]/home/username-> cget filename 

To read the HPSS file into your current working directory with a different name, follow this example:

 [HSI]/home/username-> cget newname.file : hpss.file 

HSI modes

HSI can be invoked in several modes:

  • As an interactive command interface
  • As a prefix to one or more commands to be executed from the UNIX command line
  • By submitting a job to the "hpss" queue

Interactive command interface

If you are working with our HPC, analysis, or visualization systems*, to start an HSI session all you need to do is enter hsi on the command line.

To exit the HSI environment, enter quit.

Non-interactive (batch) mode

HSI commands can be entered from the command line, without starting an HSI session first. To do this, you simply type the command like this:

hsi cput xxx : yyy 

The command will be executed, then control will return to the shell. This is how you would put HSI commands in a script, or in a “system” call from a running program. Another batch mode option is to create a file containing the desired HSI commands and executing one of the following:

hsi in filename 
hsi < filename 

Submit job to "hpss" queue

Batch (LSF) and cron jobs can use HSI in the same way as the interactive and non-interactive jobs described above. On Yellowstone, use the hpss queue for these jobs.

Here is one example of how to submit a job that will execute a transfer:

bsub -n 1 -q hpss -W 2:00 -P project_code hsi cget mydata

* UCAR users: To use HSI on NCAR systems that are outside of the Yellowstone environment, see Kerberos and HSI.