Use and storage policies

Recommended use | Data storage policies

Recommended use

HPSS is best used to store data sets larger than 100 MB for periods of 30 days or more. It is not for short-term storage of temporary data or for use as a disk-based file server.

See File size guidelines for HPSS for additional information that will help you use this system efficiently.

Users are encouraged to work within the GLADE disk-based storage environment for short-term storage of data that are being created or analyzed, and for supporting data workflows across CISL resources. Your GLADE file spaces are accessible from our HPC, data analysis, and visualization systems.

UCAR/NCAR staff: Do not use HPSS to back up your laptops or desktops.
Please contact your desktop support team about backup options.

Concurrent transfer limits

Individual users should request no more than five (5) file actions to be executed concurrently, whether via HSI or HTAR (and regardless of the client location).

There also is a global limit on concurrent file actions that sometimes results in users' receiving "EIO" error notices if the system is especially busy. To reduce the incidence of such errors, follow these recommendations:

  • Be aware that HPSS may still be busy executing an action after it appears to you to have been completed. By running commands in quick succession, you may inadvertently reach your limit because some actions remain in progress. Also, once HPSS begins to execute an action, it may continue to completion even if you cancel the HSI or HTAR command.
  • Don't submit numerous HSI transfer commands (cput or cget, for example) in either a single session or in multiple sessions. Each one constitutes a file action request.
  • If you are encountering EIO errors, submit fewer HTAR commands simultaneously or in rapid succession. A read opens both an index file and a tar file, so you might reach your limit sooner than you expect.

These limits help ensure that all users have reasonable access to HPSS.

Use HTAR for efficient transfers

We highly recommend using HTAR rather than HSI if you need to archive large numbers of individual files that are smaller than 1 MB. Failure to follow this guideline can result in suspension of your access to HPSS.

Transferring hundreds or thousands of small files with HSI commands significantly slows the entire system for all users. HTAR is more efficient for archiving and reduces the time it takes to retrieve files later.

HPSS and outside systems

Transferring data between HPSS and computers outside of the UCAR security perimeter is a two-stage process: downloading/uploading to disk, then using one of our data transfer methods (for example, Globus, SCP, or SFTP) to make the transfer.


Data storage policies

HPSS creates one copy of each file by default. These files are not backed up within HPSS or in any other repository. You are responsible for replicating any data that you feel should be stored at an additional location.

It is possible to create additional copies in HPSS for additional charges. Please contact the CISL Help Desk if you would like more information.

warning iconFiles deleted from or overwritten on HPSS cannot be recovered.

Retention

Files in HPSS have no expiration date, and we do not alter or remove them except as described below or in the case of violations of UCAR’s policies for access and use of computer systems. In return, you are required to be actively engaged in the disposition of the data that you store in HPSS. You also are responsible for managing your own data files and ensuring that charges are applied to the correct projects.

Requesting file deletions

In a few exceptional situations, you may need to request our assistance in modifying or deleting files from HPSS. Here are two examples:

  • If you have files in HPSS but no longer have access to CISL resources (and have no need for access), simply ask that we delete or change the ownership of the files.
  • A project lead may ask us to delete or reassign ownership of files that were created by a former user. If the former user still has an active login and another active project on CISL resources, please work directly with the user to make the necessary changes. Otherwise, contact us with a compelling reason to make an exception to the standard procedure.

In both cases, CISL staff will verify your identity before deleting any data.

Curation

If data sets that you no longer need are of special community interest or have historical importance, contact the CISL Data Support Section (rdahelp@ucar.edu) to inquire about transferring them to the NCAR Research Data Archive (RDA) for curation, preservation, and storage.

RDA generally does not preserve the results of computational analyses. Most often, it focuses on observational and similarly irreproducible data sets.

Orphaned and abandoned data

Files that are stored in HPSS are considered orphaned when they are no longer associated with an active project that can be charged. This may occur if a project expires or is otherwise closed. Files also may be orphaned when a student user leaves.

CISL will notify you by email about orphaned files and retain them for a grace period of eight (8) months. Ensure that we have your current email address; otherwise, we can try to find you only as staff time permits.

To check for orphan files yourself, log in to Cheyenne and run: myorphans

If you have orphaned files in HPSS, please do one of the following:

  • Delete the files that you no longer need; associate others with an active project. (See instructions below.)
  • Move your files to a non-NCAR storage location.
  • Request a University Small Allocation (qualified university researchers only) to retain them in HPSS.

Orphaned files are considered abandoned after the eight-month grace period. They are then moved to a location accessible only with CISL assistance and retained for four (4) additional months, during which CISL will make no further attempts to contact the files' owners. The data are deleted automatically at the end of the period of abandonment.

Delete an orphaned file

To delete a file from HPSS, use the HSI rm or delete command.

 [HSI]/home/username-> rm filename 

To delete all of the files in a directory, use the HSI rm or delete command and the path name.

 [HSI]/home/username-> rm directory/* 

Check a file's project code

To identify a file’s associated project code, use ls -U followed by the name of the file.

 [HSI]/home/username-> ls -U filename 

To identify associated project codes for all of the files in a directory, use a wildcard.

 [HSI]/home/username-> ls -U * 

Change project codes

To associate an orphaned file with a different project code, use the HSI chacct command with the new project code (for example, UUOM0001) and the filename.

 [HSI]/home/username-> chacct UUOM0001 filename 

To associate multiple files with a different project code, use the HSI chacct command with the new project code and path name.

 [HSI]/home/username-> chacct -R UUOM0001 directory 

For more information about using the chacct command, please see the HSI Reference Manual.

Data integrity

While HPSS is a highly reliable storage system, NCAR does not guarantee that your data are immune to loss or damage. In the highly unlikely event that your files are affected by breakage of a storage tape, we may not be able to restore them entirely.

Any potential loss is more likely to be the result of mistakenly removing or overwriting stored files. To avoid such losses, please review our documentation on permissions and creating additional copies.

Related training courses