File size guidelines for HPSS

Archiving files in the robotic tape libraries of the High Performance Storage System is a much different process than writing files to disk storage such as the GLADE file spaces. Individual use cases and workflows help determine when it is appropriate to archive files to HPSS for long-term storage.

When you do need to archive files, be aware of the impact that file size can have on storing and retrieving files, as described in these guidelines.

Preferred file size range

File sizes in the gigabyte range are preferred for storing in HPSS. A few files of hundreds of gigabytes each make the most efficient use of the system.

Considerations for very large files

Transferring files that are 1 TB or larger increases the risk of poor system performance as well as the risk, still very low, of losing a file that contains a large amount of data. While the actual file size limit in HPSS is 5 TB, we recommend storing files that are 1 TB or smaller.

Also keep in mind that the peak transfer rate for HPSS tape drives is approximately 160 MBps, so retrieving a 5 TB file from tape may take nine hours or more.

Avoid small files

Avoid transferring many small files—those in the megabyte range or smaller. The process of moving numerous individual files to and from tape is inefficient. It can become very time consuming and result in slowing the system for all users.

When you need to store many small files, use one of these two approaches:

  • Use HTAR to transfer them together as a single archive file. HTAR can bundle individual “member” files as large as 68 GB into one archive file and store it on HPSS.

  • If you need to create an archive with any member files that are larger than 68 GB, use a tar command to bundle the member files and then transfer the resulting tar file with the HSI cput command.