CISL best practices

Sharing resources | Managing files | Managing allocations | Transferring data

The practices described below will help you make the most of your computing and storage allocations.


Sharing resources

Limit your use of shared licenses

Users of the resources that CISL manages share a limited number of licenses for running IDL, MATLAB, Mathematica, and some other applications. Be familiar with and follow the license-use guidelines that have been established to ensure fair access for all users. CISL reserves the right to kill jobs/tasks of users who monopolize these licenses.

Don't monopolize compute resources

best practicesBe considerate of other users when planning and scheduling jobs, particularly jobs on shared clusters such as Geyser and Caldera. To keep from monopolizing these resources, for example, avoid writing job submission scripts that rapidly fill the scheduler with potentially concurrent compute resource requests. Contact the Consulting Services Group for guidance if your workload will require you to submit numerous jobs in a short timeframe. CISL monitors the use of these resources and will kill jobs when necessary to ensure fair access.


Managing files

Use HPSS only for long-term storage

Use the HPSS tape archive to store only the data that you need to save long-term. Rather than routinely copying output to HPSS right after you complete a simulation run, for example, use the large GLADE scratch space and save the data to HPSS only after post-processing.

Using the tape archive only for long-term storage helps conserve your storage allocation and allows the HPSS system to run more efficiently for everyone.

See HPSS: Using your terabyte-years wisely

Use scratch space for temporary files

The GLADE scratch file space is a temporary space for data that will be analyzed and removed within a short amount of time. It is also the recommended space for temporary files that would otherwise reside in small /tmp or /var/tmp directories that many users share. See Storing temporary files with TMPDIR for more information.

Store large files

Storing large files, such as tar files, is more efficient than storing numerous small files. This is because the system allocates a minimum amount of space for each file, no matter how small. On /glade/scratch, where the block size is 4 MB, the smallest amount of space the system can allocate to a file, including directories and symlinks, is 128 KB (the "sub-block" size). Any files smaller than 128 KB are still allocated 128 KB, so they require more space than you might expect. The same applies to the work and project spaces in /glade/p, while /glade/u has a block size of 512 KB and a minimum sub-block allocation of 16 KB.

Also see File size guidelines for HPSS.

Configure jobs to avoid massive directories

Ensemble runs, data assimilation runs, and other jobs can generate tens or hundreds of thousands of output files, log files, and others over time. Such large numbers of files can be difficult to manage and remove from GLADE file spaces when they are no longer needed. Configuring jobs to place no more than 2,000 to 3,000 files in a single directory will make them easier to manage. See Removing large numbers of files for how to remove massive accumulations of files.

Avoid sharing storage spaces

If you have an account for using the supercomputers, analysis, and visualization systems that CISL manages, you have your own home directory in the GLADE environment. Other users have their own home directories, too, so it isn't necessary for you to share yours by giving others write permission.

Sharing often leads to unnecessary confusion over file ownership as your work progresses. If you and your colleagues need to write files to a common space, consider using a work space or project space, and plan to back those files up in our High Performance Storage System. (Do not share your HPSS home directory.)

Set permissions when you create files

Set permissions when you create a file. While file ownership and permissions can be changed after the fact, establishing them when you create the file will simplify your life and save you time and effort later.

Always specify project codes

Always specify a project code for charging purposes when using HPSS—even if it is your default project code. Why? Your default may change. By always specifying the project code, you make it easier to change in your jobs and scripts when necessary. It also helps you manage HPSS charges and avoid surprises down the road.

Organize for efficiency

Organize your files and keep them that way. Arrange them in same-purpose trees, for example. Say you have 20 TB of Mount Pinatubo volcanic aerosols data. Keep the files in a subdirectory such as /glade/u/home/username/pinatubo rather than scattered among unrelated files or in multiple directories. Specialized trees are easier to share with other users and to transfer to other users or projects as necessary.

Back up critical files

Back up files that are critical to your project. Our HPSS is a highly reliable system, but files stored there are not backed up. In the unlikely event that your files are affected by breakage of a storage tape, we may not be able to restore them quickly or entirely. Consider storing two copies of your most critical files in HPSS or keeping one in HPSS and one in another repository. You are responsible for replicating any data that you feel should be stored at an additional location.

Remove unneeded data

Periodically examine your HPSS holdings and remove unwanted, unneeded files. This reduces charges against your storage allocation and makes the HPSS system more efficient for everyone.

Don't leave orphaned files

Don't leave orphaned files behind. Before your involvement in a project ends, transfer your files or arrange for someone else to take ownership of the files.


Managing allocations

Monitor usage charges

Check your usage charges frequently to help ensure that you are using CISL resources as efficiently as possible. Understand how your choice of queues affects charges against your allocation, and be aware of other allocation-related policies. See Managing allocations and charges.

If you are authorized to charge your work against multiple projects, check your usage for each project on a regular basis to ensure that you are charging jobs correctly. This will help you avoid overrunning one of your accounts needlessly. Also make sure that others who are authorized to charge against your allocation understand how to use resources efficiently.

Contact CISL consultants

Before you run a set of jobs that will consume a large portion of your allocation—a long experiment, for example—ask the Consulting Services Group to review your job configuration. One of our consultants may be able to suggest an economical workflow that will help you conserve computing resources. This is especially important if you are unfamiliar with job configuration or with how to manage your allocation efficiently.

Optimize on a single processor

Minimize your use of computing resources and conserve your allocation by optimizing your code on a single processor before running larger jobs in production. Use optimizing libraries if your code lends itself to that.


Transferring data

Use HTAR to archive large numbers of files

We highly recommend using HTAR rather than HSI if you need to archive large numbers of individual files that are smaller than 1 MB. Failure to follow this guideline can result in suspension of your access to HPSS.

Transferring hundreds or thousands of small files with HSI commands significantly slows the entire system for all users. HTAR is more efficient for archiving and reduces the time it takes to retrieve files later.

Also see File size guidelines for HPSS.

Use Globus to transfer files

We recommend using Globus to transfer large files or data sets between our GLADE centralized file service and remote destinations such as XSEDE facilities. It is a convenient, easy-to-use interface, and offers a feature called Globus Connect Personal that enables users to move files easily to and from laptop or desktop computers and other systems. Secure Copy Protocol (SCP) works well for transferring a few relatively small files between systems.