Removing large numbers of files

The recommended way to remove thousands or hundreds of thousands of files from a GLADE directory is by running a batch job.

Removing large numbers of files can take several hours, so you will need to provide enough wall-clock time in the job to accommodate this. You can use the sample script below with your own project code, job name, and other customizations.

Caution: Before removing large numbers of files, create a "play" directory in /glade/scratch/$USER and try the batch job with some fictional files and subdirectories to make sure that it does what you want. Carefully specify the files that you want removed before submitting a job like this.

Create job script and submit

Create a job script using the following as an example. When your script is ready, submit the job to run as described here.

Specifying the hpss partition (-p hpss) tells the Slurm scheduler to run your job on special non-compute nodes that provide fast I/O for data-management tasks such as removing a large number of files from GLADE or retrieving files from HPSS.

#!/bin/tcsh
### bash users replace /tcsh with /bash
#SBATCH -J job_name
#SBATCH -n 1
#SBATCH --ntasks-per-node=1
#SBATCH -t 24:00:00
#SBATCH -A project_code
#SBATCH -p hpss
#SBATCH -e job_name.err.%J
#SBATCH -o job_name.out.%J
#SBATCH --export=TERM,HOME,SHELL

source /glade/u/apps/opt/slurm_init/hpss.csh
### bash users replace /hpss.csh with /hpss.sh

setenv TMPDIR /glade/scratch/$USER/temp
mkdir -p $TMPDIR
### bash users: export TMPDIR=/glade/scratch/$USER/temp

### See "man rm" for explanation of options f and v
rm -fv /glade/scratch/$USER/directory_name/files_to_remove*

 

Related training courses