Skip to content

Storage Services in CHPC's Protected Environment (PE)

In the PE, CHPC currently offers four different types of storage: home directories, project space, scratch file systems and an archive storage system. All storage types, except for the archive storage system, are accessible from all CHPC PE resources. Mammoth (CHPC PE) file services are encrypted. Data on the archive storage space must be moved to one of the other spaces in order to be accessible as POSIX files. See the Data Transfer Services page for information on moving data to and from the CHPC PE storage.

  Please remember that you should always have an additional copy, and possibly multiple copies, of any critical data on independent storage systems. While storage systems built with data resiliency mechanisms (such as RAID and erasure coding mentioned in the offerings listed below or other, similar technologies) allow for multiple component failures, they do not offer any protection against large-scale hardware failures, software failures leading to corruption, or the accidental deletion or overwriting of data. Please take the necessary steps to protect your data to the level you deem necessary.

On this page

The table of contents requires JavaScript to load.

Home Directories

The CHPC provides all PE users with a default 50 GB/user home directory space on the mammoth storage. 

The size of a user's home directory space is enforced with a quota. There is a soft quota of 50 GB and a hard quota of 75 GB. Once a user's directory exceeds the soft quota, they have seven days to clean up and return to below the soft quota amount. After 7 days, they will no longer be able to write in their home directory until they clean up so that they are under the soft quota. Immediately upon exceeding the hard quota, a user will no longer be able to write any data to their home directory until they clean up so they are no longer over this quota. When over quota, you will not be able to start a
FastX or Open OnDemand session, as those tasks write to you home directory, but an SSH session can be used to connect and free up space. To find which files are taking up space in your home directory, use command du -h --max-depth=1 from your home directory.

  This space is backed up; for details on the backup schedule, see 3.1 File Storage Policies.

The CHPC does not offer larger home directories in the PE. Instead, users should make use of project spaces to store data.

Project Level Storage File System

The CHPC offers project space, equivalent to group space in the General Environment, for groups to store research data. Each project has its own project space. The access to this space is controlled by extended access control lists (ACLs) such that only users that are part of the project are allowed access to the space. For IRB governed projects, the users given access must be listed on the IRB. For non-IRB governed projects, the PI of the project will be consulted before the CHPC gives any user access to the space. Project space is on shared hardware and is not designed for running jobs that have high IO requirements. Running such jobs on project space can bog down the system and cause issues for other groups on the hardware. Please refer to the scratch file system information below. If interested, a more detailed description of this storage offering is available.

The CHPC offers 250 GB of group space to all PIs in the protected environment. PIs who require more than the 250 GB allocation can purchase larger group space. The CHPC purchases the hardware for this storage in bulk and then sells it to individual groups in TB quantities, so depending on the amount of group storage space you are interested in purchasing, the CHPC may have the storage to meet your needs on hand. The current pricing is $150/TB for the lifetime of the hardware without backups. Hardware is purchased with a 5-year warranty and we are usually able to obtain an additional 2 years of warranty after purchase. The CHPC provides a backup option at $450/TB. If you are interested in purchasing project space or backing up existing project space, please contact us at helpdesk@chpc.utah.edu to discuss and implement the backup plan. For details on the current backup policy of the PE project space, see 3.1 File Storage Policies.

Scratch File Systems

Scratch space is a high-performance temporary file system for files being accessed and operated on during jobs. It is recommended to transfer data from project space to scratch when running IO-intensive jobs, as the scratch systems are designed for better performance and this prevents project spaces from getting bogged down. These scratch file systems are not backed up. Files that have not been accessed for 60 days are automatically scrubbed. The CHPC provides two scratch file systems available free of charge in the protected environment.

  Scratch space is not intended for long-term file storage. Files in scratch spaces are deleted automatically after a period of inactivity.

If you have questions about using the scratch file systems or IO-intensive jobs, please contact at helpdesk@chpc.utah.edu.

The current scratch file systems are

  • /scratch/general/pe-nfs1, a 280 TB NFS system accessible from all protected environment CHPC resources
  • /scratch/general/pevast, 100 TB flash-based file system available from all protected environment CHPC resources
    • There is a per-user quota of 10 TB on this scratch file system

Temporary File Systems

/scratch/local

Each node on the cluster has a local disk mounted at /scratch/local that can also be used for storing intermediate files during calculation. Because it is local to the node, this will have lower-latency file access; however, be aware that these files are only accessible on the node and should be moved to another shared file system (home, group, scratch) before the end of the job if they are needed after job completion.

Access permissions to /scratch/local have been set such that users cannot create directories in the top-level /scratch/local directory. Instead, as part of the Slurm job prolog (before the job is started), a job level directory, /scratch/local/$USER/$SLURM_JOB_ID, will be created. Only the job owner will have access to this directory. At the end of the job, in the Slurm job epilog, this job level directory will be removed.

All Slurm scripts that make use of /scratch/local must be adapted to accommodate this change. Additional updated information is provided on the CHPC Slurm page.

/scratch/local is now software-encrypted. Each time a node is rebooted, this software encryption is set up again, purging anything within the content of this space. There is also a cron job in place to scrub /scratch/local of content that has not been accessed for over 2 weeks. This scrub policy can be adjusted on a per-host basis. A group can opt to have us disable this on a group-owned node, and it will not run on that host.

/tmp and /var/tmp

Linux defines temporary file systems at /tmp or /var/tmp. CHPC cluster nodes set up temporary file systems as a RAM disk with limited capacity. All interactive and compute nodes also have a spinning disk local storage at /scratch/local. If a user program is known to need temporary storage, it is advantageous to define the location of the temporary storage by setting the environmental variable TMPDIR to point to /scratch/local. Local disk drives range from 40 to 500 GB depending on the node, which is much more than the default /tmp size.

Archive Storage

Elm

The CHPC offers an archive storage solution based around object storage, specifically Ceph, a distributed object store suite developed at UC Santa Cruz. With the current cluster configuration we offer $150/TB for the 7-year lifetime of the hardware. In alignment with our current project space offering, we will operate this space in a condominium-style model by reselling this space in TB chunks. If interested, a more detailed description of this storage offering is available.

One of the key features of the archive system is that users can manage the archive directly. Users can move data in and out of the archive storage as needed: they can archive milestone moments in their research, store an additional copy of crucial instrument data, and retrieve data as needed. Ceph presents the storage as a S3 endpoint which allows the archive storage solution to be accessed via applications that use Amazon’s S3 API, such as s3cmd and rclone.

This space is a standalone entity and is not mounted on other CHPC PE resources. Elm is currently the backend storage used for CHPC-provided automatic backups (e.g., backed-up project or home space); as such, groups looking for additional data resiliency that already have spaces backed up by the CHPC may want to look for other options.

User-Driven Backup Options

Campus-level options for a backup location include Box and Microsoft OneDrive.

  There is a UIT Knowledge Base article with information on the suitability of the campus level options for different types of data (public/sensitive/restricted). Please follow these university guidelines to determine a suitable location for your data.

Owner backup to University of Utah Box: This is an option suitable for sensitive/restricted data. See the link above to get more information about the limitations. If using rclone, the credentials expire and have to be reset periodically.

Owner backup to University of Utah Microsoft OneDrive: As with box, this option is suitable for sensitive/restricted data. See the link above to get more information about the limitations. 

Owner backup to CHPC archive storage (Elm in the Protected Environment): This choice, mentioned in the archive storage section above, requires that the group purchase the required space on the CHPC's archive storage options. 

Owner backup to other storage external to CHPC: Some groups have access to other storage resources, external to the CHPC, whether at the University of Utah or at other sites. The tools that can be used for doing this are dependent on the nature of the target storage. It is the researcher's responsibility to ensure data is stored in a location appropriate for the type of data being stored.

There are a number of tools, mentioned on our Data Transfer Services page, that can be used to transfer data for backup. The tool best suited for transfers to object storage file systems is rclone. Other tools include fpsync, a parallel version of rsync suited for transfers between typical Linux "POSIX-like" file systems, and Globus, best suited for transfers to and from resources outside of the CHPC.

If you are considering a user driven backup option for your data, CHPC staff are available for consultation at helpdesk@chpc.utah.edu.

Additional Information

For more information on CHPC data policies, visit the File Storage Policies page.

Last Updated: 1/13/25