Skip to content

3.1 File Storage Policies

  1. CHPC Home/Group/Project Directory File Systems
    Many of the CHPC  file systems are based on NFS (Network File System), and proper management of files is critical to the performance of applications and the performance of the entire network. All files in home directories are NFS mounted from a fileserver, and a request for data must go over the network. Therefore, it is advised that all executables and input files be copied to a scratch directory before running a job on the clusters.
    1. Default home directory space in the general environment
      1. The general environment CHPC home directory file system (hpc_home) is available to users who have a CHPC account and do not have a department or group home directory file system maintained by CHPC (see item 1-2 below).
      2. This file system enforces quotas set at 50 GB per user. If you need a temporary increase on this limit, please let us know (helpdesk@chpc.utah.edu) and we may be able to provide the increase.
      3. This file system is not backed up 
      4. Users are encouraged to move important data back to a file system that is backed up, such as a department file server.
    2. Group-purchased home space in the general environment
      1. Departments or PIs with sponsored research projects who wish to have home directories larger than the default 50 GB can purchase chunks of space on the home system to be used for a group.
      2. Users in the group will have their home space provisioned in the group home rather than the shared HPC home space. 
      3. Home directory space purchases include full backup as described in the Backup Policies below.
      4. Any backups of owner home directory space run regularly by CHPC have a two week retention period - See Backup Policies below.
      5. Quotas
        1. User and or Group quotas can be used to control usage
        2. The quota layer will be enabled allowing for reporting of usage even if quota limits are not set
    3. Department or group space in the general environment
      1. Departments or PIs with sponsored research projects can work with CHPC to procure storage to be used as CHPC Home Directory or Group Storage
      2. Group Space does not come with backup by default but the owner of group storage can arrange for archival back up as described in the Backup Policies below.
      3. Usage Policies of this storage will be set by the owning department/group.
      4. When using shared infrastructure to support this storage it is still expected that all groups be 'good citizens'.
        1. Groups should utilize Scratch File Systems when running read or write heavy jobs.
        2. Groups should keep IO utilization on group space reasonable and be mindful of frequent file access. (e.g., cron jobs, automated scripts, file system traversal, etc).
      5. Quotas
        1. User and or Group quotas can be used to control usage
        2. The quota layer will be enabled allowing for reporting of usage even if quota limits are not set
      6. Any backups of owner home directory space run regularly by CHPC have a two week retention period - See Backup Policies below.
      7. Life Cycle
        1. CHPC will support storage for the duration of the warranty period of the storage hardware that supports the group space.
        2. Shortly before the end of the warranty period, CHPC will reach out to let groups know of the upcoming retirement of the file system, giving the groups time to either purchase new group space or move their data outside of CHPC.
    4. Archive storage in the general environment
      1. CHPC maintains a ceph object file system in the general environment. 
      2. The archive storage is NOT mounted on any of the CHPC systems
      3. This space is used for storage of CHPC run backups
      4. Groups can purchase storage on this file system to use for owner driven backups as well as for sharing of data
    5. Home directory space in the protected environment (PE)
      1. A PE home directory is provided for all users in the PE
      2. This file system enforces quotas set at 50 GB per user. CHPC will NOT increase the size on any PE home directory space.
      3. CHPC provides backup of the PE home directories as described in the Backup Policies below.
    6. Project space in the protected environment (PE)
      1. Each PE project is provided with 250 GB of project space 
      2. Groups can purchase additional project space as needed
      3. The project space is not backed up by default, however groups can arrange for archival back up as described in the Backup Policies below.
      4. Life Cycle
        1. CHPC will support storage for the duration of the warranty period of the storage hardware that supports the project space.
        2. Shortly before the end of the warranty period, CHPC will reach out to let groups know of the upcoming retirement of the file system, giving the groups time to either purchase new project space or move their data outside of CHPC.
    7. Archive storage in the protected environment (PE)
      1. CHPC maintains a ceph object file system in the protected environment (PE)
      2. The archive storage is NOT mounted on any of the CHPC systems
      3. This space is used for storage of CHPC run backups of PE file systems
      4. Groups can purchase storage on this file system to use for owner driven backups as well as for sharing of data within the PE
    8. Web Support from home directories
      1. Place html files in public_html directories
      2. URL published: "http://home.chpc.utah.edu/~<uNID>"
  2. Backup Policies
    1. Scratch file systems are not backed up
    2. The default HPC home file system in the general environment is not backed up
    3. Owned home directory space in the general environment and all home directories in the Protected Environment (PE): 
      1. The backup of this space is included in the price and occurs on the schedule of weekly full backups with daily incremental backups.  The retention window is two weeks.
      2. The backup is done to the archive storage (pando in the general environment, elm in the PE) 
    4. Group spaces in the general environment and project spaces in the protected environment are not backed up by CHPC unless the group requests backup and purchases the necessary space on the CHPC archive storage system of that environment (see 2-5 below). Note that CHPC documentation also provides information about user driven backup options on our storage page.
    5. Backup Service: 

       The time it takes to backup a space depends on several factors, including the size of the space and the number of files.  With incremental backups, the time it takes and the space the incremental backup requires depends on the turnover rate of the data, i.e., the amount of the space that has changed since the last backup. When a given group space takes longer than six days, CHPC will work with the group to develop a feasible backup strategy. Typically, this involves determining which files are static (such as raw data files) and therefore only need to be backed up once and then saved (never over written in the backup location), which files that regularly change and need to be backed up on a regularly scheduled basis, and which files do not need to be backed up at all. Note, this policy was updated 31-Jan-2024 , effective 1-April-2024, changing the frequency of backups for group/project spaces that are under 5 TB from a monthly full backup to a quarterly full backup.

      1. For group or project spaces, CHPC will perform a quarterly full  backup with weekly incremental backups. Once a new full backup is completed, the previous period backup is deleted. 
      2. For group or project spaces that cannot be backed up within six days, CHPC will also reach out to the group to determine a feasible backup strategy.
      3. Groups interested in a different backup schedule should reach out via helpdesk@chpc.utah.edu to discuss.
    6. To schedule this service, please:
      1. Send email to helpdesk@chpc.utah.edu
      2. Purchase  necessary archive space
      3. CHPC will perform the archive backup
      4. The archive space be twice the capacity of the group space being archived such that we still have a copyof the previous backup for protection if the disaster were to happen mid archive run.
  3. Scratch Disk Space: Scratch space for each HPC system is architected differently. CHPC offers no guarantee on the amount of available /scratch disk space available at any given time.
    1. Local Scratch (/scratch/local):
      1. This space is on the local hard drive of the node and therefore is unique to each individual node and is not accessible from any other node. 
      2. This space is encrypted; each time the node is rebooted the encryption is reset tfrom scratch, which in effect purges the content of this space. 
      3. /scratch/local on compute nodes is set such that users cannot create a directory under the top level /scratch/local space.
        1. As part of the Slurm job prolog (before the job is started), a job level directory,  /scratch/local/$USER/$SLURM_JOB_ID , is created and set such that only the job owner has access to the directory. At the end of the job, in the Slurm job epilog, this directory is deleted.
      4. There is no access to /scratch/local outside of a job. 
      5. This space will be the fastest, but not necessarily the largest. 
      6. Users should use this space at their own risk.
      7. This space is not backed up
    2. NFS Scratch:
      1. /scratch/general/pe-nfs1 is mounted on protected environment interactive and compute nodes
      2. /scratch/general/nfs1 is mounted on all general environment interactive and compute nodes
      3. /scratch/general/pe-vast is mounted on protected environment interactive and compute nodes
      4. /scratch/general/vast is mounted on all general environment interactive and computer nodes
      5. Scratch file systems are not intended for use as storage beyond the data's use in batch jobs
      6. Scratch file systems are scrubbed weekly of files that have not been accessed for over 60 days 
      7. Each user will be responsible for creating directories and cleaning up after their jobs 
      8. All scratch file systems are not backed up
      9. Each scratch file system may be subject to quotas per user depending on utilization and free space available on the system
    3. Owner Scratch Storage
      1. It is configured and made available as per the owner groups requirements.
      2. It is not subject to the general scrub policies that CHPC enforces on CHPC provided scratch space.
      3. Owners/groups can request automatic scrub scripts to be run per their specifications on their scratch spaces.
      4. It is not backed up.
      5. Quota layer enabled to facilitate usage reporting.
      6. Quota limits can be configured per owner/groups needs.
  4. File Transfer Services
    1. Globus: The CHPC has a campus license for the web based file transfer system Globus. See Globus for more information.
    2. DTNs: The CHPC offers the use of Data Transfer Nodes optimized for transfer of large datasets to CHPC resources. See Data Transfer Services for more information.
    3. Guest File Transfer: The CHPC offers a temporary guest transfer service to facilitate transfer of data from external collaborators to CHPC resources. See Guest Transfer Policy for more information.
Last Updated: 1/13/25