Skip to content

Singularity

The advantage of Singularity over other solutions is its focus on HPC environment, which includes support for parallel execution and GPUs. It is also more feature rich and user friendly.

CHPC provides containers for some applications. Users can also bring in their own containers, provided they include a few plugs for our systems, such as mount points for home and scratch file systems. Finally, Singularity also allows to import Docker containers, most commonly from container repositories such as DockerHub.

Importing Docker Container

Singularity has a direct support for Docker containers. Singularity and Docker page provide a good overview of Singularity's Docker container support. Below we list some basics with some local caveats.

Running Docker container directly in Singularity

To start a shell in a Docker container using Singularity, simply point to the DockerHub container URL, e.g

singularity shell docker://ubuntu:latest

Singularity scans the host file systems and mounts them into the container automatically, which allows CHPC's non-standard /uufs and /scratch file systems to be visible in the container as well. This obviates the necessity to create the mount points for these file systems manually in the container and makes the DockerHub containers very easy to deploy with Singularity.

Similarly, we can run a program that's in a DockerHub container as

singularity exec docker://biocontainers/blast:2.2.31 blastp -help

Note that the Biocontainers repositories require the version number tag (following the colon) for Singularity to pull them correctly. The version can be found by finding the tag on the container DockerHub page.

A good strategy in finding a container for a needed program is to go to hub.docker.com and search for the program name.

Converting docker image to Singularity

This approach is useful to speed up a container startup, because Singularity will at each pull or exec from Dockerhub build a new Singularity container file, which may take a while if the container is large. The drawback of this approach is that one has to build the Singularity container manually again if the DockerHub image is updated.

The process is also described in Singularity and Docker page. For example, we can built a local bioBakery container by running:

singularity build bioBakery.sif docker://biobakery/workflows

This newly created bioBakery.sif container can then be run as:

singularity exec bioBakery.sif humann2 --help

This command will execute much faster than executing from the DockerHub pulled image:

singularity exec docker://biobakery/workflows humann2 --help
Checking if container already exists

The container upload and build can be automated by utilizing a shell script that we wrote, update-container-from-dockerhub.sh. This script can be run before every time the container is run to ensure that the latest container version is used without unnecessary uploading if no newer version exists.

The approach described above can be wrapped into a SLURM script that checks if the sif file exists, or if there is an updated container on DockerHub.  The SLURM script then may look like this:

#SBATCH -N 1
#SBATCH -p ember
#SBATCH -t 1:00:00

# check if the container exists or is newer and pull if needed
/uufs/chpc.utah.edu/sys/installdir/singularity3/update-container-from-dockerhub.sh biobakery/workflows bioBakery.sif
# run a program from the container
singularity exec bioBakery.sif humann2 --help
Example: Finding and running a Docker container

Frequently we get requests to install complex programs that may not even run on CentOS7. Before writing to CHPC support consider following the example below with your application.

An user wants to install a program called guppy, which installation and use are described at a blog post. He wants to run it on a GPU since it's faster. From the blog post we know the program's name, have a hint on a provider of the program, and how to install it in Ubuntu Linux. After some web searching we find out that the program is mainly available commercially, so it has no publicly available dowload section and, likely there is no CentOS version. That leaves us with a need for an Ubuntu based container.

We can build the container ourselves based on the instructions in the blog post, but, we would need to either build with Docker or Singularity on a local machine with root, or use DockerHub automated build through GitHub repository. This can be time consuming and cumbersome, so we leave it as a last resort.

We do some more web searching to see if guppy has a container. First we search for guppy dockerhub, we get lots of hits like this one, but, none for the GPU (looking at the Dockerfile, there's no mention of GPU in the base image or what's being installed). Next we try "guppy gpu" dockerhuband find this container. We don't know yet if it does indeed support GPU, and since the Dockerfile is missing, we suspect that it is hosted on GitHub. So, we search "guppy-gpu" githuband find this repository, which based on the repository name and source looks like a match to the DockerHub image. Examining the Dockerfile we see that the container is based on nvidia/cuda9.0, which means it's being set up for a GPU. This is looking hopeful so we get the container and try to run it.

$ ml singularity
$ singularity pull docker://aryeelab/guppy-gpu
$ singularity shell --nv guppy-gpu_latest.sif
$ nvidia-smi
...
| NVIDIA-SMI 418.67 Driver Version: 418.67 CUDA Version: 10.1
... to check if the GPU works
$ guppy_basecaller --help
: Guppy Basecalling Software, (C) Oxford Nanopore Technologies, Limited.
Version 2.2.2
... to check that the program is there.

Above we have loaded the Singularity module and used Singularity to pull the Docker container. This has downloaded the Docker container image layers and has created Singularity container file guppy-gpu_latest.sif. Then we opened a shell in this container (using the --nv flag to bring in the host GPU stack into the container), and tested the GPU visibility with nvidia-smi followed by running the command guppy_basecaller to verify that it exists. With these positive outcomes, we can proceed to run the program with our data, which can be done directly with

$ singularity exec --nv guppy-gpu_latest.sif guppy_basecaller -i <fast5_dir> -o <output_folder> -c dna_r9.4.1_450bps -x "cuda:0"

As mentioned above, the singularity pull command creates a Singularity container based on a Docker container image. To guarantee that we will always get the latest version, we can use the shell script we have described above, e.g.

$ /uufs/chpc.utah.edu/sys/installdir/singularity3/update-container-from-dockerhub.sh aryeelab/guppy-gpu guppy-gpu_latest.sif
$ singularity exec --nv guppy-gpu_latest.sif guppy_basecaller -i <fast5_dir> -o <output_folder> -c dna_r9.4.1_450bps -x "cuda:0"

If we want to make this even easier to use, we can build an Lmod module and wrap up the commands to be run in the container in this module. First we create user based modules. Then copy our template to the user modules directory:

mkdir $HOME/MyModules/guppy
cd $HOME/MyModules/guppy
cp /uufs/chpc.utah.edu/sys/modulefiles/templates/container-template.lua 3.2.2.lua

and edit the new module file, 3.2.2.lua, to modify the container name, the command(s) to call from the container and the module file meta data:

-- required path to the container sif file
local CONTAINER="/uufs/chpc.utah.edu/common/home/u0123456/containers/guppy-gpu_latest.sif"
-- required text array of commands to alias from the container
local COMMANDS = {"guppy_basecaller"}
-- these optional lines provide more information about the program in this module file
whatis("Name : Guppy")
whatis("Version : 3.2.2")
whatis("Category : genomics")
whatis("URL : https://nanoporetech.com/nanopore-sequencing-data-analysis")
whatis("Installed on : 10/05/2021")
whatis("Installed by : Your Name")

When we have the module file created, we can activate the user modules and load the guppy module:

module use $HOME/MyModules
module load guppy/3.2.2

This way we can use just the guppy_basecaller command to run this program inside of the container.

Running CHPC provided containers

We provide containers for applications that are difficult to build natively on CentOS7 that our clusters run. Most of these applications are being developed on Debian based Linux systems (Ubuntu and Debian) and rely on their software stack. Some containers are simply DockerHub images converted to the Singularity sif format, while others are built manually by CHPC staff.

Running a CHPC provided container is as simple as running the application command itself. We provide an environment module that sets up alias for this command that calls the container behind the scenes. If the container provides more commands, we provide a command to start a shell in the container, from which the user can call the commands needed to execute their processing pipeline.

In the containers, the user can access storage in their home directories, or in the scratch file servers.

Below is a sample of containers that we provide. Complete list can be found by running module -r spider 'singularity$'.

bioBakery

bioBakery is a set of tools, one can either use them separately or in a pipeline. After loading the module file, module load bioBakery, we have created a shortcut to run a single command called runbioBakery, e.g. runbioBakery humann2 parameters. To start a shell in the container and run multiple commands in the shell, run startbioBakery.

samviewer

Samviewer is an electron microscopy image and analysis program. After loading the module file, module load samviewer, we define a shortcut, sv which maps the sv command in the container. We also have a shortcut sam which executes any command inside of the container, e.g. sam python runs the python from the container.

Bringing in own Singularity container

One can build singularity container on their own machine and scp it to CHPC's systems. Singularity runs on Linux, on MacOS or Windows one can create a Linux VM using e.g. VirtualBox and install Singularity in it. For details on how this can be done, see our Building Singularity containers locally page.

For security reasons we don't allow building containers on CHPC systems, as building a container requires sudo access. However, we do have a standalone Linux machine where we allow users to build containers, called singularity.chpc.utah.edu.

Singularity only supports Linux containers, so, we do not support users importing Windows or MacOS containers.

To ensure portability of your Singularity container to CHPC systems, during the build process create mount points for CHPC file systems

mkdir /uufs /scratch

Then scp the container to a CHPC file server, e.g. your home directory, and run it e.g. as:

module load singularity
singularity shell my_container.sif

Or, if you have defined the %runscript  section in your container, then simply execute it, or use singularity run  :

my_container.sif
singularity run my_containers.sif

 Note that in our singularity module, we define two environment variables:

  • SINGULARITY_SHELL=/bin/bash - this sets the container shell to bash (easier to use than default sh)
  • SINGULARITY_BINDPATH=/scratch,/uufs/chpc.utah.edu - this binds mount points to all the /scratch file servers and to /uufs file servers (sys branch, group spaces). 

If you prefer to use a different shell, or not bind the file servers, set these variables differently or unset them. 

Last Updated: 10/26/21