Squash File System and Singularity
View available datasets on the Datasets page.
Working with Datasets
Writable ext3 overlay images have conda environments installed inside, Singularity can work with squashFS for fixed datasets, such as the coco datasets.
/scratch/work/public/ml-datasets/coco/coco-2014.sqf
/scratch/work/public/ml-datasets/coco/coco-2015.sqf
/scratch/work/public/ml-datasets/coco/coco-2017.sqf
singularity exec \
--overlay /scratch/wang/zzz/pytorch1.8.0-cuda11.1.ext3:ro \
--overlay /scratch/work/public/ml-datasets/coco/coco-2014.sqf:ro \
--overlay /scratch/work/public/ml-datasets/coco/coco-2015.sqf:ro \
--overlay /scratch/work/public/ml-datasets/coco/coco-2017.sqf:ro \
/scratch/work/public/singularity/cuda11.1-cudnn8-devel-ubuntu18.04.sif /bin/bash
If you have many tiny files as fixed datasets, please make squashFS files to work with Singularity. Here is an example
- Make a temporary folder in /state/partition1, it is a folder in local hard drive on each computer node
mkdir -p /state/partition1/sw77
cd /state/partition1/sw77
- Unzip files there, for example
tar -vxzf /scratch/work/public/examples/squashfs/imagenet-example.tar.gz
- Change access permissions in case we'll share files with others
find imagenet-example -type d -exec chmod 755 {} \;
find imagenet-example -type f -exec chmod 644 {} \;
- Convert to a single squashFS file on host
mksquashfs imagenet-example imagenet-example.sqf -keep-as-directory
For more details on working with squashFS, please see this page from the SquashFS documentation.
- Copy this file to /scratch
cp -rp /state/partition1/sw77/imagenet-example.sqf /scratch/sw77/.
- To test, files are in /imagenet-example inside Singularity container
singularity exec --overlay /scratch/sw77/imagenet-example.sqf:ro /scratch/work/public/singularity/ubuntu-20.04.1.sif /bin/bash
Singularity> find /imagenet-example | wc -l
1303
Singularity> find /state/partition1/sw77/imagenet-example | wc -l
1303
- To delete the tempoary folder on host
rm -rf /state/partition1/sw77