KU Leuven storage#
Storage on the clusters hosted at KU Leuven is organized in a similar way as on other VSC clusters. Currently, we use an NFS filesystem for home and data complemented with Lustre and IBM Storage Scale (GPFS) parallel filesystems for scratch storage.
Note
We use environment variables to point at your different storage locations, as demonstrated in the tables below. These paths are constructed based on the 5 digits in your VSC ID. As mentioned in Storage locations, you should always (try to) use the environment variables rather than the paths shown in the tables.
VSC home and data storage#
The table below describes the VSC home and data storage locations, which can be acessed on all VSC clusters. These are intended for long term storage of files that are not accessed at high frequency by compute jobs.
Variable |
Path |
Type |
Access |
Backup |
Default quota |
|---|---|---|---|---|---|
|
|
NFS |
All VSC clusters |
Yes |
3 GiB |
|
|
NFS |
All VSC clusters |
Yes |
75 GiB |
Note that for $VSC_HOME and $VSC_DATA:
data is protected by snapshots, which means it is possible to recover data that was accidentally deleted or modified,
quota for non-
vsc3*users are determined by the policy of the user’s home institution.
Parallel scratch storage#
For workflows requiring frequent read or write operations (especially those involving intermediate files), it is recommended to use scratch storage. Compared to NFS, the GPFS and Lustre filesystems are better designed to handle intensive serial and parallel input/output (IO) operations. Genius and wICE share the same (Lustre based) scratch storage, while Mindwell comes with its own (GPFS based) scratch storage:
Variable |
Path |
Type |
Access |
Backup |
Default quota |
|---|---|---|---|---|---|
|
|
Lustre |
Genius, wICE |
No |
500 GiB |
GPFS |
Mindwell |
No |
500 GiB |
On each node, the $VSC_SCRATCH environment variable will point to the
scratch storage associated with the node (GPFS scratch on the Mindwell nodes,
Lustre scratch on the nodes of Genius and wICE).
Warning
It is crucial that intensive IO operations in your compute jobs are done on the scratch storage associated with the cluster where the job is running. In other words, compute jobs running on Genius and wICE have to use Lustre and jobs running on Mindwell have to use GPFS. Compute jobs that do not comply can be cancelled by the system administrators without prior notice.
Note
Non-vsc3* users need to contact the servicedesk
to receive scratch storage, as it is not set up by default.
Transferring data between Lustre and GPFS#
To facilitate data transfers between the Lustre and GPFS storage, Lustre is accessible from Mindwell and GPFS is accessible from wICE.
Note
GPFS can currently not be accessed from the login nodes and the Genius
compute nodes. This will change when the nodes have been migrated to Rocky 9,
which is scheduled for the beginning of June. In the meantime you may carry
out your data transfers using (interactive or batch) jobs on, for example,
the `interactive` partitions of
wICE or
Mindwell.
Two more environment variables ($VSC_SCRATCH_LUSTRE1 and
$VSC_SCRATCH_GPFS1) have been defined for this purpose, so that you can
easily find the mount location of your scratch directory on the “other”
parallel file system.
For instance, to copy a file from your Lustre scratch to your GPFS scratch, you could go about it as follows:
# If initiating the transfer from Genius or wICE:
cp ${VSC_SCRATCH}/myfile ${VSC_SCRATCH_GPFS1}
# If initiating the transfer from Mindwell:
cp ${VSC_SCRATCH_LUSTRE1}/myfile ${VSC_SCRATCH}
As a best practice, data transfers between Lustre and GPFS should be performed through
‘transfer’ jobs, where you for example only request a few cores on an interactive partition.
Short transfers which don’t take more than a couple of minutes can also
be performed from a Genius login node.
Globus endpoints are available for Lustre, but not yet for GPFS. It is therefore not yet possible to use Globus for these data transfers.
Automatic scratch cleanup#
$VSC_SCRATCH at KU Leuven is not for long term storage, as files not accessed
for more than 30 days are automatically removed.
The reasoning behind this is that as long as you are actively using a file
(so accessing it), the file will not be removed.
This policy can however cause confusion when files are initially transferred
to a scratch directory. If you use for instance the mv command, the file
is not actually accessed. As a result, if the last access timestamp of the
original file is a long time in the past, the file on scratch will be considered
to be inactive and automatically removed. A similar thing happens when using
rsync with the option to preserve timestamps. To avoid this problem,
the cp command (without -a argument) should be used to copy files to the
scratch directory, followed by removing the sources (if needed) using the rm
command upon a successful transfer.
Node scratch#
If your jobs require temporary storage that does not need to be shared across compute nodes, you may also consider using the local node disks:
Variable |
Path |
Type |
Access |
Backup |
Default quota |
|---|---|---|---|---|---|
|
|
ext4 |
Genius |
No |
200 GiB |
wICE |
600 GiB |
||||
Mindwell |
600 GiB |
Though limited in storage capacity, $VSC_SCRATCH_NODE has the advantage
that no network traffic is involved. The contents of this temporary storage
location are always removed when the job ends. Therefore, results need to be
copied elsewhere if needed.