vsc-Rproject#

Introduction#

vsc-Rproject is a command-line tool that facilitates the setup, management and use of RStudio Projects on VSC HPC clusters. Two main advantages of using vsc-Rproject are that the project settings and R libraries are kept separate for each project, and that compiled extensions are easier to manage in a cluster with heterogeneous hardware.

Note

In what follows, the term ‘vsc-Rproject environment’ refers to the environment created by vsc-Rproject. This environment enables the use of an associated RStudio Project.

How to use#

vsc-Rproject can be used by simply loading the corresponding module.

$ module load vsc-Rproject

Then, use the vsc-rproject command for its different functionalities.

$ vsc-rproject --help
$ vsc-rproject --version

The vsc-rproject command provides four sub-commands that can be used to configure default behaviour or create, activate, or deactivate a vsc-Rproject environment.

Creating a project#

The command vsc-rproject create allows you to create a new RStudio Project together with a vsc-Rproject environment.

When creating a new project, the only required argument is a project name.

We recommend to always provide a “modules file” when creating new projects. This is a simple text file listing a module (full name and version) per line. With a modules file, vsc-Rproject will ensure that these modules are always loaded upon activating the corresponding vsc-Rproject environment. If no modules file is provided, the default R module will be used instead.

The following command will create a modules.txt file in your data directory, containing the following modules:

  • R/4.4.1-gfbf-2023b

  • R-bundle-CRAN/2024.06-foss-2023b

$ printf "R/4.4.1-gfbf-2023b\nR-bundle-CRAN/2024.06-foss-2023b\n" > $VSC_DATA/modules.txt

Note

When you specify a modules file, it should always contain the R module.

To create a new RStudio Project and vsc-Rproject environment using this modules file, run the following command:

$ vsc-rproject create MyProject --modules="$VSC_DATA/modules.txt"

This will create a new RStudio Project named ‘MyProject’ at the default location: $VSC_DATA/Rprojects. The modules.txt file will be used when creating the project, and is stored in $VSC_DATA/Rprojects/.vsc-rproject/modules.env.

Note

If you wish to update the modules list for an existing project, you should manually add it to the $VSC_DATA/Rprojects/.vsc-rproject/modules.env file.

The project folder will contain .Renviron, .Rprofile and .R/Makevars configuration files, which are therefore specific to the project.

The .Renviron file will set the R_LIBS_USER variable to point to the project’s R package library. This can be found at the root of the project, under /library/<OS>/R.

The .Rprofile file will be configured to set the CRAN mirror to "https://cloud.r-project.org" (default) and set the R_MAKEVARS_USER variable to point to the project’s .R/Makevars file.

The .R/Makevars file can be used to control the compilation process when installing new R packages by modifying the compiler flags. vsc-Rproject’s default behaviour is to change the -march compiler flag for all relevant compilers from native to x86-64-v4.

Note

Compared to -march=native, the -march=x86-64-v4 compiler flag will discard certain microarchitecture-specific optimizations (potentially with a minor performance impact) to allow for a more generic installation which will run on any AVX512-capable x86-64 CPU (e.g. Skylake and newer for Intel CPUs and Zen4 and newer for AMD CPUs). For most users this will be the more desirable option as it makes switching between different types of compute nodes a lot easier. If some of the node types you want to utilize do not support this microarchitecture level, you can e.g. choose -march=x86-64-v3 instead.

Warning

Compiler options such as -march=x86-64-v3 and -march=x86-64-v4 are only supported in GCC version 11 and later. If you are using an older version of R that relies on an earlier GCC version, -march=x86-64-v... will not be recognized. In such cases, you can run gcc --target-help to view the list of supported -march values and choose a more appropriate setting.

If you want to enable git within the RStudio Project you can add the --enable-git flag. To automatically activate the vsc-Rproject environment after creating it, use --activate.

If you are not satisfied with the default behaviour, you can modify the behaviour of vsc-rproject create by providing additional command-line arguments. You can specify --location to create your project in a different location. The --cran argument can be used to provide a specific CRAN mirror for your project. Finally --march allows you to choose a different microarchitecture optimization for your project.

For more information, see:

$ vsc-rproject create --help

Note

Alternatively, you may also want to modify your default settings more permanently via vsc-rproject configure. See Default project configuration for more details.

Activating a project#

The activate sub-command can be used to activate an already existing vsc-Rproject environment.

$ vsc-rproject activate MyProject

Activating a vsc-Rproject environment will load all the relevant modules listed in the modules file and set the $VSC_RPROJECT environment variable which can be used to access the root directory of the project.

Deactivating a project#

The deactivate sub-command deactivates the active vsc-Rproject environment. Doing so will purge all loaded modules except for the cluster module and the vsc-Rproject module itself. Additionally, it will unset the $VSC_RPROJECT variable.

$ vsc-rproject deactivate

Default project configuration#

If you wish to change the default behaviour of vsc-Rproject, you can configure your personal default settings with the configure sub-command.

Note

You can at all times check your current default settings with vsc-rproject --defaults.

vsc-rproject configure allows you to set your prefered default R with --default-r. You may also set a new default location for your RStudio Projects with --location. Finally you can still configure your prefered default CRAN mirror using --cran and the default -march compiler settings with --march.

These personal configurations will be stored in $VSC_HOME/.vsc-rproject-config.

You can provide an alternative path for this configuration file by setting the $VSC_RPROJECT_CONFIG environment variable. This e.g. allows to apply different defaults for different clusters. If $VSC_RPROJECT_CONFIG is set, vsc-rproject will consider it and use it if possible. If $VSC_RPROJECT_CONFIG is not set (default) vsc-rproject will use the default config file in $VSC_HOME/.vsc-rproject-config.

If at any point you wish to reset your configuration to the the original default settings, simply run:

$ vsc-rproject configure --reset

vsc-Rproject and RStudio Server#

When launching a new session via the Studio Server app in the Open Ondemand portal, you can use the pre-run scriplet to load the vsc-Rproject environment.

module load vsc-Rproject; vsc-rproject activate MyProject

Warning

The R module selected in the OnDemand form must match the R module that was used to create the project! Otherwise dependency conflicts may arise as RStudio Server will replace the modules loaded via the pre-run scriplet.

Once inside the RStudio session, you still need to open the RStudio Project via the interface.