Connect Supervisor script examples

R Admins

Overview

For very advanced configurations, Connect supports the use of a supervisor script to modify the environment available to deployed content. The vast majority of Connect deployments don’t need to use supervisor scripts, but in some situations they can be an important tool that helps to unlock additional flexibility around the execution of deployed content.

Creating supervisor scripts has a steep learning curve, but it offers an extremely powerful way to manipulate the environment that your deployed content runs in and also allows you, the Connect admin, to add support for features not previously envisaged or implemented by the development team.

Note

Throughout this document we use the term “deployed content” to refer to any supported content type that has been deployed to Connect. This could be a shiny app, an Rmarkdown docuent, a Jupyter Notebook, an API or anything else that can be hosted on Connect. The supervisor script runs immediately prior to the execution of all content types

Below are some examples of things you can do with supervisor scripts as well as example script code to support them. It’s unlikely you’d be able to run any of the examples as-is in your environment, but they will hopefully provide some insights into the power of this administrator-level tool. It’s also worth reading through all the examples, even if they don’t seem immediately relevant, as they illustrate different ideas that can be implemented with a supervisor script.

Remember, getting a supervisor script to work exactly the way you want it to can be a tricky process. When developing your own supervisor, it is best to work iteratively and ensure you have prepared for appropriate downtime on your Connect system or are working in a sandboxed environment.

Running all content with a custom umask

Sometimes, external features of your infrastructure may make running deployed content with an environment other than the default.

For instance, mounted fileshares might work more smoothly if Connect were to read and write files with different permissions than the defaults. The following example shows how to run deployed content with a custom umask.

#!/usr/bin/env bash

# echo informational messages to standard error to
# prevent Connect from processing them.
echo arguments: "$@" >&2
echo >&2

# Set our custom umask value
umask 027

# Execute the target process after the environment is established.
# All customization must happen before this "exec".
exec "$@"

Using Connect with environment modules

Environment modules (and similar tools like Lmod) are popular in academic computing, HPC and other settings where interactive command line access to servers by data scientists is common. They provide a way for users to switch between different configurations of default tools. For instance, module A might have R 3.6.3 and Python 3.5 as the defaults, while module B has R 4.0.1 and Python 3.9.

Connect does not provide native support for environment modules, which can make publishing to Connect somewhat problematic. Under some circumstances however, using a supervisor script can allow us to load modules before the deployed content is loaded.

In situations where you have a one-to-one mapping between R versions and modules you might try something like this:

#!/usr/bin/env bash

set -x

echo arguments: "$@" >&2
echo >&2

# Make module command available, path might need adjustment
. /etc/profile.d/modules.sh

# load specific module based on the content's R version
case "$1" in
    "/opt/R/3.6.2/bin/R")
        module load R/3.6.2
        ;;
    "/opt/R/4.0.2/bin/R")
        module load R/4.0.2
        ;;
    *)
        export TEST_VAR=DEFAULT_VALUE
        ;;
esac

exec "$@"

This script will inspect the path to the R binary that is provided by Connect and load an appropriate module accordingly. If the path to the R binary does not match any of the configured options it will instead set an environment variable that can be accessed from within the deployed content.

An alternative approach for modules

In cases where there is not a one-to-one mapping between R versions and environment modules, it’s still possible to use them via an environment variable configured with your content.

Using the following supervisor script, if you wish to use a module with your deployed content, you must use the Connect content settings dashboard to set an environment variable called ENVIRONMENT_MODULE with the name of the module you wish to use.

#!/usr/bin/env bash

set -x

echo arguments: "$@" >&2
echo >&2

# load module based on an environment variable defined for the artifact
if [ -n "$ENVIRONMENT_MODULE" ]
then
    module load $ENVIRONMENT_MODULE
fi

exec "$@"