Running RStudio Products in Containers#
Many organizations run RStudio Professional products in Docker (or other) containers. Containerization is entirely compatible with RStudio products, and many RStudio administrators successfully run our products in containers and on Kubernetes.
RStudio products are designed to run on long-lived Linux servers for multiple users and projects. Therefore, administrators who want to run RStudio products using Docker or Kubernetes have two separate questions, which this article aims to address:
- How and why do I put the RStudio products themselves in containers?
- How do I use containerization and a cluster manager like Kubernetes to scale computational load for data science jobs?
In general, we do not recommend putting RStudio Professional products themselves in short-lived per-user or per-session containers or Kubernetes pods.
Putting RStudio products in containers#
RStudio products are designed to live on long-running Linux servers. RStudio products are entirely compatible with treating a container like the underlying Linux server to better encapsulate dependencies and diminish server statefulness.
In this model, each RStudio product is placed in its own long-running container and treated as a standalone instance of the product. Multiple containers can be load-balanced and treated as a cluster. These containers can be managed by a Kubernetes cluster, should you wish.
There are some specific considerations for running RStudio products in containers, which are detailed in this article.
RStudio also provides images, Dockerfiles, and instructions for deployment here.
Managing load with containerization#
Some administrators wish to use containerization to manage the load on their data science infrastructure. If this is the case for your RStudio Server Pro installation, we recommend using RStudio Workbench's launcher feature to use a Kubernetes cluster as the computation engine for RStudio Workbench. With the launcher, all user sessions - both interactive and ad-hoc jobs - are moved from the RStudio Server instance to the Kubernetes cluster.
In this configuration, RStudio Workbench makes requests for resources to the Kubernetes cluster, so standard Kubernetes tooling can be used to autoscale the nodes underlying the cluster. The RStudio Workbench machine or pod itself can be relatively small, as all of the computational load will be directed elsewhere.
Please see this article for more information on how the launcher works and on configuration.