Updating Data in A Shiny App On RStudio Connect

Shiny apps are often interfaces to allow users to slice, dice, view, visualize, and upload data. The data can be stored in a variety of different ways including a database or csv, rds, or arrow files.

Many Shiny apps are developed using local data files that are bundled with the app code when it’s sent to RStudio Connect. This can be a good architecture for data with infrequently-updated data. However this architecture turns out to be very brittle when the app code is updated on a scheduled basis.

In general, it’s a good idea to separate the data from the app code if the data is frequently updated.

How do I update the data?

For apps that only consume data, the most common pattern is scheduling an R Markdown document or Jupyter notebook on RStudio Connect. This document should update just the data, not re-deploy the entire app.

If your app also gives users the ability to upload data, consider calling a plumber API from within your Shiny app to update your data.

Where should the data live?

Bundle the Data with the App Code

You can add data or other files to the deployment bundle when you deploy your app. This is a good option for reasonably small data files that are seldom updated. If you’re finding that your data files are getting large or that you’re frequently updating the data but not the app code, another strategy will probably work better.

Database

Databases are a great option for storing the data for your Shiny app.

If you want to configure your Shiny app to connect to a database, two of the top priorities are deciding how you will establish the connection and how you will protect your credentials. We have recommendations for these topics and more at db.rstudio.com.

If you find that just pulling data from the database and processing it in the Shiny app is too slow, you may want to consider adopting a design pattern for using big data from R.

Pins

Pins is an R package that allows for easy storage and retrieval of data, models, and other R objects. Pins can be a good choice when you don’t have write access to a database or when the data you’re trying to save is
something like a model that won’t fit nicely into most databases.

Pins is easy to use from both the development environment and the deployed environment. You can create a pin with the pins::pin command and retrieve the data with pins::pin_get. The pins page has more details on how to use pins.

Here is an example of how to use pins with either a Shiny app or Plumber API.

Persistent Storage

Shiny apps on RStudio Connect can use the server’s file system to store data. In general, using a database or a pin is going to be a less fragile workflow than using persistent storage on RStudio Connect. The main reason you might consider persistent storage over a pin or database is that it may be faster.

If you’re using persistent storage, you must manually create the directory tree to the location you want to use, and must ensure that permissions are set correctly.1 Additionally, unless you’re using a directory mounted to the same location in both the development environment and to RStudio Connect, it can be hard to test your code outside of production.

You can check which user a piece of content will run as under the Access tab on the content in RStudio Connect under Who runs this content on the server.

Here are some instructions for configuring a Shiny app to use persistent storage on RStudio Connect.

How does the Shiny app get or update data?

Shiny apps work entirely on a “pull” model, so once the data is updated at the source, your Shiny app will need to find out to pull the updated data.

Read From Disk

For apps where the data is bundled and uploaded with the app or lives on persistent storage, you can read the data from disk.

If the data is bundled with the app at deployment time, you can use a relative file path. For example, if your app directory looks like this:

my_app/
  |--app.R
  |data/
    |--data.csv

You could load the data with a relative file path like read.csv('./data/data.csv').

If your data is loaded to persistent storage elsewhere on the RStudio Connect server, you should access the data with an absolute file path.

Live Connections

It’s often possible to architect Shiny apps to use live connections to other resources. For example, if your app lets users select a data sample, filter it, and visualize their selection, you could architect the app to (1) pull all the data possible at start up and filter on user input or (2) using a live connection to pull only the data it needs based on user input.

If the data in your app is small enough, the second option has the advantage of ensuring the data in the Shiny app is always up-to-date, along with reducing app startup times.

Most live connections are either directly to a database or to a plumber API that does the data filtering.

If you’re using your Shiny app to do something that takes a while like save a big file or run a model, it is probably a good idea to use the Shiny app as a trigger for a Plumber API. Doing so avoids freezing the Shiny app session while it does your computation. Shiny async is also an option, but generally makes app code harder to read relative to using a Plumber API.

Shiny Data Reactive

If your app is a data consumer and you’re not using a live connection, your app will need to refresh itself.

If you want to write a general data-pulling function, shiny::reactivePoll allows you to periodically check a resource for changes and run an arbitrary function if it has. You can also use shiny::invalidateLater to invalidate a reactive on a schedule.

There are also several useful functions that are specially designed to schedule data refreshes. If you’re using a pin, pins::pin_reactive allows you to check for updates to a pin on a schedule. Similarly, shiny::reactiveFileReader allows you to check for updates to a file on persistent storage.


  1. The user who runs the content will need access. [return]