Model Management with RStudio
Python users: To learn about model management workflows with Python, Jupyter, Flask, and Plotly Dash, refer to the Model Management with Python and RStudio version of this page.
Model management is a workflow within the overall model lifecycle that can be used to manage multiple versions of deployed models in production. RStudio helps you develop, deploy, and manage models in production environments within enterprise organizations.
Different components of a model deployment pipeline can be developed using RStudio professional products to:
- Deploy multiple versions of a model as REST APIs using Plumber
- Retain a history of model revisions for traceability using R Markdown
- Route API traffic between deployed models for live testing using Plumber
- Interact with models in production for verification using Shiny
RStudio Server Pro can be used with machine learning packages to develop, train, and score models during development. RStudio Connect can be used to deploy models and API routers as REST APIs and host published notebooks with details on model training.
Example: A/B Testing Multiple Credit Payment Risk Models
The following example demonstrates a full model lifecycle for different versions of a model that were developed in RStudio Server Pro and deployed to RStudio Connect.
The data set used in this example involves demographic information and payment history of various customers and whether they defaulted/missed a payment on their credit accounts.
The goals of this model deployment pipeline are to train multiple classification models to predict the probability of a new customer defaulting on their credit payment, serve predictions via a REST API, route API traffic between two different models as part of an A/B testing framework, and run interactive diagnostics to verify the model routing is performing as expected.
The tasks described in the following sections will walk through each stage of the model lifecycle in this example.
Training a Model
You can train a model in a notebook and retain all of the information used to develop the trained model in a published notebook for reproducibility and traceability.
View on RStudio Connect: /model-management/model-a-train/
In this example, the published notebook includes a record of the library, algorithm, and parameters used to train the model. Refer to the RStudio Connect User Guide for more information on publishing documents.
You can also train models on external clusters using background jobs with RStudio Server Pro and Launcher, and even schedule model training notebooks in RStudio Connect to retrain the model on a recurring basis (e.g., daily or weekly).
Serving Model Predictions
Once you’ve trained a model, you can serialize the model to a file, which contains the corresponding trained model weights. You can then serve model predictions via a REST API by adding a few lines of code.
View on RStudio Connect: /model-management/model-a-predict/
In this example, we deployed the model as a REST API and used a custom API route
such that the model is serving at the
endpoint. Refer to the RStudio Connect User Guide for more information on
deploying REST APIs with
Tuning a Model
Once you’ve trained a model, you can tune the model based on feedback from either the model evaluation stage or based on the performance of the model in production. You can select and train a different model by changing the library, algorithm, or parameters.
View on RStudio Connect: /model-management/model-b-train/
In this example, we changed one of the model parameters and will observe the impact on the importance of the factors in the model.
Deploying a New Version of a Model
After you’ve developed a new version of a model, you can deploy it as a separate application and REST API.
View on RStudio Connect: /model-management/model-b-predict/
In this example, we deployed a second version of the model as a separate REST
API and used a custom content URL such that the second version of the model is
serving at the
Managing Multiple Versions of a Model
You can manage multiple versions of a model using different methods in RStudio Connect:
1) Versioned API deployments - Each time you deploy an updated version of a REST API, a new application bundle is created. You can access a history of application bundles for each deployment. You can also roll-back to any previously deployed version.
2) Custom API routes - You can use custom content URLs to create custom routes to deployed REST APIs. Custom content URLs can be configured and/or swapped between models.
3) Separate API deployments - You can also deploy a REST API as a separate application with a separate version history and API route.
A/B Testing Different Models
You can implement an additional REST API endpoint that routes API traffic between various deployed models to implement different testing strategies.
View on RStudio Connect: /model-management/model-router/
In this example, we’ve deployed an API router that splits traffic between two models, which can be used as a framework to perform A/B testing between two models.
You can also change the logic in the model router to implement different routing schemes such as champion-challenger testing.
Verifying Model Predictions
Once the models, REST API endpoints, and API routers are deployed to production, they are ready to receive traffic and serve predictions.
You can use a deployed application to simulate API traffic, load test your API endpoints, verify the behavior of the API router, and compare the resulting model predictions.
View on RStudio Connect: /model-management/model-dashboard/
In this example, we deployed an interactive dashboard to simulate continuous API traffic, verify the behavior of the API router, and compare the resulting model predictions.
The purpose of this example is to demonstrate how model management and various stages of the model lifecycle can be mapped to functionality in RStudio Connect.
This example is simplified and can be used as a starting point for model management. There are additional considerations when deploying and managing multiple versions of models:
- Saving models and parameters on external persistent storage systems
- Encapsulating models with packages such as
- Managing package versions with RStudio Package Manager
The source code for all of the model management components described here is available in the sol-eng/model-management repository on GitHub.