In 2015, Microsoft acquired Revolution Analytics. Microsoft R was a rebranding of Revolution R. Since the R landscape at Microsoft can be a bit confusing I want to try to lay it out simply. First, Microsoft R Server has been rebranded to Microsoft Machine Learning Server. At the time of writing, ML Server 9.2 was available.
So, what are the different ways to use R from Microsoft?
Microsoft R Open
- This is the enhanced, open source, distribution of R from Microsoft.
- It is based on, and extends, the R language. It contains the R language, compatible with all R packages, scripts and applications that work with the underlying version of R.
- Contains a set of specialized packages to enhance the R experience, including multi-threaded math libraries and enhanced performance optimizations.
Microsoft R Open in Azure ML
- You can execute R scripts as part of Azure Machine Learning Studio experiments.
- This supports Microsoft R Open and CRAN.
- Note that this is currently a couple of versions behind the latest R releases - supporting CRAN 3.1.0
RevoScaleR
- Ships as part of Microsoft Machine Learning Server and Microsoft R Client.
- A collection of portable, scalable, and distributable R functions for importing, transforming, and analyzing data at scale.
- Can run it locally or remotely (for scale out etc.)
- Remote context could be: Machine Learning Server, Spark, Hadoop, SQL Server
MicrosoftML
- R pacakge that adds state-of-the-art data transforms, machine learning algorithms, and pre-trained models to R and Python functionality
- Installed as part Machine Learning Server, Microsoft R Client and SQL Server Machine Learning Services.
- Works in tandem with RevoScaleR.
mrsdeploy
- R package for establishing a remote session and for publishing and managing a web service that is backed by R.
- It comes installed and loaded with Microsoft R Client. On ML Server and SQL Server it is installed but not loaded by default.
- It makes it easy to use a remote server for executing your jobs as well as making it very easy to publish your models as a web service to Machine Learning Server.
Microsoft R Client
- This is a free data science tool built on top of Microsoft R Open.
- Allows you to work with data locally and then offload to a remote compute context for more power.
- You can use the RevoScaleR packages as part of this.
- Its aim is to enable local development and exploration.
Microsoft Machine Learning Server
- Standalone and installed on a computer not running SQL Server.
- Enterprise data analysis at scale - providing high performance and enterprise robustness.
- Supports R and Python.
- Secure environment for deploying and operationalizing machine learning models.
- Makes it easy to deploy your models as a web service.
- Ability to scale out using either Spark, Hadoop, SQL Server, or multiple nodes of ML Server.
- Microsoft Machine Learning Server stand-alone for Linux or Windows is licensed core-for-core as SQL Server 2017.
- All customers who have purchased Software Assurance for SQL Server Enterprise Edition are entitled to use 5 nodes of Microsoft Machine Learning Server for Hadoop/Spark for each core of SQL Server 2017 Enterprise Edition under SA. In addition, we are removing the core limit per-node; customers can have unlimited cores per node of Machine Learning Server for Hadoop/Spark.
Microsoft SQL Server 2017 Machine Learning Services
- Builds on R support in SQL Server 2016
- Integrating Machine Learning Services in the database - includes R and Python support.
- Can perform far better than conventional R because you can use server resources and RevoScaleR for scale out.
- This is built into the database engine (vs. stand alone as described above)
- Execute R scripts via sp_execute_external_script
- Supports in-database package management
- Supports native scoring via T-SQL PREDICT function - can predict without needing to load R environment.
PowerBI and R
- The Power BI service supports viewing and interacting with visuals created with R scripts.
- Note that in the service not all of the R packages are supported.
- R visuals that are created in Power BI Desktop, and then published to the Power BI service, for the most part behave like any other visual in the Power BI service; you can interact, filter, slice, and pin them to a dashboard, or share them with others.
Azure Databricks
- Can create notebooks and workflows using R or SparkR
- Support of CRAN packages.
- Leverage SparkR to take advantage of Spark (scale out etc.) for R jobs.
R with HDInsight
- HDInsight includes an option to spin up a Machine Learning Server (previously called R Server) to integrate with your HDI cluster.
- Execute R scripts with Spark/Hadoop compute context to distribute job across cluster.
- Use the ScaleR functions from RevoScaleR package to ensure R functions run across cluster.
R in Azure Batch
- doAzureParallel is a lightweight R package that allows you to use Azure Batch directly from your R session.
- Built on top of the R foreach package - takes each iteration of the foreach loop and submits it as a Azure Batch task.
- Leverage low priority VMs to significantly reduce the cost.
- Azure Batch allows you to create a pool of VMs which you can use to run jobs in parallel achieving better scale out and more efficiency.