You can use Azure SQL Data Warehouse as part of your Azure Data Factory pipeline which is great, but you probably don’t want to have the data warehouse running at the maximum Data Warehouse Units (DWU) all the time, especially if the pipeline is not running on a frequent basis. I want to share with you some steps to enable scaling up and scaling down of SQL Data Warehouse right within your Data Factory pipeline. For scaling up you will need:
- A linked service pointing to your SQL Data Warehouse master database - it has to be master so we can poll for success.
- A dataset of type SQL Table referencing master database.
- A dataset to act as fake output (e.g. Azure Blob output)
- A pipeline with a copy activity that uses the sqlReaderQuery to modify the Service Objective and polls until it is done (because the scaling is async).
The activity looks like:
For scaling down you need the following:
- A linked service pointing to another db/dw on the same logical server (not master though).
- A fake dataset for output.
- A stored proc in this db/dw for scaling down.
- A pipeline with a stored proc activity.
The stored proc could look like this:
The activity looks like:
This should allow you to now incorporate scaling up and scaling down a sql data warehouse within a data factory pipeline. You can access the full end to end pipeline json here in github.