In this article we will learn how to use Azure Data Factory to Extract, Transform, Load (ETL) data especially for the data warehousing purposes.
If you are new to Azure Data warehouse, I would like to suggest the below prerequisites.
- SQL Server on-premises installed
- Your Enterprise or Demo database available on SQL Server on-premises
- Let’s Move ‘On-Prem’ Data Warehouse To ‘Azure SQL Data Warehouse’ With Redgate Data Platform Studio (DPS)
- An Azure SQL Data Warehouse account, if you still don’t have one please create ‘Azure SQL Data warehouse here’
If you have some questions on Azure SQL Data Warehouse such as
Why Enterprises Should Now Consider ASDWH Seriously please read here
This is a real time example to schedule ETL from beginning.
Most recently, I got an opportunity to migrate and schedule ETL job from ‘On-Prem’ DWH to ‘Azure SQL Data Warehouse (ASDWH)’.
I am sure that you already have an Azure SQL Data Warehouse (ASDWH), so please LOGIN there and go to the ASDWH
Under Common Task of ‘Azure Data Warehouse’, please click on ‘Load Data’ as shown below:
After you click the load data the below screen will open, please click Azure Data Factory:-
Click on the Create Button.
Provide the required details, please note currently 2 versions of Azure Data Factory are available
However, I will choose V1 – Version 1, V2 Version which is currently in preview will be used in next article as shown below :-
Select location nearest to geography as per your business and click ‘Create’
And Azure Data Factory is ready
Now we need to schedule ‘Azure Data Factory’ to update the ‘Data warehouse database’, so click ‘Copy Data’,
As you click on the Copy Data, the below screen will open:-
Please note ‘Task cadence on Task schedule’ has the following options:-
- Run once now
- Run regularly on schedule
I’m considering my ‘AzureDataFactory’ as ETL to repeat every day at 8:25 PM until end date time of ‘12/31/2099’. If you want to copy only once, similar to ‘Redgate- Data Copy’ as described here Let’s Move ‘On-Prem’ Data Warehouse To ‘Azure SQL Data Warehouse’ With Redgate Data Platform Studio (DPS)
As my ‘On-Prem’ data warehouse source is SQL Server, so select ‘SQL Server’ as shown above, and click next
Here you will be landed to the page and it will ask to Configure Integration. Click at ‘Launch express setup on this computer’ as shown below,
And it will download the gateway as shown below
And Click Next
If no changes are required click Next. As my destination is Azure SQL Data warehouse, I will select it and click next
Below screen will appear
Below Mapping will be displayed to verify
Click Next For Final Deployment
And with this, you just deployed your first ETL Job- ‘Azure Data Factory’. Just to recall, this job will execute daily. You could change the frequency anytime as per business need.
Thanks again for reading the blog.
In case of any doubt or question, feel free to write message or connect by any social media, I will reply ASAP.
Until next time, keep learning….