Part 1: Setup Your Environment
Author: Eric Rouach, DBA at Madeira Data Solutions
Date: 2022-02
Description: The following will guide you into setting up an environment to prepare the implementation of data into Azure SQL Databases using Azure Data Factory.
Pre-requisite: a Microsoft Azure Subscription
For this demo, you need to create the following objects using the Azure portal:
-4 Azure SQL Databases (all using the same logical SQL Server to be created along with the 1st Azure SQL Database)
-An Azure Blob Storage
-An Azure Data Factory
Also create a simple .csv file called InstitutesData:
Let’s get started!
First, in your Microsoft Azure subscription, create 4 Azure-SQL-databases:
*StagingDB
*Institute_A
*Institute_B
*Institute_C
In the Azure portal click on “Create a resource”:
Then, in the search bar, type “SQL database”:
Click “Create”.
Before creating the databases, you’ll need to create a resource group:
Click “Create new”, enter a name for your resource group and click “OK”:
Then, enter a name for your database:
If this is the first time you create a database, then you probably don’t have a Server to choose. Click “Create New”:
Choose a Server Name and a Location:
Then, choose the authentication method as well as a login and a strong password, and click OK:
To create the database according to your needs, enter the “Configure database” section:
Choose the Basic compute-tier and click “Apply”:
Leave other settings as they are and click “Review+create” and then “Create”. You should get the following screen:
Wait a few moments until your new resource is created. You should get the following screen:
You should then set the server firewall settings to allow your own IP address to connect to the database:
Enter a new rule name and your IP address, then click “Save”:
Repeat the previous steps for creating the 3 other databases. Don’t forget that at this point you already have a resource group and a SQL Server you can use for all the other databases creation.
Our next step is the creation of a Storage Account:
Enter “storage account” in the search bar and press enter:
Click Create.
On the next page, select your subscription and resource group:
The validation should pass, and your resource should be created within a few moments.
Go to the storage account. On the left pane, click “Containers”:
Then add a new container:
This container will store the .csv file you need to load data from into your databases.
The next step will be the creation of an Azure Data Factory.
You may search for the relevant resource from the home page:
Create a new Data Factory:
Click Create data factory
Fill the requested fields and then click Next : Git configuration
Check the Configure Git later box, then click "Review + create" and when you get a “validation passed” message, click Create. Your data factory will be created within a few moments.
At this point, you should be able to see the following resources from the Azure portal’s home page:
Before you proceed to building your first ADF pipeline, you’ll first create some objects in the Azure SQL Databases using T-SQL commands:
In every Institute database, create a dbo.Institute table:
CREATE TABLE dbo.Institute
(
Id INT NOT NULL,
InstituteName NVARCHAR(50) NOT NULL,
City NVARCHAR(50) NOT NULL
)
GO
In the Staging database, create a dbo.StagingTable table:
CREATE TABLE dbo.StagingTable
(
InstituteId INT NOT NULL,
InstituteName NVARCHAR(50) NOT NULL,
City NVARCHAR(50) NOT NULL
)
GO
In every "Institute" database also create the following objects:
-a dbo.StagingTable_Ext external table reflecting the dbo.StagingTable from the StagingDB database.
(Also create all needed external objects).
For more information about cross-database queries in Azure SQL, please refer to this article: https://www.madeiradata.com/post/how-to-transfer-data-between-databases-in-azure-sql
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Strongpassword123'
--SELECT * FROM sys.symmetric_keys
CREATE DATABASE SCOPED CREDENTIAL DBScopedCredential
WITH IDENTITY = 'adf-demo-eric', --the IDENTITY is the Login Name we used to login to the Server.
SECRET = 'AzureDataFactory12345!@#'; --the SECRET is the Password we used to login to the Server.
GO
CREATE EXTERNAL DATA SOURCE Staging
WITH
(
TYPE = RDBMS,--Relational Database Management System
LOCATION = 'adf-demo-eric.database.windows.net', -- Our server name.
DATABASE_NAME = 'StagingDB', -- The database we use as our data source.
CREDENTIAL = DBScopedCredential -- the name we gave to our DATABASE SCOPED CREDENTIAL.
);
GO
CREATE EXTERNAL TABLE dbo.StagingTable_Ext
(
InstituteId INT NOT NULL,
InstituteName NVARCHAR(50) NOT NULL,
City NVARCHAR(50) NOT NULL
)
WITH
(
DATA_SOURCE = Staging,
SCHEMA_NAME = 'dbo',
OBJECT_NAME = 'StagingTable'
);
GO
In the Staging DB create a dbo.GetDatabaseName stored procedure accepting a parameter and an output parameter to retrieve the relevant database name according to the institute id:
CREATE OR ALTER PROC [dbo].[GetDatabaseName]
@InstituteId INT,
@InstituteName NVARCHAR(25) OUTPUT
AS
BEGIN
SELECT
@InstituteName = [InstituteName]
FROM
[dbo].[StagingTable]
WHERE
[InstituteId] = @InstituteId
SELECT @InstituteName as InstituteName
END
GO
In every Institute DB, create a dbo.MergeInstituteTable stored procedure using the dbo.StagingTable_Ext external table.
CREATE OR ALTER PROC [dbo].[MergeInstituteTable]
@InstituteId INT
AS
BEGIN
MERGE [dbo].[Institute] t
USING
(
select
[InstituteId],
[InstituteName],
[City]
from
[dbo].[StagingTable_Ext]
where
[InstituteId] = @InstituteId
) s
ON t.[Id] = s.[InstituteId]
WHEN NOT MATCHED BY TARGET
THEN
INSERT
(
[Id],
[InstituteName],
[City]
)
VALUES
(
s.[InstituteId],
s.[InstituteName],
s.[City]
)
;
END
GO
In the StagingDB database, create a dbo.TruncateStagingTable stored procedure.
CREATE OR ALTER PROC dbo.TruncateStagingTable
AS
BEGIN
TRUNCATE TABLE [dbo].[StagingTable]
END
GO
--------End of part 1--------
In Part 2, we will build an Azure Data Factory pipeline.
Comments