top of page
Writer's pictureEric Rouach

Get Started with Azure Data Factory

Updated: Feb 12, 2022




Part 1: Setup Your Environment


Author: Eric Rouach, DBA at Madeira Data Solutions

Date: 2022-02

Description: The following will guide you into setting up an environment to prepare the implementation of data into Azure SQL Databases using Azure Data Factory.


Pre-requisite: a Microsoft Azure Subscription


For this demo, you need to create the following objects using the Azure portal:


-4 Azure SQL Databases (all using the same logical SQL Server to be created along with the 1st Azure SQL Database)

-An Azure Blob Storage

-An Azure Data Factory


Also create a simple .csv file called InstitutesData:


Let’s get started!


First, in your Microsoft Azure subscription, create 4 Azure-SQL-databases:


*StagingDB

*Institute_A

*Institute_B

*Institute_C


In the Azure portal click on “Create a resource”:

Then, in the search bar, type “SQL database”:



Click “Create”.


Before creating the databases, you’ll need to create a resource group:


Click “Create new”, enter a name for your resource group and click “OK”:


Then, enter a name for your database:

If this is the first time you create a database, then you probably don’t have a Server to choose. Click “Create New”:


Choose a Server Name and a Location:


Then, choose the authentication method as well as a login and a strong password, and click OK:


To create the database according to your needs, enter the “Configure database” section:

Choose the Basic compute-tier and click “Apply”:


Leave other settings as they are and click “Review+create” and then “Create”. You should get the following screen:

Wait a few moments until your new resource is created. You should get the following screen:

You should then set the server firewall settings to allow your own IP address to connect to the database:

Enter a new rule name and your IP address, then click “Save”:


Repeat the previous steps for creating the 3 other databases. Don’t forget that at this point you already have a resource group and a SQL Server you can use for all the other databases creation.


Our next step is the creation of a Storage Account:

Enter “storage account” in the search bar and press enter:


Click Create.


On the next page, select your subscription and resource group:


The validation should pass, and your resource should be created within a few moments.


Go to the storage account. On the left pane, click “Containers”:


Then add a new container:


This container will store the .csv file you need to load data from into your databases.


The next step will be the creation of an Azure Data Factory.

You may search for the relevant resource from the home page:

Create a new Data Factory:

Click Create data factory

Fill the requested fields and then click Next : Git configuration

Check the Configure Git later box, then click "Review + create" and when you get a “validation passed” message, click Create. Your data factory will be created within a few moments.


At this point, you should be able to see the following resources from the Azure portal’s home page:


Before you proceed to building your first ADF pipeline, you’ll first create some objects in the Azure SQL Databases using T-SQL commands:


In every Institute database, create a dbo.Institute table:

CREATE TABLE dbo.Institute

(

Id INT NOT NULL,

InstituteName NVARCHAR(50) NOT NULL,

City NVARCHAR(50) NOT NULL

)

GO


In the Staging database, create a dbo.StagingTable table:


CREATE TABLE dbo.StagingTable

(

InstituteId INT NOT NULL,

InstituteName NVARCHAR(50) NOT NULL,

City NVARCHAR(50) NOT NULL

)

GO


In every "Institute" database also create the following objects:

-a dbo.StagingTable_Ext external table reflecting the dbo.StagingTable from the StagingDB database.

(Also create all needed external objects).

For more information about cross-database queries in Azure SQL, please refer to this article: https://www.madeiradata.com/post/how-to-transfer-data-between-databases-in-azure-sql



CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'Strongpassword123'

--SELECT * FROM sys.symmetric_keys


CREATE DATABASE SCOPED CREDENTIAL DBScopedCredential

WITH IDENTITY = 'adf-demo-eric', --the IDENTITY is the Login Name we used to login to the Server.

SECRET = 'AzureDataFactory12345!@#'; --the SECRET is the Password we used to login to the Server.

GO


CREATE EXTERNAL DATA SOURCE Staging

WITH

(

TYPE = RDBMS,--Relational Database Management System

LOCATION = 'adf-demo-eric.database.windows.net', -- Our server name.

DATABASE_NAME = 'StagingDB', -- The database we use as our data source.

CREDENTIAL = DBScopedCredential -- the name we gave to our DATABASE SCOPED CREDENTIAL.

);

GO


CREATE EXTERNAL TABLE dbo.StagingTable_Ext

(

InstituteId INT NOT NULL,

InstituteName NVARCHAR(50) NOT NULL,

City NVARCHAR(50) NOT NULL

)

WITH

(

DATA_SOURCE = Staging,

SCHEMA_NAME = 'dbo',

OBJECT_NAME = 'StagingTable'

);

GO


In the Staging DB create a dbo.GetDatabaseName stored procedure accepting a parameter and an output parameter to retrieve the relevant database name according to the institute id:



CREATE OR ALTER PROC [dbo].[GetDatabaseName]

@InstituteId INT,

@InstituteName NVARCHAR(25) OUTPUT

AS

BEGIN

SELECT

@InstituteName = [InstituteName]

FROM

[dbo].[StagingTable]

WHERE

[InstituteId] = @InstituteId


SELECT @InstituteName as InstituteName

END

GO


In every Institute DB, create a dbo.MergeInstituteTable stored procedure using the dbo.StagingTable_Ext external table.



CREATE OR ALTER PROC [dbo].[MergeInstituteTable]

@InstituteId INT

AS

BEGIN

MERGE [dbo].[Institute] t

USING

(

select

[InstituteId],

[InstituteName],

[City]

from

[dbo].[StagingTable_Ext]

where

[InstituteId] = @InstituteId

) s

ON t.[Id] = s.[InstituteId]


WHEN NOT MATCHED BY TARGET

THEN

INSERT

(

[Id],

[InstituteName],

[City]

)

VALUES

(

s.[InstituteId],

s.[InstituteName],

s.[City]

)

;

END

GO


In the StagingDB database, create a dbo.TruncateStagingTable stored procedure.



CREATE OR ALTER PROC dbo.TruncateStagingTable

AS

BEGIN

TRUNCATE TABLE [dbo].[StagingTable]

END

GO


--------End of part 1--------


In Part 2, we will build an Azure Data Factory pipeline.

0 comments

Comments


STAY IN TOUCH

Get New posts delivered straight to your inbox

Thank you for subscribing!

bottom of page