top of page

CETAS in SQL Server 2022

Writer's picture: Eric RouachEric Rouach

Updated: May 7, 2023



Written by: Eric Rouach, Madeira Data Solutions - May 2023


Create External Table As Select or "CETAS" has finally become available on SQL Server with the release of the 2022 version.

After a short setup, we can create various formats files containing any query's result set. The created file/s must be kept on an Azure storage solution i.e. Azure Blob Storage.

The process also creates an external table reflecting the updated file's content.


Pre-requisites:

-An instance of SQL Server 2022

-An Azure storage solution i.e. Azure Blob Storage

-Basic understanding of external tables, external data sources, cross-database queries in Azure SQL Database

-In case you're new to external tables, I recommend reading this article:


The following demo includes 2 parts:

1) Setup steps


2) An example stored procedure that creates a .parquet file with the content of a query from the AdventureWorks2019

database. The file is created in a folder located at an Azure Blob Storage container.

An external table reflecting this file is also created and, as we might not need to keep it,

the stored procedure also deletes any external table older than 15 minutes.


Useful documentation links:





1) One-time set up:


In case you haven't selected Polybase while installing SQL Server 2022, go to SQL Server Installation Center and add the missing feature.


Make sure Polybase is installed:

In case the above query returned 0, run the script below:

You will also need to enable the "allow polybase export" configuration option:










The next step is the creation of 4 external objects:

  • master key

  • database scoped credential

  • external data source

  • external file format


Create MASTER KEY

Create DATABASE SCOPED CREDENTIAL:

*when using a blob storage as a target location, use SAS as the identity

*before generating the SAS, make sure to grant the READ, WRITE and CREATE permissions via Azure Portal:




























Create EXTERNAL DATA SOURCE

Create EXTERNAL FILE FORMAT

Check objects creation:


2) Example Stored Procedure:



USE AdventureWorks2019;

GO


CREATE OR ALTER PROC dbo.CreateDailyParquetFile


AS


BEGIN


BEGIN TRY

BEGIN TRAN

--Create the parquet file and external table

DECLARE @TableName NVARCHAR(255),

@Location NVARCHAR(255),

@SqlCmd NVARCHAR(MAX)


SET @TableName = 'ext_sales_'+FORMAT(GETDATE(),'yyyyMMddHHmmss')

--LOCATION must point to a folder and have a trailing /

SET @Location = '/sales_'+FORMAT(GETDATE(),'yyyyMMddHHmmss')

SET @SqlCmd =

N'

CREATE EXTERNAL TABLE '+@TableName+'

WITH (

LOCATION = '+''''+@Location+''''+','+'

DATA_SOURCE = Test1Container1,

FILE_FORMAT = ParquetFileFormat

)

AS


SELECT

TOP 1000 *

FROM

AdventureWorks2019.[Sales].[SalesOrderDetail]

GO

'

EXEC sp_executesql @SqlCmd

--Delete "old" external tables

DECLARE @TableToDelete VARCHAR(50)

DECLARE @DeleteCmd NVARCHAR(MAX)


DECLARE DeletionCursor CURSOR LOCAL FAST_FORWARD

FOR

SELECT

name

FROM

sys.external_tables

WHERE

create_date < DATEADD(MINUTE,-15,GETDATE())


OPEN DeletionCursor

FETCH NEXT FROM DeletionCursor INTO @TableToDelete


WHILE @@FETCH_STATUS = 0

BEGIN

SET @DeleteCmd = N'DROP EXTERNAL TABLE '+@TableToDelete

EXEC sp_executesql @DeleteCmd

FETCH NEXT FROM DeletionCursor INTO @TableToDelete

END


CLOSE DeletionCursor

DEALLOCATE DeletionCursor

COMMIT

END TRY

BEGIN CATCH

INSERT INTO dbo.ErrorLog

(

[ErrorTime],

[UserName],

[ErrorNumber],

[ErrorSeverity],

[ErrorState],

[ErrorProcedure],

[ErrorLine],

[ErrorMessage]

)

SELECT

GETDATE(),

SUSER_SNAME(),

ERROR_NUMBER(),

ERROR_SEVERITY(),

ERROR_STATE(),

ERROR_PROCEDURE(),

ERROR_LINE(),

ERROR_MESSAGE()


PRINT 'The error was: '+ERROR_MESSAGE()

IF @@TRANCOUNT > 0

ROLLBACK

END CATCH


END;

GO


After executing the procedure, 3 things will happen:

  • A .parquet file will be created in your Azure storage

  • An external table reflecting the .parquet file's content will be created in the database














  • Any external table older than 15 minutes will be deleted from the database's External Tables folder.



*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*







2 comments

2 Comments


Guest
May 04, 2023

Wouldn't the ROLLBACK after logging the error roll it back?

Like
Eric Rouach
Eric Rouach
May 07, 2023
Replying to

Hi, that's a good question: actually the rollback affects the creation and deletion of the external tables. If the error occurs after the file creation, then the rollback won't delete the file. This post of course presents a quick example of how to use CETAS; I guess I would run the whole thing in a PowerShell script for greater control of the process. Thanks for your comment.

Like

STAY IN TOUCH

Get New posts delivered straight to your inbox

Thank you for subscribing!

bottom of page