Press "Enter" to skip to content

Category: Cloud

Azure SQL Analytics

Arun Sirpal gives an introduction to Azure SQL Analytics:

Please see the prerequisites section within this document – YOU MUST do this else you will not be able to use this feature. https://docs.microsoft.com/en-us/azure/log-analytics/log-analytics-azure-sql#prerequisites

Once setup it should take approximately 15 minutes to start capturing and rendering back some data. Don’t be surprised if it does take a little longer as was the case for myself.

My biggest complaint is about the visuals; otherwise, this looks like the beginning of a solid monitoring solution within Azure SQL Database.

Comments closed

Row Counts From Statistics In Azure DW

Derik Hammer has a script to estimate row counts in an Azure SQL Data Warehouse table:

Azure SQL Data Warehouse is a massively parallel processing (MPP) architecture designed for large-scale data warehouses. An MPP system creates logical / physical slices of the data. In SQL Data Warehouse’s case, the data has 60 logical slices, at all performance tiers. This means that a single table can have up to 60 different object_ids. This is why, in SQL Data Warehouse, there is the concept of physical and logical object_ids along with physical names.

Below is a query for finding row counts of tables in SQL Data Warehouse which accounts for the differences in architecture between my earlier script, written for SQL Server, and SQL Data Warehouse.

Click through for the script.

Comments closed

Failover Groups In Azure SQL Database

Jim Donahoe shows off Failover Groups in Azure SQL Database.  Part 1 involves setting up a Failover Group:

In my former company, we had 22 web applications that all had connections to various databases.  We had all of our databases configured for Geo-Replication already, but still if we had to failover, we had to update each connection string for the web apps along with others which became a tedious process. In came Failover Groups to the rescue!  With a Failover Group, I was able to create two endpoints that stayed the same no matter which server was primary/secondary.  I liked to think of these as my Availability Group Listeners as they kinda serve the same functionality: Route traffic to a node depending on if its read-only or not.  Best part?  It’s configured through the Azure Portal SO EASILY!  You can use PowerShell as well, but for this blog post, I will walk through the creation via the Portal.  I will make a separate post or attach a script at some point for the PowerShell deployment.

Before we start the configuration portion of this though, let’s take a look at how Microsoft defines what a Failover Group is.  I found this definition here:  “Azure SQL Database auto-failover groups (in-preview) is a SQL Database feature designed to automatically manage geo-replication relationship, connectivity, and failover at scale.”  Sounds pretty interesting, right? Let’s make one!

In Part 2, Jim shows how to connect to SQL Server using the Failover Group listener:

Well, now that the easy stuff is out of the way, let’s talk about how you connect to these groups via SSMS.  This is where some of the confusion happens.  When I first configured a Failover Group, the first thing I tried to do was connect to the Primary server via SSMS thinking it will work just like an Always On Listener in traditional SQL Server…NEWP!

If you’re running a production database on Azure SQL Database, you might want to look at Failover Groups.

Comments closed

Creating A Linked Server To Amazon Athena

Maria Zakourdaev shows that you can create a linked server connection in SQL Server to query data using Amazon Athena:

I will show you today how you can use Management Studio or any stored procedure to query the data, stored in a csv file, located on S3 storage. I am using CSV file format as an example here, columnar PARQUET gives much better performance.

I am going to:

1. Put a simple CSV file on S3 storage

2. Create External table in Athena service over the data file bucket

3. Create linked server to Athena inside SQL Server

4. Use OPENQUERY to query the data.

Athena service is built on the top of Presto, distributed SQL engine and also uses Apache Hive to create, alter and drop tables. You can run ANSI SQL statements in the Athena query editor, launching it from the AWS web services UI. You can use complex joins, window functions and many other great SQL language features. Using Athena eliminates need for ETL because it projects your schema on the data files at the time of the query.

Standard linked server warnings apply, but sometimes you need to bridge a couple technologies.

Comments closed

Creating An Azure Chat Bot

Dustin Ryan shows how to build a QnA bot:

After you’ve created your knowledge base you can then edit and update your knowledge base. There’s a few different ways to update your knowledge.

a. Manually edit the knowledge base directly within QnAMaker.ai. You can do this by directly editing the questions by modifying the text of your knowledge base.

b. Edit the source of your knowledge base. Click the Settings tab on the left to edit the URL of your FAQs or upload a new document.

Building a bot is pretty easy, and Dustin shows you just how to do it.

Comments closed

Data Lake Archive Tier

Ust Oldfeld looks at an important part of a data lake:

The Archive access tier in blob storage was made generally available today (13th December 2017) and with it comes the final piece in the puzzle to archiving data from the data lake.

Where Hot and Cool access tiers can be applied at a storage account level, the Archive access tier can only be applied to a blob storage container. To understand why the Archive access tier can only be applied to a container, you need to understand the features of the Archive access tier. It is intended for data that has no or low SLAs for availability within an organisation and the data is stored offline (Hot and Cool access tiers are online). Therefore, it can take up to 15 hours for data to be made online and available. Brining Archive data online is a process called rehydration (fitting for the data lake). If you have lots of blob containers in a storage account, you can archive them and rehydrate them as required, rather than having to rehydrate the entire storage account.

Read on for more details, including a pattern for archiving data lake data.

Comments closed

When Data Factory Flows Don’t

Emma Stewart points out an issue that might vex newcomers to Azure Data Factory:

The data within the Data Lake store was organised into a Year and Month hierarchy for the folders, and each days transactions were stored in a file which was named after the day within the relevant month folder. The task then was to create a pipeline which copies the dataset in the Data Lake Store over to the dbo.Orders table in Azure SQL DB every day within the scheduled period (Q1 2016).

After creating all the json scripts and deploying them (with no errors), I clicked on the ‘Monitor and Manage’ tile to monitor the activities, check everything was working as it should be and monitor the progress. After waiting for at least 10 minutes, I started to get frustrated.

Click through for the fix and an explanation.

Comments closed

Using The Command Line To Migrate To Azure SQL Database

Arun Sirpal shows how to use SqlPackage.exe to migrate a database to Azure SQL Database:

I have moved many databases to Azure via different methods but I recently came across a new way. Well technically it’s not new, I should say, newly found. The migration was done via the command line which is not exactly ground breaking but it’s nice to have another option.

The idea behind this is simple. Create the bacpac via command line using sqlpackage.exe with the action as export then do an import action into Azure.

Read on for the demo.

Comments closed

Fetching U-SQL Job Input And Output Paths

Matthew Hicks shows how to retrieve information on U-SQL input and output paths using Powershell:

Each time you submit a U-SQL job, a job folder is created in your Azure Data Lake Store account. This folder contains useful debugging information about the job, including a file called the U-SQL algebra file. This is an XML file containing information about your job graph, the list of input and output files, and other key U-SQL job metadata.

We’ve just published a sample script that reads the U-SQL algebra file for a specified job and returns the input or output files. Give it a try!

Read on for more.

Comments closed

Larger Azure SQL Database Standard Tier Sizes

Tim Radney reports on a new Standard tier preview for Azure SQL Database:

Previously, the Standard tier only offered 4 levels: 15, 30, 50, and 100 DTUs, with a database size limit of 250GB, with standard disk. If you had a database that was larger than 250GB, however did not need more than 100 DTUs for CPU, memory, or I/O, you were stuck paying a Premium price just for database size. With the new changes, you can now have up to a 1TB database in the Standard tier; you just have to pay the extra storage. Currently storage is being billed at $0.085/GB during the preview. Increasing from the included size of 250GB to 1TB increases by 774GB at a cost of $65.79 per month.

The new Standard preview DTU sizes support 200, 400, 800, 1,600, and 3,000 DTU options. If you have a SQL Server database workload that is more CPU-bound than I/O, these Standard tier options have the potential to save you a lot of money; however, if your workload is I/O bound, the Premium tier is going to outperform the Standard tier.

Tim follows this up with a couple of quick demos.

Comments closed