Press "Enter" to skip to content

Category: Cloud

Spatial Workaround In Azure SQL Data Warehouse

Rolf Tesmer has you covered if you want to perform spatial queries against data in Azure SQL Data Warehouse:

Recently we had a requirement to perform SQL Spatial functions on data that was stored in Azure SQL DW.  Seems simple enough as spatial has been in SQL for many years, but unfortunately, SQL Spatial functions are not natively supported in Azure SQL DW (yet)!

If interested – this is the link to the Azure Feedback feature request to make this available in Azure SQL DW – https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/10508991-support-for-spatial-data-type

AND SO — to use spatial data in Azure SQL DW we need to look at alternative methods.  Luckily a recent new feature in Azure SQL DB  in the form of Elastic Query to Azure SQL DW now gives us the ability to perform these SQL Spatial functions on data within Azure SQL DW via a very simple method!

Check out that Azure Feedback item if you’d like to see native spatial support rather than using elastic query.  In the meantime, click through to see Rolf’s workaround.

Comments closed

Configuring Azure SQL Analytics

Esat Erkec has a guide showing how to configure and use Azure SQL Analytics on Azure SQL Database:

The most important and challenging responsibility of a database administrator is monitoring performance metrics. Because monitoring performance and troubleshooting performance issues are considered to be difficult. For this reason, we need diagnostic and monitoring tools to measure performance counters and metrics. For Azure SQL there is a tool which is named SQL Analytics. With this tool, we can measure and monitor Azure SQL databases and elastic pools. At the same time, we can create alerts for notifications. SQL Analytics offers performance metrics in graphical form. In this article, we will learn how to enable Azure SQL Analytics.

This is a long and screenshot-filled post, which is helpful if you’re getting started.

Comments closed

DTUs Or vCores For Azure SQL DB

Denny Cherry looks at a new Azure SQL Database announcement:

Today Microsoft has announced that there is a new way to buy Azure SQL DB. If DTUs aren’t making sense to you, you’ll be happy to know that you can now simply select how many vCores you want for your SQL DB workload.  Now this will still require that you have an understanding on your workload to use this new vCore based way to buy Azure SQL DB, but Cores are a concept that is easy for people to talk about and wrap their heads around. Now this new model is only in preview at the moment, but I’m guessing that it’ll be around for a while in preview, then it’ll go GA as this new model makes sense.

I definitely prefer this model, as IT departments already understand the idea, whereas DTUs were nebulous at best.

Comments closed

Changing The SQL Server Port On Azure Container Services

Andrew Pruski shows how to change off of the default port for SQL Server when running Azure Container Services:

So, how do you do it when running SQL Server in Azure Container Services?

Well there’s a couple of options available.

The first one is to change the port that SQL is listening on in the container, open that port on the container, and direct to that port from the service.

The second one is to leave SQL Server listening on the default port and direct a non-default port to port 1433 from the service.

Read on to see Andrew try out both of these methods.

Comments closed

Azure Analysis Services Parallelism And Scale

Teo Lachev has a quick note on Azure Analysis Services and parallelism:

Anyone who’s done parallel programming knows that it’s not easy. Safe access and locks need to be used to synchronize access and protect shared resources. The AS xVelocity engine (aka Vertipaq) is a multi-threaded application that accesses RAM as a shared resource. As each thread competes for a global lock for memory allocation/deallocation, scalability decreases. Because processor cores reserve memory in chunks, called cache lines, additional synchronization is required when threads access memory location that are close to each other. Before SQL Server 2016 SP1, this was a serious issue with Tabular. As we’ve learned, starting with SQL Server 2016 SP1, Microsoft switched to using the Intel Threading Building Clocks (TBB) C++ library. Specifically, Tabular now uses the TBB scalable memory allocator. Mind you that TBB is not CPU-specific extensions, so any modern Intel CPU should get these benefits. “Analysis Services SP1 uses an Intel TBB-based scalable allocator that provides separate memory pools for every core. As the number of cores increases, the system can scale almost linearly” and “The Intel TBB-based scalable allocator is also expected to help mitigate performance problems due to heap fragmentation that have been shown to occur with the Windows Heap”. Further, starting with SP1, Tabular is NUMA aware with 4-node NUMA system but I don’t know if the AAS VMs are preconfigured for NUMA.

Teo doesn’t have any firm conclusions at this point, but his initial testing looks positive.

Comments closed

“Failed To Delete Database” In Azure

Arun Sirpal walks us through a reason why you might not be able to delete an Azure SQL Database:

Failed to delete the database: TestDB. ErrorCode: 400 ErrorMessage: Database ‘TestDB’ cannot be deleted because it is used as a sync metadata database which still contains sync groups and/or sync agents.

After some digging around I remembered that I tried to setup a data sync group where this database was involved in bi-directional synchronization with another Azure SQL Database. The other database was deleted a long time ago and clicking on the sync option within the portal confirmed that nothing was actually ever connected or ready.

Arun walks us through the wrong way before getting us down the right path.  I recommend the latter.

Comments closed

Using Cosmos DB For Graph Data

Jose Mendes has an introduction to the Cosmos DB graph engine:

Gremlin is the graph traversal language of Apache TinkerPop, an open source Graph Computing Framework. Gremlin allows the users to write complex queries to traverse their graphs by using a composed sequence of steps, with each step performing an operation on the data stream (further details here). There are 4 fundamental steps:

· transform: transform the objects in the stream

· filter: remove objects from the stream

· sideEffect: pass the object, but yield some side effect

· branch: decide which step to take

Click through for a quick example showing how to create and populate a graph.

Comments closed

A Frugal Stretch Database Alternative

Chris Bell shares a version of Stretch databases for people with budgets:

Stretch databases were going to provide “Cost-effective” availability for cold data, and unlike typical cold data storage,  our data would always be online and available to query. Applications would not need to be modified to work with the seamless design of the stretch database. Run a query, and the data was there being pulled from the cloud when needed. Streamlining on-premises data maintenance by reducing the local footprint of the data files as well as the size of backups! It was even going to be possible to keep data secure via encrypted connections to the cloud and in theory, make a migration to the cloud even easier.

It was destined to be a major win!

Then the price was mentioned.

Do you know anyone using stretch databases today?

Yeah, me neither.

It’s an interesting workaround with several moving parts.

Comments closed

Resuming Azure SQL Data Warehouse With Powershell

Arun Sirpal shows how to unpause an Azure SQL Data Warehouse instance using Powershell:

I totally forgot that with Azure SQL DWH you can pause and resume compute, to save money because it is expensive. Question is how do you go about resuming compute? TSQL is not possible and sure you can do the change via Azure portal but what about PowerShell?

This makes it easy to script out an overnight data load and then pausing the Azure Data Warehouse until the morning when those analysts come in, so that you can save a bit of cash (or a lot, depending upon your DWU utilization).

Comments closed

Comparing Hadoop On EC2 To EMR

Mark Litwintschik looks at EC2 versus EMR in terms of performance, specifically targeting a solution at less than $3 per hour:

The $3.00 price point was driven by the first method: running a single-node Hadoop installation. I wanted to make sure the dataset used in this benchmark could easily fit into memory.

The price set the limit for the second method: AWS EMR. This is Amazon’s Hadoop Platform offering. It has a huge feature set but the key one is that it lets you setup Hadoop clusters with very little instruction. The $3.00 price limit includes the service fee for EMR.

Note I’ll be running everything in AWS’ Irish region and the prices mentioned are region- and in the case of spot prices, time-specific.

I was a bit surprised at which service won.

Comments closed