Press "Enter" to skip to content

Category: Cloud

Cloud Savings And TCO

James Serra argues that moving to the cloud can be a net savings on cost:

I often tell clients that if you have your own on-premise data center, you are in the air conditioning business.  Wouldn’t you rather focus all your efforts on analyzing data?  You could also try to “save money” by doing your own accounting, but wouldn’t it make more sense to off-load that to an accounting company?  Why not also off-load the  costly, up-front investment of hardware, software, and other infrastructure, and the costs of maintaining, updating, and securing an on-premises system?

And when dealing with my favorite topic, data warehousing, a conventional on-premise data warehouse can cost millions of dollars in the following: licensing fees, hardware, and services; the time and expertise required to set up, manage, deploy, and tune the warehouse; and the costs to secure and back up the data.  All items that a cloud solution eliminates or greatly minimizes.

When estimating hardware costs for a data warehouse, consider the costs of servers, additional storage devices, firewalls, networking switches, data center space to house the hardware, a high-speed network (with redundancy) to access the data, and the power and redundant power supplies needed to keep the system up and running.  If your warehouse is mission critical then you need to also add the costs to configure a disaster recovery site, effectively doubling the cost.

I don’t think this story plays quite as well.  For small and mid-sized companies, yes, the cloud is often a net savings.  For companies whose products were designed to be cloud-first and take advantage of burstiness and spot markets, yes, you can drive cost savings that way.  But for most mid-to-large companies, I think the calculus shifts to where sometimes cloud options work better but often they don’t.  Need a few hundred SQL Server instances with microsecond-level latency running SQL Server Enterprise Edition 24/7?  That’s not going to be cheaper.

Comments closed

The Shuffling Operator And Azure SQL DW

Arun Sirpal is ready to deal:

For the purposes of this post the TSQL shown is elementary (don’t be surprised by that), the point is really about SHUFFLE. So, I select the estimated plan for the following code.

SELECT SOD.[SalesOrderID],SOD.[ProductID], SOH.[TotalDue]
FROM [SalesLT].[SalesOrderDetail] SOD
JOIN [SalesLT].[SalesOrderHeader] SOH ON
SOH.[SalesOrderID] = SOD.[SalesOrderID]
WHERE SOH.[TotalDue] > 1000

Shuffle me once, why not shuffle me twice. If you REALLY want to see the EXPLAIN command output, then it looks like this snippet below.

The DSQL operation clearly states SHUFFLE_MOVE. Why am I getting this? What does it mean?

Shuffling data isn’t the worst thing in the world, but it is a fairly expensive operation all things considered.  Ideally, your warehouse architecture limits the number of shuffle operations, but considering that you can only hash on one key, sometimes it’s inevitable.

Comments closed

Spatial Workaround In Azure SQL Data Warehouse

Rolf Tesmer has you covered if you want to perform spatial queries against data in Azure SQL Data Warehouse:

Recently we had a requirement to perform SQL Spatial functions on data that was stored in Azure SQL DW.  Seems simple enough as spatial has been in SQL for many years, but unfortunately, SQL Spatial functions are not natively supported in Azure SQL DW (yet)!

If interested – this is the link to the Azure Feedback feature request to make this available in Azure SQL DW – https://feedback.azure.com/forums/307516-sql-data-warehouse/suggestions/10508991-support-for-spatial-data-type

AND SO — to use spatial data in Azure SQL DW we need to look at alternative methods.  Luckily a recent new feature in Azure SQL DB  in the form of Elastic Query to Azure SQL DW now gives us the ability to perform these SQL Spatial functions on data within Azure SQL DW via a very simple method!

Check out that Azure Feedback item if you’d like to see native spatial support rather than using elastic query.  In the meantime, click through to see Rolf’s workaround.

Comments closed

Configuring Azure SQL Analytics

Esat Erkec has a guide showing how to configure and use Azure SQL Analytics on Azure SQL Database:

The most important and challenging responsibility of a database administrator is monitoring performance metrics. Because monitoring performance and troubleshooting performance issues are considered to be difficult. For this reason, we need diagnostic and monitoring tools to measure performance counters and metrics. For Azure SQL there is a tool which is named SQL Analytics. With this tool, we can measure and monitor Azure SQL databases and elastic pools. At the same time, we can create alerts for notifications. SQL Analytics offers performance metrics in graphical form. In this article, we will learn how to enable Azure SQL Analytics.

This is a long and screenshot-filled post, which is helpful if you’re getting started.

Comments closed

DTUs Or vCores For Azure SQL DB

Denny Cherry looks at a new Azure SQL Database announcement:

Today Microsoft has announced that there is a new way to buy Azure SQL DB. If DTUs aren’t making sense to you, you’ll be happy to know that you can now simply select how many vCores you want for your SQL DB workload.  Now this will still require that you have an understanding on your workload to use this new vCore based way to buy Azure SQL DB, but Cores are a concept that is easy for people to talk about and wrap their heads around. Now this new model is only in preview at the moment, but I’m guessing that it’ll be around for a while in preview, then it’ll go GA as this new model makes sense.

I definitely prefer this model, as IT departments already understand the idea, whereas DTUs were nebulous at best.

Comments closed

Changing The SQL Server Port On Azure Container Services

Andrew Pruski shows how to change off of the default port for SQL Server when running Azure Container Services:

So, how do you do it when running SQL Server in Azure Container Services?

Well there’s a couple of options available.

The first one is to change the port that SQL is listening on in the container, open that port on the container, and direct to that port from the service.

The second one is to leave SQL Server listening on the default port and direct a non-default port to port 1433 from the service.

Read on to see Andrew try out both of these methods.

Comments closed

Azure Analysis Services Parallelism And Scale

Teo Lachev has a quick note on Azure Analysis Services and parallelism:

Anyone who’s done parallel programming knows that it’s not easy. Safe access and locks need to be used to synchronize access and protect shared resources. The AS xVelocity engine (aka Vertipaq) is a multi-threaded application that accesses RAM as a shared resource. As each thread competes for a global lock for memory allocation/deallocation, scalability decreases. Because processor cores reserve memory in chunks, called cache lines, additional synchronization is required when threads access memory location that are close to each other. Before SQL Server 2016 SP1, this was a serious issue with Tabular. As we’ve learned, starting with SQL Server 2016 SP1, Microsoft switched to using the Intel Threading Building Clocks (TBB) C++ library. Specifically, Tabular now uses the TBB scalable memory allocator. Mind you that TBB is not CPU-specific extensions, so any modern Intel CPU should get these benefits. “Analysis Services SP1 uses an Intel TBB-based scalable allocator that provides separate memory pools for every core. As the number of cores increases, the system can scale almost linearly” and “The Intel TBB-based scalable allocator is also expected to help mitigate performance problems due to heap fragmentation that have been shown to occur with the Windows Heap”. Further, starting with SP1, Tabular is NUMA aware with 4-node NUMA system but I don’t know if the AAS VMs are preconfigured for NUMA.

Teo doesn’t have any firm conclusions at this point, but his initial testing looks positive.

Comments closed

“Failed To Delete Database” In Azure

Arun Sirpal walks us through a reason why you might not be able to delete an Azure SQL Database:

Failed to delete the database: TestDB. ErrorCode: 400 ErrorMessage: Database ‘TestDB’ cannot be deleted because it is used as a sync metadata database which still contains sync groups and/or sync agents.

After some digging around I remembered that I tried to setup a data sync group where this database was involved in bi-directional synchronization with another Azure SQL Database. The other database was deleted a long time ago and clicking on the sync option within the portal confirmed that nothing was actually ever connected or ready.

Arun walks us through the wrong way before getting us down the right path.  I recommend the latter.

Comments closed

Using Cosmos DB For Graph Data

Jose Mendes has an introduction to the Cosmos DB graph engine:

Gremlin is the graph traversal language of Apache TinkerPop, an open source Graph Computing Framework. Gremlin allows the users to write complex queries to traverse their graphs by using a composed sequence of steps, with each step performing an operation on the data stream (further details here). There are 4 fundamental steps:

· transform: transform the objects in the stream

· filter: remove objects from the stream

· sideEffect: pass the object, but yield some side effect

· branch: decide which step to take

Click through for a quick example showing how to create and populate a graph.

Comments closed

A Frugal Stretch Database Alternative

Chris Bell shares a version of Stretch databases for people with budgets:

Stretch databases were going to provide “Cost-effective” availability for cold data, and unlike typical cold data storage,  our data would always be online and available to query. Applications would not need to be modified to work with the seamless design of the stretch database. Run a query, and the data was there being pulled from the cloud when needed. Streamlining on-premises data maintenance by reducing the local footprint of the data files as well as the size of backups! It was even going to be possible to keep data secure via encrypted connections to the cloud and in theory, make a migration to the cloud even easier.

It was destined to be a major win!

Then the price was mentioned.

Do you know anyone using stretch databases today?

Yeah, me neither.

It’s an interesting workaround with several moving parts.

Comments closed