Press "Enter" to skip to content

Category: Cloud

Scaling Azure Analysis Services

Chris Seferlis helps us make the decision between scaling up or out with Azure Analysis Services:

Some of you may not know when or how to scale up your queries or scale out your processing. Today I’d like to help with understanding when and how using Azure Analysis Services. First, you need to decide which tier you should be using. You can do that by looking at the QPUs (Query Processing Units) of each tier on Azure. Here’s a quick breakdown:

  • Developer Tier – gives you up to 20 QPUs

  • Basic Tier – is a mid-scale tier, not meant for heavy loads

  • Standard Tier (currently the highest available) – allows you more capability and flexibility

Read on for some pointers.

Comments closed

Clustered Columnstore Index Online Rebuild

Niko Neugebauer looks at a feature which will pop up in SQL Server vNext:

The current state of the Clustered Columnstore Index ONLINE rebuild points to be an unfinished version, which will definitely get vastly improved before being released & supported in SQL Server. I have seen a couple of deadlocks and canceled transactions and so I decided that this blog post will get updated as soon as there will be an official announcement of this feature.
If you are still looking to start working on this feature, then I would suggest trying it on smaller tables. Like really, really small ones.
Oh, and for online rebuild operation focus on using partition rebuild – you are using the partitioning, right ? 🙂

Niko gave this a try in Azure SQL Database, as there is no publicly available version of SQL Server which supports this.  I’ve been waiting for this feature for 3 years now, so I’ll be happy to see it in production.

Comments closed

Column-Level Security In Azure SQL Data Warehouse

Kavitha Jonnakuti announces a new feature for Azure SQL Data Warehouse:

Access to the table columns can be controlled based on the user’s execution context or their group membership with the standard GRANT T-SQL statement. To secure your data, you simply define a security policy via the GRANT statement to your table columns. For example, if you would like to limit access to PII data in your customers table, you can simply GRANT SELECT permissions on specific columns to the ContractEmp role:

GRANT SELECT ON dbo.Customers (CustomerId, FirstName, LastName) TO ContractEmp;

This capability is available now in all Azure regions with no additional charge.

This has been in regular SQL Server for a long time, so it’s good to see it make its way into Azure SQL Data Warehouse, and in a manner which doesn’t involve creating user-defined functions for predicates like Row-Level Security.

Comments closed

Thoughts On Spending Money In The Cloud

Andy Leonard has a few thoughts on spending money once you’ve migrated to the cloud:

In a previous consulting life, a customer contacted us and asked for an evaluation of their architecture. They were attempting to scale the business and encountering… obstacles. A team looked over their software and database designs and recommended a rewrite of their custom code. We supplied a proposal (including an expensive estimate) to deliver the re-architecture, redesign, and rewriting of their code.

They felt the price was too high.

We understood, so we countered with a proposal to coach their team to deliver the same result. This would cost less, the outcome would be the same, and their team would grok the solution (because their team would build the solution). The catch? It would take longer.

They felt this solution would take too long.

Andy has some great thoughts on the subject.  One area where I’d push further is to say that the best way to take advantage of cloud services is not the best way to take advantage of on-prem services, so even if you have a well-architected on-prem solution, it might not be ideal for running in AWS or Azure.

Comments closed

Executing SSIS From Azure Data Factory

Andy Leonard shows us how to execute an SSIS package from Azure Data Factory:

The good people who work on Azure Data Factory recently added an Execute SSIS Package activity. It’s pretty cool. Let’s tinker with it some, shall we?

First, you will need to create an Azure Data Factory SSIS Integration Runtime. If you don’t know how, that’s ok – I’ve written a post titled Lift and Shift SSIS Part 0: Creating the ADF Integration Runtime that describes one way to set up ADFIR.

Read on for an example.

Comments closed

Backing Up SQL Server To S3

David Fowler shows how to back up SQL Server directly to an AWS S3 bucket:

I’ve been having a little play around with AWS recently and was looking at S3 (AWS’ cloud storage) when I thought to myself, I wonder if it’s possible to backup up an on premise SQL Server database directly to S3?

When we want to backup directly to Azure, we can use the ‘TO URL’ clause in our backup statement.  Although S3 buckets can also be accessed via a URL, I couldn’t find a way to backup directly to that URL.  Most of the solutions on the web have you backing up your databases locally and then a second step of the job uses Power Shell to copy those backups up to your S3 buckets.  I didn’t really want to do it that way, I want to backup directly to S3 with no middle steps.  We like to keep things as simple as possible here at SQL Undercover, the more moving parts you’ve got, the more chance for things to go wrong.

So I needed a way for SQL Server to be able to directly access my buckets.  I started to wonder if it’s possible to map a bucket as a network drive.  A little hunting around and I came across this lovely tool, TNTDrive.  TNTDrive will let us do exactly that and with the bucket mapped as a local drive, it was simply a case of running the backup to that local drive.

Quite useful if your servers are in a disk crunch.  In general, I’d probably lean toward keeping on-disk backups and creating a job to migrate those backups to S3.

Comments closed

Auditing Options With Azure SQL Data Warehouse

Janusz Rokicki explores what is available in Azure SQL Data Warehouse when it comes to auditing:

Auditing is disabled by default and the UI experience depends on the region to which the logical server is deployed. For instance, in UK South, the portal offers no options to manage auditing:

In North Europe, the portal allows Table Auditing (table-storage based) to be enabled on the SQL Data Warehouse scope, but it isn’t possible to enable Blob Auditing:

On top of that, Blob Auditing behaves differently when enabled on a logical server level in different regions. In locations that support Table Auditing, turning on Blob Auditing automatically enables it in all databases, including SQL Data Warehouses—and that’s expected. In other regions, Blob Auditing is not automatically enabled and has to be turned on programmatically by calling ARM REST API.

I imagine the plan is to support this across the board but it’s rolling out region by region.

Comments closed

User-Defined Restore Points In Azure SQL DW

Kevin Ngo announces a new feature in Azure SQL Data Warehouse:

Previously, SQL DW supported only automated snapshots guaranteeing an eight-hour recovery point objective (RPO). While this snapshot policy provided high levels of protection, customers asked for more control over restore points to enable more efficient data warehouse management capabilities leading to quicker times of recovery in the event of any workload interruptions or user errors.

Now, with user-defined restore points, in addition to the automated snapshots, you can initiate snapshots before and after significant operations on your data warehouse. With more granular restore points, you ensure that each restore point is logically consistent and limit the impact and reduce recovery time of restoring the data warehouse should this be needed. User-defined restore points can also be labeled so they are easy to identify afterwards.

Creating a user-defined restore point is a one-liner in Powershell, and it’s something you could do after each warehouse load, for example.

Comments closed

Azure Data Lake Analytics Updates

Michael Rys has a boatload of new updates for Azure Data Lake:

The top items include expanding our built-in support for standard file formats with native Parquet support for extractors and outputters (in public preview) and ORC (in private preview)!

In addition, since the fast file set feature now has been generally released, we can consume hundreds of thousands of such files in bulk in a single EXTRACT statement. We will publish a blog at a later date to give you much more detailed information on how this capability helps you to process so many files efficiently in a scalable way.

Important aspects of processing files at scale include:

  1. the ability to generate many files from a rowset in a single statement, providing a way to dynamically partition the data for future use with Hadoop or Spark, or to provide individual files for customers. This has been our top customer ask on the ADL Feedback forum –and now it is in private preview!

  2. the ability to handle many small files. We recommend that you make your files large enough for the processing to be efficient (300MB to 4GB is a good range), but often, your file formats (e.g., images) or data ingestion pipelines (e.g., EventHub archives) are not able to reach that size. Thus, we are adding the ability to group several files into a vertex to increase efficiency and lower cost of your job (we have seen 10 to 30 times improvement in some customer jobs!).

Read on for the full changelog.

Comments closed