Press "Enter" to skip to content

Month: March 2019

When Deletes Increase Data Size

Brent Ozar shows a case where indexes can grow in size as you delete data:

Here’s one way that deletes can cause a table to grow:
– The rows were originally written when the database didn’t have Read Committed Snapshot (RCSI), Snapshot Isolation (SI), or Availability Groups (AGs) enabled
– RCSI or SI was enabled, or the database was added into an Availability Group
– During the deletions, a 14-byte timestamp was added to each deleted row to support RCSI/SI/AG reads

Click through for a demo and takeaways.

Comments closed

Querying CosmosDB with PolyBase

Hasan Savran shows how you can query CosmosDB with T-SQL statements:

SQL Server Polybase Services lets us to pull data from other data resources by using T-SQL queries. SQL Server 2019 introduces new connectors for Polybase services like Oracle, Teradata and MongoDB. In one of the SQL Server 2019 presentations from Bob Ward, I saw the CosmosDB logo when he was talking about the new connectors of SQL Server 2019. CosmosDB already has an ODBC driver and you can use it as a datasource for your Power BI, SSIS or SSMS. SQL Server 2019 makes this connection easier by using the Polybase services.
      
      In this post, I am going to show you how to configure SQL Server 2019 Polybase to connect Azure CosmosDB Mongo databases

I have a bit of a vested interest in this, so it is heartening to see people trying out PolyBase. It’s more heartening when they use it after they try it out.

Comments closed

Deep Thoughts on SLAs

Ben Slater has a great post on service level agreements:

It should set the customer’s expectations, be realistic and be crystal clear, with no scope for misinterpretation. Well, that’s how they are meant be. But unfortunately, in the quest for more sales, some MSPs tend to commit themselves to unrealistic SLAs. It’s tempting to buy into a service when an MSP offers you 100% availability. It is even more tempting when you see a compensation clause that gives you confidence in going ahead with that MSP. But hold on! Have you checked out the exclusion clauses? Have you checked out your responsibilities in order to get what you are entitled to in the SLA? Just as it is the MSP’s responsibility to define a crystal-clear SLA, it is the customer’s responsibility to thoroughly understand the SLA and be aware of every clause in it. That is how you will notice the shades of gray!

We have put together a list of things to look for in an SLA so that customers are aware of the nuances involved and avoid unpleasant surprises after signing on.

This is important to know when you’re using a managed service but it also works internally: as a developer or manager on a product that people (presumably) rely upon, how you deal with outages and issues matters a lot.

Comments closed

Making Stored Procedure Changes With Limited Downtime

I continue my series on database development in a (near) zero downtime environment:

Versioning a procedure is pretty simple: you create a new procedure with alterations you want. Corporate naming standards where I’m at have you add a number to the end of versioned procedures, so if you have dbo.SomeProcedure, the new version would be dbo.SomeProcedure01. Then, the next time you version, you’ll have dbo.SomeProcedure02 and so on. For frequently-changing procedures, you might get up to version 05 or 06, but in practice, you’re probably not making that many changes to a procedure’s signature. For example, looking at a directory with exactly 100 procedures in it, I see 7 with a number at the end. Two of those seven procedures are old versions of procedures I can’t drop quite yet, so that means that there are only five “unique” procedures that we’ve versioned in a code base which is two years old. Looking at a different part of the code with 879 stored procedures, 95 have been versioned at least once in the 15 or so years of that code base’s existence. The real number is a bit higher than that because we’ve renamed procedures over time and renamings tend to start the process over as we might go from dbo.SomeProcedure04 to dbo.SomeNewProcedure when we redesign underlying tables or make other drastic architectural changes.

The secret is, I’m always versioning.

Comments closed

Power BI AutoML

Teo Lachev takes a look at AutoML in Power BI:

Let’s see how AutoML works based on what’s in the private preview (the usual disclaimer is that things will probably change). To start with, AutoML requires a dataflow (a note to Microsoft here is that AutoML will become more pervasive if it’s available in Power BI Desktop and it doesn’t require a premium capacity). In the private preview, AutoML requires the following steps. Presumably. the first (and most difficult step), preparing the dataset and cleansing the data is already done and available as a dataflow entity:

It looks like Microsoft’s taking what they learned from Azure ML and trying to port it over to Power BI.

Comments closed

Big SSAS News In SQL Server 2019 CTP 2.3

Chris Webb is excited about what’s in SQL Server 2019 CTP 2.3:

With the release of CTP 2.3 of SQL Server 2019 today there was big news for Analysis Services Tabular developers: Calculation Groups. You can read all about them in detail in this blog post:

https://blogs.msdn.microsoft.com/analysisservices/2019/03/01/whats-new-for-sql-server-2019-analysis-services-ctp-2-3/

In my opinion this is the most important new feature in DAX since… well, forever. It allows you to create a new type of calculation – which in most cases will be a time intelligence like a year-to-date or a previous period growth – that can be applied to multiple measures; basically the same thing that we have been doing in SSAS Multidimensional for years with the time utility/shell/date tool dimension technique. It’s certainly going to solve a lot of problems for a lot of SSAS Tabular implementations, many of which have hundreds or even thousands of measures for every combination of base measure and calculation type needed.

Click through for more of Chris’s thoughts and how calculation groups will make your life easier.

Comments closed

The Pain of DST and the Lessened Pain with AT TIME ZONE

Bert Wagner shows how you can use AT TIME ZONE as of SQL Server 2016 to make dealing with Daylight Savings Time a little less painful:

The fallacy above is that I said our two datetime2’s are in UTC, but SQL Server doesn’t actually know this. The datetime2 (and datetime) datatype doesn’t allow for time zone offsets so SQL Server really doesn’t know what time zone the data is in.

Using AT TIME ZONE on a datetime2 without offset information causes SQL Server to “…[assume] that [the datetime] is in the target time zone”. That explains why the two datetime2s above, intended to be in UTC, are actually seen as Eastern Daylight Time by SQL Server.

Read the whole thing. Dates and times are a lot more difficult than they first appear. And then they turn out to be a lot more difficult than that.

Comments closed

Failing A Powershell Step In SQL Agent

Stuart Moore shows us how we can get a SQL Agent job running a Powershell step to recognize failure:

You might want the same response to make sure your monitoring is letting you know when jobs fail. On that note, it would also be nice if you could raise an error or failure message in your PowerShell step and have that propagate back up to SQL Server

Unfortunately the usual scripting standbys of returning 0 or $false don’t work.

Stuart does have a solution, though, so read on to learn what it is.

Comments closed

Multi-Server Real-Time Scoring with R

David Smith points out a reference architecture for real-time scoring with R distributed through Kubernetes:

Let’s say you’ve developed a predictive model in R, and you want to embed predictions (scores) from that model into another application (like a mobile or Web app, or some automated service). If you expect a heavy load of requests, R running on a single server isn’t going to cut it: you’ll need some kind of distributed architecture with enough servers to handle the volume of requests in real time.

This reference architecture for real-time scoring with R, published in Microsoft Docs, describes a Kubernetes-based system to distribute the load to R sessions running in containers.

Looks interesting.

Comments closed