Press "Enter" to skip to content

Curated SQL Posts

Overview: U-SQL Database Projects

Zach Stagers gives us an overview of the new U-SQL Database Project structure:

Source Control

The projects integrates much more nicely with TFS than the older “U-SQL Project” does.

It actually gives you the icons (padlock, check mark, etc..) in the solution explorer, so it actually looks like it’s under source control!

Something that I’d really hoped had been fixed, but hasn’t, is when copying and renaming an existing item, it doesn’t recognize the rename. You have to undo the checkout of the non-existent object (the copy, before being renamed):

Read on for more improvements.

Comments closed

The Intersection Of Multiple Averages Is The Empty Set

Adi Gaskell argues that we shouldn’t get too wrapped up in “average” behaviors:

I’ve written extensively about the tremendous potential for big data in healthcare to drive enormous changes in how we keep people healthy for longer. It goes without saying however that all data is not created equal, and just having a large sample is not always sufficient to get the best insights.

If we needed reminding, a reminder comes via a recent study from the University of California, Berkeley. It suggests that things like emotion, behavior, and physiology vary hugely between individuals, therefore having an average over a large dataset can still produce a ‘norm’ that is wide of the mark for individuals.

“If you want to know what individuals feel or how they become sick, you have to conduct research on individuals, not on groups,” the researchers say. “Diseases, mental disorders, emotions, and behaviors are expressed within individual people, over time. A snapshot of many people at one moment in time can’t capture these phenomena.”

Variance is important.

Comments closed

Building An Extended Events Session

Aamir Syed gives us a simple example of using the Extended Events UI to create a new session:

Many of us have not made the effort to switch from profiler to Extended events.  It’s 2018, if you haven’t found a few hours to learn about this incredibly powerful tool, I urge you to do so now.

I’m going to provide a quick means of tracking queries with extended events. This is not an example of how comprehensive this is, but I hope that it atleast spurs some interest.

One of the main reasons we use profiler is to quickly capture some real time data. I’m going to not only show you how to do that with extended events, but this same session can be a historical view as it’s so easy to sift through and filter through the data. (No you don’t have to create a table for the result sets ala profiler).

Click through for step-by-step instructions.

Comments closed

Azure SQL Database Service By Purchase Model

Glenn Berry explains the two purchase models available with Azure SQL Database, as well as the various service tiers within each model:

The older pricing option is the DTU-based SQL purchase model, where a fixed set of resources is assigned to the database from three performance tiers, which are Basic, Standard, and Premium.

For Standard and Premium, there are multiple service tiers, which are classified according to how many Database Transaction Units (DTUs) they provide (along with their included storage and maximum available storage). The Premium tier is designed for I/O intensive workloads, and is fault-tolerant.

The Database Transaction Unit (DTU) is based on a blended measure of CPU, memory, along with storage reads and writes. The DTU-based performance levels represent preconfigured bundles of compute, memory, and storage resources designed to drive different levels of application performance. If you do not want to worry about the underlying resources and prefer the simplicity of a preconfigured resource bundle while paying a fixed amount each month, you may find the DTU-based model more suitable for your needs and easier to understand.

Glenn does a good job clearing up some of the complications around pricing for Azure SQL Database.

Comments closed

Configuring An Azure Runbook For Index Maintenance

Jim Donahoe explains how to perform index and statistics maintenance for Azure SQL Database, where you don’t have SQL Agent available:

I had a lot of issues when I created my first one, and after discussing with some folks, they had the same issues.  I searched for the best blog posts that I could find on the subject, and the one I LOVED the most was here: Arctic DBA.  He broke it down so simply, that I finally created my own pseudo installer and I wanted to share it with all of you.  Please, bear in mind, these code snippets may fail at anytime due to changes in Azure.

**IMPORTANT**

These next steps assume the following:

You have created/configured your Azure Automation Account and credential to use to execute this runbook.

Read on for a reasonably short Powershell script and a modified version of Ola Hallengren’s index maintenance procedures.

Comments closed

Using The system_health Extended Event Session

Matthew McGiffen walks us through what the system_health Extended Events session gives us:

When Microsoft introduced Extended Events (XE) in 2008, they also gave us a built-in XE session called system_health.

This is a great little tool. I mainly use it for troubleshooting deadlocks as it logs all the information for any deadlocks that occur. No more having to mess about making sure specific trace flags are enabled to ensure deadlock information is captured in the error log.

It also captures the SQL text and Session Id (along with other relevant data) in a number of other scenarios you may need to troubleshoot:

  • Where an error over severity 20 is encountered

  • Where a session has waited on a latch for over 15 seconds

  • Where a session has waited on a lock for over 30 seconds

  • Sessions that have encountered other long waits (the threshold varies by wait type)

Simply knowing what this session includes can give you a leg up on troubleshooting, especially when it’s a machine you haven’t seen before.

Comments closed

Wanted: Per-Database Wait Stat Collection Built In

Erik Darling wants configurable wait stat collections on a database level built into SQL Server:

I’m hoping that a feature like this could solve some intermediate problems that Query Store doesn’t.

Namely, being lower overhead, not collecting any PII, and not taking up a lot of disk space — after all, we’re not storing any massive stored proc text or query plans, here, just snapshots of wait stats.

This will help even if you’re already logging wait stats on your own. You still don’t have a clear picture of which database the problem is coming from. If you’ve got a server with lots of databases on it, figuring that out can be tough.

Understanding what waits (and perhaps bottlenecks) a single database is experiencing can also help admins figure out what kind of instance size they’d need as part of a migration, too.

It’s an interesting approach.  If you agree with Erik, go vote.

Comments closed

Performance Test: Loading CSV Versus Loading Excel In Power Query

Chris Webb lays out a performance test which shows how quickly Power Query can read data from a CSV versus from an Excel spreadsheet:

The black line in the graph above is the amount of data read (actually the offset values showing where in the file the data is read from, which is the same thing as a running total when Power Query is reading all the data) from the Excel file; the green line is the amount of data read from the CSV file (the same data shown in the first graph above). A few things to mention:

  • Running Process Monitor while this second query was refreshing had a noticeable impact on its performance – in fact it was almost 20 seconds slower

  • The initial values of 80 million bytes seem to be where data is read from the end of the Excel file. Maybe this is Power Query reading some file metadata? Anyway, it seems as though it takes 5 seconds before it starts to read the data needed by the query.

  • There’s a plateau between the 10 and 20 second mark where not much is happening; this didn’t happen consistently and may have been connected to the fact that Process Monitor was running

The results were remarkable; check them out.

Comments closed

Highlighting Data With gghighlight

Laura Ellis shows off the gghighlight package, which allows you to highlight selectively certain sets of data in ggplot:

While the above methodology is quite easy, it can be a bit of a pain at times to create and add the new data frame.  Further, you have to tinker more with the labelling to really call out the highlighted data points.

Thanks to Hiroaki Yutani, we now have the gghighlight package which does most of the work for us with a small function call!!   Please note that a lot of this code was created by looking at examples on her introduction document.

The new school way is even simplier:

  1. Using ggplot2, create a plot with your full data set.

  2. Add the gghighlight() function to your plot with the conditions set to identify your subset.

  3. Celebrate! This was one less step AND we got labels!

That’s a very cool package.  H/T R-Bloggers

Comments closed

Stoppable, Async Shiny Interfaces

Ian at Fells Stats wants to make a long-running Shiny app a bit more user-friendly:

Shiny operates in a reactive programming framework. Fundamentally this means that any time any UI element that affects the result changes, so does the result. This happens automatically, with your analysis code running every time a widget is changed. In a lot of cases, this is exactly what you want and it makes Shiny programs concise and easy to make; however in the case of long running processes, this can lead to frozen UI elements and a frustrating user experience.

The easiest solution is to use an Action Button and only run the analysis code when the action button is clicked. Another important component is to provide your user with feedback as to how long the analysis is going to take. Shiny has nice built in progress indicators that allow you to do this.

There are a couple of false starts in there but by the time you reach the third act, the story makes sense.  H/T R-Bloggers

Comments closed