Press "Enter" to skip to content

Curated SQL Posts

Normalization Code Smells

Erik Darling has a few tips to see if you’re not properly normalizing your tables:

First Sign Of Problems: Prefixed Columns

Do you have columns with similar prefixes?

If you have naming patterns like this, it’s time to look at splitting those columns out.

I took the Users and Posts tables from Stack Overflow and mangled them a bit to look like this.

You may not have tables with this explicit arrangement, but it could be implied all over the place.

One great way to tell is to look at your indexes. If certain groups of columns are always indexed together, or if there are lots of missing index requests for certain groups of columns, it may be time to look at splitting them out into different tables.

These things tend to happen, but they can have serious negative consequences for database performance, not to mention the risk of bad data sneaking in.

Comments closed

HA/DR With Azure SQL Database

James Serra explains the High Availability and Disaster Recovery options available when using Azure SQL Database:

High Availability (HA) – Keeping your database up 100% of the time with no data loss during common problems.  Redundancy at system level, focus on failover, addresses single predictable failure, focus is on technology.  SQL Server IaaS would handle this with:

  • Always On Failover cluster instances
  • Always On Availability Groups (in same Azure region)
  • SQL Server data files in Azure

Disaster Recovery (DR) – Protection if major disaster or unusual failure wipes out your database.  Use of alternate site, focus on re-establishing services, addresses multiple failures, includes people and processes to execute recovery.  Usually includes HA also.  SQL Server IaaS would handle this with:

  • Log Shipping

  • Database Mirroring

  • Always On Availability Groups (different Azure regions)

  • Backup to Azure

Click through for more details.

Comments closed

Customizing Live Extended Events Data

Grant Fritchey shows how to make the Extended Events UI a lot more useful:

WHAT!!!!?!!

You mean I can make that upper window useful? Yes. Not only that, but, what ever columns you add to the upper window, your copy of Management Studio remembers what you did. The next time you open this Extended Events session in the the Live Data window, the columns you selected will be there. However, your ability to customize Live Data is even easier than that. If you notice, in the upper part of your SSMS window is the Live Data tool bar. Near the end is a button innocuously titled “Choose Columns.” Click on that and a window similar to this will open:

And you can share the results with others.

Comments closed

Read Event Stream Function Missing When Viewing Extended Events File

Tibor Karaszi points out an annoying bug which can happen when viewing an Extended Events file:

Here’s the relevant text in the message box, to facilitate searching:

The storage failed to initialize using the provided parameters. (Microsoft.SqlServer.XEventStorage)

Cannot view the function ‘fn_MSXe_read_event_stream’, because it does not exist or you do not have permission. (Microsoft SQL Server, Error: 15151)

Read on for the solution, and if you’re interested in changing that behavior, vote up the UserVoice submission.

Comments closed

Examples Of Charts In Different Languages

David Smith points out a great repository of information on generating different types of charts in different libraries:

The visualization tools include applications like Excel, Power BI and Tableau; languages and libraries including R, Stata, and Python’s matplotlib); and frameworks like D3. The data visualizations range from the standard to the esoteric, and follow the taxonomy of the book Data Visualisation (also by Andy Kirk). The chart categories are color coded by row: categorical (including bar charts, dot plots); hierarchical (donut charts, treemaps); relational (scatterplots, sankey diagrams); temporal (line charts, stream graphs) and spatial (choropleths, cartograms).

Check out the Chartmaker Directory.

Comments closed

Using MLFlow For Binary Classification In Keras

Jules Damji walks us through classifying movie reviews as positive or negative reviews, building a neural network via Keras on MLFlow along the way:

François’s code example employs this Keras network architectural choice for binary classification. It comprises of three Dense layers: one hidden layer (16 units), one input layer (16 units), and one output layer (1 unit), as show in the diagram. “A hidden unit is a dimension in the representation space of the layer,” Chollet writes, where 16 is adequate for this problem space; for complex problems, like image classification, we can always bump up the units or add hidden layers to experiment and observe its effect on accuracy and loss metrics (which we shall do in the experiments below).

While the input and hidden layers use relu as an activation function, the final output layer uses sigmoid, to squash its results into probabilities between [0, 1]. Anything closer to 1 suggests positive, while something below 0.5 can indicate negative.

With this recommended baseline architecture, we train our base model and log all the parameters, metrics, and artifacts. This snippet code, from module models_nn.py, creates a stack of dense layers as depicted in the diagram above.

The overall accuracy is pretty good—I ran through a sample of 2K reviews from the set with Naive Bayes last night for a presentation and got 81% accuracy, so the neural network getting 93% isn’t too surprising.  Seeing the confusion matrix in this demo would have been a nice addition.

Comments closed

Power BI Versus Power BI Report Server

Lars Bouwens contrasts the normal Power BI service with Power BI Report Server:

The differences between Power BI Service and Power BI Report Server are well documented in the Planning a Power BI Enterprise Deployment whitepaper and the online documentation. The whitepaper by Chris Webb and Melissa Coates also includes a comparison between Power BI Service and Power BI Report Server (Jun 2017 version). Because this is a lot of information, I’ve summarized the most important topics.

Power BI Report Server – two options
Power BI Report Server is the Power BI on-premises alternative to the cloud-based Power BI Service. There are two ways of acquiring Power BI Report Server: 1) Purchase a Power BI Premium Subscription or 2) by making use of your SQL Server Enterprise Software Assurance license.

I use Power BI Report Server.  It’s not perfect, but it does what I need it to do and it doesn’t cost the company anything extra (due to Enterprise Edition + Software Assurance).

Comments closed

Building A URL With DAX

Kasper de Jonge shows us how to generate a hyperlink in DAX based on filters in a report:

Thanks to two recent Power BI features it now possible to generate a link on the fly using DAX to go to a new report and pass in any filters. (Alsop imagine linking to your favorite SSRS report !)

In this blog post we will create a DAX formula that generate a hyperlink based on the filters in the report. The measure contains 2 parts, the first part is generating the right url to the target report using DAX and the second part is passing the filters to that report.

We will be using variables a lot in these expressions, more on variables in DAX can be found here: https://msdn.microsoft.com/en-us/query-bi/dax/var-dax

Read on for an example.

Comments closed

Confirmation In Powershell

Shane O’Neill shows off how easy it is to add confirmation checkpoints into Powershell code:

Finally it works… BUT…

  • It’s 26 lines long for this piece of code. If we have multiple then this is going to blow up size wise, and
  • There’s definitely more but I think I’ve gotten the point across (or I hope I have).

So let’s try and use what’s built in to PowerShell.

The built-in version is 3 lines of code and provides more functionality.  You probably want to use that version; click through to see this all in action.

Comments closed

Finding The First Non-NULL Value In A Window

Bert Wagner shows off the FIRST_VALUE window function and walks us through a case it struggles with:

The SQL Server FIRST_VALUE function makes it easy to return the “first value in an ordered set of values.”

The problem is that if that first value happens to be a NULL, there is no easy, built-in way to skip it.

While a UserVoice item exists to add the ability to ignore nulls (go vote!), today, we’re going accomplish that end result with some alternative queries.

Click through for the demo, as well as a video version of the post.

Comments closed