Press "Enter" to skip to content

Curated SQL Posts

Error Handling in Scala

Ashish Chaudhary gives us three different ways of handling errors in Scala:

Error handling is the process of handling the possibility of failure. For example, failing to read a file and then continuing to use that bad input would clearly be problematic. Noticing and explicitly managing these errors saves the rest of the program from various pitfalls.

Exceptions in Scala work the same way as in C++ or Java. When an exception occurs, say an Arithmetic Exception then the current operation is aborted, and the runtime system looks for an exception handler that can accept an Arithmetic Exception. Control resumes with the innermost such handler. If no such handler exists, the program terminates.

Or, another way to put it is, structural programming with try/catch or functional programming via monads (Option and Either).

Comments closed

Unique Constraints vs Unique Indexes

Erik Darling calls out unique key constraints:

I do love appropriately applied uniqueness. It can be helpful not just for keeping bad data out, but also help the optimizer reason about how many rows might qualify when you join or filter on that data.

The thing is, I disagree a little bit with how most people set them up, which is by creating a unique constraint.

Data modeling Kevin wants to use unique key constraints because that’s the correct thing to do. Implementation Kevin uses unique nonclustered indexes for the reasons Erik describes. Not mentioned in Erik’s post but potentially relevant is that operations on unique nonclustered indexes can be done online, whereas unique key constraint operations (creation and alteration via drop+create) are offline.

Comments closed

Tips on using Included Columns on Indexes

Chad Callihan shares some advice:

In my previous blog post, we saw how using INCLUDE to add a column to an index can make a difference compared to a key column. Let’s do a quick overview of INCLUDE and when it should be used.

Included columns are columns that can added to an index as non-key columns. They are only added to the leaf nodes of an index and have a bit more flexibility. Having trouble adding a particular data type to an index? Included columns can be data types unable to be added as key columns. Are you possibly maxed out on index key columns? Use INCLUDE to add any necessary columns.

Read on for an example and note the warning that you shouldn’t just add all of the columns to the INCLUDE clause.

Comments closed

Create and Update Columns on Tables

Kenneth Fisher adds four columns to a table:

One of the easiest ways to collect information about table activity is to add a series of audit columns to the table. The most common set of column consists of four columns.

– When was the row created?

– Who created it?

– When was the row last updated?

– Who last updated it?

Read on to see how to create these, but definitely read the comments. Joe Celko has a good reason to avoid this style, and there’s another good reason as well: if you update the columns and the updated values are longer than what they replace, you can end up generating a lot of page splits as full pages get broken up to handle those in-page updates.

If you do need something like this, I’m growing more and more fond of an activity log table which is append-only and tracks these operations separately. That removes the page splits, allows you to deal with deletions, and gives a better idea of what happened over time for a third party. It does come at a cost if you frequently need to show the create and last updated dates (or users), but my experience has been that a vast majority of the time, we include those for internal purposes, not to display to users.

Comments closed

The Alt Key in SSMS

Barney Lawrence shows off vertical selection in SQL Server Management Studio:

I went years without knowing how useful the Alt key was, it’s not well documented but I can guarantee that once you know about it and give it a little practice you won’t be able to live without it.

While I’m filing this under SSMS Tips and Tricks but it works equally in Visual Studio, VS Code, Azure Data Studio and even Notepad ++ (but not plain old vanilla notepad). It’s worth a test in other places too.

As a quick note, it does not work quite the same way in Visual Studio Code or Azure Data Studio as in Visual Studio or SSMS: Alt+Shift+Down arrow copies the current row and pastes it in the row below. Holding Alt and Shift while using the mouse does work, but if you’re big on keyboard commands, you’ll be a bit disappointed.

Comments closed

SCD Type 2 with Delta Lake

Chris Williams continues a series on slowly changing dimensions in Delta Lake:

Type 2 SCD is probably one of the most common examples to easily preserve history in a dimension table and is commonly used throughout any Data Warehousing/Modelling architecture. Active rows can be indicated with a boolean flag or a start and end date. In this example from the table above, all active rows can be displayed simply by returning a query where the end date is null.

Read on to see how you can implement this pattern using Delta Lake’s capabilities.

Comments closed

Rolling Means with MazamaRollUtils

Jonathan Callahan has an interesting R package for us:

The initial release of MazmaRollUtils provides all the basic rolling functions with features like alignment and missing value removal along with additional capabilities for smoothing, damping and outlier detection — all common activities in time series analysis.

Click through for an explanation of the process, and then check out the package itself on GitHub. H/T R-Bloggers.

Comments closed

Tools Are Not Documentation

Ray Kim has a good reminder for us:

I spoke to the form’s owner (this is where the political part comes in). I explained what I was doing. However, he keeps insisting: “it doesn’t have to be documented, because the form is the documentation.”

He showed me a screen shot of text on the form that explains how the particular request works. The text made a lot of sense, and it would have been ideal to fulfill at least part (if not most) of my needs. I decided that I would create a reference to it. So I looked around the form for it… and could not find it anywhere.

He finally told me that “you had to click a specific button on the form to view the text.”

If you’ve developed a process or product, it’s easy to forget that you have a fuller mental model than anybody else, and so things which are blindingly obvious to you probably aren’t to users.

Comments closed

Displaying Blob Storage-Based Images in Power Apps

Paulina Nowinska has a tutorial for us:

Today, I explain how to create a simple app in Microsoft Power Apps where:

– the data are located in Excel,

– the table contains the path to the images from public Azure Blob Storage,

– the app displays images directly from Blob Storage based on the path defined in the database (Excel file).

If you haven’t used Power Apps before, I recommend checking it out. It’s not perfectly intuitive, but it does offer a much lower-code experience than classic app development.

1 Comment

Using Powershell in Azure Cloud Shell

Hope Foley shows how you can set up Powershell to be your Azure Cloud Shell language of choice:

Part of my job is doing POCs with customers to help with Azure Data Services.  Anything that helps me move quicker is helpful so I’m a tad bit obsessed with automating things.  I have used PowerShell for more years than I’m willing to admit to help me automate what I can.  There are a lot of ways to automate things like ARM templates and DevOps, but PoSH has been my preferred hammer.  As much as I love it, I’ve ran into issue sometimes with installing modules locally on folks machines and not to mention if they have a Mac.  I wondered recently if Azure Cloud Shell would help make things easier, and it very much did and I’m super pumped to share!  This post will help run through how to get setup to run PowerShell scripts in Azure.

For people who prefer Powershell to bash, check it out.

Comments closed