Press "Enter" to skip to content

Author: Kevin Feasel

What’s New with Delta Lake

Denny Lee and Tathagata Das announce Delta Lake 0.5.0:

With the following pull requests, you can now run even more Delta Lake operations concurrently. With finer grain conflict detection, these updates make it easier to run complex workflows on Delta tables such as:

– Running deletes (e.g. for GDPR compliance) concurrently on older partitions while newer partitions are being appended.
– Running file compactions concurrently with appends.
– Running updates and merges concurrently on disjoint sets of partitions.

Click through for the full changelog.

Comments closed

Chaos Engineering with SQL Server

Andrew Pruski is excited about Chaos Engineering:

Chaos Engineering is a practice in which we run experiments against a system to see if it reacts the way we expect it to when it encounters a failure.

We’re not trying to break things here…Chaos Engineering is not breaking things in production.

If I said to my boss that we’re now going to be experiencing an increased amount of outages because, “I AM AN ENGINEER OF CHAOS”, I’d be marched out the front door pretty quickly.

On the plus side, we will know Andrew’s supervillain origin story.

2 Comments

The Cost of Sorting in Stored Procedures

Monica Rathbun wants us to think about whether we really need that ORDER BY clause:

We know that sorting can be one of the most expensive things in an execution plan as shown below. However,  we continue to do ORDER BYs repeatedly. Yes, I 100% agree that there is a need to sort a results set and that this should be done in the procedure for good reason, but my concern is having  multiple sorts, erroneous sorts, and the sorts that can be done elsewhere. These are the ones that waste resources and can stifle performance.

Click through for a demo showing that this does make a difference.

Comments closed

An Intro to Power BI Premium

Gilbert Quevauvilliers gives us an overview of what Power BI Premium is and what you need to know before using it:

A great place to start is to first explain “What is Power BI Premium?”

At the very basic level Power BI Premium allows you to buy dedicated capacity with additional features. 

Read on to see what that means, what you get with Premium, and how it can benefit your organization. Gilbert also covers the pricing model, which is important because this isn’t cheap.

Comments closed

What Uses tempdb?

Jason Hall takes us through several SQL Server features which use tempdb behind the scenes:

Since SQL Server 2005, triggers use the version store, which is kept in tempdb. So, if you’re using triggers, they are implicitly using tempdb. Remember that triggers are set-based, and you’ll get version data for every affected row. For really big batches, this can turn out to be quite a bit of version data.

Click through for several more examples.

Comments closed

Ghost Records and Availability Groups

Aaron Bertrand ran into an interesting problem:

I recently came across an issue in a SQL Server Availability Group scenario where queries against a heavily-used queue table were taking longer and longer over time. The symptoms of the query were that logical reads were increasing rapidly, and we ultimately tracked it down to ghost records (and version ghost records) that were being created as rows from the queue table were consumed and deleted. Because the database was being used in a readable secondary, the ghost cleanup process simply wasn’t able to keep up with the volume of deletes against the table. The first workaround implemented was to suspend the queue consumers and rebuild the table.

Read on to see how Aaron used filtered indexes and soft deletes to mitigate some of the pain.

Comments closed

A Script for Peer Reviews of Code

Paul Andrew shares with us a script which is useful for peer reviewing code before check-in:

Does the code include good comments? Things that explain the reason why logic has been implemented to assist future developers looking at the same code.

All to often I see code comments written that just translate from code to English and tell me what is happening. What should be fairly obvious to the reviewer as they read the code. Why is so much more important.

It’s a 20-point checklist, but worth reviewing and adapting for your own purposes.

Comments closed

R for Systems Administration

Ian Flores Siaca says hey, why not use R for systems administration:

jrDroplet is a package designed specifically to manage Virtual Machines in Digital Ocean for our training courses. The idea is that with a single line we are able to create a Digital Ocean droplet with the packages installed for our courses, hiding all of the background complexities related to infrastructure.

It’s an interesting use for this DSL.

Comments closed

An Introduction to gRPC

Munander Singh explains why you might want to use gRPC for cross-process communications:

When the client invokes the service, the client-side gRPC library uses the protocol buffer and marshals the remote procedure call, which is then sent over HTTP2. Server un-marshal the request and executes the respective procedure invocation using protocol buffers. The response follows a similar execution flow from the server to the client.

The main advantage of developing services and clients with gRPC is that your service code or client side code doesn’t need to worry about parsing JSON or similar text-based message formats (within the code or implicitly inside the underlying libraries such as Jackson, which is hidden from service code). What comes in the wire is a binary format, which is unmarshalled into an object. Also, having first-class support for defining a service interface via an IDL is a powerful feature when we have to deal with multiple microservices and ensure and maintain interoperability.

The “yes, but” here is that based on your language of choice, gRPC can be a bit tricky to get going.

Comments closed

Moving tempdb on a SQL Server Instance

Drew Skwiers-Koballa gives us the step-by-step process for moving tempdb from one folder to another on a machine:

Not only can the size of TempDB files be unpredictable (unless the workload is completely predictable or a size limit is placed), but it is full of old Tupperware. That is, if TempDB is destroyed, your SQL Server will create a new one as soon as the service restarts. The whole migration is summarized in these 4 steps:

1. Create a new location for TempDB
2. Use TSQL to change the TempDB file location(s)
3. Restart the SQL Server service during a maintenance window
4. Verify and clean up

This is one of the easier things to move, but it does require server downtime.

Comments closed