Press "Enter" to skip to content

Curated SQL Posts

R Checkpoint Package Update Now in Beta

Hong Ooi announces that a revamp of the checkpoint package is now in beta:

Checkpoint has been around for nearly 6 years now, helping R users solve the reproducible research puzzle. In that time, it’s seen many changes, new features, and, inevitably, bug reports. Some of these bugs have been fixed, while others remain outstanding in the too-hard basket.

Many of these issues spring from the fact that it uses only base R functions, in particular install.packages, to do its work. The problem is that install.packages is meant for interactive use, and as an API, is very limited. For starters, it doesn’t return a result to the caller—instead, checkpoint has to capture and parse the printed output to determine whether the installation succeeded. This causes a host of problems, since the printout will vary based on how R is configured. Similarly, install.packages refuses to install a package if it’s in use, which means checkpoint must unload it first—an imperfect and error-prone process at best.

In addition to these, checkpoint’s age means that it has accumulated a significant amount of technical debt over the years. For example, there is still code to handle ancient versions of R that couldn’t use HTTPS, even though the MRAN site (in line with security best practice) now accepts HTTPS connections only.

Click through to see what’s in the new checkpoint package.

Comments closed

Fun with tempdb

Andy Mallon walks us through setting up tempdb:

There are three problems I’ve got to fix. I need to (1) remove those two extra files, (2) grow the tempdb log file, and (3) even out the size of the data files (and shrink them a little to make room for the larger log file. We’re going to tackle these in the reverse order than I listed them–partially out of necessity, and partially because it’s going to be easier.

Click through to see how Andy sets up tempdb. This is a good way to set up tempdb.

Comments closed

The Pain of Code Noise

Chris Johnson talks about a concept dear to me:

Basically code noise is anything that pulls your attention away from what the code is supposed to be doing, or obscures the true nature of the code in some way. It’s not something we consider enough when writing T-SQL code, but I think there is a lot to be said for writing code the next person will be able to read.

As a small example, I was debugging something recently and found that all of the insert statements had ORDER BY clauses. I couldn’t work out why these were making me so angry, after all it’s not doing anything to hurt performance, and in fact isn’t doing anything at all, until one of the other devs in the office pointed out that it’s one example of the code noise that the whole code base is filled with.

Chris provides us a couple examples of noise. My bottom line on this is, develop to the minimum required standards of what the computer needs (i.e., accurate data, fast enough, etc.) and give the humans maintaining the code a fighting chance. Spend more time making it easy for humans and make everybody’s life easier.

Comments closed

Ownership Chaining in SQL Server

K. Brian Kelley walks us through the concept of ownership chaining in SQL Server:

Ownership chaining is a security feature in SQL Server which occurs when all of the following conditions are true:

– A user (which could be an app through a login/service account) tries to access an object that makes a reference to another object. For instance, the user tries to execute a stored procedure that accesses other objects or a SELECT from a view that accesses other tables.
– The user has access to the first object, such as EXECUTE rights on the stored procedure or SELECT rights on the view.
– Both objects have the same owner.

In this case, SQL Server will see the chain between the object the user called and the object being referenced. SQL Server will also determine that the owner for both objects is the same. When those conditions are met, SQL Server will create the ownership chain.

Read on for an in-depth example of ownership chaining and how it solves certain problems around managing database rights.

Comments closed

Tempdb Issues You Might Have

Fabiano Amorim walks through a few of the tricky issues around tempdb:

One of many performance improvements that came with SQL Server 2014 is that it doesn’t flush dirty pages created in a minimally logged operation on tempdb. This gives you the benefit of having faster (compared to prior versions) inserts, but it caused another problem as those allocated pages may take a lot of time to be removed from the buffer pool data cache. Before discussing the problem, quickly look at the benefit and then understand some important concepts of flush dirty pages on tempdb.

Click through to learn more about the fix, and then a fix to the fix…which opened up a new avenue to fix. This kind of thing is why operating at scale is so difficult: the solution to one problem often becomes the avenue to a new problem.

Comments closed

Refreshing Selected Tables in Power BI Desktop

Gilbert Quevauvilliers answers one of the great mysteries in life:

When working with Power BI datasets I often want to refresh multiple tables, BUT not all the tables in my model. (This often happens when I made a change to underlying data sources and I want to make sure it is working as expected in Power BI Desktop)

I recently found that I could do it by doing the following steps below.

When you have one or two big tables and you don’t make any changes to those tables, it can be a pain waiting for them to refresh. This is a great alternative.

Comments closed

Operations Testing with Pester

Sheldon Hull takes us through using Pester to automate operations tasks:

In my example, let’s start small and say you just have PowerShell, and some servers.

What I’ve discovered is that to actual validate DevOps oriented work is completed, you typically go through the equivalent of what a Cucumber test would have. This “checklist” of validations is often manually performed, lacking consistency and the ability to scale or repeat with minimal effort.

Consider an alternative approach to helping solve this issue, and expanding your ability to automate the tedious testing and validation of changes made.

Read on for an example as well as some additional thoughts from Sheldon.

Comments closed

Memory Management in Flink 1.10

Andrey Zagrebin walks us through some memory management improvements in the most recent version of Apache Flink:

Apache Flink 1.10 comes with significant changes to the memory model of the Task Managers and configuration options for your Flink applications. These recently-introduced changes make Flink more adaptable to all kinds of deployment environments (e.g. Kubernetes, Yarn, Mesos), providing strict control over its memory consumption. In this post, we describe Flink’s memory model, as it stands in Flink 1.10, how to set up and manage memory consumption of your Flink applications and the recent changes the community implemented in the latest Apache Flink release.

Click through to learn about the current model and methods to control memory utilization.

Comments closed

Calculation Groups in Analysis Services and Power BI

Kasper de Jonge walks us through calculation groups:

In most cases the business wants to see some common calculations done over these measures like YTD, QTD, YoY% etc. That means you must make a new measure for each of the calculation over the base measure. This leads to measure explosion in your model as you need to add a new measure for each variation. Very quickly this leads to not four measures but sixteen and more creating a management headache.

In come calculation groups. With calculation groups you can create a group that can apply common calculation over base measures. It’s hard to explain so let’s just dive in on how that works.

Click through for a demo of how to enable and use calculation groups.

Comments closed

Getting a Better Windows Terminal

Sheldon Hull takes us through the Windows Terminal experience:

I’ve long been a Cmder/ConEmu user for Windows as it’s provided a much-needed improvement of the standard Windows terminal.

I’ve started trying to use the newer Windows Terminal project to benefit from the improved performance and support, and found getting it up and running with any customizations was a little time consuming and confusing. This wasn’t something I’d hand off to someone who wasn’t looking for experimentation.

Click through for a few scripts and screenshots.

Comments closed