Press "Enter" to skip to content

Author: Kevin Feasel

The Basics of tSQLt

Jess Pomfret walks us through the basics of tSQLt:

Getting started with tSQLt is really easy- you download a zip file, unzip the contents and then run the tSQLt.class.sql script against your development database.

There are a couple of requirements, including CLR integration must be enabled and your database must be set to trustworthy.  These could open up some security considerations, but for my development environment it’s no issue.

This is where I’d say putting the database in a container would be extremely helpful, as then you can destroy it afterward. I’m not sure if that’d work, as SQL Server on Linux doesn’t support unsafe or external access assemblies (and I’m not sure what tSQLt requires there).

Comments closed

Using Trello as a Power BI Data Source

Gilbert Quevauvilliers shows how we can use Trello as an input for Power BI:

Where I am consulting, they use Trello boards which enables them to keep track of what tasks are being done, getting done and how things are progressing.

An interesting question came up asking how much work has been done. And I thought this could be done via Trello and looking at the number of tasks in the boards that have gone from To Do to Completed.

Below are the steps of how I completed this.

It’s a mashable world.

Comments closed

Installing SSMS on Servers Running SQL Server?

Andy Mallon says yes, install SQL Server Management Studio on those servers running SQL Server instances:

“But wait, Andy. That’s not a best practice!” you say?

The pseudo best practice of “don’t install SSMS” is a misguided one–advice that I even fell into repeating in the past. However, that’s actually proposed solution to a best practice, rather than being itself a best practice.

I agree with Andy wholeheartedly on this.

Comments closed

The Importance of Unit Testing Database Code

Chris Johnson shares some thoughts on unit testing database code:

This is a topic that is quite close to me heart. I don’t come from a computing background before I started working with SQL Server, so I was quite ignorant when it came to a lot of best practices that other developers who have worked with other languages are aware of. Because of this, I had no idea about unit testing until I attended a talk at a SQL Saturday all about tSQLt. If anyone isn’t aware (as I wasn’t) tSQLt is a free open source unit testing framework for use in SQL Server databases. It is the basis of Redgate’s SQL Test software, and is the most used framework for writing unit tests in SQL Server.

Since then I’ve worked to try and get employers to adopt this as part of a standard development life cycle, with mixed success at best. My current employer is quite keen, but there are two major problems. First, we have a huge amount of legacy code that obviously has no unit tests in place; and second, the way people code is not conducive to unit testing.

Click through for additional thoughts on writing good tests and an example of modularizing code to make it more testable. I’m still in the camp of “test what you can, but you can’t test everything” with databases. There’s just too much state dependency.

Comments closed

Changing the Graphics Device in RMarkdown Docs

Colin Gillespie shows us how to change PDF and PNG output settings within knitr:

In many workflows, function calls to graphic devices are not explicit. Instead, the call is made by another package, such as knitr.

When kniting an Rmarkdown document, the default graphics device when creating PDF documents is grDevices::pdf() and for HTML documents it’s grDevices::png(). As we demostrated, these are the worst possible choices!

Click through to see what you can do about it.

Comments closed

Tidy Simulation of Stochastic Processes in R

David Robinson shows off my favorite distribution:

The Riddler puzzle describes a Poisson process, which is one of the most important stochastic processes. A Poisson process models the intuitive concept of “an event is equally likely to happen at any moment.” It’s named because the number of events occurring in a time interval of length is distributed according to , for some rate parameter (for this puzzle, the rate is described as one per day, ).

How can we simulate a Poisson process? This is an important connection between distributions. The waiting time for the next event in a Poisson process has an exponential distribution, which can be simulated with rexp().

Read on to learn about the Poisson distribution and Yule processes.

Comments closed

Clarifying Nomenclature around Azure Synapse Analytics

James Serra clears a few things up:

I see a lot of confusion among many people on what features are available today in Azure Synapse Analytics (formally called Azure SQL Data Warehouse) and what features are coming in the future. Below is a picture (click to zoom) that I describe below that hopefully clears things up:

I tend to just say “Azure Synapse Analytics SQL Pools” for the product formerly known as Azure SQL Data Warehouse and save “Azure Synapse Analytics” to include Spark + hyperscale (James’s v3).

Comments closed

Why Unit Testing in the Database Is Tough

Rob Farley talks about a couple of reasons why database unit testing can be difficult to do:

Hamish wants to develop a conversation about unit testing within database because he recognises that the lack of unit testing is a significant problem. It’s quite commonplace in the world of iterative code, of C#, Java, and those kinds of languages, but a lot less commonplace in the world of data. I’m going to look at two of the reasons why I think this is.

Read Rob’s thoughts in their entirety. I fully agree that we need to test, but get wishy-washy on the topic of automated testing. The reason for that is that tooling is quite limited, and many of those limitations are inherent limits in the database platform itself. For the types of things you most need to test (like hefty stored procedures), the number of test cases spirals out of control quickly. And unlike functional or structured programming languages, T-SQL performance gets markedly worse as you modularize, which makes it so difficult to get down to an easily testable block of code.

Comments closed

Incremental Refresh with Power BI

Chris Webb talks about a special use case for Power BI incremental refresh:

Power BI incremental refresh is a very powerful feature and now it’s available in Shared capacity (not just Premium) everyone can use it. It’s designed for scenarios where you have a data warehouse running on a relational database but with a little thought you can make it do all kinds of other interesting things; Miguel Escobar’s recent blog post on how to use incremental refresh for files in a folder is a great example of this. In this post I’m going to show you how to use incremental refresh to solve another very common problem – namely how to get Power BI to keep the data that’s already in your dataset and add new data to it.

Click through for the details.

Comments closed