TDE + AG = Higher CPU Utilization

Ginger Keys has an analysis stress testing CPU load when Transparent Data Encryption is on and a database is in an Availability Group:

Microsoft says that turning on TDE (Transparent Data Encryption) for a database will result in a 2-4% performance penalty, which is actually not too bad given the benefits of having your data more secure. There is even more of a performance hit when enabling cell level or column level encryption. When encrypting any of your databases, keep in mind that the tempdb database will also be encrypted. This could have a performance impact on your other non-encrypted databases on the same instance.

In a previous post I demonstrated how to add an encrypted database to an AlwaysOn group in SQL2016. In this article I will demonstrate the performance effects of having an encrypted database in your AlwaysOn Group compared to the same database not-encrypted.

The results aren’t surprising, though the magnitude of the results might be.

The Importance Of A Test Environment

Randolph West explains why it’s important to have a test environment separate from your development and production environments:

Some companies I’ve worked with have different forms of testing environments, including QA (Quality Assurance), IAT (Internal Acceptance Testing), and UAT (User Acceptance Testing). What they are called doesn’t matter, so long as they exist.

In a typical IT deployment, whether using Waterfall, Agile, or other development methodologies of the month, it pays to have a basic developmenttestproduction deployment path.

Randolph explains it in some detail but one of the big benefits for me is that you can make sure that deployment process works before deployment time.  Knowing that your checked-in scripts won’t break the deployment (because they didn’t break the CI build and release) makes the release process a lot less stressful.

Using OStress

Nikhilesh Patel explains how to use OStress to generate artificial database loads for stress testing:

OStress is a Microsoft tool comes with RML utilities package and it uses to stress SQL Server. This is especially useful when you want to troubleshoot SQL Server while SQL Server is under heavy load.

It is a free tool for SQL Server developers and DBAs. It is designed to assist with performance stress testing of T-SQL queries and routines. The tool automatically collects metrics to help you determine whether your queries will perform under load, and what kind of resource strain they put on a server. In short, it also allows putting a serious load on your database.

OStress isn’t the easiest thing in the world to set up, but it works well.

BDD In Spark

Aaron Colcord and Zachary Nanfelt explain how to use Cucumber to create behavior-driven development tests on Apache Spark:

Cucumber allows us to write a portion of our software in a simple, language-based approach that enables all team members to easily read the unit tests. Our focus is on detailing the results we want the system to return. Non-Technical members of the team can easily create, read, and validate the testing of the system.

Often Apache Spark is one component among many in processing data and this can encourage multiple testing frameworks. Cucumber can help us provides a consistent unit testing strategy when the project may extend past Apache Spark for data processing. Instead of mixing the different unit testing strategies between sub-projects, we create one readable agile acceptance framework. This is creating a form of ‘Automated Acceptance Testing’.

Best of all, we are able to create ‘living documentation’ produced during development. Rather than a separate Documentation process, the Unit Tests form a readable document that can be made readable to external parties. Each time the code is updated, the Documentation is updated. It is a true win-win.

It’s an interesting mix.  I’m not the biggest fan of BDD but I’m happy that this information is out there.

Kinesis Data Generation

Allan MacInnis shows off a new data generation tool for Amazon’s Kinesis:

Amazon Kinesis Streams and Amazon Kinesis Firehose enable you to continuously capture and store terabytes of data per hour from hundreds of thousands of sources. Amazon Kinesis Analytics gives you the ability to use standard SQL to analyze and aggregate this data in real-time. It’s easy to create an Amazon Kinesis stream or Firehose delivery stream with just a few clicks in the AWS Management Console (or a few commands using the AWS CLI or Amazon Kinesis API). However, to generate a continuous stream of test data, you must write a custom process or script that runs continuously, using the AWS SDK or CLI to send test records to Amazon Kinesis. Although this task is necessary to adequately test your solution, it means more complexity and longer development and testing times.

Wouldn’t it be great if there were a user-friendly tool to generate test data and send it to Amazon Kinesis? Well, now there is—the Amazon Kinesis Data Generator (KDG).

Check it out if you’re using Kinesis and need to do some load testing.

Designing A Data Warehouse Test Plan

Koos van Strien walks through some of the high-level concepts when automating data warehouse tests:

In my current project, I’ve got a database containing everything to perform these tests:

  • Tables with identical structure to the ones in the staging area (plus two columns “TestSuiteName” and “TestName”)
  • A table containing the mapping from test-input table to target database, schema and table
  • A stored procedure to purge the DWH (all layers) in the test environment
  • A stored procedure to insert the data for a specific testsuite / name

When preparing a specific test case (the “insert rows for test case” step from the diagram above), the rows needed for that case are copied into the DWH:

Testing warehouses is certainly not a trivial exercise but given how complex warehouse ETL tends to be, having good tests reduces the number of 3 AM pages.

Pester For Presentations

Rob Sewell takes Pester to the edge:

If you have PowerShell version 5 then you will have Pester already installed although you should update it to the latest version. If not you can get Pester from the PowerShell Gallery follow the instructions on that page to install it. This is a good post to start learning about Pester

What can you test? Everything. Well, specifically everything that you can write a PowerShell command to check. So when I am setting up for my presentation I check the following things. I add new things to my tests as I think of them or as I observe things that may break my presentations. Most recently that was ensuring that my Visual Studio Code session was running under the correct user. I did that like this

Rob’s scenario is around giving presentations, but while reading this, think about those services which should be running on your SQL Server instance—the same concept applies.

Load Testing Kafka

Satish Bhor shows off Pepper-Box, a load generator which can stress test Apache Kafka:

Pepper-Box is a Kafka load generator application that can be used as a plugin for JMeter or standalone utility. It allows sending plain text Kafka messages (JSON, XML, CSV, or any other custom format), as well as Java serialized objects. Pepper-Box includes a template engine and random data generation function which helps to design message in any format. If we use it with JMeter then we can use all JMeter features. Pepper-Box is very useful in streaming analytics and data pipelines implementation, where input data format is tightly coupled with business problems.

Pepper-Box includes four main components.

I’m going to keep an eye on this tool.

Ignoring LoadGeneratorLocationError

Melissa Connors shows how to ignore LoadGeneratorLocationError errors in Visual Studio load tests:

I use Visual Studio for performance testing and overhead analysis with the SentryOne products. Currently, I have Microsoft Visual Studio Enterprise 2015 Version 14.0.25431.01 Update 3 installed. Since the first edition of 2015 (possibly even Visual Studio 2013), I’ve received a LoadGeneratorLocationError during each Load Test execution.

Since I am running the test locally, this error is noise. Furthermore, no one wants to see an error in an otherwise successful test. It simply ruins the final results report. In addition, when the Load Test was created, “On-premise Load Test” was selected, which makes this frustrating. Possibly more frustrating is that it’s called “On-premise” when you get started in the New Load Test Wizard.

Read on for the answer.


Kevin Feasel


R, Testing

David Smith discusses a new service to test packages on multiple platforms:

If you’re developing a package for R to share with others — on CRAN, say — you’ll want to make sure it works for others. That means testing it on various platforms (Windows, Mac, Linux, and all the versions thereof), and on various versions of R (current, past, and future). But it’s likely you only have access to one platform, and installing and managing multiple R versions can be a pain.

R-hub, the online package-building service now in public beta, aims to solve this problem by making it easy to build and test your package on a variety of platforms and R versions. Using the rhub R package, you can with a single command upload your package to the cloud-based R-hub service, and build and test your package on the current, prior, and in-development versions of R, using any or all of these platforms

This looks like an interesting service for package developers and companies with a broad distribution of R installations.


August 2017
« Jul