Generating Workloads with Powershell

Rob Sewell wants to generate a workload against AdventureWorks using Powershell:

For a later blog post I have been trying to generate some workload against an AdventureWorks database.

I found this excellent blog post by Pieter Vanhove t which references this 2011 post by Jonathan Kehayias t

Rob turns these into multi-threaded workload generators. If you’re looking at generating stress on servers, you might also look at PigDog, developed by Mark Willkinson (one of my co-workers, so I have seen the look of joy on his face when he brings SQL Server to its knees).

Unit Testing R Code

Kevin Feasel


R, Testing

John Mount points out that you don’t need special infrastructure to perform unit testing in R:

There seems to be a general (false) impression among non R-core developers that to run tests, R package developers need a test management system such as RUnit or testthat. And a further false impression that testthat is the only R test management system. This is in fact not true, as R itself has a capable testing facility in “R CMD check” (a command triggering R checks from outside of any given integrated development environment).

By a combination of skimming the R-manuals ( ) and running a few experiments I came up with a description of how R-testing actually works. And I have adapted the available tools to fit my current preferred workflow. This may not be your preferred workflow, but I have and give my reasons below.

Food for thought for any R developer.

Testing Cosmos DB’s REST API

Hasan Savran shows how we can test Cosmos DB’s REST API using Postman:

        You have many options to access to CosmosDB. Rest API is one of these options and it is the low level access way to Cosmos DB. You can customize all options of CosmosDB by using REST API. To customize the calls, and pass the required authorization information, you need to use http headers. There are many headers you can set depending on the operation you want to run in CosmosDB.  I am going to cover only the required headers here.

      In the following example, I am going to try to create a database in CosmosDB emulator by using the REST API. First let’s look at the required header fields for this request. These requirement applies to all other REST API calls too.

It’s a little more complicated than just posting to a URL and Hasan has you covered.

The Anatomy of a Pester Test

Shane O’Neill takes us through using Pester to test self-contained scripts:

Where things differ…
…could be when you try to accommodate different people and create a .ps1 file that both defines and calls a function. Self Contained scripts, if you would call them that.
Normally the reason that I’ve heard from this is you’re trying to help a non-technical minded person and they just want a file that they can open, hit “run”, and everything is done for them.
Have you ever tried to Pester test those files though? It’s not recommended, especially if your function removes or modifies objects.

Click through for a solution and read Shane’s update as well for a scenario where it doesn’t quite work as hoped.

Regression Testing With Pester

Ust Oldfield continues a series on Pester testing:

In a previous post, I gave an overview to regression tests. In this post, I will give a practical example of developing and performing regression tests with the Pester framework for PowerShell. The code for performing regression tests is written in PowerShell using the Pester Framework. The tests are run through Azure DevOps pipelines and are designed to test regression scenarios. The PowerShell scripts, which contain the mechanism for executing tests, rely upon receiving the actual test definitions from a metadata database. The structure of the metadata database will be exactly the same as laid out in the Integration Test post.

There’s a hefty test script here too, so check it out.

An Overview of Regression Testing

Ust Oldfield gives us a primer on regression testing:

There are a variety of methods and techniques that can be used in the design and execution of regression tests. These are:
– Retest All
– Test Selection
– Test Case Prioritisation

Regression testing is really nice to have in place because it keeps you from looking like a buffoon for accidentally reintroducing a bug you’ve previously quashed.

Generating Fake Data

Rich Benner shows us how to use the Faker library in Python to generate test data:

There are far more options when using Faker. Looking at the official documentation you’ll see the list of different data types you can generate as well as options such as region specific data.

Go have fun trying this, it’s a small setup for a large amount of time saved.

These types of tools can be great for generating a bunch of data but come with a couple of risks. One is that in the fake addresses Rich shows, ZIP codes don’t match their states at all, so if your application needs valid combos, it can cause issues. The other problem comes from distributions: generated data often gets created off of a uniform distribution, so you might not find skewness-related problems (e.g., parameter sniffing issues) strictly in your test data.

That said, easily generating test data is powerful and I don’t want to let the good be the enemy of the great.

Database Integration Testing With Pester

Ust Oldfield shows us an example of how we can perform database integration testing using Pester:

In a previous post, I gave an overview to integration tests and documenting integration points. In this post, I will give a practical example of developing and performing integration tests with the Pester framework for PowerShell. With a data platform, especially one hosted in Azure, it’s important to test that the Azure resources in your environment have been deployed and configured correctly. After we’ve done this, we can test the integration points on the platform, confident that all the components have been deployed.

The code for performing integration tests is written in PowerShell using the Pester Framework. The tests are run through Azure DevOps pipelines and are designed to test documented integration points. The PowerShell scripts, which contain the mechanism for executing tests, rely upon receiving the actual test definitions from a metadata database.

Click through for the script.

Getting CSV Row Counts

Dave Mason shares a few techniques for getting row counts of CSV files:

I was reminded of this recently as I was working with R, trying to read a nearly 2 GB data file. I wanted to read in 5% of the data and output it to a smaller file that would make the test code run faster. The particular function I was working with needed a row count as one of its parameters. For me, that meant I had to determine the number of rows in the source file and multiply by 0.05. I tied the code for all of those tasks into one script block.
Now, none to my surprise, it was slow. In my short experience, I’ve found R isn’t particularly snappy–even when the data can fit comfortably in memory. I was pretty sure I could beat R’s record count performance handily with C#. And I did. I found some related questions on StackOverflow. A small handful of answers discussed the efficiency of various approaches. I only tried two C# variations: my original attempt, and a second version that was supposed to be faster (the improvement was nominal).

To fact-check Dave (because this blog is about nothing other than making sure Dave is right), I checked the source code to the wc command. That command also streams through the entire file, so Dave’s premise looks good.

Testing Kafka Streams Applications

Yeva Byzek continues her series on testing Kafka-based streaming applications:

When you create a stream processing application with Kafka’s Streams API, you create a Topologyeither using the StreamsBuilder DSL or the low-level Processor API. Normally, the topology runs with the KafkaStreams class, which connects to a Kafka cluster and begins processing when you call start(). For testing though, connecting to a running Kafka cluster and making sure to clean up state between tests adds a lot of complexity and time.

Instead, developers can unit test their Kafka Streams applications with utilities provided by kafka-streams-test-utils. Introduced in KIP-247, this artifact was specifically created to help developers test their code, and it can be added into your continuous integration and continuous delivery (CI/CD) pipeline.

Streaming applications need tested just like any other.


April 2019
« Mar