Press "Enter" to skip to content

Category: Testing

Creating Repeatable Test Data

Louis Davidson repeats himself:

In order to test graph structures, I needed a large set of random data. In some ways, this data will resemble the IMDB database I will include later in this chapter, but to make it one, controllable in size and two, random, I created this random dataset. I loaded a set of values for an account into a table and a set of interests. I wanted then to be able to load a set of random data into edge, related one account to another (one follows the other), and then an account to a random set of interests.

In this article, I will discuss a few techniques I used, starting with the simplest method using ordering by NEWID(), then using RAND() to allow the code to generate a fixed set of data.

There’s a lot of code needed to do it but if it’s something you’ve got to do, that’s the cost of doing business.

Leave a Comment

Testing API Packages in R

Jamie Owen does some testing:

This blog post is a follow on to our API as a package series, which looks to expand on the topic of testing {plumber} API applications within the package structure leveraging {testthat}. As a reminder of the situation, so far we have an R package that defines functions that will be used as endpoints in a {plumber} API application. The API routes defined via {plumber} decorators in inst simply map the package functions to URLs.

Jamie covers a lot of testing ground in that post as well, so check it out.

Leave a Comment

Running Postman Tests in GitLab

Rahul Kumar automates Postman tests:

Hi folks, In this brief blog post, we’ll learn more about Gitlab CI and Postman, the API testing tool we use the most frequently. This article’s goal is to provide a quick process for automatically testing the service API response. The solution makes use of the capabilities provided by the Gitlab-integrated Continuous Integration tool.

Click through for the tutorial.

Comments closed

Testing Powershell Scripts

David Wilson provides an introduction to Pester:

Most of you probably know that I’m a big fan of automated testing and especially testing during the development process. It significantly improves the design of the code by encouraging loose coupling and high cohesion. It also provides great documentation and increases the confidence of anyone who needs to change the code in the future (this includes future you)!

Testing does tend to get the short end of the stick when it comes to development time. Some of that is design problems, like David mentions, but I think a lot of it is the “This is a solved problem” mentality we (and I am definitely part of “we” here) end up in: I proved that the solution work because the code compiled and the two scenarios I tried out worked; therefore, why do I need to “waste” the extra time by writing all of these tests when I can move on to something more interesting?

Comments closed

Creating a SQL Server 2022 Learning Environment

Marlon Ribunal gets us started with a Docker container:

Maybe you want to get your hands dirty with the bells and whistles of the latest iteration of SQL Server, but you don’t have an extra bare metal or Azure or GCP based VM. Well, you’re in luck because Microsoft just released container images for SQL Server 2022.

Here are few steps to get you started with SQL Server 2022:

At this point, it’s quite easy to give new versions of SQL Server a try, even when they’re in preview. That said, some of the features make it to containers later so you might want to spin up a virtual machine and install it if there’s something you can’t get right now in the container.

Comments closed

Testing Azure Synapse Link for SQL Server 2022

Kevin Chant gives Synapse Link for SQL Server a try:

Azure Synapse Link for SQL Server 2022 allows you to replicate your data from a SQL Server 2022 database to an Azure Synapse Analytics dedicated SQL Pool.

It is one of the options for the new Azure Synapse Link for SQL feature that was announced during Microsoft Build. You can read more about this in the Microsoft post which also announced the Public Preview of Azure Synapse Link for SQL.

Click through to see what Kevin has found so far. I think by the time this rolls out GA, it should be pretty good.

Comments closed

Creating Reproducible Examples with CI

Colin Gillespie and Jack Walton tackle a common training problem:

As the number of courses we offer increased, so did the maintenance burden of our associated training materials (lecture notes, slides, exercises, and more). To ease this burden, and to assist in ensuring that our training materials build consistently, we developed an R package called {jrNotes2}. Amongst other things, this package ensures that all courses:

– have identical “template files”: .gitlab-ci.yml.gitignoreMakefiles, index.Rmd, …;

– have the same directory structure, and

– pass a set of quality-assurance checks.

This is smart but read on to see why it’s still a challenge. This is especially true in the R and Python worlds, where breaking changes seem to be so common.

Comments closed

Unit Testing ADX Functions

David Giard builds some tests:

Our application contains many functions that return data stored in Azure Data Explorer (ADX). We wrote these functions in Kusto Query Language (KQL) and each function returns a set of data based on the arguments passed. Although developers tested these functions as they wrote them, we needed a way to validate that the functions continued to work as the code and the data changed.

Automated Unit testing is an essential part of any application development life cycle. It validates that code works properly and minimizes the risk that future code changes will break existing functionality.

In this article, I will discuss the approach we took in automating the testing of ADX functions.

Click through to see how to use the assert() function and build some tests.

Comments closed

Finding Duplicates in Type 2 SCDs

Dinesh Asanka wants to verify some Type 2 slowly changing dimension results:

As we discussed in a previous article, Implementing Slowly Changing Dimensions (SCDs) in Data Warehouses, there are three main types of slowly changing dimensions, such as Type 1, Type 2, and Type 3. Out of these Type 1 is the simple dimension where you will simply maintain only the latest version of the attribute. For example, if the employee got promoted to Senior Software Engineer from Software Engineer, you will simply overwrite the existing value to the new value so that the historical aspect is lost.

Type 2 Slowly Changing Dimensions are used to track historical data in a data warehouse. This is the most common approach in dimension. This article uses a sample database of AdventureworksDW which is the sample database for the data warehouse.

Click through for one way to compare, one which you could build using dynamic SQL.

Comments closed

Stress Testing Power BI Embedded

Kristyna Hughes puts Power BI to the test:

For example, one instance may have a very large data model that takes a lot of memory and CPU time to refresh, 20 users at peak viewing times, hourly refreshes, and the queries are all very simple and allow for query folding. Another business may have six smaller data models, 950 users at peak viewing times, daily refreshes, and the queries populating the data model are all very very complex. All of these elements impact the usage at any given time, making predicting overall CPU needs nearly impossible. Thankfully, stress testing your capacity gives us an option that is not purely reactionary.

This blog will walk through how to stress test your capacity, the elements of capacity planning, and how to understand the results of the stress test.

Read on to see how, using a step-by-step guide.

Comments closed