Press "Enter" to skip to content

Category: Testing

Finding a Good Cost Threshold for Parallelism

Jared Westover goes on a quest:

Given modern hardware, you might hear that the default setting of 5 for the Cost Threshold for Parallelism (CTFP) is far too low. However, people are left with a decision: Should they change it or leave it alone? If I change it and the performance gets worse, I’ll be left with egg on my face. What exactly is the benefit of increasing it, especially for smaller-cost queries?

Read on to learn more about what Cost Threshold for Parallelism is, how you can set it, and a simple example of how the setting can affect you. Jared also has some links to great resources that I highly recommend you check out.

Comments closed

Testing Kafka Messages with RecordCaptor

Anton Belyaev shows off an open-source utility:

Let’s take a Telegram bot that forwards requests to the OpenAI API and returns the result to the user as an example. If the request to OpenAI violates the system’s security rules, the client will be notified. Additionally, a message will be sent to Kafka for the behavioral control system so that the manager can contact the user, explain that their request was too sensitive even for our bot, and ask them to review their preferences.

The interaction contracts with services are described in a simplified manner to emphasize the core logic. Below is a sequence diagram demonstrating the application’s architecture. I understand that the design may raise questions from a system architecture perspective, but please approach it with understanding — the main goal here is to demonstrate the approach to writing tests.

Read on to see how it all works, as well as links to Anton’s GitHub repo for testing in Kafka.

Comments closed

Building a Test Data Generator for PostgreSQL

Mika Sutinen builds some data:

I recently had a project where I needed quickly to generate some realistic looking test data to PostgreSQL database. While I often like to go for ready-made solutions, this felt like a good opportunity to stretch my coding muscles and develop it myself. Moreover, this seemed like a fun puzzle to solve, and I could probably use the same solution later on elsewhere.

Click through for a description of the generator, as well as a link to Mika’s GitHub repo. Taking a quick peek at it, it does appear that you could probably use this for other data platforms like SQL Server with very limited modification.

Comments closed

Testing Python Code with pytest

Aida Gjoka builds some tests:

Testing code using automated tools is common throughout the software development industry. This technique can improve the quality of the code you write as a data scientist. Testing helps refine your code, supports redesign, prevents errors, and makes it harder to write single-use code.

Here, we introduce the pytest framework and show how it can be used to test Python functions. If you don’t use a testing framework as part of your daily workflow, try experimenting with the techniques presented here the next time you write or extend a function.

I am a big fan of pytest because it strikes what I consider to be a great balance between convention and customization. There’s very little administrative overhead to creating test classes and test cases, so tests are easy to build and can it’s trivial to run a test suite or a specific part of one.

Comments closed

Creating Test Classes and Unit Tests with tSQLt

Olivier Van Steenlandt continues a series on database testing:

We have set up our tSQLt Database Project in the previous data recipe, Create a SSDT Project Template based on your Database Project. Now it’s time to dive into the wonderful world of tSQLt Unit Testing. In the meantime, I have added my data warehouse to my SSDT Solution and added this project as a Database Reference to my Unit Testing Database Project. If you are unsure how to do this, you can find all the information you need in my previous data cookbook which you can access via the following link: Getting Started With Database Projects & Azure DevOps.

Read on for a walkthrough of how to do this.

Comments closed

Adding tSQLt to a Database Project

Olivier Van Steenlandt provides an overview of adding tSQLt to a Visual Studio database project:

As a first step in the process, we’re going to create a new Database Project, in my case, I will be calling my Database Project AdventureWorksDW_UnitTesting and my solution AdventureWorks.

If you are not sure how to set up a Database Project in Visual Studio from scratch, don’t worry, you can follow the step-by-step data recipe I released a while ago, Getting Started with Database Projects and Version Control

Read on to learn more about how to add the tSQLt objects and eliminate cross-database reference issues.

Comments closed

The Importance of Exploratory Testing

Thuy covers why exploratory testing is important:

Exploratory Testing is a software testing method that testers use to explore, find, and test features, bugs, or issues in an application freely and without the need for a prior testing plan. In exploratory testing, the tester will focus on freely working with the app as a real user and trying to find bugs and issues without following a specific test scenario.

Exploratory testing is a type of testing in which test cases are not created beforehand, but testers can test the system quickly. They can jot down ideas about what needs to be checked before performing the test. The focus of exploratory testing focuses more on testing as a “thinking” activity that explores new cases that do not follow the mainstream activity.

It’s amazing (and dismaying) how many bugs you can find simply by clicking around. The tricky part about exploratory testing is not actually finding bugs, but keeping track of your actions so that a developer knows how to fix the bugs you’ll inevitably find.

Comments closed

Unit Testing a Database

Olivier Van Steenlandt builds some tests:

In the past few years, I learned much about collaborative data warehouse development and deployment automatization by using Database Projects (SSDT) and Azure DevOps (and other tools).

I had my fair share of learning curves, making mistakes, and having great learning opportunities. Lately, I started my next journey to learn about Unit Testing for data warehousing/database development.

In this data cookbook (blog post series), we will discover the wonderful world and different flavors of unit testing from a data perspective. In the coming weeks/months, new data recipes (blog posts) will be released bi-weekly.

This first post provides an overview of the topic and includes links to three tools, though SQL Test is an implementation of tSQLt. Of the three, Visual Studio tests are the best of the bunch, though they’re more integration tests than unit tests.

Comments closed

Test Isolation with Kafka

Anton Belyaev builds some tests:

The experience of running Kafka in test scenarios has reached a high level of convenience thanks to the use of Test containers and enhanced support in Spring Boot 3.1 with the @ServiceConnection annotation. However, writing and maintaining integration tests with Kafka remains a challenge. This article describes an approach that significantly simplifies the testing process by ensuring test isolation and providing a set of tools to achieve this goal. With the successful implementation of isolation, Kafka tests can be organized in such a way that at the stage of result verification, there is full access to all messages that have arisen during the test, thereby avoiding the need for forced waiting methods such as Thread.sleep().

This method is suitable for use with Test containers, Embedded Kafka, or other methods of running the Kafka service (e.g., a local instance).

Click through for that approach.

Comments closed