Press "Enter" to skip to content

Category: Testing

Generating Test Data with ChatGPT

Daniel Janik builds fake data:

Have you ever been tasked with creating test data for an application and then ran into performance problems once the application moves to production?

Many of us manage databases or applications that contain regulated data that can’t leave a production environment. This means that we need to “clean” the data if it’s going to be used in QA or development work and one common way to de-identify the data is to simply update columns like firstname and lastname with a simple format “firstname” + counter; however, this results in all the data being unique and sequential. Firstname1, firstname2, firstname3, …
This isn’t good for getting like for like results with a production database and can lead to questions we’ve heard before in the workplace like “Why didn’t we catch this in QA?”

This works reasonably well, though you’d want to be sure to seed in edge cases and the like. But if you just need to generate some realistic-ish data pretty quickly, this is one option that can work.

Comments closed

Trying out Azure Load Testing

Dieu Phan takes us through the Azure Load Testing service:

Azure Load Testing is a fully managed load-testing service that enables you to generate high-scale loads. The service simulates traffic for your applications, regardless of where they’re hosted so I would like to share a walkthrough Azure Load Testing in this post.

Okay, this post isn’t very data platform-centric, but I do like the Load Testing service and think more companies and people should use it.

Comments closed

Automated Power BI Visual Testing with PBI Inspector

Chris Webb phones a friend:

This week, one of my colleagues at Microsoft, Nat Van Gulck, showed me a cool new open-source tool he’s been working on to make VisOps for Power BI much easier: PBI Inspector. What is VisOps? I’ll admit I didn’t really know either, so being lazy I asked Nat to write a few paragraphs describing the project and why it will be useful:

Read on for Nat’s description and an example of PBI Inspector in action.

Comments closed

Unit Testing Dynamic SQL

Jay Robinson lays out a pattern:

Dynamic SQL (aka Ad Hoc SQL) is SQL code that is generated at runtime. It’s quite common. Nearly every system I’ve supported in the past 30 years uses it to some degree, some more than others.

It can also be a particularly nasty pain point in a lot of systems. It can be a security vulnerability. It can be difficult to troubleshoot. It can be difficult to document. And it can produce some wickedly bad results.

Click through for Jay’s process as well as recommendations and an example. It’s certainly worth thinking about.

Comments closed

Verifying a Backup in SQL Server

Chad Callihan knows your last backup is only as good as your last restore:

Is the process of testing your backups something you know you should do but never get around to? Do you find yourself assuming all is well with backups while putting out other fires? Test-DbaLastBackup, part of the beloved dbatools, can solve your dilemma.

There are many options available when using Test-DbaLastBackup. Let’s explore a few of these options and see some examples of how to use them.

Click through to learn more about this. And you could easily put together Powershell scripts to stagger your restorations over a time frame (such as, 15% of your databases each day, so that you get to 100% by the end of the week).

Comments closed

Thoughts on Testing Stored Procedures

Erik Darling has a plan:

While most of it isn’t a security concern, though it may be if you’re using Row Level Security, Dynamic Data Masking, or Encrypted Columns, you should try executing it as other users to make sure access is correct.

When I’m writing stored procedures for myself or for clients, here’s what I do.

Click through for Erik’s guidance. The premise is couched in security testing, though much of this is functionality and performance testing Regardless, it’s a good plan.

Comments closed

Managing Database Test Data

Phil Factor maintains some tests:

When learning about relational databases, we all tend to use ‘toy’ databases such as PubsAdventureWorksNorthWind, or ClassicModels. This is fine, but it is too easy to assume that one can then do real-world database development in the same way. You have your database full of data and just cut code that you then test. From a distance, it all seems so easy.

In fact, rapid and effective database development usually requires a much more active approach to data. You need to work out how to test your work as you go, and to test continuously. For that, you need appropriate data with the right characteristics, in the suitable quantity. You also need to plan how to ensure that, when you make changes to the database, or even minor changes to its settings, all business processes continue to work correctly. In Agile terms you need a test-first methodology, fast feedback loop, and iterative development. You should never cut some SQL Code and only then think to yourself “I wonder how I’ll be able to test this?“.

This is something I’ve historically been pretty lazy about, to my detriment. Phil does an outstanding job of making the case for why generating and working with your own test data (versus live data) is important, as well as categorizing the purposes of this test data and the types of tests you’ll want to have.

Comments closed

Load Testing in Power BI

Chris Webb gives us the why and the how:

If you’re about to put a big Power BI project into production, have you planned to do any load testing? If not, stop whatever you’re doing and read this!

In my job on the Power BI CAT team at Microsoft it’s common for us to be called in to deal with poorly performing reports. Often, these performance problems only become apparent after the reports have gone live: performance was fine when it was just one developer testing it, but as soon as multiple end-users start running reports they complain about slowness and timeouts. At this point it’s much harder to change or fix anything because time is short, everyone’s panicking and blaming each other, and the users are trying to do their jobs with a solution that isn’t fit for purpose. Of course if the developers had done some load testing then these problems would have been picked up much earlier.

With that in mind, Chris explains some of the things we can do to help with load testing in Power BI.

Comments closed

Data Validation with Great Expectations and Azure Functions

Eduard van Valkenburg does a bit of data validation:

Great Expectations (Great Expectations Home Page • Great Expectations) is a popular Python-based OSS tool to validate data that comes into your data estate. And for me, validating incoming data is best done file by file, as the files arrive! On Azure there is no better platform for that then Azure Functions. I love the serverless nature of Functions and with the triggers available for arriving blobs, as well as HTTP, event grid events, queues and others. There are some great patterns that allow you to build event-driven data architectures. We also now have the Python v2 framework for Azure Functions available, which makes the developer experience even better. So, let’s go through how to get it running.

This looks really interesting and tying it in to Azure Functions is a good idea assuming that the checks don’t run for too long.

Comments closed