Reporting On Unit Tests In R With covrpage

Maelle Salmon recaps Locke Data’s involvement with the covrpage package:

To read more about getting started with covrpage in your own package in a few lines of code only, we recommend checking out the “get started” vignette. It explains more how to setup the Travis deploy, mentions which functions power the covrpage report, and gives more motivation for using covrpage.
And to learn how the information provided by covrpage should be read, read the “How to read the covrpage report” vignette.

Check it out.

Test Data Generation In SQL Server

Ahmad Yaseen walks through a couple techniques for creating test data in SQL Server:

Generating test data to fill the development database tables can also be performed easily and without wasting time for writing scripts for each data type or using third party tools. You can find various tools in the market that can be used to generate testing data. One of these wonderful tools is the dbForge Data Generator for SQL Server . It is a powerful GUI tool for a fast generation of meaningful test data for the development databases. dbForge data generation tool includes 200+ predefined data generators with sensible configuration options that allow you to emulate column-intelligent random data. The tool also allows generating demo data for SQL Server databases already filled with data and creating your own custom test data generators. dbForge Data Generator for SQL Server can save your time and effort spent on demo data generation by populating SQL Server tables with millions of rows of sample data that look just like real data. dbForge Data Generator for SQL Server helps to populate tables with most frequently used data types such as Basic, Business, Health, IT, Location, Payment and Person data types.

I have a love-hate relationship with test data generation tools, as they tend not to create reasonable data, where reasonable is a combination of domain (hi, birth date in the early 1800s!) and distribution.

Testing Package Properties With ssisUnit

Bartosz Ratajczyk shows how you can test certain properties on an Integration Services package using ssisUnit:

The command is simple. You can get or set the property using the value for given property path. As usual – when you get the value, you leave the value blank. The path – well – is the path to the element in the package or the project. You use backslashes to separate elements in the package tree, and at the end, you use .Properties[PropertyName] to read the property. If you use the elements collection – like connection managers – you can pick a single element using square brackets and the name of this element.

Read on for more, including limitations and useful testing scenarios.

Automated Testing With Power Query

Fred Kaffenberger walks us through query failure with Power Query:

I loved Nar’s post on Automated Testing using DAX. I especially like the rule of always including controls so that business readers can share responsibility for data quality. For my part, I sometimes use hidden pages in Power BI reports to assure myself of data quality. I also set alerts on testing dashboards in the Power BI Service to notify me if something is not right. Sometimes, however, a more proactive approach is needed. So, we’ll be doing automated testing with Power Query.

If the query can’t connect to the data source, it will fail. When this happens, the report in Power BI Service is stale, but accurate. I’m fine with thisIt can also happen that the query succeeds but is incomplete. In this case, the result is that the report is wrong. Why does this happen? It can happen because of an overtaxed transactional data source. The ERP or CRM or work order system just can’t deliver the amount of data. Maybe it’s linked SQL tables using ODBC. For whatever reason, the query succeeds, but data is missing. I’m NOT fine with this. The long-term solution is to move to a more reliable data source (data warehouse, anybody?). In the short run, refreshes must be stopped. Stale data is better than bad data.

Also check out the comments.

Packages For Testing R Packages

Kevin Feasel


R, Testing

Maelle Salmon shows us how to test our R packages within R:

If you’re brand-new to unit testing your R package, I’d recommend reading this chapter from Hadley Wickham’s book about R packages.

There’s an R package called RUnit for unit testing, but in the whole post we’ll mention resources around the testthat package since it’s the one we use in our packages, and arguably the most popular one. testthat is great! Don’t hesitate to reads its docs again if you started using it a while ago, since the latest major release added the setup() and teardown() functions to run code before and after all tests, very handy.

To setup testing in an existing package i.e. creating the test folder and adding testthat as a dependency, run usethis::use_testthat(). In our WIP pRojects package, we set up the tests directory for you so you don’t forget. Then, in any case, add new tests for a function using usethis::use_test().

The testthis package might help make your testing workflow even smoother. In particular, test_this() “reloads the package and runs tests associated with the currently open R script file.”, and there’s also a function for opening the test file associated with the current R script.

This is an area where I know I need to get better, and Maelle gives us a plethora of tooling for tests.

Generating Load For Kafka With JMeter

Anup Shirolkar shows us a way to use JMeter to generate load for Apache Kafka clusters:

The Anomalia Machina is going to require (at least!) one more thing as stated in the intro, loading with lots of data! Kafka is a log aggregation system and operates on a publish-subscribe mechanism. The Kafka cluster in Anomalia Machina will be accumulating a lot of events which are to be processed to discover anomalies. The exact sequence of processing is still being prototyped at this point in time, but there is a solid requirement of a tool/mechanism to load the Kafka cluster with lots of data in a hurry.

The requirements pointed me in direction of looking for ‘Kafka Load Testing’. Firstly thinking of load testing, one tool comes to mind which is used very widely for load testing of Java based systems: ‘Jmeter’. Jmeter has rich toolset to perform various types of testing. It also comes with many advantages viz. Open source, easy to use, platform independent, distributed testing etc. I can use Jmeter and test its ability to perform cluster loading.

Read on for the demonstration.

Testing TDE Performance

Eduardo Pivaral tests the performance of a database with Transparent Data Encryption versus that same database without encryption:

Transparent data encryption (TDE) helps you to secure your data at rest, this means the data files and related backups are encrypted, securing your data in case your media is stolen.
This technology works by implementing real-time I/O encryption and decryption, so this implementation is transparent for your applications and users.

However, this type of implementation could lead to some performance degradation since more resources must be allocated in order to perform the encrypt/decrypt operations.

On this post we will compare how much longer take some of the most common DB operations, so in case you are planning to implement it on your database, you can have an idea on what to expect from different operations.

These results fit in reasonably well with what I’d heard, but it’s nice to have someone run the numbers.

Building Observable Distributed Systems

Kevin Sookocheff has some thoughts on building observable systems:

Given the shortcomings of monitoring and testing, we should shift focus to building observable systems. This means treating observability of system behaviour as a primary feature of the system being built, and integrating this feature into how we design, build, test, and maintain our systems. This also means acknowledging that the ease with which we can debug our production environment will be a key indicator of system reliability, scalability, and ultimately customer experience. Designing a system to be observable requires effort from three disciplines of software development: development, testing, and operations. None of these disciplines is more important than the others, and the sum of them is greater than the value of the individual parts. Let’s take some time to look at each discipline in more detail, with a focus on observability.

My struggle has never been with the concept, but rather with getting the implementation details right.  “Make everything observable” is great until you run out of disk space because you’re logging everything.

Combining Parameters And Prompts In Powershell

Shane O’Neill compromises with a co-worker on building a Powershell function:

As you may have guessed…

…this did not go down well with my co-worker.

That?! How are we supposed to know to write “!?” there? How does that even mean help anyway? Can we not just get the help message to show up on it’s own!

To which I said…

We can but it’s going to create a massive bottleneck! If we try and put in the help messages then it’s going to slow things down eventually because we’re not going to be able to run this without someone babysitting it! Do you want to be the person that has to sit there watching this every single time it’s run?

Compromise is a wonderful thing though and we eventually managed to merge the best of both worlds…

Shane gives us a solution and has a couple of updates for simpler solutions to boot.

An Update To ssisUnit

Bartosz Ratajczyk has added some functionality to ssisUnit:

Second – you can get and set the properties of the project and its elements. Like – overwriting project connection managers (I designed it with this particular need on my mind). You can now set the connection string the different server (or database) – in the PropertyPath of the PropertyCommand use \Project\ConnectionManagers, write the name of the connection manager with the extension, and use one of the Properties. You can do it during the Test setup (or all tests setup), but not during the test suite setup, as ssisUnit is not aware of the project until it loads it into the memory.

Good on Bartosz for resurrecting a stable but moribund project and adding some enhancements.


December 2018
« Nov