Press "Enter" to skip to content

Author: Kevin Feasel

The JOIN Function in SSRS

Tim Mitchell explains what the JOIN() function does in SQL Server Reporting Services:

The JOIN() function in SQL Server Reporting Services is a handy tool that allows you to turn a list into a delimited string value. This function accepts two parameters, a list and a delimiter with which to separate the output, and returns a string with that list separated by the specified delimiter character.

It’s not an inaccurate name, but I’m surprised it’s not named CONCAT_WS() or something like that given how the term “join” has such a strong connotation in the relational database world.

Comments closed

TempDB Usage and WhoIsActive

Josh Darnell takes us through a weird scenario where WhoIsActive can’t catch the real culprit:

Let’s say you are informed that tempdb is getting hammered on a production SQL Server instance (in the “lots of reads and writes” sense, not the “lots of shots of tequila” sense), and it’s disrupting other workloads on the system. You may have found this out through the power of monitoring (tempdb files are growing or full), or your favorite DMV queries, or just from being really smart.

You spring into action to find the offending query, and run EXEC sp_WhoIsActive but get…nothin’:

I did not successfully guess why this might be, but Josh explains it well.

Comments closed

Diagnosing Power Query Steps

Chris Webb takes us through the Diagnose Step button in Power Query:

As you might have guessed, it’s closely related to the Query Diagnostics functionality that was introduced back in October. Whereas the existing Query Diagnostics functionality allows you to see what happens inside the Power Query when a query is executed, this new feature does something similar but allows you to run a query up to a specific step. This is useful in scenarios where you want to reduce the diagnostics data you are collecting to a subset of the steps in the query without having to comment out a lot of M code.

It looks pretty useful.

Comments closed

Testing Kafka Streams

Jukka Karvanen takes us through TopologyTestDriver in Apache Kafka Streams:

Similarly to testing a single record, it is possible to pipe Value lists into a TestInputTopic. Validating the output can be done record by record like before, or by using the readValueToList() method to see the big picture when validating the whole collection at the same time. For our example topology, when the test pipes in the values, it needs to validate the keys and use the readKeyValueToList() method.

Testing streams of data (regardless of the product) is enough of a mental shift that it’s not an easy problem. For that reason, I welcome any tool which simplifies the process.

Comments closed

How Spark Runs on YARN with HDFS

Sarfaraz Hussain explains how some of the pieces of the Hadoop ecosystem fit together:

Once it verifies that everything is in place, it will assign a Job ID to the Job and then allocate the Job ID into a Job Queue.

So, in Job Queue there can be multiple jobs waiting to get processed.

As soon as a job is assigned to the Job Queue, it’s corresponding information about the Job like Input/Output Path, the location of the Jar, etc. are written into the temp location of HDFS.

Read the whole thing.

Comments closed

Columnstore Versus Page Compression

Aaron Bertrand compares columnstore and page compression on a specific table:

Recently someone at work asked for more space to accommodate a rapidly growing table. At the time it had 3.75 billion rows, presented on 143 million pages, and occupying ~1.14TB. Of course we can always throw more disk at a table, but I wanted to see if we could scale this more efficiently than the current linear trend. Sounds like a great job for compression, right? But I also wanted to try out some other solutions, including columnstore – which people are surprisingly reluctant to try. I am no Niko, but I wanted to make an effort to see what it could do for us here.

Note that I am not focusing on reporting workload or other read query performance at this time – I merely want to see what impact I can have on storage (and memory) footprint of this data.

Here is the original table. I’ve changed table and column names to protect the innocent, but everything else is relatively accurate.

Page compression won, and I’ve got a pretty good idea why (though some of the diagnostic info is gone): Aaron has several VARCHAR and NVARCHAR columns, and those blow up the columnstore dictionary pretty fast. Aaron has more to go in this series, so stay tuned.

Comments closed

Auditing the Power BI Activity Log

Brett Powell shows how to query the Power BI Activity Log:

The new Power BI Activity Log makes it much easier for Power BI administrators to access audit log data to monitor and analyze activities occurring within the tenant they support. This blog post describes one solution for accessing, processing, and loading Power BI activity log data as part of an automated process to support ongoing Power BI administrative analysis and reporting.

Brett has a very nicely defined solution and lays it out step by step for us.

Comments closed

Costs and Benefits of Recompilation

Erik Darling takes us through the pros and cons of slapping that RECOMPILE label on a query:

It’s been a while since SQL Server has had a real RECOMPILE problem. And if you put it up against the performance problems that you can hit with parameter sniffing, I’d have a hard time telling someone strapped for time and knowledge that it’s the worst idea for them.

Obviously, you can run into problems if you (“you” includes Entity Framework, AKA the Database Demolisher) author the kind of queries that take a very long time to compile. But as I list them out, I’m kinda shrugging.

My rule of thumb is that WITH RECOMPILE isn’t the first answer, but I won’t mess around too long before going to it.

Comments closed

Auditing Index Changes

Jason Brimhall gives us a solution to track index changes:

In the previous article, I showed a quick and dirty method to try and capture the indexes that may have been just created. We all know that in the hectic life of the DBA, it is easy enough to forget the indexes created more than just a few hours in the past.

In that article, I mentioned that the solution provided was far from a complete solution to the entire problem. A bigger part of the picture involves audits to know when and by whom indexes were created, altered or dropped.

Keeping track of this kind of thing is important, particularly in an environment where performance suddenly changes even though the devs totally didn’t touch anything, promise.

Comments closed

Recovering From Parsing NaN Values

Slava Murygin shows how you can recover from a nasty TRY_PARSE error:

In my previous post I outlined very dangerous SQL Server problem, caused by usually not very harmful commands “PARSE” and “TRY_PARSE“: Having “NaN” value for REAL and FLOAT producing severe error.

This post will be about a major issue it causes and on how to fight it.

Will start from generating the problem.

Attention: Do not run that in production!!!

Slava’s serious about not running the code in a real environment.

Comments closed