Press "Enter" to skip to content

Day: August 26, 2021

Creating a Kafka Producer and Consumer with C#

Jim Galasyn shows how to use the Confluent.Kafka NuGet package to connect to a Kafka cluster from C#:

Sometimes you’d like to write your own code for producing data to an Apache Kafka® topic and connecting to a Kafka cluster programmatically. Confluent provides client libraries for several different programming languages that make it easy to code your own Kafka clients in your favorite dev environment.

One of the most popular dev environments is .NET and Visual Studio (VS) Code. This blog post shows you step by step how to use .NET and C# to create a client application that streams Wikipedia edit events to a Kafka topic in Confluent Cloud. Also, the app consumes a materialized view from ksqlDB that aggregates edits per page. The application runs on Linux, macOS, and Windows, with no code changes.

Now, if only the .NET package supported a bunch of stuff which has come out over the past few years (the big one being Streams)… That’s no knock on the maintainers, mind you—they’ve done a good job given available resources—but it’s still unfortunate. At least there’s an unofficial implementation and hey, the original Confluent.Kafka .NET package started out as one of those too.

Comments closed

Compressing JSON in SQL Server

Randolph West has a recommendation:

I’ll also pre-emptively note that if this table was simply an append-only archive table, the row size would not really matter. Unfortunately, this table participates in thousands of transactions per day, and as the original developers used Entity Framework and didn’t think much of using NVARCHAR(MAX), the entire row is coming over the wire into the application each time it is queried.

As I’ve written previously about this kind of thing, this is not a good design pattern. Using the VARBINARY(MAX) data type with COMPRESS in the INSERT/UPDATE queries — and DECOMPRESS in the SELECT queries — is a much better design pattern and dramatically reduces the amount of data transferred over the network. Additionally, SQL Server needs significantly less space to store, maintain, and back up this compressed data.

Read on to see the likely benefits from doing this. I’d say that if your main purpose of storing the JSON is just to pass a blob back and forth, then yes, do compress. If you’re frequently shredding these sorts of large documents within SQL Server…well, probably time for a better data model.

Comments closed

SQL Server Monitoring via Zabbix

Reitse Eskens digs into using Zabbix to monitor SQL Server:

In one of the projects I’m working in, we needed to have some sort of monitoring solution on SQL Server, but there wasn’t budget for a commercial monitoring solution. There’s a small number of freeware, open-source solutions but these are all difficult to get working. In this blog I’ll show you what Zabbix has on offer as a default and what you can add yourself.

I’m not the biggest fan of Zabbix, but if it’s what you have, better to use the tools you have than not.

Comments closed

Optimizing String Split and Search

Daniel Hutmacher needs things to go faster:

One of the things that sp_ctrl3 does is plaintext database search. If you pass a string to the procedure that does not match an existing object, it’ll just perform a plaintext search of all SQL modules (procedure, views, triggers, etc) for that string. The search result includes line numbers for each result, so it needs to split each module into lines.

I’ve found that this takes a very long time to run in a database with large stored procedures, so here’s how I tuned it to run faster.

Read the whole thing.

Comments closed

Seeking SARG

Erik Darling leaves no man behind and is seeking Sarge. On day one, Erik briefs the crew:

SARGability is the in-club way of saying that a search predicate(s) can be used to seek through the key(s) of an index.

On day two, the crew use the power of math to get past a blockade:

Let’s say we’re doing this to audit short questions and answers for quality.

Since SQL Server doesn’t retain any precise data about string column lengths, we don’t have an effective way to implement this search.

Worse, since the Body column is a max datatype, no expression (SARGable or not) can be pushed to the index scan.

On day three, they enter the fetid jungles of tempdb:

In all, the query runs for about 50 seconds. This can be avoided by hinting a hash join, of course, for reasons explained here.

But good luck figuring out why this thing runs for 50 seconds looking at a cached, or estimated execution plan, which doesn’t show you spills or operator times.

Stay tuned for the thrilling conclusion to Seeking SARG.

Comments closed

Dealing with Non-Yielding Schedulers

Sean Gallardy breaks up the party:

One of the most common items that will cause a memory dump in SQL Server is a non-yielding scheduler (generally referred to as NYS). What the heck does that mean? Why would it cause a memory dump? Is there anything that can be investigated? Good questions, let’s take a look.

Read on to learn what these are, why they’re not something you want to deal with on a regular basis, and how you can get more information on what happened out of a dump file. Which is also going to be helpful for Microsoft staff to diagnose and correct the underlying issue (if possible).

Comments closed