Press "Enter" to skip to content

Curated SQL Posts

Estimating the Likelihood of an Underdog Winning at Soccer

Holger von Jouanne-Diedrich lays out the math for us:

The Bundesliga is Germany’s primary football league. It is one of the most important football leagues in the world, broadcast on television in over 200 countries.

If you want to get your hands on a tool to forecast the result of any game (and perform some more statistical analyses), read on!

What I would like is a tool which has SC Freiburg utterly dominating Bayern. Said tool may be more mythological than scientific (or at least a copy of Football Manager and a little bit of save scumming…), but I’ll take it.

Comments closed

Security Breach in Cosmos DB: ChaosDB

Nir Ohfeld and Sagi Tzadik discovered a flaw in Azure Cosmos DB:

Nearly everything we do online these days runs through applications and databases in the cloud. While leaky storage buckets get a lot of attention, database exposure is the bigger risk for most companies because each one can contain millions or even billions of sensitive records. Every CISO’s nightmare is someone getting their access keys and exfiltrating gigabytes of data in one fell swoop.

So you can imagine our surprise when we were able to gain complete unrestricted access to the accounts and databases of several thousand Microsoft Azure customers, including many Fortune 500 companies. Wiz’s security research team (that’s us) constantly looks for new attack surfaces in the cloud, and two weeks ago we discovered an unprecedented breach that affects Azure’s flagship database service, Cosmos DB.

Read on for details about the attack. Microsoft has already mitigated the issue by disabling the functionality necessary to pull off the attack. H/T Ben Stegink.

Comments closed

Multi-Cloud Pros and Cons

James Serra lays out some of the benefits and drawbacks of using multiple cloud providers:

A discussion I have seen many companies have is if they should be single-cloud (using only one cloud company) or multi-cloud (using more than one cloud company). The three major Cloud Service Providers (CSPs) that companies use for nearly all use cases are Microsoft Azure, Amazon Web Services (AWS), and Google Cloud Platform (GCP).

Without spoiling it too much, James is not really sold on the idea.

Comments closed

When in Doubt, Stop Counting

Chad Callihan looks at the SET NOCOUNT ON option:

You may have a stored procedure that completes in an acceptable amount of time for the dozen or so times a day it gets called. Maybe it returns results in a few seconds and that makes the users calling it happy enough that you can move onto more pressing matters. But what about a stored procedure being called millions of times a day? The definition of acceptable can be drastically different when you consider the speed and traffic that type of stored procedure produces. When every millisecond matters, it’s worth checking to see what your setting is for SET NOCOUNT.

Click through for a demo and what you can realistically expect from SET NOCOUNT ON. This works best with big loops, and incidentally, one pattern I like to use is to combine SET NOCOUNT ON with an occasional RAISERROR('%i iterations run...', 10, 1, @loopvar) WITH NOWAIT. That way, you can still see progress on the screen, but instead of printing results every single run, you might see one every 100 runs.

Comments closed

Paginated Reports in Power BI

Elayne Jones dives into paginated reports in Power BI:

Paginated Reports for Power BI offer pixel-perfect control over the format of each element of a report. They allow users to fine-tune each field of the report, such as text size, colors, spacing, and print layout, in a more precise way than using regular visuals in Power BI Desktop. Users can access Paginated Reports directly from workspaces in Power BI Service. Additionally, users can embed Paginated Reports directly onto a Power BI report page with the new visual option. This article will explain how to create a Paginated Report and how to configure the new Paginated Reports visual in Power BI Desktop. Please note that Paginated Reports require a Premium subscription. This tutorial is based on a fictional Sales Report.

If you’re familiar with SQL Server Reporting Services, you’ll find Power BI paginated reports simultaneously comfortable and confining—it’s much the same functionality as SSRS, but doesn’t feel as complete.

Comments closed

Creating a Kafka Producer and Consumer with C#

Jim Galasyn shows how to use the Confluent.Kafka NuGet package to connect to a Kafka cluster from C#:

Sometimes you’d like to write your own code for producing data to an Apache Kafka® topic and connecting to a Kafka cluster programmatically. Confluent provides client libraries for several different programming languages that make it easy to code your own Kafka clients in your favorite dev environment.

One of the most popular dev environments is .NET and Visual Studio (VS) Code. This blog post shows you step by step how to use .NET and C# to create a client application that streams Wikipedia edit events to a Kafka topic in Confluent Cloud. Also, the app consumes a materialized view from ksqlDB that aggregates edits per page. The application runs on Linux, macOS, and Windows, with no code changes.

Now, if only the .NET package supported a bunch of stuff which has come out over the past few years (the big one being Streams)… That’s no knock on the maintainers, mind you—they’ve done a good job given available resources—but it’s still unfortunate. At least there’s an unofficial implementation and hey, the original Confluent.Kafka .NET package started out as one of those too.

Comments closed

Compressing JSON in SQL Server

Randolph West has a recommendation:

I’ll also pre-emptively note that if this table was simply an append-only archive table, the row size would not really matter. Unfortunately, this table participates in thousands of transactions per day, and as the original developers used Entity Framework and didn’t think much of using NVARCHAR(MAX), the entire row is coming over the wire into the application each time it is queried.

As I’ve written previously about this kind of thing, this is not a good design pattern. Using the VARBINARY(MAX) data type with COMPRESS in the INSERT/UPDATE queries — and DECOMPRESS in the SELECT queries — is a much better design pattern and dramatically reduces the amount of data transferred over the network. Additionally, SQL Server needs significantly less space to store, maintain, and back up this compressed data.

Read on to see the likely benefits from doing this. I’d say that if your main purpose of storing the JSON is just to pass a blob back and forth, then yes, do compress. If you’re frequently shredding these sorts of large documents within SQL Server…well, probably time for a better data model.

Comments closed

SQL Server Monitoring via Zabbix

Reitse Eskens digs into using Zabbix to monitor SQL Server:

In one of the projects I’m working in, we needed to have some sort of monitoring solution on SQL Server, but there wasn’t budget for a commercial monitoring solution. There’s a small number of freeware, open-source solutions but these are all difficult to get working. In this blog I’ll show you what Zabbix has on offer as a default and what you can add yourself.

I’m not the biggest fan of Zabbix, but if it’s what you have, better to use the tools you have than not.

Comments closed

Optimizing String Split and Search

Daniel Hutmacher needs things to go faster:

One of the things that sp_ctrl3 does is plaintext database search. If you pass a string to the procedure that does not match an existing object, it’ll just perform a plaintext search of all SQL modules (procedure, views, triggers, etc) for that string. The search result includes line numbers for each result, so it needs to split each module into lines.

I’ve found that this takes a very long time to run in a database with large stored procedures, so here’s how I tuned it to run faster.

Read the whole thing.

Comments closed