Press "Enter" to skip to content

Month: April 2023

A Thought on Query Granularity

Chris Johnson shares some thoughts:

This query is pretty simple, we’re wanting to return a set of data about the orders placed including the total cost of each order. However the orders in the database are split into an Orders table that holds the order level information, and an OrderLines table that holds information on each product ordered in each order, and we need that OrderLines table to get the cost.

So we write our query, and we join to the OrderLines table, and we sum the cost. But then, because we have an aggregate, we need to group by every other column we’re returning. And to me that just doesn’t look right, and it doesn’t convey the intention of the query properly.

In Chris’s simple example, I’m not sure I’d push it very much, but Chris does have a good point in terms of explaining query intent. Also, depending on how many order lines there are relative to orders (the next step in the chain for that query), aggregation in a common table expression could be faster than waiting until after the join to aggregate on all of the columns. In reality, that’s the most likely reason I’d make this change, assuming that it made a big enough performance difference. But if you take a much more complicated query of this sort, then I’d be more amenable to the argument.

Comments closed

Getting Last-Used Report Parameters for SSRS

Brian Bønk shares some history:

A couple of years back I needed to find the latest used parameters from a reporting services – SSRS (yes, it is old!) report. The functionality was to find the latest used parameters for all parameters dynamically and use them as the default values for each parameter in the report, when tjhe user opened the report. If the uesr had not used the reprot before, it should ask for the parameters.

This was a alot of fun for me to do, as I needed to query the internal tables from the reporting services database. The tables ExecutionLogStorage and the Catalog was on play. This due to the fact that I needed the active user’s latest used parameter values from each specific report and this in runtime when the report was opened.

Click through to see how Brian did it.

Comments closed

Building an Azure Stream Analytics Query

Alex Lin takes us through the process:

As a developer, your journey with Azure Stream Analytics (ASA) can be divided into several stages, each with its own set of challenges and requirements. In this blog post, we’ll walk you through the typical developer journey in ASA, from the initial setup to production deployment. Along the way, we’ll explore the various development tools and best practices that will help you build a Stream Analytics job. 

Click through for the demonstration.

Comments closed

Spatial Data and GIS in DuckDB

Mark Litwintschik tries out a new extension to DuckDB:

DuckDB’s Spatial extension is made up of ~7K lines of C++ excluding its dependencies.

There are three major GIS projects this extension uses to provide its functionality. The first is GDAL, a 1.7M-line C++ project that provides file conversion functionality for both raster and vector-based GIS file formats. The second is PROJ, a 150K-line C++ project that provides coordinate transformations. The third is GEOS, a 150K-line C++ project that provides geometry manipulation functionality.

All of these dependency projects were started more than 20 years ago, have been extensively battle-tested and seem to underpin almost every GIS project and service I’ve come across.

In this post, I’ll walk through some example GIS workflows with the DuckDB Spatial extension.

Click through for a dive into what the extension can do.

Comments closed

In-Place Upgrades or New Builds?

Ben DeBow asks the tough questions:

An in-place upgrade involves upgrading the existing SQL Server to a newer version without creating a new instance. In other words, the new version of SQL Server is installed on the same server as the existing version, and the databases are upgraded to the new version.

On the other hand, a side-by-side upgrade involves creating a new instance of SQL Server and migrating the databases to the new instance. This approach requires more planning and preparation than an in-place upgrade but can be less risky.

Click through for the comparison. Historically, I’ve always been on Team New Build, though more recent versions of SQL Server do a better job of in-place upgrades, to the point where I’m fairly comfortable suggesting one to a client.

Comments closed

Cost Optimization with Azure SQL Managed Instance

Niko Neugebauer makes a few recommendations:

Optimizing costs in SQL Managed Instance isn’t a one-size-fits-all situation—there are several cost-saving benefits and capabilities to take advantage of depending on your unique business needs. By taking advantage of these benefits, or better yet, using two or more simultaneously, you can achieve significant cost savings while optimizing your business operations.

Click through for the list. None of it is earth-shattering, but they are still things to keep in mind along the way.

Comments closed

xp_cmdshell Is Not the Problem

David Fowler gives it to us straight:

xp_cmdshell is an extended SQL stored proc that allows users to run Windows command prompt commands from within SQL. Sound scary? It might, but is xp_cmdshell really a security risk?

Well a lot of people think so, many DBAs and IT departments will insist that it’s always disabled and many auditors and pen testers will raise it a significant vulnerability if they see it enabled on any of your SQL Servers.

But is it really that much of a security risk?

Click through for David’s thoughts, which match my own quite well here. Either xp_cmdshell is not the problem because you explicitly needed to make bad decisions in order for it to hurt you, or xp_cmdshell is not the problem because a bad person got access to a sysadmin account and hurt you. In neither case was xp_cmdshell the proximate cause.

Comments closed

Manual Halloween Problem Protection

Jared Poche takes us through Halloween problem protection and builds out his own method, with blackjack and hookers:

Well, if SQL Server is trying to separate the read from the write, why don’t I just do that myself? I had the idea to read the data I needed in an INSERT…SELECT statement, writing into a memory-optimized table variable (motv). I could make sure the read included all the columns I need to calculate the new value, including the CASE statement I had in the SET clause of the UPDATE.

Click through for Jared’s explanation of how to implement it, as well as the circumstances in which it might be faster than what you get by letting a single T-SQL statement handle the job.

Comments closed

Security Replication in Power BI

Marc Lelijveld and Vytautas Kraujalis lock things down, over and over and over:

Imagine, you have everything setup and well secured in your data lakehouse, or data warehouse if you will. Then a user starts consuming the data in Power BI and imports all data according to the security applied to that users’ permissions. Once the data is imported, all data can easily be shared to others who might have other permissions on the same dataset. Potentially, this leads to a breach of data to people who should not have accessed this data at all. Ideally, you replicate the security from the source into Power BI, but without setting up everything by hand.

In this blog post, you will learn how you can automate the replication of security from source to your Power BI data model in just a few steps. A blog post based on an actual client case and written by Vytautas Kraujalis and myself.

Click through for an explanation and a link to the GitHub repo containing all of the scripts.

Comments closed

Finding the Max (or Min) Value across Columns

Greg Dodd tries out some new syntax:

Have you ever had 2 columns in a table that you need to select the maximum value from? Not the maximum value from each column, but the biggest value after comparing the columns?

Usually where I have this problem, I’d use a CASE statement, but there’s another way. 

I do like GREATEST() and LEAST() (or argmax/argmin if you’re used to those terms), though Greg does include how you can get the same functionality in versions prior to SQL Server 2022.

Comments closed