Press "Enter" to skip to content

Curated SQL Posts

The Most Recent Issues You’ve Closed

Brent Ozar wraps up this month’s T-SQL Tuesday:

So when I’m meeting a new team and learning what they do,  I’ve found it helpful to ask, “What specifically was the last issue you closed?” Note that I don’t ask, “What are you working on now?” because that tends to lead to long-term projects that people want to do, but not necessarily what they’re paid to do. If you ask them about the last specific task they checked off, that’s usually related to something the company demands that they do because it’s urgent. It leads to fun discoveries about what people think they do, versus why managers really keep them around on the payroll.

Click through for this month’s list of respondents.

Comments closed

Combining Kafka and Flink

Gautam Goswami shares some thoughts:

In short, the process of collecting data in real-time as streams of events from event sources such as databases, sensors, and software applications is known as event streaming. With real-time data processing and analytics in mind, Apache Flink is a potent open-source program. For situations where quick insights and minimal processing latency are critical, it offers a consistent and effective platform for managing continuous streams of data. 

I’ve found it interesting that Confluent people have spent a lot of time the past several months talking up Apache Flink and Kafka+Flink combinations.

Comments closed

Checking for the Existence of Multiple Values in SQL

Lukas Eder does a performance test:

But what if you want to check if there are at least 2 (or N) rows? In that case, you cannot use EXISTS, but have to revert to using COUNT(*). However, instead of just counting all matches, why not add a LIMIT clause as well? So, if you want to check if actors called WAHLBERG have played in at least 2 films, instead of this:

Lukas compares performance between two query options in Postgres, Oracle, SQL Server, and MySQL. Because Oracle has weird behavior in the test, Lukas shares a third option that works well for it.

Comments closed

Feature Toggles for Databases

Phil Factor discusses feature toggles for the database:

In software development the concept of feature toggles are used to selectively turn on and off features. They are, for example, used to restrict some newly introduced features to a select group to see how these features work. While this concept has been long used for user-facing application code, it is also a practice that is useful for database code.

In databases, feature toggles are used for several loosely related purposes. One common use is to separate, or decouple, rollout of database features from deployment of new versions of the application code. This allows developers to release software faster and with less risk.

The idea of using roles to determine whether a given user should see a particular feature makes sense, though I was a bit confused at first because I was thinking about feature flags: behavioral changes you slowly turn on for groups of people. This is a bit different.

Comments closed

Notes on Data Engineering in Microsoft Fabric

John Miner shares some notes. Part 1 looks at getting started and tables, both managed and unmanaged:

The architectural diagram shows how information flows from a source system, into a delta lake house, transformed by programs, and used by end users. To get source data into the lake, we can use any of the three methods to retrieve the data as files: pipelines – traditional Azure Data Factory components, dataflows – wrangling data flows based on Power Query and shortcuts – the ability to link external storage to the lake. Once the data is in the lake, there are two types of programs that can transform the data files: spark notebooks and data flows.

Part 2 covers file and folder management:

In practice, I have seen an additional quality zone called raw be used to stage files in their native format before converting to a delta file format. Please note, the lake house uses either shortcuts or pipelines to get files into the lake. We will talk more about bronze, silver and gold zones when I cover full and incremental loading later in this article.

Read on for John’s thoughts.

Comments closed

Environmental Deployment in Microsoft Fabric

Kevin Chant takes us through deployment pipelines in Microsoft Fabric:

One question that I get frequently asked is how many workspaces are required? In reality, the answer is that it depends.

However, if you want your solution to be flexible and loosely coupled I do recommend at the very least one Microsoft Fabric workspace per environment.

That’s also required if you’re using deployment pipelines, as each stage in the pipeline pushes to a unique workspace.

Comments closed

The apply() Functions in R

Steven Sanderson talks about a series of functions:

Welcome, fellow R warriors! Today, we delve into the heart of vectorized operations with R’s “apply” family: apply()lapply()sapply(), and tapply(). These functions are your secret weapons for efficiency and elegance, so buckle up and prepare to be amazed!

But first, the “why”: Loops are great, but for repetitive tasks on data structures, vectorization reigns supreme. It’s faster, cleaner, and lets you focus on the “what” instead of the “how” of your analysis. Enter the apply family, each member offering a unique twist on applying functions to your data.

The trickiest part about the apply() series is remembering which one does what. This is where purrr’s map() function does a better job, I think.

Comments closed

Maximizing the Savings for SQL Developers

Bob Ward is speaking my language:

Whether you build applications for SQL Server on-premises or in Azure, there are several options for you to develop or test for free, or with substantial cost savings. Some of these options you may be familiar with, but this post will help explain details and answer some questions Microsoft often gets to provide you a clear story.

The part in particular I want to focus on is Azure Dev/Test pricing. I give a talk on saving money in the cloud and this is one of the bits most people know very little about. With a Dev/Test subscription in Azure (not just a regular subscription that you call “dev” or “test”!), you can avoid Windows and SQL Server licensing fees, get discounts on certain services like Azure Application Services, and do a bit more with it. It’s something most people don’t know about and the documentation is a little confusing, but Bob does a good job of clarifying things.

Comments closed

Enabling SQL Agent after Upgrading from Express Edition

Warwick Rudd ponies up the cash:

With the above limitations, this product can provide all of the functionality for your application to operate as required. Because there is no SQL Agent service in this edition this makes it more difficult for you to implement scheduled database maintenance. We are not going to discuss ways to implement database maintenance on an Express edition of SQL Server in this post.

But at some point in time you may find yourself in the position that you need to perform an upgrade of editions. This may or may not include and upgrade of versions. An upgrade to the version can be achieved and can bring benefits of its own. Normally, I would not be writing about and recommending performing in-place upgrades. However, sometimes due to restrictions of vendor applications it is a requirement to perform an in-place upgrade. 

Read on to see what you’d need to do, knowing that Express Edition does not come with a SQL Server Agent and does not perform the appropriate setup for one.

Comments closed