Press "Enter" to skip to content

Curated SQL Posts

Refreshing a View in SQL Server

Chad Callihan hits the reset button:

I recently encountered a question related to views: what happens when you make a change to the table that a view is based on? For example, if you change a column from VARCHAR(8) to VARCHAR(20), how does the view handle that change? You might expect the view to update, but it won’t do it on its own. You have to refresh the view.

Let’s look at a mocked up example.

Click through for that example. You’d think it could do so on its own, but nope. I suppose the reason you can’t is probably related to linked server and external object references in views, where that remote resource can change schema and SQL Server wouldn’t know about it. Perhaps also the idea that a person may be authorized to change a table in one database or schema but shouldn’t be authorized to modify (even to refresh?) a particular view that references the table.

Comments closed

Regex Support in Azure SQL DB

Abhiman Tiwari has a big announcement:

We are pleased to announce the private preview of regular expressions (regex) support in Azure SQL Database. Regex is a powerful tool that allows you to search, manipulate, and validate text data in flexible ways. With regex support, you can enhance your SQL queries with pattern matching, extraction, replacement, and more. You can also combine them with other SQL functions and operators to create complex expressions and logic.

This is something I’ve wanted to see in SQL Server for years, and I’m excited that there’s official support now. Prior to that, you could use SQL# to perform some regular expression operations using the CLR, but as long as performance is reasonable on these, it’s a huge feature to include.

Comments closed

Sorting by Large Columns in SQL Server

Aaron Bertrand has a clever trick:

In the most basic cases, we can implement this functionality in SQL Server by using OFFSET/FETCH. The problem is that anything that uses TOP or OFFSET will potentially have to scan everything in the index up until the page requested, which means that queries become slower and slower for higher page numbers. To achieve anything close to linear performance, you need to have a narrow, covering index for each sort option, or use columnstore as Erik Darling recommends here, or concede that some searches are just going to be slow. Throw in additional filtering, pulling data from other tables, and letting users dictate any sort order they want, and it becomes that much harder to tune for all cases.

I have a lot that I want to say about paging, and I will follow up with more content soon. Several years ago, I wrote about some ways to reduce the pain here, and it is long overdue for a refresh. For today’s post, though, I wanted to talk specifically about pagination when you have to order by large values. By “large” I mean any data type that can’t fit in an index key, like nvarchar(4000) or, really, anything that can’t lead in an index and/or would push the key past 1,700 bytes.

Read on for the scenario and how it all works.

Comments closed

An Overview of Postgres’s Planner

Cary Huang digs into one phase of Postgres query processing:

When you send a query to PostgreSQL, it normally would go through stages of query processing and return you the results at the end. These stages are known as:

  • Parse
  • Analyze
  • Rewrite
  • Plan
  • Execute

I wrote another blog to briefly explain the responsibility of each query processing stage. You can find it here. In this blog, we will only focus on the “plan” stage or the “planner” module as this is perhaps the most interesting or complex stage if you will. I will share my understanding of the planner module as I investigate its internal workings to handle a simple sequential scan. This will be based on PostgreSQL 16.

Read on to learn what the planner does and how it works, at a high level.

Comments closed

ggbrick in CRAN

Dan Oehm notes another brick in the wall:

If you’re looking for something a little different, ggbrick creates a ‘waffle’ style chart with the aesthetic of a brick wall. The usage is similar to geom_col where you supply counts as the height of the bar and a fill for a stacked bar. Each whole brick represents 1 unit. Two half bricks equal one whole brick.

It has been available on Git for a while, but recently I’ve made some changes and it now has CRAN’s tick of approval.

Click through to see how you can use it. This style of waffle chart, in the right scenario, can be quite useful, providing a high-level view and also giving you some idea of fine-grained magnitudes. H/T R-Bloggers.

Comments closed

Mock Tests for R Packages

Maelle Salmon does a bit of mocking:

This blog featured a post on mocking, the art of replacing a function with whatever fake we need for testing, years ago. Since then, we’ve entered a new decade, the second edition of Hadley Wickham’s and Jenny Bryan’s R packages book was published, and mocking returned to testthat, so it’s time for a new take/resources roundup!

Click through to see how you can create mocks in R as well as some practical examples of mocks in action.

Comments closed

Copilot in Azure SQL Database

Joe Sack gets in the cockpit:

We are excited to announce that Copilot in Azure SQL Database is now in limited public preview! Copilot in Azure SQL Database experiences are designed to streamline the design, operation, optimization, and health of Azure SQL Database-driven applications. It improves productivity in the Azure portal by offering natural language to SQL conversion and self-help for database administration.

Click through to see what you can do with it right now. I’d expect there will be more capabilities over time, too.

Comments closed

An Example of an MD5 Hash Collision

John Cook shares an example of a hash collision:

Marc Stevens gave an example of two alphanumeric strings that differ in only one byte that have the same MD5 hash value. It may seem like beating a dead horse to demonstrate weaknesses in MD5, but it’s instructive to study the flaws of broken methods. And despite the fact that MD5 has been broken for years, lawyers still use it.

Click through for the example.

Comments closed

Postgres Licensing in Light of Redis

Jonathan Katz share some thoughts:

(Disclosure: I’m on the PostgreSQL Core Team, but what’s written in this post are my personal views and not official project statements…unless I link to something that’s an official project statement 😉

I was very sad to learn today that the Redis project will no longer be released under an open source license. Sad for two reasons: as a longtime Redis user and pretty early adopter, and as an open source contributor. I’ll preface that I’m empathetic to the challenges of building businesses around open source, having been on multiple sides of this equation. I’m also cognizant of the downstream effects of these changes that can completely flip how a user adopts and uses a piece of technology.

Read on to learn about why Jonathan believes (with good reason) Postgres’s license is a lot more sticky.

Comments closed