Press "Enter" to skip to content

Category: Syntax

Thoughts on R’s New Pipe

John Mount has thoughts on the upcoming pipe operator in R:

There is a current active discussion on this prototype and some interesting points come up. Note the current proposal appears to disallow a |> f -> f(a), a currently popular transform.

1. This is a language feature presented as a soon-to-be-user-visible prototype, not an RFC.
2. Some are objecting to the term “pipe.”
3. Some call this sort of pipe function composition.
4. It is noticed that this sort of substitution is generally thought of as a “macro.”
5. There is a claim the proposed pipe seems to violate the beta-reduction rule of the lambda calculus: variables should be substitutable for values.

Read on for John’s take on this. I particularly appreciate his response to point number 2: other functional languages have pipes (in fact, |> is the F# pipe operator). Pipes are not unique to UNIX. John has a lot of interesting comments, so check them out.

Comments closed

First Thoughts on Amazon Babelfish

Ryan Booz shares some first thoughts on Amazon’s Babelfish offering:

The imputes for creating the tool is clear for AWS. Provide a way for customers to easily connect a SQL Server app to Aurora Postgres, saving big on licensing fees and reducing total cost of ownership. Assuming the tool is successful at some level, I’m sure it will provide a revenue boost for Amazon and some customers might (initially) feel a win. No harm, no foul on Amazon for leading the effort. Free markets, baby!

No matter how clever Babelfish is, however, I just can’t see how this is ultimately a win for SQL Server or PostgreSQL… or the developers that will ultimately need to support these “hybrid” apps.

I think Ryan makes good points and does hit upon the crux of the problem. I’d also say that there’s a secondary problem which Ryan hints at, but it is that a query may be sufficiently fast in one database variant but perform horridly in another. A classic example of this is a solution built on cursors in Oracle and then bringing that to T-SQL.

Comments closed

Aggregate Functions in SQL Server

Hugo Kornelis takes us through the concept of aggregate functions:

SQL Server currently supports three operators that can compute aggregations: Hash MatchStream Aggregate, and Window Aggregate. These operators all use the same basic principle of maintaining internal counters as rows are processed, so that the final value of those internal counters is the expected value.

Read on to see the full list, as well as how they operate.

Comments closed

JSON Basics with SQL Server

Steve Jones takes us through querying straightforward JSON data in SQL Server:

Recently I saw Jason Horner do a presentation on JSON at a user group meeting. I’ve lightly looked at JSON in some detail, and I decided to experiment with this.

All in all, I’ve been pretty happy with the syntax for JSON manipulation in T-SQL. I’m not the biggest user of JSON around, but when I’ve needed to slice or build JSON, even when I needed to build it in a certain way to emulate an old application, it has worked for me.

Comments closed

Reducing CTE Re-Scans with APPLY

Daniel Hutmacher shows another good use of the APPLY operator:

You can tell by the plan why this is an inefficient query: the SQL expression in the common table expression is executed once for every time that it’s referenced in the code.

Better living through CROSS APPLY

You could store the results of the CTE in a temp table, but where’s the fun in that? Instead, why not use the CTE once, and then return four rows for each row that the CTE spits out? That’s exactly what CROSS APPLY does.

Read the whole thing and appreciate that much more all the nice things you can do with APPLY.

Comments closed

GREATEST and LEAST in Azure SQL Database

Arun Sirpal shows off some missing functionality in SQL Server:

Being in the cloud does have many benefits, from lower administration to fast scaling but another “side effect” of operating in Azure SQL Database is the cloud first nature of changes. By this I basically mean new features always get pushed to Azure first before the classic on-premises version so some gems come to light.

Here is one for you. Have you ever wanted MySQL’s functinality to apply LEAST() and GREATEST() argument? Well, you can now, in Azure.

I can’t say that I would use this every day or anything, but I have felt the pain of not having it. There are workarounds, though nothing as convenient as syntax. Hopefully this shows up on-prem in the next version of SQL Server.

Comments closed

Uncommon SQL Tricks

Shane O’Neill has a bandolier of SQL tricks to show off:

Recently the DBA Team Lead and I were reviewing some SQL code, and we came across some SQL that neither of us had frequently encountered before. This led to a brief watercooler moment where we shared some uncommon SQL that we had seen. Perfect blog post material, I think.

I had previously learned about ODBC date functions from Shane and also learned about CURRENT in this post, so check it out.

Comments closed

Using the READPAST Query Hint

Rajendra Gupta looks at a lesser-used query hint:

If we specify the READPAST hint in the SQL queries, the database engine ignores the rows locked by other transactions while reading data. Suppose you have a transaction that blocked a few rows in a table for updating the information in those rows. Now, if another user starts a transaction and specifies the READPAST query hint, the query engine ignores these rows and returns the remaining rows satisfying the data requirement of the query. It might return incorrect data as well.

There are some very limited uses for this hint, though they are out there.

Comments closed

Using Hints in SQL Server

Jared Poche is flirting with the dark side:

I work on hundreds of databases with the same schema. They have different data sets and distributions, different sizes, and their statistics are going to update at different times. But if one of them chooses a bad plan, I have to push aside whatever other work to research the high CPU on database xyz.

Consistency is really valuable to me. And in this case, the answer is simple. Yes, I want to scan the fast, small memory-optimized table variable first, and use it to filter the larger, slower table. Adding a join hint or a force order to this query should keeps its plan and performance consistent.

Click through for a few examples of where query hints can be useful, but also where they can fail you in unexpected ways.

Comments closed

Optimizing Multiple CTEs

Itzik Ben-Gan continues a series on table expressions:

Last month I explained and demonstrated that CTEs get unnested, whereas temporary tables and table variables actually persist data. I provided recommendations in terms of when it makes sense to use CTEs versus when it makes sense to use temporary objects from a query performance standpoint. But there’s another important aspect of CTE optimization, or physical processing, to consider beyond the solution’s performance—how multiple references to the CTE from an outer query are handled. It’s important to realize that if you have an outer query with multiple references to the same CTE, each gets unnested separately. If you have nondeterministic calculations in the CTE’s inner query, those calculations can have different results in the different references.

Say for instance that you invoke the SYSDATETIME function in a CTE’s inner query, creating a result column called dt. Generally, assuming no change in the inputs, a built-in function is evaluated once per query and reference, irrespective of the number of rows involved. If you refer to the CTE only once from an outer query, but interact with the dt column multiple times, all references are supposed to represent the same function evaluation and return the same values. However, if you refer to the CTE multiple times in the outer query, be it with multiple subqueries referring to the CTE or a join between multiple instances of the same CTE (say aliased as C1 and C2), the references to C1.dt and C2.dt represent different evaluations of the underlying expression and could result in different values.

Definitely worth the read.

Comments closed