Press "Enter" to skip to content

Category: Syntax

Join Operations in BigQuery

Rathish Kumar joins a few tables together:

SQL joins are used to combine columns from multiple tables to get desired result set. In a typical Relational model we use normalized tables, each table represents an entity (example: employee, department, etc) and its relationships and when we need to get data from more than one tables, for example employee name and employee department, we use joins to combine employee name column from employee table, department name column from department table based on employee number key column, which is available on both the tables.

Similarly, typical data warehouse setup follows Star or Snowflake schema consisting of a primary fact table and satellite dimension tables. Fact tables represents events (example: orders table in a ecommerce business) and dimension table represents attributes and slowly changing information (example: customer, product tables).

The syntax is rather similar to most database engines, though there are a few physical join operators which differ from typical relational database management systems. Also, I’ll take this moment to say thank you to Rathish for not using Venn diagrams to show joins and instead using a proper technique.

Comments closed

Rewriting Scalar UDFs and NULL Results

Erik Darling jumps out of the time machine to warn us, but we have no idea what he’s talking about so we ignore the warning and end up doing the thing he warned us not to do, causing us to need to send him into a time machine to warn us not to do it:

I think I have probably spent 500 hours of my life rewriting T-SQL Scalar UDFs to avoid all the performance problems associated with them.

The obvious choice is the Inline Table Valued Function, which has fewer performance issues baked in. For the kids out there: they don’t spill trauma.

But getting the rewrite right can be tricky, especially when it’s possible for the function to return NULL values.

Click through for the example.

Comments closed

Concatenation per Group in MySQL

Rahul Mehta asks the big questions:

In this article, we are going to discuss how to aggregate the article at a row level. MySQL provides a function called “group_concat” to perform row-level concatenation. Before we go ahead and learn how to do so, let us first understand:

  • Why do we need it?
  • Where do we need it?
  • When to use it?

Fortunately, Rahul then answers these questions; otherwise, it’d be a pretty short article. The T-SQL analog to this is STRING_AGG(), though the syntax and behavior is not precisely the same.

Comments closed

Checking XML Validity

Kevin Wilkie doesn’t like misshapen XML data:

Sometimes you’ll find that you will have XML in your database. This could be for various reasons – from storing the XML after receiving an API response to keeping it in a table because a web developer couldn’t figure out another way to store their data. Sometimes – no matter how much you trust your source – you should question if the XML is well-formed. Let’s work out a few ways you can do that in a database.

Read on for a few tests. The more concerned you are about XML data quality, the more you’d want to push in the direction of having an XSLT defined as well.

Comments closed

Mnemonics for Remembering SQL Clause Order

Bob Pusateri keeps it all straight in his head:

Ooh! A mnemonic! And a pretty good one at that. The idea being that the first letter of each word of the sentence helps you remember something else, like the order of the major parts of a SELECT statement:

SELECT
FROM
WHERE
GROUP BY
HAVING
ORDER BY

Click through for Bob’s best attempts. Oddly enough, now I want some Rally’s fries. Or maybe Checkers—I can’t decide.

Comments closed

DBCC CHECKALLOC

Steve Stedman helps us understand a DBCC command:

DBCC CHECKALLOC is a database console command (DBCC) in Microsoft SQL Server that can be used to check the allocation and structural integrity of the data and index pages in a database. Checking the allocation and structural integrity of the pages can be useful for identifying and correcting issues with the database that could cause errors or performance issues.

Read on to learn more and see a couple examples of it in action.

Comments closed

End of Month in Snowflake and SQL Server

Kevin Wilkie is ready for that end-of-month paycheck:

When you work with data, you’ll probably need to work with dates at least once a month. That is the nature of the beast. Today, let’s compare working with them in SQL Server and Snowflake. I want to focus only on adding and subtracting months when provided with a specific day.

Along the way, I would also push for a calendar table, so that you can remove some of the more difficult (or even most common) date calculations.

Comments closed

Building Your Own TRY_PARSE Function

Aaron Bertrand gives it a go:

A while back, I wrote a couple of tips about simulating TRY_CONVERT – a function added in SQL Server 2012 – in earlier versions (see Part 1 and Part 2). Recently, someone in the community had a similar requirement: they wanted to make TRY_PARSE work for dates in SQL Server 2008. Now, I’m not one to help people stay on versions of SQL Server that are now 15 years old, but I do acknowledge that hands may be tied, and they can’t always control the version they’re stuck with.

Aaron does this community member a solid, though it’s probably something you’d never want to use if you’re on SQL Server 2012 or later.

Comments closed

Lateral Joins in Snowflake

Kevin Wilkie makes a lateral move:

In Snowflake, you can do something akin to an APPLY, but not. That, my friends, is a LATERAL join. According to Snowflake documentation, this type of join “allows an inline view to reference columns from the table expression that precedes the inline view.”

Yes, that’s gibberish. Let’s show what you’re doing.

I’m such an APPLY apologist that the description made sense to me.

Comments closed

Tips for People Moving to Postgres

Ryan Booz shares a few bits of advice about people learning about Postgres:

I was much further in my SQL Server career than I should have been before I understood the usefulness of APPLY in T-SQL. I didn’t realize that APPLY isn’t a SQL standard and so when I started using PostgreSQL and needed similar functionality, I was perplexed trying to find the equivalent. Fortunately, I wasn’t the only one searching.

My recollection here is that Microsoft introduced the APPLY operator before there was an ANSI standard, so as long as my memory isn’t faulty here, this wasn’t a case of a vendor just deciding to name something differently for the sake of it.

Regardless, Ryan has a lot of good advice for learners of PL/SQL.

Comments closed