Category: Syntax

Using SUBSTRING() in PostgreSQL

Published 2024-09-23 by Kevin Feasel

The PostgreSQL substring function returns a subset of the input string based on the supplied string starting position and length. You can use the SUBSTRING function for extracting data as per fixed string length and regular expressions.

Syntax:

SUBSTRING (String, Start_Position, length)

Read on for some common uses, as well as cases that are a bit more esoteric.

Comments closed

Generating Embeddings in Oracle from a Function or Trigger

Published 2024-09-20 by Kevin Feasel

Brendan Tierney continues a series on generative AI in Oracle:

In my previous post, I gave examples of using Cohere to create vector embeddings using SQL and of using a Trigger to populate a Vector column. This post extends those concepts, and in this post, we will use OpenAI.

Warning: At the time of writing this post there is a bug in Oracle 23.5 and 23.6 that limits the OpenAI key to a maximum of 130 characters. The newer project-based API keys can generate keys which are greater than 130 characters. You might get lucky with getting a key of appropriate length or you might have to generate several. An alternative to to create a Legacy (or User Key). But there is no guarantee how long these will be available.

Assuming you have an OpenAI API key of 130 characters or less you can follow the remaining steps. This is now a know bug for the Oracle Database (23.5, 23.6) and it should be fixed in the not-too-distant future. Hopefully!

Read on to learn more.

Comments closed

Hierarchical Data Types in Postgres

Published 2024-09-20 by Kevin Feasel

Florent Jardin builds a hierarchy:

The SQL standard defines a set of rules so that database systems can be interchangeable, but there are small singularities in the wild. In this regard, the hierarchyid data type provided by SQL Server is a striking example. If you are switching to PostgreSQL, two solutions are available to you.

A first and simpler solution consists in linking each node to its parent using a new parentid column and applying a foreign key constraint. Another, more complete approach consists in using the ltree extension. This article deals with the latter case.

Read on to learn more.

Comments closed

Data Analysis with Window Functions in Postgres

Published 2024-09-18 by Kevin Feasel

Elizabeth Christensen dives into window functions:

SQL makes sense when it’s working on a single row, or even when it’s aggregating across multiple rows. But what happens when you want to compare between rows of something you’ve already calculated? Or make groups of data and query those? Enter window functions.

Window functions tend to confuse people – but they’re a pretty awesome tool in SQL for data analytics. The best part is that you don’t need charts, fancy BI tools or AI to get some actionable and useful data for your stakeholders.

Read on for several demonstrations. Most of this you can also do with SQL Server 2012 or later, though the DATE_TRUNC() example will only work in SQL Server 2022 or Azure SQL DB / Managed Instance. Prior to that, you’d need to use a different mechanism, such as CAST(o.order_date AS DATE), to get it working.

Comments closed

ISNULL vs COALESCE in SQL Server

Published 2024-09-13 by Kevin Feasel

Erik Darling has a video for us:

A Difference Between ISNULL And COALESCE You Might Care About In SQL Server

There’s nothing for me to snip as the graf. I don’t often link to videos without any sort of text accompaniment, but it’s been too long since I’ve linked to Erik and this was an interesting topic.

Bonus points for using “case expression” instead of the more common but technically incorrect “case statement.”

Comments closed

Understanding the EXISTS Keyword in SQL

Published 2024-08-30 by Kevin Feasel

Eric Blinn probably exists:

I’ve seen the EXISTS keyword in Microsoft SQL Server T-SQL code and don’t understand it well. What does it do? How do I use it? Are there best practices around SQL EXISTS?

This SQL tutorial will explain what the keyword EXISTS does and show several different use cases.

Read on to see how you can use EXISTS and its complement, NOT EXISTS, in a variety of use cases. One important part of why EXISTS can be useful compared to other ways of writing a particular query is that the performance profile of an EXISTS clause is a semi-join: we proceed until we find the first result matching our clause. If that happens to be in the first row, we can stop there as we’ve fulfilled the requirement. By contrast, an alternative using IN or something else like using SELECT COUNT(*) would likely need to read more pages of data than EXISTS.

Comments closed

Eliminating Unexpected Whitespace or Null Characters in Columns

Published 2024-08-28 by Kevin Feasel

Kevin Wilkie has fun with an unexpected character:

The query ran just fine so I happily completed my work for the day. I sent it to my QA team and had them check it out before pushing it to Production. They sent it back with a note saying “We expect a lot more “Insert Garbage Data Here” to show.

Being a good data developer – or someone who just wants to show QA up (take your pick) – decided to dig deep into what FieldName really shows. I found a lot of the following:

In Kevin’s case, he tried to use TRIM() and it didn’t fix anything. That’s because TRIM(), by default, only removes the space character (up to the first non-space character), not all whitespace and not the null character CHAR(0) that Kevin found.

An alternative version of TRIM() that would have worked in this case, plus adding in tabstops as well, would be:

SELECT
    TRIM(' '+char(0)+char(9) FROM FieldName) AS TrimmedFieldName
FROM dbo.SomeTable;
GO

And you could also extend that to include newlines, line feeds, vertical tabs, the line separator character, and whatever else you need.

Comments closed

Searching for Wildcard Characters in LIKE

Published 2024-08-28 by Kevin Feasel

Andy Brownsword is looking for a discount:

Performing a wildcard search by throwing a % into a LIKE expression is bread and butter. How do we handle this when we actually want to search for the wildcard though?

This was an issue I first saw early in my career before I was even building database solutions. The business had a back office solution where you could search through offers on the UI. Unfortunately it threw up some strange results.

Searching for ‘30%’ offers would return more than expected.

Read on for an example of the problem, as well as how you can resolve it.

Comments closed

Storing and Parsing JSON in SQL Server

Published 2024-08-23 by Kevin Feasel

Ed Pollack talks JSON:

Like XML, JSON is an open standard storage format for data, metadata, parameters, or other unstructured or semi-structured data. Because of its heavy usage in applications today, it inevitably will make its way into databases where it will need to be stored, compressed, modified, searched, and retrieved.

Even though a relational database is not the ideal place to store and manage less structured data, application requirements can oftentimes override an “optimal” database design. There is a convenience in having JSON data close to related relational data and architecting its storage effectively from the start can save significant time and resources in the future.

Read on for plenty of examples and tips. Ideologically, I have no problem parsing JSON to load data into SQL Server. I have no real problem storing data in JSON if the calling application takes that JSON as-is and does not expect the database to modify or shred that JSON. I have no problem taking relational data and creating JSON structures to send out to calling applications. My problem comes when you store the data as JSON but then expect the database to manage data. Treat the JSON blob as atomic and we’re fine; otherwise, I want to make that data relational, as befits a relational database.

2 Comments

Data Integrity and the LIKE Operator

Published 2024-08-22 by Kevin Feasel

Joe Celko provides some context:

The LIKE operator has been in SQL since the very beginning. It’s a very weak form of pattern matching, that has only two simple wildcards. The wildcards are ‘_‘ (underscore) and ‘%‘ (percent). They respectively stand for any single character and for zero or more characters in a substring. The syntax is pretty straightforward:

Read on for examples of LIKE in action, including in check constraints.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31