Press "Enter" to skip to content

Category: Syntax

Vector Search in Oracle

Brendan Tierney shows a few queries:

It can be very straightforward to use Vectors using SQL. It’s just a simple SQL query, with some additional Vector related requirements. The examples given below are a collection of some simple examples. These aren’t my examples, but they come from either documentation or from other examples people have come up with. I’ve tried to include references back to the original sources for these, and if I’ve missed any or referred to the wrong people, just let me know and I’ll correct the links.

In my next post on Vector Databases, I’ll explore a slightly more complex data set. I’ll use the Wine dataset used in a previous post and Vector Search to see if I can find a suitable wine. Some years ago, I had posts and presentations on machine learning to recommend wine. Using Vector Search should give us better recommendations (hopefully)!

Click through for those queries.

Comments closed

Using SUBSTRING() in PostgreSQL

Rajendra Gupta builds a substring:

The PostgreSQL substring function returns a subset of the input string based on the supplied string starting position and length. You can use the SUBSTRING function for extracting data as per fixed string length and regular expressions.

Syntax:

SUBSTRING (String, Start_Position, length)

Read on for some common uses, as well as cases that are a bit more esoteric.

Comments closed

Generating Embeddings in Oracle from a Function or Trigger

Brendan Tierney continues a series on generative AI in Oracle:

In my previous post, I gave examples of using Cohere to create vector embeddings using SQL and of using a Trigger to populate a Vector column. This post extends those concepts, and in this post, we will use OpenAI.

Warning: At the time of writing this post there is a bug in Oracle 23.5 and 23.6 that limits the OpenAI key to a maximum of 130 characters. The newer project-based API keys can generate keys which are greater than 130 characters. You might get lucky with getting a key of appropriate length or you might have to generate several. An alternative to to create a Legacy (or User Key). But there is no guarantee how long these will be available.

Assuming you have an OpenAI API key of 130 characters or less you can follow the remaining steps. This is now a know bug for the Oracle Database (23.5, 23.6) and it should be fixed in the not-too-distant future. Hopefully!

Read on to learn more.

Comments closed

Hierarchical Data Types in Postgres

Florent Jardin builds a hierarchy:

The SQL standard defines a set of rules so that database systems can be interchangeable, but there are small singularities in the wild. In this regard, the hierarchyid data type provided by SQL Server is a striking example. If you are switching to PostgreSQL, two solutions are available to you.

A first and simpler solution consists in linking each node to its parent using a new parentid column and applying a foreign key constraint. Another, more complete approach consists in using the ltree extension. This article deals with the latter case.

Read on to learn more.

Comments closed

Data Analysis with Window Functions in Postgres

Elizabeth Christensen dives into window functions:

SQL makes sense when it’s working on a single row, or even when it’s aggregating across multiple rows. But what happens when you want to compare between rows of something you’ve already calculated? Or make groups of data and query those? Enter window functions.

Window functions tend to confuse people – but they’re a pretty awesome tool in SQL for data analytics. The best part is that you don’t need charts, fancy BI tools or AI to get some actionable and useful data for your stakeholders.

Read on for several demonstrations. Most of this you can also do with SQL Server 2012 or later, though the DATE_TRUNC() example will only work in SQL Server 2022 or Azure SQL DB / Managed Instance. Prior to that, you’d need to use a different mechanism, such as CAST(o.order_date AS DATE), to get it working.

Comments closed

ISNULL vs COALESCE in SQL Server

Erik Darling has a video for us:

A Difference Between ISNULL And COALESCE You Might Care About In SQL Server

There’s nothing for me to snip as the graf. I don’t often link to videos without any sort of text accompaniment, but it’s been too long since I’ve linked to Erik and this was an interesting topic.

Bonus points for using “case expression” instead of the more common but technically incorrect “case statement.”

Comments closed

Understanding the EXISTS Keyword in SQL

Eric Blinn probably exists:

I’ve seen the EXISTS keyword in Microsoft SQL Server T-SQL code and don’t understand it well. What does it do? How do I use it? Are there best practices around SQL EXISTS?

This SQL tutorial will explain what the keyword EXISTS does and show several different use cases.

Read on to see how you can use EXISTS and its complement, NOT EXISTS, in a variety of use cases. One important part of why EXISTS can be useful compared to other ways of writing a particular query is that the performance profile of an EXISTS clause is a semi-join: we proceed until we find the first result matching our clause. If that happens to be in the first row, we can stop there as we’ve fulfilled the requirement. By contrast, an alternative using IN or something else like using SELECT COUNT(*) would likely need to read more pages of data than EXISTS.

Comments closed

Eliminating Unexpected Whitespace or Null Characters in Columns

Kevin Wilkie has fun with an unexpected character:

The query ran just fine so I happily completed my work for the day. I sent it to my QA team and had them check it out before pushing it to Production. They sent it back with a note saying “We expect a lot more “Insert Garbage Data Here” to show.

Being a good data developer – or someone who just wants to show QA up (take your pick) – decided to dig deep into what FieldName really shows. I found a lot of the following:

In Kevin’s case, he tried to use TRIM() and it didn’t fix anything. That’s because TRIM(), by default, only removes the space character (up to the first non-space character), not all whitespace and not the null character CHAR(0) that Kevin found.

An alternative version of TRIM() that would have worked in this case, plus adding in tabstops as well, would be:

SELECT
    TRIM(' '+char(0)+char(9) FROM FieldName) AS TrimmedFieldName
FROM dbo.SomeTable;
GO

And you could also extend that to include newlines, line feeds, vertical tabs, the line separator character, and whatever else you need.

Comments closed

Searching for Wildcard Characters in LIKE

Andy Brownsword is looking for a discount:

Performing a wildcard search by throwing a % into a LIKE expression is bread and butter. How do we handle this when we actually want to search for the wildcard though?

This was an issue I first saw early in my career before I was even building database solutions. The business had a back office solution where you could search through offers on the UI. Unfortunately it threw up some strange results.

Searching for ‘30%’ offers would return more than expected.

Read on for an example of the problem, as well as how you can resolve it.

Comments closed

Storing and Parsing JSON in SQL Server

Ed Pollack talks JSON:

Like XML, JSON is an open standard storage format for data, metadata, parameters, or other unstructured or semi-structured data. Because of its heavy usage in applications today, it inevitably will make its way into databases where it will need to be stored, compressed, modified, searched, and retrieved.

Even though a relational database is not the ideal place to store and manage less structured data, application requirements can oftentimes override an “optimal” database design. There is a convenience in having JSON data close to related relational data and architecting its storage effectively from the start can save significant time and resources in the future.

Read on for plenty of examples and tips. Ideologically, I have no problem parsing JSON to load data into SQL Server. I have no real problem storing data in JSON if the calling application takes that JSON as-is and does not expect the database to modify or shred that JSON. I have no problem taking relational data and creating JSON structures to send out to calling applications. My problem comes when you store the data as JSON but then expect the database to manage data. Treat the JSON blob as atomic and we’re fine; otherwise, I want to make that data relational, as befits a relational database.

2 Comments