Press "Enter" to skip to content

Curated SQL Posts

Building a Data Serving API in Azure

Justice Zishanhi has some recommendations for serving data in Azure:

Data is an important asset to all organizations big and small. As these organizations mature, building an end-to-end data platform to enable BI and AI at scale has become part of that journey. Some organizations, have the requirement to expose modelled data in a data warehouse or data lake (Azure Data Lake Storage Gen2) to downstream consumer applications (mobile or web apps) where access patterns can be unpredictable in respect to frequency of access and/or type of data that is requested.

Data warehouse engines and data lakes are not designed for singleton transactional (request / response) interactions.  To serve these requests at scale and to meet the different SLAs and access pattern unpredictability, data needs to be offloaded to a suitable database engine (i.e., a caching layer) that is built to serve such queries.  

The “Design Patterns” section of this article highlights a generalized pattern for implementing a data serving API which meets this requirement – consisting of a Data Platform component and an API component. For implementing the API, two patterns are commonly adopted – a synchronous pattern or an asynchronous pattern. Both are explored in the “API Implementation Patterns” section of this article.

The example focuses on Cosmos DB and provides quite a bit of helpful guidance.

Comments closed

Converting a Referenced Query Column to a Table

Erik Svensen does a bit of magic:

Let’s imagine I have a query with a column called Reportid – and I want to create another query with the unique values from the column.

Normally I would create a new blank query and than reference the column with – queryName[ColumnName]

And then there are a bunch of other steps which eventually lead you to the result. Or you can go straight to easy mode. That sounds like a pretty good idea to do instead.

Comments closed

SQL Server 2022 and S3 Object Integration

Anthony Nocentino does some PolyBase work:

In this blog post, I’ve implemented two example environments for using SQL Server 2022’s s3 object integration. One for backup and restore to s3 compatible object storage and the other for data virtualization using Polybase connectivity to s3 compatible object storage. This work aims to get you up and running as quickly as possible to work with these new features. I implemented this in Docker Compose since that handles all the implementation and configuration steps for you. The complete code for this is available on my GitHub repo…I’m walking you through the implementation here in this post.

In my post Setting up MinIO for SQL Server 2022 s3 Object Storage Integration we did this step by step at the command line. Using Docker Compose it will do all the hard work for you and you can get up and running fast.

Let’s walk through what you’ll get in each environment.

Read the whole thing, especially the note at the bottom about minimum RAM requirements on the Docker container.

Comments closed

Custom Infix Functions in R

Dominik Rafacz loves infix functions:

Custom infix functions are one of my favorite features in R. This article is my love letter to them. But first, a quick recap.

For those unfamiliar with the terminology, infix function is a function fun which is called using infix notation, e.g., x fun y instead of fun(x, y). Those functions are also called infix operators by base R, and I will use those terms and name infixes interchangeably. There are a lot of infix operators in base R used very frequently, i.e., arithmetic or logical operators. We use them so often that we usually forget that they are functions. And that we can call them just like regular functions.

Infix functions are something I tend to forget entirely about developing on my own but they can be extremely useful, as Dominik shows. H/T R-Bloggers.

Comments closed

An Intro to Key Word Analysis

Lewis Prince continues a series on natural language processing:

Here we are with part 2 of this blog series on web scraping and natural language processing (NLP). In the first part I discussed what web scraping was, why it’s done and how it can be done. In this part I will give you details on what NLP is at a high level, and then go into detail of an application of NLP called key word analysis (KWA).

Read on for a high-level overview of the topic and how to do it in Cognitive Services. But not the topic model—that’d be a different post.

Comments closed

Purview Access Policies and SQL Server 2022

Srdan Bozovic links Purview and SQL Server 2022:

The focus of this article is on using Microsoft Purview to enable access to user data as well as specific system metadata in SQL Server 2022 running on Azure Arc–enabled servers.

With the SQL Server 2022 release, the goal is to enable three main scenarios:

– Browsing data in user-defined tables and views.

– Performance monitoring with system commands, functions, and views.

– Security auditing with security-related system functions and views.

If Azure Arc-enabled servers are required for Purview to work, I think that will seriously hinder uptake.

Comments closed

Resolving tempdb Issues in Azure SQL DB

Holger Linke troubleshoots some problems:

The tempdb system database is a global resource available to users who are connected to Azure SQL Database or any instance of SQL Server. It holds temporary user objects that are explicitly created by a user or application, and internal objects that are created by the SQL Server database engine itself. The most common tempdb issue is running out of space, either regarding tempdb’s overall size quota or the transaction log.

The available tempdb space in Azure SQL Database depends on two factors: the service tier (pricing tier) that the database is configured with, and the type of workload that is executed against the database. These are also the main factors to control if you are running out of tempdb space.

Click through for several error cases and how we can resolve them.

Comments closed

When Estimated and Actual Plans Differ

Brent Ozar notes that estimated plans are not guarantees:

A reader posted a question for Office Hours:

Hi Brent, What is your take on Hugo Kornelis’s explanation of execution plan naming. As her his explanation, estimated exec plan is simply an execution plan whereas actual execution plan = execution plan+run-time stats. Do you agree that the naming is flawed and confusing? – Yourbiggestfan

Click through to see examples of when estimated plans might look different from actual plans.

Comments closed