Press "Enter" to skip to content

Curated SQL Posts

Ad Hoc Data Exploration with Azure Data Explorer

Michal Bar introduces a new feature:

We are excited to introduce the new Data Exploration feature, designed to enhance your ability to delve deeper into the data presented on any Dashboard.

If the information you’re seeking isn’t readily available on the dashboard, this feature allows you to extend your exploration beyond the data displayed in the tiles, potentially uncovering new insights.

Directly from a dashboard, you can refine your exploration using a user-friendly, form-like interface. This intuitive and dynamic experience is tailored for insights explorers seeking insights based on high volumes of data in near real time.

Click through to see the new feature in action.

Comments closed

The Performance Impact of Local Variables

Jared Westover talks performance:

Often, developers use local variables when writing ad hoc queries or stored procedures for many reasons. You might hear “never repeat code” or “avoid using magic numbers.” While writing a lengthy stored procedure, I might include a few. However, did you know that local variables can hurt the performance of your queries? How can you keep local variables from negatively affecting performance? Keep reading to find those answers and more.

This is the kind of performance issue that you can easily forget about. Jared includes two methods for resolving the issue if you run into performance problems on a specific query or stored procedure.

Comments closed

Tracking Task History in Snowflake

Kevin Wilkie is interested in just one thing:

By the table function name, you’re probably wondering “Sherpa, why on Earth do I really need to look at query history based on the session? I have Query_History_By_User which gives me a broader look at my data. Who really cares?”

Great question, my friend! You’re right. This new Query_History_By_Session table function isn’t going to give you a lot of data that you can’t get easier – and more helpfully – with Query_History_By_User. Why did Snowflake provide me this “useless” function?

Read on for the value of this function and an example of querying task history.

Comments closed

A Review of Useful pg_stat_statements

Umair Shahid tracks some statements:

pg_stat_statements is an extension for PostgreSQL that tracks execution statistics of SQL statements. It is designed to provide insight into the performance characteristics of database queries by collecting data on various metrics such as execution time, number of calls, and I/O operations. This extension is immensely useful for database administrators and developers looking to optimize their SQL queries and improve overall database performance.

Click through to learn more about pg_stat_statements, including how to install and configure it, as well as some of the things you can do with it.

Comments closed

Adding tSQLt to a Database Project

Olivier Van Steenlandt provides an overview of adding tSQLt to a Visual Studio database project:

As a first step in the process, we’re going to create a new Database Project, in my case, I will be calling my Database Project AdventureWorksDW_UnitTesting and my solution AdventureWorks.

If you are not sure how to set up a Database Project in Visual Studio from scratch, don’t worry, you can follow the step-by-step data recipe I released a while ago, Getting Started with Database Projects and Version Control

Read on to learn more about how to add the tSQLt objects and eliminate cross-database reference issues.

Comments closed

Updates in Apache Kafka 3.8

Josep Prat announces a slew of changes:

We are proud to announce the release of Apache Kafka 3.8.0. This release contains many new features and improvements. This blog post will highlight some of the more prominent features. For a full list of changes, be sure to check the release notes.

See the Upgrading to 3.8.0 from any version 0.8.x through 3.7.x section in the documentation for the list of notable changes and detailed upgrade steps.

This also puts Kafka one step closer to getting rid of its ZooKeeper dependency altogether.

Comments closed

Using Semantic Model Scale Out as Part of Power BI Refresh

Chris Webb keeps the lights on during a refresh:

In recent my posts on the Command Memory Limit error and the partialBatch mode for Power BI semantic model refresh, I mentioned that one way to avoid memory errors when refreshing large semantic models was to run use refresh type clearValues followed by a full refresh – but that the downside of doing this was that your model would not be queryable until the full refresh had completed. Immediately afterwards some of my colleagues (thank you Alex and Akshai) pointed out that there was in fact a way to ensure a model remained queryable while using this technique: using Semantic Model Scale Out. How? Let me explain…

Click through for that explanation.

Comments closed

A Visual Explanation of Row Context in DAX

Marco Russo and Alberto Ferrari get visual:

Row context is the second fundamental concept in writing DAX code. In a previous article, we introduced the first concept – the filter context – using a visual approach. In this article, we rely on graphical visualization to describe a row context.

This article provides a different perspective on a topic already discussed in other row context articles: read them to get more insights about this important concept for DAX.

Click through for a great primer on the topic.

Comments closed

Linux Memory Overcommit and PostgreSQL

Laurenz Albe shares a warning:

Linux tries to conserve memory resources. When you request a chunk of memory from the kernel, Linux does not immediately reserve that memory for you. All you get is a pointer and the promise that you can use the memory at the destination. The kernel allocates the memory only when you actually use it. That way, if you request 1MB of memory, but use only half of it, the other half is never allocated and is available for other processes (or the kernel page cache).

Overbooking is a concept that airlines all over the world have been using for a long time. They sell more seats than are actually in the plane. From experience, airlines know that some passengers don’t show up for the flight, so overbooking allows them to make more profit. By default, Linux does the same: it deals out (“commits”) more memory than is actually available in the machine, in the hope that not all processes will use all the memory they allocate. This memory overcommit is great for using resources efficiently, but it has one problem: what if all the flight passengers show up, that is, what if the processes actually use more memory than is available? After all, you cannot offer a computer process a refund or a free overnight hotel room.

Read on to learn more about memory overcommit and what you can do about it.

Comments closed

Security Options in Microsoft Fabric Warehouses

Koen Verbeeck locks things down:

We are implementing a data analytics solution in Microsoft Fabric. A warehouse is used for the gold layer, and we want to give users access to the data. However, by sharing the warehouse, they can read all the data in all the tables. Some data is sensitive, and only users with the correct permissions should be able to view it. Is it possible to implement more granular access control to the data?

Read on for the answer, as well as an important note on how users might be able to circumvent your permissions settings.

Comments closed