Press "Enter" to skip to content

Category: Warehousing

Temp Table Bugs in Microsoft Fabric Warehouses

Jared Westover runs into a wall:

I was excited when Microsoft announced the ability to create session-scoped temporary tables in a Fabric warehouse. However, after using Microsoft Fabric temporary tables, I quickly felt disappointed. When will they be ready for prime time, and in the meantime, what other options are available?

Click through for Jared’s experience, although it might already be fixed.

Leave a Comment

Comparing Microsoft Fabric to Snowflake

Evanjalin Joseph lays out a comparison:

Take ShopSmart, a global retail chain that operates both online and offline. The company wants to combine its sales, inventory, and customer data in order to facilitate real-time reporting and predictive analytics. Two top platforms are being assessed by the IT team for this change.

Azure, Power BI, and Microsoft 365 are already widely used by ShopSmart, which is in line with Fabric’s integrated ecosystem. The alternative, however, provides more multi-cloud flexibility and strong performance on structured data. The group has to choose between selecting a more specialized warehousing solution with more deployment options or making use of its current Microsoft investments.

Let’s examine the differences between the two platforms.

Click through for an overview of each platform and how they stack up against one another.

Leave a Comment

Creating a Dashboard in Snowflake

Kevin Wilkie doesn’t just do e-mails:

If you haven’t had a chance to mess around in Snowflake, there are internal dashboards that you can provide to your users (or just use yourself) so that you can get the latest data.

Today, I want to quickly show a use case for these Dashboards and one way that you can get data quickly and easily in a nice format.

And right above Dashboards is Streamlit, one of my favorite libraries for building rapid application prototypes and even small data applications.

Leave a Comment

Behind the Scenes in Developing a Cassandra Password Validator

Stefan Miklosovic walks us through a new feature in Apache Cassandra:

Here’s the problem: while users have always had the ability to create whatever password they wanted in Cassandra–from straightforward to incredibly complex and everything in between–this ultimately created a noticeable security vulnerability.

While organizations might have internal processes for generating secure passwords that adhere to their own security policies, Cassandra itself did not have the means to enforce these standards. To make the security vulnerability worse, if a password initially met internal security guidelines, users could later downgrade their password to a less secure option simply by using “ALTER ROLE” statements.

Read on to see how CEP-24 helps with this. It looks like CEP-24 will be released in Apache Cassandra 5.1.

Leave a Comment

Snowflake Query Tags in Power BI

Chris Webb takes some of the shine off of things:

Since the November 2024 Power BI release blog post announced that queries sent to Snowflake by Power BI include a query tag I’ve had a lot of questions from people who couldn’t see this happening or wanted to know what the query tags contained, so in this blog I thought I would outline the current status.

It turns out that the query tag isn’t as far along as the blog post indicated, and there are some pretty big limitations in the cases in which there actually is tagging.

Leave a Comment

Loading Data from Pandas into Snowflake

Anil Kumar Moka loads some data:

Loading data into Snowflake is a common need. Using Python and pandas is a common go-to solution for data professionals. Whether you’re pulling data from a relational database, wrangling a CSV file, or prototyping a new pipeline, this combination leverages pandas’ intuitive data manipulation and Snowflake’s cloud-native scalability. But let’s be real—data loading isn’t always a simple task.

Files go missing, connections drop, and type mismatches pop up when you least expect them. That’s why robust error handling isn’t just nice-to-have; it’s essential for anything you’d trust in production. In this guide, we’ll walk through the fundamentals of getting data into Snowflake, explore practical examples with pandas and SQLAlchemy, and equip you with the tools to build a dependable, real-world-ready pipeline. Let’s dive in and make your data loading process as smooth as possible!

Read on for a quick primer around data loading and some of the sanity checking we should be doing along the way.

Comments closed

Foreign Key Relationships in Microsoft Fabric Data Warehouses

Jared Westover looks at key constraints:

In late 2024, I noticed a comment on the Microsoft Learn site stating that foreign keys could improve query performance on tables in a Fabric warehouse. That claim immediately caught my attention. I wanted to answer a simple question: Do relationships help, hurt, or have no effect when added to tables in a Fabric warehouse?

Let’s get more specific—do foreign keys improve query performance when reading data (not loading)? In other words, do they make queries run faster?

Sadly, the answer is not as promising as with SQL Server. But this also makes sense considering the distributed nature of Fabric data warehouses.

Comments closed

Asynchronous SQL Statement Execution in Snowflake

Koen Verbeeck doesn’t want to wait for an answer:

It’s been a while since I blogged about Snowflake, but a recent LinkedIn post caught my attention: the ability to add asynchronous execution of SQL statements in a stored procedure. In other words: parallel execution of SQL statements. This got me excited, because in my opinion this is something that has been missing in T-SQL since forever. Every time you want to do something in parallel, you need to use external tools to accomplish this in SQL Server (or Azure SQL DB, or Fabric Warehouse, or Fabric SQL DB, or … you get the point). You needed to use SQL Server Agent Jobs, or SSIS packages, or Azure Data Factory and so on.

Snowflake introduces the ASYNC and AWAIT keywords, which can be used to trigger asynchronous execution. 

Read on for a very simple example and some thoughts from Koen. Aside from possibly making data modifications faster (assuming there are no constraint checks), I’m not quite sure what the major benefit to this is. I’d generally use asynchronous calls to support UI operations, letting a calling application respond to user input while some background thread processes data. But I’m not positive what you get from pushing async/await logic into the database itself.

Comments closed

Spatial Queries in Fabric Data Warehouse

Jovan Popvic reads a map:

Spatial data has become increasingly important in various fields, from urban planning and environmental monitoring to transportation and logistics. Fabric Data Warehouse offers spatial functionalities that enable you to query and analyze spatial data efficiently.

In this blog post, we will delve into the spatial capabilities in the Fabric Data Warehouse and demonstrate how to use the spatial functions in your queries.

This looks a bit like the way we perform spatial operations in SQL Server. Jovan shows off some examples of functionality, so check that out.

Comments closed