Warehousing – Curated SQL

In that article, we took advantage of some of the built-in sample code from the User Data Function editor, as well as some great code examples from Sujata: Example User data functions for Translytical task flows · GitHub

The problem? All of these samples use SQL Databases in Fabric as the backend item.

Jon switches this from a SQL database into a Fabric Data Warehouse, and notes some of the challenges along the way.

Mapping Snowflake Warehouse Usage by Time of Day

Published 2025-07-02 by Kevin Feasel

It’s ten o’clock. Do you know where your Kevin Wilkie is?:

In our last few posts, we’ve looked at how to:

Identify warehouses that haven’t been used in the past 30 days

Monitor daily credits burn.

Today, we’re diving into a different, but equally powerful question:

When are my Snowflake warehouses actually running?

Click through for the query and an example of it in action.

Tracking Unused Snowflake Warehouses

Published 2025-06-25 by Kevin Feasel

Kevin Wilkie doesn’t want to spend that extra money:

Now that we’ve loaded our Costs Dashboard with all sorts of goodness, let’s take it a step further and make it even more useful. The goal: track unused warehouses to help manage spend and reduce clutter.

Read on to see how you can do this.

Loading Data into Snowflake via Python

Published 2025-06-13 by Kevin Feasel

Anil Kumar Moka does a bit of data loading:

In our ongoing exploration of Snowflake data loading strategies, we’ve previously examined how to use pandas with SQLAlchemy to efficiently move data into Snowflake tables. That approach leverages pandas’ intuitive DataFrame handling and works well for many common scenarios where you’re already manipulating data in Python before loading it to Snowflake.

In this article, we’re diving deeper into the Snowflake toolbox by exploring the native Snowflake Connector for Python. While pandas offers simplicity and familiarity, the native connector provides a different set of capabilities focused on precision control and Snowflake-specific optimizations. This article explains you when and how to use this more direct approach for everything from small CSV files to massive datasets that would overwhelm pandas.

Click through for the full article.

Comments closed

Optimizing a Snowflake Data Warehouse

Published 2025-06-12 by Kevin Feasel

Harshavardhan Yedla gives us some guidance:

Optimizing a Snowflake data warehouse (DWH) is crucial for ensuring high performance, cost-efficiency, and long-term effectiveness in data processing and analytics. The following outlines the key reasons optimization is essential:

Read on for some tips around optimizing Snowflake warehouses. A lot of this stays at a pretty high level and doesn’t provide detailed guidance, but it’s a good checklist for thinking about your own situation.

Comments closed

Analyzing Snowflake Costs

Published 2025-05-21 by Kevin Feasel

Kevin Wilkie watches a moth fly out of his wallet and wonders where all of the money went:

Last time, in Dashboard Dreams and Snowflake Schemes, we talked a little about showing how much Snowflake really costs in a dashboard internal to Snowflake itself instead of having to push it to PowerBi, Tableau, Looker, or a myriad of other tools.

This time, let’s take it a step further: instead of sticking with the basic bar charts or exploding pie charts, we’ll explore how to better highlight usage trends by adding a Rolling 7-Day Average to our visualizations. This helps us more easily spot patterns and anomalies within our warehouses.

Read on for a pair of queries and a neat chart.

Comments closed

Real-Time Data Streaming in Snowflake

Published 2025-05-09 by Kevin Feasel

Anil Kumar Moka streams some data:

Real-time data ingestion has become essential for modern analytics and operational intelligence. Organizations across industries need to process data streams from IoT sensors, financial transactions, and application events with minimal latency. Snowflake offers two robust approaches to meet these real-time data needs: Snowpipe for near-real-time file-based streaming and Direct Streaming via Snowpark API for true real-time data integration.

This guide explores both options in depth, providing detailed implementations with explanation of code parameters, performance comparisons, and practical recommendations to help you choose the right approach for your specific use case.

Click through to see how it works. I’ll only make one semi-snarky comment that ‘real-time’ doesn’t mean “takes several seconds” but I realize I’m the one tilting at windmills here.

Comments closed

Choosing a Warehousing Data Architecture

Published 2025-05-07 by Kevin Feasel

James Serra compares and contrasts OLAP architectures:

As discussed in my blog and book “Deciphering Data Architectures: Choosing Between a Modern Data Warehouse, Data Fabric, Data Lakehouse, and Data Mesh” (Amazon), organizations are often challenged with choosing the right data architecture to meet their business goals—especially as AI and data-driven decision-making take center stage. To help clarify, here’s a quick review of the four core architectures, followed by guidance on when to use each. Each architecture includes five stages of data movement – ingest, store, transform, model, and visualize (described here).

Click through for James’s take on how each of them works and when you might choose one over the other.

Comments closed

Temp Table Bugs in Microsoft Fabric Warehouses

Published 2025-04-29 by Kevin Feasel

Jared Westover runs into a wall:

I was excited when Microsoft announced the ability to create session-scoped temporary tables in a Fabric warehouse. However, after using Microsoft Fabric temporary tables, I quickly felt disappointed. When will they be ready for prime time, and in the meantime, what other options are available?

Click through for Jared’s experience, although it might already be fixed.

Comments closed

Comparing Microsoft Fabric to Snowflake

Published 2025-04-22 by Kevin Feasel

Evanjalin Joseph lays out a comparison:

Take ShopSmart, a global retail chain that operates both online and offline. The company wants to combine its sales, inventory, and customer data in order to facilitate real-time reporting and predictive analytics. Two top platforms are being assessed by the IT team for this change.

Azure, Power BI, and Microsoft 365 are already widely used by ShopSmart, which is in line with Fabric’s integrated ecosystem. The alternative, however, provides more multi-cloud flexibility and strong performance on structured data. The group has to choose between selecting a more specialized warehousing solution with more deployment options or making use of its current Microsoft investments.

Let’s examine the differences between the two platforms.

Click through for an overview of each platform and how they stack up against one another.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Category: Warehousing

Writing Back to a Fabric Data Warehouse via UDF

Mapping Snowflake Warehouse Usage by Time of Day

Tracking Unused Snowflake Warehouses

Loading Data into Snowflake via Python

Optimizing a Snowflake Data Warehouse

Analyzing Snowflake Costs

Real-Time Data Streaming in Snowflake

Choosing a Warehousing Data Architecture

Temp Table Bugs in Microsoft Fabric Warehouses

Comparing Microsoft Fabric to Snowflake