Warehousing – Page 6

If you’ve been working with data for several years like I have – mostly using the SQL language – then I have a term for you that other languages, like JavaScript or Python, have had for a few years. The term is “memoizable” and it means, in a nutshell, to remember. A memoizable function caches the results so that it can return the resultset in record time, given the same parameters.

Yeah, it’s a fancy term that basically states, “Instead of calculating the result each time, I’ll just create a lookup table of all possible inputs and what the output is.” It’s really helpful when you have a small number of possible inputs and generating a result takes a while.

Read on to learn more about how this works in Snowflake, including several limitations.

Comments closed

Implementing a Star Schema in a Microsoft Fabric Lakehouse

Published 2024-09-30 by Kevin Feasel

Nikola Ilic builds a lakehouse:

But, what is a star schema in the first place? I have good and bad news for you:)…The bad news is that I’m not covering it in this article because this one focuses on explaining how to implement a star schema in Fabric L akehouse (assuming that you already know what star schema is). The good news is: I’ve already written about it, so go and read this article first, if you’re not sure what star schema represents in the world of data modeling…

In one of the previous articles, I also shown how to implement a star schema in Power BI, by leveraging Power Query Editor.

Now, let’s get our hands dirty and build a star schema by using PySpark in the Fabric notebook!

Click through to see how.

Comments closed

Role-Playing Dimensions in Direct Lake

Published 2024-09-30 by Kevin Feasel

Chris Webb puts on a mustache and changes his shirt really quickly:

Note that the Sales fact table has two date columns, OrderDate and ShipDate.

If you create a DirectLake semantic model using the Web Editor and add these two tables you could rename the Date table to Order Date and build a relationship between it and the OrderDate column on the Sales table:

What about analysing by Ship Date though? You could create a physical copy of the Date table in your Lakehouse and add that to the model, but there’s another option.

Read on for that answer. Interesting that, as of right now, the primary way to do this is with third-party software.

Comments closed

Tracking Python Packages in Snowflake

Published 2024-09-25 by Kevin Feasel

Kevin Wilkie takes a peek:

When working with one of the many modern computer languages that use libraries, one of the many things to be aware of – as a developer – is the version of the libraries available for your usage.

Since there are multiple languages in Snowflake that use libraries, let’s go over how to check out the versions that come installed and how to install one yourself.

Read on for those answers. Well, one answer and one conundrum.

Comments closed

So You Dropped a Table–Snowflake Edition

Published 2024-09-18 by Kevin Feasel

Kevin Wilkie would never presume that you dropped the table, no no:

This week, I want to talk about something we’ve all done at least once – especially before our first cup of coffee in the morning. Yes, that’s right – dropping tables and databases.

Read on to see how you can rectify this sort of mistake.

Comments closed

Idempotent Column Alteration in Snowflake

Published 2024-09-11 by Kevin Feasel

Kevin Wilkie adds a column:

In my last post, we worked out how to change the data type of a field if it already existed and was found to need changing. This time, I want to add a fresh, new column to an existing table.

Read on to see how Snowflake allows you to use IF NOT EXISTS syntax to do this, as well as its limitations in practice.

Comments closed

Discerning a Star Schema from an Existing Report

Published 2024-09-10 by Kevin Feasel

Kelly Broekstra describes a common flow for business intelligence projects:

I have worked as a business intelligence developer for several years, and I’m always asked: “How do you convert user requirements to a functioning data model?”

I follow the Kimball methodology. For more information, check out the official pages.

But, here are some specific tips on what works for me.

Click through for those tips.

Comments closed

Data Type Changes in Snowflake

Published 2024-09-06 by Kevin Feasel

Kevin Wilkie makes some changes:

When working with data, I usually have an idea of what type of data I will push into a field. Sometimes, for whatever reason, it is decided to change the type of data allowed in the field. Today, I want to show how that’s done in Snowflake.

Click through to learn how, and how it’s not quite the same as SQL Server.

Comments closed

Connecting Snowflake to Microsoft Fabric

Published 2024-09-04 by Kevin Feasel

Stephanie Bruno makes a connection:

If you’re new to Snowflake and you need to mirror a Snowflake database in Microsoft Fabric, where do you begin? The steps are straightforward enough, but when trying something new, I often get tripped up by the basics. In this case, the configuration screen for mirroring. The documentation tells us to simply enter the server and warehouse, and provides some helpful information on where to find the details, but I prefer step by step instructions with pictures. If you do, too, then this post is for you.

Click through for a walkthrough.

Comments closed

Performance Tuning via Query History in Snowflake

Published 2024-08-21 by Kevin Feasel

Kevin Wilkie gets down to tuning:

In our last post, we talked about some of my favorite queries I use in Snowflake to see various items of interest – such as finding the worst-performing queries. For today’s post, though, I want to talk about performance tuning.

Yes, you read that right. We’re going to use query history to do some fun performance tuning.

Click through for two queries that can help you find what you may need to tune.

Comments closed

M	T	W	T	F	S	S
					1	2
3	4	5	6	7	8	9
10	11	12	13	14	15	16
17	18	19	20	21	22	23
24	25	26	27	28	29	30

Category: Warehousing

Memoizing Functions with Snowflake

Implementing a Star Schema in a Microsoft Fabric Lakehouse

Role-Playing Dimensions in Direct Lake

Tracking Python Packages in Snowflake

So You Dropped a Table–Snowflake Edition

Idempotent Column Alteration in Snowflake

Discerning a Star Schema from an Existing Report

Data Type Changes in Snowflake

Connecting Snowflake to Microsoft Fabric

Performance Tuning via Query History in Snowflake