Arun Sirpal explains a common architectural pattern:
Let’s go back to data platforms today and I want to talk about a very common integration I see nowadays, Azure Blob Storage linked to Snowflake via a storage integration which then we can access semi structured files via external tables, it is a good combination of technology I have to say.
Click through for an architecture diagram and example of the code you’d need.
Arun Sirpal reads the fine print:
This is the managed instance link feature; I really like this, if you know about Distributed AGs then you may know they are tricky to setup (well I found this) but Microsoft takes care of this out of the box.
The point of this quick blog is not how to set this up but the benefit of enabling the Managed Instance as “ license free “ via the hybrid failover rights option – do not forget about this.
Read on for the list of requirements.
Arun Sirpal gives us an overview:
Chaos engineering is fun but especially important when building solutions in the cloud. It is great leveraging the cloud to build something, whether that’s a globally distributed website with lots of traffic or an internal 3 tier application for a business – the question is – what happens is there is an unexpected fault / disruption? Can your system / app withstand the issue?
Click through for the overview, as well as some additional resources you can use to try it out.
Arun Sirpal pushes the big red button:
We all would like to save money when operating in the cloud, Microsoft has released a stop / start concept for SQL Managed Instances – preview mode!
At the time of writing, the Managed Instance needs to be built from the November wave where you will see the functionality in the overview section and it has to be in the General purpose tier. If you have managed links or failover groups then you cant use this feature.
There is a kicker, however, which makes this a less-than-pleasant option. Arun has more detail.
Arun Sirpal doesn’t have time to create indexes:
I will use a clone of the table to compare it to when search optimisation is on. I will make sure no caching in on which could affect the test.
I activate the feature via:
ALTER TABLE data_staging ADD SEARCH OPTIMIZATION;
This takes time! If you run something like the below to confirm 100% completion. This is because there is a maintenance service that runs in the background responsible for creating and maintaining the search access path:
Click through to see what happens and the kinds of performance gains Arun realized.
Arun Sirpal is looking for a query plan:
Whatever your database background is you will have heard of an execution / explain plan. Snowflake is no different. Coming from a MS SQL background I was used to reading plans from right to left. Now it is bottom to the top!
Read on to see how you can find the query plan for an operation.
Arun Sirpal sets up Snowflake:
Now let’s start the process of creating a snowflake account in the Azure Cloud. You can sign up for a free trial from here – https://signup.snowflake.com/ I am going to bypass this and go straight to the setup screens. (This is slightly different because as an org-admin I have the power to create accounts)
Select the cloud provider and edition you require; we have already discussed these options before. You know me, its going to be Azure but feel free to dive into AWS or GCP.
Read on for some step-by-step installation instructions.
Arun Sirpal lays out the foundation of Snowflake DB’s architecture:
At the most basic level, Snowflake has 3 important components. The Cloud services layer, centralised storage layer and the compute layer.
Cloud services – they call this the “brains” of snowflake. This is where infrastructure management takes place, the optimiser is based (cost-based), metadata management and security (authentication and access control) are handled.
Read on to learn about the other two layers and how they meet.
Arun Sirpal helps us make a choice:
You have decided that snowflake is the technology you want to use to build your next gen data platform, you have decided your cloud provider (Azure, AWS, GCP) then next you need to think about what edition of snowflake suits your business needs?
Read on to see what the major differences are between editions.
Arun Sirpal has a list:
Storage Accounts are pretty much integrated into so many different designs in Azure, whether you are using Azure Synapse, 3rd party product like Snowflake, or Event Streaming designs – we need it.
When you create a storage account there are 5 different replication types you should know about. These are LRS, ZRS, GRS, RA-GRS and GZRS. Lots of abbreviations here, lets explain further.
Read on for the explanation.
Can't find what you're looking for? Try refining your search: