Press "Enter" to skip to content

Month: September 2021

Synapse vs Snowflake

Travis Manning has a throw-down:

Data warehousing has become a hot topic for most organizations as data volume grows exponentially, and yet the capacity to manually manage it all but diminishes. The ecosystem is replete with options, each with a host of features and integrations. In this article, we will discuss two of the most common (and commonly discussed!) data warehousing services, Azure Synapse and Snowflake Data Warehouse (DW). For this article, we will try to focus on use cases, and which option is appropriate in that context.

Click through for the product comparison. One big difference not covered is pricing uncertainty. If you have a good understanding of the number of executions and computational complexity of your queries, as well as data quantities, Snowflake can be very competitively priced. But what can happen is that the competitive price turns into a much-less-competitive price by the time you’re fully up to speed.

Leave a Comment

Deploying Synapse Artifacts to a Managed vNet Workspace

Rui Cunha takes us through an Azure Synapse Analytics deployment scenario:

In my previous article, I demonstrated how we could easily use the Synapse Workspace Deployment extension to accomplish this second stage of the process. I’m now coming back to this topic as I realized that many of our customers were reporting difficulties in completing this second stage of their Synapse CICD process because they were failing to deploy Synapse artifacts to a Managed VNET Synapse Workspace.

In this particular scenario, the deployment was failing because their target workspace was not allowing access from public networks.

Fortunately, the answer isn’t “Allow access from public networks.” Click through to see what you can do instead.

Leave a Comment

Using Azure Queue Storage as a Trigger for Function Apps

Aveek Das shows how you can use Azure Queue Storage as a way to trigger an Azure Function App:

In this article, we are going to learn how to trigger Function Apps from Queue Storage in Azure. Function Apps has been one of the most popular cloud services of Microsoft Azure. Function Apps allow users to write code in any language and then execute the code in the cloud. There is no infrastructure to be managed and hence is very flexible for writing and building applications on the go. Every Function App can be triggered in multiple ways, for example, by calling the function URL using an HTTP endpoint or from some other functions in Azure. In this article, we are going to trigger the Function App from Queue Storage in Azure and see how to pass a message from the queue to the Function App.

Queue Storage in Azure is another service in Azure that allows users to store multiple messages in it. Users can use a queue to create a list of items that need to be processed one by one. Messages to Queue Storage in Azure can be added by using the HTTP or HTTPS endpoints. Usually, a queue can store data up to 64 KB in size. We can add millions of messages in a queue if it is supported by the storage account.

Click through to see how. Though now I wonder why I might use Queue Storage instead of an Event Hub or an Event Grid. But I suppose that’s a question for a different article.

Leave a Comment

Capturing SQL Server Login Details with extended Events

Jack Vamvas shows how to track SQL Server logins:

I have to capture logon information details for a specific logon on a SQL Server.   Specifically – the client_hostname, nt_username & username. What i’m looking for is a log recording a successful connection made to the server.     The event should be triggered a) when a connection is made & b)   from a connection pool. 

Click through to see how.

Leave a Comment

From Azure Data Factory to Synapse Pipelines

Kevin Chant copies and pastes:

In this post I want to share an alternative way to copy an Azure Data Factory pipeline to Synapse Studio. Because I think it can be useful.

For those who are not aware, Synapse Studio is the frontend that comes with Azure Synapse Analytics. You can find out more about it in another post I did, which was a five minute crash course about Synapse Studio.

By the end of this post, you will know one way to copy objects used for an Azure Data factory pipeline to Synapse Studio. Which works as long as both are configured to use Git.

Click through to see how.

Leave a Comment

Choosing between Power BI Pro and Premium

Marc Lelijveld has an image for us:

Often I got the question from customers: “Can you assign my workspace to a premium capacity?” But frequently they actually do not really need Power BI Premium. It remains to be a difficult topic to decide whether someone needs Power BI Premium or not. Therefore, I decided to setup a decision tree that helps to decide if you need Power BI Premium or not.

This decision tree highlights a bunch of Premium specific requirements and features like breaking the data size limits, XMLA Endpoints, unlimited content sharing and much more!

Click through to see that decision tree, though note that it does not differentiate between Premium and Premium per User.

Leave a Comment

Comparing CPU Activity and Diagnosing the Cause

Joe Obbish has a tutorial for us:

Sometimes I have a need to run a quick CPU comparison test between two different SQL Server instances. For example, I might be switching from old hardware to new hardware and I want to immediately see a faster query to know that I got my money’s worth. Sometimes I get a spider sense while working with virtualized SQL Server instances and want to check for problems. Yesterday, I was doing a sort of basic health check on a few servers that I hadn’t worked with much and I wanted to verify that they got the same performance for a very simple query.

Click through for an easy test script and a good amount of diagnosis to understand why there is a significant difference between two instances.

Leave a Comment

De Moivre’s Equation and Sample Size-Based Variance

Holger von Jouanne-Diedrich demonstrates de Moivre’s equation:

Over one billion dollars have been spent in the US to split up big schools into smaller ones because small schools regularly show up in rankings as top performers.

In this post, I will show you why that money was wasted because of a widespread (but not so well known) statistical artifact, so read on!

Do read on to learn more about this paradox.

Leave a Comment

Visualization in Spark with Drsti

Jean-Georges Perrin shows off a Spark library:

I was looking for an effortless data visualization that would interface easily with Apache Spark. I found a few interesting tools, but nothing that would not require some complex interfacing, setup, or infrastructure. In a good geek way, I then decided to write the tool. This lack of simple tools is how Drsti (pronounced drishti) was born.

Aren’t you tired of looking at dataframes that looked like they came straight from a 1980 VT100? Sure, if you use notebooks, either standalone or hosted (IBM Watson Studio, Databricks…), you are not (or less) confronted with the issue. However, if you are building pipelines outside of the Data Science toys, oops, tools, you may need to visualize data in a graph.

Read on to see how it works and some of what you can do with Drsti.

Leave a Comment

An Intro to Dapr

Steve Jones tries out Dapr:

I’ve heard about Dapr a few times from developer friends, but hadn’t really understood it that well. I had a webinar coming up, so I decided to spend a bit of time working with it to understand how it might function with an application.

I went to https://dapr.io/, and saw the basic outline of Dapr is in this video from their site. I also found this getting started video from Donovan Brown.

Note that Dapr is totally different from Dapper.

Leave a Comment