Press "Enter" to skip to content

Month: December 2021

Automatically Stopping Data Explorer Clusters

Gabi Lehner has good news for us:

Azure Data explorer team is constantly focused on reducing COGS and making sure users are paying only for value they are getting.

As part of this initiative, we’re now adding a new automatic capability to stop unused clusters.

In case, you created a cluster and did not ingest any data to it or even if you ingested data and later, you’re not running any queries or ingesting new data for days, we will automatically stop that cluster.

I have two thoughts on this. First, good. Frankly, every cloud service should have automatic pausing unless it makes sense not to—that is, pausing should be the default, not a feature you add later. This is especially true for expensive data processing services.

Second, based on the description, I think I’d like a little more control over this, in terms of how long we go before auto-stop kicks off. It’s ten days, which is a reasonable + large number, but other numbers could make just as much sense for a given user. I like the idea that we see in Databricks and in Azure Synapse Analytics Spark pools: give me a reasonable default, but let me change it in case the reasonable default can’t cut it for some reason.

Comments closed

Tower of Hanoi in T-SQL

Tomaz Kastrun would like to play a game:

T-SQL Code for the popular game of “Tower of Hanoi”, that can be played in Microsoft SQL Server, Azure Data Studio or any other T-SQL editor with support of query execution.

Given that this is the game you use to teach students recursion, I figured a T-SQL based solution would be interesting. Well, Tomaz has the solution and the workspace to play it yourself.

Comments closed

Quantifier Predicates in SQL

Joe Celko takes us through quantifier predicates:

SQL is based on set theory and logic. The only bad news is that many programmers have never had a class on either of those topics. They muddle through using the Boolean operators in their programming language and think that’s all there is to formal logic.

Let’s flashback to the early days of logic and play catch up. We need to start with syllogisms. Syllogisms are logical forms made up of combinations of two statements about classes of things that lead to a conclusion. They were invented by the Greeks and written up by Aristotle in Prior Analytics. You might have run into them, If you had a philosophy class that included lectures on formal fallacies. The three forms of statements allowed are:

Click through to receive a brief primer on formal logic and learn more about how SQL implements these concepts.

Comments closed

Creating a Synapse Workspace with Data Exfiltration Protection

I have a post on creating a new Azure Synapse Analytics workspace:

As a quick upshot, having a managed VNet set up means that any Spark pools you create will have subnet segregation, meaning that the Spark machines will be in their own subnet, away from everything else. This provides a bit of cross-pool protection for you automatically. It also performs similar network isolation for your Synapse workspace, keeping it separated from other workspaces. The other big thing it does is create managed private endpoints to the serverless and dedicated SQL pools, which means that any network traffic between these pools and resources in the Synapse workspace will be guaranteed to transit over Azure networks and not the public internet, at least until it gets to you hitting the web.azuresynapse.net URL (and there are additional methods to lock down that part of it that we won’t cover today).

By default, the portal will not create a managed virtual network, so you’ll need to enable it at creation time. You cannot enable or disable the managed virtual network setting after a workspace has been created, so if you make a mistake, you’d need to rebuild the workspace, though you can at least use the same storage account.

One last thing that managed virtual networks offer you is the ability to enable data exfiltration protection.

Click through to see how it all works. Data exfiltration protection can limit you a bit, and that can be quite frustrating, but it does what it says…in the same way that Draconis did what he said.

Comments closed

Building Data Mesh Edges in Azure

Paul Andrew continues a series on data mesh in Azure:

Anyway, moving on. In part 2 of this blog series, keeping the same focus from part 1, with the first data mesh principal. Let’s take our nodes and start thinking about the edges. The data mesh – data product interfaces… Enter my Azure Resource Group with arms/antenna type things, seen on the right 

Caveat: as you may have already gathered, I’m going to use the terms edge and interface a lot in this post. The meaning in the context of the data mesh is the same. Nodes with edges, nodes with interfaces.

Click through for more detail.

Comments closed

Reference to Database Not Supported in this Version of SQL Server

Diego Nieto-Arroyo troubleshoots an issue:

In this article I will show how to resolve and overcome errors while attempting to query a system tables or any table for that matter, via Linked Servers. The error seen below is the result of the issue we encountered.

Msg 40515, Level 16, State 2, Line 1

Reference to database and/or server name in ‘xxxx.sys.sp_tables_info_90_rowset_64’ is not supported in this version of SQL Server.

Read on to see what causes this issue and how you can resolve it.

Comments closed

Black-Scholes Pricing in Power Query

Imke Feldmann has a formula for us:

The Black Scholes formula returns the value of European put and call options. The version I’m sharing here uses the standard normal cumulative distribution function from my previous blogpost.

Click through for an R version followed by the same function in M. I was going to comment on how much more code the M version was, but about half of that difference was Imke kindly adding in documentation and the other half was the inclusion of the normal CDF generator. Otherwise, it’s roughly the same number of lines.

Comments closed

Reporting Options in Power BI

Soheil Bakhshi summarizes the set of reporting options available in Power BI:

Power BI Service

Power BI Service is a SaaS (Software as a Service) offering from Microsoft in the cloud. The users within an organisation, depending on their access rights, may create reports directly in Power BI Service. The users can also securely share and distribute those reports. While creating or editing reports is possible in Power BI Service, it is strongly recommended to avoid this method for several reasons. The most obvious one is that the changes we make to a report may soon get overwritten by someone else that republishes the same report from Power BI Desktop. Check this blog post from SQLChick to see why you should avoid creating or editing reports directly from Power BI Service. The reports are downloadable in PBIX format. 

You can access Power BI Service here.

Click through for the full list and lots of detail.

Comments closed

Functional Programming in Python

Mehreen Saeed tries to tempt me into liking Python:

Python is a fantastic programming language. It is likely to be your first choice for developing a machine learning or data science application. Python is interesting because it is a multi-paradigm programming language that can be used for both object-oriented and imperative programming. It has a simple syntax that is easy to read and comprehend.

In computer science and mathematics, the solution of many problems can be more easily and naturally expressed using the functional programming style. In this tutorial, we’ll discuss Python’s support for the functional programming paradigm, and Python’s classes and modules that help you program in this style.

Click through to see how you can write functional-friendly programs in Python.

Comments closed