Press "Enter" to skip to content

Curated SQL Posts

How SQL Server Stores sql_variant Data

Randolph West answers one of the mysteries in life:

By a show of hands, who uses the sql_variant data type? In my experience the answer is “no one,” and that’s not necessarily a bad thing. There’s a general philosophy in our industry where we assume that columns are a specific data type. This predictability makes writing queries and applications against those columns easier. It’s difficult enough dealing with NULL values, so adding confusion with handling data conversions is an invitation to introduce bugs.

The sql_variant data type is a bit of a relic. I think the people who used that were also big variant data type users in Visual Basic. Nonetheless, Randolph clarifies how SQL Server stores this data.

Comments closed

PolyBase and Availability Groups

Rajendra Gupta has a detailed article on working with PolyBase in an Availability Group:

In this 28th article for SQL Server Always On Availability Group series, we explore the high-availability for the SQL Server PolyBase(SSB) external tables using AG groups.

There’s a lot of detail in the article and it’s worth reading in conjunction with Nathan Schoenack’s post. Someday I’ll get to the blog post on my backlog around PolyBase and AGs, especially with scale-out clusters. Someday.

Comments closed

Recursive Metadata Discovery in Azure Data Factory

Richard Swinbank gives us one method to perform recursive metadata discovery in Azure Data Factory:

Azure Data Factory’s Get Metadata activity returns metadata properties for a specified dataset. In the case of a blob storage or data lake folder, this can include childItems array – the list of files and folders contained in the required folder. If you want all the files contained at any level of a nested a folder subtree, Get Metadata won’t help you – it doesn’t support recursive tree traversal. In this post I try to build an alternative using just ADF.

But before you get too invested in this technique, please read Richard’s spoiler.

Comments closed

Using Pester to Test Cluster Resource Owners

Jess Pomfret has a check for who owns specific failover cluster resources:

If we are going to test that we’re in our expected configuration, we need to record what that configuration looks like.  I have a hard coded list of cluster names. However, you could easily pull them from a text file, or a database.  Once we have the list of clusters we can use Get-ClusterGroup to determine the cluster roles and their current owners.

To persist this owner information I’m using ConvertTo-Json and then outputting it to a file. This creates a file that can easily be read back into PowerShell as an object using ConvertFrom-Json.

It’s also probably worth mentioning that this ideal configuration can be stored in source control. That’ll keep the file safe and you can easily keep track of any changes that are made to it.

Read on for the full set of steps.

Comments closed

Adding Northwind to a SQL Server Instance

Doug Kline brings back a blast from the past:

 This post shows how to run a SQL Server Instance on about any computer using Docker Containers. Your next step might be to get a sample database into that SQL Server Instance. 

Thanks to Microsoft, you can get their sample databases as T-SQL scripts. You can use these to install these databases on whatever server you are connected to, including your “containerized” SQL Server instance.

It’s been a while since I’ve used Northwind, but sometimes you just need a simple database.

Comments closed

Working with Active Power BI Sessions

Marc Lelijveld provides us insights:

Getting more insights in the telemetry of your Power BI dataset is always interesting. Especially if you share your Power BI dataset for self-service purposes to build new reports on top of your managed dataset, you might want to know who is actually using in and what queries are they executing against your datasets.

Besides that, there might be a whole lot of other valid use cases where you want to have more insights in currently running sessions on your dataset. An example where I recently ran into, was canceling a Power BI Dataset refresh. As there is no button in the Power BI Service to do this, I had to find a different way to do this.

Read on to see how, using DAX Studio.

Comments closed

Building a Hadoop Cluster with Spark in Kubernetes

Gopal takes us through building up a Hadoop cluster via Kubernetes:

In our current scenario, we have 4 Node cluster where one is master node (HDFS Name node and YARN resource manager) and other three are slave nodes (HDFS data node and YARN Node manager)

In this cluster, we have implemented Kerberos, which makes this cluster more secure.

Kerberos services are already running in the different server which would be treated as KDC server.

In all of the nodes, we have to do a client configuration for Kerberos which I have already written in my previous blog. please go through below kerberos authentication links for more info.

kerberos authentication

Read on for the walkthrough.

Comments closed

The Dangers of the securityadmin Role

Tibor Karaszi warns us to be wary of the securityadmin role:

Over the last week, I’ve seen two discussions where the securityadmin server role was suggested.

The scenario is typically that we don’t want to make this person a sysadmin, but we want this to manage logins and stuff like that.

But Tibor explains why you should watch out—my rule of thumb is to treat securityadmin as just as important as sysadmin. It’s not the same, but someone with securityadmin can do a lot of damage to a server.

Comments closed

Building a Simple Date Table in Power BI

Marco Russo has a simple method for building a date dimension in Power BI:

With the recent introduction of custom format strings in Power BI, we can use a different approach that no longer requires additional columns, and that can also remove the need to use the Sort by Column feature. The basic idea is that we always store a date for a period like Year, Month, or Week. The custom format string displays the text we want to see in the user interface; however, because the underlying data is a number we automatically get the right sort order and the ability to use the Continuous Type in the X-Axis of a line chart.

Read the whole thing.

Comments closed