Press "Enter" to skip to content

Category: Naming

A Data Governance by any other Name

Matthew Roche wants a re-naming:

To successfully implement managed self-service business intelligence at any non-trivial scale, you need data governance. To build and nurture a successful data culture, data governance is an essential part of the success.

Despite this fact, and despite the obvious value that data governance can provide, data governance has a bad reputation. Many people – likely including the leaders you need to be your ally if you’re working to build a data culture in your organization – have had negative experiences with data governance in the past, and now react negatively when the topic of data governance is raised.

They now treat data governance as a four-letter word.

Read the whole thing, though I do disagree with Matthew. Changing the name does not change the underlying problems; all it does is make the new name just as hated as the old one because the problems are still there. Call it Data Enablement if you’d like, but if the process is the same and the tools are the same, the outcome is the same, regardless of the name.

Comments closed

Updates in Azure Synapse Analytics

Saveen Reddy shows how the Synapse product team has been busy this year:

Previously, Synapse workspaces had a kind of database called a Spark Database. Spark databases had two key characteristics:

– Tables in Spark databases kept their underlying data in Azure Storage accounts (i.e. data lakes)

– Tables in Spark databases could be queried by both Spark pools and by serverless SQL pools.

To help make it clear that these databases are supported by both Spark and SQL and to clarify their relationship to data lakes, we have renamed Spark databases to Lake databases. Lake databases work just like Spark databases did before. They just have a new name.

Okay, this is the kind of change I can do without. That’s a really dumb name. Spark databases tell you what a thing is. It’s a database which lives in Apache Spark. Lake databases run what? Apache Spark. But if anything really should be called a Lake database, it’d be a serverless SQL pool’s database because everything in there is built on top of the data lake—it’s all external tables pointing to a lake. So calling a Spark database a Lake database brings more confusion than elucidation.

Most of the other changes on that list? Really cool. This one? Not at all.

Comments closed

Dynamic Column Rename in Power BI with XMLA and TOM

Kristyna Hughes solves a problem:

For the TOM and XMLA experts, imagine this. Your customer wants to dynamically rename columns without using the Power BI Desktop and would prefer all existing report visuals not get broken by the new name. Impossible? Not with TOM, XMLA, and translations within Power BI.

If you’ve ever tried to change a column name in a Power BI source, you’ve likely run into this error on any visuals that contained the renamed column. And when you hit that “See Details”, it will tell you the column that you simply renamed is no longer available for your visual.

So how do we get around that?

Read on to see how.

Comments closed

Renaming Multiple Columns at Once in Power BI

Matt Allington wants to change a bunch of column names at once with Power BI:

This is not the first time I have shared this concept.  In my previous article I showed how it is possible to add a prefix to every column in a table. This article today is slightly different. Today I am removing text from multiple columns all at once using some M code. The trick you need to learn to solve this problem is “how to create a list of lists”.

Click through for a video to see it in action.

Comments closed

Naming Azure Purview Scans

Daniel Janik treats Azure Purview scans like pets rather than cattle:

If you’ve ever been a DBA and seen the mess that you get with SQL Agent Jobs without a clean naming standard for your job schedules and job names then you’ll appreciate this tip.

If you haven’t been a DBA that’s OK too. Years ago I came up with my own naming standard for SQL Agent artifacts and I’ve always felt better when the messy room was clean. No Really! That’s exactly what this is like. A messy room where you are pretty sure you put the item you’re looking for in but you just can’t seem to find it until you clean 95% of the mess and then you’re so exhausted that you don’t have time to do what you wanted to in the first place. Ever been there?

Read on to see what the scans look like by default, as well as some thoughts Daniel has regarding a better way to do things.

Comments closed

Unique Resource Names and Azure

Meagan Longoria gives us a warning:

Each resource type in Azure has a naming scope within which the resource name must be unique. For PaaS resources such as Azure SQL Server (server for Azure SQL DB) and Azure Data Factory, the name must be globally unique within the resource type. This means that you can’t have two data factories with the same name, but you can have a data factory and a SQL server with the same name. Virtual machine names must be unique within the resource group. Azure Storage accounts must be globally unique. Azure SQL Databases should be unique within the server.

Since Azure allows you to create a data factory and a SQL server with the same resource name, you may think this is fine. But you may want to avoid this, especially if you plan on using system-defined managed identities or using Azure PowerShell/CLI. And if you aren’t planning on using these things, you might want to reconsider.

Click through for a demonstration of how you might get into trouble with this.

Comments closed

Mapping New Column Names with Power Query

Soheil Bakhshi reminds me of DB/2:

So, here is my scenario. I received about 10 files, including 15 tables. Some tables are quite small, so I didn’t bother. But some of them are really wide like having between 150 to 208 columns. Nice!

Looking at the column names, they cannot be more difficult to read than they are, and I have multiple tables like that. So I have to rename those columns to something more readable, more on this side of the story later.

Fortunately, there’s a way to fix this; click through for that way.

Comments closed

Renaming Cached DataFrames in Spark

Landon Robinson works around an annoyance:

But DataFrames have not been given the same, clear route to convenient renaming of cached data. It has, however, been attempted and requested by the community:

https://forums.databricks.com/questions/6525/how-to-setname-on-a-dataframe.html
https://issues.apache.org/jira/browse/SPARK-8480

However, with the below approach, you can start naming your DataFrames all you want. It’s very handy.

Read on to see the solution in action.

Comments closed

Good Practices for Naming Things in Power BI

Chris Webb shares some thoughts on the power of names:

What’s wrong with this picture? Look at the names:

– The tables and columns have the same names that they had in the data source, in this case a SQL Server database. Note the table name prefixes of “Dim” for dimensions and “Fact” for fact tables.
– The column and measure names either don’t have spaces or use underscores instead of spaces.
– What on earth does the measure name _PxSysF even mean?

Chris mentions that some of the ideas in the post may be controversial, but to be honest, I don’t think any of them are. The important thing here is to keep your audience in mind.

Comments closed