Press "Enter" to skip to content

Month: September 2023

TINYINT Casts in Spark SQL vs T-SQL

Bill Fellows runs into an interesting oddity:

Yet another thing that has bitten me working in SparkSQL in Databricks—this time it’s data types.

In SQL Server, a tinyint ranges from 0 to 255 but both of them allow for 256 total values. If you attempt to cast a value that doesn’t fit in that range, you’re going to raise an error.

SQL Server’s TINYINT data type is an unsigned one-byte number, whereas TINYINT in Spark SQL is a signed one-byte number. But that’s not the biggest difference Bill finds, so check out the post to learn more.

Comments closed

Controlling Power BI Chart Ranges with DAX

Marco Russo and Alberto Ferrrari control the horizontal, Marco Russo and Alberto Ferrari control the vertical:

DAX is a powerful tool in the hands of a Power BI developer. Using simple DAX formulas, you can not only compute interesting metrics but also customize the behavior of Power BI visuals. In this article, we use DAX to control the range of charts to obtain more coherent visualizations.

Read on to see how.

Comments closed

Documenting Power BI Workspaces with Fabric Notebooks

Prathy Kamasami shares a use case for notebooks in Microsoft Fabric:

If you are a consultant like me, you know how hard it can be to access Power BI Admin API or Service Principal. Sometimes, you need to see all the workspaces you have permission for and what’s inside them. Well, I found with MS Fabric, we can use notebooks and achieve it with a few steps:

Read on for an enumeration of those four steps, as well as detailed instructions for each.

Comments closed

Sending Azure Cost Management Data to Azure Data Explorer

Brad Watts writes out some cost data:

Understanding your Azure Spend is one of the most important things you do as an Azure customer. Azure Cost Management is built into the platform to provide you insights. But we live in a world of data and looking at the Azure Cost Management data in a silo may not meet your organization’s needs. In those situations, we can solve that need by putting your Cost Management data into an analytical platform like Azure Data Explorer or Microsoft Fabric KQL Database. Here we can bring in or join additional data that’s useful, run ad-hoc queries and build visualization tying it all together.

Using the below repository, you’ll be able to utilize Azure Cost Management exports to setup an automated process that ingests the cost data into ADX or Fabric KQL Database.

There are several steps involved, but as Brad points out, you can do this either with Microsoft Fabric or with classic Azure Data Factory + Azure Data Explorer. I’d also throw in Azure Synapse Analytics, but that’s not as in vogue anymore.

Werner Zirkel also has a great comment showing how you can cut out most of the steps with Event Grid.

Comments closed

Thoughts on NOLOCK

Erik Darling has some thoughts:

And generally, the more NOLOCK hints I see, the more money I know I’m going to make.

It shows me four things right off the bat:

  • The developers need a lot of training
  • The code needs a lot of tuning
  • The indexes need a lot of adjusting
  • There are probably some serious bugs in the software

Perhaps the only other thing that signals just how badly someone needs a lot of help is hearing “we’re an Entity Framework only shop”.

Cha-ching.

I have to admit, even being a consultant doesn’t soften the pain of walking into a place and seeing people use NOLOCK like they picked up a fresh pallet of it from Costco and need to use it up before it goes bad.

Comments closed

Statistical Tests in R

Adrian Tam tries out a couple of tests:

R as a data analytics platform is expected to have a lot of support for various statistical tests. In this post, you are going to see how you can run statistical tests using the built-in functions in R. Specifically, you are going to learn:

  • What is t-test and how to do it in R
  • What is F-test and how to do it in R

This is one of the things that R does best among any language: statistical testing. R has support for an enormous number of statistical functions, either built into the base language or available as packages.

Comments closed

Plotting a Subset of Data in R

Steven Sanderson doesn’t need all of those data points:

Data visualization is a powerful tool for gaining insights from your data. In R, you have a plethora of libraries and functions at your disposal to create stunning and informative plots. One common task is to plot a subset of your data, which allows you to focus on specific aspects or trends within your dataset. In this blog post, we’ll explore various techniques to plot subsets of data in R, and I’ll explain each step in simple terms. Don’t worry if you’re new to R – by the end of this post, you’ll be equipped to create customized plots with ease!

Click through for several techniques for subsetting data, as well as reasons why you might want to do it.

Comments closed

Microsoft Fabric Presentations

Wolfgang Strasser opens a vault:

Are you searching for Microsoft Fabric Presentations? You want learn more about the new unified analytics solution?

There are plenty of presentation available around the internet – some only as recordings, some as PDFs only.

BUT – last week, I found a (now not more) hidden gem of Microsoft Fabric content on the internet – the Microsoft Fabric Readiness repository

Click through for the link to those presentations.

Comments closed

A SQL Server Security Checklist

Hemantgiri Goswami has a list and checks it twice:

Last week, in my previous article on How to Secure SQL Server I have discussed a few points that can help you secure SQL Server. In this post, as promised I will share a SQL Server Security Checklist that I have used for many of my clients to help them achieve PCI compliance.

As you are aware, PCI is global payment security standard council. Following their standards help an organization achieve a compliance certificate that all the card data that is processed, store and transmit are maintained in secure environment.

The good news is that you can use the dbachecks suite to check many of these items.

Comments closed

PGSQL Phriday 012 Roundup

Ryan Booz goes beyond a short summary:

I think due to a number of people attending a PostgreSQL conference during the week blogs would have been written, and the ongoing runup to a pending release, participation this month was lower than normal. But the blog posts (and audio podcast) that we did receive were top-notch and I’m genuinely thrilled to see people make the effort. Keep an eye on these blogs for other content, because the quality of their work is excellent and you’ll surely learn new things with anything new they produce!

Read on for Ryan’s review of three blog posts and one podcast.

Comments closed