Press "Enter" to skip to content

Month: December 2021

Matching Supply with Demand using Batch Mode

Joe Obbish has a solution:

Itzik Ben-Gan posted an interesting T-SQL challenge on SQL performance dot com. I’m writing up my solution in my own blog post because I have a lot to say and getting code formatting right can be tricky in blog post comments. For reference, my test machine is using SQL Server 2019 CU14 with an i7-9700K CPU @ 3.60 GHz processor. The baseline cursor solution completes in 8465 ms on my machine.

Click through for a bit of formal logic and a lot of tuning.

Comments closed

Sparklines and Filter Context

Ed Hansberry riffs on the new sparkline functionality in Power BI:

This isn’t a tutorial on how to create sparklines as Microsoft has you covered there. What I want to discuss though is how Sparklines are impacted by the filter context in a visual. First though you need to enable this feature. It is in preview mode at launch, so go to File, Options, Options and Settings, and look for the Preview section. Click Sparklines, then close and restart Power BI Desktop.

In the visual above, I added the “Month” field to the Sparkline settings and that created an additional filter on the visual. Before I added the Sparkline, Year was the only filter affecting the numbers. By adding the month to the sparkline, shown below, it breaks out the Total Net Sales figure by month.

But what if I don’t want the sparkline to be the full range of data. For example, I just want the sparkline to show the last 3 months of the year, so 3 data points, not 12.

Click through to see how it’d look as a measure and what you need to do to make sparklines look right.

Comments closed

Copy Logins between SQL Server Instances

David Alcock wants to move a login:

Migrating SQL databases is fun, depending on your definition of fun that is. The process can involve having to move things such as login details that have been around for that long that nobody has a clue what they are anymore.

With domain accounts that’s pretty straightforward, the passwords are managed in Active Directory and not held in SQL Server and it’s just a case of recreating the account on the new instance.

SQL authentication is different and migrating an account as is means you also have to recreate the password as is which could be difficult if you didn’t know what the password should be. It’s worth saying at this point that the preference should always be to use domain accounts, they’re more secure and much more manageable and migrations are ideal opportunities to refactor things to be better but for the sake of this article let’s proceed with the scenario of recreating a SQL authentication login with an unknown password, and we’ll need to get creative.

And as you’d expect, dbatools makes a dramatic appearance.

Comments closed

Troubleshooting External Command Timeouts in Power BI

Chris Webb continues a series n troubleshooting timeouts:

In the first post in this series I showed how any Power BI dataset refresh started via the Power BI portal or API is limited to 2 hours in Shared capacity and 5 hours in Premium capacity, and how you could work around that by running a refresh via Premium’s XMLA endpoint feature. In the second post in this series I showed how some M functions allow you to set timeouts. However, even if you initiate a refresh via the XMLA endpoint you may still get a timeout error and in this post I’ll discuss another reason why: the External Command Timeout.

Read on to see what the external command timeout is and when it might strike.

Comments closed

Data Mesh and Event Streaming

Adam Bellemare takes us through an example of implementing data mesh ideas in Confluent Cloud:

Data mesh. This oft-talked-about architecture has no shortage of blog postsconference talkspodcasts, and discussions. One thing that you may have found lacking is a concrete guide on precisely how to get started building your own data mesh implementation. We have you covered. In this blog post, we’ll show you how to build a data mesh using event streams powered by Confluent Cloud, highlighting our design decisions, and the key benefits, and the key benefits and challenges you’ll need to consider along the way. In fact, we’ll go one better: we’ve built a data mesh prototype for you to check out on your own to see what this would look like in action, or fork to bootstrap a data mesh for your own organization.

Read on for the example.

Comments closed

Bullet Charts

Amy Esselman is number one with a bullet:

Our November challenge was inspired by a recent workshop question about bullet graphs. We invited the community to find some data of interest and build their own bullet graph. More than three dozen members tackled this less familiar chart type in a variety of tools, including Excel, Tableau, Python, Illustrator, Datawrapper, PowerBI, and R.

For many challenge participants, this was their first attempt at creating a bullet graph. They discovered, as you may find yourself, that even though bullet graphs are not a typical chart type, they can be powerful in the right situation. 

Click through for examples of bullet charts in action.

Comments closed

Save Money with Spot Instances

I have a post on using spot instances in the cloud:

Spot instances are an idea which came out of Amazon Web Services. Specifically, the people at AWS realized that they had excess capacity on servers and in the cloud, excess capacity is typically a bad thing, as you’re paying for resources not in use. Going back to basic economics, when you have excess capacity, you have a surplus. There are two ways to deal with a surplus: decrease supply (shift the supply curve back) or decrease prices (move down the demand curve).

There are some complicating factors here which make it tough for AWS or other cloud vendors to do either.

Of course I wasn’t going to let a discussion of spot instances go without hitting a bit of economic theory. Just be happy I didn’t break out the supply and demand curve visuals…

Comments closed

Understanding SQL Server I/O Size

Anthony Nocentino dives into a topic:

Master File Table (MFT) is the data structure that describes files and directories on NTFS. In Figure 1, you can see an MTF record has several sections describing the metadata about the file and pointers to blocks that make up the file. A block, also referred to as a cluster in Windows, is an abstraction over one or more physical structures (sectors or pages depending on the media) presented by the underlying disk. A block/cluster is also the atomic allocation unit from a file system and has a configurable size. On NTFS, this is referred to as the NTFS Allocation Unit Size and is a configurable attribute of the file system. By default, it is 4KB and can be as large as 2MB. Since a block is a unit of allocation, if a file is between 1 byte and the file system’s allocation unit size, it will take up exactly one block/cluster on the file system. As the file grows, more blocks/clusters are allocated to represent the file. The MFT data structure tracks which blocks make up a file. The block allocator of the file system will try to ensure blocks are physically adjacent on the disk and groups them together in runs.

A 4KB NTFS Allocation Units size is considered best practice on general-purpose file systems. And 64KB is considered best practice for SQL Server…but why? Let’s keep digging…

I was sitting at a lunch table with Anthony when it all clicked and that was fun to see.

Comments closed

Microsoft.DataFactory and Storage Event Triggers in Synapse

Cathrine Wilhelmsen troubleshoots an Azure issue:

I ran into an issue today while trying to publish a storage event trigger in Azure Synapse Analytics. After publishing, I got error messages that said “failed to subscribe” and “failed to activate”. The storage event trigger had been published, but it wouldn’t start. Help!

Click through for some resources on documentation, a few things which didn’t work, and what finally resolved the issue.

Comments closed

Building a Data Mesh in Azure

Paul Andrew starts a new series:

The concepts and principals of a data mesh architecture have been around for a while now and I’ve yet to see anyone else apply/deliver such a solution in Azure. I’m wondering if the concepts are so abstract that it’s hard to translate the principals into real world requirements, and maybe even harder to think about what technology you might actually need to deploy in your Azure environment.

Given this context (and certainly no fear of going first with an idea and being wrong ) here’s what I think we could do to build a data mesh architecture in the Microsoft cloud platform – Azure.

Click through for Paul’s take on the first data mesh principle.

Comments closed