Press "Enter" to skip to content

Day: January 16, 2020

Solving the Spelling Bee Honeycomb Puzzle

David Robinson has fun with puzzle-solving:

Solving this puzzle in R is interesting enough, but it’s particularly challenging to do so in a computationally efficient way. As much as I love the tidyverse, this, like the “lost boarding pass” puzzle and Emily Robinson’s evaluation of the best Pokémon team, serves as a great example of using R’s matrix operations to work efficiently with data.

I’ve done a lot of puzzles recently, and I realized that showing the end result isn’t a representation of my thought process. I don’t show all the dead ends and bugs, or explain why I ended up choosing a particular path. So in the same spirit as my Tidy Tuesday screencasts, I recorded myself solving this puzzle (though not the process of turning it into a blog post).

Most of the post is analysis around the problem, but you do get a viable solution as well.

Comments closed

Cluster-Based Image Analysis and Reduction

Sebastian Sauer takes an image and reduces it to a group of colors:

This post is a remake of this casestudy:

brought to you by Karsten Lübke.

The main purpose is to replace the base R command that Karsten used with a more tidyverse-friendly style. I think that’s easier (for me).

We will compute a cluster analysis to find the typical RGB color per cluster.

Click through for quite a bit of R code and a couple interesting turns.

Comments closed

Problems Installing SQL Server from a Mounted ISO

Randolph West recommends against installing SQL Server from a mounted ISO file:

In the old days, software used to be distributed on physical media like diskette, CD, and finally DVD. Nowadays we download and install software as standalone files. Even enterprise software like SQL Server is available for download in this way, usually distributed as an ISO file. Once you’ve downloaded it, you can mount that ISO file as a virtual drive and access it as though it was on physical media.

Or you can do the right and proper thing, and extract the ISO file to a network drive or local storage first, using a tool like 7-Zip.

I don’t think I’ve ever seen the problems Randolph refers to, though I’ll readily admit that Randolph has quite a few more installations done than I.

Comments closed

Understanding Implicit Conversions

Jared Poche takes us through the concept of implicit conversions:

A quick search will tell you that implicit conversions are pretty awful for performance, and in particular drive CPU usage. That’s not news. There is an aspect of this I think a lot of engineers don’t understand; why does it cause performance issues?

An implicit conversion occurs when you try to compare between or assign data across two different data types, without converting one yourself. If you used a CAST or CONVERT, it would be an explicit conversion. When SQL Server does it for you, it’s an implicit conversion, and these can have a real impact on your execution plans. Not all combinations of types can be converted implicitly, for a full list look here.

What’s interesting in this is that there were some cases where Jared expected implicit conversion and the rules indicate that there should be implicit conversion, but the database optimizer saw through his ruse.

Comments closed

Getting Folder Names in Power BI

Reza Rad shows us something new in Power Query:

There are times that you need to get the list of FOLDERS and not just files. Of course folder itself (without considering files in it), doesn’t contain data to be used for a report. However, sometimes, even the folder name might contain some useful information. The Get Data From Folder option in Power BI will just give you a list of files. There is a little trick that can help you to get a list of folders. Let me show you how.

There’s a fair bit to it, so click through and read the whole thing.

Comments closed

Release Flow Branching and Database DevOps

Kendra Little explains why the Azure DevOps Release Flow model can work well for database activity:

But how do you use branches? It’s helpful to pick a strategy. There are many fine Git branching strategies out there, things like GitFlow and GitHub Flow and more — enough that it’s overwhelming to learn about these when you are just starting out.

The strategy that I recommend for folks who are starting out with database DevOps and Git is the Azure DevOps team Release Flow model with dedicated development databases. (Why dedicated development databases? Read more here.)

Read on to learn why.

Comments closed

Reporting Queries and User-Defined Functions

Erik Darling tells a story of two things which do not get along very well:

The OLTP part was working fine, but there was a reporting element that was dog slow, and would cause all sorts of problems on the server.

When we got into things, I noticed something rather funny: All of their reporting queries had very high estimated costs, and all the plans were totally serial.

The problem came down to two functions that were used in the OLTP portion, which were reused in the reporting portion.

Read on to understand why SQL Server 2019 and its function inlining capability would not have helped here.

Comments closed

Refreshing Power BI from Your Outlook Calendar

Chris Webb has a nice use for Power Automate and Outlook:

The ideal way to view when multiple events are scheduled is a calendar and we’ve got great calendar functionality in Outlook. What if you could schedule refresh of your datasets from a calendar in Outlook? It turns out to be easier than you might think! Here’s how.

Read the whole thing, including Chris’s warning not to put it into production. You wouldn’t want the person with all of those calendar entries to leave your company and have things suddenly break, after all.

Comments closed