Press "Enter" to skip to content

Day: June 4, 2024

Fuzzy Search and Levenshtein Distance

Hoen Nguyen explains a couple of terms:

In the world of search engines and data retrieval, achieving high accuracy and relevance in the results is a constant challenge. One of the techniques used to improve search results is Fuzzy Search.

This blog post will delve into the concept of fuzzy search, its implementation using the Levenshtein Distance, and how to test its effectiveness.

Levenshtein distance is also one of the techniques spell checkers use, comparing word not in its dictionary to other words within a certain distance.

Comments closed

Building a Custom Color Palette for ggplot2

Tomaz Kastrun pulls out the color swatches:

A simple, yet effective way to set your colour palette in R using ggplot library.

Click through for the demonstration. Tomaz keeps the text very light in this post, so I’ll do a little vamping of my own. Creating a custom palette is neat, but do make sure that your custom palette works for users with color vision deficiency (CVD). Taking Tomaz’s bar chart into Coblis (an amazing tool I continue to use quite regularly), here’s what it looks like for people with protanopia—that is, no red cones in their eyes:

It’s not awful, particularly because Tomaz changed the fill but not the border color, so you get a funky striation effect.

But the real kicker is if you switch to the monochromatic option in Coblis.

Granted, I know of exactly one person with monochromacy, so if you want to be fair, this isn’t one I’d check for on a webpage. But the large majority of technical books have grayscale images because it saves money on printing, so if this were your sweet-looking color scheme and you’re adding the image into a book, readers would need to focus particularly hard on the bars to figure anything out.

Comments closed

The CONTROL SERVER Permission in SQL Server

Jeff Iannucci has a series on security in SQL Server:

The CONTROL SERVER permission has been around since SQL Server 2005, and is the most powerful permission granted as part of membership in the sysadmin role. What many folks don’t realize is that this permission can be granted to a login or group without including them in the sysadmin role. And that can become problematic if, as an administrator, you aren’t aware of logins or groups that don’t have this permission.

Jeff points out how CONTROL SERVER isn’t quite the same as sysadmin, but why you should still treat it that way.

Comments closed

Better Reports with the 3-30-300 Rule

Kurt Buhler comes up with a good rule:

Effective reports and dashboards should enable users to quickly answer their data questions so that they can focus on their primary business tasks and responsibilities. To help you design effective reports, we introduce the 3-30-300 rule for information design. The 3-30-300 rule is a straightforward and practical approach for you to produce efficient report layouts by structuring reports in a functionally hierarchical way. This rule concisely paraphrases the visual information-seeking mantra from Ben Schneidermann (1996). To make it easier to understand for Power BI developers, we express this rule with respect to approximately how long it should take users to get certain information or perform certain tasks in a report.

It’s a clever mnemonic and Kurt does a good job of showing how you could implement it.

Comments closed

Granting Developers Query Store Access

Josephine Bush wants to allow developers to solve their own problems:

Let’s have devs look at their own query performance. Yes, please, sign me up for that! Sometimes, it’s hard for me to know the best course of action, especially when they are using Entity Framework, but it’s a great start for them to use Query Store to see how impactful their queries are. I’m happy to help them decipher results if they are confused, but I really like performance tuning being a team sport. I was giving them a list of queries with, for example, high CPU usage, but it was even better when they could go in there and use Query Store for themselves on a regular basis.

The actual granting of rights takes a couple lines of T-SQL, and Josephine also provides an overview of Query Store along the way. Erik Darling’s sp_QuickieStore plays a prominent role in this post and I agree that it’s extremely helpful. I’d also be remiss not bringing up QDS Toolbox as well, as it’s a rather good solution in its own right.

Comments closed

Dynamic Cursors in SQL Server

Hugo Kornelis continues a series on cursors:

We’re already at part 31 of the plansplaining series. And this is also the third part in my discussion of execution plans for cursors. After explaining the basics, and after diving into static cursors, it is now time to investigate dynamic cursors. As a quick reminder, recall that a static cursor presents data as it was when the cursor was opened (and does so by simply saving a snapshot of that data in tempdb), whereas a dynamic cursor is supposed to see all changes that are committed while the cursor is open. Let’s see how this change in semantics affects the execution plan.

Read on as Hugo gives it the college try and also admits he might be missing something in the explanation.

Comments closed

GRANT Operations in Postgres

Shaun Thomas takes us through GRANT operations and roles in Postgres:

Not every database-backed application needs to be locked down like Fort Knox. Sometimes there are even roles that leverage blanket access to large swathes of available data, if not every table, simply for auditing or monitoring purposes. Normally this would require quite a bit of preparation or ongoing privilege management, but Postgres came up with a unique solution starting with version 14: predefined roles.

This topic comes up relatively frequently in Postgres chats like Discord, Slack, and IRC. Usually it’s along the lines of: “We have a low security application but have separated read and write access from the table owner to avoid accidents. That user should still be able to read or write any table in the database. What do I do?”

This is an area where Postgres and SQL Server are using the same terms but aren’t quite speaking the same language.

Comments closed

Power BI Command Memory Limit

Chris Webb is overdrawn at the memory bank:

Continuing my series on Power BI model memory errors (see part 1 and part 2), in this post I will look at the Command Memory Limit which restricts the amount of memory that XMLA commands like Create, Alter and most importantly Refresh and can use.

If you’ve ever been told that your semantic model should consume less than half the amount of memory available to it because memory consumption can double during a full refresh, then that is because of the Command Memory Limit. 

Read on to learn more about the Command Memory Limit and why this advice exists.

Comments closed