Press "Enter" to skip to content

Month: November 2023

Things to Ignore: SQL Server I/O Affinity

Sean Gallardy recommends you not touch this:

I honestly have no idea how or why people tend to use this configuration option, if you know please drop me a line and let me know or put a comment below, I’m genuinely curious. When I ask people why this is set when I see/find it, I normally get a “well that’s how the last server was” or “I don’t know”. Awesome. I always like to just change random settings for no particular reason. Some days you wake up decide you’re going to go change a bunch of settings on your computer because why not, it’ll be fun.

There probably is a reason, though Sean’s speculation of “so the benchmark scores for SQL Server testing would be higher” is just as likely the cause as anything else. My fallback alternative is “one very large customer threw a lot of money Microsoft’s way to add a setting that works for them but nobody else.” There are a couple of those in the product, too.

Comments closed

Differentiating Physical and Logical Reads in SQL Server

Jose Manuel Jurado Diaz explains a concept:

In the realm of Azure SQL Database, query performance is a paramount concern for database administrators and developers alike. A critical aspect of this performance is understanding how SQL Server interacts with data, particularly through physical and logical reads. This article delves into these two fundamental concepts, providing insights into their impact on database performance and a practical lab to observe these metrics in action.

Read on for the difference, as well as a demonstration. With slow disks and insufficient RAM, it’s really important to know this difference. But as you have more RAM and move to formats like NVMe for storage, I’d argue that it’s less of an issue. The additional RAM, in particular, is important because the idea is that data access frequently will remain in the buffer pool for longer, so you’re more likely to see logical reads in action. Of course, poor indexing and bad decisions can ruin that idea, so don’t do that, okay?

Comments closed

Simulating a Bivariate Normal Distribution in R

Steven Sanderson isn’t content with a univariate normal distribution:

Imagine two variables, like height and weight, that exhibit a joint distribution. The bivariate normal distribution captures the relationship between these variables, describing how their values tend to cluster around certain means and how they vary together. It’s like a two-dimensional bell curve, where the peak represents the most likely combination of values for both variables.

Click through to learn a bit more about bivariate normal distributions, including ways to plot one and show its density function.

Comments closed

Failure Modes of Sending Pre-Read Materials for Meetings

Alex Velez sends us documents in advance:

This could be a hot take, but I’m not a fan of pre-reads and will respectfully decline most requests to share content before a meeting. 

Before I elaborate on why, let’s start by exploring what pre-reads are, why they often fail, and some more effective alternatives.

I think the viable but difficult alternative is to do what Jeff Bezos did at Amazon: for each meeting, there is a 2-page primer covering all of the relevant context for the meeting. After people are in the room, you distribute the 2-pager and everybody spends 5 minutes reading it first. That serves the intent of the pre-read but there are strict social cues to do the reading, something that does not exist with pre-reads. It also prevents people from going around in circles because they have different subsets of information and don’t realize it.

Of course, this is a challenge to pull off in practice and requires more effort from the standpoint of meeting hosts, but I’m also of the belief that there should be some level of pain involved in scheduling a meeting, as that will cut out many of the “This could have been an e-mail” types of meetings.

Comments closed

DAX Time Intelligence with a Fiscal Year Differing from Calendar Year

Olivier Van Steenlandt covers a common case:

Many companies don’t follow the regular Calendar as we know (January 1st – December 31st). They follow their own Financial Calendar (often called Fiscal Calendar) which can start at any time of the year.

Because of this, writing Year-To-Date calculations in DAX for your Tabular Model might seem challenging.

In the step-by-step example, we are working for a company that starts its Financial Year on July 1st.

Read on to see one way to do it. It doesn’t quite solve the problem Olivier brought up, but I’d also make note that having a calendar table with fiscal + calendar year information in it helps remarkably well. It can even handle multiple fiscal year concepts; as an example, a state agency I worked for had a fiscal year on July 1 but the US federal government’s fiscal year begins October 1, so it was just a matter of having StateFiscalYear and FederalFiscalYear columns.

Also, check out Olivier’s new theming, under the Data Cuisine motif.

1 Comment

Time Series Data in Postgres with TimescaleDB

Semab Tariq keeps track of time:

TimescaleDB is an open-source time-series database extension for PostgreSQL. It is designed to efficiently manage and query time-series data, offering features such as automatic data partitioning, data retention policies, and specialized time-series functions. 

This extension provides scalability, improved performance, and seamless integration with PostgreSQL, making it a powerful choice for applications dealing with large volumes of time-stamped data, including IoT, monitoring, and analytics.

Read on to learn how to install it (on Linux), some of the tuning parameters available, and how to create time series hypertables and chunk tables.

Comments closed

SQL Compare at the Command Line

Steve Jones is resting his mouse-clicking finger:

Recently a customer was looking to automate some of their SQL Compare checks, but they wanted to do this in a dynamic way, since they needed to do this at scale. Their idea was to not have a SQL Compare project, but build a mapping at the command line.

This post addresses a part of their issue. This will look at how to set up a basic SQL Compare command line.

Read on to see how. This is something I did a while back as well, though it was quite a while ago (say, 8-9 years ago) and required us to do some DLL weirdness back then. Once we got it working, however, it did a good job.

Comments closed

Plotting a Cumulative Distribution Function in R

Steven Sanderson builds a plot:

Before delving into the world of R programming, let’s first grasp the fundamental concept of a CDF. Imagine a group of students eagerly awaiting their exam results. The CDF for their scores would depict the probability of encountering a student with a score less than or equal to a specific value. For instance, if the CDF indicates a value of 0.7 at 80%, it implies that there’s a 70% chance of finding a student with a score of 80 or lower.

Read on to see how you can calculate this in a dataset and then plot the CDF.

Comments closed

Functions to Deal with Object Names in R

Maelle Salmon has a sticky note:

Interestingly the docs for setNames() sound as if it were created just for this use case!

“This is a convenience function that sets the names on an object and returns the object. It is most useful at the end of a function definition where one is creating the object to be returned and would prefer not to store it under a name just so the names can be assigned.”

For the opposite operation, removing the names of an object, we can use unname().

Read on for an overview of several of these functions. H/T R-Bloggers.

Comments closed