Curated SQL – Page 401 – A Fine Slice Of SQL Server

Differentiating Physical and Logical Reads in SQL Server

Published 2023-11-13 by Kevin Feasel

Jose Manuel Jurado Diaz explains a concept:

In the realm of Azure SQL Database, query performance is a paramount concern for database administrators and developers alike. A critical aspect of this performance is understanding how SQL Server interacts with data, particularly through physical and logical reads. This article delves into these two fundamental concepts, providing insights into their impact on database performance and a practical lab to observe these metrics in action.

Read on for the difference, as well as a demonstration. With slow disks and insufficient RAM, it’s really important to know this difference. But as you have more RAM and move to formats like NVMe for storage, I’d argue that it’s less of an issue. The additional RAM, in particular, is important because the idea is that data access frequently will remain in the buffer pool for longer, so you’re more likely to see logical reads in action. Of course, poor indexing and bad decisions can ruin that idea, so don’t do that, okay?

Comments closed

Simulating a Bivariate Normal Distribution in R

Published 2023-11-10 by Kevin Feasel

Steven Sanderson isn’t content with a univariate normal distribution:

Imagine two variables, like height and weight, that exhibit a joint distribution. The bivariate normal distribution captures the relationship between these variables, describing how their values tend to cluster around certain means and how they vary together. It’s like a two-dimensional bell curve, where the peak represents the most likely combination of values for both variables.

Click through to learn a bit more about bivariate normal distributions, including ways to plot one and show its density function.

Comments closed

Failure Modes of Sending Pre-Read Materials for Meetings

Published 2023-11-10 by Kevin Feasel

Alex Velez sends us documents in advance:

This could be a hot take, but I’m not a fan of pre-reads and will respectfully decline most requests to share content before a meeting.

Before I elaborate on why, let’s start by exploring what pre-reads are, why they often fail, and some more effective alternatives.

I think the viable but difficult alternative is to do what Jeff Bezos did at Amazon: for each meeting, there is a 2-page primer covering all of the relevant context for the meeting. After people are in the room, you distribute the 2-pager and everybody spends 5 minutes reading it first. That serves the intent of the pre-read but there are strict social cues to do the reading, something that does not exist with pre-reads. It also prevents people from going around in circles because they have different subsets of information and don’t realize it.

Of course, this is a challenge to pull off in practice and requires more effort from the standpoint of meeting hosts, but I’m also of the belief that there should be some level of pain involved in scheduling a meeting, as that will cut out many of the “This could have been an e-mail” types of meetings.

Comments closed

Microsoft Fabric Decision Guides

Published 2023-11-10 by Kevin Feasel

Wolfgang Strasser has a pair of links:

As Microsoft Fabric is a brand-new technology (and still in preview – as of 2023-11-09), there are hardly any best practices published and available.
But today, I found some great catches in the Fabric documentation – two decision guides

Click through for more details on those two guides.

Comments closed

DAX Time Intelligence with a Fiscal Year Differing from Calendar Year

Published 2023-11-10 by Kevin Feasel

Olivier Van Steenlandt covers a common case:

Many companies don’t follow the regular Calendar as we know (January 1st – December 31st). They follow their own Financial Calendar (often called Fiscal Calendar) which can start at any time of the year.

Because of this, writing Year-To-Date calculations in DAX for your Tabular Model might seem challenging.

In the step-by-step example, we are working for a company that starts its Financial Year on July 1st.

Read on to see one way to do it. It doesn’t quite solve the problem Olivier brought up, but I’d also make note that having a calendar table with fiscal + calendar year information in it helps remarkably well. It can even handle multiple fiscal year concepts; as an example, a state agency I worked for had a fiscal year on July 1 but the US federal government’s fiscal year begins October 1, so it was just a matter of having StateFiscalYear and FederalFiscalYear columns.

Also, check out Olivier’s new theming, under the Data Cuisine motif.

1 Comment

Time Series Data in Postgres with TimescaleDB

Published 2023-11-10 by Kevin Feasel

Semab Tariq keeps track of time:

TimescaleDB is an open-source time-series database extension for PostgreSQL. It is designed to efficiently manage and query time-series data, offering features such as automatic data partitioning, data retention policies, and specialized time-series functions.

This extension provides scalability, improved performance, and seamless integration with PostgreSQL, making it a powerful choice for applications dealing with large volumes of time-stamped data, including IoT, monitoring, and analytics.

Read on to learn how to install it (on Linux), some of the tuning parameters available, and how to create time series hypertables and chunk tables.

Comments closed

SQL Compare at the Command Line

Published 2023-11-10 by Kevin Feasel

Steve Jones is resting his mouse-clicking finger:

Recently a customer was looking to automate some of their SQL Compare checks, but they wanted to do this in a dynamic way, since they needed to do this at scale. Their idea was to not have a SQL Compare project, but build a mapping at the command line.

This post addresses a part of their issue. This will look at how to set up a basic SQL Compare command line.

Read on to see how. This is something I did a while back as well, though it was quite a while ago (say, 8-9 years ago) and required us to do some DLL weirdness back then. Once we got it working, however, it did a good job.

Comments closed

Plotting a Cumulative Distribution Function in R

Published 2023-11-09 by Kevin Feasel

Steven Sanderson builds a plot:

Before delving into the world of R programming, let’s first grasp the fundamental concept of a CDF. Imagine a group of students eagerly awaiting their exam results. The CDF for their scores would depict the probability of encountering a student with a score less than or equal to a specific value. For instance, if the CDF indicates a value of 0.7 at 80%, it implies that there’s a 70% chance of finding a student with a score of 80 or lower.

Read on to see how you can calculate this in a dataset and then plot the CDF.

Comments closed

Functions to Deal with Object Names in R

Published 2023-11-09 by Kevin Feasel

Maelle Salmon has a sticky note:

Interestingly the docs for setNames() sound as if it were created just for this use case!

“This is a convenience function that sets the names on an object and returns the object. It is most useful at the end of a function definition where one is creating the object to be returned and would prefer not to store it under a name just so the names can be assigned.”

For the opposite operation, removing the names of an object, we can use unname().

Read on for an overview of several of these functions. H/T R-Bloggers.

Comments closed

Using VS Code Extensions in Azure Data Studio

Published 2023-11-09 by Kevin Feasel

I have a quick tip blog post:

Something that a lot of people don’t know is, you can import some Visual Studio Code extensions into Azure Data Studio and have them work.

I emphasize the word “some” here because I have seen cases of extensions failing to install. I don’t know exactly why certain extensions fail or what percentage of them will work, so give it a whirl and see what you get.

Comments closed

M	T	W	T	F	S	S
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30

Curated SQL Posts