Press "Enter" to skip to content

Month: September 2023

ORMs and Mapping Requirements

Mark Seemann is not a big fan of Entity Framework:

When I evaluate whether or not to use an ORM in situations like these, the core application logic is my main design driver. As I describe in Code That Fits in Your Head, I usually develop (vertical) feature slices one at a time, utilising an outside-in TDD process, during which I also figure out how to save or retrieve data from persistent storage.

Thus, in systems like these, storage implementation is an artefact of the software architecture. If a relational database is involved, the schema must adhere to the needs of the code; not the other way around.

To be clear, then, this article doesn’t discuss typical CRUD-heavy applications that are mostly forms over relational data, with little or no application logic. If you’re working with such a code base, an ORM might be useful. I can’t really tell, since I last worked with such systems at a time when ORMs didn’t exist.

Read on for a thoughtful argument. The only critique I have is I’d prefer stored procedures over saving SQL queries in the code.

1 Comment

Incremental Sort in Postgres

Umair Shahid takes us through the concept of incremental sort in PostgreSQL:

Incremental sort is a database optimization feature, introduced in PostgreSQL 13, that allows sorting to be done incrementally during the query execution process. Sorting is a common operation in database queries, often necessary when retrieving data in a specific order. PostgreSQL’s query planner uses incremental sort to improve query performance, particularly for large datasets. This feature is enabled by default in PostgreSQL 13 and above. 

Read on to see how it works and some good practices which help maximize the likelihood that you can take advantage of the feature.

Comments closed

Building a Flink Application in Java

Wade Waldron talks about a (free) new course:

Recently, I got my hands dirty working with Apache Flink®. The experience was a little overwhelming. I have spent years working with streaming technologies but Flink was new to me and the resources online were rarely what I needed. Thankfully, I had access to some of the best Flink experts in the business to provide me with first-class advice, but not everyone has access to an expert when they need one. 

To share what I learned, I created the Building Flink Applications in Java course on Confluent Developer. It provides you with hands-on experience in building a Flink application from the ground up. I also wrote this blog post to walk through an example of how to do dataflow programming with Flink. I hope these two resources will make the experience less overwhelming for others.

Click through for the blog post and check out the full course if you’re so inclined.

Comments closed

Working with Histogram Breaks in R

Steven Sanderson divvies out buckets for a histogram:

Histograms divide data into bins, or intervals, and then count how many data points fall into each bin. The breaks parameter in R allows you to control how these bins are defined. By specifying breaks thoughtfully, you can highlight specific patterns and nuances in your data.

Click through to see how you can use the breaks parameter in a few different ways to customize your histogram. The default breaks in R are often reasonable, but trying a few different breaks can help you get a better understanding of the actual distribution of the data.

Comments closed

An Overview of Sankey Diagrams

Simon Rowe explains what a Sankey diagram is:

In the image, you can see how Sankey used arrows to show the flow of energy with the widths of the shaded areas proportional to the amount of heat loss as it progresses through the engine’s cycle. This series of complex relationships would be difficult for a reader to understand at a glance were they simply presented in text and data tables. Making just such a sophisticated system easier to understand is the purpose of a Sankey diagram, which visually summarises the volume and direction of flows through the stages of a process or system.

Click through for several good examples, some advice on when a Sankey diagram could make sense, and the times in which you should not use this visual.

Comments closed

Azure Blob Storage Operating System Error 86

Jose Manuel Jurado Diaz 86’d that option:

Today, I worked on a service request that our customer got the following error message: Cannot open backup device ‘https://XXX.blob.core.windows.net/NNN/YYY.bak‘. Operating system error 86(The specified network password is not correct.). RESTORE HEADERONLY is terminating abnormally. (Microsoft SQL Server, Error: 3201). Following I would like to share with you some details why this issue and the activities done to resolve it. 

Read on to get a better understanding of what this error actually means and how you can fix it.

Comments closed

Azure AD (or Entra ID) Authentication for SQL Server VMs

Deepthi Goguri enables Azure Entra ID security on a SQL Server VM in Azure:

To enable the SQL Server 2022 on a virtual machine to use Azure AD, we need below things:

Deepthi then includes the list of requirements, starting with a managed identity and moving on to permissions and other configuration. It looks like a fair number of steps, but it’s not that onerous a task once you get to it.

And this also gives me a chance to rant about Microsoft marketing a bit, as they are in the process of switching the name Azure Active Directory to Azure Entra ID. Granted, Azure Active Directory isn’t Active Directory (although it does perform a very similar task in a fairly similar way). So to remove confusion that I don’t think really existed, they changed the name to Entra ID. Except that most of the Microsoft documentation still says Azure Active Directory, and we have about a decade’s worth of blog content talking about Azure Active Directory, so when you go searching for the resolution to a problem, you’ll have to search for Azure Entra ID as well as its former name, which means people will still link the product to Azure Active Directory—at least, until the point when there’s enough blog content and documentation in place to replace the large majority of those existing blog posts—and so you renamed a product for no reason. Plus, they picked an ambiguous name that people will pronounce multiple ways: is the “ent” in Entra like “enter the dungeon” or Entra like “a delicious entrée”?

But then again, considering how many pronunciations of Azure there are, maybe this is the plan…

Comments closed

A Review of DataVeil for SQL Server Users

Brian Kelley tries out a product:

My organization typically moves production data to other environments. There are a variety of use cases:

  • Testing with the amount and frequency of production data.
  • Performing analytics on said data.
  • Delivering production-like data to a third party for their use.

We do not want to move production data around. Instead, we want to deliver “production-like” data for these use cases. Sometimes, we work with multiple systems integrated with each other, and in those cases, we need the data to match up. In other instances, we need the sensitive data, such as personal identifiable information (PII), to be altered so it’s no longer sensitive, but there’s no requirement for it to be consistent across systems.

Read on for Brian’s full review. I should also note that this is most definitely a paid product.

Comments closed