Press "Enter" to skip to content

Day: July 11, 2025

Decision Trees and Non-Tabular Data

Ivan Palomares Carrascosa explains that you can use more than standard structured data against decision trees:

Versatile, interpretable, and effective for a variety of use cases, decision trees have been among the most well-established machine learning techniques for decades, widely used for classification and regression tasks. Yet, they are still widely used — whether as standalone models or as components of more powerful ensemble methods like random forests and gradient boosting machines.

And there is one more attractive feature that pushes the boundaries of their versatility even further: they can accommodate data in diverse formats, beyond just fully structured, tabular data. This article examines this facet of decision trees from a balanced theoretical and practical approach.

Click through for an example.

Leave a Comment

Event Notification via LISTEN/NOTIFY in PostgreSQL Doesn’t Scale

Elliot Levin takes us through a performance issue:

We love Postgres and it lives at the heart of our service! But this extremely concurrent, write-heavy workload resulted in a stalled-out Postgres. This is the story of what happened, how we ended up discovering a bottleneck in the LISTEN/NOTIFY feature of Postgres (the event notifier that runs based on triggers when something changes in a row), and what we ended up doing about it.

Click through for details, as well as what the team there did to migrate away from this feature.

Leave a Comment

Replication Types and Modes in PostgreSQL

Semab Tariq lays out three types of replication available in PostgreSQL:

PostgreSQL also supports replication to keep standby servers in sync with the primary server using Write-Ahead Log (WAL) files. Every change made to the database is first recorded in these WAL files on the primary server. These logs are then continuously streamed to the standby server, which applies them to stay up to date. This method ensures that all standby servers stay in sync with the primary and are ready to be promoted in case the primary server fails.

In this blog, we will explore the different types and modes of replication available in PostgreSQL to help you understand which option best fits your business needs.

Click through for the list, as well as an overview of each replication type.

Leave a Comment

Capture Long-Running Queries via Extended Events

Tom Collins has another extended events session for us:

A SQL Server Extended Event to track SQL queries taking longer than 100 seconds to complete. Adjust accoring to your requriements.

There is also a query below to extract the column details from the xel file 

Click through for the code. This kind of extended events session is rather useful for performance tuning and finding issues before customers e-mail.

Leave a Comment

Information Disclosure Vulnerability in SQL Server

Mike Walsh takes us through a recent CVE entry:

On patch Tuesday this week, Microsoft released an Important severity security update (a CVSS base score of 7.5)

The details of this 0-day exploit are available to read at the NIST site, and the Microsoft security update site.

In short, the exploit that Microsoft has discovered and subsequently fixed can allow information disclosure.

The NIST entry is kind of a joke right now, and the Microsoft security update info is basically what they submitted to NIST plus links to download the patches. Still, this is worth patching and it’s an issue that goes back at least to 2016—probably earlier, but 2016 is the last version of SQL Server that still gets security updates.

Leave a Comment

Accessing Delta Lake Tables as Iceberg Data

Matthew Hicks makes an announcement:

We’re thrilled to announce an exciting new Preview capability in OneLake: you can now automatically read Delta Lake tables using Apache Iceberg compatible readers, with no need for migration, copying, or manual conversion. This enhancement gives data engineers and analytics teams unprecedented flexibility in how they access and interact with their data.

This is pretty neat, given that Iceberg is the other popular format for data lakes.

Leave a Comment