Press "Enter" to skip to content

Day: October 7, 2025

Choosing a Time Series Forecast Model

Ivan Palomares Carrascosa builds a matrix:

Time series data have the added complexity of temporal dependencies, seasonality, and possible non-stationarity.

Arguably, the most frequent predictive problem to address with time series data is forecasting i.e. predicting future values of a variable like temperature or stock price based on historical observations up to the present. With so many different models for time series forecasting, practitioners might sometimes find it difficult to choose the most suitable approach.

This article is designed to help, through the use of a decision matrix accompanied by explanations on when and why to employee different models depending on data characteristics and problem type.

Ivan breaks it out into two dimensions, data complexity and univariate/multivariate, and explains which types of algorithms might work best in each.

Leave a Comment

Optimized Compaction in Microsoft Fabric Spark

Miles Cole crunches things down:

Compaction is one the most necessary but also challenging aspects of managing a Lakehouse architecture. Similar to file systems and even relational databases, unless closely managed, data will get fragmented over time, and can lead to excessive compute costs. The OPTIMIZE command exists to solve for this challenge: small files are grouped into bins targeting a specific ideal file size and then rewritten to blob storage. The result is the same data, but contained in fewer files that are larger.

However, imagine this scenario: you have a nightly OPTIMIZE job which runs to keep your tables, all under 1GB, nicely compacted. Upon inspection of the Delta table transaction log, you find that most of your data is being rewritten after every ELT cycle, leading to expensive OPTIMIZE jobs, even though you are only changing a small portion of the overall data every night. Meanwhile, as business requirements lead to more frequent Delta table updates, in between ELT cycles, it appears that jobs get slower and slower until the next scheduled OPTIMIZE job is run. Sound familiar?

Read on to see what’s new and how you can enable it in your Fabric workspace.

Leave a Comment

Error Handling in Powershell with ErrorAction

Patrick Gruenauer decides to continue:

Error handling is an important part of scripting and automation, and PowerShell provides robust tools for managing errors efficiently. One of the key features for error management in PowerShell is the ErrorAction parameter. This blog post will dive into the ErrorAction parameter, explaining its usage and providing practical examples to illustrate its usage.

Read on to see the five available options and a pair of examples.

Leave a Comment

Value Filter Behavior and SUMMARIZECOLUMNS in DAX

Alberto Ferrari and Marco Russo provide an introduction:

Value filter behavior controls the SUMMARIZECOLUMNS behavior that changes how filters are applied to the measure evaluation. This is mostly relevant when developers use the filter arguments in SUMMARIZECOLUMNS.

The topic is very broad, and we will just be able to scratch the surface here. However, the good news is that most developers do not need to learn the intricacies of value filter behavior. This property has three settings: Automatic, Independent, and Coalesced. The safest and most correct setting is the one introduced in 2025: Independent. Coalesced was the default setting before 2025, whereas Automatic retains Coalesced for older models, and it sets Independent for new models.

Read on to learn more about these behaviors and what they mean for your queries.

Leave a Comment

Microsoft Fabric Spark Connector for SQL Databases

Arshad Ali makes an announcement:

Fabric Spark connector for SQL databases (Azure SQL databases, Azure SQL Managed Instances, Fabric SQL databases and SQL Server in Azure VM) in the Fabric Spark runtime is now available. This connector enables Spark developers and data scientists to access and work with data from SQL database engines using a simplified Spark API. The connector will be included as a default library within the Fabric Runtime, eliminating the need for separate installation.

This is a preview feature and works with Scala and Python code against SQL Server-ish databases in Azure (Azure SQL DB, Azure SQL Managed Instance, and virtual machines running SQL Server in Azure).

Leave a Comment