Press "Enter" to skip to content

Day: November 20, 2023

New Features in healthyR.ts 0.3

Steven Sanderson lays out some updates:

One of the standout additions is the introduction of util_log_ts(). This function seems like a game-changer, providing a streamlined way to log time series data. This is incredibly useful, especially when dealing with extensive datasets, making the whole process more efficient and user-friendly. This is a helper function for auto_stationarize().

There’s a lot in this update and the blog post also includes several examples of automating stationarity and ARIMA.

Comments closed

Choosing the Right Technology in the Modern Azure Data Warehouse

Josephine Bush has some advice:

Here’s a quick description of the options we explored:

  • Azure Data Factory – Orchestrates and automates data movement and transformation. You can create workflows, pipelines, and ETL (Extract, Transform, Load) processes using it.
  • Databricks – A unified data science, engineering, and analytics platform based on Apache Spark. It simplifies data exploration, preparation, and machine learning workflows, allowing teams to collaborate efficiently. Interactive notebooks make Databricks a versatile tool for scalable data analysis and processing.
  • Synapse – Integration of big data and data warehousing in the cloud. It facilitates collaborative analytics and AI-driven insights using serverless and provisioned resources across various data sources. Integrated analytics, warehousing, and data integration are part of Synapse’s unified experience.
  • Fabric – An all-in-one analytics solution for enterprises that offers data movement, data lakes, data engineering, data integration, data science, and real-time analytics.

Read on for pros and cons of different options Josephine & crew reviewed, as well as the option they landed on and why.

Comments closed

Many-to-Many Power BI Relationships and Table Refreshes

Dany Hoter gives us a reason to minimize use of many-to-many relationships in Power BI:

I must admit that in the last two years I’ve told many Power BI/Kusto customers not to worry about relationships that are created as M:M.

I was pretty sure that with Direct Query, such relationships are fine,

Indeed, the generated queries looked fine and performed as expected.

I recently became aware that the number of queries generated for some visuals e.g. Matrix and tables can be affected by the type of relationships between the participating tables.

Read on for a description of why you shouldn’t load your Power BI semantic models with many-to-many relationships, especially once Kusto is involved.

Comments closed

A Pattern for DAX Time Intelligence Functions

Allison Kennedy checks the calendar:

Time Intelligence functions in DAX change the filter context on the Date table of your model. 

Step 0: Prepare 

Before using Time Intelligence functions, you should:

Read on for a four-step process covering how to apply a time intelligence function like DATEADD() or DATESYTD() in DAX.

Comments closed

The Importance of Asking the Right Question

Brian Kelley offers some advice:

This is not a “clickbait” title, but an important consideration when it comes to developing technical solutions. Let me give you an example between two questions for SQL Server on-premises running on Windows.

Question 1: Does SQL Server allow you to set things like password complexity, password length, and the number of failed login attempts before the account is locked?

Question 2: Does SQL Server support things like password complexity, password length, and locking the account after a number of failed login attempts?

Betteridge’s Law of Headlines also applies to Brian’s post.

It’s so easy to get locked into answering the question without that additional context, and it’s also hard to tell if a person is asking question 1 because they don’t know the answer in general, or if they’re asking because they know you can do it in Windows but aren’t sure if there is a separate mechanism for SQL Server.

Comments closed

Power BI Datasets? Semantic Models!

Chris Webb shares some thoughts:

The name change proved to be surprisingly uncontroversial. Of course it’s very disruptive – trust me, I know, I have around 500 blog posts that I need to do a search-and-replace on at some point – so I have a lot of sympathy for people with books or training courses that need updating or who are getting calls from confused end users who are wondering where their datasets have gone. But there was a general consensus that the change was the right thing to do:

Read on for a bit more of the story, as well as some great links to get additional information on semantic modeling.

Comments closed

Primer on Indexing and Partitioning in Postgres

Salman Ahmed gives us a 10,000 foot view of two topics:

When it comes to managing large and complex databases in PostgreSQL, an important decision you’ll face is how to optimize your data storage and retrieval strategies. Two common techniques for improving database performance and manageability are indexing and partitioning in PostgreSQL.

Read on for a quick overview of each topic, including the variety of index types and partitioning strategies available.

Comments closed