Press "Enter" to skip to content

Day: March 14, 2023

Thoughts on Linear Regression

John Mount shares some thoughts:

I want to spend some time thinking out loud about linear regression.

As a data science consultant and teacher I spend a lot of time using linear regression and teaching linear regression. I have found each of these pursuits can degenerate into mere doctrine or instructions. “do this,” “expect this,” “don’t do that,” “you should know,” and so on. What I want to do here is take a step back and think out loud about linear regression from first principles. To do attempt this I am going to start with the problem linear regression solves, and try to delay getting to the things so important that “everybody should known them without question.” So let’s think about a few things in a particular order.

For thinking out loud, this is laid out rather well, so give it a read.

Comments closed

Understanding Azure Cognitive Search Costs

Matt Eland doesn’t want to break the bank:

Let’s continue my recent trend in exploring pricing tips for the various parts of AI and Machine Learning on Azure with a dive into Azure Cognitive Search.

Sometimes confused with the AI offerings of Azure Cognitive Services, the entirely different Azure Cognitive Search is a rich service that allows you to index a variety of files and documents, extract meaning from those documents, and provide rich search results to users.

In this article we’ll explore the pricing structure of Azure Cognitive Search and highlight some things you should be aware of as you plan and develop your Cognitive Search resources.

Read the whole thing if you’re thinking of using Azure Cognitive Search. It’s a good service and I think the pricing model is fairly straightforward, though there are always nuances to these things.

Comments closed

Object Tagging in Snowflake

Warner Chaves tags a table:

A tag is a user-defined label that can be attached to a Snowflake object, such as a database, table, or column. Tags can categorize objects based on any criteria that you choose, such as sensitivity, business unit, project, or owner. Once tags have been applied, you can use them to control access to the tagged objects, track usage and costs, and apply policies and rules.

Now let’s apply tagging to a specific use case: identifying sensitive customer data. For example, let’s assume that you have a table in Snowflake called “customers” that contains customer information, including their addresses. We want to categorize the “address” column as sensitive so that we can apply data protection policies and controls.

Click through for a few examples of how to create tags, apply tags to database objects, and review tagged objects.

Comments closed

A Review of Postgres Memory Parameters

Henrietta Domborvskaya takes a look at memory parameters in Postgres:

Ordinary PostgreSQL users often do not know that PostgreSQL configuration parameters exist, let alone what they are and what they mean. There is a good reason for such ignorance since, in real life, ordinary users don’t have any say in how these parameters are set. Configuration parameters are set not just for a database but for the whole instance, which may have multiple databases, so any individual user will get the same as others get. To be completely transparent, in some cases, the said ordinary users can specify some parameters just for their own uses, but let’s hold our horses for now.

There are over three hundred PostgreSQL configuration parameters, so no wonder that even experienced DBAs often do not know what each of these parameters does. That is perfectly fine; however, there is a widespread belief that somewhere, in the secret vaults of many consulting companies, there is a treasure chest of perfect PostgreSQL parameter settings.

Read on for more information about config parameters in general, followed by several memory-related parameters you can tweak and some guidance on where to begin with them.

Comments closed

Building a Dimension and Measure Matrix for Power BI

Olivier Van Steenlandt does some documentation:

In this blog post, I will guide you through all the required steps to get a Data Model Relationship Matrix in Power BI.

If you don’t know what I mean, I would like to have a straightforward overview where I can see which attribute groups and measure groups I can combine from my Tabular Model in (SQL Server) Analysis Server.

The first thing I thought of was “this is very much like a bus matrix in the Kimball model.” It’s a little different, though, as the rows in the axis pertain to measure groups rather than business units.

Comments closed

Scaling Multiple Azure SQL DBs on a Single Server

Laith Ayesh has a script for us:

In a few scenarios, you might need to scale multiple databases on a logical server (not part of elastic pool) at once, the azure portal only allows you to scale each database individually. This can be achieved using the following PowerShell script:

just modify the parameters like SubID, the resource group and server name and then pick the service tier you want and run the script:

Click through for the Powershell script and an important note.

Comments closed

Saving Incremental PBIX File Backups

Matt Allington saves PBIX Final Report V2 No Wait V3 Draft Demo 2 Copy Copy Copy 3.pbix:

I do a lot of Power BI model and report development; maybe you do too. There’s nothing worse than spending an hour or so developing your model only to have something go wrong and you lose your work. Things that can go wrong include:

  • Your PC/App crashes. Power BI does have auto save, but I prefer that to be the last thing I rely on to save the day rather than the only thing.
  • You make a significant mistake in the approach and need to undo your work (autosave won’t save you with this problem).
  • You make a big mistake in Power Query. There is no undo in Power Query, so if you spend an hour inside Power Query, don’t save, and then make a mistake, there is no way to recover your work.

At the time of writing, there are no version control tools built into Power BI, so as a result it is up to you to manage backups yourself.

Read on for a few tips around backups and file management.

Comments closed

Tips for Debugging DAX Code

Ed Hansberry has no bugs, but just in case:

When trying to get your DAX measures to work correctly, there are a number of tools you can use to debug, which I will briefly mention, but not go into depth on. Instead, this post is about how to think about debugging your code. I am focusing on DAX here, but many of the methods would apply to any type of code.

Read on for a series of tips around built-in capabilities, process, and the power of conversation.

Comments closed