Press "Enter" to skip to content

Month: March 2023

Understanding Azure Cognitive Search Costs

Matt Eland doesn’t want to break the bank:

Let’s continue my recent trend in exploring pricing tips for the various parts of AI and Machine Learning on Azure with a dive into Azure Cognitive Search.

Sometimes confused with the AI offerings of Azure Cognitive Services, the entirely different Azure Cognitive Search is a rich service that allows you to index a variety of files and documents, extract meaning from those documents, and provide rich search results to users.

In this article we’ll explore the pricing structure of Azure Cognitive Search and highlight some things you should be aware of as you plan and develop your Cognitive Search resources.

Read the whole thing if you’re thinking of using Azure Cognitive Search. It’s a good service and I think the pricing model is fairly straightforward, though there are always nuances to these things.

Comments closed

Object Tagging in Snowflake

Warner Chaves tags a table:

A tag is a user-defined label that can be attached to a Snowflake object, such as a database, table, or column. Tags can categorize objects based on any criteria that you choose, such as sensitivity, business unit, project, or owner. Once tags have been applied, you can use them to control access to the tagged objects, track usage and costs, and apply policies and rules.

Now let’s apply tagging to a specific use case: identifying sensitive customer data. For example, let’s assume that you have a table in Snowflake called “customers” that contains customer information, including their addresses. We want to categorize the “address” column as sensitive so that we can apply data protection policies and controls.

Click through for a few examples of how to create tags, apply tags to database objects, and review tagged objects.

Comments closed

A Review of Postgres Memory Parameters

Henrietta Domborvskaya takes a look at memory parameters in Postgres:

Ordinary PostgreSQL users often do not know that PostgreSQL configuration parameters exist, let alone what they are and what they mean. There is a good reason for such ignorance since, in real life, ordinary users don’t have any say in how these parameters are set. Configuration parameters are set not just for a database but for the whole instance, which may have multiple databases, so any individual user will get the same as others get. To be completely transparent, in some cases, the said ordinary users can specify some parameters just for their own uses, but let’s hold our horses for now.

There are over three hundred PostgreSQL configuration parameters, so no wonder that even experienced DBAs often do not know what each of these parameters does. That is perfectly fine; however, there is a widespread belief that somewhere, in the secret vaults of many consulting companies, there is a treasure chest of perfect PostgreSQL parameter settings.

Read on for more information about config parameters in general, followed by several memory-related parameters you can tweak and some guidance on where to begin with them.

Comments closed

Building a Dimension and Measure Matrix for Power BI

Olivier Van Steenlandt does some documentation:

In this blog post, I will guide you through all the required steps to get a Data Model Relationship Matrix in Power BI.

If you don’t know what I mean, I would like to have a straightforward overview where I can see which attribute groups and measure groups I can combine from my Tabular Model in (SQL Server) Analysis Server.

The first thing I thought of was “this is very much like a bus matrix in the Kimball model.” It’s a little different, though, as the rows in the axis pertain to measure groups rather than business units.

Comments closed

Scaling Multiple Azure SQL DBs on a Single Server

Laith Ayesh has a script for us:

In a few scenarios, you might need to scale multiple databases on a logical server (not part of elastic pool) at once, the azure portal only allows you to scale each database individually. This can be achieved using the following PowerShell script:

just modify the parameters like SubID, the resource group and server name and then pick the service tier you want and run the script:

Click through for the Powershell script and an important note.

Comments closed

Saving Incremental PBIX File Backups

Matt Allington saves PBIX Final Report V2 No Wait V3 Draft Demo 2 Copy Copy Copy 3.pbix:

I do a lot of Power BI model and report development; maybe you do too. There’s nothing worse than spending an hour or so developing your model only to have something go wrong and you lose your work. Things that can go wrong include:

  • Your PC/App crashes. Power BI does have auto save, but I prefer that to be the last thing I rely on to save the day rather than the only thing.
  • You make a significant mistake in the approach and need to undo your work (autosave won’t save you with this problem).
  • You make a big mistake in Power Query. There is no undo in Power Query, so if you spend an hour inside Power Query, don’t save, and then make a mistake, there is no way to recover your work.

At the time of writing, there are no version control tools built into Power BI, so as a result it is up to you to manage backups yourself.

Read on for a few tips around backups and file management.

Comments closed

Tips for Debugging DAX Code

Ed Hansberry has no bugs, but just in case:

When trying to get your DAX measures to work correctly, there are a number of tools you can use to debug, which I will briefly mention, but not go into depth on. Instead, this post is about how to think about debugging your code. I am focusing on DAX here, but many of the methods would apply to any type of code.

Read on for a series of tips around built-in capabilities, process, and the power of conversation.

Comments closed

The apply() Family in R

Steven Sanderson operates over a list of operators over lists:

In this post I will talk about the use of the R functions apply()lapply()sapply()tapply(), and vapply() with examples.

These functions are all designed to help users apply a function to a set of data in R, but they differ in their input and output types, as well as in the way they handle missing values and other complexities. By using the right function for your particular problem, you can make your code more efficient and easier to read.

I do prefer the purrr() syntax because it’s a little easier to remember its function names versus keeping the variants of apply() straight in your mind. Even so, there’s a lot you can do with a judicious use of apply().

Comments closed

The Story behind Benford’s Law

John Cook gives us a dose of history and math:

In 1881, astronomer Simon Newcomb noticed something curious. The first pages in books of logarithms were dirty on the edge, while the pages became progressively cleaner in later pages. He inferred from this that people more often looked up the logarithms of numbers with small leading digits than with large leading digits.

Why might this be? One might reasonably expect the numbers that came up in work to be uniformly distributed. But as often the case, it helps to ask “Uniform on what scale?”

Read on for a bit more of the story behind Newcomb’s Benford’s law and a just-so story about differing bases.

Comments closed

Tips for AKS Storage Provisioning

Joji Varghese gives us a hand:

In an Azure Kubernetes (AKS) cluster, Pods can access physical storage resources such as disks or volumes using Persistent Volumes (PV). To use these resources, Pods need to make a Persistent Volume Claim (PVC), which requests a specific amount of storage from a storage class. This claim can then be matched to an available Persistent Volume. Azure offers several storage solutions that can be used to provision Persistent Volumes in an AKS cluster.

This article will provide real-world guidance on securely using Container Storage Interface (CSI) drivers to provision Azure File Shares and Azure Blob storage in an AKS cluster.

If you’re looking at setting up Azure Kubernetes Service, give this a review.

Comments closed