Press "Enter" to skip to content

Category: Cloud

Refreshing a Power BI Dataset via HTTPS URL

Gilbert Quevauvilliers presses the big red button:

I have found that sometimes there are other systems that are loading data, and once they are complete they then want to refresh the Power BI Dataset.

Another way to do this is to use Power Automate, in which a system or user can request a HTTPS URL and once called that will then refresh the Power BI dataset.

I explain how to do this in the steps below.

Click through to see how to set up that job.

Comments closed

Data Validation with Great Expectations and Azure Functions

Eduard van Valkenburg does a bit of data validation:

Great Expectations (Great Expectations Home Page • Great Expectations) is a popular Python-based OSS tool to validate data that comes into your data estate. And for me, validating incoming data is best done file by file, as the files arrive! On Azure there is no better platform for that then Azure Functions. I love the serverless nature of Functions and with the triggers available for arriving blobs, as well as HTTP, event grid events, queues and others. There are some great patterns that allow you to build event-driven data architectures. We also now have the Python v2 framework for Azure Functions available, which makes the developer experience even better. So, let’s go through how to get it running.

This looks really interesting and tying it in to Azure Functions is a good idea assuming that the checks don’t run for too long.

Comments closed

Contrasting Azure IoT Hub and Event Hub

Brian Bønk lays out a quick comparison:

When working with Azure Data Explorer and loading data to the storage engine, you might have some streaming devices or services that should land in the engine.

Azure provides two out-of-the-box services:

  1. Azure IoT Hub
  2. Azure Event Hub

At first glance it seems like teh two services are doing the exact same thing – sending events through to other services in Azure. But there are some differences.

Read on to see what these differences are.

Comments closed

Best Practices Assessment for Azure Arc-Enabled SQL Server Instances

Ganapathi Varma Chekuri takes us through an assessment:

Best practices assessment provides a mechanism to evaluate the configuration of your SQL Server. Once the best practices assessment feature is enabled, your SQL Server instance and databases are scanned to provide recommendations for things like SQL Server and database configurations, index management, deprecated features, enabled or missing trace flags, statistics, etc. Assessment run time depends on your environment (number of databases, objects, and so on), with a duration from a few minutes, up to an hour.

If you’re familiar with the assessment on Azure VMs, this is quite similar, though it extends to on-premises machines or VMs running in other cloud providers. This does require installing the agent and paying for an Arc-Enabled SQL Server instance, so it’s not free.

Comments closed

Estimating and Managing Pod Spread in AKS

Joji Varghese talks pod distribution in Azure Kubernetes Service:

In Azure Kubernetes Service (AKS), the concept of pod spread is important to ensure that pods are distributed efficiently across nodes in a cluster. This helps to optimize resource utilization, increase application performance, and maintain high availability.

This article outlines a decision-making process for estimating the number of Pods running on an AKS cluster. We will look at pod distribution across designated node pools, distribution based on pod-to-pod dependencies and distribution where pod or node affinities are not specified. Finally, we explore the impact of pod spread on scaling using replicas and the role of the Horizontal Pod Autoscaler (HPA). We will close with a test run of all the above scenarios.

Read on for tips, as well as a few web tools, which you can use to estimate and control pod spread in AKS.

Comments closed

Role-Based Access Controls in Amazon OpenSearch

Scott Chang and Muthu Pitchaimani show how to assign rights in Amazon OpenSearch to IAM groups:

Amazon OpenSearch Service is a managed service that makes it simple to secure, deploy, and operate OpenSearch clusters at scale in the AWS Cloud. AWS IAM Identity Center (successor to AWS Single Sign-On) helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. To build a strong least-privilege security posture, customers also wanted fine-grained access control to manage dashboard permission by user role. In this post, we demonstrate a step-by-step procedure to implement IAM Identity Center to OpenSearch Service via native SAML integration, and configure role-based access control in OpenSearch Dashboards by using group attributes in IAM Identity Center. You can follow the steps in this post to achieve both authentication and authorization for OpenSearch Service based on the groups configured in IAM Identity Center.

Click through for the process.

Comments closed

Using the Log Replay Service to Migrate to Azure SQL MI

Rob Carrol makes a move:

The Log Replay Service (LRS) is a new Azure service that allows you to migrate your databases from SQL Server on-premises, SQL Server on Azure Virtual Machines, Amazon EC2, Amazon RDS for SQL Server, or Google Compute Engine to Azure SQL Managed Instance. LRS is a free cloud service that uses log shipping technology to enable custom migrations of databases from SQL Server 2008 through 2022.

Read on for some configuration options and tips on how to use the service.

Comments closed

Tips for Power BI Modeling with ADX

Dany Hoter shares some tips on creating star schema models with Azure Data Explorer:

Relationships between DQ tables are created as M:M by default. This is not a problem and even recommended with single direction.

Read on for several tips. What’s interesting as I read this is just how radically different the advice is for ADX utilization versus Power BI utilization, such as using strings to join dimensions to facts. That would be heresy in a Kimball-style model and is a common cause for slow-down in Power BI. Yet that’s the recommendation here for working with ADX, unless I’m misunderstanding Dany’s post.

Comments closed

Understanding Azure Cognitive Search Costs

Matt Eland doesn’t want to break the bank:

Let’s continue my recent trend in exploring pricing tips for the various parts of AI and Machine Learning on Azure with a dive into Azure Cognitive Search.

Sometimes confused with the AI offerings of Azure Cognitive Services, the entirely different Azure Cognitive Search is a rich service that allows you to index a variety of files and documents, extract meaning from those documents, and provide rich search results to users.

In this article we’ll explore the pricing structure of Azure Cognitive Search and highlight some things you should be aware of as you plan and develop your Cognitive Search resources.

Read the whole thing if you’re thinking of using Azure Cognitive Search. It’s a good service and I think the pricing model is fairly straightforward, though there are always nuances to these things.

Comments closed

Object Tagging in Snowflake

Warner Chaves tags a table:

A tag is a user-defined label that can be attached to a Snowflake object, such as a database, table, or column. Tags can categorize objects based on any criteria that you choose, such as sensitivity, business unit, project, or owner. Once tags have been applied, you can use them to control access to the tagged objects, track usage and costs, and apply policies and rules.

Now let’s apply tagging to a specific use case: identifying sensitive customer data. For example, let’s assume that you have a table in Snowflake called “customers” that contains customer information, including their addresses. We want to categorize the “address” column as sensitive so that we can apply data protection policies and controls.

Click through for a few examples of how to create tags, apply tags to database objects, and review tagged objects.

Comments closed