Press "Enter" to skip to content

Month: March 2023

Working with Remote Jupyter Books in Azure Data Studio

Steve Hughes reaches across the internet:

When working with Azure Data Studio and its support of Jupyter books, you will find there is an option for remote Jupyter books. As shown in the image below, you can open that Jupyter book and follow through the dialogue for a couple of Microsoft books that are readily available.

Click through to see how this option differs from standard Jupyter books (which are themselves different from Jupyter notebooks) and how you can create one.

Leave a Comment

Data Pipelines and Data Mesh

Jean-Georges Perrin answers a burning question:

I keep having questions about data pipelines. Data pipelines in Data Mesh is a topic I should tackle. So… Is the data pipeline the root of all evil?

Jean-Georges’s answer is quite in line with one of my favorite phrases: “Short answer: no, with an ‘if’; long answer: yes, with a ‘but.'” Read on for some thoughts on data pipelines and what the data mesh concept does to minimize harm.

Leave a Comment

Creating an Elasticsearch Pipeline

The Big Data in Real World team builds a pipeline:

A pipeline is a definition of a series of processors that are to be executed in the same order as they are declared. 

Think of a processor as a series of instructions that will be executed.

In this post we are going to create a pipeline to add a field named doc_timestamp to all the documents that are added to the index.

Click through for the process. In Elasticsearch, ingest pipelines aren’t for moving data but rather for performing some common operations or tasks prior to indexing the data.

Leave a Comment

Role-Based Access Controls in Amazon OpenSearch

Scott Chang and Muthu Pitchaimani show how to assign rights in Amazon OpenSearch to IAM groups:

Amazon OpenSearch Service is a managed service that makes it simple to secure, deploy, and operate OpenSearch clusters at scale in the AWS Cloud. AWS IAM Identity Center (successor to AWS Single Sign-On) helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. To build a strong least-privilege security posture, customers also wanted fine-grained access control to manage dashboard permission by user role. In this post, we demonstrate a step-by-step procedure to implement IAM Identity Center to OpenSearch Service via native SAML integration, and configure role-based access control in OpenSearch Dashboards by using group attributes in IAM Identity Center. You can follow the steps in this post to achieve both authentication and authorization for OpenSearch Service based on the groups configured in IAM Identity Center.

Click through for the process.

Leave a Comment

SQL Server 2022 CU2 Released

Srinivas Kandibanda shares the news:

The 2nd cumulative update release for SQL Server 2022 RTM is now available for download at the Microsoft Downloads site. Please note that registration is no longer required to download Cumulative updates.

Click through for a link to get the latest CU, as well as a link leading to notes on what’s in it. One interesting PolyBase-related note is that SQL Server 2022 CU2 finally supports using TNS files when connecting to Oracle databases. That was the norm the last time I semi-seriously used Oracle (quite a while ago), but for PolyBase, you had to specify all connection details separately.

Leave a Comment

Tips for Power BI Modeling with ADX

Dany Hoter shares some tips on creating star schema models with Azure Data Explorer:

Relationships between DQ tables are created as M:M by default. This is not a problem and even recommended with single direction.

Read on for several tips. What’s interesting as I read this is just how radically different the advice is for ADX utilization versus Power BI utilization, such as using strings to join dimensions to facts. That would be heresy in a Kimball-style model and is a common cause for slow-down in Power BI. Yet that’s the recommendation here for working with ADX, unless I’m misunderstanding Dany’s post.

Leave a Comment

Using the Log Replay Service to Migrate to Azure SQL MI

Rob Carrol makes a move:

The Log Replay Service (LRS) is a new Azure service that allows you to migrate your databases from SQL Server on-premises, SQL Server on Azure Virtual Machines, Amazon EC2, Amazon RDS for SQL Server, or Google Compute Engine to Azure SQL Managed Instance. LRS is a free cloud service that uses log shipping technology to enable custom migrations of databases from SQL Server 2008 through 2022.

Read on for some configuration options and tips on how to use the service.

Leave a Comment

Power BI Group By Columns

Marco Russo and Alberto Ferrari bundle things together:

In Power BI you can specify the unique identifier of a column value by using another column or another set of columns. This feature is currently used by the Fields Parameter feature in Power BI, but it may also be used for other purposes in a model. However, there are several limitations – such as the incompatibility with MDX queries – that reduce one’s ability to use Group By Columns property in many scenarios, so it cannot be used with Excel as a client.

Read on to learn more about how grouping works in Power BI and some of the limitations.

Leave a Comment

QA Refreshes via CI/CD

Hiram Fleitas rebuilds the QA environment:

In this post I am going to cover how to automatically refresh a lower environment commonly used for testing as part of your release (CD) pipeline.

Well, why? – you may be asking.

  1. In some cases, developers and testers need to test their application code-changes against a fresh copy of production-like data. This helps them do validations prior to publishing their changes to production where their apps are bombarded by end-user live workloads.
  2. Also, the lower environment may be used for testing, and we can’t overwrite the test data constantly. It needs to be a hot-standby refresh, made available when necessary.

Click through for notes on the process.

Leave a Comment

Investigating Powershell Object Members

Jeffrey Hicks wants to know what he can do:

A few weeks ago, I was working on content for a new PowerShell course for Pluralsight. The subject was objects. We all know the importance of working with objects in PowerShell. Hopefully, you also know that the output you get on your screen from running a PowerShell command is not the whole story. Formatted presentation is separate from the underlying objects in the pipeline. That’s why it is important to know how to use Get-Member to discover how an object is defined.

But, as Jeffrey points out, this doesn’t work for static members. Read on to learn what does.

Leave a Comment