Press "Enter" to skip to content

Day: March 16, 2023

Data Pipelines and Data Mesh

Jean-Georges Perrin answers a burning question:

I keep having questions about data pipelines. Data pipelines in Data Mesh is a topic I should tackle. So… Is the data pipeline the root of all evil?

Jean-Georges’s answer is quite in line with one of my favorite phrases: “Short answer: no, with an ‘if’; long answer: yes, with a ‘but.'” Read on for some thoughts on data pipelines and what the data mesh concept does to minimize harm.

Comments closed

Creating an Elasticsearch Pipeline

The Big Data in Real World team builds a pipeline:

A pipeline is a definition of a series of processors that are to be executed in the same order as they are declared. 

Think of a processor as a series of instructions that will be executed.

In this post we are going to create a pipeline to add a field named doc_timestamp to all the documents that are added to the index.

Click through for the process. In Elasticsearch, ingest pipelines aren’t for moving data but rather for performing some common operations or tasks prior to indexing the data.

Comments closed

Role-Based Access Controls in Amazon OpenSearch

Scott Chang and Muthu Pitchaimani show how to assign rights in Amazon OpenSearch to IAM groups:

Amazon OpenSearch Service is a managed service that makes it simple to secure, deploy, and operate OpenSearch clusters at scale in the AWS Cloud. AWS IAM Identity Center (successor to AWS Single Sign-On) helps you securely create or connect your workforce identities and manage their access centrally across AWS accounts and applications. To build a strong least-privilege security posture, customers also wanted fine-grained access control to manage dashboard permission by user role. In this post, we demonstrate a step-by-step procedure to implement IAM Identity Center to OpenSearch Service via native SAML integration, and configure role-based access control in OpenSearch Dashboards by using group attributes in IAM Identity Center. You can follow the steps in this post to achieve both authentication and authorization for OpenSearch Service based on the groups configured in IAM Identity Center.

Click through for the process.

Comments closed

SQL Server 2022 CU2 Released

Srinivas Kandibanda shares the news:

The 2nd cumulative update release for SQL Server 2022 RTM is now available for download at the Microsoft Downloads site. Please note that registration is no longer required to download Cumulative updates.

Click through for a link to get the latest CU, as well as a link leading to notes on what’s in it. One interesting PolyBase-related note is that SQL Server 2022 CU2 finally supports using TNS files when connecting to Oracle databases. That was the norm the last time I semi-seriously used Oracle (quite a while ago), but for PolyBase, you had to specify all connection details separately.

Comments closed

Tips for Power BI Modeling with ADX

Dany Hoter shares some tips on creating star schema models with Azure Data Explorer:

Relationships between DQ tables are created as M:M by default. This is not a problem and even recommended with single direction.

Read on for several tips. What’s interesting as I read this is just how radically different the advice is for ADX utilization versus Power BI utilization, such as using strings to join dimensions to facts. That would be heresy in a Kimball-style model and is a common cause for slow-down in Power BI. Yet that’s the recommendation here for working with ADX, unless I’m misunderstanding Dany’s post.

Comments closed

Using the Log Replay Service to Migrate to Azure SQL MI

Rob Carrol makes a move:

The Log Replay Service (LRS) is a new Azure service that allows you to migrate your databases from SQL Server on-premises, SQL Server on Azure Virtual Machines, Amazon EC2, Amazon RDS for SQL Server, or Google Compute Engine to Azure SQL Managed Instance. LRS is a free cloud service that uses log shipping technology to enable custom migrations of databases from SQL Server 2008 through 2022.

Read on for some configuration options and tips on how to use the service.

Comments closed