Press "Enter" to skip to content

Day: July 22, 2021

ksqldb 0.19.0 Released

Tom Nguyen announces a new version of ksqldb:

ksqlDB 0.19.0 adds support for foreign-key joins between tables. Data decomposition into multiple tables (i.e., schema normalization) is a key strength of the relational data model and often requires joining tables based on a foreign key. So far, we have been able to provide tools for normalizing data, provided the rows in each of the tables followed a one-to-one relationship (i.e., have the same primary key).

Providing built-in support for foreign-key joins, which was previously only possible to do through workarounds, unlocks many new use cases where you’d like to have a many-to-one relationship between your tables. This is a highly demanded feature, and we are excited to finally make it available.

Click through to see what else they’ve included.

Comments closed

Feeding Data from Kafka into Splunk

Guy Shilo performs a bit of data migration:

Kafka connect is a framework that uses Kafka topics for collecting data from various sources and distributing it to different sinks. It comes bundled with Kafka installation but can run independently from Kafka brokers and access them remotely. Here is an explanation about what Kafka connect is and it’s architecture. It is also a good candidate for running on Kubernetes since it only uses outgoing communication.

The framework uses plugins to be able to talk to different sources and sinks. There are many ready plugins for a variety of systems. Some of them are free and some are licensed to companies like Confluent or Debezium. many of them can be found here. Some systems can be a source of data, some can be a sink and some can be both. Basically a source adapter polls the source system for changes, pulls the data and publish it in a Kafka topic. A sink adapter subscribes to a Kafka topic, gets incoming events and exports them into the target system.

As I mentioned, there are several dozens of supported adapters. Just for the demonstration we will capture events from kafka topic and store them in splunk for visualization and investigation.

Click through to see how it all fits together.

Comments closed

Querying Private Blob Storage Containers with Azure Synapse Analytics

Dennes Torres looks at some private information:

The queries from the previous article were made against the public container in the blob storage. However, if the container is private, you will need to authenticate with the container. In this article, you’ll learn how to query private blob storage with SQL.

NOTE: Be sure that the Azure Synapse Workspace and the storage account with the sample files are set up before following along with this article. You will also need to replace your storage account URL each time that a storage account URL is used in the article.

There are three possible authentication methods, and these methods may have some variation according to the type of storage account and the access configuration. I will not dig into details about storage here and leave that for a future article.

Read on for the three authorization methods and a lot of detail on using SAS tokens (the preferred method) to access this data.

Comments closed

Combining Change Data Capture with Azure Data Factory

Reitse Eskens continues a series on learning Azure Data Factory:

In my last blog, I pulled all the data from my table to my datalake storage. But, when data changes, I don’t want to perform a full load every time. Because it’s a lot of data, it takes time and somewhere down the line I’ll have to separate the changed rows from the identical ones. Instead of doing full loads every night or day or hour, I want to use a delta load. My pipeline should transfer only the new and changed rows. Very recently, Azure SQL DB finally added the option to enable Change Data Capture. This means after a full load, I can get the changed records only. And with changed records, it means the new ones, the updated ones and the deleted ones.

Let’s find out how that works.

Read on for the article and demonstration.

Comments closed

Differences in Logging between Azure Analysis Services and Power BI PPU

Gilbert Quevauvilliers continues a series on migrating from Azure Analysis Services to Power BI Premium Per User:

Another important aspect when having datasets is being able to log and monitor performance. In this blog post I am going to compare the logging between Azure Analysis Services (AAS) and Power BI Premium Per User (PPU).

With the recent release of PPU having integration with Log Analytics it makes it a lot easier to compare the logging options between AAS and PPU.

This is an area where there’s still a bit of a gap. Click through to see what the differences look like today.

Comments closed

Data Platform Deployments via Azure Test Plan

Kevin Chant shows off the power of Azure Test Plans:

In this post I want to cover using Azure Test Plans for Data Platform deployments. Because using it to manage test plans can be very useful.

By the end of this post, you will know what Azure Test Plans are and how they can be useful for data Platform deployments.

Click through to see how this feature in Azure DevOps works and how you can use it to test your deployments.

Comments closed

Content Sharing with Power BI

Marc Lelijveld continues a series on going from small-scale to enterprise with Power BI:

Let’s start with the most important feature of the Power BI Service, sharing content! At the same time, this can be one of the most challenging ones. Especially since there are many ways to share content in Power BI. In my experience in enterprise organizations, I have seen a various ways of sharing content. Below I explain the different options there are, leading to a conclusion of my personal best practice.

But why is the way how we share content so important in relation to large enterprise solutions? Well, I believe that all centrally managed solutions should match (organizational) best practices. The way how the content is made available to the users is one of these best practices. It will help end users to find the content they are looking for, always at the same consistent location.

Read on for several techniques.

Comments closed