Press "Enter" to skip to content

Month: July 2024

Updates in Apache Kafka 3.8

Josep Prat announces a slew of changes:

We are proud to announce the release of Apache Kafka 3.8.0. This release contains many new features and improvements. This blog post will highlight some of the more prominent features. For a full list of changes, be sure to check the release notes.

See the Upgrading to 3.8.0 from any version 0.8.x through 3.7.x section in the documentation for the list of notable changes and detailed upgrade steps.

This also puts Kafka one step closer to getting rid of its ZooKeeper dependency altogether.

Comments closed

Using Semantic Model Scale Out as Part of Power BI Refresh

Chris Webb keeps the lights on during a refresh:

In recent my posts on the Command Memory Limit error and the partialBatch mode for Power BI semantic model refresh, I mentioned that one way to avoid memory errors when refreshing large semantic models was to run use refresh type clearValues followed by a full refresh – but that the downside of doing this was that your model would not be queryable until the full refresh had completed. Immediately afterwards some of my colleagues (thank you Alex and Akshai) pointed out that there was in fact a way to ensure a model remained queryable while using this technique: using Semantic Model Scale Out. How? Let me explain…

Click through for that explanation.

Comments closed

A Visual Explanation of Row Context in DAX

Marco Russo and Alberto Ferrari get visual:

Row context is the second fundamental concept in writing DAX code. In a previous article, we introduced the first concept – the filter context – using a visual approach. In this article, we rely on graphical visualization to describe a row context.

This article provides a different perspective on a topic already discussed in other row context articles: read them to get more insights about this important concept for DAX.

Click through for a great primer on the topic.

Comments closed

Linux Memory Overcommit and PostgreSQL

Laurenz Albe shares a warning:

Linux tries to conserve memory resources. When you request a chunk of memory from the kernel, Linux does not immediately reserve that memory for you. All you get is a pointer and the promise that you can use the memory at the destination. The kernel allocates the memory only when you actually use it. That way, if you request 1MB of memory, but use only half of it, the other half is never allocated and is available for other processes (or the kernel page cache).

Overbooking is a concept that airlines all over the world have been using for a long time. They sell more seats than are actually in the plane. From experience, airlines know that some passengers don’t show up for the flight, so overbooking allows them to make more profit. By default, Linux does the same: it deals out (“commits”) more memory than is actually available in the machine, in the hope that not all processes will use all the memory they allocate. This memory overcommit is great for using resources efficiently, but it has one problem: what if all the flight passengers show up, that is, what if the processes actually use more memory than is available? After all, you cannot offer a computer process a refund or a free overnight hotel room.

Read on to learn more about memory overcommit and what you can do about it.

Comments closed

Security Options in Microsoft Fabric Warehouses

Koen Verbeeck locks things down:

We are implementing a data analytics solution in Microsoft Fabric. A warehouse is used for the gold layer, and we want to give users access to the data. However, by sharing the warehouse, they can read all the data in all the tables. Some data is sensitive, and only users with the correct permissions should be able to view it. Is it possible to implement more granular access control to the data?

Read on for the answer, as well as an important note on how users might be able to circumvent your permissions settings.

Comments closed

Microsoft Fabric GitHub Integration Security Considerations

Kevin Chant covers a bit of security:

I know the option to work with GitHub has got a lot of people excited. Which I why wanted to share my initial thoughts about security with you all. Because a lot of things have come to mind whilst testing this.

I want to highlight immediate implications and options before you all get too involved with testing. To make sure you test working with GitHub safely.

Plus, this post is really useful for those of you looking to test this in a regulated GitHub Enterprise environment. Because it will allow you to explain things to your GitHub administrators better, and/or forward them this post. To explain what you want to achieve.

Read on for Kevin’s thoughts on the matter.

Comments closed

Optimistic Locking in Postgres

Semab Tariq explains how optimistic locking works in PostgreSQL:

Concurrency control in databases ensures that multiple transactions can occur simultaneously without causing data errors. It’s essential because, without it, two people updating the same information at the same time could lead to incorrect or lost data. There are different ways to manage this, including optimistic locking and pessimistic locking. Optimistic locking assumes that conflicts are rare and only checks for them when updating data. In contrast, pessimistic locking assumes conflicts are likely and locks data early to prevent issues. Optimistic locking allows for more concurrent transactions and better performance in systems with fewer conflicts.

Read on to learn more about it, including some patterns for best use and what to avoid.

Comments closed

AutoML in Python with TPOT

Abid Ali Awan gives us a primer on TPOT:

AutoML is a tool designed for both technical and non-technical experts. It simplifies the process of training machine learning models. All you have to do is provide it with the dataset, and in return, it will provide you with the best-performing model for your use case. You don’t have to code for long hours or experiment with various techniques; it will do everything on its own for you.

In this tutorial, we will learn about AutoML and TPOT, a Python AutoML tool for building machine learning pipelines. We will also learn to build a machine learning classifier, save the model, and use it for model inference.

Click through to see an example of how to use the library.

Comments closed

FabricRestClient and Long-Running Operations

Sandeep Pawar has a public service announcement:

I want to thank Michael Kovalsky for pointing out that FabricRestClient in Semantic Link supports (since v 0.7.5) Long Running Operation (LRO).

LRO support allows the client to wait for the request to process without being blocked. Without LRO support, you will get a 202 response code saying the request is being processed. You need to submit another request based on the url returned to get the result. With LRO support, FabricRestClient will wait 20s and give you the result back.

Click through to see what you’d need to do to enable it, as well as the benefit you can receive.

Comments closed