Press "Enter" to skip to content

Author: Kevin Feasel

Cluster-Based Image Analysis and Reduction

Sebastian Sauer takes an image and reduces it to a group of colors:

This post is a remake of this casestudy: https://fallstudien.netlify.com/fallstudie_bildanalyse/bildanalyse

brought to you by Karsten Lübke.

The main purpose is to replace the base R command that Karsten used with a more tidyverse-friendly style. I think that’s easier (for me).

We will compute a cluster analysis to find the typical RGB color per cluster.

Click through for quite a bit of R code and a couple interesting turns.

Comments closed

Solving the Spelling Bee Honeycomb Puzzle

David Robinson has fun with puzzle-solving:

Solving this puzzle in R is interesting enough, but it’s particularly challenging to do so in a computationally efficient way. As much as I love the tidyverse, this, like the “lost boarding pass” puzzle and Emily Robinson’s evaluation of the best Pokémon team, serves as a great example of using R’s matrix operations to work efficiently with data.

I’ve done a lot of puzzles recently, and I realized that showing the end result isn’t a representation of my thought process. I don’t show all the dead ends and bugs, or explain why I ended up choosing a particular path. So in the same spirit as my Tidy Tuesday screencasts, I recorded myself solving this puzzle (though not the process of turning it into a blog post).

Most of the post is analysis around the problem, but you do get a viable solution as well.

Comments closed

Understanding Implicit Conversions

Jared Poche takes us through the concept of implicit conversions:

A quick search will tell you that implicit conversions are pretty awful for performance, and in particular drive CPU usage. That’s not news. There is an aspect of this I think a lot of engineers don’t understand; why does it cause performance issues?

An implicit conversion occurs when you try to compare between or assign data across two different data types, without converting one yourself. If you used a CAST or CONVERT, it would be an explicit conversion. When SQL Server does it for you, it’s an implicit conversion, and these can have a real impact on your execution plans. Not all combinations of types can be converted implicitly, for a full list look here.

What’s interesting in this is that there were some cases where Jared expected implicit conversion and the rules indicate that there should be implicit conversion, but the database optimizer saw through his ruse.

Comments closed

Problems Installing SQL Server from a Mounted ISO

Randolph West recommends against installing SQL Server from a mounted ISO file:

In the old days, software used to be distributed on physical media like diskette, CD, and finally DVD. Nowadays we download and install software as standalone files. Even enterprise software like SQL Server is available for download in this way, usually distributed as an ISO file. Once you’ve downloaded it, you can mount that ISO file as a virtual drive and access it as though it was on physical media.

Or you can do the right and proper thing, and extract the ISO file to a network drive or local storage first, using a tool like 7-Zip.

I don’t think I’ve ever seen the problems Randolph refers to, though I’ll readily admit that Randolph has quite a few more installations done than I.

Comments closed

Release Flow Branching and Database DevOps

Kendra Little explains why the Azure DevOps Release Flow model can work well for database activity:

But how do you use branches? It’s helpful to pick a strategy. There are many fine Git branching strategies out there, things like GitFlow and GitHub Flow and more — enough that it’s overwhelming to learn about these when you are just starting out.

The strategy that I recommend for folks who are starting out with database DevOps and Git is the Azure DevOps team Release Flow model with dedicated development databases. (Why dedicated development databases? Read more here.)

Read on to learn why.

Comments closed

Getting Folder Names in Power BI

Reza Rad shows us something new in Power Query:

There are times that you need to get the list of FOLDERS and not just files. Of course folder itself (without considering files in it), doesn’t contain data to be used for a report. However, sometimes, even the folder name might contain some useful information. The Get Data From Folder option in Power BI will just give you a list of files. There is a little trick that can help you to get a list of folders. Let me show you how.

There’s a fair bit to it, so click through and read the whole thing.

Comments closed

Refreshing Power BI from Your Outlook Calendar

Chris Webb has a nice use for Power Automate and Outlook:

The ideal way to view when multiple events are scheduled is a calendar and we’ve got great calendar functionality in Outlook. What if you could schedule refresh of your datasets from a calendar in Outlook? It turns out to be easier than you might think! Here’s how.

Read the whole thing, including Chris’s warning not to put it into production. You wouldn’t want the person with all of those calendar entries to leave your company and have things suddenly break, after all.

Comments closed

Reporting Queries and User-Defined Functions

Erik Darling tells a story of two things which do not get along very well:

The OLTP part was working fine, but there was a reporting element that was dog slow, and would cause all sorts of problems on the server.

When we got into things, I noticed something rather funny: All of their reporting queries had very high estimated costs, and all the plans were totally serial.

The problem came down to two functions that were used in the OLTP portion, which were reused in the reporting portion.

Read on to understand why SQL Server 2019 and its function inlining capability would not have helped here.

Comments closed

Installing .NET Core on a Raspberry Pi 4

Hasan Savran continues a series on Microsoft + Pi:

I have been writing about Azure IOT Hub and Raspberry Pi 4. So far, I bought a Raspberry Pi 4. I registered it as Azure IOT Edge device. Now, I am ready to write some code in Raspberry Pi. In this post, I will show you how install .NET Core 3.1 to Raspberry Pi so we can write some code to generate some data and push this data to Azure IOT Hub.

     First, you need to go to the .NET Core homepage to get the latest version’s url. Following page lists all .NET Core version, 3.1 was the latest when I was writing this blog. Pick the latest one from this list.

Another route might be to install Docker on your Pi.

Comments closed

Streams and Tables in Apache Kafka

Michael Noll has started a four-part series. Part one serves as a primer:

In my daily work as a member of Confluent’s Office of the CTO and as the former product manager for ksqlDB and Kafka Streams, I interact with many users of Apache Kafka—be it developers, operators, or architects. Some have a stream processing or Kafka background, some have their roots in relational databases like Oracle and MySQL, and some have neither. But many of them have the same set of technical questions, such as: What’s the difference between an event stream and a database table? Is a Kafka topic the same as a stream? How can I best leverage all these pieces when I want to put my data in Kafka to use?

By the end of this series, you will have answers to each of these common questions and many more. If you are interested to learn about Kafka, I invite you to join me on this journey through Kafka’s core fundamentals!

Part 2 looks at storage fundamentals:

Part 1 of this series discussed the basic elements of an event streaming platform: events, streams, and tables. We also introduced the stream-table duality and learned why it is a crucial concept for an event streaming platform like Apache Kafka®. Here in part 2, we will take a deep dive into Kafka’s storage fundamentals. Notably, we will explore topics and—in my opinion, the most important concept in Kafka: partitions.

We’ll start with the most basic storage question: how do I store data in Kafka?

I’m looking forward to parts 3 and 4.

Comments closed