Press "Enter" to skip to content

Author: Kevin Feasel

Testing Message Ordering in Kafka

Francesco Tisiot puts a claim to the test:

One of Apache Kafka®’s most known mantras is “it preserves the message ordering per topic-partition”, but is it always true? In this blog post we’ll analyze a few real scenarios where accepting the dogma without questioning it could result in unexpected, and erroneous, sequences of messages.

There’s a lot more to this than I realized, and Francesco does a great job of explaining it.

Leave a Comment

Alt Text in R

Nicola Rennie looks at different ways to incorporate alt text in R-based images:

Alt text (short for alternative text) is text that describes the appearance and purpose of an image. Alt text has multiple purposes, the main one being that it aids visually impaired users to better understand your content when the alt text is read aloud by screen readers. Alt text is also used in place of an image if it fails to load, which means that users with poor internet connection are more likely to be able to engage with your content.

The ggplot2 example was an interesting one, as I hadn’t ever added alt text to an image there.

Leave a Comment

Model Deployment using Azure Functions

Alexander Billington needs to get that new model out:

Deploying machine learning (ML) models into production can be challenging, as it requires careful consideration of various factors such as scalability, reliability, and maintainability. While developing an ML model is an exciting process, deploying it into production can be a daunting task. The challenges faced in productionising data science projects can range from infrastructure to version control, model monitoring to integration with other systems. This blog will take a look at how Azure Functions can simplify the deployment process, getting models into production quickly and robustly to maximise their value.

I like this approach and find it interesting, as most of the time, the MLOps model Microsoft recommends has you scheduling Azure DevOps pipelines / GitHub Actions periodically or when new training data hits a specific folder. If you have some non-standard trigger for an action, this is a good way to get you going.

Leave a Comment

Collaborating with External Individuals in Power BI

Marc Lelijveld talks to the outside world:

Let’s imagine you’re running a (fictive) company, and you’re short on data & analytics experts. Therefore, you decide to in hire expertise, and they will help you build your Power BI reports. As an employee of this organization, you rather have them starting sooner than later. But… if you need to request accounts for them at your IT organization, this might take weeks, if not a month to properly setup and run through this process. But what alternatives do you have?

In this blog I will further elaborate on the important things you should think about when working with Externals in Power BI. This blog is based on the session I’ve presented at the Dutch Power BI community day and at SQL Bits 2023 on the same topic together with my colleague Odeta Jankaitienė.

Click through for some of the important decisions you’ll need to make along the way.

Leave a Comment

Updating SQL Server Containers on Kubernetes

Amit Khandelwal rolls out some updates:

I’m sure you’ve thought about how to update SQL Server containers running on a Kubernetes cluster at some point. So, this blog attempts to answer the question. According to the Kubernetes documentation, there are two Update strategies for statefulset workloads. For your convenience, I’m quoting the summary below:

  1.  OnDelete update : When a StatefulSet’s .spec.updateStrategy.type is set to OnDelete, the StatefulSet controller will not automatically update the Pods in a StatefulSet. Users must manually delete Pods to cause the controller to create new Pods that reflect modifications made to a StatefulSet’s .spec.template.
  2. Rolling update : When a statefuleset’s .spec.updateStrategy.type is set to RollingUpdate, the StatefulSet controller will delete and recreate each Pod in the StatefulSet. It will proceed in the same order as Pod termination (from the largest ordinal to the smallest), updating each Pod one at a time. This is the default update strategy.

Read the whole thing to learn how these two strategies of updating containers work.

Leave a Comment

Building a Shiny App to Show Star Maps

Benjamin Smith builds a UI:

Recently, I released a R package called starBliss that aimed to replicate the output of a e-commerce site called MapsForMoments – a site which lets users order custom prints of the night sky on the date of their choosing (usually a special occasion such as a birthday, first date, wedding etc.) and allows them to choose a style, and add some custom text. It was a great experience getting to build the package which replicated the MapsForMoments product and I was shocked to see how well it was received when I posted about it- with the Github receiving over 30 stars at the time of writing this blog!

I decided to take this to the next level by trying to build a similar UI in shiny which allows the user to create a custom star map and not need to use the R console. In this blog I share my experience constructing and showcase the “free alternative” to MapsForMoments – starBlissGUI!

Click through to see how you can run the app, as well as a sample output.

Leave a Comment

Estimating Simulation Variance when Running Stan Models in R

Sebastian Sauer takes a look at an interesting question:

stan_glm() allows for setting a seed value thereby eliminating the variance induced by random numbers. However, in case a seed is not used, how much variance is to be expected? This is the research question of this analysis.

Let’s choose n=100 repetitions in our simulation.

Click through for the demonstration, including a summary table and notes on installed packages for the sake of reproducibility.

Leave a Comment

Checking XML Validity

Kevin Wilkie doesn’t like misshapen XML data:

Sometimes you’ll find that you will have XML in your database. This could be for various reasons – from storing the XML after receiving an API response to keeping it in a table because a web developer couldn’t figure out another way to store their data. Sometimes – no matter how much you trust your source – you should question if the XML is well-formed. Let’s work out a few ways you can do that in a database.

Read on for a few tests. The more concerned you are about XML data quality, the more you’d want to push in the direction of having an XSLT defined as well.

Leave a Comment

Using a Trigger to Auto-Refresh View Metadata

Aaron Bertrand keeps metadata in sync:

As much as we tell people to use SCHEMABINDING and avoid SELECT *, there is still a wide range of reasons people do not. A well-documented problem with SELECT * in views, specifically, is that the system caches the metadata about the view from the time the view was created, not when the view is queried. If the underlying table later changes, the view doesn’t reflect the updated schema without refreshing, altering, or recreating the view. Wouldn’t it be great if you could stop worrying about that scenario and have the system automatically keep the metadata in sync?

It’s almost entirely not apropos, but the first thing I thought of when I read the title and Problem statement was Goethe’s line about Mephistopheles: “Oft evil will shall evil mar.” Make of that what you will.

Leave a Comment