Press "Enter" to skip to content

Author: Kevin Feasel

Stream Processing with Flink and Kafka

Konstantin Knauf starts a new series:

There was a huge amount of buzz about Apache Flink® at this year’s Kafka Summit London. From an action-packed keynote to standing-room only breakout sessions, it’s clear that the Apache Kafka® community is hungry to learn more about Flink and how the stream processing framework fits into the modern data streaming stack.

That’s why we’re excited to introduce our new “Inside Flink” blog series that takes a deeper look at why developers and organizations everywhere are shifting their stream processing technologies to Flink. Our first blog post explains what Flink is and how it can enhance your streaming use cases running on Kafka. Future topics will include common Flink use cases, an inside look at Flink SQL, and much more.

Click through for the first post in the series, which covers what Flink is and how the two products can interoperate.

Comments closed

Solving Systems of Equations in R

Steven Sanderson needs a solution:

In mathematical modeling and data analysis, it is often necessary to solve systems of equations to find the values of unknown variables. R provides the solve() function, which is a powerful tool for solving systems of linear equations. In this blog post, we will explore the purpose of solving systems of equations, explain the syntax of the solve() function, and provide three examples of increasing complexity to demonstrate its usage.

This post got me thinking about linear programming, which is a different topic but still pretty easy to do in R.

Comments closed

Creating a Calendar View in Power BI

Martin Schoombee needs a calendar:

It’s pretty sad that we don’t have a built-in calendar visualization in Power BI, and the custom visuals in the marketplace don’t have everything I need/want for my own internal reporting…so I decided to experiment a little and see how close I could get with the standard graphs that are available.

Read on to see how close Martin could get. It’s actually more calendar-looking than I would have expected, though also frustratingly limited.

Comments closed

Don’t Start Stored Procedure Names with sp_

Chad Callihan provides solid advice:

Everyone has an opinion on naming stored procedures. One opinion that can be agreed upon is that starting a stored procedure with “sp_” is not the way to go. Using “sp_” can only do harm. Let’s take a look at why that is.

Read on for Chad’s reasoning.

I don’t like prefixes at all for stored procedures (or tables). It’s a common misunderstanding of Hungarian notation, as you don’t provide any new and relevant information in the object name: of course it’s a stored procedure; I’m using “EXEC” to execute it, so what else could it be?

The exception to a very good rule is something which you intend to put in the master database and execute from any database context. The best example of this is sp_whoisactive.

Comments closed

ANSI_PADDING and Its Relevance

Vitaly Bruk won’t be late to meetings anymore:

Today, I have late for the team meeting … 🙁

When I joined a Teams meeting, the team discussed ANSI_PADDING. One of the teammates fires at me with the teacher’s voice: “What are we talking about? What is ANSI_PADDING? Shoot!”.

That was a joke, but … just in case, let’s see what is this.

Read on to learn what the ANSI_PADDING attribute is and why it’s relevant.

Comments closed

Dense Rank in KQL

Robert Cain continues a series on KQL:

The Kusto Windowing Function row_rank_dense is an interesting function. It lets you get a unique count of a specific column in a dataset. Unlike other methods of getting counts, row_rank_dense allows you to see each individual row of data.

First though, if you’ve not read the introductory post on Windowing Functions, Fun With KQL Windowing Functions – Serialize and Row_Number, you should do so now as it introduced several important concepts needed to understand how these Windowing Functions work.

Read on to see how this window function works.

Comments closed

Adaptive Query Execution in Spark 3.0

The Big Data in Real World team talks on-the-fly execution plan changes:

Apache Spark, the popular distributed computing framework, has been widely adopted for processing large-scale data. With the release of Apache Spark 3.0, a groundbreaking feature called Adaptive Query Execution (AQE) was introduced. AQE addresses the limitations of traditional static execution plans by dynamically optimizing query execution based on runtime statistics. In this blog post, we will explore how AQE works and how it significantly improves the performance of Spark applications.

This sounds pretty similar to adaptive query processing in SQL Server, though a look at the Spark documentation shows that there are some practical differences in implementation versus what SQL Server does.

Comments closed

Thoughts on Parameter Sensitive Plan Optimization

Jared Poche shares some thoughts:

First, I wanted to point out that there are fixes to issues specific to Parameter Sensitive Plan Optimization in SQL Server 2022. The issue itself is documented at the link below.

https://learn.microsoft.com/en-us/sql/relational-databases/performance/parameter-sensitive-plan-optimization?view=sql-server-ver16#known-issues

When I heard about the issue, I heard that it could cause SQL Server 2022 to cause stack dumps every 15 minutes. The time interval made me think it was related to the default interval for Query Store, and that seems to be the case.

Jared has several thoughts on the topic as he puts together a talk on PSPO.

Comments closed

Tracking Historical Database Snapshots

David Fowler goes sleuthing:

This is a question that’s come up twice this morning, firstly where can we find a history of database snapshots and secondly where can we find a history of restores from snapshot?

Frustratingly, SQL doesn’t make this at all easy for us and if this is something that you want to record, you’re going to have a do a little extra work.

Let’s take a look at each part in turn.

Read on for a messy solution.

Comments closed

Microsoft Fabric Notebooks and Compute Limits

Reitse Eskens hits a wall:

In this case, my notebook threw an error at me but the command seemed to finish without any issue. Sounds vague? It did to me. The notebookcell I tried to run had a lot of stuff happening at the same time.

As you can see in the above screenshot, the status shows green checkmarks but there’s an error as well. The error message was not really clear to me, but that can really be me lack of deep level experience. So, I logged a call with Microsoft Support and see what they could come up with.

I’ve had enough experience with Spark to see the issue and figure the response, but click through for the screenshot and what Reitse did to resolve the issue.

Comments closed