Press "Enter" to skip to content

Day: June 13, 2024

Combining Flink SQL, Streamlit, and Kafka

Lucia Cerchie has a pair of posts. First up, Lucia sets the stage:

n part 1 of this series, we’ll make an app, hosted on Streamlit, that allows a user to select a stock, in this case SPY, or the SPDR S&P 500 ETF Trust. Upon selection, a live chart of the stock’s bid prices, calculated every five seconds, will appear.

What are the pieces that go into making this work? The source of the data is the Alpaca Market Data API. We’ll hook up a Kafka producer to the websocket stream and send data to a Kafka topic in Confluent Cloud. Then we’ll use Flink SQL within Confluent Cloud’s Flink SQL workspace to tumble an average bid price every five seconds. Finally, we’ll use a Kafka consumer to receive that data and populate it to a Streamlit component in real time. This frontend component will be deployed on Streamlit as well.

Part 2 then closes the trap:

In part one of this series, we walked through how to use Streamlit, Apache Kafka®, and Apache Flink® to create a live data-driven user interface for a market data application to select a stock (e.g., SPY) and discussed the structure of the app at a high level. First, data with information on stock bid prices is moved via an Alpaca websocket, then, it’s produced to a Kafka topic in Confluent Cloud where it is also processed with Flink SQL. 

Now comes the tricky part: running the Kafka consumer and producer in the same application.

Click through for a good demonstration of a practical solution. Lucia also has a GitHub repo with all of the code, a demo of the site in action, and some links to additional resources.

Comments closed

An Auditing Oddity with SQL Audit

Rod Edwards runs into legal troubles:

This is a finger pointing situation that i’ve witnessed in the past regarding native SQL Auditing, and the potential for edge case false positives. Something really not helpful when it comes to any security related topic.

This post is just to highlight a potential gotcha with the native SQL Auditing functionality, dependent on it’s configuration. It’s certainly not a best practice on setting up Auditing, or access controls, or even the intent someone may have in falling foul of any Audit. There are many awesome guides out there on how to do exactly that.

Despite this post not being any of those things, it is still quite useful in pointing out an edge case in auditing, one to which I don’t have a good answer.

Comments closed

Role Checks: Access Admin, Security Admin, DDL Admin

David Seis looks at three roles:

Understanding SQL Server roles is crucial for managing permissions and ensuring SQL Server security. In this post, we will delve into three specific roles: db_accessadmin, db_securityadmin, and db_ddladmin, discussing when each should be used and considerations for least privilege and security. We’ll also include a script you can use to audit your database roles.

Read on to see what each of those three do. I’m not sure I’ve ever worked in an environment that required use of any of these three roles. Typically, the person or set of people responsible for doing the activities associated with one of those three roles needed to do all three (and more).

Comments closed

Show Top N and Bottom N Records in One Power BI Visual

Kenneth Omorodion burns the candle from both ends:

Recently, I wrote an article, Rank and Sort Data Based on Multiple Columns in Power BI Using DAX. However, it is very common for business users to request the ability to dynamically view the Top N and Bottom N values of a measure, like Total Sales, on the same visual. This requirement is simple to implement on either the Top or Bottom N options. But, the challenge is when we need to represent the two options on the same chart simultaneously.

Read on for an example of how to do this.

Comments closed

Fun with Query Timeouts

Forrest McDaniel gets my most coveted category:

I love how there are lots of little things to SQL Server – mostly ignored details that only sometimes matter but make sense when you stare at them. Actually, I’m not sure I love them. I forget about them, and then stub my toe on them. Occasionally though, there’s a quirky combination that leads to bloggable shenanigans.

Let’s start with Detail Number One, which has most definitely tripped me up: queries that are returning rows won’t time out. You might be familiar with connections using a default 30s timeout, but as long the query is returning a row every 29s, the client won’t time it out. You can even test this in SSMS.

Read on to see how Forrest takes advantage of this, uh, capability.

Comments closed

Three Partitioning Options in Postgres

Semab Tariq shows how to perform three types of partitioning in PostgreSQL:

PostgreSQL is renowned for its exceptional performance in managing data. One of its standout features is partitioning, a technique that divides large datasets into smaller, more manageable segments. Partitioning provides several benefits, including improved query performance, streamlined data management, and enhanced scalability. By organizing data into partitions, PostgreSQL can execute searches more efficiently and handle tasks with greater ease. 

In this blog, we will delve into the details of partitioning in PostgreSQL, exploring its various types, advantages, and drawbacks. We’ll uncover how partitioning can revolutionize data management and decision-making processes in database environments.

Click through for demonstrations of range, list, and hash partitioning.

Comments closed

Build a Custom Semantic Model for Microsoft Fabric

Reza Rad offers up some advice:

The Lakehouse or Warehouse comes with a default Power BI Sematic model, which can be used for reporting and analytics. However, you can also build and use a customized semantic model. There are significant differences when using the semantic model in real-world analytics projects. In this article, I’ll explain the difference between these two, which one is recommended, and why.

Click through for the video, as well as the article.

Comments closed