Press "Enter" to skip to content

Author: Kevin Feasel

Reviewing Table-Level Statistics in Postgres

Muhammad Ali explains what information you can find regarding tables in Postgres:

Database monitoring, performance tuning and query optimization are critical operations for maintaining an efficient database system. A key component in PostgreSQL for this purpose is the pg_stat_all_tables view.

pg_stat_all_tables offers real time statistics on table activities such as number of sequential and index scans performed on a table, updates, deletes, inserts, and more. It also offers information on dead tuples along with vacuum and analyze stats which enables DB admins to make data-driven decisions. Here’s a table detailing the columns in the pg_stat_all_tables:

Click through for that table about tables.

Comments closed

Leaving Good Comments in a Stored Procedure

Erik Darling comments on your comments:

Possibly the least helpful, but most humorous, way of leaving comments, is a large block of green text up at the top of a module.

There are all sorts of helpful insights buried in those comments to help me as a consultant understand my audience.

But… 

I agree with a lot of where Erik is going with his thoughts. The area where we probably have some daylight is that I’d rather limit comments to statements of why rather than what. Sure, when I’m pseudo-coding out a procedure, I’ll have a bunch of little “do this thing here” types of comments, but I remove those as I build out the code. Instead, explain why you’re doing something if it isn’t patently obvious, if you rewrote a query in a more complicated to read fashion because it performs much better, that kind of thing.

But in fairness, as long as your comments actually reflect the code, it’s really hard to say any code base is ever over-commented. It’s way easier to go the opposite direction.

Comments closed

Documenting a Tabular Model

Olivier Van Steenlandt builds the docs:

A few months ago, I chatted with colleagues about our Tabular Model. More specifically the lack of Tabular Model documentation. Since we were thinking about replacing our current model, I started to think about how to integrate documentation easily.

Having documentation is 1 thing, making sure it’s used is something completely different. And then we’re not even talking about keeping it up to date. My initial idea was to include the documentation task during the development phase. That said, time to get the thoughts into practice.

Read on to see what Olivier did.

Comments closed

All about Lakehouses in Microsoft Fabric

Tomaz Kastrun gives us the skinny with multiple posts in his Advent of Microsoft Fabric. Day 3 introduces the lakehouse:

Lakehouse is cost-effective and optimised storage, supporting all types of data and file formats, structured and unstructured data, and helps you govern the data, giving you better data governance. With optimised and concurrent reads and writes, it gives outstanding performance by also reducing data movement and minimising redundant copy operations. Furthermore, it gives you a user-friendly multitasking experience in UI with retaining your context, not losing your running operations and working on multiple things, without accidentally stopping others.

Day 4 covers Delta format:

Yesterday we looked into lakehouse and learned that Delta tables are the storing format. So, let’s explore what and how we can go around understanding and working with delta tables. But first we must understand delta lake.

Day 5 covers data ingest:

We have learned about delta lake and delta tables. But since we have uploaded the file directly, let’s explore, how we can also get the data into lakehouse.

Click through for all three posts.

Comments closed

Joining on Overlapping Date Ranges in T-SQL

Daniel Hutmacher crosses the streams:

You can get into a situation where you have two tables with values associated with date ranges. What’s worse, those date ranges don’t necessarily have to align, which can make joining them a seemingly complex task, but it is surprisingly simple when you learn how to think of overlapping date ranges, along with this relatively simple T-SQL join pattern.

This problem gets even more challenging if you have the possibility of multiple overlaps and you want to find the combination with the biggest overlap for each individual item.

Comments closed

Producing Messages with librdkafka

Jakub Korab dives into a Kafka library:

In a previous blog post (How To Survive an Apache Kafka® Outage) I outlined the effects on applications during partial or total Kafka cluster outages and proposed some architectural strategies to handle these types of service interruptions. The applications most heavily impacted by this type of outage are external interfaces that receive data, do not control request flow, and possibly perform some form of business transaction with the outside world before producing to Kafka. These applications are most commonly found in finance and written in languages other than Java—mostly C and C++. 

librdkafka is the main underlying client library used in non-JVM environments and has wrapper libraries for Python, .Net, Go, and an ever-expanding list of clients. It has not been written about to the same extent as the Java client, and it is worth examining as its interface and underlying mechanics are fundamentally different. 

This library is quite useful and versatile.

Comments closed

Integrating Azure ML and Power BI

I have a new video:

In this video, I show off how easy it is to integrate Azure ML and Power BI, at least once you get past all of the trouble trying to integrate them.

I expected this to be easy. It turns out that the “make it look easy” depends on having several things in place already and using the correct (by which I mean “old”) deployment type.

Comments closed

Finding the Local Port Number for Power BI Desktop

Soheil Bakhshi updates an older post:

In March 2018, I wrote a blogpost called Four Different Ways to Find Your Power BI Desktop Local Port Number. Last week, Zoe Doughlas from Microsoft left a comment reminding me of a fifth method to get the port which encouraged me to write this quick tip. Thanks to Zoe!

As the name suggests, the blog was about finding Power BI Desktop’s local port number. If you do not have any clue what I mean by local port number, I strongly suggest reading that blog.

Read on to see what that fifth method is.

Comments closed

Optimizing Sort Operators in Window Functions

Andy Brownsword talks about window function query tuning:

We’re on quite a roll with window functions these past few weeks. Last week we looked at the operators we’d see in execution plans when using a window function. This week I wanted to tackle one of the more troublesome ones specifically: the Sort operator.

We know that sort operators are expensive in our queries. To use a window function our data needs to be sorted. How about if we need multiple functions? What if we’d like the output sorted too? Can we optimise any of those out of the execution plan?

Read on for several tips.

Comments closed