Press "Enter" to skip to content

Author: Kevin Feasel

PFS Contention and Heaps

Uwe Ricken continues a series on heaps in SQL Server:

The PFS page “can” become a bottleneck for a heap if many data records are entered in the heap in the shortest possible time. How often the PFS page has to be updated depends mostly on the data record’s size to be saved.

This procedure does not apply to clustered indexes since data records in an index must ALWAYS be “sorted” into the data volume according to the defined index value. Therefore, the search for a “free” space is not carried out via the PFS page but via the value of the key attribute!

Read on for more detail.

Comments closed

Wrapping up the Azure Databricks Advent

Tomaz Kastrun laughs at 24-day advent calendars:

In the last two days we have focused on understanding Apache Spark through performance tuning and through troubleshooting. Both require some deeper understanding of Spark and Azure Databricks, but gives also a great insight to all who will need to improve performance and work with Spark.

Today, I would like to list couple of additional Learning material, documentation and any other additional resources for further exploration on Azure Databricks.

Click through for links to additional resources on Apache Spark and Databricks, as well as the other 30 entries in the series.

Comments closed

Non-Kimball Relationships in Power BI

Paul Turley continues a series on relationship modeling in Power BI:

So far, you’ve seen that the essential components of a data model include tables related to each other. Filtering records in one table prorogates the filter to related tables(s), causing records in the related table to be filtered. This dimensional model schema is the foundation of the reporting solution. Once the fact and dimension tables are in-place, you can create more advanced solutions by working outside of this standard relationship pattern.

Read on for the full story.

Comments closed

The Complexity of Adding Simple Features

Chris Webb answers a timeless question:

One question I get asked all the time is this:

Why don’t you add [insert feature idea here] to Power BI?

It’s sometimes followed up by one or more of the following comments:

It would be so easy for you to do
I can’t believe you haven’t done it already
Power BI is unusable without it
[insert competitor name here] has had this feature for years

…and a real or virtual exasperated sigh.

Read on for the answer. This isn’t special to Power BI or even Microsoft—once you start to have customers with competing interests, these decisions get a lot harder.

Comments closed

Incremental Backoff with Powershell

Shane O’Neill implements a linear backoff strategy:

Pushups are hard! Even when I’m not trying to rep out as many as I can, they still take a toll on the body. Soon a five-minute break is not enough, and I’m taking longer and longer rests.

Fine, if that’s the way we’re going to do this, then I’m going to go with the flow.

I can confirm that pushups are hard. Also, click through for a link to the backoff script.

Comments closed

Powershell Quick Hits

Jess Pomfret shares a few Powershell tips:

My goal was to gain more stars than last year, which I succeeded at. I only got 6 total stars last year. Now my goal for next year will be to beat this year’s performance.  I did learn several neat things while working on these puzzles and those I thought were worth sharing.

Read on for those things.

Comments closed

Bayesian Modeling of Holiday Behavior

Daniel Marthaler and Brian Coffey have an interesting post:

As the year unfolds, our demand fluctuates. Two big drivers of that fluctuation are seasonality and holidays. With the holiday season upon us, it’s a great time to describe how both seasonality and holiday effects can be estimated, and how you can use this formulation in a predictive time series model.

In this post, we describe the difference between seasonality and holiday effects, posit a general Bayesian Holiday Model, and show how that model performs on some Google Trends data.

Read the whole thing.

Comments closed

Deleting Messages and Topics in Kafka

The Hadoop in Real World team has a pair of related posts. The first is on how to remove messages in a Kafka topic:

The easiest way to purge or delete messages in a Kafka topic is by setting the retention.ms to a low value. retention.ms configuration controls how long messages should be kept in a topic. Once the age of the message in a topic hits the retention time the message will be removed from the topic.

Note the below steps delete or purge messages in your topic. Use precaution when executing the below.

Because Kafka is an immutable log rather than “final” storage, the ideal scenario has you never deleting data. But sometimes you just run low on disk space. You can also set the max retention size as another option. But note that these aren’t going to let you delete a single message—that’s not a good thing to do with a log; rather, you offset or cancel out the message and submit a new one.

The second post covers deletion of a Kafka topic:

In this post we will see how to delete a Kafka topic and get the details of the topic before deleting it.

Comments closed