Kevin Feasel – Page 284

Vacuuming in PostgreSQL

Published 2023-10-09 by Kevin Feasel

If you’re a PostgreSQL user, you’ve undoubtedly come across the term “vacuum“. This operation plays a pivotal role in maintaining the optimal performance of your database while preventing unnecessary data bloat. In this blog, we’ll understand how vacuum works on high level, its significance, types, server parameters that influence autovacuum operations, and general FAQ’s on vacuum.

Read on to learn more about what vacuuming does and why it is important. It also turns out that there are multiple types of vacuuming.

Comments closed

Workaround for Primary Keys in Fabric Data Warehouses

Published 2023-10-09 by Kevin Feasel

Gilbert Quevauvilliers needs a key:

When I started looking into using the data warehouses feature in Fabric, I did see that there were limitations on Primary Key columns.

Below is my blog post on how I still use keys in my data warehouse, instead of using GUID’s which to me are long and hard to use.

In my example I am going to create a simple data warehouse which is going to consist of two-dimension tables (Date and Country) and a fact table with the Sales amounts.

This seems sub-optimal, though at least Gilbert shows us a workaround.

Comments closed

In-Memory OLTP and Memory Allocation

Published 2023-10-09 by Kevin Feasel

Tanayankar Chakraborty explains an error:

We recently encountered a support case where a customer using In-memory tables in an Azure SQL DB, receives an error message while trying to insert data into the table that also has a clustered columnstore index. The customer then deleted the entire data from the In-memory Tables (With the clustered columnstore index), however it appeared that the Index Unused memory was still not released. Here’s the memory allocation the customer could see:

Error

In addition to the error above- here is the error text:

Msg 41823, Level 16, State 109, Line 1

Could not perform the operation because the database has reached its quota for in-memory tables. This error may be transient. Please retry the operation. See ‘http://go.microsoft.com/fwlink/?LinkID=623028‘ for more information

In this case, the error ends up being a “didn’t read the manual” type of error.

Comments closed

Debugging an Unresponsive Elasticsearch Cluster

Published 2023-10-06 by Kevin Feasel

Derric Gilling troubleshoots an Elasticsearch cluster:

Because of this sharding, a read or write request to an Elasticsearch cluster requires coordinating between multiple nodes as there is no “global view” of your data on a single server. While this makes Elasticsearch highly scalable, it also makes it much more complex to setup and tune than other popular databases like MongoDB or PostgresSQL, which can run on a single server.

When reliability issues come up, firefighting can be stressful if your Elasticsearch setup is buggy or unstable. Your incident could be impacting customers which could negatively impact revenue and your business reputation. Fast remediation steps are important, yet spending a large amount of time researching solutions online during an incident or outage is not a luxury most engineers have. This guide is intended to be a cheat sheet for common issues that engineers running that can cause issues with Elasticsearch and what to look for.

Read on for several helpful tips.

Comments closed

Radar Charts in R

Published 2023-10-06 by Kevin Feasel

Steven Sanderson has radar love:

Radar charts, also known as spider, web, polar, or star plots, are a useful way to visualize multivariate data. In R, we can create radar charts using the fmsb library. Here are several examples of how to create radar charts in R using the fmsb library:

Radar charts are a guilty pleasure of mine. They are rarely the right choice, but when they are, I love it so much.

Comments closed

Tightening up Dashboards

Published 2023-10-06 by Kevin Feasel

Rita Fainshtein improves that dashboard:

One of our challenges as dashboard developers is effectively presenting all the necessary information to decision-makers while working within the constraints of limited ‘real estate’ on the dashboard. To tackle this challenge, I’ve compiled a list of 5 tips that will help you complete the task without the need for excessive buttons or constant screen switching.

I heartily agree with 4 out of the five and agree with caveats concerning the tooltip example. The only reason I might disagree with moving information into tooltips is that dashboards are intended to be glanceable, meaning you can get all relevant information by looking at the dashboard but without needing to click, drag, scroll, drill, or otherwise manipulate the dashboard. I like tooltips for ancillary information—which, in fairness, is also the point Rita drives at.

Comments closed

RCSI and ID-Driven ETL

Published 2023-10-06 by Kevin Feasel

Michael J. Swart shares a warning:

Yesterday, Kendra Little talked a bit about Lost Updates under RCSI. It’s a minor issue that can pop up after turning on RCSI as the default behavior for the Read Committed isolation level. But she doesn’t want to dissuade you from considering the option and I agree with that advice.

In fact, even though we turned RCSI on years ago, by a bizarre coincidence, we only came across our first RCSI-related issue very recently. But it wasn’t update related. Instead, it has to do with an ETL process. To explain it better, consider this demo:

Michael has one example solution. I could also see a “windback” run, where, instead of starting at the very end of the line for ETL, you start a few hundred rows earlier. That way, you can pick up any stragglers. It would add some overhead to the ETL task, but given that ETL jobs should be idempotent, it shouldn’t affect the end results.

Comments closed

Auto-Failover Groups in Azure SQL DB

Published 2023-10-06 by Kevin Feasel

Etienne Lopes wraps up a series:

So, first of all, what is Auto-failover groups?

“The auto-failover groups feature allows you to manage the replication and failover of databases to another Azure region. You can include of a group of databases or all user databases in a logical server to be replicated to another logical server. It is a declarative abstraction on top of the active geo-replication feature, designed to simplify deployment and management of geo-replicated databases at scale.“

Read on to see some of the benefits of this, as well as how to enable it.

Comments closed

Lost Updates with RCSI

Published 2023-10-06 by Kevin Feasel

Kendra Little shares a warning:

There are two isolation levels in SQL Server that use optimistic locking for disk-based tables:

Read Committed Snapshot Isolation (RCSI), which changes the implementation of the default Read Committed Isolation level and enables statement-based consistency.

Snapshot Isolation, which provides high consistency for transactions (which often contain multiple statements). Snapshot Isolation also provides support for identifying update conflicts.

Many folks get pretty nervous about RCSI when they learn that certain timing effects can happen with data modifications that don’t happen under Read Committed. The irony is that RCSI does solve many OTHER timing risks in Read Committed, and overall is more consistent, so sticking with the pessimistic implementation of Read Committed is not a great solution, either.

I don’t recall getting any kinds of update errors with RCSI and I’ve used it in some pretty heavy workloads.

Comments closed

An Intro to Databricks Asset Bundles

Published 2023-10-05 by Kevin Feasel

Dustin Vannoy covers one technique for CI/CD in Databricks:

Databricks Asset Bundles provides a way to version and deploy Databricks assets – notebooks, workflows, Delta Live Tables pipelines, etc. This is a great option to let data teams setup CI/CD (Continuous Integration / Continuous Deployment). Some of the common approaches in the past have been Terraform, REST API, Databricks command line interface (CLI), or dbx. You can watch this video to hear why I think Databricks Asset Bundles is a good choice for many teams and see a demo of using it from your local environment or in your CI/CD pipeline.

Click through for a video and some sample scripts.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Author: Kevin Feasel