Press "Enter" to skip to content

Author: Kevin Feasel

Tips for Query Tuning in Postgres

Gabrielle Roth shares some advice:

For PGSQL Phriday #016, Ryan’s asked us to “…discuss your process for tuning difficult queries. Specifically, try to focus on that one problematic query that really challenged you and you always use it as an example when helping or teaching others your methods.”

Here are the generic steps I take, mostly in order.

Click through for those tips, as well as an example of using join_collapse_limit in practice to tame an unruly query.

Comments closed

partialBatch Commit Mode in Power BI API

Chris Webb provides an explanation:

I have always wondered what the partialBatch option for the commitMode parameter in the Enhanced Refresh API does exactly. There is some documentation here and here but I was curious to find out more as part of the research I’m doing for my ongoing series on Power BI refresh memory errors, in case it was useful for reducing overall memory usage (spoiler: it may be). In this post I’ll share what I found out after running some tests.

Read on for the demonstration and explanation, as well as tips on when you might want to use it.

Comments closed

Dynamic Unpivoting of Columns in T-SQL

Kristyna Ferris does a bit of twisting:

Picture this, your data ingestion team has created a table that has the sales for each month year split into different columns. At first glance, you may think “what’s the big deal? Should be pretty easy, right? All I need to do is unpivot these columns in Power BI and I’m good to go.” So you go that route, and the report works for one month. Next month, you get an urgent email from your stakeholders saying they can’t see this month’s numbers. That’s when you realize that this table will grow with new columns every month. That means that any report you make needs a schema refresh every single month. Unfortunately, Power BI will not grab new columns from a table once it’s published into the online service. The only way for the Power Query to pivot the new columns is for you to open the report in your desktop, go to Power Query, and refresh the preview to get all the columns in that table.

Which is quite the pain. But Kristyna has a solution using the UNPIVOT operator in T-SQL.

Comments closed

Collapsing or Concatenating Text in R

Steven Sanderson builds a list:

When working with data frames in R, you may often encounter scenarios where you need to collapse or concatenate text values based on groups within your dataset. This could involve combining text from multiple rows into a single row per group, which can be useful for summarizing data or preparing it for further analysis. In this post, we’ll explore how to achieve this task using different methods in R—specifically using base R, the dplyr package, and the data.table package.

This is the R equivalent of T-SQL’s STRING_AGG() function, or the STUFF() + FOR XML PATH approach if you’re still on an older version of SQL Server.

Comments closed

Quick Takes on Logistic Regression

John Cook talks about my favorite form of regression that serves to solve classification problems:

Logistic regression models the probability of a yes/no event occurring. It gives you more information than a model that simply tries to classify yeses and nos. I advised a client to move from an uninterpretable classification method to logistic regression and they were so excited about the result that they filed a patent on it.

It’s too late to patent logistic regression, but they filed a patent on the application of logistic regression to their domain. I don’t know whether the patent was ever granted.

Read on for a few more thoughts on and around logistic regression and logits from a mathematician.

Comments closed

Indexing for Read-Scale Databases

Jose Manuel Jurado Diaz shares a customer case:

Today, I worked on a service request that our customer has a Business Critical database with 4 vCores and Read-Scale Out enabled. Our customer noticed several performance issues using Read-Scale Out database and I would like to explain several lessons learned found during the troubleshooting steps.

Click through for notes on troubleshooting and improving performance.

Comments closed

Arbitrary Intervals for Partitioning in Postgres

Keith Fiske does a bit of interval math:

Whether you are managing a large table or setting up automatic archiving, time based partitioning in Postgres is incredibly powerful. pg_partman’s newest versions support a huge variety of custom time internals. Marco just published a post on using pg_partman with our new database product for doing analytics with PostgresCrunchy Bridge for Analytics. So I thought this would be a great time to review the basic and complex options for the time based partitioning.

Read on for a note of how pg_partman works and interval management, especially for versions earlier than 5.0.

Comments closed

Dangling Images with Oracle 23ai Free Edition

Kellyn Gorman runs into an issue:

When I tried to connect via SQLPlus as SYSDBA, I received an EXTPROC error. It pointed clearly to the listener.ora file, which I discovered a path listed still to ora23c for the extproc, corrected it, started the Listener, but to no avail- an ORA-12547 error, realizing I had a make file issue on the binaries for Oracle.

I contacted Geral Venzl, who was very gracious and after some quick research, he came back that his folks said everything was fine with the images, so I thanked him and dug into the issue deeper.  I quickly discovered this problem could happen to others, so decided I better document here for anyone who does happen upon it.

Click through for the high-level explanation and a bit more detail on dangling images.

Comments closed

Multi-Master Architecture in PostgreSQL

Semab Tariq describes a scale-out technique for Postgres:

Multi-master architecture has gained significant traction in the world of database management, offering a solution to traditional limitations in scalabilityfault tolerance, and high availability. By allowing multiple nodes to operate as master, this architecture promises a more flexible and robust database system. However, along with these benefits come certain challenges, including data consistency, resource demands, and conflict resolution.

In this blog, we will explore what multi-master architecture is, delve into its key advantages, and discuss the potential drawbacks that come with its implementation. Also in our upcoming blogs, we will see how you can setup your first multi-master architecture with a tool called PGD (Postgres Distributed) by EnterpriseDB (EDB).

Read on to learn how it works, as well as some of the pros and cons of using it.

Comments closed