Press "Enter" to skip to content

Day: June 5, 2023

Order, Sort, and Rank in R

Steven Sanderson compares three functions in R:

In the realm of data analysis and programming, organizing and sorting data efficiently is crucial. In R, a programming language renowned for its data manipulation capabilities, we have three powerful functions at our disposal: order()sort(), and rank(). In this blog post, we will delve into the intricacies of these functions, explore their applications, and understand their parameters. These R functions are all used to sort data, however, they each have different purposes and use different methods to sort the data.

Coming at R from a SQL background, the idea of order() and sort() behaving so differently was strange at first, especially because the verbs are synonymous and it’s the noun form of “order” which we get back, rather than performing the ordering.

Comments closed

Thoughts on Postgres File Layout and Migration

Dian Fay shares some advice:

I’ve used several migration frameworks in my time. Most have been variations on a common theme dating back lo these past fifteen-twenty years: an ordered directory of SQL scripts with an in-database registry table recording those which have been executed. The good ones checksum each script and validate them every run to make sure nobody’s trying to change the old files out from under you. But I’ve run into three so far, and used two in production, that do something different. Each revolves around a central idea that sets it apart and makes developing and deploying changes easier, faster, or better-organized than its competition — provided you’re able to work within the assumptions and constraints that idea implies.

Read on for thoughts about three tools: sqitch, graphile-migrate, and migra.

Comments closed

Postgres Change Management Rollbacks

Grant Fritchey explains why stateful systems are difficult to roll back:

The invitation this month for #PGSqlPhriday comes from Dian Fay. The topic is pretty simple, database change management. Now, I may have, once or twice, spoken about database change management, database DevOpsautomating deployments, and all that sort of thing. Maybe. Once or twice.

OK. This is my topic.

I’ve got some great examples on taking changes from the schema on your PostgreSQL databases and then deploying them. All the technical stuff you could want. However, I don’t want to talk about that today. Instead, I want to talk about something really important, the concept of rollbacks when it comes to database deployments.

I completely agree with Grant’s description of the pain and his recommendation. With stateful systems, roll forward, not backward.

Comments closed

Managing a Technical Project

Jeff Mlakar has some advice:

Over the years, in various positions, I’ve participated in many projects as a developer, lead developer, architect, jack-of-all-trades administrator, etc. I’ve also had the opportunity to lead technical projects as well.

This post focuses on techniques I have employed to successfully manage technical projects. Read on for tips regarding meetingscommunication, and building your confidence.

Read on for Jeff’s tips and recommendations.

Comments closed

Thoughts on Fabric OneLake

Teo Lachev shares some thoughts:

In a previous post, I shared my overall impression of Fabric. In this post, I’ll continue exploring Fabric, this time sharing my thoughts on OneLake. If you need a quick intro to Fabric OneLake, the Josh Caplan’s “Build 2023: Eliminate data silos with OneLake, the OneDrive for Data” presentation provides a great overview of OneLake, its capabilities, and the vision behind it from a Microsoft perspective. If you prefer a shorter narrative, you can find it in the “Microsoft OneLake in Fabric, the OneDrive for data” post. As always, we are all learning and constructive criticism would be appreciated if I missed or misinterpreted something.

I think some of Teo’s criticism comes from the idea that OneLake should also mean one lakehouse or one data lake, but the abstraction is one level higher than that. I would like to see some of Teo’s ideas make it into GA, though.

Comments closed

Source Control and Change Management for Postgres

Ryan Booz relives an older story:

For those of you that don’t know, those ER tools were really expensive (probably still are for the ones that exist) and only a few developers had access to the tool. They didn’t have a great DX either.

Aside from the lack of automation and ability of our developers to be more integrated into the process, there was always the one looming issue that we just couldn’t reconcile.

If Joe left and joined the circus (see, I got you there), we were stuck.

We knew this was a bottleneck for some time and we had tried multiple times to change the process. Our ability to iterate on new feature development went through one person and a set of 15-year-old scripts. It didn’t match our otherwise Agile process of front-end code and data analysis projects.

Read on for Ryan’s thoughts on database change management. Some of the tools mentioned work with multiple database platforms, whereas others are specific to Postgres.

Comments closed

Enabling Postgres Auditing

Athar Ishteyaque enables an extension:

The PostgreSQL Audit Extension (or pgaudit) provides detailed session and/or object audit logging via the standard logging facility provided by PostgreSQL.

The goal of a PostgreSQL audit is to provide the tools needed to produce audit logs. These logs are often required to pass certain government, financial, or ISO certification audits.

I am kind of curious what the performance impact of this extension is.

Comments closed