Press "Enter" to skip to content

Category: Source Control

Thoughts on Postgres File Layout and Migration

Dian Fay shares some advice:

I’ve used several migration frameworks in my time. Most have been variations on a common theme dating back lo these past fifteen-twenty years: an ordered directory of SQL scripts with an in-database registry table recording those which have been executed. The good ones checksum each script and validate them every run to make sure nobody’s trying to change the old files out from under you. But I’ve run into three so far, and used two in production, that do something different. Each revolves around a central idea that sets it apart and makes developing and deploying changes easier, faster, or better-organized than its competition — provided you’re able to work within the assumptions and constraints that idea implies.

Read on for thoughts about three tools: sqitch, graphile-migrate, and migra.

Leave a Comment

Source Control and Change Management for Postgres

Ryan Booz relives an older story:

For those of you that don’t know, those ER tools were really expensive (probably still are for the ones that exist) and only a few developers had access to the tool. They didn’t have a great DX either.

Aside from the lack of automation and ability of our developers to be more integrated into the process, there was always the one looming issue that we just couldn’t reconcile.

If Joe left and joined the circus (see, I got you there), we were stuck.

We knew this was a bottleneck for some time and we had tried multiple times to change the process. Our ability to iterate on new feature development went through one person and a set of 15-year-old scripts. It didn’t match our otherwise Agile process of front-end code and data analysis projects.

Read on for Ryan’s thoughts on database change management. Some of the tools mentioned work with multiple database platforms, whereas others are specific to Postgres.

Leave a Comment

Feature Branching and Hotfixes for Azure DevOps

Vytas Suopys covers a bit of source control strategy:

Have you ever deployed a release to production only to find out a bug has escaped your testing process and now users are being severely impacted? In this post, I’ll discuss how to deploy a fix from your development Synapse Workspace into a production Synapse Workspace without adversely affecting ongoing development projects.

This example uses Azure DevOps for CICD along with a Synapse extension for Azure DevOps: Synapse Workspace Deployment. In this example, I assume Synapse is already configured for source control with Azure DevOps Git and Build and Release pipelines are already defined in Azure DevOps. Instructions on how to apply this this can be found in the Azure Synapse documentation for continuous integration and delivery.

The specific example covers Synapse, though the general principle applies no matter what you’re deploying.

Comments closed

SQLCMD Variables in Database Projects

Olivier Van Steenlandt can’t live in this static world:

When I started to explore and use Database Projects, I ran into a specific situation quite fast where I was required to use SQLCMD variables. In this blog post, I will describe what they are, how you can use SQLCMD variables in Database Projects and where this might become very useful for you.

Click through for a scenario, a primer on using SQLCMD variables, and some basic details on how to use them in database projects.

Comments closed

Pull Requests and Database Projects

Olivier Van Steenlandt continues building out a database project:

A Pull Request is an alternative way of merging branches. Instead of executing the merge yourself, you will create a Pull Request for someone else to revise your development and approve and merge when ok. By doing this you are introducing peer reviewing into your development process. From my experience, using Pull Requests will improve your development quality since someone needs to validate your changes before they can be deployed to a certain environment.

I wouldn’t advise using Pull Requests to get changes in a development environment but I would advise using it to get changes to an Acceptance- or Production environment. By doing this, you can already find issues in an earlier stage than in production.

Read on to see how they work, using Azure DevOps as the example. GitHub pull requests are very similar in nature.

Comments closed

Thoughts on Software Development and Postgres

Ryan Lambert thinks about software development:

Question: Do you store your SQL code in GitHub (or other source control, e.g. GitLab)?

Yes! All mission critical SQL I am involved with is saved in a Git repo, either public or private depending on the project. The use of source control for mission critical code (SQL, Python, etc.) is non-negotiable. A good portion of my “not trivial” code is also stored in source control, with my trend leaning towards more code in source control. The trivial code isn’t worth the effort of putting it into source control or the bloat it creates in those projects.

Read on for more.

Comments closed

Isolated Spark Testing with lakeFS

Adi Polak demonstrates lakeFS:

This tutorial demonstrates how to build a development and testing environment for validating your logic on a full-blown production data volume and variety, working with lakeFS and Spark. You will walk through the journey of creating a repository and building a Spark application while using lakeFS capabilities. You will learn how to data changes, revert them in cases of mistakes or other hiccups, and lately merge separate branches to reflect data changes from the isolated environments.

Not too long ago, I had a couple conversations with developers and data engineers about decentralized development and devs having their own environments and data. This seems like it would be a good approach to that common problem, and it works for Azure Synapse Analytics as well.

Comments closed

Partial Database Projects

Olivier Van Steenlandt doesn’t get the whole cookie:

In this blog post, I will describe how you can get a database in source control partially. You might be wondering why you would do that. Well, let’s start by explaining the use case.

A couple of years ago, I was working for a company where a third-party vendor owned the OLTP system. At that point in time, we were not allowed to change any existing objects or create any new objects in the existing schemas. Though, we were required to be able to transfer the data from the OLTP system to the staging environment of our Data Warehouse. To do so, the third-party vendor created a schema in the database where we were allowed to create views and stored procedures to be able to get the data we needed.

Read on for an example of how this might work, as well as important database project settings you’ll want to change in that case.

Comments closed