Press "Enter" to skip to content

Category: Source Control

CI/CD for the Synapse Serverless SQL Pool

Kevin Chant has some links for us:

Last week the January 2023 video of the Azure Synapse Analytics and Microsoft MVP series was released. Where I covered how to do CI/CD for dedicated SQL Pools in Azure Synapse Analytics.

Since the release of that video the most popular question raised has been how to perform CI/CD for serverless SQL Pools within Azure Synapse Analytics?

Read on for links galore.

Comments closed

SQLCMD Variables in Database Projects

Olivier Van Steenlandt can’t live in this static world:

When I started to explore and use Database Projects, I ran into a specific situation quite fast where I was required to use SQLCMD variables. In this blog post, I will describe what they are, how you can use SQLCMD variables in Database Projects and where this might become very useful for you.

Click through for a scenario, a primer on using SQLCMD variables, and some basic details on how to use them in database projects.

Comments closed

Pull Requests and Database Projects

Olivier Van Steenlandt continues building out a database project:

A Pull Request is an alternative way of merging branches. Instead of executing the merge yourself, you will create a Pull Request for someone else to revise your development and approve and merge when ok. By doing this you are introducing peer reviewing into your development process. From my experience, using Pull Requests will improve your development quality since someone needs to validate your changes before they can be deployed to a certain environment.

I wouldn’t advise using Pull Requests to get changes in a development environment but I would advise using it to get changes to an Acceptance- or Production environment. By doing this, you can already find issues in an earlier stage than in production.

Read on to see how they work, using Azure DevOps as the example. GitHub pull requests are very similar in nature.

Comments closed

Thoughts on Software Development and Postgres

Ryan Lambert thinks about software development:

Question: Do you store your SQL code in GitHub (or other source control, e.g. GitLab)?

Yes! All mission critical SQL I am involved with is saved in a Git repo, either public or private depending on the project. The use of source control for mission critical code (SQL, Python, etc.) is non-negotiable. A good portion of my “not trivial” code is also stored in source control, with my trend leaning towards more code in source control. The trivial code isn’t worth the effort of putting it into source control or the bloat it creates in those projects.

Read on for more.

Comments closed

Isolated Spark Testing with lakeFS

Adi Polak demonstrates lakeFS:

This tutorial demonstrates how to build a development and testing environment for validating your logic on a full-blown production data volume and variety, working with lakeFS and Spark. You will walk through the journey of creating a repository and building a Spark application while using lakeFS capabilities. You will learn how to data changes, revert them in cases of mistakes or other hiccups, and lately merge separate branches to reflect data changes from the isolated environments.

Not too long ago, I had a couple conversations with developers and data engineers about decentralized development and devs having their own environments and data. This seems like it would be a good approach to that common problem, and it works for Azure Synapse Analytics as well.

Comments closed

Partial Database Projects

Olivier Van Steenlandt doesn’t get the whole cookie:

In this blog post, I will describe how you can get a database in source control partially. You might be wondering why you would do that. Well, let’s start by explaining the use case.

A couple of years ago, I was working for a company where a third-party vendor owned the OLTP system. At that point in time, we were not allowed to change any existing objects or create any new objects in the existing schemas. Though, we were required to be able to transfer the data from the OLTP system to the staging environment of our Data Warehouse. To do so, the third-party vendor created a schema in the database where we were allowed to create views and stored procedures to be able to get the data we needed.

Read on for an example of how this might work, as well as important database project settings you’ll want to change in that case.

Comments closed

The Benefits of Stacking Pull Requests

Vivian Qu explains why stacking pull requests can make sense:

Becoming proficient in version control and change management is a necessary part of any software engineer’s job. However, I think that basic proficiency alone is not sufficient to be truly effective when working on complex production-ready software with a team of engineers. Stacking pull requests (PRs) is a key skill that should be learned early in a junior engineer’s career.

Stacking PRs is an advanced git technique that allows an engineer to break down one large change into a series of dependent changes that can be turned into smaller pull requests and reviewed separately.

Read on to learn more. It’s a skill I definitely don’t have, so time to add that to my to-learn list.

Comments closed

Merging Database Project Changes

Olivier Van Steenlandt feeds changes into different branches:

When you start development, you create a feature branch, which is a living copy of your main branch where you apply changes during the development phase. As soon as you finalize your development, you want to get these changes to your development environment.

This process is called merging. During the merge process, 2 branches will be combined. At this point, we want our feature branch to be combined with the development branch.

Click through to learn how it all works.

Comments closed

One Repo for Every Environment

Meagan Longoria explains an important part of source control repositories:

I’ve seen a few people start Azure Data Factory (ADF) projects assuming that we would have one source control repo per environment, meaning that you would attach a Git repo to Dev, and another Git repo to Test and another to Prod.

Microsoft recommends against this, saying:

Read on for the citation as well as the practical reason why we don’t want multiple repos. This is true not only for Azure Data Factory but for every development project. You have one repository with branches. Certain branches represent checkpoints where code goes out to a specific environment via use of a release tool (e.g., Azure DevOps release pipelines, GitHub actions, etc.).

Comments closed