Press "Enter" to skip to content

Category: Source Control

Storing Container Images in GitHub Package Registry

Andrew Pruski shows how we can use GitHub Package Registry to store private container images:

The GitHub Package Registry is available for beta testing and allows us to store container images in it, basically giving us the same functionality as the Docker Hub.

However the Docker Hub only allows for one private repository per free account whereas the Github package registry is completely private! Let’s run through a simple demo to create a registry and upload an image.

It’s pretty easy to set up, so check it out.

Comments closed

Version Control and Power BI Desktop

Gilbert Quevauvilliers takes us through version control with PBIX files:

In the second part of my blog post I am going to detail how to use the version control with Power BI Desktop files.

This will include adding files, checking files in and out, viewing previous versions and reverting to previous versions.

If this is the first time you are reading this blog post, I would highly suggest reading Setting up Version Control for my Power BI Desktop Files (PBIX) with no additional Cost * | Part 1

In short, Gilbert treats PBIX file as any other data file. These can get kind of beefy, though, so I’ve also saved them as templates—that way, you get the structure without pulling in all of the data.

Comments closed

Reverting a Git Push

Stuart Moore takes us through backing out a commit in Git when you pushed to the wrong branch:

We’ve all done it. Working for ages tracking down that elusive bug in a project. Diligently committing away on our local repo as we make small changes. We’ve found the convoluted 50 lines of tortured logic, replaced it with 5 simple easy to read lines of code and all the test have passed. So we push it backup to github and wander off to grab some a snack as a reward

Halfway to the snacks you suddenly have a nagging doubt in the back of your mind that you don’t remember starting a new branch before starting on the bug hunt.

Read on for the process.

Comments closed

Finding Three-Part or Four-Part Names in SQL Server

Louis Davidson shows how we can find three-part or four-part naming in T-SQL code:

In order to make this work, one of the considerations is to eliminate cross database dependencies, as you can’t reference objects that don’t exist in views, and even in stored procedures, which offer delayed resolution of objects, you can’t test the code without the database it is referencing.

In addition, and somewhat more important to the process, is dealing with three part names that reference the name of the database your object is in. During the comparison process, the database can be created with a name that is different from your target database to compare to (referred to as a shadow database.) So if you are in database X and have references to X.schema.table, but the database is generated as X_Shadow, the X. is now a cross database reference rather than the local reference you are desiring.

Four part names to linked servers are a different sort of nightmare, but one that is (hopefully) exceedingly rare. The queries presented will help with this as well.

Louis has a few scripts to help you find these. If your code is in source control already, you could also build a regular expression to search through it.

Comments closed

Storing SQL Server Helm Charts in GitHub

Andrew Pruski shows how we can use GitHub to store Helm charts and access them easily:

In a previous post I ran through how to create a custom SQL Server Helm chart.

Now that the chart has been created, we need somewhere to store it.

We could keep it locally but what if we wanted to use our own Helm chart repository? That way we wouldn’t have to worry about deleting the chart on our local machine.

I use Github to store all my code to guard against accidentally deleting it (I’ve done that more than once) so why not use Github to store my Helm charts?

Cluster configurations are still code, and code belongs in source control.

Comments closed

Integrating Azure Data Factory With GitHub

Rayis Imayev shows us how to tie Azure Data Factory pipelines with GitHub, allowing automatic check-in based on ADF pipeline changes:

Working with Azure Data Factory (ADF) enables me to build and monitor my Extract Transform Load (ETL) workflows in Azure. My ADF pipelines is a cloud version of previously used ETL projects in SQL Server SSIS.

And prior to this point, all my sample ADF pipelines were developed in so-called “Live Data Factory Mode” using my personal workspace, i.e. all changes had to be published in order to be saved. This hasn’t been the best practice from my side, and I needed to start using a source control tool to preserve and version my development code.

Click through for a detailed demo.

Comments closed

Integrating Azure Data Studio With GitHub

Eduardo Pivaral shows how to use Azure Data Studio to push to a Git repository on GitHub:

There are a lot of source control applications and software, everyone has its pros and cons, but personally, I like to use GitHub, since it is free to use and since it was recently acquired by Microsoft, support for other products is easier (SQL Server for this case).

On this post, I will show you how to implement a source control for a database using GitHub and Azure Data Studio (ADS).

Click through for the step-by-step instructions.

Comments closed

Installing Azure DevOps Server 2019

Kevin Chant takes a look at Azure DevOps Server 2019 RC1:

To clarify, Azure DevOps Server 2019 is the new name for TFS. Furthermore, it’s now rebranded and updated to look and feel like the Azure DevOps service. Which is the new name for Visual Studio team Services (VSTS) in Azure.
Therefore, I decided to do this because I am currently involved with Azure DevOps and using it with SQL Server deployments. Hence, I decided I wanted a local copy of my own to test things.

Click through for a couple of interesting findings, especially around required (and disallowed) versions of prerequisites like SQL Server versions.

Comments closed

Using Source Control For Those Database Queries

Caitlin Hudon shares a few SQL Truths and explains why tracking database queries in source control is important:

If I could teach SQL to analysts who plan to work in industry data science, I’d start by sharing a few SQL Truths I’ve learned, and why I recommend tracking SQL queries in git. Here goes:

  1. You will *always* need that query again
  2. Queries are living artifacts that change over time
  3. If it’s useful to you, it’s useful to others (and vice versa)

Focusing on these points has led to my continuous adoption of a query library — a git repository for saving and sharing commonly (and uncommonly) used queries, all while tracking any changes made to these queries over time.

This is separate from keeping database objects (like table or procedure definitions) in source control.

Comments closed

Your R Code Should Be In Source Control Too

Lindsay Carr explains the importance of storing your R code in source control:

But wait, I would need to learn an additional tool?

Yes, but don’t panic! Git is a tool with various commands that you can use to help track your changes. Luckily, you don’t need to know too many commands in Git to use the basic functionality. As an added bonus, using Git with RStudio takes away some of the burden of knowing Git commands by including buttons for common actions.

As with any tool that you pick up to help your scientific workflows, there is some upfront work before you can start seeing the benefits. Don’t let that deter you. Git can be very easy once you get the gist. Think about the benefits of being able to track changes: you can make some changes, have a record of that change and who made it, and you can tie that change to a specific problem that was reported or feature request that was noted.

It’s still code, and you gain a lot by keeping code in source control.

Comments closed