Press "Enter" to skip to content

Category: Source Control

Distributing Notebooks

Grant Fritchey wants to know where to buy notebooks and notebook accessories:

I’m myopically focused at the moment on Azure Data Studio, but there are a lot of other places and ways to create or consume notebooks. However, I’m going to keep my focus.

The issue I’m running into, is distributing the notebooks.

There are a lot of great comments. Before reading them, here’s my answer:

  • GitHub repos, like Grant mentions. They’re good, though I have the same feeling about a production notebook that I do about an SSIS package: notebooks are binaries (after a fashion). For pedagogical purposes, I’ll absolutely slap notebooks into GitHub, typically without data. But for a real data science project, those notebooks can get hefty when you store all of the data in them, and it’s really hard to diff the JSON to understand what changed.
  • Binder and Azure Notebooks are services which let you host notebooks remotely. Binder reads from a GitHub repo and spins up a virtual environment for you. Azure Notebooks lets you run notebooks (including F# notebooks) against free VMs in Azure, or you can use your own VM for more power. Azure Notebooks let you fork projects pretty easily. I haven’t used Google Colab but it looks pretty similar to Azure Notebooks.
  • When you start up Jupyter Notebooks, you’re really starting a server. You can have a server running in your environment with your team’s notebooks. I’d probably still drop them in source control as well.
Leave a Comment

Forking GitHub Repos and Contributing to Open Source Projects

Rob Sewell takes us through the process of contributing to an open source project:

– Fork the repository into your own GitHub

– Clone the repository to your local machine

– Create a new branch for your changes

– Make some changes and commit them with useful messages

– Push the changes to your repository

– Create a Pull Request from your repository back to the original one

You will need to have git.exe available which you can download and install from https://git-scm.com/downloads if required

For bonus points, we learn that Shane O’Neill doesn’t use the Oxford comma.

Leave a Comment

A Git Cheat Sheet

Kendra Little has a cheat sheet for working with Git:

I created a cheat sheet for the Git Command Line Interface to go along with my Git tutorial for SQL Change Automation video. I find the Git CLI to be very friendly and easier to learn than a GUI interface.

Given the number of “How do I extricate myself from this Git mess?” messages in my company chat, I’m not sure I’d call the Git CLI friendly. Nonetheless, Kendra does a great job of putting together most of the common commands in an easy guide.

Leave a Comment

SQL Server Trends Worth Watching

Grant Fritchey follows up on a Kevin Hill tweet:

There are a million things to learn about in our rapidly shifting technological landscape, but I think this assessment, especially the way it was put, “no longer justify ignoring” really nails some of the fundamentals.

Let’s talk about why you can no longer ignore Docker, Git and DBATools either.

If you’re a DBA and aren’t familiar with Docker, Git, or DBATools, that’s a pretty good trio of things to spend some time learning. You can survive without them, but you’re more likely to thrive if you know them.

Comments closed

Storing Container Images in GitHub Package Registry

Andrew Pruski shows how we can use GitHub Package Registry to store private container images:

The GitHub Package Registry is available for beta testing and allows us to store container images in it, basically giving us the same functionality as the Docker Hub.

However the Docker Hub only allows for one private repository per free account whereas the Github package registry is completely private! Let’s run through a simple demo to create a registry and upload an image.

It’s pretty easy to set up, so check it out.

Comments closed

Version Control and Power BI Desktop

Gilbert Quevauvilliers takes us through version control with PBIX files:

In the second part of my blog post I am going to detail how to use the version control with Power BI Desktop files.

This will include adding files, checking files in and out, viewing previous versions and reverting to previous versions.

If this is the first time you are reading this blog post, I would highly suggest reading Setting up Version Control for my Power BI Desktop Files (PBIX) with no additional Cost * | Part 1

In short, Gilbert treats PBIX file as any other data file. These can get kind of beefy, though, so I’ve also saved them as templates—that way, you get the structure without pulling in all of the data.

Comments closed

Reverting a Git Push

Stuart Moore takes us through backing out a commit in Git when you pushed to the wrong branch:

We’ve all done it. Working for ages tracking down that elusive bug in a project. Diligently committing away on our local repo as we make small changes. We’ve found the convoluted 50 lines of tortured logic, replaced it with 5 simple easy to read lines of code and all the test have passed. So we push it backup to github and wander off to grab some a snack as a reward

Halfway to the snacks you suddenly have a nagging doubt in the back of your mind that you don’t remember starting a new branch before starting on the bug hunt.

Read on for the process.

Comments closed

Finding Three-Part or Four-Part Names in SQL Server

Louis Davidson shows how we can find three-part or four-part naming in T-SQL code:

In order to make this work, one of the considerations is to eliminate cross database dependencies, as you can’t reference objects that don’t exist in views, and even in stored procedures, which offer delayed resolution of objects, you can’t test the code without the database it is referencing.

In addition, and somewhat more important to the process, is dealing with three part names that reference the name of the database your object is in. During the comparison process, the database can be created with a name that is different from your target database to compare to (referred to as a shadow database.) So if you are in database X and have references to X.schema.table, but the database is generated as X_Shadow, the X. is now a cross database reference rather than the local reference you are desiring.

Four part names to linked servers are a different sort of nightmare, but one that is (hopefully) exceedingly rare. The queries presented will help with this as well.

Louis has a few scripts to help you find these. If your code is in source control already, you could also build a regular expression to search through it.

Comments closed

Storing SQL Server Helm Charts in GitHub

Andrew Pruski shows how we can use GitHub to store Helm charts and access them easily:

In a previous post I ran through how to create a custom SQL Server Helm chart.

Now that the chart has been created, we need somewhere to store it.

We could keep it locally but what if we wanted to use our own Helm chart repository? That way we wouldn’t have to worry about deleting the chart on our local machine.

I use Github to store all my code to guard against accidentally deleting it (I’ve done that more than once) so why not use Github to store my Helm charts?

Cluster configurations are still code, and code belongs in source control.

Comments closed

Integrating Azure Data Factory With GitHub

Rayis Imayev shows us how to tie Azure Data Factory pipelines with GitHub, allowing automatic check-in based on ADF pipeline changes:

Working with Azure Data Factory (ADF) enables me to build and monitor my Extract Transform Load (ETL) workflows in Azure. My ADF pipelines is a cloud version of previously used ETL projects in SQL Server SSIS.

And prior to this point, all my sample ADF pipelines were developed in so-called “Live Data Factory Mode” using my personal workspace, i.e. all changes had to be published in order to be saved. This hasn’t been the best practice from my side, and I needed to start using a source control tool to preserve and version my development code.

Click through for a detailed demo.

Comments closed