In the second part of my blog post I am going to detail how to use the version control with Power BI Desktop files.
This will include adding files, checking files in and out, viewing previous versions and reverting to previous versions.
If this is the first time you are reading this blog post, I would highly suggest reading Setting up Version Control for my Power BI Desktop Files (PBIX) with no additional Cost * | Part 1
In short, Gilbert treats PBIX file as any other data file. These can get kind of beefy, though, so I’ve also saved them as templates—that way, you get the structure without pulling in all of the data.
We’ve all done it. Working for ages tracking down that elusive bug in a project. Diligently committing away on our local repo as we make small changes. We’ve found the convoluted 50 lines of tortured logic, replaced it with 5 simple easy to read lines of code and all the test have passed. So we push it backup to github and wander off to grab some a snack as a reward
Halfway to the snacks you suddenly have a nagging doubt in the back of your mind that you don’t remember starting a new branch before starting on the bug hunt.
Read on for the process.
In order to make this work, one of the considerations is to eliminate cross database dependencies, as you can’t reference objects that don’t exist in views, and even in stored procedures, which offer delayed resolution of objects, you can’t test the code without the database it is referencing.
In addition, and somewhat more important to the process, is dealing with three part names that reference the name of the database your object is in. During the comparison process, the database can be created with a name that is different from your target database to compare to (referred to as a shadow database.) So if you are in database X and have references to X.schema.table, but the database is generated as X_Shadow, the X. is now a cross database reference rather than the local reference you are desiring.
Four part names to linked servers are a different sort of nightmare, but one that is (hopefully) exceedingly rare. The queries presented will help with this as well.
Louis has a few scripts to help you find these. If your code is in source control already, you could also build a regular expression to search through it.
In a previous post I ran through how to create a custom SQL Server Helm chart.
Now that the chart has been created, we need somewhere to store it.
We could keep it locally but what if we wanted to use our own Helm chart repository? That way we wouldn’t have to worry about deleting the chart on our local machine.
I use Github to store all my code to guard against accidentally deleting it (I’ve done that more than once) so why not use Github to store my Helm charts?
Cluster configurations are still code, and code belongs in source control.
Working with Azure Data Factory (ADF) enables me to build and monitor my Extract Transform Load (ETL) workflows in Azure. My ADF pipelines is a cloud version of previously used ETL projects in SQL Server SSIS.
And prior to this point, all my sample ADF pipelines were developed in so-called “Live Data Factory Mode” using my personal workspace, i.e. all changes had to be published in order to be saved. This hasn’t been the best practice from my side, and I needed to start using a source control tool to preserve and version my development code.
Click through for a detailed demo.
There are a lot of source control applications and software, everyone has its pros and cons, but personally, I like to use GitHub, since it is free to use and since it was recently acquired by Microsoft, support for other products is easier (SQL Server for this case).
On this post, I will show you how to implement a source control for a database using GitHub and Azure Data Studio (ADS).
Click through for the step-by-step instructions.
To clarify, Azure DevOps Server 2019 is the new name for TFS. Furthermore, it’s now rebranded and updated to look and feel like the Azure DevOps service. Which is the new name for Visual Studio team Services (VSTS) in Azure.
Therefore, I decided to do this because I am currently involved with Azure DevOps and using it with SQL Server deployments. Hence, I decided I wanted a local copy of my own to test things.
Click through for a couple of interesting findings, especially around required (and disallowed) versions of prerequisites like SQL Server versions.
If I could teach SQL to analysts who plan to work in industry data science, I’d start by sharing a few SQL Truths I’ve learned, and why I recommend tracking SQL queries in git. Here goes:
- You will *always* need that query again
- Queries are living artifacts that change over time
- If it’s useful to you, it’s useful to others (and vice versa)
Focusing on these points has led to my continuous adoption of a query library — a git repository for saving and sharing commonly (and uncommonly) used queries, all while tracking any changes made to these queries over time.
This is separate from keeping database objects (like table or procedure definitions) in source control.
But wait, I would need to learn an additional tool?
Yes, but don’t panic! Git is a tool with various commands that you can use to help track your changes. Luckily, you don’t need to know too many commands in Git to use the basic functionality. As an added bonus, using Git with RStudio takes away some of the burden of knowing Git commands by including buttons for common actions.
As with any tool that you pick up to help your scientific workflows, there is some upfront work before you can start seeing the benefits. Don’t let that deter you. Git can be very easy once you get the gist. Think about the benefits of being able to track changes: you can make some changes, have a record of that change and who made it, and you can tie that change to a specific problem that was reported or feature request that was noted.
It’s still code, and you gain a lot by keeping code in source control.
I almost always have trouble remembering which option is for use with a non-empty folder of “here’s a bunch of files that I want to dump into the repo to start with”, vs. “here’s an empty folder where I want to pull down the contents of an existing repo”. Fortunately, Tortoise yells at you if you try to do the latter — which is
Export— into a non-empty folder. So we want to
Import. Assuming you have a folder where all your SQL scripts live already, right-clicky and say “Tortoise SVN .. Import.”
Check it out. The only concern I have is that this source control is just local source control. That’s very helpful in situations where you accidentally mess something up, but my preference is to put my code in the same source control system the developers are using. And if the developers aren’t using source control, get that institution in place as soon as possible because that’s begging for trouble.