Contributing To Open Source: Understanding GitHub

Andy Levy has a great guide showing how to pull the dbatools repo from GitHub:

I’m putting this together here for my own reference and to hopefully write it up in a way that helps things “click” for some people who need that extra nudge to get into “aha!” territory. A number of the examples I’ve seen elsewhere have mixed the command-line and GUI clients, but the more I use git GUIs, the less I like them for the basic workflow. You only need to know a handful of commands to be productive and for that, the command line beats the GUI in my opinion.

So, here we go. My GitHub workflow for working on dbatools, with as much command-line work as possible. This walk-through assumes basic familiarity with source control concepts.

This is a great guide for people who are not familiar with Git.

Database Code Analysis

William Brewer has an interesting article on performing code analysis on database objects:

In general, code analysis is not just a help to the individual developer but can be useful to the entire team. This is because it makes the state and purpose of the code more visible, so that it allows everyone who is responsible for delivery to get a better idea of progress and can alert them much earlier to potential tasks and issues further down the line. It also makes everyone more aware of whatever coding standards are agreed, and what operational, security and compliance constraints there are.

Database Code analysis is a slightly more complicated topic than static code analysis as used in Agile application development. It is more complicated because you have the extra choice of dynamic code analysis to supplement static code analysis, but also because databases have several different types of code that have different conventions and considerations. There is DML (Data Manipulation Language), DDL (Data Definition Language), DCL (Data Control Language) and TCL (Transaction Control Language).  They each require rather different analysis.

William goes on to include a set of good resources, though I think database code analysis, like database testing, is a difficult job in an under-served area.

Stored Procedures Are Code

Rob Farley hates hearing that stored procedures don’t need to go into source control:

Hearing this is one of those things that really bugs me.

And it’s not actually about stored procedures, it’s about the mindset that sits there.

I hear this sentiment in environments where there are multiple developers. Where they’re using source control for all their application code. Because, you know, they want to make sure they have a history of changes, and they want to make sure two developers don’t change the same piece of code, maybe they even want to automate builds, all those good things.

But checking out code and needing it to pass all those tests is a pain. So if there’s some logic that can be put in a stored procedure, then that logic can be maintained outside the annoying rigmarole of source control. I guess this is appealing because developers are supposed to be creative types, and should fight against the repression, fight against ‘the man’, fight against control.

When I come across this mindset, I worry a lot.

Read on for Rob’s set of worries, and hie thee to the source control repository.  It really doesn’t matter which source control product you use (ideally, the same one that developers use for their app code), just as long as it’s in source control.

CI With SQL Server And Jenkins

Chris Adkin shows how to auto-deploy SQL Server Data Tools projects to a SQL Server instance using Jenkins:

The aim of this blog post is twofold, it is to explain how:

  • A “Self building pipeline” for the deployment of a SQL Server Data Tools project can be implemented using open source tools
  • A build pipeline can be augmented using PowerShell

What You Will Need

  • Jenkins automation server

  • cURL

  • SQL Server 2016 (any edition will suffice)

  • Visual Studio 2015 community edition

  • A windows server, physical or virtual to install all of the above on, I will be using Windows Server 2012 R2 as the operating system

Automated integration via CI is extremely helpful, and Chris makes it look easy in this post.

Temporal Tables For R Source Control

Tomaz Kastrun shares an unorthodox way of collecting historical R code changes:

I will not comment on the solution Bob provided, since I don’t know how their infrastructure, roles, security is set up. At this point, I am grateful for his comment. But what I will comment, is that there is no straightforward way or any out-of-the-box solution. Furthermore, if your R code requires any additional packages, storing the packages with your R code is not that bad idea, regardless of traffic or disk overhead. And versioning the R code is something that is for sure needed.

To continue from previous post, getting or capturing R code, once it gets to Launchpad, is tricky. So storing R code it in a database table or on file system seems a better idea.

It’s an interesting concept.  My preference is to use R Tools for Visual Studio and a more traditional source control mechanism.  It involves keeping source control up to date, but that’s a good practice to follow in any case.

Local Filesystem Source Control

Steve Jones shares a tale of woe related to source control neglect:

One of the first things I found was that all our stored procedures in the production server were encrypted. I wasn’t sure why, since we hosted our machines, but that wasn’t a big deal.

Until it was.

One day we had an issue on one of our SQL Server 2000 servers (we had two, supposedly identical). In troubleshooting and putting some sample data in both systems for a fake customer, we got different results. Hmmm, not what I wanted to see.

I checked the VCS (SourceSafe at the time) and checked out the code. I then loaded my test data and … got a third, different result. Now I was concerned as this was a production bug that was delaying work for a customer.

I’ve become convinced over the past few years that having all of your code in source control (including database scripts!) is a key differentiator between a good work situation and a bad work situation.

Databases In Source Control

Robert Sheldon walks through core concepts of source control:

The source control system can’t merge the two file versions until Barb resolves the conflict between black bear and brown bear (the additions of wolf and fox still cause no problem).

When conflicts of this nature arise, someone must examine the comparison and determine which version of bear should win out. In this case, Barb decides to go with black bear.

It’s worth considering the risk associated with this merge process. Barb’s commit fails, so she can’t save her changes to the repository until she can successfully perform a merge. If something goes wrong with the merge operation, she risks losing her changes entirely. This might be a minor problem for small textual changes like these, but a big problem if she’s trying to merge in substantial and complex changes to application logic. This is why the source control mantra is: commit small changes often.

The article is more of an intro to source control, but if you aren’t familiar with how source control works, it’s a great read.  Regardless, the best thing you can do for yourself is to get your database code in source control.  That opens up the possibility for safer refactoring of code.

Git Introduction

Allison Tharp has an introduction to Git:

Git is a version control system (VCS), which is just what it sounds like: a system to help keep track of different versions of software.  Git isn’t the only VCS out there (others include CVS, SVN, and Fossil), but it is one of the more popular systems, particularly for open source projects.  You’ve certainly used software that was developed using Git (Firefox and Chrome are two big ones!).

Version control is really helpful when you are working with other people.  Without version control, if I send you a file I’m working on and you make changes to it, we would suddenly have two versions.  If I integrate your changes into my file, then we’d only have one file but no history!  Even when working alone, version control is really helpful for us to keep track of how the project is moving along.

Understanding at least one source control platform is vital for software development.  Git can be like pulling teeth (and then there are the times when it gets really painful), but if you are developing software (even personal scripts!) and don’t have source control in place, you’re walking a tightrope without a net.

Database Project Basics

James Anderson gives a basic overview of database projects within Visual Studio:

SSDT is a VS plugin that can script out a database into individual files so that you can us a VCS (I use Git) to version control them. Once those scripts are in my Git repo, I can use it as the single source of truth to generate my releases from. This is the basis of getting our databases into our CI process. ReadyRoll will be used to further improve this process and to add our migration/upgrade scripts to our repo. SSDT is required by ReadyRoll and can be found here.

Before we can start with ReadyRoll, we need to learn some Visual Studio basics.

I’ve used database projects for the better part of a decade.  They aren’t perfect but in most environments, they’re quite helpful…if other people use them as well…

Use Source Control

James Anderson wants you to use source control:

SSC and SSDT require the use of compare tools to build deployment scripts. This is referred to as a state based migration. I’d done deployments like this in the past and saw that people reviewing the release found it difficult to review these scripts when the changes were more than trivial. For this reason, I decided to look at some migration based solutions. Migration solutions generate scripts during the development process that will be used to deploy changes to production. This allows the developer to break the changes down into small manageable individual scripts which in turn makes code reviews easier and deployments feel controlled. These scripts sit in the VS project and are therefore source controlled in the same way as the database.

James recommends Git here.  I’m not Git’s biggest fan, but it’s much, much better than not having any source control at all.

Categories

September 2017
MTWTFSS
« Aug  
 123
45678910
11121314151617
18192021222324
252627282930