I will not comment on the solution Bob provided, since I don’t know how their infrastructure, roles, security is set up. At this point, I am grateful for his comment. But what I will comment, is that there is no straightforward way or any out-of-the-box solution. Furthermore, if your R code requires any additional packages, storing the packages with your R code is not that bad idea, regardless of traffic or disk overhead. And versioning the R code is something that is for sure needed.
To continue from previous post, getting or capturing R code, once it gets to Launchpad, is tricky. So storing R code it in a database table or on file system seems a better idea.
It’s an interesting concept. My preference is to use R Tools for Visual Studio and a more traditional source control mechanism. It involves keeping source control up to date, but that’s a good practice to follow in any case.
One of the first things I found was that all our stored procedures in the production server were encrypted. I wasn’t sure why, since we hosted our machines, but that wasn’t a big deal.
Until it was.
One day we had an issue on one of our SQL Server 2000 servers (we had two, supposedly identical). In troubleshooting and putting some sample data in both systems for a fake customer, we got different results. Hmmm, not what I wanted to see.
I checked the VCS (SourceSafe at the time) and checked out the code. I then loaded my test data and … got a third, different result. Now I was concerned as this was a production bug that was delaying work for a customer.
I’ve become convinced over the past few years that having all of your code in source control (including database scripts!) is a key differentiator between a good work situation and a bad work situation.
The source control system can’t merge the two file versions until Barb resolves the conflict between black bear and brown bear (the additions of wolf and fox still cause no problem).
When conflicts of this nature arise, someone must examine the comparison and determine which version of bear should win out. In this case, Barb decides to go with black bear.
It’s worth considering the risk associated with this merge process. Barb’s commit fails, so she can’t save her changes to the repository until she can successfully perform a merge. If something goes wrong with the merge operation, she risks losing her changes entirely. This might be a minor problem for small textual changes like these, but a big problem if she’s trying to merge in substantial and complex changes to application logic. This is why the source control mantra is: commit small changes often.
The article is more of an intro to source control, but if you aren’t familiar with how source control works, it’s a great read. Regardless, the best thing you can do for yourself is to get your database code in source control. That opens up the possibility for safer refactoring of code.
Git is a version control system (VCS), which is just what it sounds like: a system to help keep track of different versions of software. Git isn’t the only VCS out there (others include CVS, SVN, and Fossil), but it is one of the more popular systems, particularly for open source projects. You’ve certainly used software that was developed using Git (Firefox and Chrome are two big ones!).
Version control is really helpful when you are working with other people. Without version control, if I send you a file I’m working on and you make changes to it, we would suddenly have two versions. If I integrate your changes into my file, then we’d only have one file but no history! Even when working alone, version control is really helpful for us to keep track of how the project is moving along.
Understanding at least one source control platform is vital for software development. Git can be like pulling teeth (and then there are the times when it gets really painful), but if you are developing software (even personal scripts!) and don’t have source control in place, you’re walking a tightrope without a net.
SSDT is a VS plugin that can script out a database into individual files so that you can us a VCS (I use Git) to version control them. Once those scripts are in my Git repo, I can use it as the single source of truth to generate my releases from. This is the basis of getting our databases into our CI process. ReadyRoll will be used to further improve this process and to add our migration/upgrade scripts to our repo. SSDT is required by ReadyRoll and can be found here.
Before we can start with ReadyRoll, we need to learn some Visual Studio basics.
I’ve used database projects for the better part of a decade. They aren’t perfect but in most environments, they’re quite helpful…if other people use them as well…
SSC and SSDT require the use of compare tools to build deployment scripts. This is referred to as a state based migration. I’d done deployments like this in the past and saw that people reviewing the release found it difficult to review these scripts when the changes were more than trivial. For this reason, I decided to look at some migration based solutions. Migration solutions generate scripts during the development process that will be used to deploy changes to production. This allows the developer to break the changes down into small manageable individual scripts which in turn makes code reviews easier and deployments feel controlled. These scripts sit in the VS project and are therefore source controlled in the same way as the database.
James recommends Git here. I’m not Git’s biggest fan, but it’s much, much better than not having any source control at all.
If you are migrating, but want to keep the databases on SQL 2012 Express, then you can skip this part. I wanted them moved to my SQL 2014 instance. So I did a traditional backup/restore from SQL 2012 Express to SQL 2014. I took new backups of the SQL 2014 databases, and then uninstalled SQL 2012 Express. Then I had to configure TFS to connect to a different SQL instance. Within the web.config file (%ProgramFiles%\Microsoft Team Foundation Server 12.0\Application Tier\Web Services\web.config), I found an application setting named “applicationDatabase”. I made a backup copy of web.config first, then I changed the “applicationDatabase” value. It should be in a recognizable format if you’re familiar with SQL Server connection strings. You can also make this change within IIS. It was there that I noticed a few other settings that contained SQL connection strings. Check out the following in IIS and change settings as needed:
Dave has lots of screen shots to make the process easier to understand, but my main takeaway is that for the most part, migrating TFS is a huge pain…
I’ve always advocated a dev/main/release process, but I’ll admit this has weaknesses, not least that testing will usually only take place properly in one branch, and that bugs found in one branch may not find there way “back” or “forward”, but to go with one branch means that you are forced to keep the quality at production-code quality and make use of feature switches. Certainly it’s an ambitious way of working, and Microsoft’s ALM documentation suggests that no branches is reserved for smaller teams, but surely if the SQL team at Microsoft are able to do it then certainly it’s a branching strategy worth considering?
Read the linked article as well. This is an interesting look from the inside of how SQL Server gets developed.
In our environment, changes made in the Test branch have to travel through the Main branch and into the Release branch to be deployed into production. Sometimes changes need to move through quickly without regard to other changes, especially in an environment where there may be a single coder. Note: the following code will merge all checked in code regardless of who checked it in. Be careful in multi-coder environments.
This is your daily public service announcement saying that if you don’t have your database code in source control, you really should get your database code into source control.
While the choice of version control system is a very subjective and personal one, there are a number of reasons why I think Mercurial is an excellent choice, especially for someone just venturing into the world of version control. Here are a number of those reasons.
I prefer Mercurial to Git because I think the Mercurial commands are a bit easier to understand (although the proliferation of non-terrible Git GUIs has mitigated this somewhat). Nevertheless, if you love Git, use Git; if you love Mercurial, use Mercurial; if you love SVN, use SVN; if you love Visual Source Safe…use Mercurial or Git…