Rolf Tesmer explains that machine learning and DevOps aren’t oil and water (or maybe they are and we just need to stir harder):
In talking with various development teams, customers and DevOps engineers, a lot of the potential problems of meshing ML development into an enterprise DevOps process can be boiled down to a few different areas this aims to address…
– ML stack might be different from rest of the application stack
– Testing accuracy of ML model
– ML code is not always version controlled
– Hard to reproduce models (ie explainability)
– Need to re-write featurizing + scoring code into different languages
– Hard to track breaking changes
– Difficult to monitor models & determine when to retrain
So DevOps helps with this, right? Right?
Well er, some of them yes, but not all.
DevOps is not a panacea but it can solve certain types of problems well.
When I first started with VSTS and ultimately Azure DevOps, I went through many failed builds because the order of the jobs in your pipeline don’t run in the order that you’ve built them and how you would logically believe them to run. The image below shows two Build Pipeline jobs but when the build is queued, whether this be manual or via CI, the second job is running before job #1. In this example the build will fail because Job #2 is to deploy a dacpac to a SQL Server on Linux Docker Container (Using Ubuntu Agent Host) but obviously this cannot be done until the dacpac has been created in Job #1 which is running on a VS2017 Agent Host:
Click through to see how it’s done.
There is so much that is wrong with that conversation.
We could talk about the bottlenecks and the large amount of work in progress backed up in test – and the ways that could be fixed,
We could talk about the infrequent ‘big bang’ release in three months and the manual, error prone heroics that will probably be required to deliver it – and the ways that could be fixed,
We could talk about the misguided approach regarding branching strategies or the shared development database – and the ways they could be fixed,
We could talk about testing silos and the likelihood of drift and inconsistencies between different environments – and the ways they could be fixed,
We could talk about the word “DevOps-ing” – and why it should be burned along with anyone who uses it un-ironically. (And anyone who uses the word “irony” inappropriately or puts their commas at the end of the line.),
But I’m not going to talk about any of those things. I’m not going to talk about any of the things the customer said. I’m not going to talk about any of the technical issues or the possible solutions to those problems.
I’m going to talk about something much, much more important.
Read on to see what is much, much more important: culture.
Problem 1 Image Tag
There is no image tag specified for the microsoft/mssql-server-linux image, therefore, if Microsoft push a newer version of the image to docker hub, this will be pulled down from docker hub when the build pipeline runs. This is easily fixed by tagging the image with a tag for an explicit version, e.g. microsoft/mssql-server-linux:2017-GA.
Click through for the starting code, two additional issues, and the corrected code.
The base return is the TSQLFragment object, which in turn has a Batches object, which in turn holds… well it can hold a lot of different things. When the text is parsed, it will determine what type of object to return based on the statement it determines it is. For example, if it’s an insert statement it will be a certain type of object with a given set of properties and methods, and if it’s, say, a create index statement you’ll get different properties, such as which table or view is getting the index along with the indexed columns and included columns. It really is interesting.
But interesting can a double-edged sword: since the statement object that gets returned can be different for each parsed piece of code, that means to set up any type of intelligence around the stuff we’re dealing with, we need to check for very specific objects.
Unfortunately, I never got past the first animated GIF, whose subtitle was wrong. You, however, should read the whole thing.
The answer is via Azure Automation.
At a high level this is what I did.
Create an Automation Account.
Create a credential.
Create a PowerShell Runbook which has the code for index rebuilds.
Create a schedule and link it to the above.
Configure parameters within the schedule (if any).
Configure logging level (if desired).
Click through for the detailed steps.
Monitoring changes a bit with DevOps. It’s less about a simple tier and moves to the entire infrastructure. A need to monitor application, host, database and availability between each is essential. As these different tiers rarely come from one vendor and many may even be proprietary, there are requirements to monitor using multiple tools, scripts and interfaces.
Two of the main products for monitoring, recognized in the DevOps community are New Relic and AppDynamics. Monitoring can be as simple as a suite of scripts that report the health and status of processes and orchestration, notifying if there is any failure. This choice normally has a scaling limit and at some point, a more robust solution is required or gaps are felt in the monitoring process or failure at certain tiers. More enterprise solutions, such as New Relic and AppDynamics and enhanced by logging suites like Splunk and Sumo Logic.
Read the whole thing.
I’ve had most builds work really well. I tried a number of things, but kept getting a few items in the build. There were login errors or network errors, both of which bothered me since I could manually log in with SSMS from the same machine as my build agent.
I suspected a few things here, one of which was the use of named pipes for the Shadow database and TCP for Azure SQL Database.
Eventually, I decided to fall back with msbuild, ignoring VSTS, and make sure all my parameters were correct.
Read on for the rest of the story.
As I travel to multiple events focused on numerous platforms the database is crucial to, I’m faced with peers frustrated with DevOps and considerable conversation dedicated to how it’s the end of the database administrator. It may be my imagination, but I’ve been hearing this same story, with the blame assigned elsewhere- either its Agile, DevOps, the Cloud or even a latest release of the actual database platform. The story’s the same- the end of the Database Administrator.
The most alarming and obvious pain point of this, is that in each of these scenarios, the result was the Database Administrator a focal point in the end more so than they were when it began. When it comes to DevOps, the specific challenges of the goal needed the DBA more so than any of these storylines. As development hurdled top speed to deliver what the business required, the DBA and operations as a whole, delivered the security, the stability and the methodologies to build automation at the level that the other groups simply never needed previously.
This is a useful rejoinder to fears of imminent job loss.