Press "Enter" to skip to content

Author: Kevin Feasel

Continuous Integration Is A Process

Derik Hammer makes the vital point that continuous integration isn’t a tool; it’s a process:

SQL Server Data Tools (SSDT) is a tool that I am particularly familiar with and will become the subject of my examples. SSDT database projects shift the source of truth from your database to your source control. The intent is that the project and its build artifact, the dacpac, is the desired state of your database. SSDT will then generate the code necessary for you to migrate from your current state to the desired state.

The problem with my description is that it is similar to saying, “hammers drive nails into wood,” and then expecting that you won’t have to learn how to swing the hammer, aim at the head of the nail, or regulate how hard you hit it. Tools like SSDT are not magic and they can have problems. A solid understanding of how they work can mitigate or completely avoid these issues, however.

Click through for Derik’s rant.

Comments closed

Powershell Gallery And The Linux Model

Chrissy LeMaire explains the Linux packaging model and the long-term vision for Powershell:

So Joey comes up and says “Chrissy, Aaron Nelson has pretty much required me to talk to you. The SQL Community has the #1 PowerShell UserVoice request. We see that – we’ve heard you, The People want Out-DataTable and we agree. Would you be happy if we added it to the PowerShell Gallery first?”

“Uh, no! I want Out-DataTable to be a first class citizen like Out-GridView.”

“But where we’re going with PowerShell — we’re going smaller – to just core files, then you add on from the Gallery as desired.”

“Oh dang, like Linux! I’m liking it, keep talking.”

“To be clear, this is post 6.0. In the 6.0 timeframe, but we want to decouple as many release trains as possible, like PowerShellGet and PSReadline. But we’ll still very well package the ‘uber-complete, awesome devops tool edition’ of PowerShell. In the meantime, you could setup a metapackage for just your database stuff.”

“So it is like Linux patterns! PowerShell Gallery does that? I’m sold.”

Chrissy goes on to explain what a Powershell Gallery metapackage module is, how to create one, and even how to publish one yourself.

Comments closed

Saving Your Biml Outputs

Tim Cost shows how to save the Biml which gets generated behind the scenes when you go to generate a set of files:

One of the first things I started wondering about as I got used to reading OPC (other peoples code) is just EXACTLY what is BIML doing at any given point in the code.  You can make some educated guesses based on the SSIS packages (in my case I’m exclusively interested in BIML for SSIS but of course it can do a lot more than that), but it’s easy to get lost, especially when there’s a lot of BIML script and some of it is only used to establish a data model in memory or to create / fill variables that will be used in SSIS.  I was delighted to discover the following piece of code that can show you exactly what BIML is doing based on the code you are writing.

If you don’t have BimlStudio, this trick is vital for figuring out what’s going wrong.

Comments closed

Dealing With Database Changes

Vladimir Oselsky walks through his database deployment workflow:

When it comes to actual deployment to Test and production servers, it is handled by application update program that runs scripts on the target server one by one in alphabetical order. Since we have clients running different versions, scripts always have to be applied in order, for example, if the customer is on version 1.5 before the could get 2.5 they need 2.0. This ensures that database changes are applied in correct order, and I don’t have to worry about something breaking.

One last problem that I have to deal with on a regular basis is Version-drift. This is caused when I manually patch a client for a fix without going through the proper build process. In those cases, I just have to manually merge changes into development to guarantee that it will make it out to other clients. Once in a while, it becomes quite complicated to keep track of different clients running different versions and how to ensure that if they need a fix, it is not something that could be resolved through update versus manual code changes.

Version drift can be a big pain, but check out Vlad’s workflow.

Comments closed

47 Incorrect Deployment Assumptions

Brent Ozar has a list of 47 assumptions regarding database deployments that turn out not always to be true:

30. The deployment person wouldn’t dream of only highlighting some of it and running it.

31. The staff who were supposed to work with you during the deployment will be available.

32. The staff, if available at the start of the call, will be available during the entire call.

33. The staff won’t come down with food poisoning halfway through the deployment call, forget to mute their home office phone, step into the bathroom, and leave the bathroom door open.

I’ve never had item #33 happen to me, but that’s a pretty solid list of stuff that can go wrong.

Comments closed

Database Deployment: Growing Up

Ryan Booz uses schooling as an extended metaphor for database deployment:

In general, the biggest issues we hit continue to be client customizations to the database (even ones we sanction) and an ever growing set of core-pop data that we manage and have to proactively defend against client changes.  This is an area we just recently admitted we need to take a long, hard look at and figure out a new paradigm.

I should mention that it was also about this time that we were finally able to proactively get our incremental changes into source  control.  All of our final scripts were in source somewhere, but the ability to use SQL Compare and SQL Source Control allowed our developers to finally be a second set of eyes on the upgrade process.  No longer were we weeding through 50K lines of SQL upgrade just to try and find what changed.  Diffing whole scripts doesn’t really provide any good context… especially when we couldn’t guarantee that the actions in the script were in the same order from release to release.  This has been another huge win for us.

This is a view from someone in the middle of the process.  Ryan’s group isn’t pushing everything automatically, but they’re building out to that.

Comments closed

Breakpoint Extended Event

Arun Sirpal is a dangerous man of mystery and danger, but mostly danger:

I did a dangerous thing, and I want to make sure that YOU DO NOT do the same.

I was creating a couple of extended events sessions and was playing around with some actions. I ended up with the following code where I was after a guy called Shane:

The probability that you intend to set a breakpoint in SQL Server via Extended Event is quite low (low enough that if you’re doing it, you should already know what you’re doing), but click through to see exactly what damage you can do.

Comments closed

EF Core Merge Statements

Richie Rump looks at SQL that Entity Framework Core generates when inserting a batch of records:

If you’re an experienced SQL tuner, you’ll notice some issues with this statement. First off the query has not one but two table variables. It’s generally better to use temp tables because table variables don’t have good statistics by default. Secondly, the statement uses a MERGE statement. The MERGE statement has had more than it’s fair share of issues. See Aaron’s Bertrand’s post “Use Caution with SQL Server’s MERGE Statement” for more details on those issues.

But that got me wondering, why would the EF team use SQL features that perform so poorly? So I decided to take a closer look at the SQL statement. Just so you know the code that was used to generate the SQL saves three entities (Katana, Kama, and Tessen) to the database in batch. (Julie used a Samurai theme so I just continued with it.)

Yeah…I’m not liking the MERGE statement very much here.

Comments closed

Genomic Analysis In Spark

Tom White and Jonathan Keebler show off hail, a package to allow you to perform genomic analysis in Apache Spark:

One of the most important downstream analyses is finding genetic trait associations. Association studies look for statistical associations between genetic variation and phenotypic traits, that is, an observable characteristic of an individual, such as hair color or disease. With the increasing availability of whole-genome sequence data, it’s possible to look for variants from across the whole genome that may be associated with a disease, rather than heavily relying only on commonly known variants as in a traditional genome-wide association study (GWAS).

The challenge for downstream processing is scale. Tools that can cope with a few hundred or even a few thousand genomes, such as the well-known 1000 Genomes dataset, can’t handle datasets that are one or more orders of magnitude larger. These datasets are now becoming commonplace, thanks to the multiple sequencing efforts taking place around the world like the 100,000 Genomes Project in the UK and the Precision Medicine Initiative in the US.

Genomic analysis has been right in Hadoop’s wheelhouse for a while.

Comments closed

Grid Features In SQL Prompt

Derik Hammer shows off some of the grid functionality in Red Gate’s SQL Prompt:

Even more common than scripting out INSERT statements, I may need to copy a set of values and format them for an IN clause. Normally I would use a text editor such as Notepad++ to reformat the multiple lines of values. SSMS can also be used but I find Notepad++’s find/replace features better.

Now I do not have to worry about copying/pasting the values and making changes. SQL Prompt delivers a direct conversion from values to IN clause.

Click through for some animated GIFs showing how to use this functionality.

Comments closed