No one can deny how large the online support community for data science is. Today, it’s possible to teach yourself Python and other programming languages in a matter of weeks. And if you’re ever in doubt, there’s a StackOverflow thread or something similar waiting to give you the perfect piece of code to help you.
But when it came to pushing it to production, we found very little documentation online. Most data scientists seem to work on Python notebooks in a silo. They process large volumes of data and analyze it — but within the confines of Jupyter Notebooks. And most of the resources we’ve found while growing as data scientists revolve around Jupyter Notebooks.
Another option might be to use JupyterHub.
There’s two things going on here (and one hidden thing):
The first two messages point out that a procedure is referencing the column
ColdRoomSensorNumberwith schemabinding. The reason it’s using schemabinding is because it’s a natively compiled stored procedure. And that tells me that the table
Warehouse.ColdRoomTemperaturesis an In-Memory table. That’s not all. I noticed another wrinkle. The procedure takes a table-valued parameter whose table type contains a column called
ColdRoomSensorLabel. We’re going to have to replace that too. Ugh. Part of me wanted to look for another example.
The last message tells me that the table is a system versioned table. So there’s a corresponding archive table where history is maintained. That has to be dealt with too. Luckily Microsoft has a great article on Changing the Schema of a System-Versioned Temporal Table.
One last thing to worry about is a index on
ColdRoomSensorNumber. That should be replaced with an index on
ColdRoomSensorLabel. SSDT didn’t warn me about that because apparently, it can deal with that pretty nicely.
I’m glad that Michael went with a more complex example—it’s easy to tell this story with a simple procedure versioning, but in seeing a larger change, you can see the rhythm in the process—it’s all the same pattern of steps over and over.
There are two main kinds of SQL queries. SELECT/INSERT/UPDATE/DELETE statements are examples of Data Manipulation Language (DML). CREATE/ALTER/DROP statements are examples of Data Definition Language (DDL).
With schema changes – DDL – we have the added complexity of the SCH-M lock. It’s a kind of lock you don’t see with DML statements. DML statements take and hold schema stability locks (SCH-S) on the tables they need. This can cause interesting blocking chains between the two types where new queries can’t start until the schema change succeeds
Click through for suggestions with regard to schema locks, as well as a few tips for modifying large tables.
Procedures are very easy to Blue-Green. Brand new procedures are added during the pre-migration phase. Obsolete procedures are dropped during the post-migration phase.
If the procedure is changing but is logically the same, then it can be altered during the pre-migration phase. This is common when the only change to a procedure is a performance improvement.
But if the procedure is changing in other ways. For instance, when a new parameter is added, or dropped, or the resultset is changing. Then use the Blue-Green method to replace it: During the pre-migration phase, create a new version of the procedure. It must be named differently and the green version of the application has to be updated to call the new procedure. The original blue version of the procedure is deleted during the post-migration phase. It’s not always elegant calling a procedure something like
s_USERS_Create_v2but it works.
This has been a great series so far, and the way he does deployments matches very closely to the way we do them.
When using the Blue-Green method, basically nothing gets changed. Instead everything gets replaced. We start by setting up a new environment – the green environment – and then cut over to it when we’re ready. Once we cut over to the new environment successfully, we’re free to remove the original blue environment. The technique is all about replacing components rather than altering components.
Check it out for a great explanation, not only of how true blue-green doesn’t jibe well with databases, but how to get it to work well enough.
There are 3 modules included. I’m going to write a bigger piece about these functions elsewhere, but for now here is a link to the documentation –
Read on for more information and a link to the source code if you’re interested in learning more.
I have moved many databases to Azure via different methods but I recently came across a new way. Well technically it’s not new, I should say, newly found. The migration was done via the command line which is not exactly ground breaking but it’s nice to have another option.
The idea behind this is simple. Create the bacpac via command line using sqlpackage.exe with the action as export then do an import action into Azure.
Read on for the demo.
Problem 1 Image Tag
There is no image tag specified for the microsoft/mssql-server-linux image, therefore, if Microsoft push a newer version of the image to docker hub, this will be pulled down from docker hub when the build pipeline runs. This is easily fixed by tagging the image with a tag for an explicit version, e.g. microsoft/mssql-server-linux:2017-GA.
Click through for the starting code, two additional issues, and the corrected code.
The base return is the TSQLFragment object, which in turn has a Batches object, which in turn holds… well it can hold a lot of different things. When the text is parsed, it will determine what type of object to return based on the statement it determines it is. For example, if it’s an insert statement it will be a certain type of object with a given set of properties and methods, and if it’s, say, a create index statement you’ll get different properties, such as which table or view is getting the index along with the indexed columns and included columns. It really is interesting.
But interesting can a double-edged sword: since the statement object that gets returned can be different for each parsed piece of code, that means to set up any type of intelligence around the stuff we’re dealing with, we need to check for very specific objects.
Unfortunately, I never got past the first animated GIF, whose subtitle was wrong. You, however, should read the whole thing.
In my project, we have also integrated SonarQube with our TFS CI/CD build and have configured the Quality Gates.
For example – If I try to inject a security threat or a known coding issue — the TFS build will fail, the check in will get rejected, the quality gate fails and SonarQube points me to the exact issue – which I can rectify and do another check-in. So it will basically stop you from checking in code with potential issues.
Currently the only way to catch such issues is during manual coding reviews. SonarQube will help in automating that process. You can write your own rules to look for known issues in the code and stop it before the code gets checked in to source control.
So overall you can ensure good quality code going to Production and less regression defects coming up at a later point of time.
Read on for an example where a SonarQube rule can find a SQL injection vulnerability and thereby fail the build.