Press "Enter" to skip to content

Author: Kevin Feasel

Building A Disaster Recovery Checklist

Robert Davis has a checklist for you:

If a disaster struck tomorrow, are you ready for it? Are you sure you thought of everything? I cannot count the number of times something happened, and I thought that I should have been able to anticipate that. It can be a big benefit to get another viewpoint on your disaster plan. No matter how well you think you’ve covered yourself, someone who thinks about things differently than you do can often see something you don’t.

Here’s a checklist of things you might not have considered. This can be your other set of eyes.

This is not a to-do list or a process guide when disaster strikes; rather, it’s a checklist of things you should have in place before a disaster occurs.

Comments closed

Minimizing Problem Reproductions

Lonny Niederstadt explains the value of a minimalistic repro:

Often among the hardest of my decisions is whether I should spend more time trying to simplify a given problem by eliminating additional factors, shrinking the data needed for a repro, etc… or just put all that effort into investigation purely aimed at understanding the behavior. I expect that to be a long-term challenge 🙂

I was quite happy with the way this particular one worked out, though. It started as a maze… access violations generated on a SQL Server 2016 instance with dozens of databases. The access violation came from an insert query using a synonym – with the underlying table in another database! (I didn’t know that was possible – or *ever* desirable – until I saw it in this context.) The AV was occurring during a multi-step data transfer process and it was hard to isolate the data flowing into and out of the two databases in question. But after some finagling, I got the problem repro pretty doggone small. Reproduced the AVs on several CUs of SQL Server 2016 and on SQL Server 2017 RC2.

If you think you’re going to enlist the help of someone outside your organization, then you definitely want a minimalistic repro.  That will reduce the risk of red herrings, reduce the burden of assistance, and make it much more likely that some poor sap in support can actually fix your problem if it turns out to be a bug.

Comments closed

Basics Of Powershell Modules

Tracy Boggiano shows you how to create a Powershell module:

The first file you need to create is a manifest file.  This file tells who wrote the module what version PowerShell should be used and module version.  You will need to open either PowerShell ISE or PowerShell under Run as administrator to run this command. You change the path to match the name you want the name you want your module to have along with the name of the module and author.  You may have to create the folder in your Modules folder (DBAFunctions) first depending on the security on your machine.

There isn’t too much to the process, but there is a big benefit:  code portability.

Comments closed

Working With Rowversion Data Types

Louis Davidson walks through some of the properties of rowversion data types:

For years, I had thought (and was probably taught in SQL.AlongTimeAgoInAPlaceFarFarAway) that the timestamp column (well before rowversion was a thing,) was not guaranteed to be an ever increasing value. But this is not the case.

In BOL (https://docs.microsoft.com/en-us/sql/t-sql/data-types/rowversion-transact-sql)  it states:

“The rowversion data type is just an incrementing number…”

This makes it useful for determining rows that have changed, because it it automatic and the user cannot override the value in the column. However, there is a major concern when you use rowversions, and that is what happens when you change the structure of the table, and expect the consumer to see that change. The problem is that when you change the structure of the table, the rowversion will not be set (except when adding a new rowversion column.)

I’m not much of a fan of rowversion and tend not to use it, but my biases don’t have to be yours.

Comments closed

Setting Data Types When Building Table Values In Power BI

Chris Webb shows how to use a parameter on the Table.AddColumn function in M to define specific data types:

In the last two screenshots the ABC123 icon in the column headers show that they are set to use the Any data type; the columns returned by calling the function have lost their data types.

The key to solving this problem is using the optional fourth parameter of the Table.AddColumn() function, which allows you to set a data type for the column that function adds to a table. Altering the Invoked Custom Function step of the previous query to do this, setting the new column to be a table type like so:

Worth reading in its entirety.

Comments closed

Live Query Doesn’t Show Blocking

Erik Darling points out that the Live Query Plan view does not show wait events:

What the live query plan doesn’t tell you about your query.

  1. That it’s being blocked
  2. What it’s waiting on

No seriously, nothing turns red or beeps or starts checking its watch, and there’s no Clippy asking you to hang on a moment. It just sits there.

And since the query hasn’t finished running yet, we don’t have any wait stats in the plan XML yet.

Assuming that’s fairly easy for Management Studio to get at the operator level—Erik does note that you can get this blocking information from DMVs at the session/query level—it’d be nice to add this information.

Comments closed

Getting Started With TensorFlow

Vivek Kalyanrangan shows us how to install TensorFlow:

Installing Tensorflow with GPU requires you to have NVIDIA GPU. AMD video cards are not supported with tensorflow. NVIDIA uses low level GPU computing system called CUDA. It is an NVIDIA proprietary software.

One can go the OpenCL way with AMD but as of now it won’t work with tensorflow.

Also, all NVIDIA devices are not supported. Here is a list from the NVIDIA documentation listing the supported GPUs.

By the end of it, Vivek also shows us a simple trained model.

Comments closed

Visualizing Networks With R

Arthur Charpentier shows off some of the functionality of igraph:

The good thing is that a lot of functions are available. For instance, we can get shortest paths between two specific nodes. And we can give appropriate colors to the nodes that we’ll cross:

> AP=all_shortest_paths(iflo,
+ from=”Peruzzi”,
+ to=”Ginori”)
> L=AP$res[[1]]
> V(iflo)$color=”yellow”
> V(iflo)$color[L[2:4]]=”light blue”
> V(iflo)$color[L[c(1,5)]]=”blue”
> plot(iflo)

Click through for a demo-heavy example.

Comments closed

Poorly-Performing Parallel Queries

Joe Obbish shows off how skewed data can cause SQL Server parallelism to perform poorly in certain scenarios:

The query above is designed to not be able to take advantage of parallelism. The useless repartition streams step and the spill to tempdb suggest that the query might perform better with a MAXDOP 1 hint. With a MAXDOP 1 hint the query runs with an average time of 2473 ms. There is no longer a spill to tempdb.

What happens if the query is run with MAXDOP 3? Earlier I said that the hashing function or thread boundaries can change based on DOP. With MAXDOP 3 I get a much more even row distribution on threads:

I think the number of cases where it makes sense to use a specific, non-1 MAXDOP hint is pretty small, but here’s one of them.  The problem is that if this data changes regularly, the skewness of the data could change along with it, making your brilliant optimization unnecessary or even harmful.

Comments closed

Machine Learning Services Updates

Umachandar Jayachandran and team have been busy.  First, they announced a preview of SQL Server ML Services in Azure SQL Database:

In-database Machine Learning support was added in SQL Server 2016 and we are now bringing the same functionality to Azure SQL Database. You can now train and score machine learning models in Azure SQL Database and the predictions can be exposed to any application using your database, easily and seamlessly.

The preview functionality allows you to train and score machine learning models using data that fits in memory (in R data frame). Please note that the amount of memory available for R scripts execution depends on the edition of the Azure SQL database and cannot be modified.

No Python support there yet, but it’s upcoming.  Second, we can use the PREDICT function in Azure SQL Database:

Today we are announcing the general availability of the native PREDICT Transact-SQL function in Azure SQL Database. The PREDICT function allows you to perform scoring in real-time using certain RevoScaleR or revoscalepy models in a SQL query without invoking the R or Python runtime.

The PREDICT function support was added in SQL Server 2017. It is a table-valued function that takes a RevoScaleR or revoscalepy model & data (in the form of a table or view or query) as inputs and generates predictions based on the machine learning model. More details of the PREDICT function can be found here.

There are a limited number of models which support PREDICT—things like linear and logistic regression, RevoScaleR’s fast decision trees, etc.  If you have this type of model, however, the predictions stay within SQL Server and end up being much faster than going out to R.

Comments closed