Visualizing Linear Regression Results

Bernardo Lares gives us a few ways of interpreting visually a linear regression result in R:

The most obvious plot to study for a linear regression model, you guessed it, is the regression itself. If we plot the predicted values vs the real values we can see how close they are to our reference line of 45° (intercept = 0, slope = 1). If we’d had a very sparse plot where we can see no clear tendency over that line, then we have a bad regression. On the other hand, if we have all our points over the line, I bet you gave the model your wished results!

Then, the Adjusted R2 on the plot gives us an easy parameter for us to compare models and how well did it fits our reference line. The nearer this value gets to 1, the better. Without getting too technical, if you add more and more useless variables to a model, this value will decrease; but, if you add useful variables, the Adjusted R-Squared will improve.

We also get the RMSE and MAE (Root-Mean Squared Error and Mean Absolute Error) for our regression’s results. MAE measures the average magnitude of the errors in a set of predictions, without considering their direction. On the other side we have RMSE, which is a quadratic scoring rule that also measures the average magnitude of the error. It’s the square root of the average of squared differences between prediction and actual observation. Both metrics can range from 0 to ∞ and are indifferent to the direction of errors. They are negatively-oriented scores, which means lower values are better.

I like this approach to explaining models.

Test-Driven Database Development

Haroon Ashraf walks us through a simplified example of test-driven database development:

In TDDD, business requirements are encapsulated in database unit tests.

In case the requirement is adding a new category to the Category table, it is necessary to implement TDDD according to the following steps:

  1. Creating of database unit test to check the existence of AddNewCategory database object.

  2. Failing of the unit test because of the database object absence.

  3. Creating the AddNewCategory object in order for the unit test to pass.

  4. The unit test determines whether AddNewCategory stored procedure is actually adding a new category or not.

  5. That unit test also fails.

  6. AddNewCategory procedure code changes to add a new category that verifies afterrerunning the unit test, which is able to pass now.

Laying out my biases, I’m not a fan of TDD for application development and definitely not a fan of it for database development.  “Unit testing” inside a database is extremely limited, particularly when there are so many side effects and encapsulation tends to be actively harmful.

Improving AG Database Level Failover

Sourabh Agarwal announces improvements to Availability Groups when it comes to database level failover:

In addition to the existing checks, the new implementation has the following additional checks.

  1. The new implementation stores and uses a historical snapshot of the database state information to decide if a failover should be initiated. The health check routine caches the database state and associated error information, for the last two executions, which is then compared with the state information from the current execution of the health detection routine. If the same error condition (for the below mentioned error codes) exists in three consecutive runs of the health detection routine, a failover is initiated. This implementation is intended to provide safeguards against transient errors and issues which can be fixed by the auto page repair capabilities of the availability groups.

  2. The new implementation checks for following additional errors. Majority of these errors are indicative of a hardware issues on the server. Please note, that this is not an exhaustive list of errors which could impact the database availability. There is an outstanding item to include error 824 to this list.

Great news from the Tiger Team.

Managing SQL Logins Across Different Instances

Raul Gonzalez shows us how to maintain the same login across different SQL Server instances—it’s all about the SIDs:

Most servers out there would have both enabled so sooner or later us, DBA’s, need to deal with SQL logins, but there is more than providing a name and a password (a strong one, of course).

If you also have different environments, most likely you want to create different logins to avoid DEV apps or users connecting to LIVE or vice-versa.

But when you have different logins and by default database users, when you need to refresh your DEV (TEST, QA…) you’d need to apply all the permissions granted again to the right user because the login does not exist in that environment. Does it sound familiar?

In this post I will show you how you can handle this problem in a very simple way.

Click through to read the whole thing.

It’s 10 O’Clock; Do You Know Where Your Backups Are?

Adrian Buckman has a script which makes sure your backups are where msdb says they are:

Here is the information that the script provides:

  1. Warn of restores over the top of the database since its last FULL backup

  2. Show database snapshots currently against the database

  3. Show the last FULL, DIFF and LOG backup for the database including the backup durations and backup age.

  4. Backup file information such as backup start/finish time , file path , first LSN , Last LSN , a status column which states whether the log chain is in tact based on First and last LSN but also if the file exists on disk, and finally a file exists column which will tell you if the file still exists on disk.

This is a great script if you take transaction log backups frequently (typically a good idea).

SSAS Tabular Deployment Fails: Newtonsoft.Json Missing

Alex Whittles walks us through an error deploying a SQL Server Analysis Services tabular model:

Deploying an Analysis Services Tabular model to SSAS Azure using the Analysis Services Deployment Wizard. Both Visual Studio 2017 & SQL Server 2017 installed on the client.

Try and click on the ellipses to change the data source connection string or impersonation information results in a Newtonsoft.json error:

“Could not load file or assembly ‘Newtonsoft.Json, Version, Culture=neutral, ……”

As I like to joke, every single .NET project in existence includes Newtonsoft.Json.  As Alex shows, sometimes they don’t reference the right version.

Connecting GitHub To Azure Container Registry

Andrew Pruski automates the generation of SQL Server Docker images in Azure Container Registry, generating a new image with each GitHub repo check-in:

Fantastic, one build task created! How easy was that??

Let’s test by running: –

az acr build-task run --registry TestContainerRegistry01 --name buildsqlimage

And the progress of the build task can be monitored: –

az acr build-task logs --registry TestContainerRegistry01

Andrew gives us the step-by-step details, so check it out.

Parsing T-SQL Scripts With Pester

Rob Sewell shows us how to use Pester to ensure that a set of SQL scripts are valid T-SQL:

This is a quick Pester test I wrote to ensure that some SQL Scripts in a directory would parse so there was some guarantee that they were valid T-SQL. It uses the SQLParser.dll and because it was using a build server without SQL Server I have to load the required DLLs from the dbatools module (Thank you dbatools 🙂 )

It simply runs through all of the .sql files and runs the parser against them and checks the errors. In the case of failures it will output where it failed in the error message in the failed Pester result as well.

This particular example doesn’t ensure that the scripts do what you want them to do, but hey, Pester was built for that as well.


July 2018
« Jun Aug »