Press "Enter" to skip to content

Day: September 3, 2019

Calculating Consistency of Ratings

Sebastian Sauer looks at computing reliability between raters:

Computing inter-rater reliability is a well-known, albeit maybe not very frequent task in data analysis. If there’s only one criteria and two raters, the proceeding is straigt forward; Cohen’s Kappa is the most widely used coefficient for that purpose. It is more challenging to compare multiple raters on one criterion; Fleiss’ Kappa is one way to get a coefficient. If there are multiple criteria, one way is to compute the mean of multiple Fleiss’ coefficients.

However, a different way, and the way presented in this post, consists of checking of all raters agree on one given item (and repeating that for all items). If rater A assigns two tags/criteria (tag1, tag2) to item A, then the other raters may not assign different tags (eg tag3, tag4) to that item, if a match should be scored. Note that this proceeding allows for different numbers of tags/criteria for the items (eg., item 1 with only 1 tag, but item 2 with 3 tags etc.). However, our grading should give some points, if, say, rater1 assigns tag1 and tag2, but raters 2 and 3 only assign tag1.

Read the whole thing.

Comments closed

Update to ggraph

Thomas Lin Pedersen has an update to ggraph:

If you are new to ggraph, a short description follows: It is an extension of ggplot2 that implement an extended grammar for relational data (e.g. trees and networks). It provides a huge variety of geoms for drawing nodes and edges, along with an assortment of layouts making it possible to produce a very wide range of network visualization types. It is to my knowledge the most feature packed network visualization framework available in R (and potentially in other languages as well), all building on top of the familiar ggplot2 API. If you want to learn more I invite you to browse the new pkgdown website that has been made available.

It looks really nice.

Comments closed

Rolling Windows Upgrades with AGs + WSFC

Allan Hirt shows how you can combine Availability Groups with Windows Server Failover Clusters and upgrade the operating system version while keeping your SQL Servers running:

The configuration for a cluster rolling upgrade allows for mixed Windows Server versions to coexist in the same WSFC. This is NOT a deployment method. It is an upgrade method. DO NOT use this for normal use. Unfortunately, Microsoft did not put a time limit on how long you can run in this condition, so you could be stupid and do something like have a mixed Windows Server 2012 R2/2016 WSFC. Fire, ready, aim. The WSFC knows about this and you’ll see a warning with an Event ID of 1548.

Read on for a summary of what Allan has learned in doing this.

Comments closed

Comparing CAST and CONVERT Performance

Max Vernon runs a performance test of CAST versus CONVERT:

This post is a follow-up to my prior post inspecting the performance of PARSE vs CAST & CONVERT, where we see that PARSE is an order of magnitude slower than CONVERT. In this post, we’ll check if there is a similar difference between using CAST or CONVERT. But just to be clear, CONVERT offers a lot more functionality than CAST; this post will not help you decide which of these functions to use for a specific use-case – I leave that to the reader to decide for themselves.

Max gets slightly different numbers but under the covers they both call the same CONVERT() function. The difference in numbers is noise: both of them have standard deviations of ~200ms, so a t-test can’t distinguish the two. The big choice is whether you’d rather have ANSI standard code (if so, use CAST()) or if you’d prefer additional functionality around dates and times (like CONVERT() offers).

Comments closed

Migrating to SQL Managed Instances with dbatools

Jovan Popovic shows how we can perform an offline migration from on-prem/IaaS SQL Server to a SQL Managed Instance using dbatools:

Typically, the offline migration process looks like:

– You need to create an Azure Blob Storage account that will be used to temporary hold the database backups that will be moved from SQL Server to Managed Instance.
– You need to back up the databases to Azure Blob Storage and restore them from Azure Blob Storage to Managed Instance.
– You need to migrate server-level objects such as logins, agent jobs from the source to destination instance.

In this article, I will use Azure PowerShell to create and manage necessary Azure resources, and DBATools PowerShell library to initiate migration.

Read on for the process, including the Powershell scripts and dbatools calls needed.

Comments closed

Verifying Database Backups

Lori Brown reminds us to perform checksums and verify backups on completion:

I found out that I have been missing something from our regular database backups that I had no idea that I should have been using all along.  I know about verifying your backup file and have incorporated into our standard maintenance routines one that will periodically test backups by restoring using VERIFYONLY.  However, I totally missed also having CHECKSUM specified when creating backup files.  Ugh!!  Not sure how that happened but I am totally onboard with it now.  Better late than never!

Lori does explain what the consequences are in terms of time and CPU utilization so that you’re aware of the tradeoffs when enabling these options.

Comments closed

Breaking Out Powershell Functions with Powershell

Shane O’Neill shows us how we can use Powershell to break Powershell functions out into their own files:

The stupid thing that I was doing was that I was manually, visually scanning the script, copying out the function definitions, and pasting them into their own function files.

This was long, this was tedious, and this was not a efficient use of my time.

Especially since the scripts were not laid out as logically as I would have liked.

Click through to see how Shane solved this.

Comments closed