Press "Enter" to skip to content

Month: April 2018

Faceting With R And SQL Server ML Services

Marlon Ribunal has a quick example showing how to build faceted plots with SQL Server ML Services and ggplot2:

In my previous post, I have demonstrated how easy it is to create a bar graph in SQL Server 2017 In-Database Machine Learning using  R.

We’re going to build upon that basic graph.

Sometimes doing data analysis would require us to look at an overview of our data across specific partitions, say a year. For example, we want to see how our product groups fare on month-to-month basis across the last 4 years.

In a data analytics perspective, there are quite a handful of data points in this requirement – data aggregate (quantity), monthly periods, and year partitions.

One of the approaches to handle such requirement is by using a facet. Faceting is a way of plotting subsets of data into a matrix of panels based on one or more variables – or facets.

Click through for the example and code.  Facets are quite useful, but they run the risk of misleading if you squeeze too many onto the screen.  The same line can look quite different with a “tall” facet versus a “wide” facet, and that can change how people interpret your visual.

Comments closed

Tools For Various SQL Server Stacks

Warren Estes breaks out some tooling recommendations by stacks—that is, common use cases:

The Admin stack is probably the most important stack here. You still using maintenance tasks via SSMS? stop doing that. Rebuilding indexes every night? Maybe rethink that.

How you keep track of, monitor and do basics DBA tasks?

CMS server 
Ok so this can involve SSMS, but a feature not a lot of people may not use. We use it to keep track of all of our instances and push things..oh baby baaaby!  It also allows me to combine PoSh to do work against instances, gather data (historical, dmv…etc) and do a boat load of admin stuff without pointing and clicking. Heck I don’t even have to open SSMS to use my CMS server at all.

SentryOne SQLSentry
SQLSentry can automatically defrag indexes for you and update stats. You could use this instead of the below choices for this aspect if desired. Although not free, it’s an option we have in our environment and I love me some options.

Ola hallengren/Minionware
Both amazing options for backup, reindexing and checkdb. Although most places i’ve worked use Ola’s scripts by default. HOWEVER…. Minion has some pretty nice options that are FAR more configurable than Ola’s. We have mitigated some large DB issues by rolling our own code on top of Ola’s scripts. We could avoid this by simply using Minionware!

I’m a huge fan of the Minionware suite.  And several other things Warren mentions.

Comments closed

Restoration And That CHECKDB Message

Mike Fal investigates an interesting message in the SQL Server error log after a database restoration:

Recently I was doing some work with a friend around some database restores. It was pretty routine stuff. However, after one restore my friend came across something in the SQL Error Log that caught him by surprise. As part of the restore, there was a CHECKDB message for the restored database:

My friend’s first reaction was “why is SQL Server doing a DBCC CHECKDB as part of the restore?” He was concerned, because CHECKDB is a pretty hefty operation and this could really impact the restore time if he had to wait on a CHECKDB to complete. But the other confusing thing was that the date for the CHECKDB didn’t match up with the restore timing.

Click through to learn the answer.

Comments closed

Essential SQL Server Tools

Tracy Boggiano has a top 5 list of tools she uses on a day-to-day basis:

This T-SQL Tuesday is brought to us by Jens Vestergaard (b |  t), and we are asked to share our favorite SQL Server tools. Hint Profiler will not be on the list. But where do you start there are so many tools out there. In alphabetical order here are my top 5 tools because I can’t pick which one is better than other.

Click through to see Tracy’s top 5 list.

Comments closed

SQL Server Powershell Module On PowerShell 6 Core

Drew Furgiuele is ready to retire to his fainting couch:

So, I bit: the tweet he referenced was announcing a new version of the SQL Server module (21.0.17240). Here’s a quick list of the updates included:

  • Added Get-SqlBackupHistory cmdlet
  • Ported PS Provider to .NET Core for PowerShell 6 support 
  • Ported a subset of cmdlets to .NET Core for PowerShell 6 support 
  • Powershell 6 support on macOS and Linux in Preview. 
  • To use SqlServer provider on macOS and Linux mount it using a new PSDrive. Examples in documentation.
  • Removed restriction of 64-bit OS for this module. Note: Invoke-Sqlcmd cmdlet is the only cmdlet not supported on 32-bit OS.

The bold lines are my emphasis: with PowerShell 6 support for Linux and macOS, that opens up new avenues for connecting to and automating SQL Server from any platform. This is exciting stuff. I couldn’t wait to take it for a spin, so I set up a quick demo environment to test it out.

It’s not perfect but it did give Drew the vapors, which is a good sign that they’re on the right track.

Comments closed

Collecting Login Details With dbatools

Chrissy LeMaire shows us several ways to track who has connected to your SQL Server instance:

Using the default trace is pretty lightweight and backwards compatible. While I generally try to avoid traces, I like this method because it doesn’t require remote access, it works on older SQL instances, it’s accurate and reading from the trace isn’t as CPU-intensive as it would be with an Extended Event.

Click through for details on this as well as three other methods, along with the dbatools glue to make it work.

Comments closed

A Basic Explanation Of Associative Rule Learing

Akshansh Jain has some notes on associative rules:

Support tells us that how frequent is an item, or an itemset, in all of the data. It basically tells us how popular an itemset is in the given dataset. For example, in the above-given dataset, if we look at Learning Spark, we can calculate its support by taking the number of transactions in which it has occurred and dividing it by the total number of transactions.

Support{Learning Spark} = 4/5
Support{Programming in Scala} = 2/5
Support{Learning Spark, Programming in Scala} = 1/5

Support tells us how important or interesting an itemset is, based on its number of occurrences. This is an important measure, as in real data there are millions and billions of records, and working on every itemset is pointless, as in millions of purchases if a user buys Programming in Scala and a cooking book, it would be of no interest to us.

Read the whole thing.

Comments closed

Building Forest Plots With ggplot2

Faisal Atakora shows how to build a forest plot using ggplot2:

To build a Forest Plot often the forestplot package is used in R. However, I find the ggplot2 to have more advantages in making Forest Plots, such as enable inclusion of several variables with many categories in a lattice form. You can also use any scale of your choice such as log scale etc. In this post, I will introduce how to plot Risk Ratios and their Confidence Intervals of several conditions.

Click through for the script.  You might also want to compare it to the forestplot package to see how these differ.

Comments closed

Working With Dates And Times In Logstash

Mike Hillwig continues his Logstash series:

So far, I’ve done a decent job getting the data into shape. My biggest challenge, though, was the dates and times. Dates are in one field, and the times are in another. Dates look like 2014-02-26 and times look like 0852 Using a traditional datetime datatype would be nice to have, so I’ll have to do it myself. In order to turn a date and time into a datetime, I need to abut the two fields and then convert it.

I accomplished this by using a mutate filter, employing by several add_field commands. Notice how I simply abut the two times.

Read on to see how Mike does it.

Comments closed