Press "Enter" to skip to content

Curated SQL Posts

Checking Availability Group Status With Powershell

Tracy Boggiano shows off a script which checks Availability Group status of selected servers:

My favorite thing to automate using PowerShell is checking on the status of things on multiple servers.  For example, after patching your environment running a quick query to make sure the version number is the same.  In this example, we will use a cmdlet my coworker wrote in combination in my cmdlet to check the health of all the Availability Groups across our landscape or you could use it just check one.  After all I do consider myself to be an HA/DR nut.

I’ve blogged about my coworker’s Get-CmsHost cmdlet before but now he has and shared it on github so you can read more about here.

In my cmdlet I use the same code that used in the SSMS AG dashboard to check the status of my Availability Groups.

Tracy includes her cmdlet as well as several example calls.

Comments closed

Considerations With Third-Party Backup Tools

Gianluca Sartori gives us a laundry list of potential problems with third-party database backup solutions:

2. Potentially dangerous separation of duties
Backup tools are often run and controlled by windows admins, who may or may not be the same persons responsible for taking care of databases. Well, surprise: if you’re taking backups you’re responsible for them, and backups are the main task of the DBA, so… congrats: you’re the DBA now, like it or not.
If your windows admins are not ok with being the DBA, but at the same time are ok with taking backups, make sure that you discuss who gets accountable for data loss when thing go south. Don’t get fooled: you must not be responsible for restores (which, ultimately, is the reason why you take backups) if you don’t have control over the backup process. Period.

There is plenty of sound advice in this post.  These points also apply to roll-your-own solutions as well, but the main focus is on enterprise backup tools, which are in many cases surprisingly shoddy.

Comments closed

Powershell And CMS

Mark Wilkinson loves Powershell and he loves Central Management Servers and he loves combining the two:

Get-CmsHosts is a function I wrote as part of a custom PowerShell module we maintain internally at my employer. It is simple to use, but is the base of most automation projects I work on.

Simple Example

PS> Get-CmsHosts -SqlInstance 'srv-' -CmsInstance srv-mycms-01

This example will connect to srv-mycms-01 and return a distinct list of instance host names registered with that CMS server that start with the string srv-. This output can then be piped to other commands:

Read on for more examples and details, and then grab the script at the end of Mark’s post.

Comments closed

Disabling Named Pipes Using Powershell

Brian Carrig shows how to disable the Named Pipes protocol using Powershell:

Windows and POSIX systems both support something referred to as “named pipes”, although they are different concepts. For the purposes of this post I am referring only to the Windows version. By default on most editions of SQL Server (every edition except Express Edition), there are three supported and enabled protocols for SQL server to listen on – Shared Memory, TCP/IP and Named Pipes. The inclusion of named pipes has always confused me somewhat. In theory, named pipes allow communication between applications without the overhead of going through the network layer. This advantage disappears when you want to communicate over the network using named pipes. In all modern versions of SQL Server, named pipes does not support Kerberos, so for most shops you likely will not be using or should not be using named pipes to communicate with SQL Server.

Security best practices dictate that if you are not using a particular protocol, you should disable it. There is an option to disable this is in the GUI in Configuration Manager but since this T-SQL Tuesday blog post is about using Powershell it does not make sense to cover it here. Nor is it particular easy to use the GUI to make a configuration change across hundreds of SQL instances. Unfortunately, I have not found a good way to make this change that does not involve using WMI, if anybody is aware of a better method, I welcome your feedback.

Read the whole thing.  You should have Named Pipes enabled if you’re running a NetBIOS network.  But because it’s not 1997 anymore, you probably shouldn’t be running a NetBIOS network.

Comments closed

Spinning Up Hyper-V VMs With Powershell

Andrew Pruski shows us how to enable Hyper-V and create a Windows VM using Powershell:

I’m constantly spinning up VMs and then blowing them away. Ok, using the Hyper-V GUI isn’t too bad but when I’m creating multiple machines it can be a bit of a pain.

So here’s the details on the script I’ve written, hopefully it could be of some use to you too.

Click through for the script; it’s ultimately just a few lines of code.

Comments closed

Naive PCA With R

Pablo Bernabeu gives us a naive method for performing a Principal Component Analysis:

STAGE 1.  Determine whether PCA is appropriate at all, considering the variables

  • Variables should be inter-correlated enough but not too much. Field et al. (2012) provide some thresholds, suggesting that no variable should have many correlations below .30, or any correlation at all above .90. Thus, in the example here, variable Q06 should probably be excluded from the PCA.

  • Bartlett’s test, on the nature of the intercorrelations, should be significant. Significance suggests that the variables are not an ‘identity matrix’ in which correlations are a sampling error.

  • KMO (Kaiser-Meyer-Olkin), a measure of sampling adequacy based on common variance (so similar purpose as Bartlett’s). As Field et al. review, ‘values between .5 and .7 are mediocre, values between .7 and .8 are good, values between .8 and .9 are great and values above .9 are superb’ (p. 761). There’s a general score as well as one per variable. The general one will often be good, whereas the individual scores may more likely fail. Any variable with a score below .5 should probably be removed, and the test should be run again.

  • Determinant: A formula about multicollinearity. The result should preferably fall below .00001.

PCA is a powerful tool in several fields, including clinical testing.

Comments closed

Serverless Lambda Architecture

Laith Al-Saadoon shows off a new Amazon Web Services product, AWS Glue, which allows you to build a data processing system on the Lambda architecture without directly provisioning any EC2 instances:

With the launch of AWS Glue, AWS provides a portfolio of services to architect a Big Data platform without managing any servers or clusters. AWS Glue is a fully managed extract, transform, and load (ETL) service that makes it easy for customers to prepare and load their data for analytics. You can create and run an ETL job with a few clicks in the AWS Management Console. You simply point AWS Glue to your data stored on AWS, and AWS Glue discovers your data and stores the associated metadata (for example, the table definition and schema) in the AWS Glue Data Catalog. After it’s cataloged, your data is immediately searchable, queryable, and available for ETL.

AWS Glue generates the code to execute your data transformations and data loading processes. Furthermore, AWS Glue provides a managed Spark execution environment to run ETL jobs against a data lake in Amazon S3. In short, you can now run a Lambda Architecture in AWS in a completely 100% serverless fashion!

“Serverless” applications allow you to build and run applications without thinking about servers. What this means is that you can now stream data in real-time, process huge volumes of data in S3, and run SQL queries and visualizations against that data without managing server provisioning, installation, patching, or capacity scaling. This frees up your users to spend more time interpreting the data and deriving business value for your organization.

Laith has a working demo of the process available as well.

Comments closed

Trigram Search In SQL Server

Paul White shows how to implement trigram wildcard searches in SQL Server:

The basic idea of a trigram search is quite simple:

  1. Persist three-character substrings (trigrams) of the target data.
  2. Split the search term(s) into trigrams.
  3. Match search trigrams against the stored trigrams (equality search)
  4. Intersect the qualified rows to find strings that match all trigrams
  5. Apply the original search filter to the much-reduced intersection

We will work through an example to see exactly how this all works, and what the trade-offs are.

A must-read.  N-grams in SQL Server is an example of a non-obvious data architecture which performs much better than the obvious alternative, at least when the conditions are right.

Comments closed

Transit Data Visualization In R

Goncalo Trincao Cunha shows us how to plot General Transit Feed Specification data in R:

GTFS (General Transit Feed Specification) is a specification that defines a data format for public transportation routes, stop, schedules, and associated geographic information.

In this post, we’ll use R with ggplot2 and ggmap to visualize GTFS route and schedule information on a map.

This post uses a GTFS feed from CARRIS, which is a bus public transport operator from the city of Lisbon.

Click through for code and a few interesting maps of Lisbon, Portugal.

Comments closed

Deleting SSAS Cube Partitions With Powershell

Richie Lee shows how to remove Analysis Services cube partitions using Powershell:

One such an example of ad-hoc DBA tasks was when I had to delete about 600 partitions from a measure group that had thousands of partitions. Doing this manually would be ridiculous, so at the time I created a SQL script that used some dynamic T-SQL to create the delete commands in XMLA. XMLA has no “delete if exist” type syntax, so if I needed to run this again, this dynamic SQL output wouldn’t work. And so I decided that if I had to run the same task a gain I would write a PowerShell script that would run DSC-style and drop the partitions that were no longer required. And funnily enough, that is exactly what I had to do.

I knew I would be able to create a Powershell script that used AMO to check if a partition exists and drop it if it did. I also wanted the script to take into account any other partitions in other measure groups that may also need to be dropped. So I made sure the script uses PowerShell switches that can be included when calling the function, and if they are included then the pertaining partitions in that measure group will be deleted. So you can run the script for one, some or all of the measure groups in a cube.

Click through for the script.

Comments closed