Press "Enter" to skip to content

Curated SQL Posts

Fraud Detection With R And Azure

David Smith shows us an online fraud detection template:

Detecting fraudulent transactions is a key applucation of statistical modeling, especially in an age of online transactions. R of course has many functions and packages suited to this purpose, including binary classification techniques such as logistic regression.

If you’d like to implement a fraud-detection application, the Cortana Analytics gallery features an Online Fraud Detection Template. This is a step-by step guide to building a web-service which will score transactions by likelihood of fraud, created in five steps

Read through for the five follow-up articles.  This is a fantastic series and I plan to walk through it step by step myself.

Comments closed

Running Out Of In-Memory Disk Space

Jack Li shows us what happens when we run out of disk space for checkpoint files on a memory-optimized table:

The question is what happens if the disk that host the In-Memory checkpoint files runs out of disk space?  So I decided to do some testing and document the symptoms and recovery steps here in case you run into such issue.  With our Azure, test was really easy.  All I had to do was to spawn a VM and attach a very small disk to simulate out of disk space condition.

If your disk runs out of space, you will see various errors below though your database stays online

It looks like you can work your way out of a full drive scenario.  Hopefully, however, you won’t get into this scenario too often.

Comments closed

Indexing JSON

Jovan Popovic answers a question which has been on my mind:  how are we supposed to index JSON data in SQL Server 2016?

In this post I will show how you can add indexes on JSON properties in product catalog. In SQL Server 2016, you can use two type of indexes on JSON text:

  1. Index on computed column that index some specific properties in JSON.
  2. Full text search index that can index all key:value pairs in JSON objects.

This is the downside to JSON not being an official type:  indexing is somewhat limited.  In comparison, you could create XML indexes which were specially-designed to do the job of searching for text within an XML field.

Comments closed

Transaction Log Is Full

Andy Galbraith diagnoses an “ACTIVE_BACKUP_OR_RESTORE” issue:

The way FULL backups work in SQL Server, the transaction log is not released for re-use during a FULL backup, even if regular LOG backups are occurring or the database is in SIMPLE recovery.  This is due to the fact that the portion of the LOG that is used during the FULL has to be persisted during the FULL in order to be backed up at the end of the FULL – that is, the FULL backup includes the data at the start of the FULL (23:30) *plus* the LOG used until the end of the FULL (in the case of the 12/04-12/05 backup, the LOG used from 23:30 to 07:11).  This is the meaning of the ACTIVE_BACKUP_OR_RESTORE message – the LOG is waiting for the end of the active FULL backup before it can be released for re-use, which in this case was causing the LOG/LDF file to grow to fill its mount point.

This is interesting analysis and a reminder that even though you’re in Simple recovery mode, SQL Server still uses a transaction log and it’s just as important.

Comments closed

Parameterizing Procedures

Monica Rathbun shows how to combine several report queries:

I try to parameterize as many stored procedures as possible. This not only minimizes the amount of procedures I need to maintain, it in my opinion is a much cleaner way to code. It disturbs me when I see multiple stored procedures that pull the exact same data, but may have slight differences between them. Whether it be a sort, a where clause, or even just an extra field or two that makes it different, some developers think you need a different procedure for each one . Why not consolidate and parameterize?

The next step might be using dynamic SQL to build a query if there’s as much overlap as we see in Monica’s example.

Comments closed

Network Load Testing

Tim Radney uses iperf to perform network load testing:

Poor network performance can be a silent killer for application performance and my personal experience has shown this to be the case on many occasions. Often an application would start having performance issues and the application engineer would say that the application server looks good and starts to point their finger at the database. I would get a call to look at the database server and all indications showed that the database server was in good health (and this is where monitoring for key performance indicators and having a baseline helps!). Since the application and database teams were saying everything was good, we would ask the network team to check things out. The network team would look at a few things and give the all clear on their side as well. Each team troubleshooting and reviewing their respective systems took time, meanwhile the application performance was still suffering. The issue would then get escalated until all the teams would be asked to join a conference bridge to troubleshoot together. Eventually someone would start a deeper network test and determine that we either had a port saturation, routing, or some other complex networking issue. A few clicks or changing something on their end would eventually resolve the application slowness.

iperf is a nice tool for checking to see if your network throughput looks reasonable.

Comments closed

Sentiment Analysis

Dustin Ryan and Patrick Leblanc used Azure ML and Power BI to do sentiment analysis:

Using Azure ML and a free subscription to the Text Analytics API, I’m going to show you how to perform sentiment analysis and key phrase extraction on tweets with the hashtag #Colts (after this past Sunday’s 51-16 beat down of the Colts at the hands of the Jacksonville Jaguars, I’m bathing in the tears of Colts fans. Watch the highlights! ). Although my example here is somewhat humorous, the steps can be used to perform sentiment analysis and key phrase extraction on any text data as long as you can get the data into Power Query.

This is a fantastic example of how Azure ML can be used.  Read the whole thing.

Comments closed

Powershell To Modify Team Foundation Version Control

Kevin Eckart makes use of TFS commands to maintain source:

In our environment, changes made in the Test branch have to travel through the Main branch and into the Release branch to be deployed into production. Sometimes changes need to move through quickly without regard to other changes, especially in an environment where there may be a single coder. Note: the following code will merge all checked in code regardless of who checked it in. Be careful in multi-coder environments.

This is your daily public service announcement saying that if you don’t have your database code in source control, you really should get your database code into source control.

Comments closed