Press "Enter" to skip to content

Month: May 2016

Predicting ER Deaths

Konur Unyelioglu uses a neural network to predict emergency department deaths:

In this article we used an artificial neural network (ANN) from Spark machine learning library as a classifier to predict emergency department deaths due to heart disease. We discussed a high-level process for feature selection, choosing number of hidden layers of the network and number of computational units. Based on that process, we found a model that achieved very good performance on test data. We observed that Spark MLlib API is simple and easy to use for training the classifier and calculating its performance metrics. In reference to Hastie et. al, we have some final comments.

Articles like this are what got me interested in data analysis to begin with.

Comments closed

Try-Catch Doesn’t Handle Everything

Tara Kizer notes that there are limits in what TRY/CATCH blocks handle in SQL Server:

It’s well documented in Books Online (BOL). If you’re like me, then tl;dr. Are we even calling it Books Online these days? I still say “bookmark lookup” instead of “key lookup”. I suppose I’ll be saying Books Online for quite some time too. At least these days it really is online.

Here’s a shortened version:

  • Warnings or informational messages that have a severity of 10 or lower

  • Errors that have a severity of 20 or higher that stop the session

  • Attentions

  • When a session is KILLed

It’s important to know that not everything gets caught, particularly major issues.

Comments closed

Monitoring MapR With ELK

Mathieu Dumoulin shows how to feed MapR metrics into ElasticSearch and monitor with Kibana:

There are several ways to keep the data updated: a cron job, a linux daemon running as a service, or a stream tool such as Streamsets.

The easiest way might be to run the task as a cron job with an interval of one to thirty seconds depending on monitoring needs. This may be suitable for a proof of concept or a small test cluster or even a production cluster. The main drawback of using a cron is that the control over the execution is limited to running the script and resources aren’t shared, meaning we are opening and closing a connection to Elasticsearch as well as doing the work to call the rest endpoint for each invocation.

Kibana makes for some pretty dashboards.

Comments closed

Azure SQL Database Threat Detection

Warner Chaves has a video on Azure SQL Database Threat Detection:

As I mentioned, right now the tool is more of a reactive tool as it only lets you know after it has detected the anomaly. In the future, I would love to see a preventive configuration where one can specify a policy to completely prevent suspicious SQL from running. Sure, there can always be false alarms, however, if all the application query patterns are known, this number should be very low. If the database is open to ad-hoc querying then a policy could allow to only prevent the queries or even shut down the database after several different alerts have been generated. The more flexible the configuration, the better, but in the end what I want to see is a move from alerting me to preventing the injection to begin with.

In the demo, I’m going to go through enabling Azure SQL threat detection, some basic injection patterns and what the alerts look like. Let’s check it out!

This looks interesting.  I’ll have to give it a try on a test database.

Comments closed

Introduction To ElasticSearch

Hasan Rahhal gives a good introduction to ElasticSearch:

On the top level hits.total is the total number of the docs using an empty search query, and max_score is the maximum score a document can take in a specific query. In our case it’s one, since no query was specified.

In __shards.total_ the value is the number of Lucene indexes that Elasticsearch created for that index. The default number is always 5 unless we specify otherwise on index creation time. More details about shards are explained here.

ElasticSearch is designed to store things like logs and monitoring metrics, and the interface is JSON.  This makes it very useful for certain tasks and infuriatingly difficult to do other things (like advanced aggregations).  Still, in a medium-sized or larger environment, this is probably a technology you either are using today or want to use.

Comments closed

Minion CheckDB Beta

Sean McCown has released a beta of Minion CheckDB:

We’ve had many of you asking to be part of the Minion CheckDB beta and now is the time. We’re putting the finishing touches on the 1st beta and it’s looking great with some fabulous features.
So this is the open call for beta users. If you’d like to meet Codex before anyone else then send me an email.
We have some requirements though. We don’t want dead beta users. This is your chance to shape the product and we want to hear from you. So if you’re serious about putting the product through its paces then we definitely want you. So you should be ready to provide real feedback, report bugs as you find them, and work with us to fix them.

Considering that I’ve bothered Sean about this at every SQL Saturday the two of us have been at this year, I’d better get moving and join the beta.

Comments closed

Building Blocks Of Cortana Intelligence Suite

Melissa Coates has put together a new presentation on the building blocks of the Cortana Intelligence Suite:

Each section will wrap up with an example of the ‘building blocks’ to formulate a solution. Although these ‘building blocks’ examples are greatly simplified, my hope is it will generate ideas for how the different Azure components can fit together for formulating hybrid solutions.

Check it out, as there are a lot of pieces.

Comments closed

Cortana Intelligence Suite

Buck Woody discusses various components of the Cortana Intelligence Suite:

It’s not a simple matter of “choose one from column B and two from column A” – you have to learn the processes, and then the tools, and then think about your situation. In other words, some things are complicated because they are…complicated. However:

There are some things you can consider out of the box. So I spoke with my friend Romit Girdhar while we were co-teaching in London last week, and he put together a great visualization. You can see them here, and download the PDF below. Thanks, Romit!

And of course they had to change the name—it wouldn’t be a Microsoft product if the name didn’t change every six months…

Comments closed

Forcing Query Store Plans

Grant Fritchey wonders, if you force a plan using Query Store but the plan ages out of cache, do you still use the forced plan?

To start with, a small stored procedure that I use all the time for bad parameter sniffing demos that reliably gets different plans with different values due to statistics skew:

1
2
3
4
5
6
7
8
9
10
11
12
CREATE PROC dbo.spAddressByCity @City NVARCHAR(30)
AS
SELECT  a.AddressID,
        a.AddressLine1,
        a.AddressLine2,
        a.City,
        sp.Name AS StateProvinceName,
        a.PostalCode
FROM    Person.Address AS a
JOIN    Person.StateProvince AS sp
        ON a.StateProvinceID = sp.StateProvinceID
WHERE   a.City = @City;

If this procedure is called for a value of ‘London’ it gets a plan with a Merge Join. For most other value it gets a plan with a Loops Join. Here’s an example of the ‘London’ plan:

It’s a good question with a good answer.

Comments closed

Threat Modeling

Michael Howard discusses threat modeling in Azure:

Many conversations I have with customers go like this:

Customer: “We cannot deploy on Azure until we know that appropriate defenses are in place.”

Me: “I agree 100%, so let’s build a threat model for the proposed design and see what you need to do, and what Microsoft provides.”

A couple of days pass as we build and iterate on the threat model.

Now here’s when the customer has an “a-ha” moment. At the end of the process we have a list of defenses for each part of the architecture and we all agree that the defenses are correct and appropriate.

It’s at that point the customer realizes that they can deploy a cloud-based solution securely.

My tongue-in-cheek response is, of course a customer can deploy a cloud-based solution securely if they have Michael Howard walking them through it.  Michael does include some links on Azure security configuration and threat modeling resources so check those out.

Comments closed