Press "Enter" to skip to content

Author: Kevin Feasel

Installing SQL Server R Services Packages

Julie Koesmarno shows how to install an R package on a SQL Server 2016 instance which has SQL Server R Services installed:

When you start playing with R in SQL Server, sooner or later you would need to install some packages, for example ggplot2. You may run into a problem that sounds like this “Error in library(“ggplot2”) : there is no package called ‘ggplot2’“.

The following script is used in the iris_demo.sql (SQLServer2016CTP3Samples\Advanced Analytics\iris_demo.sql), and would cause a missing library error if you don’t have the packages installed on SQL Server R Services yet.

Julie shows two methods, one a Good Idea and the other a Bad(?) Idea.

Comments closed

MDM Is Hard

Knut Juergensen gives an overview of Master Data Management:

The sad reality in many companies is that there is no MDM, or that it exists but is implemented and managed poorly. Often, this is due to lack of managerial-level understanding of its real value and, subsequently, a lack of investment.

I’ll recount some of the problems that we encountered with our MDM system, at least partly due to this lack of understanding and investment from management. Although the example is specific to engineering manufacturing, I know that similar fundamental flaws affect other MDM systems in other environments.

The primary master data in this case comprises the parts and products used in our assembly lines, which are provided and created by our in-house and external design engineers. A core issue with our MDM system is the source of this master data.

Knut gives a good explanation of what MDM is, how it works, and then an example of how it doesn’t work.  Read the whole thing.

Comments closed

Faster Extended Events Reader

The CSS SQL Server Engineers note that with SQL Server 2016, we’ll get faster Extended Events readers:

SQL Server 2016 improves the XEvent Linq reader scalability and performance.    The XEvent UI in SQL Server Management Studio uses the XEvent Linq reader to process the events for display.   Careful study of the XEvent Linq reader revealed opportunities for scalability and performance improvements.

I don’t know if this will push anyone in the direction of using Extended Events who isn’t already doing so, but I like the performance improvement here.

Comments closed

POCs As A Problem

Bill Vorhies argues that data science proofs of concept fall short of the mark:

If you do a quick read through of some of the Gartner or O’Reilly studies you’ll quickly see that a lack of executive sponsorship is one of the major barriers to adoption.  So isn’t the POC a good way to get the attention of the C-level?  Yes and no.

If as we described above it leads to the adoption of a series of stand alone ‘technology projects’, then no.  If it was really necessary to start with little firecracker POCs to demonstrate the explosive strategic value of becoming data-driven, then maybe so.

Here’s a simple change of mindset (borrowed from John Weathington referenced above) that instead of focusing on Proof of Concept, we should instead create projects to demonstrate Proof of Value.  By focusing on value we change the orientation so that any projects are aligned with value to the company.  In other words, they are aligned with the company’s strategic objectives.

This is an interesting argument which goes against my inclinations.  Check it out.

Comments closed

Downgrading Databases

Stephen West shows how to migrate a database to an earlier version of SQL Server:

The error occurs as SQL Server database files and backups are not backward compatible restricting restore of database created from higher SQL Server version to lower version. Below are some of the steps to migrate the SQL Server Database from higher version to lower version:

1. Use Generate Scripts wizard of SQL Server Management Studio in Higher version

In this step, we will first script the schema of the desired Database on SQL Server 2012 instance to migrate the database to SQL Server 2008 R2 using Generate Scripts wizard of the SQL Server Management Studio.

There’s no easy way to do this; database upgrades are generally a one-way action.

Comments closed

Dave Mason Interviews Chrissy LeMaire

Dave Mason recently interviewed Chrissy LeMaire on topics Powershell:

[Dave]: It’s natural to have bias toward the tools and technology we know, which can lead to spirited debate. Most of the time, it’s friendly and thoughtful. I’ve been getting a sense of “us and them” regarding T-SQL vs PowerShell. Do you get that sense too?

[Chrissy]: Yes, which has been pretty surprising to me. As a PowerShell MVP, it sometimes feels like fellow DBAs may see me as an invader of SQL territory, when in fact, I’ve been a DBA for 17 of the 20 years that I’ve been in IT. I even updated my Twitter profile to make it clear that I’ve been a SQL Server DBA since 1999. I believe this issue will resolve itself as DBAs begin to see how PowerShell can make their jobs way easier. I’m also hoping that mySQL Server Migration script, which has no T-SQL (or even C#) equivalent, will be as persuasive as it is useful.

I remember reading an article in SQL Server Magazine back around 2002 that made the case for DBAs to learn T-SQL and other scripting languages. My first thought was “Wait, there are DBAs that don’t know T-SQL?” I always thought T-SQL was part of the job description, and it’s the same now with PowerShell. This belief was further enforced by the fact that when I was getting started with PowerShell and SQL, Simple Talk’s Phil Factor and Laerte Junior already had a ton of stuff out there and a few books about SQL Server and PowerShell had already been written. I thought I was late to the party.

This is a fun read; check it out.

Comments closed

Predicting ER Deaths

Konur Unyelioglu uses a neural network to predict emergency department deaths:

In this article we used an artificial neural network (ANN) from Spark machine learning library as a classifier to predict emergency department deaths due to heart disease. We discussed a high-level process for feature selection, choosing number of hidden layers of the network and number of computational units. Based on that process, we found a model that achieved very good performance on test data. We observed that Spark MLlib API is simple and easy to use for training the classifier and calculating its performance metrics. In reference to Hastie et. al, we have some final comments.

Articles like this are what got me interested in data analysis to begin with.

Comments closed

Try-Catch Doesn’t Handle Everything

Tara Kizer notes that there are limits in what TRY/CATCH blocks handle in SQL Server:

It’s well documented in Books Online (BOL). If you’re like me, then tl;dr. Are we even calling it Books Online these days? I still say “bookmark lookup” instead of “key lookup”. I suppose I’ll be saying Books Online for quite some time too. At least these days it really is online.

Here’s a shortened version:

  • Warnings or informational messages that have a severity of 10 or lower

  • Errors that have a severity of 20 or higher that stop the session

  • Attentions

  • When a session is KILLed

It’s important to know that not everything gets caught, particularly major issues.

Comments closed

Monitoring MapR With ELK

Mathieu Dumoulin shows how to feed MapR metrics into ElasticSearch and monitor with Kibana:

There are several ways to keep the data updated: a cron job, a linux daemon running as a service, or a stream tool such as Streamsets.

The easiest way might be to run the task as a cron job with an interval of one to thirty seconds depending on monitoring needs. This may be suitable for a proof of concept or a small test cluster or even a production cluster. The main drawback of using a cron is that the control over the execution is limited to running the script and resources aren’t shared, meaning we are opening and closing a connection to Elasticsearch as well as doing the work to call the rest endpoint for each invocation.

Kibana makes for some pretty dashboards.

Comments closed

Azure SQL Database Threat Detection

Warner Chaves has a video on Azure SQL Database Threat Detection:

As I mentioned, right now the tool is more of a reactive tool as it only lets you know after it has detected the anomaly. In the future, I would love to see a preventive configuration where one can specify a policy to completely prevent suspicious SQL from running. Sure, there can always be false alarms, however, if all the application query patterns are known, this number should be very low. If the database is open to ad-hoc querying then a policy could allow to only prevent the queries or even shut down the database after several different alerts have been generated. The more flexible the configuration, the better, but in the end what I want to see is a move from alerting me to preventing the injection to begin with.

In the demo, I’m going to go through enabling Azure SQL threat detection, some basic injection patterns and what the alerts look like. Let’s check it out!

This looks interesting.  I’ll have to give it a try on a test database.

Comments closed