2018-05-23 – Curated SQL

Distributed training with Keras 2 and MXNet

This article shows how to install Keras-MXNet and demonstrates how to train a CNN and an RNN. If you tried distributed training with other deep learning engines before, you know that it can be tedious and difficult. Let us show you what it’s like now, with Keras-MXNet.

Installation is only a few steps

Deploy an AWS Deep Learning AMI
Install Keras-MXNet
Configure Keras-MXNet

The Deep Learning AMI is already set up for trial, so it should be easy to follow along.

Comments closed

There Is No Easy Button With Predictive Analytics

Published 2018-05-23 by Kevin Feasel

Scott Mutchler dispels some myths:

There are a couple of myths that I see more an more these days. Like many myths they seem plausible on the surface but experienced data scientist know that the reality is more nuanced (and sadly requires more work).

Myths:

Deep learning (or Cognitive Analytics) is an easy button. You can throw massive amounts of data and the algorithm will deliver a near optimal model.

Big data is always better than small data. More rows of data always results in a significantly better model than less rows of data.

Both of these myths lead some (lately it seems many) people to conclude that data scientist will eventually become superfluous. With enough data and advanced algorithms maybe we don’t need these expensive data scientists…

Read on for a dismantling of these myths. There’s a lot more than “collect all of the data and throw it at an algorithm” (and even then, “all” the data rarely really means all, which I think deserves to be a third myth). H/T R-bloggers

Comments closed

Multi-Layered Security With Docker Containers

Published 2018-05-23 by Kevin Feasel

Jessie Frazelle points out the advancements in security that Docker has made over the past couple of years:

Container runtimes have security layers defined by Seccomp, Apparmor, kernel namespaces, cgroups, capabilities, and an unprivileged Linux user. All the layers don’t perfectly overlap, but a few do.

Let’s go over some of the ones that do overlap. I could do them all, but I would be here all day. The mount syscall is prevented by the default Apparmor profile, default Seccomp profile, and CAP_SYS_ADMIN. This is a neat example as it is literally three layers. Wow.

Everyone’s favorite thing to complain about in containers or to prove that they know something is creating a fork bomb. Well this is actually easily preventable. With the PID cgroup you can set a max number of processes per container.

Interesting reading from an insider.

Comments closed

The Semantics Of GraphQL

Published 2018-05-23 by Kevin Feasel

Adrian Colyer reviews a paper on the mathematical properties behind GraphQL:

The authors study the computational complexity of GraphQL looking at three central questions:

The evaluation problem: what can we say about the complexity of GraphQL query evaluation?

The enumeration problem: how efficiently can we enumerate the results of a query in practice?

The response size problem: how large can responses get, and what can we do to avoid obtaining overly large response objects?

In order to do this, they need to find some solid ground to use for reasoning. So the starting point is a formalisation of the semantics of GraphQL.

This is a review of a published academic paper rather than a how-to guide, so it’s math-heavy. I am enjoying seeing the development of normal forms for graph processing languages—it’s the beginning of a new generation of normalization purists.

Comments closed

nUnit Tests And Spatial Data Types

Published 2018-05-23 by Kevin Feasel

David Wilson shows how to build integration tests in nUnit when you’re using spatial data types:

I was recently working on a .NET 4.6 based project that was using EF 6 and nUnit for unit testing. While setting up some integration tests against a local SQL database I was receiving this error:

Spatial types and functions are not available for this provider because the assembly ‘Microsoft.SqlServer.Types’ version 10 or higher could not be found.

We had recently been using SQL Server spatial types for tracking geograpic locations and the tests which performed updates and inserts against these fields were failing.

Read on for the setup instructions.

Comments closed

Concatenating Multiple SQL Files

Published 2018-05-23 by Kevin Feasel

Steve Stedman has a quick Powershell one-liner to concatenate multiple files:

I come across the need occasionally to deploy a set of sql files that are all checked into source control in different files with a file hierarchy like this:

Database Name

Type of object (proc, table, view, etc)

Name of object

When I go to deploy the scripts I need to manually combine all the SQL files into one to move to production, qa or test for deployment. After getting annoyed at lots of copy and paste I finally discovered an easy powershell script to combine all the files into one.

Steve points out at the end that if the file does not end with “GO” then combining multiple things, like stored procedures, together might result in unexpected behavior. I’ve done something similar to Steve’s script, except as you stream the content, append a newline, “GO,” and another newline.

Comments closed

Deferred Name Resolution In SQL Server

Published 2018-05-23 by Kevin Feasel

Kendra Little explains the concept of deferred name resolution in SQL Server:

In this case, I’m creating a temporary stored procedure (out of laziness, it means I don’t have to clean up a quick demo) –
CREATE OR ALTER PROCEDURE #test
AS
IF 1=0
    EXECUTE dbdoesnotexist.dbo.someproc;
GO
The database dbdoesnotexist does NOT exist, but I’m still allowed to create the procedure.

When I do so, I get an informational message:

The module ‘#test’ depends on the missing object ‘dbdoesnotexist.dbo.someproc’. The module will still be created; however, it cannot run successfully until the object exists.

This can be useful in some cases where you’ll be querying a table or procedure that may not exist all the time, but which will exist when a certain code block is run.

But, as Kendra points out, deferred name resolution doesn’t work everywhere, so it’s important to know the rules around when it will or will not work.

Comments closed

Don’t Forget Those Paused Indexes

Published 2018-05-23 by Kevin Feasel

Arun Sirpal tries to create a new index on his Azure SQL Database:

I was creating some demo non-clustered indexes in one of my Azure SQL Databases and received the following warning when I executed this code:
CREATE NONCLUSTERED INDEX [dbo.NCI_Time]
ON [dbo].[Audit] ([UserId])
INCLUDE ([DefID],[ShopID])
Msg 10637, Level 16, State 3, Line 7

Cannot perform this operation on ‘object’ with ID 1093578934 as one or more indexes are currently in resumable index rebuild state. Please refer to sys.index_resumable_operations for more details.

How intriguing!

Fortunately, the error message is clear and helpful, two terms which rarely go in conjunction with “error message.”

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Day: May 23, 2018

Combining Keras With Apache MXNet

Distributed training with Keras 2 and MXNet

Installation is only a few steps

There Is No Easy Button With Predictive Analytics

Multi-Layered Security With Docker Containers

The Semantics Of GraphQL

nUnit Tests And Spatial Data Types

Concatenating Multiple SQL Files

Deferred Name Resolution In SQL Server

Don’t Forget Those Paused Indexes