# Day: November 17, 2017

Adrian Colyer has a two-parter summarizing an interesting academic paper regarding deep neural networks.  Part one introduces the theory:

Section 2.4 contains a discussion on the crucial role of noise in making the analysis useful (which sounds kind of odd on first reading!). I don’t fully understand this part, but here’s the gist:

The learning complexity is related to the number of relevant bits required from the input patterns $X$ for a good enough prediction of the output label $Y$, or the minimal $I(X; \hat{X})$ under a constraint on $I(\hat{X}; Y)$ given by the IB.

Without some noise (introduced for example by the use of sigmoid activation functions) the mutual information is simply the entropy $H(Y)$independent of the actual function we’re trying to learn, and nothing in the structure of the points $p(y|x)$ gives us any hint as to the learning complexity of the rule. With some noise, the function turns into a stochastic rule, and we can escape this problem. Anyone with a lay-person’s explanation of why this works, please do post in the comments!

The different colours in the chart represent the different hidden layers (and there are multiple points of each colour because we’re looking at 50 different runs all plotted together). On the x-axis is $I(X;T)$, so as we move to the right on the x-axis, the amount of mutual information between the hidden layer and the input $X$ increases. On the y-axis is $I(T;Y)$, so as we move up on the y-axis, the amount of mutual information between the hidden layer and the output $Y$ increases.

I’m used to thinking of progressing through the network layers from left to right, so it took a few moments for it to sink in that it’s the lowest layer which appears in the top-right of this plot (maintains the most mutual information), and the top-most layer which appears in the bottom-left (has retained almost no mutual information before any training). So the information path being followed goes from the top-right corner to the bottom-left traveling down the slope.

This is worth a careful reading.

In the previous blog post In-Memory OLTP Indexes – Part 1: Recommendations, we gave you an update on the latest features of In-Memory OLTP technology. We also summarized the key characteristics of memory-optimized indexes and shared some guidelines and recommendations on how to best choose and configure an index for your memory-optimized table. At this point, if you haven’t read through the previous blog post, we strongly recommend you do so. In this blog post we continue onwards; take the learnings from the previous blog (Part 1) and using some sample examples, apply them in practice. The learnings from this blog post (Part 2) will be particularly useful if you are experiencing query performance issues with memory-optimized tables; either after migration from disk-based tables or in general, with your production workload leveraging memory-optimized tables.

To summarize this blog post covers the following:

• Common mistakes and pitfalls to avoid when working with memory-optimized indexes.

• Best practices to follow when configuring your memory-optimized indexes for optimal performance.

• Troubleshooting and Mitigating your query performance issues with memory-optimized indexes.

• Monitoring your query performance with memory-optimized indexes.

There’s a lot of detail in this post, and tuning these types of indexes isn’t quite the same as normal, disk-based indexes.

I keep things simple and like to look at certain performance based metrics but before talking about what metrics are available let’s step through an example.

For this post I want to setup an alert for CPU percentage utilised that when it is greater than 50% over the last 5 minutes I would like to know about it. First step is to navigate to your Azure SQL Database.

Click through for a screenshot-driven guided tour.

Larger numbers of indices create exponentially more query plan possibilities. When too many choices exist, the Optimizer will give up partway through and just pick the best plan thus far. As more indices are added the problem worsens and compilation times, i.e., processor times, increase to a point. This can be illustrated best by reviewing an actual customer example. In this case, one table had 144 indices attached to it and several others had between 20 and 130 indices. The queries were quite complex with as many as fifteen joins, many of which were outer joins. Query and index tuning were impossible because query performance was often counterintuitive and sometimes nonsensical. Adding an index that addressed a specific query need often made the query run worse one time and better the next. Note: Cached query plan issues, e.g., parameter sniffing or plan reuse were not problems in this case. The only solution was to tear down ENTIRE indexing structure and rebuild it with SQL Server’s guidance and nine days’ worth of production queries. Table 5 summarizes the results of the index restructuring project. The performance of 98 percent of the queries was comparable to or better than it was when the large numbers of indices were present.

Don’t be that company.

Microsoft Connect(); is a developer event from Nov 15-17, where plenty of announcements are made.  Here is a summary of the data platform related announcements:

• Azure Databricks: In preview, this is a fast, easy, and collaborative Apache Spark based analytics platform optimized for Azure. It delivers one-click set up, streamlined workflows, and an interactive workspace all integrated with Azure SQL Data Warehouse, Azure Storage, Azure Cosmos DB, Azure Active Directory, and Power BI.  More info

• Azure Cosmos DB with Apache Cassandra API: In preview, this enables Cassandra developers to simply use the Cassandra API in Azure Cosmos DB and enjoy the benefits of Azure Cosmos DB with the familiarity of the Cassandra SDKs and tools, with no code changes to their application.  More info.  See all Cosmos DB announcements

So how is Azure Databricks put together? At a high level, the service launches and manages worker nodes in each Azure customer’s subscription, letting customers leverage existing management tools within their account.

Specifically, when a customer launches a cluster via Databricks, a “Databricks appliance” is deployed as an Azure resource in the customer’s subscription.   The customer specifies the types of VMs to use and how many, but Databricks manages all other aspects. In addition to this appliance, a managed resource group is deployed into the customer’s subscription that we populate with a VNet, a security group, and a storage account. These are concepts Azure users are familiar with. Once these services are ready, users can manage the Databricks cluster through the Azure Databricks UI or through features such as autoscaling. All metadata (such as scheduled jobs) is stored in an Azure Database with geo-replication for fault tolerance.

I’ve been a huge fan of the Databricks Community Edition.  We’ll see if there will be a Community Edition version for Azure as well.

Before we go on, now’s your chance to name your widget. In my code above, I highlighted some of the changes I made. Also notice that this widget has a path to the query file we created; if this file doesn’t exist (or you can’t read from the path it exists on), it’ll stop working. Just a heads-up.

I named it something meaningful (but hey, do you), and I changed the vertical and horizontal size of the widget to be a little easier to read. Once you’re done, it’s time to add your widget to the dashboard.

I think people are going to like this product once it matures a bit.  This kind of flexibility without having to drop into .NET is great for DBAs for whom C# is a little intimidating.

It’s also worth noting that transport security is only needed in a distributed environment. Just like if the IRS agent lived with you, you wouldn’t need to go through the USPS. But that’s just weird.

This wraps up Colleen’s Service Broker series.  If you do find yourself interested in Service Broker, this is a great way to get your feet wet.