SQL Server On VMware Guide

David Klee announces an update to VMware’s SQL Server best practices guide:

I am proud to announce that we contributed to the latest revision of the Microsoft SQL Server on VMware best practices guide, freely available at this address. This document outlines some of the common VM-level tweaks and adjustments that are made when running enterprise SQL Server VMs on VMware platforms. This guide is considered a must-read if you manage these sorts of SQL Servers, which cannot be treated as general purpose virtual machines.

This guide was recently updated for vSphere 6.5, and we consider it an absolute must for your enterprise management library!

If you manage SQL Server instances on VMware, it’s definitely worth the read.

Entity Framework Slow, News At 11

Jovan Popovic shows that Entity Framework is slow and Dapper is fast:

To setup test, you can go to StackExchange/Dapper GitHub an download source code. Tests are created as C# solution (Dapper.sln). When you open this solution you can find Dapper.Tests project. You might need to change two things:

  1. Connection strings are hardcoded in Tests.cs file with values like “Server=(local)\SQL2014;Database=tempdb;User ID=sa;Password=Password12!”. You might need to change this and put your connection info.
  2. Project is compiled using dotnet sdk 1.0.0-preview2-003121, so you might get compilation errors if you don’t have a matching framework. I have removed line: “sdk”: { “version”: “1.0.0-preview2-003121” } from global.json to fix this.

Now you will be able to build project and run tests.

Nothing’s going to be faster than hand-crafted, well-tuned statements from people who know what they’re doing.  Micro-ORMs like Dapper and FSharp.Data.SqlClient will trade a little bit of a speed hit for developer niceties.  Heavier frameworks like Entity Framework and NHibernate add a lot more, but tend to be significantly slower.

Supersized Tables

Deborah Melkin tells a story of a design battle she lost:

The programmers came to me and said we need to add a large number of columns to this table for one piece of functionality. It would more than double the total number of columns on the table. Oh, and all of the new columns would be NULL since we would only need to populate them if they were using that functionality and even then, not all of them would require data. The final result would be that 65-75% of the table would end up having nullable fields with the majority of those having NULL for the value.

I said what I think any sane DBA would say to this request: No.

Click through for the rest of the tale.

Serverless Azure

Christos Matskas has an article on Azure Functions, Service Fabric, and Batch:

This service is the hidden gem of HPC (high performance computing) within the Azure Compute service family. As the name implies, Azure Batch is designed to run large-scale and high-performance computing applications efficiently in the cloud. When you’re faced with large workloads, all you have to do is to use Azure Batch to define compute resources to execute your applications in parallel and at the desired scale. A good use-case for Azure Batch would be to perform financial risk modelling, climate data analysis or stress testing. What makes Batch so useful is the fact that you don’t need to manually manage the node cluster, virtual networks or scheduling because all this is handled by the service. You need to define a job, any associated data and the number of nodes you want to utilise. It makes no difference if you need to run on one, a hundred or even thousands of nodes. The service is designed to scale according to the workload needs.

The cheapest server may very well be no server, and we’re at the point where relatively simple services could just run as Azure Functions or AWS Lambda functions.

That 53rd Week

Jens Vestergaard notes that you can sometimes have a 53rd week in the year:

There are a lot of great examples out there on how to build your own custom Time Intelligence into Analysis Services (MD). Just have a look at this, this, this, this and this. All good sources for solid Time Intelligence in SSAS.
One thing they have in common though, is that they all make the assumption that there is and will always be 52 weeks in a year. The data set I am currently working with is built on ISO 8601 standard. In short, this means that there is an (re-) occurrence of a 53rd full week as opposed to only 52 in the Gregorian version which is defined by: 1 Gregorian calendar year = 52 weeks + 1 day (2 days in a leap year).

The 53rd occurs approximately every five to six years, though this is not always the case. The last couple of times  we saw 53 weeks in a year was in 1995, 2000, 2006, 2012 and 2015. Next time will be in 2020. This gives you enough time to either forget about the hacks and hard-coded fixes in place to mitigate the issue OR bring your code in a good state, ready for the next time.

Dates and currency are hard problems.

Finding Candidate Keys

Daniel Hutmacher explains ways to find candidate keys:

Let’s assume we have a temp table heap called #table, with 9 columns and no indexes at all. Some columns are integers, one is a datetime and few are numeric. As I’m writing this post, my test setup has about 14.4 million rows.

In the real world, when you’re investigating a table for primary key candidates, there are a few things you’ll be looking for that are beyond the scope of this post. For instance, it’s a fair assumption that a numeric or float column is not going to be part of a primary key, varchar columns are less probable candidates than integer columns, and so on. Other factors you would take into consideration are naming conventions; column names ending with “ID” and/or columns that you can tell are foreign keys would also probably be good candidates.

It’s useful to think of all the candidate keys, as getting to Boyce-Codd Normal Form or 4th/5th NF involves dealing with all potential primary keys, not just the one you selected.  Daniel’s post gives you several different methods of searching existing data; combine that with domain knowledge and a bit of logic and you have a pretty decent start at finding candidate keys.

Leading Wildcard Seek Triggers

Aaron Bertrand demonstrates the triggers you could use if you wanted to build leading wildcard seek tables:

In my last post, “One way to get an index seek for a leading wildcard,” I mentioned that you would need triggers to deal with maintaining the fragments I recommended. A couple of people have contacted me to ask if I could demonstrate those triggers.

To simplify from the previous post, let’s assume we have the following tables – a set of Companies, and then a CompanyNameFragments table that allows for pseudo-wildcard searching against any substring of the company name

Read on for the triggers.

IoT Versus Event Hub

James Serra clarifies the differences between Azure’s IoT Hub and its Event Hub:

The majority of the time, if the data is coming directly from the devices, either directly or via a field-based gateway, IoT Hub will be the more appropriate choice.  Event Hub will generally be the more appropriate choice if either the data will not be coming to Azure directly from the devices, but rather either cloud-to-cloud through another provider, intra-cloud, or if the data is already landing on-premise and needs to be streamed to the cloud from a small number of endpoints internally.  There are exceptions to both conditions, of course.

Both solutions offer very high throughput data ingestion and can handle tremendous streaming data volumes.  In fact, today, IoT Hub is primarily a set of additional services that wrap an underlying Event Hub.

Read on for more scenarios and limitations in each.  They definitely serve different use cases.

Thinking About Availability Group Outages

Brent Ozar reminds us to think about graceful degradation of applications:

There’s a gray bar across the top that says, “This site is currently in read-only mode; we’ll return with full functionality soon.”

That’s not a hidden feature of Always On Availability Groups. Rather, it’s a hidden feature of really dedicated developers whose application:

This is where a bit of foresight and hard work can really pay off.  Read the whole thing.

Thinking Post-DRAM

Joe Chang argues that we may benefit more from a hardware architecture which uses lower-latency, lower-capacity RAM:

There are different types of SRAM. High-performance SRAM has 6 transistors, 6T. Intel may use 8T Intel Labs at ISSCC 2012 or even 10T for low power? (see real world tech NTV). It would seem that SRAM should be about six times less dense than DRAM, depending on the number of transistors in SRAM, and the size of the capacitor in DRAM.

There is a Micron slide in Micro 48 Keynote III that says SRAM does not scale on manufacturing process as well as DRAM. Instead of 6:1, or 0.67Gbit SRAM at the same die size as 4Gbit DRAM, it might be 40:1, implying 100Mbit in equal area? Another source says 100:1 might be appropriate.

Eye-balling the Intel Broadwell 10-core (LCC) die, the L3 cache is 50mm2, listed as 25MB. It includes tags and ECC on both data and tags? There could be 240Mb or more in the 25MB L3? Then 1G could fit in a 250mm2 die, plus area for the signals going off-die.

There is a lot of depth in this blog post.


March 2017
« Feb