Press "Enter" to skip to content

Month: August 2017

Parameter Sniffing Explained

Bert Wagner looks at how parameter sniffing works, why it’s (mostly) a good thing, and how to gently explain to the server when it’s a bad thing:

All subsequent executions of that same query will go to the query cache to reuse that same initial query plan — this saves SQL Server time from having to regenerate a new query plan.

Note: A query with different values passed as parameters still counts as the “same query” in the eyes of SQL Server.

In the case of the examples above, the first time the query was executed was with the parameter for “Costa Rica”. Remember when I said this dataset was heavily skewed? Let’s look at some counts:

Check it out for a clear depiction of the topic.  One solution that Bert doesn’t have but I will use sometimes is to create local variables in a procedure and set their values equal to input parameters.  That way, the optimizer doesn’t have an assumption as to the value of the local variable.  But there are several ways to get around this when it’s an issue.

Comments closed

Creating A WSFC Witness On Azure

Jim Donahoe shows how to create a cloud witness for a Windows cluster using Azure:

Well, that is easy to fix, right?  Let’s just spin up a VM in Azure, and host the FSW on that machine.  Problem solved!  Technically yes, that is a viable option.  But, let’s consider the cost of that scenario in the breakdown below:

  1. VM with OS licensed and Disk space allocated for FSW
  2. NSG/Firewall to protect said resource from outside 
  3. VNET

Also, you have to figure in the man hours in configuring all of those things(Let’s say 4 hours total.  Insert your hourly rate here:  Rate x 4 = Setup fee for VM in Azure

Now, here is where Cloud Witness saves the day!  The Cloud Witness WSFC Quorum type will utilize BLOB Storage in Azure to act as the point of arbitration.  Not sure what that means?

There’s a good walkthrough, but it does look quite easy to do, and a simple blob is going to be a lot cheaper than a VM.

Comments closed

Event Sourcing And Kafka

Ben Stopford gives an architectural overview of how Apache Kafka can act as an event sourcing system:

At a high level, event sourcing is really just the observation that in an event driven architecture, the events are facts. So if you keep them around, you can use them as a datasource.

One subtlety comes from the way the events are modelled. They can be values: whole facts (an Order, in its entirety) or they can be a set of ‘deltas’ that must be re-combined (a whole Order message, followed by messages denoting just the state changes: “amount updated to $5”, “order cancelled” etc).

As an analogy, imagine you are building a version control system. When a user commits a file for the first time, you save it. Subsequent commits might only save the ‘delta’: just the lines that were added, changed or removed. Then, when the user performs a checkout, you open the version-0 file and apply all the deltas, to derive to the current state.

The alternate approach is to simply store the whole file, exactly as it was at the time it was changed. This makes pulling out a version quick and easy, but to compare different versions you would have to perform a ‘diff’.

The series Ben has been working through is very helpful in wrapping your mind around what Kafka can do, and this post was no exception.

Comments closed

Parallel Processing In R

Chaitanya Sagar shows a few methods for parallelizing code in R:

Parallel programming may seem a complex process at first but the amount of time saved after executing tasks in parallel makes it worth the try. Functions such as lapply() and sapply() are great alternatives to time consuming looping functions while parallel, foreach and doParallel packages are great starting points to running tasks in parallel. These parallel processes are based on functions and are also modular. However, with great power comes a risk of code crashes. Hence it is necessary to be careful and be aware of ways to control memory usage and error handling. It is not necessary to parallelize every piece of code that you write. You can always write sequential code and decide to parallelize the parts which take significant amounts of time. This will help in further reducing out of memory instances and writing robust and fast codes. The use of parallel programing method is growing and many packages now have parallel implementations available. With this article. one can dive deep into the world of parallel programming and make full use of the vast memory and processing power to generate output quickly. The full code for this article is as follows.

If you’re using Microsoft R server, there are additional parallelism options. H/T R-Bloggers

Comments closed

What’s New In Ambari 2.5

Paul Codding tells us what’s coming in the next version of Ambari:

Ambari Log Search (Tech Preview) has been one of our most popular features, and in this release has seen UI, and backend refreshes based on customer feedback.  Log Search is planned for GA with the next major Ambari release, Ambari 3.0 in which the UI will be simplified, and the backend will have more robust log retention and scaling capabilities.

There are some interesting changes, so read the whole thing.

Comments closed

Breaking A Database Into Smaller Files

Jana Sattainathan shows how to break a SQL Server database into smaller files, as well as giving some reasons why you might want to do that:

You are probably reading this post because you have experienced the pain yourself and I dont want to waste anymore of your time and get right to the steps involved in breaking up a huge database or a datafile

  1. Check the space situation on your host

  2. Get the space usage by files for the big database/datafile in question

  3. Decide on number of files to add/location

  4. Add multiple secondary datafiles

  5. Distribute data from big datafile into the new datafiles using EMPTYFILE option

  6. Shrink the big datafile and set a maximum size

  7. Change the default filegroup

Read on for Jana’s step-by-step approach.

Comments closed

Partial Text Slicer In Power BI

Mitchell Pearson shows how to slice results based on text input:

In this installment of the Problem, Design, Solution series we are going to show you how to perform a text search using slicers in Power BI, this simulates a “LIKE” type search. In the following screenshot you can see that when “Tax” is selected all records in the table that have “Tax” anywhere in the record are returned, likewise whenever “IT” is selected from the slicer all records in the table that have IT in them are returned. Hope you enjoy this post!

Click through for the explanation, followed by a video that walks you through the process.

Comments closed

Working With UTC And Local Times

Jo Douglass shows how to use the DATETIMEOFFSET data type and AT TIME ZONE syntax to convert between UTC and local times:

Run select SysDateTimeOffset(); and you should see a date and time which mirrors your server’s current time, plus a time zone offset showing its current offset from UTC; this includes any time zone offset, plus any daylight savings time offset.

If I were to run this (from the UK) on August 15th, 2017 while my clock is showing that it’s noon exactly, I would get 2017-08-15 12:00:00.0000000 +01:00; the +01:00 offset is because the UK is offset by one hour from UTC during daylight savings. The datetime2 portion of a datetimeoffset is in local time, not UTC.

My normal operation is to store everything in UTC and let the application convert to local times.  That allows you to compare dates much more easily and reduces confusion around daylight savings time.

Comments closed

Forecasting Versus Predicting

Rob Collie explains that there are two different concepts which use similar names:

Once you’ve digested the illustration at the top of this article, yeah, you’ve kind already got it.

  • Forecasting is when we anticipate the behavior of “Lots” of people (customers, typically) on “Long” timelines.
  • Predictive Analytics anticipate the behavior of One person (again, typically a customer) on a “Short” timeline.

So…  Macro versus Micro.

But let’s delve just a little bit deeper, in order to “cement” the concepts.

There’s a very useful distinction here and Rob does well to flesh out the details.  I highly recommend this if you’re curious about micro- versus macro-level predictions.

Comments closed

SSMS Performance Dashboard

Pedro Lopes announces that the SQL Server Performance Dashboard is now built into SQL Server Management Studio:

Back in 2007, we released the Microsoft SQL Server 2005 Performance Dashboard Reports, which were designed to provide fast insight into performance issues from some newly created system views – DMFs (Dynamic Management Views). These were updated for SQL Server 2008 and later to SQL Server 2012, and while being very helpful they had a significant drawback – required separate download and install. This meant that when needed, most probably they were not installed in a specific SQL Server, and therefore were unusable when they were needed the most.

With the new SSMS 17.2, we are releasing the Performance Dashboard embedded as a built-in Standard Report. This means that it is available for any SQL Server instance starting with SQL Server 2008, without any extra downloads or running any extra scripts. Just connect to your server in SSMS and open the Performance Dashboard.

Aside from making it built into Management Studio, they’ve also added a few helpful things to the product, so it is worth checking out.

Comments closed