Author: Kevin Feasel

Quadrant Chart In Power BI

Published 2017-08-10 by Kevin Feasel

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Quadrant Chart Custom Visual. The Quadrant Chart is used to show a distribution of data across separate quadrants.

There’s an interesting mix of 2D layout plus bubble size. This is probably one of the better custom visuals available.

Comments closed

Parameter Sniffing Explained

Published 2017-08-10 by Kevin Feasel

Bert Wagner looks at how parameter sniffing works, why it’s (mostly) a good thing, and how to gently explain to the server when it’s a bad thing:

All subsequent executions of that same query will go to the query cache to reuse that same initial query plan — this saves SQL Server time from having to regenerate a new query plan.

Note: A query with different values passed as parameters still counts as the “same query” in the eyes of SQL Server.

In the case of the examples above, the first time the query was executed was with the parameter for “Costa Rica”. Remember when I said this dataset was heavily skewed? Let’s look at some counts:

Check it out for a clear depiction of the topic. One solution that Bert doesn’t have but I will use sometimes is to create local variables in a procedure and set their values equal to input parameters. That way, the optimizer doesn’t have an assumption as to the value of the local variable. But there are several ways to get around this when it’s an issue.

Comments closed

Creating A WSFC Witness On Azure

Published 2017-08-10 by Kevin Feasel

Jim Donahoe shows how to create a cloud witness for a Windows cluster using Azure:

Well, that is easy to fix, right? Let’s just spin up a VM in Azure, and host the FSW on that machine. Problem solved! Technically yes, that is a viable option. But, let’s consider the cost of that scenario in the breakdown below:

VM with OS licensed and Disk space allocated for FSW

NSG/Firewall to protect said resource from outside

VNET

Also, you have to figure in the man hours in configuring all of those things(Let’s say 4 hours total. Insert your hourly rate here: Rate x 4 = Setup fee for VM in Azure

Now, here is where Cloud Witness saves the day! The Cloud Witness WSFC Quorum type will utilize BLOB Storage in Azure to act as the point of arbitration. Not sure what that means?

There’s a good walkthrough, but it does look quite easy to do, and a simple blob is going to be a lot cheaper than a VM.

Comments closed

Event Sourcing And Kafka

Published 2017-08-09 by Kevin Feasel

Ben Stopford gives an architectural overview of how Apache Kafka can act as an event sourcing system:

At a high level, event sourcing is really just the observation that in an event driven architecture, the events are facts. So if you keep them around, you can use them as a datasource.

One subtlety comes from the way the events are modelled. They can be values: whole facts (an Order, in its entirety) or they can be a set of ‘deltas’ that must be re-combined (a whole Order message, followed by messages denoting just the state changes: “amount updated to $5”, “order cancelled” etc).

As an analogy, imagine you are building a version control system. When a user commits a file for the first time, you save it. Subsequent commits might only save the ‘delta’: just the lines that were added, changed or removed. Then, when the user performs a checkout, you open the version-0 file and apply all the deltas, to derive to the current state.

The alternate approach is to simply store the whole file, exactly as it was at the time it was changed. This makes pulling out a version quick and easy, but to compare different versions you would have to perform a ‘diff’.

The series Ben has been working through is very helpful in wrapping your mind around what Kafka can do, and this post was no exception.

Comments closed

Parallel Processing In R

Published 2017-08-09 by Kevin Feasel

Chaitanya Sagar shows a few methods for parallelizing code in R:

Parallel programming may seem a complex process at first but the amount of time saved after executing tasks in parallel makes it worth the try. Functions such as lapply() and sapply() are great alternatives to time consuming looping functions while parallel, foreach and doParallel packages are great starting points to running tasks in parallel. These parallel processes are based on functions and are also modular. However, with great power comes a risk of code crashes. Hence it is necessary to be careful and be aware of ways to control memory usage and error handling. It is not necessary to parallelize every piece of code that you write. You can always write sequential code and decide to parallelize the parts which take significant amounts of time. This will help in further reducing out of memory instances and writing robust and fast codes. The use of parallel programing method is growing and many packages now have parallel implementations available. With this article. one can dive deep into the world of parallel programming and make full use of the vast memory and processing power to generate output quickly. The full code for this article is as follows.

If you’re using Microsoft R server, there are additional parallelism options. H/T R-Bloggers

Comments closed

What’s New In Ambari 2.5

Published 2017-08-09 by Kevin Feasel

Paul Codding tells us what’s coming in the next version of Ambari:

Ambari Log Search (Tech Preview) has been one of our most popular features, and in this release has seen UI, and backend refreshes based on customer feedback. Log Search is planned for GA with the next major Ambari release, Ambari 3.0 in which the UI will be simplified, and the backend will have more robust log retention and scaling capabilities.

There are some interesting changes, so read the whole thing.

Comments closed

Breaking A Database Into Smaller Files

Published 2017-08-09 by Kevin Feasel

Jana Sattainathan shows how to break a SQL Server database into smaller files, as well as giving some reasons why you might want to do that:

You are probably reading this post because you have experienced the pain yourself and I dont want to waste anymore of your time and get right to the steps involved in breaking up a huge database or a datafile

Check the space situation on your host
Get the space usage by files for the big database/datafile in question
Decide on number of files to add/location
Add multiple secondary datafiles
Distribute data from big datafile into the new datafiles using EMPTYFILE option
Shrink the big datafile and set a maximum size
Change the default filegroup

Read on for Jana’s step-by-step approach.

Comments closed

Partial Text Slicer In Power BI

Published 2017-08-09 by Kevin Feasel

Mitchell Pearson shows how to slice results based on text input:

In this installment of the Problem, Design, Solution series we are going to show you how to perform a text search using slicers in Power BI, this simulates a “LIKE” type search. In the following screenshot you can see that when “Tax” is selected all records in the table that have “Tax” anywhere in the record are returned, likewise whenever “IT” is selected from the slicer all records in the table that have IT in them are returned. Hope you enjoy this post!

Click through for the explanation, followed by a video that walks you through the process.

Comments closed

Working With UTC And Local Times

Published 2017-08-09 by Kevin Feasel

Jo Douglass shows how to use the DATETIMEOFFSET data type and AT TIME ZONE syntax to convert between UTC and local times:

Run select SysDateTimeOffset(); and you should see a date and time which mirrors your server’s current time, plus a time zone offset showing its current offset from UTC; this includes any time zone offset, plus any daylight savings time offset.

If I were to run this (from the UK) on August 15th, 2017 while my clock is showing that it’s noon exactly, I would get 2017-08-15 12:00:00.0000000 +01:00; the +01:00 offset is because the UK is offset by one hour from UTC during daylight savings. The datetime2 portion of a datetimeoffset is in local time, not UTC.

My normal operation is to store everything in UTC and let the application convert to local times. That allows you to compare dates much more easily and reduces confusion around daylight savings time.

Comments closed

Forecasting Versus Predicting

Published 2017-08-09 by Kevin Feasel

Rob Collie explains that there are two different concepts which use similar names:

Once you’ve digested the illustration at the top of this article, yeah, you’ve kind already got it.

Forecasting is when we anticipate the behavior of “Lots” of people (customers, typically) on “Long” timelines.

Predictive Analytics anticipate the behavior of One person (again, typically a customer) on a “Short” timeline.

So… Macro versus Micro.

But let’s delve just a little bit deeper, in order to “cement” the concepts.

There’s a very useful distinction here and Rob does well to flesh out the details. I highly recommend this if you’re curious about micro- versus macro-level predictions.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31