SQL Server R Services 2016 Bug

Kevin Feasel

2018-12-28

Bugs, R

Gonzalo Bissio takes us through a bug in SQL Server R Services 2016 (which has already been fixed in 2017):

2 Weeks ago I was working in a very interesting case in SQL 2016. I received an email from one of my customers saying that they were having intermitent issues within their app that was executing some SP_Execute_External_script SP calls to the database.

[…]


We also restarted the launchpad service but with no luck….. The biggest challenge was that sometimes the service responded fine and sometimes it showed the issue that I paste above (And this for me was absolutelly new…). From SQL Side we executed an extended events session with all R services counters but nothing appeared… From sys.dm_Exec_session_Wait_stats we just observed that the session was waiting for SATELLITE_SERVICE_SETUP wait which points that SQL Was waiting an answer from the R service itself.

Click through for the solution.

Using Hive For Data Virtualization

Kevin Feasel

2018-12-28

Hadoop

Gunther Hagleitner, et al, walk us through some reasons why we might want to use Apache Hive for data virtualization:

Assume you want to execute a Hive query that accesses data from an external RDBMS behind a JDBC connection. A possible naïve way of doing this would treat the JDBC source as a “dumb” storage system, reading all the raw data over JDBC and processing it in Hive. In this case you would ignore the query capabilities of the RDBMS and pull too much data over the JDBC link, thus ending up with poor performance and an overloaded system.
For that reason, Hive implements smart push-down to other systems by relying on its storage handler interfaces and cost-based optimizer (CBO) powered by Apache Calcite. In particular, Calcite provides rules that match a subset of operators in the logical representation of the query and generates a new equivalent representation with more operations executed in the external system. Hive includes those rules that push computation to the external systems in its query planner, and then relies on Calcite to generate a valid query in the language that those systems support. The storage handler implementations are responsible to send the generated query to the external system, retrieve its results, and transform the incoming data into Hive internal representation so it can be processed further if needed.

A lot of platforms are moving toward data virtualization (e.g., SQL Server with its Big Data Clusters). That appears to be the next product battleground.

$null In Powershell

Kevin Marquette goes into great detail on Powershell’s $nullconcept:

When a $null value is used in a numeric equation then your results will be invalid if they don’t give an error. Sometimes the $null will evaluate to 0 and other times it will make the whole result $null. Here is an example with multiplication that gives 0 or $nulldepending on the order of the values.

Nulls are tricky to handle in any language, making their nuances important to understand.

Becoming An Expert

Adrian Colyer wraps up The Morning Paper for the year by reviewing a big picture paper on developer expertise:

You’ll know an expert programmer by the quality of the code that they write. Experts have good communication skills, both sharing their own knowledge and soliciting input from others. They are self-aware, understanding the kinds of mistakes they can make, and reflective. They are also fast (but not at the expense of quality).
Experience should be measured not just on its quantity (i.e., number of years in the role), but on its quality. For example, working on a variety of different code bases, shipping significant amounts of code to production, and working on shared code bases. The knowledge of an expert is T-shaped with depth in the programming language and domain at hand, and a broad knowledge of algorithms, data structures, and programming paradigms.

Click through for the full review.

The Ultimate Powershell Telemetry Prompt

Jeffery Hicks might have taken things a bit too far:

Well, I knew I wouldn’t be satisfied. The other day I shared a PowerShell prompt function that could display telemetry like information for a few remote servers. One of the drawbacks was the limited amount of information I could display. I’ve revised that function and have a new version that displays additional information via a few performance counters. I’ve also reorganized the function to make it a bit more efficient. Want to see it?

My jokey lede aside, this is really cool. Click through for details and to get a link to the code.

Displaying Human-Readable Month Sets With DAX

Alberto Ferrari wants to show sets of contiguous months using DAX:

Today I woke up with an interesting question, about how to show a selection of months in a nice way, detecting contiguous selection. You can easily understand the desired solution from the following figure:

I enjoyed writing a quick solution, which is worth sharing. The code is somewhat verbose, but this is mainly for educational purposes (meaning I did not want to spend time optimizing it). I will likely write a full article on it, for now, just enjoy some DAX code:

I removed the image, but to get the gist (and get you to click through to see it in its beauty), it reads “January, March-April, August-December”

Click through for Alberto’s quick-and-dirty solution and then Chris Webb’s improvement.

The Bitmap Operator

Hugo Kornelis describes a new operator:

The Bitmap operator is used to build a bitmap that, based on a hash, represents which values may be present in a data flow. Due to the chance of hash collisions in the hash function used, the Bitmap process can produce false positives but not false negatives – so a match based on a bitmap is not guaranteed to be a match to the actual data, but a non-match based on a bitmap is guaranteed to not be a match in the actual data.
The generated bitmap is typically used in other operators to remove rows for which there is no match in the bitmap, and hence guaranteed no match in the original set of data processed by the Bitmap operator. The use of Bitmap operators is most common in execution plans for star join queries in large data warehouses. An example can be seen here.

Click through for details on how it works and plenty of good information on it.

Tips When Writing Extended Events To Files

Jason Brimhall has some tips to help you use the file target in Extended Events:

This first little tip comes from a painful experience. It is common sense to only try and create files in a directory that exists, but sometimes that directory has to be different on different systems. Then comes a little copy and paste of the last code used that worked. You think you are golden but forgot that one little tweak for the directory to be used. Oops.

Read on to see how SQL Server exposes that error, and then Jason shows us a different how-not-to with file targets.

Categories

December 2018
MTWTFSS
« Nov Jan »
 12
3456789
10111213141516
17181920212223
24252627282930
31