Press "Enter" to skip to content

Day: October 4, 2019

Fun with [ as a Function in R

John Mount definitely quotes Dr. Ian Malcolm correctly in this one:

How about defining a new [-based function call notation? The ideas is: we could write sin[5] in place of sin(5), thus unifying the notations for function call and array access. Some languages do in fact have unified function call and array access (though often using “(” for both). Examples languages include Fortran and Matlab.

Let’s add R to the list of such languages.

I love the flexibility in the language, almost as much as I would enjoy taking away production rights from the person who ships this in my code base…

Comments closed

An Introduction to Apache Livy

Guy Shilo explains why we should care about Apache Livy:

Apache Livy is an open source server that exposes Spark as a service. Its backend connects to a Spark cluster while the frontend enables REST API. This enables running it as the organization’s Spark gateway and even run in in docker containers.

Not only it enables running Spark jobs from anywhere, but it also enables shared Spark context and a shared RDD cache among all it’s users which is time and memory saving.

I will demonstrate here how to setup Apache Livy on one of the cluster’s nodes and on a separate server.

Click through for the demonstration.

Comments closed

SQL Server on Azure: Performance Optimized Storage Config

Mine Tokus announces a new feature when using Azure to host IaaS SQL Server instances:

Today, we are excited to announce Performance Optimized Storage Configuration capabilities for the VM’s registered with SQL VM RP. This feature automates storage configuration according to performance best practices for SQL Server on Azure virtual machines through Azure Portal or Azure Quick start Templates when creating a SQL VM. Automated performance best practices include separating Data and Log filescache configuration for premium disks hosting data and log filessupport for Temp DB on local disksupport for Ultra disks to host data, log or Temp DB files and database engine only images. In this article, we will discuss each automated performance best practice in detail.

Read on for the description and check out those links for additional information.

Comments closed

Multi-Parent Hierarchies in Power BI

Imke Feldmann has another way to solve the multi-parent problem in Power BI:

If you have parent-child-hierarchy with multiple parents, my function will a table like below, where the children with multiple parents still reside in different rows:

Due to this, the table cannot directly be connected with the FactTable, as NodeKey is not unique. Solution is to create DimNode-table that contains only unique values from the NodeKeys. Use it as a bridge between the 2 tables and implement a bidirectional filter to the Nodes-table:

Read on for the complete answer and to grab a copy of the PBIX file.

Comments closed

A Stored Procedure to Check for Agent Job Completion

Brian Hansen has a stored procedure which can help you synchronize those asynchronous SQL Agent job calls:

This is a stored procedure that I have found useful in a number of circumstances. It came into being because there are times that I need to run a job via T-SQL and wait for it to complete before continuing the script. The problem is that sp_start_job does just what it says: it starts the job and then immediately continues. There is no option to wait for it to finish.

I needed some way to pause script execution pending the completion (hopefully successfully) of the job.

One important note: this procedure using the undocumented xp_sqlagent_enum_jobs system procedure. While this has been around for ages, it is unsupported. I get why may bother some, but this procedure is the only way that I know of to reliably determine the current run status of a job.

Read on to learn more about the procedure and grab a copy.

Comments closed

Power BI Column Concatenation

Alexander Arvidsson shows how we can concatenate columns in Power BI using DAX:

Finally we can tackle the last hurdle – the column that shows both the current number of certifications and the requested goal. Had this been Excel it would have been dead easy – we just create a new cell that concatenates two other cells like this:

Then we copy the formula to all the rows. Easy. But this is not Excel. The “goal” part is simple – that’s just another column. The trick is to count all the other rows in the table with the same key. Let’s add the key column to the table so we see what we’re working with. “CompKey” is the concatenated key we created in the previous blog post. “Number of certs” is a count of the rows in the table, and because of row context it gets evaluated per key.

Read on for the solution.

Comments closed

Azure Data Studio October 2019 Release

Alan Yu announces the October 2019 release of Azure Data Studio:

While fixing a bug involving copying rows and columns from the results grid, we ended up creating an innovative copy/paste experience with the results grid.

Typically, when you select random cells individually, pasting this into a grid like in Excel will prevent the pasting. However, we’ve implemented a logical pattern that would support this type of pasting. This works by ignoring columns and rows that are not selected, and nulling columns and rows that were not included in the copy parameters.

I really like that copy-paste functionality. Time to go update…

Comments closed