Press "Enter" to skip to content

Curated SQL Posts

MR3: Hive on Kubernetes

Alex Woodie reports on a DataMonad production:

MR3 is a software product developed by a team led by Sungwoo Park. The software, which is not open source, is sold by a Delaware-based software company called DataMonad. After prototyping a Java-based execution engine called MR2 in the 2013 timeframe, development of Scala-based MR3 began in 2015. The first release of MR3 was delivered in early 2018, and version 1.0 was released yesterday.

According to DataMonad, MR3 is an execution engine for big data processing, and Hive is the first and main application that’s been configured to run on it (Tez is also supported). The company says MR3 offers comparable performance to the latest release of Hive, dubbed LLAP, but without the technical complexity.

The closed-sourcedness is a bit of a downer, but I like having more competition in the space.

Comments closed

PolyBase and Excel

I have a post on setting up PolyBase to work with Microsoft Excel:

If you tried to use Microsoft’s Excel driver prior to 2019 CU2, you’d get the following error:

Msg 105082, Level 16, State 1, Line LineNumber
105082;Generic ODBC error: [Microsoft][ODBC Excel Driver]Optional feature not implemented

To this point, I recommended in PolyBase Revealed that you use a different driver, like CData’s, which did work. CData’s driver still works (I assume…PolyBase ODBC support is a fluid situation, it seems), but now I can officially say that PolyBase supports the Microsoft Access Database Engine Redistributable driver for Microsoft Excel. Let’s go to the tape.

Click through for the instructions.

Comments closed

Creating a Power BI Report Book

Teo Lachev shows us how to build a Power BI report book:

Scenario: Management has requested an easy way to view a subset of strategic reports located in different Power BI workspaces. You can ask the users to mark reports and dashboards as favorites so they can access pertinent content in the Favorites menu, but you’re looking for an easier configuration, such as to create a book of reports with a built-in navigation that organizes reports in groups (like a table of contents), such as the screenshot below demonstrates.

Click through for the instructions as well as a discussion on why you wouldn’t necessarily want to build a Power BI app for this.

Comments closed

Executing Azure Data Factory Pipelines with Azure Functions

Paul Andrew wants to execute an Azure Data Factory pipeline via an Azure Function call:

For the function itself, hopefully this is fairly intuitive once you’ve created your DataFactoryManagementClient and authenticated.

The only thing to be careful of is not using the CreateOrUpdateWithHttpMessagesAsync method by mistake. Make sure its Create Run. Sounds really obvious, but when you get code drunk names blur together and the very different method overloads will have you confused for hours!…. According to a friend 🙂

Read the whole thing.

Comments closed

Sorting When Columns are Ordered the Same Way

Erik Darling thinks outside the column:

Sorts aren’t just for the column(s) in your order by — if you SELECT *, you need to all the columns in the * by all the columns in the order by.

I know I basically repeated myself. That’s for emphasis. It’s something professional writers do.

Dig it.

The principle Erik is talking about is having a monotonic relationship (that is, as one column increases, the other always increases; and as one column decreases, the other always decreases). If you can guarantee that, and if one of those happens to be indexed already, you can get a nice performance boost.

But wait, I was told there would be no math.

Comments closed

Creating a Delimited List in .NET

Bill Fellows wants you to use a built-in library method:

There are various ways to concatenate values together. A common approach I see is that people will add a delimiter and then the value and loop until they finish. Then they take away the first delimiter. Generally, that’s easier coding, prepending a delimiter, than to append the delimiter and not do it for the final element of a list. Or add it and then remove the final delimiter from the resulting string.

Gentle reader, there is a better way. And has been for quite some time but if you weren’t looking for it, you might not know it exists. 

You do get bonus points if you knew this existed.

Comments closed

Digging Into Bar Charts

Alex Velez takes us through the humble bar chart:

Our eyes start at the base and scan towards the end of each bar. We measure the lengths relative to both the baseline and the other bars, so it’s a straightforward process to identify the smallest or the largest bar. We can also see the negative space between varying heights of bars to compare the incremental difference between them. 

Not only are these graphs easy to read, but they are also widely recognized. Chances are, you’ve already encountered a standard horizontal or vertical bar chart. But bars come in many shapes and sizes. I’ll list below a few of the most common variations, with links to examples.

Click through for some good information on bar charts, including design tips.

Comments closed

Reading Azure DevOps Results in Powershell

Mark Broadbent doesn’t let the lack of an official Powershell module get in the way:

In my post Using Azure CLI to query Azure DevOps I explained how you can use the Azure CLI to query Azure DevOps so you can obtain useful information on builds, releases, and other useful information. The solution required a certain level of skill with JMESPath to manipulate your result sets -which as explained can be a little confusing.

However once you have a bare bones result set, it is likely that you will want to consume these results in a more user-friendly environment such as PowerShell so that you can build upon these data sets. I thought this would be an easy thing to do, but as you will see below it was anything but.

Read on for some thoughts and a sample script.

Comments closed