Press "Enter" to skip to content

Author: Kevin Feasel

Things to Stop Doing

Tom LaRock has a list of life-altering recommendations:

RESPONDING IMMEDIATELY

Stop doing that. Trust me on this, your response to that email/text/slack can wait. Don’t believe me? Try an experiment. For one month do not reply immediately to your emails. At the end of the month add up the number of emails you received, and the number of emails that required an immediate response. I’m willing to bet that the number is quite low, much lower than you realize. And once you realize just how few require an immediate reply, you’ll never look at email in the same way again.

It’s a fun list, and I definitely agree in most circumstances on at least 3/4 of them.

Comments closed

Azure Data Factory Notifications

Rayis Imayev walks us through three different techniques for sending notifications in Azure Data Factory:

While working on data integration projects and using Azure Data Factory as your main orchestration tool will help you to develop strategic forward thinking about your development tasks: to ponder on what your data sources are, point of destinations to land this information into a new data model and transformation steps to shape data from the source to its destination. Just like when you play chess and have to plan ahead several of your next moves.

Along with this structural thinking to develop and execute your data flows, timely notifications of when something goes left or right would give you additional peace of mind.

Something I appreciate in this post is that Rayis contrasts the Azure Data Factory techniques with SSIS methods, giving you a solid base for comparison.

Comments closed

Power BI Year in Review

Kasper de Jonge walks us through the biggest changes to Power BI in 2019:

As the end of the year closes I was reminiscing on what a huge year it has been for Power BI. I work mostly with large organisations so my view will be slightly skewed towards that.

For me 2019 has been the year where Power BI got massive adoption as the standard BI platform in an organisation, it went from self serve to also contain corporate BI. 

The Power BI development model does have its downsides, but without a doubt, they get stuff done.

Comments closed

Short Substrings and Computed Columns

Erik Darling gives us a story about computed columns that turns out not to be about computed columns at all (having thereby subverted our expectations):

The problem is that when I tried to index it:

CREATE INDEX dummy
    ON dbo.Users(DisplayNameComputed);

I got this error:

Msg 537, Level 16, State 3, Line 21
Invalid length parameter passed to the LEFT or SUBSTRING function.

And when I tried to select data from the table, the same error.

Click through to find the real query killer.

Comments closed

Diagnosing PolyBase Errors

Niels Berglund takes us through an odd “incorrect syntax” error with PolyBase:

What we see in Figure 1incorrect syntax exception, is strange, as I have executed the same code in a SQL Server 2019 Big Data Cluster, (BDC), without any issues, and the forum poster executed the same in SQL Server 2019 Enterprise Edition also without any issues.

Ok, but what about creating an external table against a relational data source – where we do not need to define an external file format?

There is a straightforward answer as to why the specific error message pops up, but I agree with Niels that it’d be nice to have “here’s the problem and here’s the solution” types of error messages. The deeper you get into the product—especially the older Hadoop external data source—the worse the error messages get.

Comments closed

Mapping in R with mapply

Andrew Treadway shows how to use the Map() and mapply() functions in R:

An older post on this blog talked about several alternative base apply functions. This post will talk about how to apply a function across multiple vectors or lists with Map and mapply in R. These functions are generalizations of sapply and lapply, which allow you to more easily loop over multiple vectors or lists simultaneously.

The idea of Map in functional programming takes a bit of time to really wrap your head around, but once you do, it becomes extremely powerful. H/T R-bloggers

Comments closed

Finding the Largest Profit or Loss with R

David Robinson takes us through a brainteaser:

I recently came across an interview problem from A Cool SQL Problem: Avoiding For-Loops . Avoiding loops is a topic I always enjoy reading about, and the blog post didn’t disappoint. I’ll quote that post’s description of the problem:

You have a table of trading days (with no gaps) and close prices for a stock.

Find the highest and lowest profits (or losses) you could have made if you bought the stock at one close price and sold it at another close price, i.e. a total of exactly two transactions.

You cannot sell a stock before it has been purchased. Your solution can allow buying and selling on the same trading_date (i.e. profit or loss of $0 is always, by definition, an available option); however, for some bonus points, you may write a more general solution for this problem that requires you to hold the stock for at least N days.

The SQL solution is another scenario where window functions save the day. In R, there are a couple straightforward options (if you happen to know about the functions!) which have radically different performance profiles.

Comments closed

When OBJECT_NAME() Can Block

Andy Mallon shows us that the OBJECT_NAME() function does not care about your READ UNCOMMITTED transaction isolation level:

That’s just going to sit there and wait, and wait, and wait, and wait. Because the table was created inside a transaction, the metadata about the table is uncommitted, and thus unavailable to the second session. The result is that my second session waits & waits & waits until the first session is committed (or rolled back).

OK, so it’s being blocked due to an uncommitted transaction. We could try doing dirty reads, right? I’ll kill my second session that’s been blocked, and I’ll throw in SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;. That’s like just like using a nolock hint:

USE AM2_WTF;
GO
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT OBJECT_NAME(581577110);

What the… It’s still being blocked by the uncommitted CREATE TABLE

Read on to understand why, and what you can do about it.

Comments closed

Creating a Custom Power BI Filter Pane

Tomaz Kastrun walks us through creating a custom filter pane in Power BI:

Having as much possible Power BI estate when it comes to putting the visuals for data exploration, is everyone’s dream. But the slicers for “slicing and dicing” the data is also very important. Having all the slicers hidden has always been mine go-to design.

Several similar ideas have been shown, Guy in a cube (Adam and Patric) have both done similar ideas over past years, many questions have also been answered on Power BI community website. And mine requirements were similar:

1. have the ability to hide the filter pane
2. have the selected slicer items listed

Read on to see how Tomaz took care of the issue.

Comments closed

Query Concurrency + Azure Synapse Analytics

James Serra takes us through query concurrency with Azure Synapse Analytics:

A common question I here from customers is because of the performance of Azure Synapse Analytics (formally called Azure SQL Data Warehouse or SQL DW), can they run Power BI dashboards against it using DirectQuery (and not have to use Azure Analysis Services (AAS), Import the data into Power BI, or use Power BI aggregation tables), avoiding having another copy of the data (saving money), and having data “real time” (as of the last refresh of the data warehouse)?

There are two things to think of in considering an answer this question. The first is if you will get the performance you need (discussed in my last blog), the second is if a certain amount of concurrent queries or connections will cause a problem (the subject of this blog).

Read the whole thing.

Comments closed