Press "Enter" to skip to content

Month: August 2017

Dynamic Date Dimensions In Power BI

Ginger Grant shows how to build a dynamic date dimension in Power BI using either DAX or M:

Recently I needed to create a date dimension for a Power BI model as there was not one in the source database. There are two different ways that I could do this, using DAX from the Modeling Tab within the Data View or using M via the Query Editor window.  As a general rule, when it is possible data manipulation should be done in M as it offers a greater level of compression.  In this case though I am using a function in DAX, which is not the same as creating a calculated column.

Read on to see code examples for each method, as well as Ginger’s analysis.

Comments closed

More Annoying MSDTC Problems

Jeff Mlakar walks through some advanced MSDTC troubleshooting:

Sometimes when a VM is cloned from a template (VMware or Hyper-V) the CID can be duplicated between the two machines. Evidence of this can be seen in the Event Viewer (Event ID 4101). This means the two MSDTC services will not be able to communicate with each other.

A possible error message is:

The Microsoft Distributed Transaction Coordinator (MS DTC) has cancelled the distributed transaction

The MSDTC feature of the Windows operating system requires unique CID values to ensure that MSDTC functionality between computers works correctly. Disk duplicate images of Windows installations must have unique CID values or MSDTC functionality may be impaired. This can occur when using virtual hard disks to deploy an operating system to a virtual machine.

This burned me in the past.  Jeff has several scenarios that he walks us through, so if you’re using the Distributed Transaction Coordinator, definitely check this out.

Comments closed

Limiting Color Usage On Dashboard Charts

Jesse Gorter explains why you shouldn’t overwhelm your dashboard chart users with colors:

In this example we use a signal color for the past too. Do you notice how the usage of green distracts from the current week which is a red? This suggest we are doing great overall even though at this time, we are doing not so great. It is up to you to decide what you want to communicate. If you are a sports team showing the rank during the season, only the current position would be important. In sales, having 30 weeks of outstanding sales above the target and the current week selling slightly under, it would make sense to show the signal color for the past.

Not to mention making it easier for people with CVD to read your report, something with which the red-green scheme does not do great.

Comments closed

Left Versus Right Joins

Denis Gobo doesn’t like RIGHT JOIN:

Do you use RIGHT JOINs? I myself rarely use a RIGHT JOIN, I think in the last 17 years or so I have only used a RIGHT JOIN once or twice. I think that RIGHT JOINs confuse people who are new to databases, everything that you can do with a RIGHT JOIN, you can also do with a LEFT JOIN, you just have to flip the query around

So why did I use a RIGHT JOIN then?

Don’t be lazy; switch out those right joins.  The trick is that for every RIGHT JOIN statement, there is an equivalent statement which does not use RIGHT JOIN.  The percentage of the time that you might benefit from RIGHT JOIN is so low that the fixed costs of mentally processing what’s going on tend to overwhelm the slight benefit of that style of join.

Comments closed

Going Back From The Cloud

Arun Sirpal notes that you can take a cloud database back to on-premises:

The Challenge: I am going to write about a way to move from Azure SQL Database (Platform as a service) back to a local SQL Server. I did encounter errors on the way but more importantly I have written how to avoid/solve them.

Another key point I made sure that there were no connections to the database when doing the below as I didn’t want in-flight data movement whilst doing it. If you can’t do this, then you should create a copy of the database and work from that.

It’s not a trivial operation, but Arun does walk us through the steps.

Comments closed

ARITHABORT And ANSI_WARNINGS

Shane O’Neill looks at what the ARITHABORT and ANSI_WARNINGS settings do in SQL Server:

So, like a dog when it sees a squirrel, when I found out about the problems with ARITHABORT and ANSI_WARNINGS I got distracted and started checking out what else I could break with it. Reading through the docs, because I found that it does help even if I have to force myself to do it sometimes, I found a little gem that I wanted to try and replicate. So here’s a reason why you should care about setting ARITHABORT and ANSI_WARNINGS on.

These are two settings where the default value makes a lot of sense.

Comments closed

Context Switches In SQL Server

Ewald Cress continues his journey to the center of the SQLOS:

The SQLOS scheduler exists in the cracks between user tasks. As we’re well aware, in order for scheduling to happen at all, it is necessary for tasks to run scheduler-friendly code every now and again. In practice this means either calling methods which have the side effect of checking your quantum mileage and yielding if needed, or explicitly yielding yourself when the guilt gets too much.

Now from the viewpoint of the user task, the experience of yielding is no different than the experience of calling any long-running CPU-intensive function: You call a function and it eventually returns. The real difference is that the CPU burned between the call and its return was spent on one or more other threads, while the current thread went lifeless for a bit. But you don’t know that, because you were asleep at the time!

Definitely read the whole thing.

Comments closed

Clippy Lives: In Scala

Akhil Vijayan explains Scala Clippy:

Now you may be wondering how these errors are identified and we get advice related to it.

Simple, these are provided by the Scala community. If you visit their official website Scala Clippy where you can find a tab “Contribute”. Under that, we can post our own errors. These errors are parsed first, and when successful we can add our advice which will be reviewed and if accepted it will be added to their database which will, in turn, be beneficial to others.

Take a close look at the screenshots; I missed it at first, but there’s helpful advice above the error message.

Comments closed

More On S3Guard

Aaron Fabbri describes how S3Guard works:

Although Apache Hadoop has support for using Amazon Simple Storage Service (S3) as a Hadoop filesystem, S3 behaves different than HDFS.  One of the key differences is in the level of consistency provided by the underlying filesystem.  Unlike HDFS, S3 is an eventually consistent filesystem.  This means that changes made to files on S3 may not be visible for some period of time.

Many Hadoop components, however, depend on HDFS consistency for correctness. While S3 usually appears to “work” with Hadoop, there are a number of failures that do sometimes occur due to inconsistency:

  • FileNotFoundExceptions. Processes that write data to a directory and then list that directory may fail when the data they wrote is not visible in the listing.  This is a big problem with Spark, for example.

  • Flaky test runs that “usually” work. For example, our root directory integration tests for Hadoop’s S3A connector occasionally fail due to eventual consistency. This is due to assertions about the directory contents failing. These failures occur more frequently when we run tests in parallel, increasing stress on the S3 service and making delayed visibility more common.

  • Missing data that is silently dropped. Multi-step Hadoop jobs that depend on output of previous jobs may silently omit some data. This omission happens when a job chooses which files to consume based on a directory listing, which may not include recently-written items.

Worth reading if you’re looking at using S3 to store data for Hadoop.  Also check out an earlier post on the topic.

Comments closed