August 2017 – Page 16

Recently I needed to create a date dimension for a Power BI model as there was not one in the source database. There are two different ways that I could do this, using DAX from the Modeling Tab within the Data View or using M via the Query Editor window. As a general rule, when it is possible data manipulation should be done in M as it offers a greater level of compression. In this case though I am using a function in DAX, which is not the same as creating a calculated column.

Read on to see code examples for each method, as well as Ginger’s analysis.

Comments closed

Limiting Color Usage On Dashboard Charts

Published 2017-08-07 by Kevin Feasel

Jesse Gorter explains why you shouldn’t overwhelm your dashboard chart users with colors:

In this example we use a signal color for the past too. Do you notice how the usage of green distracts from the current week which is a red? This suggest we are doing great overall even though at this time, we are doing not so great. It is up to you to decide what you want to communicate. If you are a sports team showing the rank during the season, only the current position would be important. In sales, having 30 weeks of outstanding sales above the target and the current week selling slightly under, it would make sense to show the signal color for the past.

Not to mention making it easier for people with CVD to read your report, something with which the red-green scheme does not do great.

Comments closed

Left Versus Right Joins

Published 2017-08-07 by Kevin Feasel

Denis Gobo doesn’t like RIGHT JOIN:

Do you use RIGHT JOINs? I myself rarely use a RIGHT JOIN, I think in the last 17 years or so I have only used a RIGHT JOIN once or twice. I think that RIGHT JOINs confuse people who are new to databases, everything that you can do with a RIGHT JOIN, you can also do with a LEFT JOIN, you just have to flip the query around

So why did I use a RIGHT JOIN then?

Don’t be lazy; switch out those right joins. The trick is that for every RIGHT JOIN statement, there is an equivalent statement which does not use RIGHT JOIN. The percentage of the time that you might benefit from RIGHT JOIN is so low that the fixed costs of mentally processing what’s going on tend to overwhelm the slight benefit of that style of join.

Comments closed

Going Back From The Cloud

Published 2017-08-07 by Kevin Feasel

Arun Sirpal notes that you can take a cloud database back to on-premises:

The Challenge: I am going to write about a way to move from Azure SQL Database (Platform as a service) back to a local SQL Server. I did encounter errors on the way but more importantly I have written how to avoid/solve them.

Another key point I made sure that there were no connections to the database when doing the below as I didn’t want in-flight data movement whilst doing it. If you can’t do this, then you should create a copy of the database and work from that.

It’s not a trivial operation, but Arun does walk us through the steps.

Comments closed

Hypnotizing Your Users: Drawing Spirals In SQL Server

Published 2017-08-07 by Kevin Feasel

Slava Murygin shows how to draw spirals in SQL Server using spatial data types:

In this script you can play with total number of iterations (@i), with increment value of @R or with width of a line (STBuffer), but generally, you will have always the same “Archimedean” type of a spiral.

Slava shows us how to build a half-dozen different types of spirals, providing sample code for each.

Comments closed

ARITHABORT And ANSI_WARNINGS

Published 2017-08-07 by Kevin Feasel

Shane O’Neill looks at what the ARITHABORT and ANSI_WARNINGS settings do in SQL Server:

So, like a dog when it sees a squirrel, when I found out about the problems with ARITHABORT and ANSI_WARNINGS I got distracted and started checking out what else I could break with it. Reading through the docs, because I found that it does help even if I have to force myself to do it sometimes, I found a little gem that I wanted to try and replicate. So here’s a reason why you should care about setting ARITHABORT and ANSI_WARNINGS on.

These are two settings where the default value makes a lot of sense.

Comments closed

Context Switches In SQL Server

Published 2017-08-04 by Kevin Feasel

Ewald Cress continues his journey to the center of the SQLOS:

The SQLOS scheduler exists in the cracks between user tasks. As we’re well aware, in order for scheduling to happen at all, it is necessary for tasks to run scheduler-friendly code every now and again. In practice this means either calling methods which have the side effect of checking your quantum mileage and yielding if needed, or explicitly yielding yourself when the guilt gets too much.

Now from the viewpoint of the user task, the experience of yielding is no different than the experience of calling any long-running CPU-intensive function: You call a function and it eventually returns. The real difference is that the CPU burned between the call and its return was spent on one or more other threads, while the current thread went lifeless for a bit. But you don’t know that, because you were asleep at the time!

Definitely read the whole thing.

Comments closed

Clippy Lives: In Scala

Published 2017-08-04 by Kevin Feasel

Akhil Vijayan explains Scala Clippy:

Now you may be wondering how these errors are identified and we get advice related to it.

Simple, these are provided by the Scala community. If you visit their official website Scala Clippy where you can find a tab “Contribute”. Under that, we can post our own errors. These errors are parsed first, and when successful we can add our advice which will be reviewed and if accepted it will be added to their database which will, in turn, be beneficial to others.

Take a close look at the screenshots; I missed it at first, but there’s helpful advice above the error message.

Comments closed

More On S3Guard

Published 2017-08-04 by Kevin Feasel

Aaron Fabbri describes how S3Guard works:

Although Apache Hadoop has support for using Amazon Simple Storage Service (S3) as a Hadoop filesystem, S3 behaves different than HDFS. One of the key differences is in the level of consistency provided by the underlying filesystem. Unlike HDFS, S3 is an eventually consistent filesystem. This means that changes made to files on S3 may not be visible for some period of time.

Many Hadoop components, however, depend on HDFS consistency for correctness. While S3 usually appears to “work” with Hadoop, there are a number of failures that do sometimes occur due to inconsistency:

FileNotFoundExceptions. Processes that write data to a directory and then list that directory may fail when the data they wrote is not visible in the listing. This is a big problem with Spark, for example.
Flaky test runs that “usually” work. For example, our root directory integration tests for Hadoop’s S3A connector occasionally fail due to eventual consistency. This is due to assertions about the directory contents failing. These failures occur more frequently when we run tests in parallel, increasing stress on the S3 service and making delayed visibility more common.
Missing data that is silently dropped. Multi-step Hadoop jobs that depend on output of previous jobs may silently omit some data. This omission happens when a job chooses which files to consume based on a directory listing, which may not include recently-written items.

Worth reading if you’re looking at using S3 to store data for Hadoop. Also check out an earlier post on the topic.

Comments closed

M	T	W	T	F	S	S
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

Month: August 2017

Dynamic Date Dimensions In Power BI

More Annoying MSDTC Problems

Limiting Color Usage On Dashboard Charts

Left Versus Right Joins

Going Back From The Cloud

Hypnotizing Your Users: Drawing Spirals In SQL Server

ARITHABORT And ANSI_WARNINGS

Context Switches In SQL Server

Clippy Lives: In Scala

More On S3Guard