Month: May 2018

Updated DIML Catalog Browser

Published 2018-05-14 by Kevin Feasel

Andy Leonard announces a new version of his Catalog Browser utility:

Catalog Browser first displays the reference mapping in the context of the environment named DEV_Person. DEV_Person is a Catalog Environment that contains a collection of Catalog Environment Variables.

Catalog Browser next displays the reference mapping in the context of the SSS Connection Manager named AdventureWorks2014.OLEDB that consumes the Reference between the DEV_Person environment and the Load_Person.dtsx SSIS package. Note that this Reference Mapping is displayed as <Property Name> –> <Environment Variable Name>, or “ConnectionString –> SourceConnectionString”. Why? Catalog Browser is displaying the Reference Mapping from the perspective of the Connection Manager property.

The third instance of Values Everywhere is shown in the Package Connection References node. Remember, a reference “connects” a package or project to an SSIS Environment Variable (learn more at SSIS Catalog Environments– Step 20 of the Stairway to Integration Services). From the perspective of the reference, the reference mapping is displayed as <Environment Variable Name> –> <Property Name>, or “SourceConnectionString –> ConnectionString”. Why? Catalog Browser is displaying the Reference Mapping from the perspective of the Reference.

Check it out.

Comments closed

Update Statistics After An Upgrade

Published 2018-05-14 by Kevin Feasel

Erin Stellato gives us some good life advice:

There are a variety of methods we use for helping customers upgrade to a new SQL Server version, and one question we often get asked is whether or not statistics need to be updated as part of the upgrade process.

tl;dr

Yes. Update statistics after an upgrade. Further, if you’re upgrading to 2012 or higher from an earlier version, you should rebuild your indexes (which will update index statistics, so then you just need to update column statistics).

Make this one of your “Not too long; did read” posts of the day.

Comments closed

Choosing Between SQL, M, And DAX

Published 2018-05-14 by Kevin Feasel

Paul Turley chooses the form of his destructor:

A Date dimension table is an essential component in most any data warehouse or reporting database so techniques to generate these tables have been around for a long time. The foundation of a Date dimension table is a table containing one row per contiguous date in a range that includes every possible transaction date or fact record. To make reporting easier, it is common practice to have multiple date dimensions in the semantic model. For example, if sales transaction facts have an Order Date and a Delivery Date, and both are used independently for reporting; there may be an Order Date dimension and a Delivery Date dimension in the model.

A common practice for building the dimension table is to just populate a single Date type column with the sequential date values. After these rows are inserted, date part functions may be used to populate additional columns by referencing the Date value in an expression. Most every language includes, for example, a MONTH() and YEAR() function to convert a date value into these date parts.

I’m hoping that Paul puts together several of these types of post, where he contrasts building something in SQL, M, and DAX so we can see which language helps most where.

Comments closed

Error Processing SSIS Task When TargetServerVersion Is SQL Server 2016

Published 2018-05-14 by Kevin Feasel

Shabnam Watson diagnoses an error condition when trying to run an Analysis Services processing task inside SQL Server Integration Services:

I ran into this problem a while ago at a client. They upgraded from Visual Studio 2013 to 2015 and the SSAS processing tasks started to error out immediately. The solution turned out to be setting the TargetSeverVersion to anything but SQL Server 2016. In this case, it was set to 2014 and that fixed the error.

Recently I ran into this post https://twitter.com/SQLKohai/status/994335086425399297 by Matt Cushing (@SQLKohai) and decided to dig in more. Initially when I tested it, all was working fine. After I installed SSDT 2015 to test, I started getting the same error in SSDT 2017. I played around with a DLL and got SSDT 2017 to work with all TargetVersionServers again. At the end I managed to break it again after I went through an uninstall and reinstall of all versions of SSDT. The reason I did the reinstall of SSDT was that I thought I might have had a broken registry entry that I was hoping the installation would fix. This did not work!

Read on for the solution and a detailed dive into the problem.

Comments closed

Non-Blocking Aggregations

Published 2018-05-14 by Kevin Feasel

Daniel Hutmacher tilts at windmills:

It’s not entirely uncommon to want to group by a computed expression in an aggregation query. The trouble is, whenever you group by a computed expression, SQL Server considers the ordering of the data to be lost, and this will turn your buttery-smooth Stream Aggregate operation into a Hash Match (aggregate) or create a corrective Sort operation, both of which are blocking.

Is there anything we can do about this? Yes, sometimes, like when those computed expressions are YEAR() and MONTH(), there is. But you should probably get your nerd on for this one.

There are many ways to solve a problem, and sometimes the best method is indirect.

Comments closed

Building A Neural Network With TensorFlow

Published 2018-05-11 by Kevin Feasel

Julien Heiduk gives us an example of building a neural network with TensorFlow:

To use Tensorflow we need to transform our data (features) in a special format. As a reminder, we have just the continuous features. So the first function used is: tf.contrib.layers.real_valued_column. The others cells allowed to us to create a train set and test set with our training dataset. The sampling is not the most relevant but it is not the goal of this article. So be careful! The sample 67-33 is not the rule!

It’s probably an indicator that I’m a casual, but I prefer to use Keras as an abstraction layer rather than working directly with TensorFlow.

Comments closed

Scientific Debt

Published 2018-05-11 by Kevin Feasel

David Robinson gives us the data science analog to technical debt:

In my new job as Chief Data Scientist at DataCamp, I’ve been thinking about the role of data science within a business, and discussing this with other professionals in the field. On a panel earlier this year, I realized that data scientists have a rough equivalent to this concept: “scientific debt.”

Scientific debt is when a team takes shortcuts in data analysis, experimental practices, and monitoring that could have long-term negative consequences. When you hear a statement like:

“We don’t have enough time to run a randomized test, let’s launch it”

“To a first approximation this effect is probably linear”

“This could be a confounding factor, but we’ll look into that later”

“It’s directionally accurate at least”

you’re hearing a little scientific debt being “borrowed”.

Read the whole thing. I strongly agree with the premise.

Comments closed

Is Query Folding Taking Place?

Published 2018-05-11 by Kevin Feasel

Chris Webb has a quick way of seeing if query folding is taking place when importing data from Analysis Services into Power Query (either Power BI or Excel):

As a quick follow-on from last week’s post on how to detect whether query folding is taking place when importing from OData data sources, if you’re importing data from Analysis Services you have a similar problem: how do you know whether query folding is taking place? Ensuring that query folding takes place for as many of the steps in your query – especially those that filter or otherwise reduce the amount of data returned – is very important for data refresh performance.

Although the Power Query engine generates MDX queries when importing from Analysis Services in the same way it generates SQL queries when it imports from a relational database, the View Native Query option doesn’t work for Analysis Services data sources. You can of course use a Profiler trace or xEvents to see the MDX, but for most users that will not be an option for security reasons.

Read on for a better alternative.

Comments closed

Resetting Slices In Power BI With Bookmarks

Published 2018-05-11 by Kevin Feasel

Reza Rad shows how to use a bookmark to set the default state of a Power BI dashboard and return to it with a single button click:

Using bookmarks for clearing all slicers in Power BI is not a a new function, I have been using it for many months, and advising many people to do it that way. However, I still get a lot of questions in my presentations about how to do that. That is why I ended up writing this post. This post shows you a very quick trick of having a button to clear all slicers, and the magic is all happening with bookmarks. Bookmarks store the state of a Power BI page, and can be used in many scenarios, in this post, I only show you the ability to clear all slicers in a page. To learn more about Power BI; read Power BI book from Rookie to Rock Star.

Click through for an example of how to set this up.

Comments closed

Constraints On Temp Tables

Published 2018-05-11 by Kevin Feasel

Kenneth Fisher argues that you should use default naming for temp table constraints:

You should be able to create a #temp in every session. That’s the idea, right? It’s one of the things that differentiates a global temp table from a local temp table. But there can be some difficulties with that.

If you are working with reusable code that uses temp tables (a stored procedure for example), sometimes you need to create a constraint. The thing about constraints is that their names are just as unique as tables, stored procedures etc. i.e. the name of a constraint can only be used once. You can’t have two tables with the same constraint name. In fact, you can’t even have a constraint name that matches a table, stored procedure etc name.

There’s some solid advice in this post.

Comments closed