Press "Enter" to skip to content

Author: Kevin Feasel

Testing BI Projects With NBi: Hierarchies And Levels

Cedric Charlier shows us a potential pain point when testing an Analysis Services cube with hierarchies defined:

When executing the following query (on the Adventure Works 2012 sample database/cube), you’ll see two columns in the result displayed by SSMS. It’s probably what you’re expecting, you’re only selecting one specific level of the hierarchy [Date].[Calendar Date] and one measure.

You’re probably expecting that NBi will also consider two columns. Unfortunately, it’s not the case: NBi will consider 4 columns! What are the additional and unexpected columns? The [Date].[Calendar].[Calendar Year] and [Date].[Calendar].[Calendar Semester] are also returned. In reality, this is not something specific to NBi, it’s just the standard behaviour of the ADOMD library and SSMS is cheating when only displaying one column for the level!

Click through for the solution.  And if NBi sounds interesting, check out Cedric’s prior post on the topic.

Comments closed

Configuring SQL Operations Studio

Kendra Little gave a quiz on SQL Operations Studio and is now sharing the answers:

Question 2: To toggle a BLOCK comment, the built-in shortcut is…

Answer: Shift+Alt+A

  • Correct: 43 (27%)
  • Incorrect: 118 (73%)

I think a lot of folks who use SSMS regularly and don’t use VSCode may not know what I meant by the question, because SSMS doesn’t have this functionality (or if it does, I’ve never figured out the shortcut!)

Check out all of the answers and build up those SQLOps skills.

Comments closed

Betteridge’s Law And Index Hints

Bert Wagner asks a question in his title, Should You Use Index Hints?  Those familiar with Betteridge’s Law of Headlines know the general answer already:

One way to “fix” a poor performing plan is to use an index hint.  While we normally have no control over how SQL Server retrieves the data we requested, an index hint forces the  query optimizer to use the index specified in the hint to retrieve the data (hence, it’s really more of a “command” than a “hint”).

Sometimes when I feel like I’m losing control I like using an index hint to show SQL Server who’s boss.  I occasionally will also use index hints when debugging poor performing queries because it allows me to confirm whether using an alternate index would improve performance without having to overhaul my code or change any other settings.

About the only place I consistently use index hints is with filtered indexes, where the combination of parameter sniffing and inexactitude in filters will convince the optimizer that the filtered index isn’t helpful when it really is.

Comments closed

Database Source Control With SVN

Nate Johnson sets up SVN for local source control:

I almost always have trouble remembering which option is for use with a non-empty folder of “here’s a bunch of files that I want to dump into the repo to start with”, vs. “here’s an empty folder where I want to pull down the contents of an existing repo”.  Fortunately, Tortoise yells at you if you try to do the latter — which is Export — into a non-empty folder.  So we want to Import.  Assuming you have a folder where all your SQL scripts live already, right-clicky and say “Tortoise SVN .. Import.”

Check it out.  The only concern I have is that this source control is just local source control.  That’s very helpful in situations where you accidentally mess something up, but my preference is to put my code in the same source control system the developers are using.  And if the developers aren’t using source control, get that institution in place as soon as possible because that’s begging for trouble.

Comments closed

Supertype-Subtype Relationships In Tables

Deborah Melkin explains the supertype-subtype pattern in relational database architecture:

You don’t see a supertype-subtype relationship defined as such when you’re looking at the physical database. You’ll only see it explicitly in the logical data model. So what is the pattern and how do you know that you have one in your database?

This relationship exists where you have one entity that could have different attributes based on a discriminator type. One example is a person. Depending on the role of that person in relationship to the business, you will need to store different pieces of information for them. You need different information about a client than you do an employee. But you’re dealing with a person so there is shared information.

It’s a good pattern for minimizing data repetition.

Comments closed

Ingesting Multiple Data Sources With NiFi And MiniFi

Tim Spann shows how to collect data from multiple IoT devices using MiniFi and send it to a NiFi host:

So I designed my MiniFi flow in the Apache NiFi UI (pretty soon there will be a special designer for this). You then highlight everything there and hit ‘Create Template.’ You can then export it and convert it to config.yml. Again, this process will be automated and connected with the NiFi Registry very shortly to reduce the amount of clicking.

This is an example. When you connect to it in your flow you design it in Apache NiFi UI, you will connect to this port on the Remote Processor Group. If you are manually editing one (okay never do this, but sometimes I have to), you can copy that ID from this Port Details and past it in the file.

I like this as an overview of NiFi’s capabilities and a sneak peek at where they’re going.

Comments closed

debugr: Debugging In R

Joachim Zuckarelli announces a new R package, debugr:

debugr is a new package designed to support debugging in R. It mainly provides the dwatch() function which prints a debug output to the console or to a file. A debug output can consist of a static text message, the values of one or more objects (potentially transformed by applying some functions) or the value of one or multiple (more complex) R expressions.

Whether or not a debug message is displayed can be made dependent on the evaluation of a criterion phrased as an R expression. Generally, debug messages are only shown if the debug mode is activated. The debug mode is activated and deactivated with debugr_switchOn() and debugr_switchOff(), respectively, which change the logical debugr.active value in the global options. Since debug messages are only displayed in debug mode, the dwatch() function calls can even remain in the original code as they remain silent and won’t have any effect until the debug mode is switched on again.

Click through for links to additional resources.  It looks like an interesting way of tracing problems in more error-prone segments of code.  H/T R-Bloggers

Comments closed

What Is R?

Dave Mason has started a new blog and hits the heavy topic first:

For anyone that has no idea what R is, comparisons to scripting languages like PowerShell, javascript, vbscript, or even DOS batch/cmd files might be helpful. I feel there are enough commonalities, at least conceptually at a high level, for the comparison to be appropriate. We’ve already seen some differences, though. The <- assignment operator sure is weird. I recall Oracle’s PL/SQL used := as an assignment operator. Almost all other languages I remember coding with use the near-universal = (equals sign). Using <- will take some time getting used to.

Those R variables used in this post are declared without a data type. But they do have underlying types, which I’ll cover in another post. If I remember correctly, javascript doesn’t have types–everything is an object (please leave a comment if this is wrong and I’ll correct the post later). Vbscript used “var”s for everything, although you could coerce data types with functions like CInt, CBool, etc.

The way I like to describe R is as two things:  first, it is a domain-specific language dedicated to statistical analysis; and second, that it is a functional programming language (though not a pure functional language).

Comments closed

Exploratory Time Series Analysis

The authors at Knoyd have a post on exploratory data analysis of a time series data set:

From the plot above we can clearly see that time-series has strong seasonal and trend components. To estimate the trend component we can use a function from the pandas library called rolling_mean and plot the results. If we want to make the plot more fancy and reusable for another time-series it is a good idea to make a function. We can call this function plot_moving_average.

The second part of the series promises to use Box-Jenkins to forecast future values.

Comments closed