Conformity In Self-Service BI

Paul Turley has a nice post on some of the risks of self-service BI:

In some solutions with a manageable scale and a reasonable tolerance for a certain amount of data loss and inconsistency, this approach may be just fine.  There are very good reasons for inconsistencies between sets of data that come from different sources.  When 20 million fact rows are obtained from an online ordering system and .1% don’t have matching customers records that come from the system used to manage the master customer list, it may simply be a question of timing.  Or, maybe the definition of “customer” is slightly different in the two systems.  Chances are that there are also legitimate data quality issues affecting a small percentage of these records.

Whatever the case may be, a data conformity or potential data quality issue in this design scenario falls on the model designer to either fix or ignore.  Again, this may or may not be a concern but it is a decision point and might raise the question of ownership when questions about data quality are raised.

Paul then goes on to show how this gets fixed in a traditional model and where you need to watch out with SSAS Tabular.  Good essay worth reading.

Visualizing R In Power BI (Too)

Dustin Ryan is also looking at R visualization in Power BI:

Not only can we create and download custom visuals from PowerBI.com to extend the capabilities of Power BI, we can use R to create a ridiculous amount of powerful visualizations. If you can get the data into Power BI, you can use R to perform interesting statistical analysis and create some pretty cool, interactive visuals.

Dustin and Jan Mulkens are working on similar posts at the same time, so watch both of them.

Using NOEXPAND Hints

Paul White expands upon NOEXPAND:

There is another consequence of not using the NOEXPAND hint, which I mentioned in passing a couple of years ago in my article, Optimizer Limitations with Filtered Indexes:

The NOEXPAND hints are needed even in Enterprise Edition to ensure the uniqueness guarantee provided by the view indexes is used by the optimizer.

If you use indexed views in your environment, read this article.

Trace Flags Without Sysadmin

Jack Li shows how to enable a trace flag without sysadmin or changing any application code:

The initial thought is to enable the trace flag at session level.  We ran into two challenges.  First, application needs code change (which they couldn’t do) to enable it.  Secondly, dbcc traceon requires sysadmin rights.   Customer’s application used a non-sysadmin user.  These two restrictions made it seem impossible to use the trace flag.

However, we eventually came up with a way of using logon trigger coupled with wrapping the dbcc traceon command inside a stored procedure.   In doing so, we solved all problems.  We were able to isolate the trace flag just to that application without requiring sysadmin login.

This is the very edge of an edge case.  In normal practice, change the code.

Trustworthy Databases

Kenneth Fisher asks if you check TRUSTWORTHY settings on your databases:

I wasn’t surprised (although a little disappointed) that out of the 9 people the answered only one person was, and of the rest 5 didn’t even know what TRUSTWORTHY is. I even had one person ask me later. That’s somewhat scary because under the right circumstances if you give me a database with TRUSTWORTHY turned on I can take over your instance. I’m NOT going to show you how but it isn’t terribly difficult.

I’ll admit that I have been a bit non-chalant about TRUSTWORTHY in the past, but turning it on is the smart move.

SQL Server 2012 Slipstreaming

Cody Konior looks at slipstreaming SQL Server 2012:

There was a blog post by Boris Hristov which had some good images of the various places to look in the GUI to make sure that the updates have been picked up correctly. Using these and through experimentation I was able to answer those questions.

Interesting questions and good answers.

Bad Fixes

David Alcock looks at a few common “fixes” which end up causing their own problems:

I’m seeing lots of CXPACKETS waits, how do I fix these?

Bad Advice = Set the maximum degree of parallelism to 1, no more CXPACKET waits!

I’m seeing index fragmentation occur quite frequently on some of my indexes, what should I do?

Bad Advice = Set fill factor on the server to 70, no more fragmentation problems!

I’m worried about TempDB contention. What should I do?

Bad Advice = Set the number of files to the number of cores, no more contention issues!

Read the post for better advice.

Power BI And R

Kevin Feasel

2015-12-31

Power BI, R

Jan Mulkens has started a series on combining Power BI and R.

Part 1:

Fact is, R is here to stay. Even Microsoft has integrated R with SQL Server 2016 and it has made R scripting possible in it’s great Azure Machine Learning service.
So it was only a matter of time before we were going to see R integrated in Power BI.

Part 2:

From the previous point, it seems that R is just running in the background and that most of the functionality can be used.

Testing some basic functionality like importing and transforming data in the R visual worked fine.
I haven’t tried any predictive modelling yet but I assume that will just work as well.

Part 3:

So instead of printing “Hello world” to the screen, we’ll use a simple graph to say hello to the world.

First we need some data, Power BI enables us to enter some data in a familiar Excel style.
Just select “Enter Data” and start bashing out some data.

I’m looking forward to the rest of the series.

Populating Lookup Query Using A Variable

Meagan Longoria shows us how to use a variable to populate a lookup query in SSIS:

I already had my data flow populated with the lookup for MSA. I set it to full cache and entered a query in the connection to initially populate the fields that would be returned (simply my lookup query without the where clause).

Next, I opened the data flow properties, located Expressions and clicked on the ellipses to open the Property Expression  Editor. I found the SQLCommand property for my MSA lookup and set it to my package variable that contained my query.

I had issues in the past with full cached lookups and variables.  Fortunately, you can get around a lot of problems with expressions.

Data Compression

Andy Mallon looks at the costs and benefits of data compression:

The obvious benefit is that compressed data takes up less space on disk. Since you probably keep multiple copies of your database (multiple environments, DR, backups, etc), this space savings can really add up. High-performance enterprise-class storage is expensive. Compressing your data to reduce footprint can have a very real benefit to your budget. I once worked on an SAP ERP database that was 12TB uncompressed, and was reduced to just under 4TB after we implemented compression.

My experience with compression is that the benefit vastly outweighs the cost.  Do your own testing, of course.

Categories

August 2019
MTWTFSS
« Jul  
 1234
567891011
12131415161718
19202122232425
262728293031