Power BI Takeover

Devin Knight has a Q&A on Power BI:

Q: what is the difference between the Query editor and Data Modeler? What can and can’t do in each case ?

To summarize the Query Editor is mainly for Data Extraction actions.  So providing source information, applying rules to the incoming data, etc… The Data Modeling areas are focused on creating relationships between tables you’ve important and creating calculations you might need in your report.  This of this as the last step to prepare you data for reports.

Check out Devin’s webinar as well.  It’s a lot longer than a coffee break, but worth your time.

RID Lookup Or Key Lookup?

Aaron Bertrand asks which is faster, RID lookups or key lookups?

I’ve seen multiple people state that a heap can be better than a clustered index for certain scenarios. I cannot disagree with that. One of the interesting reasons I’ve seen stated, though, is that a RID Lookup is faster than a Key Lookup. I’m a big fan of clustered indexes and not a huge fan of heaps, so I felt this needed some testing.

So, let’s test it!

I thought it would be good to create a database with two tables, identical except that one had a clustered primary key, and the other had a non-clustered primary key. I would time loading some rows into the table, updating a bunch of rows in a loop, and selecting from an index (forcing either a Key or RID Lookup).

It looks like RID lookups are slightly faster than key lookups.  But check out the comments:  this is a best-case scenario.

Connecting To SQL Data Warehouse

Robert Sheldon looks at ways to connect to Azure SQL Data Warehouse:

Unlike SSMS, Microsoft does support connecting to SQL Data Warehouse from Visual Studio, via the database engine features in SSDT. When you get into the Visual Studio/SSDT environment, open SQL Server Object Explorer, which is similar to Object Explorer in SSMS. From there, click the Add SQL Server button.

When the Connect dialog box appears, provide the server name, select SQL Server Authentication, and then specify the login name and password, as shown in the following figure.

It is a bit surprising that you can’t easily connect via SSMS 2014.  Maybe that’s changed with SSMS 2016?

Custom Power BI Visuals

Rob Farley looks into custom Power BI visuals:

I hadn’t explored much in the way of custom visuals in Power BI until a while back, even though I was very much aware of the competition that was held in September. It had been on my list to explore some of what was possible. And this month, the T-SQL Tuesday topic (hosted by Wendy Pastrick@wendy_dance) was to learn something new and to blog about it. So it seemed a good idea to learn how to make my own custom visualisation!

Now, creativity isn’t exactly my thing. I find it really hard to write songs, for example. I know how to do it – but I quickly become self-critical and get stuck. Writing is easier, because it feels less ‘creative’, and appeals more to the teacher / preacher in me (and I know that takes creativity, especially if you’ve ever seen me present, but it’s different). So sitting down and coming up with a new way of visualising data wasn’t something I was going to do.

For more info, check out more info on creating visuals and using R to visualize data in Power BI.

DB File Space Usage

Andy Mallon looks into sys.dm_db_file_space_usage:

SQL Server 2016 is coming out next month, and it’s chock full of new features that I can learn. Row-level security is interesting. Dynamic data masking sounds fun.

If you’re a regular reader of my blog, you probably know I try to approach questions from a unique angle. Instead of blogging about something cutting edge or sexy, I decided to scroll through the list of system views until I found one I didn’t recognize.

sys.dm_db_file_space_usage

The name is pretty self-explanatory, but I never noticed this existed until now. Seems like the type of DMV that I should have known about, but I didn’t. Quick look at BOL, and I got the verbose description from Microsoft:

Andy goes on to compare the outputs from this DMV to methods he’s historically used.

Trending And Smoothing

Matt Allington looks at trending and smoothing data in Power BI:

You can download the workbook here if you want to take a look, or simply look at the embedded version I have pinned at the bottom of the post.

Notice the spikes in sales in different months in the chart above?  These spikes are very common in sales data, and in my experience they can be even more prevalent in weekly sales data.  These spikes make it difficult to analyse trends in the data.  You could put a trend line into the chart (thanks to the April update do Power BI), but a standard linear trend line is too simplistic to really see what is happening in your data, particularly if there are seasonal changes.

One good way to look at the trends in your data is to add an Average Monthly Sales Rolling Quarter trend line to the chart.  You simply take the total sales of the last 3 months and then divide by 3.  If you were doing a weekly trend, take the last 13 weeks and divide by 13.  When you overlay this Avg Monthly Sales RQ line on the original chart, it looks like this.

This is a fairly advanced topic, but it’s also the kind of thing which separates good reporting from great reporting.

SQL Server 2016 Editions For BI

John White discusses which version of SQL Server 2016 you should use for business intelligence work:

SQL Server 2016 removes the BI Edition as an option, leaving us with a choice between only Standard and Enterprise. The biggest news in my opinion from a licensing perspective with 2016 is that Tabular Mode will now be supported in Standard Edition. This puts the tabular model within the reach of all organizations, and closes the licensing gap in the BI stack. This is fantastic news.

There are of course limitations with Standard mode. Tabular in Standard Mode is restricted to 16 GB of RAM, which may seem like a lot, but keep in mind that tabular is an in-memory technology. It’s possible to bump into this limit fairly quickly, but it’s a limit that serves the small/medium business space rather well.

It’s surprising (in a good way) that John recommends Standard Edition, at least for small and medium businesses.

Bike Rental Demand Estimation

Kevin Feasel

2016-05-11

R

The Revolution Analytics blog has a Microsoft-driven article on estimating bike rental demand with Microsoft R Server:

In addition to the original features in the raw data, we add number of bikes rented in each of the previous 12 hours as features to provide better predictive power. We create acomputeLagFeatures() helper function to compute the 12 lag features and use it as the transformation function in rxDataStep().

Note that rxDataStep() processes data chunk by chunk and lag feature computation requires data from previous rows. In computLagFeatures(), we use the internal function .rxSet() to save the last n rows of a chunk to a variable lagData. When processing the next chunk, we use another internal function .rxGet() to retrieve the values stored in lagData and compute the lag features.

This is a great article for anybody wanting to dig into analytics, because they show their work.

Categories

May 2016
MTWTFSS
« Apr Jun »
 1
2345678
9101112131415
16171819202122
23242526272829
3031