Press "Enter" to skip to content

Curated SQL Posts

Power BI Free Is The Problem

Matt Allington shares his thoughts on the recent Power BI licensing changes:

I think the existence of the Power BI Free product has been the root of the problem here.  The fact that you could do so much for free (including some sharing) really muddied the waters and has taken the focus away from acknowledging that there needs to be a two tier pricing model for users (free is not a pricing tier). Microsoft is addressing one part of the problem by making it clear that Power BI Free is for personal (non sharing) use. However it has not addressed the second part of the problem being the need for a lower priced offering for users that just consume data in a way I would describe as “low involvement”. Microsoft has taken away the “proxy for a low priced sharing tier” without providing a genuine low priced replacement – this had just made the situation worse, not better and it has upset a lot of people.  Power BI Free has been a great product to “try before you buy” but unfortunately its existence prevented Microsoft from realising it was missing a price tier for 2 years!  Power BI Free for personal use (no sharing) is an incredibly generous offering from Microsoft.  It is a shame that it will need a backlash to fill the real gap – a lower priced tier.

Check out the comments as well.  I think Matt has a good point, and my guess is that the Power BI team will make it easier for small to medium sized businesses to use Power BI, but they first wanted to focus on the problem with big customers.

Comments closed

No More Sharing With Power BI Free

Ginger Grant explains an important ramification of the recent Power BI licensing changes:

Included in the recent list of announcements Microsoft made about Power BI Local and Power BI Premium are a series of changes to the Power BI Free version which will go into effect on June 1. The free edition of Power BI will no longer be able to share reports. Currently free users could create reports and share them with others, which will be discontinued.  Only Power BI Pro Editions will be able to share reports.  Currently Power BI Pro users can create reports which can be shared with Free versions as long as no Pro features are used.  This means that if a Power BI report is set to automatically refresh the data, that report cannot be shared as Free versions do not have the ability to create reports which have data refreshed automatically. If the report was recreated to remove the automatic updates and instead refreshed manually, then the report could be shared with Free versions.  Starting June 1, the sharing feature will be removed. No longer can Power BI Pro users share anything to Power BI Free users.  If you have a Power BI Free account, there is no way to share information in the service. The Power BI Desktop will continue to be free but since you cannot print the content within it and sharing a PBIX file means that you will always be sharing the entire data model, this is of limited value.

Read the whole thing.

Comments closed

UNION ALL Ordering

Paul White shows how UNION ALL concatenation has changed since SQL Server 2008 R2:

The concatenation of two or more data sets is most commonly expressed in T-SQL using the UNION ALL clause. Given that the SQL Server optimizer can often reorder things like joins and aggregates to improve performance, it is quite reasonable to expect that SQL Server would also consider reordering concatenation inputs, where this would provide an advantage. For example, the optimizer could consider the benefits of rewriting A UNION ALL B as B UNION ALL A.

In fact, the SQL Server optimizer does not do this. More precisely, there was some limited support for concatenation input reordering in SQL Server releases up to 2008 R2, but this was removed in SQL Server 2012, and has not resurfaced since.

It’s an interesting article about an edge case.

Comments closed

Azure Data Lake Tools For VS Code

Jenny Jiang announces that Azure Data Lake Tools for Visual Studio Code is now generally available:

ADLA Integration

The ADL Tools for VSCode integrate well with ADLA. Azure Data Lake includes the capabilities required to make it easy for developers, data scientists, and analysts to store data of any size, shape, and speed, and do all types of processing and analytics across platforms and languages. U-SQL on ADLA offers Job as a Service with the Microsoft invented U-SQL language. Customers do not have to manage deployment of clusters, but can simply submit their jobs to ADLA, an analytics platform managed by Microsoft.

Click through for the full announcement.

Comments closed

File Management In Containers

Andrew Pruski hows how to copy files into and out from Docker containers:

Last week I was having an issue with a SQL install within a container and to fix I needed to copy the setup log files out of the container onto the host so that I could review.

But how do you copy files out of a container?

Well, thankfully there’s the docker cp command. A really simple command that let’s you copy whatever files you need out of a running container into a specified directory on the host.

I’ll run through a quick demo but I won’t install SQL, I’ll use an existing SQL image and grab its Summary.txt file.

Read on for the demo.

Comments closed

Cardinality Estimation On Memory-Optimzied Table Variables

Jack Li explains that the cardinality estimator works the same for memory-optimized table variables as it does for regular table variables:

In a previous blog, I talked about memory optimized table consumes memory until end of the batch.   In this blog, I want to make you aware of cardinality estimate of memory optimized table as we have had customers who called in for clarifications.  By default memory optimized table variable behaves the same way as disk based table variable. It will have 1 row as an estimate.   In disk based table variable, you can control estimate by using option (recompile) at statement level (see this blog) or use trace flag 2453.

You can control the same behavior using the two approaches on memory optimized table variable if you use it in an ad hoc query or inside a regular TSQL stored procedure.  The behavior will be the same. This little repro will show the estimate is correct with option (recompile).

Jack also explains how this works for natively compiled stored procedures (spoilers:  it doesn’t), so read the whole thing.

Comments closed

Build Your Own Power BI App

Reza Rad shows how to build a Power BI app:

Till now there were about 6 methods of sharing content in Power BI, including:

I have written about some of these already (follow links above), and will write about the rest soon. Yesterday, Microsoft announced preview version of Power BI Apps, which is a new method of sharing. This is an enhancement version of two methods previously: Work spaces, and Content Packs together!

Read on for a step-by-step guide.

Comments closed

Picking An R Package Name

Marcelo Perlin has fun looking at package names in CRAN:

Looking at package names, one strategy that I commonly observe is to use small words, a verb or noun, and add the letter R to it. A good example is dplyr. Letter d stands for dataframe, ply is just a tool, and R is, well, you know. In a conventional sense, the name of this popular tool is informative and easy to remember. As always, the extremes are never good. A couple of bad examples of package naming are A3, AF, BB and so on. Googling the package name is definitely not helpful. On the other end, packagesamplesizelogisticcasecontrol provides a lot of information but it is plain unattractive!

Another strategy that I also find interesting is developers using names that, on first sight, are completely unrelated to the purpose of the package. But, there is a not so obvious link. One example is package sandwich. At first sight, I challenge anyone to figure out what it does. This is an econometric package that computes robust standard errors in a regression model. These robust estimates are also called sandwich estimators because the formula looks like a sandwich. But, you only know that if you studied a bit of econometric theory. This strategy works because it is easier to remember things that surprise us. Another great example is package janitor. I’m sure you already suspect that it has something do to with data cleaning. And you are right! The message of the name is effortless and it works! The author even got the privilege of using letter R in the name.

Marcelo uses word and character analysis to come up with his conclusions, making this a good way of seeing how to graph and slice data. h/t R-bloggers

Comments closed

Improving Performance For JSON In SQL Server

Bert Wagner shows how to use indexed, non-persisted columns to pre-parse JSON data in SQL Server:

This is basically a cheat code for indexing computed columns.

SQL will only compute the “Make” value on a row’s insert or update into the table (or during the initial index creation) — all future retrievals of our computed column will come from the pre-computed index page.

This is how SQL is able to parse indexed JSON properties so fast; instead of needing to do a table scan and parsing the JSON data for each row of our table, SQL Server can go look up the pre-parsed values in the index and return the correct data incredibly fast.

Personally, I think this makes JSON that much easier (and practical) to use in SQL Server 2016. Even though we are storing large JSON strings in our database, we can still index individual properties and return results incredibly fast.

It’s great that the database engine is smart enough to do this, but I’m not really a big fan of storing data in JSON and parsing it within SQL Server, as that violates first normal form.  If you know you’re going to use Make as an attribute and query it in SQL, make it a real attribute instead of holding multiple values in a single attribute.

Comments closed

Stored Procedures Are Code

Rob Farley hates hearing that stored procedures don’t need to go into source control:

Hearing this is one of those things that really bugs me.

And it’s not actually about stored procedures, it’s about the mindset that sits there.

I hear this sentiment in environments where there are multiple developers. Where they’re using source control for all their application code. Because, you know, they want to make sure they have a history of changes, and they want to make sure two developers don’t change the same piece of code, maybe they even want to automate builds, all those good things.

But checking out code and needing it to pass all those tests is a pain. So if there’s some logic that can be put in a stored procedure, then that logic can be maintained outside the annoying rigmarole of source control. I guess this is appealing because developers are supposed to be creative types, and should fight against the repression, fight against ‘the man’, fight against control.

When I come across this mindset, I worry a lot.

Read on for Rob’s set of worries, and hie thee to the source control repository.  It really doesn’t matter which source control product you use (ideally, the same one that developers use for their app code), just as long as it’s in source control.

Comments closed