Press "Enter" to skip to content

Author: Kevin Feasel

Playing Blackjack With Power BI

Phillip Seamark goes and creates a blackjack game in Power BI:

The last of the three data-tables in the model used to control Player 1 is the ‘P1 Turn’ data-table.  This is simply a 10-row single column table with numbers 1 through 10.  The purpose of this table is to use in conjunction with a slicer that will help keep track of what turn Player 1 is up to.  A series of 5 bookmarks will be used to snapshot a slicer selected in 5 different states.  A series of bookmarks will be taken with this slicer having a different value selected which is how the game can keep track of the progress through the game.

A [P1 Card Filter] calculated measure keeps track of the selected value over the above slicer which is used as a filter on the table-visual that is used to reveal cards for Player 1.

It’s more a toy solution than an actual game, but it’s interesting to see.

Comments closed

Reporting Services Templates In Visual Studio 2017

Greg Low points out an oddity in where Reporting Services templates are located in SQL Server 2017:

Templates are just report files (ie: RDL files) but placed in a special folder on your system.

If you’re starting with a blank page to create a report, you’re doing it wrong.

But where do these templates come from? In the Visual Studio 2015 version of the tools, they were here:

C:\Program Files (x86)\Microsoft Visual Studio 14.0\
Common7\IDE\PrivateAssemblies\ProjectItems\ReportProject

I found after installing the VS2017 version of the tools though, that they weren’t there any more.

Greg has found them in two separate locations, so read on to learn more.

Comments closed

Collecting Plan Metrics With Trace Flag 7412

Grant Fritchey shows a lightweight way of capturing plan metrics:

I place a lot of emphasis on capturing actual execution plans because of the runtime metrics, but with Trace Flag 7412, we don’t need the plan. This is great news, because capturing execution plans, even using extended events, is an expensive proposition. However, using either the query_thread_profile event, or, Trace Flag 7412, we can get the runtime metrics without the plan.

Very interesting.  But as Grant points out, this is not a cost-free operation, so use as needed but not willy-nilly.

Comments closed

Understanding Binary Trees

Robert Maclean has a couple of posts on binary trees.  In the first post, he explains the basics of a binary tree:

As a binary tree has some flexibility in it, a number of classifications have come up to have a consistent way to discuss a binary tree. Common classifications are:

  • Full binary tree: Each node in a binary tree can have zero, one, or two child nodes. In a fullbinary tree, each node can only have zero or two child nodes.
  • Perfect binary tree: This is a full binary tree with the additional condition that all leaf nodes (i.e. nodes with no children) are at the same level/depth.
  • Complete binary tree: The complete binary tree is where each leaf node is as far left as possible.
  • Balanced binary tree: A balanced binary tree is a tree where the height of the tree is as small a number as possible.

Then, he looks at binary search trees:

So, why should we care about a BST? We should care because searching is really performant in it as each time you move a depth down, you eliminate approximately 50% of the potential nodes.

So, for example, if we wanted to find the item in our example with the key 66, we could start at the root (50) and move right. At that point, we have eliminated 8 possible nodes immediately. The next is to the left from the node with the 70 (total possible nodes removed 12). Next is to the right of the node with the value of 65, and then to 66 to the left of 67. So we found the node with 5 steps.

Going to Big O Notation, this means we achieved a performance of close to O(log n). It is possible to have a worst case of O(n), when the tree is not Optimal or Unbalanced.

Binary search trees are an easy to understand, reasonably efficient model for searching for data.  Even when there are better options, this is an easy algorithm to implement and can often be good enough to solve the problem.

Comments closed

Pareto Efficiency And Mario Kart

The folks at Civis Analytics answer one of the more important questions in life:

Mario Kart was a staple of my childhood — my friends and I would spend hours after school as Mario, Luigi, and other characters from the Nintendo universe racing around cartoonish tracks and lobbing pixelated bananas at each other. One thing that always vexed our little group of would-be speedsters was the question of which character was best. Some people swore by zippy Yoshi, others argued that big, heavy Bowser was the best option. Back then there were only eight options to choose from; fast forward to the current iteration of the Mario Kart franchise and the question is even more complicated because you can select different karts and tires to go with your character. My Mario Kart reflexes aren’t what they used to be, but I am better at data science than I was as a fourth grader, so in this post I’ll use data to finally answer the question “Who is the best character in Mario Kart?”

This post also acts as a primer on Pareto Efficiency, an important concept in economics.

Comments closed

Flattening JSON Data With Databricks

Ivan Vazharov gives us a Databricks notebook to parse and flatten JSON using PySpark:

With Databricks you get:

  • An easy way to infer the JSON schema and avoid creating it manually
  • Subtle changes in the JSON schema won’t break things
  • The ability to explode nested lists into rows in a very easy way (see the Notebook below)
  • Speed!

Following is an example Databricks Notebook (Python) demonstrating the above claims. The JSON sample consists of an imaginary JSON result set, which contains a list of car models within a list of car vendors within a list of people. We want to flatten this result into a dataframe.

Click through for the notebook.

Comments closed

Microsoft R Open 3.5.0 Released

David Smith announces that Microsoft R Open 3.5.0 is now available:

Microsoft R Open 3.5.0 is now available for download for Windows, Mac and Linux. This update includes the open-source R 3.5.0 engine, which is a major update with many new capabilities and improvements to R. In particular, it includes a major new framework for handling data in R, with some major behind-the-scenes performance and memory-use benefits (and with further improvements expected in the future).

Microsoft R Open 3.5.0 points to a fixed CRAN snapshot taken on June 1 2018. This provides a reproducible experience when installing CRAN packages by default, but you always change the default CRAN repository or the built-in checkpoint package to access snapshots of packages from an earlier or later date.

It’s nice to see Microsoft keeping pace with R changes; they look like they’re averaging about 6-8 weeks from an R point release to an MRO release.

Comments closed

Smarter Indexes Based On Column Cardinality

Eric Blinn has a function which organizes columns in the missing index DMV by cardinality:

Bryan Rebok and Brent Ozar recently opened my eyes to something I didn’t know.  When SQL Server recommends missing indexes to you it puts the columns in order in which they are found in the table.  That’s it.  I always thought there was more logic into it.  But there isn’t.  Upon reading this I had a terrible realization that I’ve made a lot of awful indexes in my time.  I owe the world an apology.  I hope this post can serve as that apology.

I’ve written a function that accepts the equality column list from dm_db_missing_index_details as a parameter and spits those columns back out in order by their cardinality.  This won’t necessarily be the proper order for the columns in every index, but it is far more likely to be correct than the initial result from the DMV.

I’m amazed that the missing index DMV generates column names in such a simplistic manner.

Comments closed

Figuring Out Azure Analysis Services Costs

Chris Webb explains that Azure Analysis Services might not be quite as expensive as you’d first think:

What does this mean for the cost of Azure Analysis Services? Basically, if you’re taking advantage of these features you won’t pay one of the monthly prices quoted on the pricing page linked to at the top of this post. Instead you may do things like:

  • Scale up for one hour every day when you need to process your SSAS database, just to get the extra memory and QPUs needed, then scale down when processing has finished
  • Scale out only on certain days, or certain times of day, to handle increased numbers of users
  • Pause your instance when you are sure that no-one needs to run queries

How do you then calculate the likely cost? For my Azure Analysis Services precon at SQLBits a few months ago I built an Excel workbook that shows how to go about this.

There are some good questions in the comments section, so check those out as well.

Comments closed