Press "Enter" to skip to content

Author: Kevin Feasel

New Docker Tag For SQL Server 2017

Hamish Watson notes that there are new Docker tags for SQL Server 2017:

I have been using SQL Server 2017 running on Linux for a while now (blog post pending) and use the official images from:

https://hub.docker.com/r/microsoft/mssql-server-linux/

To get the latest I used to run

docker pull microsoft/mssql-server-linux:latest

However today I noticed that the :latest tag had been removed:

Click through to see the tag you probably want to use.

Comments closed

Using XPath To Shred HTML

Shannon Lowder shows off the HTML Agility Path project to help him parse the contents of webpages:

Let’s say we wanted the table.  We could use the XPath /html/body/table to retrieve it. We can also use XPath to refer to a collection.  Let’s say we wanted all the rows. We would use the XPath /html/body/table/tr. We would get a collection of three rows.  Notice the XPath looks a lot like a Linux or windows folder path.  That’s the idea of XPath!

I would like to point out a couple of extra points.  First, XPath is case sensitive.  So if I had tried to use /html/body/table/TR, I would find no nodes.

Second, you can use “short hand” in your XPath queries.  //body/table/tr would get you to the same place /html/body/table/tr did.

This intro is part of a series Shannon has started on scraping data from websites.

Comments closed

Online Dashboard Taxonomy

Tim Bock has a lot of examples of online dashboards:

The classic dashboards are designed to report key performance indicators (KPIs). Think of the dashboard of a car or the cockpit of an airplane. The KPI dashboard is all about dials and numbers. Typically, these dashboards are live and show the latest numbers. In a business context, they typically show trend data as well.

A very simple example of a KPI Dashboard is below. Such dashboards can, of course, be huge. Huge dashboards have lots of pages crammed with numbers and charts, looking at all manner of operational and strategic data.

The single most important question I think you can ask about dashboards is, what does the intended audience need to see (and do, once they’ve seen)?  That will drive the kind of dashboard elements you want to use.  If you need people to react and perform some maintenance operation, you probably want a KPI chart.  If you want to influence readers’ opinions, infographic elements might be the trick.

Comments closed

Custom Snippets In RStudio

Mara Averick shows how to create a custom code snippet in RStudio:

The RStudio Support Code Snippets post is a great step-by-step for adding snippets of your own. The gist of it for a markdown snippet is as follows:

  1. Open RStudio Preferences

  2. Go to the Code section

  3. Click the Edit Snippets button3

  4. Select Markdown

  5. Add your snippet and Save

Click through for a demo of how to embed tweets into your RMarkdown documentation.

Comments closed

Timeline Storyteller Custom Visual

Devin Knight continues his Power BI custom visuals series:

In this module you will learn how to use the Timeline Storyteller.  The Timeline Storyteller is a great way to tell a story about your data. It gives you the ability to create multiple representations of your data and then pull them together by creating multiple scenes.

This is a flashy visual and I think Devin’s set is an excellent example of where you might want to use it.

Comments closed

Limiting Docker Container Resources

Andrew Pruski shows how to cap the resources available to a container:

What I’ve done here is use the cpus and memory switches to limit that container to a maximum of 2 CPUs and 2GB of RAM. There are other options available, more info is available here.

Simple, eh? But it does show something interesting.

I’m running Docker on my Windows 10 machine, using Linux containers. The way this works is by spinning up a Hyper-V Linux VM to run the containers (you can read more about this here).

Read on to learn more.

Comments closed

Finding Adaptive Join Inefficiencies

Joe Obbish walks us through a scenario with adaptive joins in SQL Server 2017:

The estimated costs for the two queries are very close to each other: 74.6842 and 74.6839 optimizer units. However, we saw earlier that the tipping point for an adaptive join on this query can vary between 22680 and 80388.3 rows. This inconsistency means that we can find a query that performs worse with adaptive joins enabled.

Click through to see the queries Joe is using.  Based on this, I’d guess that this is probably a knife-edge problem:  most of the time, adaptive join processing is better, but if you hit the wrong query, it’s worse.

Comments closed

Running Out Of Ints

Paul Randal explains the unlikelihood that you’d run out of bigints in a table:

So with 1 million rows per second, you’ll be generating 1 million x 3,600 (seconds in an hour) x 24 (hours in a day) = 86.4 billion rows per day, so you’ll need about 1.4 terabytes of new storage per day. If you’re using the bigint identity as a cluster key, each row needs new space, so you’ll need almost exactly 0.5 petabytes of new storage every year.

At that rate, actually running out of bigint values AND storing them would take roughly 150 thousand petabytes. This is clearly impractical – especially when you consider that storing *just* a bigint is pretty pointless – you’d be storing a bigint and some other data too – probably doubling the storage necessary, at least.

By contrast, if you have a staging table that flows 10 million rows a day (meaning 10 million leave and 10 million new ones enter), you’ll overflow an int column in less than a year.  It’s worth thinking about data sizes before deciding on the type of a surrogate key.  Bigint is the safest, and if you think you’ll need it, go with it.  But there is that storage overhead.

Comments closed

Variables In DAX

Matt Allington shows us how to use variables in DAX:

Variables in DAX is a relatively new feature and is available in

  • Power BI Desktop
  • Excel 2016
  • SSAS Tabular 2016

Variables are not available in Excel 2013 or Excel 2010.

Click through to see how to assign and use variables.  It’s interesting to see how they’re local to a measure, so at this point at least, you can’t share variables between measures.  Given what DAX is supposed to be, that’s probably the right choice.

Comments closed

What’s New In Analysis Services

Christian Wade explains what’s new in SQL Server Analysis Services 2017:

SSAS 2017 introduces the 1400 compatibility level. Here are just some highlights of the new features:

  • New infrastructure for data connectivity and ingestion into tabular models with support for TOM APIs and TMSL scripting. This enables support for a range of additional data sources, and data transformation and mashup capabilities.

  • Support for BI tools such as Microsoft Excel enables drill-down to detailed data from an aggregated report. For example, when end-users view total sales for a region and month, they can view the associated order details.

  • Object-level security to secure table and column names in addition to the data within them.

  • Enhanced support for ragged hierarchies such as organizational charts and chart of accounts.

  • Various other improvements for performance, monitoring, and consistency with the Power BI modeling experience.

There’s plenty more where that came from (unless you’re a Multidimensional fan…), so click through for the details.

Comments closed