Press "Enter" to skip to content

Curated SQL Posts

Task-Based Effectiveness of Visualizations

Adrian Colyer summarizes an interesting IEEE paper:

So far this week we’ve seen how to create all sorts of fantastic interactive visualisations, and taken a look at what data analysts actually do when they do ‘exploratory data analysis.’

To round off the week today’s choice is a recent paper on an age-old topic: what visualisation should I use?

No prizes for guessing “it depends!”

…the effectiveness of a visualization depends on several factors including task at the hand, and data attributes and datasets visualized.

Is this the paper to finally settle the age-old debate surrounding pie-charts??

The results were very interesting, though as an official Pie Chart Hater, I would point out that in none of their results was a pie chart ever better than a bar/column chart. There are cases where it works out okay, but if it’s never better and often worse than something, I’d rather use the alternative.

Comments closed

Database Restoration and the Plan Cache

Andy Mallon has some tests for us:

If you restore a database, what does that do to the plan cache? Well, let’s start by looking at the documentation for RESTORE. (Emphasis mine)

Restoring a database clears the plan cache for the instance of SQL Server. Clearing the plan cache causes a recompilation of all subsequent execution plans and can cause a sudden, temporary decrease in query performance. For each cleared cachestore in the plan cache, the SQL Server error log contains the following informational message: ” SQL Server has encountered %d occurrence(s) of cachestore flush for the ‘%s’ cachestore (part of plan cache) due to some database maintenance or reconfigure operations”. This message is logged every five minutes as long as the cache is flushed within that time interval.

Yikes. That first sentence sounds like it is going to clear the cache for the entire instance.

Read on as Andy tests this and (spoiler alert) changes the documentation.

Comments closed

Ordering in Cosmos DB Queries

Hasan Savran shows how you can order data in Cosmos DB queries:

If you need to use multiple properties in your ORDER BY then you need to define COMPOSITE INDEXES.For example when I try to run the following query and try to order the objects by CreatedOn and Score, I end up with an error because I do not have a COMPOSITE INDEX to use with this ORDER BY.

Many parts of Cosmos DB’s SQL syntax are similar to T-SQL, but some of the underlying assumptions—such as, what you need to order data—are quite different.

Comments closed

Auto-Deleting SQL Agent Jobs

Dave Bland takes us through SQL Agent job auto-deletion:

Have you ever looked at something in SQL Server and wonder why it is there?  That is what I think when I see this option in the SQL Server Agent job properties.  I can not come up with any good reason of why you would want a job to delete itself upon completion.  I even did a Google search and really didn’t find a good reason.  However, if you know of a great reason of why you would want to enable this, I would love to hear about it.

I’ve used it in the past for scheduling ephemeral work, particularly when I didn’t have the ability to control operations otherwise. For example, I need to perform a time-consuming one-time update on data, but I don’t want to tie it to a script on my machine because I wanted to go home that night. Creating a job which auto-deletes upon success lets me schedule it for when I want it to run, kick off the script, and not leave a mess behind in the SQL Agent jobs list. It’s a case where I don’t really care about history and checking the box gives me a quick indicator of success: if the job’s gone in the morning, my work here is done; if not, I need to begin troubleshooting.

3 Comments

Deploying a Container Instance in Azure

Anibal Kolker takes us through container deployment in Azure:

As derived from the title, the objective of this post is to help you deploy a container instance inside Azure.

However, we’ll extend the typical scenario and make a slightly more extensive use of networking capabilities, by placing the container group inside a private subnet.

Note: For this example, and for simplicity only, we’ll use NGINX as our container of choice. Of course, you’re welcome to try with any other image.

There are a few pieces in play, but Anibal does a good job putting it all together.

Comments closed

A New Notebook Tool: Polynote

Jeremy Smith, et al, announce a new product:

We are pleased to announce the open-source launch of Polynote: a new, polyglot notebook with first-class Scala support, Apache Spark integration, multi-language interoperability including Scala, Python, and SQL, as-you-type autocomplete, and more.

Polynote provides data scientists and machine learning researchers with a notebook environment that allows them the freedom to seamlessly integrate our JVM-based ML platform — which makes heavy use of Scala — with the Python ecosystem’s popular machine learning and visualization libraries. It has seen substantial adoption among Netflix’s personalization and recommendation teams, and it is now being integrated with the rest of our research platform.

There are some nice pieces to it, especially around language interop.

Comments closed

Head-to-Head Comparisons with Power BI

Rob Collie walks us through building a visual which provides head-to-head comparison using Power BI:

Yes, I know that NONE of the infographics above is a scientifically “good” comparison tool.  Too noisy, too flashy, not clean…  but every now and then you DO need to cater to your audience.  Engagement is the first step in the comprehension funnel, and in this particular example, yep, I’m trying to capture the eyeballs of an audience that likes this sort of thing.  The style of #4 is a decent compromise in this case.  Know your audience.

Rob takes us through an interesting journey. I don’t think I’d want to use that style too often, but to be fair, Rob talks about that in the snippet I clipped.

Comments closed