Press "Enter" to skip to content

Author: Kevin Feasel

Re-Introducing rquery

John Mount has a new introduction to rquery:

rquery is a data wrangling system designed to express complex data manipulation as a series of simple data transforms. This is in the spirit of R’s base::transform(), or dplyr’s dplyr::mutate() and uses a pipe in the style popularized in R with magrittr. The operators themselves follow the selections in Codd’s relational algebra, with the addition of the traditional SQL “window functions.” More on the background and context of rquery can be found here.

The R/rquery version of this introduction is here, and the Python/data_algebra version of this introduction is here.

Check it out.

Comments closed

What’s New with Standard Edition

Niko Neugebauer is jazzed about SQL Server 2019 Standard Edition:

The documentation has been released on the Editions & Features support in Sql Server 2019 and there are huge news that will get a good number of Standard Editions users excited.

The thing that should make every single user of the Sql Server 2019 Standard Edition jump is the presence of TDE (Transparent Database Encryption), the feature that

– every single responsible company would love to be able to use to secure their data
– that every single DBA, Developer & IT Professional kept asking Microsoft to include for so many years
– was already there since Azure SQL Database has it by default a couple of years even for the basic edition

I am beyond happy with this news. Grateful that Microsoft has listened to the common sense voice and made a security feature available for the paying customers, giving them a little bit more possibility to be conformant to such demanding norms as GDPR.

Aside from that, Niko looks at several new features which will be available in Standard Edition, as well as a few Enterprise-only features.

Comments closed

Emergency Mode in SQL Server

Paul Randal answers a reader question:

I had a blog comment question a few days ago that asked why emergency-mode repair requires the database to be in EMERGENCY mode as well as SINGLE_USER mode.

All repair operations that DBCC CHECKDB (and related commands) performs require the database to be in single-user mode so there’s a guarantee that nothing can be changing while the checks and repairs are done. But that doesn’t change the behavior of what repair does – that needs emergency mode too.

Read on for an explanation of what emergency mode is and why we need it to run CHECKDB repair operations.

Comments closed

skip-2.0 and SQL Server Security

K. Brian Kelley has the lowdown on skip-2.0:

Problem
I’ve read recently that there’s a new piece of malware that’s been named skip-2.0 and it targets SQL Server. What exactly is it, where did it come from, and how do I protect myself against it?

Solution
This new piece of malware, skip-2.0, does target SQL Server. Specifically, it targets SQL Server versions 11 and 12, which correspond to the names SQL Server 2012 and SQL Server 2014 respectively. Therefore, if you’re only running SQL Server 2016 or higher, you’re not affected by skip-2.0 (yet another reason to upgrade).

Read on to learn how it works and how you can protect against it.

Comments closed

Speeding Up Excel Pivot Table Performance

Chris Webb shows how you can improve performance of Excel pivot tables hitting Analysis Services Multidimensional models:

Back in 2016 I wrote the following blog post about changes to the way Excel 365 generated MDX queries for PivotTables connected to Analysis Services, Power Pivot/the Excel Data Model and Power BI datasets:

https://blog.crossjoin.co.uk/2016/07/08/excel-2016-pivottable-mdx-changes-lead-to-big-query-performance-gains/

I know it sounds boring and not something you need to worry about but trust me, this is important – these changes solved the vast majority of Excel PivotTable performance problems that I encountered when I was a consultant so you should read the above post before continuing.

Unfortunately, earlier this year these changes had to be partially rolled back because in some rare cases the queries generated returned incorrect results; this means that you may find that values for subtotals and grand totals are again being returned even when they aren’t being displayed. The good news is that you should still be able to get the improved performance with a few minor tweaks.

Read on to see what those tweaks are.

Comments closed

Against Premature Re-Architecture

Cyndi Johnson has a good rant:

One of my biggest pet peeves in software development is the compulsion that so many developers have to rip up the foundation and completely build something over again, pretty much from scratch.

I’ve been that developer plenty of times. It’s easy to walk in, see that there are some problems, and want to raze everything. Sometimes that’s a reasonable answer, but every apparent mismatch or hack was put in to solve a particular business rule, many of which are lost to the mists of time. Burning down and starting over loses a lot of that information, so rebuilding is something you do with caution.

Comments closed

On Quantum Supremacy

John Cook has some thoughts on Google’s quantum supremacy announcement:

Google announced today that it has demonstrated “quantum supremacy,” i.e. that they have solved a problem on a quantum computer that could not be solved on a classical computer. Google says

Our machine performed the target computation in 200 seconds, and from measurements in our experiment we determined that it would take the world’s fastest supercomputer 10,000 years to produce a similar output.

IBM disputes this claim. They don’t dispute that Google has computed something with a quantum computer that would take a lot of conventional computing power, only that it “would take the world’s fastest supercomputer 10,000 years” to solve. IBM says it would take 2.5 days.

If you want to jump the gun but also stay on the Microsoft stack, the Q# programming language is open-source and you can run a simulator on your machine. Manning also has a Q# book in the works.

Comments closed

Task-Based Effectiveness of Visualizations

Adrian Colyer summarizes an interesting IEEE paper:

So far this week we’ve seen how to create all sorts of fantastic interactive visualisations, and taken a look at what data analysts actually do when they do ‘exploratory data analysis.’

To round off the week today’s choice is a recent paper on an age-old topic: what visualisation should I use?

No prizes for guessing “it depends!”

…the effectiveness of a visualization depends on several factors including task at the hand, and data attributes and datasets visualized.

Is this the paper to finally settle the age-old debate surrounding pie-charts??

The results were very interesting, though as an official Pie Chart Hater, I would point out that in none of their results was a pie chart ever better than a bar/column chart. There are cases where it works out okay, but if it’s never better and often worse than something, I’d rather use the alternative.

Comments closed

Database Restoration and the Plan Cache

Andy Mallon has some tests for us:

If you restore a database, what does that do to the plan cache? Well, let’s start by looking at the documentation for RESTORE. (Emphasis mine)

Restoring a database clears the plan cache for the instance of SQL Server. Clearing the plan cache causes a recompilation of all subsequent execution plans and can cause a sudden, temporary decrease in query performance. For each cleared cachestore in the plan cache, the SQL Server error log contains the following informational message: ” SQL Server has encountered %d occurrence(s) of cachestore flush for the ‘%s’ cachestore (part of plan cache) due to some database maintenance or reconfigure operations”. This message is logged every five minutes as long as the cache is flushed within that time interval.

Yikes. That first sentence sounds like it is going to clear the cache for the entire instance.

Read on as Andy tests this and (spoiler alert) changes the documentation.

Comments closed