Columnstore Indexes On Cloned Databases

Parikshit Savjani has a script to update columnstore index statistics before running DBCC CLONEDATABASE:

Unlike traditional Btree indexes, when a columnstore index is created, there is no index statistics created on the columns of the columnstore indexes. However, there is an empty stats object created with the same name as columnstore index and an entry is added to sys.stats at the time of index creation. The stats object is populated on the fly when a query is executed against the columnstore index or when executing DBCC SHOW_STATISTICS against the columnstore index, but the columnstore index statistics aren’t persisted in the storage. The index statistics is different from the auto created statistics on the individual columns of columnstore indexes which is generated on the fly and persisted in the statistics object. Since the index statistics is
not persisted in storage, the clonedatabase will not contain those statistics leading to inaccurate stats and different query plans when same query has run against database clone as opposed to production database.

Click through for the script.

Trick Co-Workers With This Extended Property

Kenneth Fisher shows how to use extended properties to hide a table from SQL Server Management Studio:

FYI I’ve tried this at the column and schema levels and it didn’t work.

Using this you can hide the object from SSMS object explorer without restricting its use in any way.

I’m curious if there are any other hidden uses of extended properties. I haven’t been able to find any documentation so if you’ve seen any please let me know!

I don’t think I’ve ever had cause to hide objects from Management Studio, but if you’re looking for next year’s April Fools prank, maybe?

ggedit 0.2.0

Jonathan Sidi announces ggedit 0.2.0:

ggedit is an R package that is used to facilitate ggplot formatting. With ggedit, R users of all experience levels can easily move from creating ggplots to refining aesthetic details, all while maintaining portability for further reproducible research and collaboration.
ggedit is run from an R console or as a reactive object in any Shiny application. The user inputs a ggplot object or a list of objects. The application populates Bootstrap modals with all of the elements found in each layer, scale, and theme of the ggplot objects. The user can then edit these elements and interact with the plot as changes occur. During editing, a comparison of the script is logged, which can be directly copied and shared. The application output is a nested list containing the edited layers, scales, and themes in both object and script form, so you can apply the edited objects independent of the original plot using regular ggplot2 grammar.

This makes modifying ggplot2 visuals a lot easier for people who aren’t familiar with the concept of aesthetics and layers—like, say, the marketing team or management.

OLAP Limitations In Tableau

Tim Cost points out areas of friction when trying to use Tableau to connect to a multi-dimensional Analysis Services cube:

I love Tableau, I do NOT however, love working with Tableau when it is connected to an OLAP cube (like Microsoft SQL Server Analysis Services).  I don’t enjoy working with cube data in Tableau because basically all the coolest parts of Tableau won’t work or won’t work in the ways you might expect.  I don’t see this as a failing of Tableau, I lay the blame on the OLAP cube.  The main issue with working against a cube in Tableau is that you talk to a cube with MDX, where we talk to almost every other data source with SQL.  MDX (or Mind Destroying Expressions as I think of them), are just a huge pain to work with.  As hard as it is for ME to write MDX, for Tableau it’s even harder. Here are some things that you should consider before committing to a Tableau project with Microsoft SQL Server Analysis Services as a data source

Click through for ten such considerations.

Replacing SQL Agent In Azure

Bob Rubocki has some Q&A regarding automating in Azure the types of things you’d normally run SQL Agent jobs for:

Q: Is there any way to handle the execution of SSIS packages stored locally?

A: Azure Automation works on Azure resources.  It cannot be used for executing local SSIS packages.

In some cases, you may still need a scheduling tool (which might be a VM with SQL Agent).

Understanding DBCC OPENTRAN

Kevin Hill goes into detail on what DBCC OPENTRAN does:

I have verified that new records I inserted have been read by the log reader, AND distributed to the subscriber(s).  This means that while you are seeing

Oldest distributed LSN : (37:157:3)

There is not an error…just info.

If you have non-distributed LSNs, there is something to troubleshoot in the replication process which is way outside the scope of this post.  A non-distributed replicated transaction/LSN CAN cause some huge Log file growth, and need to be investigated.  If this happens frequently, use the TABLERESULTS option to log to a regular table and alert on it.

Good information here.

Learning Azure

Grant Fritchey notes that web searches won’t always take you to the latest version of documentation:

If you’re learning Azure and you research things using a search engine, then I strongly recommend you use the ability to limit your searches to the last year. Otherwise, you may be getting incomplete or incorrect data. At this precise moment, I’d say you need to limit your searches to Google (although I honestly hate recommending one of these tools over the other, let’s keep the competition fierce) because I was able to easily get the correct information within a couple of mouse clicks.

Grant’s post makes sense, and so does the search engine behavior:  in Grant’s case, those older cmdlet documentation links have been around longer and older resources tend to have a larger number of relevant linkbacks and clicks.  That’s also visible in SQL Server documentation, where sometimes you’ll land on the 2008R2 or 2012 version of documentation rather than 2016 or vNext.

Meanwhile, Victoria Holt has a bunch of resources for the Azure curious:

Here are a whole set of links to kick start your learning of Microsoft Azure services.

Introduction video

Changes to computer thinking – Stephen Fry explains cloud computing

That’s a good set of starting links.

Table Variables Use TempDB Too

Derik Hammer proves that classic, non-memory-optimized table variables use disk:

Table variables use tempdb similar to how temporary tables use tempdb. Table variables are not in-memory constructs but can become them if you use memory optimized user defined table types. Often I find temporary tables to be a much better choice than table variables. The main reason for this is because table variables do not have statistics and, depending upon SQL Server version and settings, the row estimates work out to be 1 row or 100 rows. In both cases these are guesses and become detrimental pieces of misinformation in your query optimization process.

It’s worth the read.


April 2017
« Mar