Press "Enter" to skip to content

Author: Kevin Feasel

Last Known Good DBCC CHECKDB In Powershell

Rob Sewell shows off a cmdlet to check DBCC DBINFO for each database to get the last known good CHECKDB run:

This time we get more information. The server name, database name, when the database was created, the last good DBCC Checkdb, how long since the database was created, how long since the last known good DBCC Checkdb, a status and a Data Purity enabled flag. If you look at the image above it shows that the DBA_Admin database has a status of “New database, not checked yet” even though it has a date for the last known good DBCC CheckDb. This is because it was restored after this server was upgrade from CTP 1.3 to CTP 1.4 and there has not yet been a DBCC CheckDb run yet. The system databases have a status of “CheckDb should be performed”. This is because the last known good DBCC CheckDb is more than 7 days ago. Lets run a DBCC CheckDb and check again

Do read the caveats, and also check out a previous Arun Sirpal blog post on DBCC DBINFO.

Comments closed

Troubleshooting Cluster Creation Errors

Mark Broadbent diagnoses an error which seems misleading at first:

One such problem is when you use the New-Cluster command to add all your nodes in one go.

New-Cluster -Name magrathea -node server5,server6,server7
-staticaddress 192.168.1.70

Simple right? Well no. In this instance I ran into the following error:

New-Cluster : There was an error adding node 'server7' to the cluster
the node cannot be contacted. Ensure that the node is powered on and is connected to the network.

Read on for an example of piecemeal debugging.  Mark’s advice is to keep things simple, as in this case at least, you can’t count on the error messages coming back to be completely accurate.

Comments closed

The Central Limit Theorem

Mala Mahadevan explains the Central Limit Theorem with an example:

The central limit theorem states that the sampling distribution of the mean of any independent,random variable will be normal or nearly normal, if the sample size is large enough. How large is “large enough”? The answer depends on two factors.

  • Requirements for accuracy. The more closely the sampling distribution needs to resemble a normal distribution, the more sample points will be required.
  • The shape of the underlying population. The more closely the original population resembles a normal distribution, the fewer sample points will be required. (from stattrek.com).

The main use of the sampling distribution is to verify the accuracy of many statistics and population they were based upon.

Read on for an example and to see how to calculate this in T-SQL.

Comments closed

Indexed Views On Clustered Columnstore Indexes

If you’re using SQL Server 2014 and want to create a non-clustered rowstore index on top of your clustered columnstore index, Niko Neugebauer has you covered:

Here we have a beautiful and a simple execution plan, which delivers what we need – great performance with a relative low overall cost. And in this way we enjoy the possibility to get the best out of the 2 worlds – Columnstore & Rowstore.
Should we need to run a similar query but agains the whole dataset, it will be redirected to our Columnstore Index which will deliver great performance:

Alternatively, upgrade to SQL Server 2016 and you get this without introducing an indexed view into the mix.

Comments closed

Columnstore Indexes On Cloned Databases

Parikshit Savjani has a script to update columnstore index statistics before running DBCC CLONEDATABASE:

Unlike traditional Btree indexes, when a columnstore index is created, there is no index statistics created on the columns of the columnstore indexes. However, there is an empty stats object created with the same name as columnstore index and an entry is added to sys.stats at the time of index creation. The stats object is populated on the fly when a query is executed against the columnstore index or when executing DBCC SHOW_STATISTICS against the columnstore index, but the columnstore index statistics aren’t persisted in the storage. The index statistics is different from the auto created statistics on the individual columns of columnstore indexes which is generated on the fly and persisted in the statistics object. Since the index statistics is
not persisted in storage, the clonedatabase will not contain those statistics leading to inaccurate stats and different query plans when same query has run against database clone as opposed to production database.

Click through for the script.

Comments closed

Trick Co-Workers With This Extended Property

Kenneth Fisher shows how to use extended properties to hide a table from SQL Server Management Studio:

FYI I’ve tried this at the column and schema levels and it didn’t work.

Using this you can hide the object from SSMS object explorer without restricting its use in any way.

I’m curious if there are any other hidden uses of extended properties. I haven’t been able to find any documentation so if you’ve seen any please let me know!

I don’t think I’ve ever had cause to hide objects from Management Studio, but if you’re looking for next year’s April Fools prank, maybe?

Comments closed

ggedit 0.2.0

Jonathan Sidi announces ggedit 0.2.0:

ggedit is an R package that is used to facilitate ggplot formatting. With ggedit, R users of all experience levels can easily move from creating ggplots to refining aesthetic details, all while maintaining portability for further reproducible research and collaboration.
ggedit is run from an R console or as a reactive object in any Shiny application. The user inputs a ggplot object or a list of objects. The application populates Bootstrap modals with all of the elements found in each layer, scale, and theme of the ggplot objects. The user can then edit these elements and interact with the plot as changes occur. During editing, a comparison of the script is logged, which can be directly copied and shared. The application output is a nested list containing the edited layers, scales, and themes in both object and script form, so you can apply the edited objects independent of the original plot using regular ggplot2 grammar.

This makes modifying ggplot2 visuals a lot easier for people who aren’t familiar with the concept of aesthetics and layers—like, say, the marketing team or management.

Comments closed

OLAP Limitations In Tableau

Tim Cost points out areas of friction when trying to use Tableau to connect to a multi-dimensional Analysis Services cube:

I love Tableau, I do NOT however, love working with Tableau when it is connected to an OLAP cube (like Microsoft SQL Server Analysis Services).  I don’t enjoy working with cube data in Tableau because basically all the coolest parts of Tableau won’t work or won’t work in the ways you might expect.  I don’t see this as a failing of Tableau, I lay the blame on the OLAP cube.  The main issue with working against a cube in Tableau is that you talk to a cube with MDX, where we talk to almost every other data source with SQL.  MDX (or Mind Destroying Expressions as I think of them), are just a huge pain to work with.  As hard as it is for ME to write MDX, for Tableau it’s even harder. Here are some things that you should consider before committing to a Tableau project with Microsoft SQL Server Analysis Services as a data source

Click through for ten such considerations.

Comments closed

Understanding DBCC OPENTRAN

Kevin Hill goes into detail on what DBCC OPENTRAN does:

I have verified that new records I inserted have been read by the log reader, AND distributed to the subscriber(s).  This means that while you are seeing

Oldest distributed LSN : (37:157:3)

There is not an error…just info.

If you have non-distributed LSNs, there is something to troubleshoot in the replication process which is way outside the scope of this post.  A non-distributed replicated transaction/LSN CAN cause some huge Log file growth, and need to be investigated.  If this happens frequently, use the TABLERESULTS option to log to a regular table and alert on it.

Good information here.

Comments closed