Press "Enter" to skip to content

Author: Kevin Feasel

Data Protection Manager

Tom Roush discusses gotchas around Microsoft’s Data Protection Manager:

You’ve got DPM installed, and for the most part, configured.  It’s working, but you have transaction log drives filling up on some of your servers, and it’s not really clear why.

Wanna know why?

Here’s the answer:

It’s because the UI is very unclear, because the documentation is unclear, (there was a hint of it on page 83) and because the things that would be obvious to a DBA simply aren’t mentioned.

Tom has a very detailed post on the topic, making it a must-read if you use this tool.

Comments closed

Hack Those P Values!

Ned Bicare provides us a sure-fire method for getting our academic papers published:

“If you torture the data long enough, it will confess.”

This aphorism, attributed to Ronald Coase, sometimes has been used in a disrespective manner, as if it was wrong to do creative data analysis.
In fact, the art of creative data analysis has experienced despicable attacks over the last years. A small but annoyingly persistent group of second-stringers tries to denigrate our scientific achievements. They drag psychological science through the mire.
Ned has a great tool to play around with as well, letting us Statistics our way to academic success.
Comments closed

Going From Trace To Extended Events

Erin Stellato shows that SQL trace events map pretty closely to Extended Events:

Every time I talk about migrating from Profiler and Trace to Extended Events (XE), I seem to add something to my presentation.  It’s not always intentional, and I don’t know if it’s because I just can’t stop tweaking my demos and contents, or something else, but here in London at IEPTO2 this week, I added a query that helps you see what event in Extended Events is comparable to the event you’re used to using in Trace/Profiler.  While most of the events in XE have a name similar to the event in Trace (e.g. sp_statement_completed for SP:StmtCompleted), some mappings aren’t so intuitive.  For example, SP:Starting in Trace is module_start in XE, and SP:Completed in Trace is module_end in XE.  That’s not so bad, but if you’re ever monitoring file growths, note that the database_file_size_change event in XE is the event for the following four events in trace: Data File Auto Grow, Data File Auto Shrink, Log File Auto Grow, and Log File Auto Shrink.

This is a helpful query to keep around until you get really familiar with Extended Events.

Comments closed

Database Snapshots

Kenneth Fisher discusses database snapshots:

Here is where it starts getting interesting. A snapshot initially takes up little to no space. As changes are made to the source database the snapshot grows in size. In fact the snapshot is the size of all of the pages changed in the source database since the creation of the snapshot. Basically as a page is changed in the source database a copy of the original page is made and stored in the snapshot, but only the first time. (Note: The files used to store these pages are called sparse files.) This means that if you change the same page over and over again it will only be written to the snapshot once. It then logically follows that the largest a snapshot can get is the size of the source database at the time the snapshot was taken. Since most of the time we change a very small portion of the database at any given point in time this means that snapshots tend to be much smaller than the source database. In fact you could load millions of rows into the source database (assuming they are mostly/all in new pages) and it will have little to no effect on the size of the snapshot.

My favorite use of database snapshots was so developers could test their changes in QA and then revert back to a pre-snapshot environment.  That way, they could preserve data for future runs.

Comments closed

Polybase Setup Errors

Murshed Zaman on the Azure CAT team covers a number of Polybase configuration errors:

SSMS Error:

Any Select query fails with the following error.
Msg 106000, Level 16, State 1, Line 1
Java heap space

Possible Reason:

Illegal input may cause the java out of memory error.  In this particular case the file was not in UTF8 format. DMS tries to read the whole file as one row since it cannot decode the row delimiter and runs into Java heap space error.

Possible Solution:

Convert the file to UTF8 format since PolyBase currently requires UTF8 format for text delimited files.

I imagine that this page will get quite a few hits over the years, as there currently exists limited information on how to solve these issues if you run into them, and some of the error messages (especially the one quoted above) have nothing to do with root causes.

Comments closed

Query Store On Read-Only Databases

Kendra Little wants to know if Query Store works on read-only databases:

At least for now, Query Store can only record query performance in databases that are read-write. Once you go read-only you can review the performance of past queries, but you can’t track the performance of anyone who queried the database after the point it went read-only.

At least for now. Query Store is such an awesome feature that perhaps this will change in the future. (I don’t have any inside info, only optimism.)

That’s a little bit of a letdown, but makes perfect sense.

Comments closed

Checking Users And Principals

Shane O’Neill walks through a permissions issue and cautions against jumping the gun:

All the above is what I did.

Trying to fix the permission error, I granted SELECT permission.
Trying to fix the ownership chain, I transferred ownership.
Mainly in trying to fix the problem, I continually jumped the gun.
Which is why I am still a Junior DBA.

To be fair, I’d argue that if you intended to have replicated objects live in a different schema, the second action was fine.  Regardless, the advice is sound.

Comments closed

Polybase Statistics

I dig into a non-trivial Polybase query:

Polybase offers the ability to create statistics on tables, the same way that you would on normal tables.  There are a few rules about statistics:

  1. Stats are not auto-created.  You need to create all statistics manually.

  2. Stats are not auto-updated.  You will need to update all statistics manually, and currently, the only way you can do that is to drop and re-create the stats.

  3. When you create statistics, SQL Server pulls the data into a temp table, so if you have a billion-row table, you’d better have the tempdb space to pull that off.  To mitigate this, you can run stats on a sample of the data.

Round one did not end on a high note, so we’ll see what round two has to offer.

Comments closed

Netflix Billing Architecture

The Netflix tech blog discusses changing their billing infrastructure to be entirely in the cloud (AWS in this case):

Cleaning up Code: We started chipping away existing code into smaller, efficient modules and first moved some critical dependencies to run from the Cloud. We moved our tax solution to the Cloud first.

Next, we retired serving member billing history from giant tables that were part of  many different code paths. We built a new application to capture billing events, migrated only necessary data into our new Cassandra data store and started serving billing history, globally, from the Cloud.

We spent a good amount of time writing a data migration tool that would transform member  billing attributes spread across many tables in Oracle  into a much simpler Cassandra data structure.

We worked with our DVD engineering counterparts to further simplify our integration and got rid of obsolete code.

Purging Data: We took a hard look at every single table to ensure that we were migrating only what we needed and leaving everything else behind. Historical billing data is valuable to legal and customer service teams. Our goal was to migrate only necessary data into the Cloud. So, we worked with impacted teams  to find out what parts of historical data they really needed. We identified alternative data stores that could serve old data for these teams. After that, we started purging data that was obsolete and was not needed for any function.

All in all, a very interesting read on how to migrate large databases.  Even if you’re moving from one version of a product to another, some of these steps might prove very helpful in your environment.

Comments closed