Press "Enter" to skip to content

Month: December 2015

Things A Junior DBA Should Know

Kendra Little has a list of three things a junior DBA should know:

Confession: I was a Junior DBA for a long time before I had a clue about this. It’s not unusual– many DBAs pick up existing databases and it’s natural to accept that the settings are correct.

Except, usually they aren’t. Usually, the last person who set them up just kinda guessed.

Guess what? You’re responsible for whatever they guessed.

Kendra’s three items are definitely junior-level, but we all start somewhere.

Comments closed

Columnstore In 2016

Niko Neugebauer has two new posts up on columnstore index changes with SQL Server 2016.

First, row group merging with clustered columnstore indexes:

Row Group merging & cleanup is a very long waited improvement that came out in SQL Server 2016. Once Microsoft has announced this functionality, everyone who has worked with SQL Server 2014 & Clustered Columnstore Indexes has rejoiced – one of the major problems with logical fragmentation because of the deleted data is solved! Amazing!
Just as a reminder – logical fragmentation is the process when we mark obsolete data in the Deleted Bitmap (in Columnstore Indexes there is no direct data removal from the compressed Segments with Delete command and Update command uses Deleted Bitmap as well marking old versions of rows as deleted).

Second, Stretch DB with columnstore:

Stretch DB or alternatively Stretch Database is a way of spreading your table between SQL Server (on-premises, VM in Azure) and a Azure SQLDatabase. This means that the dat of the table will shared between the SQL Server and the Azure SQLDatabase giving the opportunity to lower the total cost of the local storage, since Azure SQLDatabase is cheap relatively expensive storage typically used on the local SQL Server installations.
This mean that the table data will be separated intoHot Data & Cold Data, where Hot Data is the type of data that is frequently accessed and it extremely important (this is typically some OLTP data) and the Cold Data (this is typically rarely or almost never accessed archival or log data).
For the final user the experience should be the same as before – should he ask for some data that is not on the SQL Server, then it will be read from Azure SQLDatabase by the invocation of remote query, joined with the local results (if any) and then presented to the user.

These two posts are must-reads if you work with columnstore indexes.

Comments closed

Collecting ETL Metrics

Andy Leonard has a long and useful post on collecting ETL metrics in SQL Server 2016:

“In an age of the SSIS Catalog, why would one ever employ this kind of metadata collection, Andy?” That’s a fair question. The SSIS Catalog is an awesome data integration execution, logging, and externalization engine. There are a handful of use cases, though, where enterprises may opt to continue to execute SSIS packages from the file system or the MSDB database. Perhaps the biggest reason to do so is that’s the way the enterprise is currently executing SSIS. When SSDT-BI converts pre-Catalog-era (2005, 2008, 2008 R2) SSIS packages to current, it imports these packages in a “Package Deployment Model” SSIS Project. This allows developers to upgrade the version of their SSIS project to SSIS 2016 (and enjoy many benefits for so doing) while continuing to execute SSIS packages in the file system. Kudos to the Microsoft SSIS Development Team for this backwards compatibility!

Andy asks the question I wanted to ask and gives a good answer.

Comments closed

Introducing Microsoft SQL Server 2016 Preview

Denny Cherry alerts us to a free preview of a new Microsoft Press book:

The best thing about this book, is that it is free, there isn’t even a soul sucking registration to go through.  Just click and download.  It’s available is a standard 8.5×11″ PDF (standard US Pages) as well as a smaller PDF for mobile.  If you are looking for EPUB and MOBI files you’ll need to wait a few more weeks as they are supposed to be available starting in January (don’t hold me to that, I’m just going off the MSDN post.

Denny has a chapter on SQL Server security improvements that looks particularly interesting to me.

Comments closed

Performance Monitoring For SSAS

Bill Anton has more information on performance monitoring for Analysis Services:

For query workloads, we can see important information about every single query that hits the system including details such as the total duration of the query, query text (MDX/DAX), start and end times, as well as the associated user account. We can also determine details as to how the query was executed such as the number of partitions scanned, aggregation hits/misses, cache hits/misses, other queries running at the same time, etc…all of which have an effect on the performance of any one particular query. A secondary benefit is that we’ll be able to identify the usage pattern(s) of folks using the cube. For example, is usage low/moderate throughout the week with a heavy spike on Friday mornings?

Bonus note:  it looks like there will be an xEvents for Analysis Services GUI in SQL Server 2016.

Comments closed

Auto-Deploying Documentation

Steph Locke has more on documentation auto-deployment (and the original Curated SQL entry):

So I went through and converted everything in my Rtraining to this and realised it messed up my slide decks – it’s been so long since I had built a pure knitr solution that I forgot that rmarkdown::render != knitr::knit. For my slidedecks, if I wanted the ioslides_presentation format, I needed to use rmarkdown::render. The problem with that has been the relative references to the CSS and the logo.

To solve this I read about the custom render formats capability and created afunction that produces an ioslides_presentation but with my CSS preloaded by default. This now means that I can produce slides with better file referencing.

Steph has put up all of her R-related presentations and documentation as well, so check that out.

Comments closed

DocumentDB

Robert Sheldon walks us through DocumentDB:

DocumentDB organizes documents into collections, with each database capable of hosting one or more collection. Because DocumentDB is a cloud service, it offers quick and easy implementations, while delivering the flexibility and scalability necessary to meet the demands of todays web and mobile applications.

DocumentDB integrates JSON and JavaScript right into the database engine. JSON, short for JavaScript Object Notation, is a widely implemented lightweight format for exchanging data between different source types, similar to how XML can be used to exchange data. JSON is based on a subset of the JavaScript programming language and is easy for computers to parse and generate, as well as being human readable.

Read the whole thing if you’re interested in Microsoft’s competitor to MongoDB.

Comments closed

Monitoring For Suspect Pages

John Martin shows us about dbo.suspect_pages:

dbo.suspect_pages is a table that resides in the MSDB database and is where SQL Server logs information about corrupt database pages (limited to 1,000 rows) that it encounters, not just when DBCC CHECKB is run but during normal querying of the database. So if you have a DML operation that accesses a corrupt page, it will be logged here, this means that you have a chance of identifying a corruption in your database outside of the normal DBCC CHECKDB routine.

This is a nice tool we can use to check for corruption.

Comments closed

Fraud Detection With R And Azure

David Smith shows us an online fraud detection template:

Detecting fraudulent transactions is a key applucation of statistical modeling, especially in an age of online transactions. R of course has many functions and packages suited to this purpose, including binary classification techniques such as logistic regression.

If you’d like to implement a fraud-detection application, the Cortana Analytics gallery features an Online Fraud Detection Template. This is a step-by step guide to building a web-service which will score transactions by likelihood of fraud, created in five steps

Read through for the five follow-up articles.  This is a fantastic series and I plan to walk through it step by step myself.

Comments closed