Press "Enter" to skip to content

Author: Kevin Feasel

Interrogating A Stack Dump

Kendra Little looks at a SQL Server stack dump:

In the video, I show an example of a stack dump caused by running DBCC PAGE with format style 3 against a table with a filtered index in SQL Server 2014.

It looks like this bug is fixed in SQL Server 2016, at least by SP1.

Sample code to reproduce this against the AdventureWorks2012 database (which I had restored to SQL Server 2014) is in my gist here.

Click through to watch the video.

Comments closed

SSMS Templates

Jana Sattainathan shows some of the value of SQL Server Management Studio templates, along with an important warning:

If you do start creating your own templates, you are responsible for backing them up. To locate the folder where they are stored

  1. Open DOS command prompt
  2. Run “echo %APPDATA%”
  3. Note the base path
  4. Navigate to %AppData%\Microsoft\Microsoft SQL Server\{SQL Server Version}\Tools\Shell\Templates\Sql\

(where %AppData% is the base path from

and {SQL Server Version} = 90 for SQL 2005, 100 for SQL 2008, 110 for SQL 2012, 120 for SQL 2014 and 130 for SQL 2016)

Templates are extremely useful for day-to-day development as well as giving a handy way of generating snippets of code, like estimating row count without having to remember to join to sys.indexes, sys.objects, and sys.dm_db_partition_stats.

Comments closed

Google Compute Engine Whitepapers

Brent Ozar Unlimited has a couple whitepapers out about working with SQL Server in Google Compute Engine.  First, Brent and Tara Kizer create an Availability Group:

In this white paper we built with Google, we’ll show you:

  • How to build your first Availability Group in Google Compute Engine

  • How to test your work with four failure simulations

  • How to tell whether your databases will work well in GCE

Erik Darling also has a whitepaper on performance tuning:

Relax. Have a drink. In this white paper we built with Google, we’ll show you:

  • How to measure your current SQL Server using data you’ve already got

  • How to size a SQL Server in Google Compute Engine to perform similarly

  • After migration to GCE, how to measure your server’s bottleneck

  • How to tweak your SQL Server based on the performance metrics you’re seeing

If you’re looking at GCE as a potential migratory spot, you’ve got some extra reading material.

Comments closed

Dr. Elephant: Where Does My Hadoop Cluster Hurt?

Carl Steinbach looks back at Dr. Elephant one year later:

What we needed to introduce to the job-tuning equation was a series of questions like those asked by a physician making a diagnosis: a step-by-step process that guides the user through the problem-solving process, while also educating them at the same time.

So we created Dr. Elephant, a system that automatically detects under-performing jobs, diagnoses the root cause, and guides the owner of the job through the treatment process. Dr. Elephant makes it easy to identify jobs that are wasting resources, as well as jobs that can achieve better performance without sacrificing efficiency. Perhaps most importantly, Dr. Elephant makes it easy to act on these insights by making job-level performance tuning accessible to users regardless of their previous skill level. In the process, Dr. Elephant has helped to ease the tension that previously existed between user productivity on one side and cluster efficiency on the other.

LinkedIn has made this project open source if you want to check it out in your environment.

Comments closed

TensorFlow With YARN

Wangda Tan and Vinod Kumar Vavilapalli show how to control TensorFlow jobs with YARN:

YARN has been used successfully to run all sorts of data applications. These applications can all coexist on a shared infrastructure managed through YARN’s centralized scheduling.

With TensorFlow, one can get started with deep learning without much knowledge about advanced math models and optimization algorithms.

If you have GPU-equipped hardware, and you want to run TensorFlow, going through the process of setting up hardware, installing the bits, and optionally also dealing with faults, scaling the app up and down etc. becomes cumbersome really fast. Instead, integrating TensorFlow to YARN allows us to seamlessly manage resources across machine learning / deep learning workloads and other YARN workloads like MapReduce, Spark, Hive, etc.

Read on for more details, including a demo video.

Comments closed

Rolling Out An Analytics Project

Christina Prevalsky shares some thoughts on considerations when implementing an analytics project:

The earlier you address data quality the better; the less time your end users spend on data wrangling, and the more they can focus on high value analytics. As your organization’s data infrastructure matures, migrating from spreadsheets to databases and data warehouses, data quality checks should be formally defined, documented, and automated. Exceptions should either be handled automatically during data intake using predefined business rules logic or require immediate user intervention to correct any errors.

Providing clean, centralized, and analytics-ready data to end users should not be a one-way process. By allowing end users to focus on high-value analytics, like data mining, network graphs, clustering, etc., they can uncover certain outliers and anomalies in the data. Effective data management should include a feedback loop to communicate these findings and, if necessary, incorporate any changes in the ETL processes, making centralized data management more dynamic and flexible.

The big question to ask is, “what problem are we trying to solve?”  That will help determine the answer to many of the questions, including how you store the data, how you expose the data, and even which data you collect and keep.

Comments closed

Air Travel Route Maps With ggplot2

Peter Prevos wants to create a pretty map of flights he’s taken:

The first step was to create a list of all the places I have flown between at least once. Paging through my travel photos and diaries, I managed to create a pretty complete list. The structure of this document is simply a list of all routes (From, To) and every flight only gets counted once. The next step finds the spatial coordinates for each airport by searching Google Maps using the geocode function from the ggmap package. In some instances, I had to add the country name to avoid confusion between places.

The end result is imperfect (as Peter mentions, ggmap isn’t wrapping around), but does fit the bill for being eye-catching.

Comments closed

Continuous Deployment In A Box

Ed Elliott has been working on a very interesting project:

What does this do?

Unblock-File *.ps1 – removes a flag that windows puts on files to stop them being run if they have been downloaded over the internet.
.\ContinuousDeploymentFTW.ps1 – runs the install script which actually:

  • Downloads chocolatey
  • Installs git
  • Installs Jenkins 2
  • Guides you how to configure Jenkins
  • Creates a local git repo
  • Creates a SSDT project which is configured with a test project and ssdt and all the references that normally cause people problems
  • Creates a local Jenkins build which monitors your local git repo for changes
  • When code is checked into the repo, the Jenkins job jumps into action and…

If you check into the default branch “master” then Jenkins:

  • Builds the SSDT project
  • Deploys the project to the unit test database
  • Runs the tSQLt unit tests
  • Generates a deployment script for the “production” database

and what you have there is continuous delivery in a box

Click through for a video where Ed shows how it all works.

Comments closed