Press "Enter" to skip to content

Month: July 2016

Job Hunting

Andy Mallon hunts the most dangerous of creatures:  man job.

Finding a new job is hard. At every turn, it’s a lot of work: finding a job, interviewing, negotiating. I don’t know anyone who likes doing it. I do know lots of people who struggle with various parts of the process.

I want to share my experience, my thoughts, and my opinions. Hopefully, you can learn from my experience (good and bad). By sharing my experience, hopefully I can make life a little bit easier on you the next time you’re job hunting.

Definitely looking forward to this series.

Comments closed

Finding Failed Backups And Jobs

Thomas Rushton builds a nasty query to answer an important question:

Assumptions

  1. Backup jobs that do full backups don’t overlap
  2. There’s nothing else doing full backups
  3. erm…
  4. that’s it

The reason I was thinking about this is that we have occasional-but-annoyingly-frequent backup job failures, wherein most of the databases back up just fine, but the odd one fails. (SharePoint box, I’m looking at you…) Rather than trawling through the error logs to find out which particular database didn’t back up successfully, I wanted a query to do the heavy lifting. Yes, I’m a lazy lone DBA…

The end result is a pivoted query showing days in which full backups fail.  There’s a lot of information in there, so that might be something I’d want to visualize in Excel or R, changing cell colors for failed jobs so they stand out better.  Nevertheless, check this out, especially if you don’t have a solution in place to monitor backups.

Comments closed

Parallel Loading Columnstore Indexes

Sunil Agarwal shows how to bulk load with parallelism into a clustered columnstore index from a staging table:

SQL Server 2016 requires following conditions to be met for parallel insert on CCI

  • Must specify TABLOCK
  • No NCI on the clustered columnstore index
  • No identity column
  • Database compatibility is set to 130

While these restrictions are enforced in SQL Server 2016 but they represent important scenarios. We are looking into relaxing these in subsequent releases. Another interesting point is that you can also load into ‘rowstore HEAP’ in parallel as well.

The restriction I’d most like to see reduced would be the “no non-clustered indexes” part.  The rest seem forgivable for most clustered columnstore setups (i.e., fact tables).

Comments closed

SSIS Firewall Rules

Slava Murygin shows how to create a firewall rule to allow SSIS connections:

Recently tried to connect to Remote SQL Server Integration Service directly from SSMS and got following error:

TITLE: Connect to Server
——————————
Cannot connect to 10.1.32.66.
——————————
ADDITIONAL INFORMATION:
Failed to retrieve data for this request. (Microsoft.SqlServer.Management.Sdk.Sfc)
For help, click: http://go.microsoft.com/fwlink?ProdName=Microsoft%20SQL%20Server&LinkId=20476
——————————
The RPC server is unavailable. (Exception from HRESULT: 0x800706BA) (Microsoft.SqlServer.DTSRuntimeWrap)
——————————
The RPC server is unavailable. (Exception from HRESULT: 0x800706BA) (Microsoft.SqlServer.DTSRuntimeWrap)
——————————
BUTTONS:
OK
——————————

Slava then shows how to work around this.

Comments closed

Missing Values In R

David Smith explains NA values in R:

Here’s a little puzzle that might shed some light on some apparently confusing behaviour by missing values (NAs) in R:

What is NA^0 in R?

You can get the answer easily by typing at the R command line:

> NA^0
[1] 1

But the interesting question that arises is: why is it 1? Most people might expect that the answer would be NA, like most expressions that include NA. But here’s the trick to understanding this outcome: think of NA not as a number, but as a placeholder for a number that exists, but whose value we don’t know.

Definitely read the comments on this one.

Comments closed

Diagnosing Memory Grant Issues

Pedro Lopes looks at Extended Events around memory grants:

Three conditions can trigger this warning to show up in showplan:

  1. Excessive Grant: when max used memory is too small compared to the granted memory. This scenario can cause blocking and less efficient usage when large grants exist and a fraction of that memory was used.

  2. Used More Than Granted: when the max used memory exceeds the granted memory. This scenario can cause OOM conditions on the server.

  3. Grant Increase: when the dynamic grant starts to increase too much, based on the ratio between the max used memory and initial request memory. This scenario can cause server instability and unpredictable workload performance.

I like that this information also shows up when you view an execution plan using SSMS 2014 SP2.

Comments closed

Deprecated SSAS Features

Chris Webb looks at Analysis Services functionality deprecated (or discontinued) as of 2016:

Some time ago I blogged about the deprecated and discontinued functionality in SSAS 2014, so I thought it would be a good idea to follow my last post on what’s new in SSAS 2016 Multidimensional with a discussion of what’s going or gone from it.

The same page that I linked to last time has been updated for 2016, and there are four more subpages with all the details. There’s nothing much interesting to say about the breaking changes (basically AMO has been rejigged) or behaviour changes (there’s no in-place upgrade for Tabular models using DirectQuery – you have to open the project and edit some settings) but the other two pages do have some news worthy of comment:

It looks like there are some potentially interesting features on the chopping block.  My Analysis Services experience is extremely limited, so I’ve never used any of them, but looks like it might be worth checking out if you have production cubes.

Comments closed

SQL Server’s Basic Installer

Derik Hammer walks through the new SQL Server Basic Installer:

When using the Basic Installer only the database engine and SQL Client Connectivity SDK are installed. I find this better than using the advanced installer’s Install with all defaults option. Typically any local user will not need Reporting Services, Integration Services, Analysis Services, or any other feature available.

I was a bit concerned about this when it was first announced—the default installer already allows you to make too many poor decisions—but I think it works within the use case Derik describes:  developers installing a local edition on their dev boxes.

Comments closed

Understanding Spark APIs

Jules Damji explains when to use RDDs, when to use DataFrames, and when to use Datasets in Spark:

Like an RDD, a DataFrame is an immutable distributed collection of data. Unlike an RDD, data is organized into named columns, like a table in a relational database. Designed to make large data sets processing even easier, DataFrame allows developers to impose a structure onto a distributed collection of data, allowing higher-level abstraction; it provides a domain specific language API to manipulate your distributed data; and makes Spark accessible to a wider audience, beyond specialized data engineers.

With Spark 2.0, the balance moves in favor of the more structured data types.  What’s old is new; what’s unstructured is structured…

Comments closed