Press "Enter" to skip to content

Day: March 17, 2017

Polybase Use Cases

James Serra talks about Polybase use cases:

For federated queries: “N” requires all data from the source to be copied into SQL Server 2016 and then filtered.  For “Y”, the query is pushed down into the data source and only the results are returned back, which can be much faster for large amounts of data.

I mention “Maybe” for age out data in SQL DW as you can use PolyBase to access the aged-out data in blob or Azure Data Lake Storage (ADLS), but it will have to import all the data so may have slower performance (which is usually ok for accessing data that is aged-out).  For SQL Server 2016, it will have to import the data unless you use HDP/Cloudera, in which case the creation of the MapReduce job will add overhead.

The thing that I like about this chart is that the new Polybase sources (SQL Server, Oracle, Teradata, Mongo, and generic ODBC) do support predicate pushdown.  For large data sets, that’s huge:  it lets the database engine on the opposite end do as much filtering as possible before sending results back to your SQL Server head node.

Comments closed

Graphing R Package Dependencies

Tomaz Kastrun uses the igraph package to graph package dependencies in R:

With importing package tools, we get many useful functions to find additional information on packages.

Function package.dependencies() parses and check dependencies of a package in current environment. Function package_dependencies()  (with underscore and not dot) will find all dependent and reverse dependent packages.

This probably tilts more toward “fun” than “practical,” but this will let you see the full set of dependencies for a package if, for example, you need to grab all of these packages for upgrading an offline instance.

Comments closed

Tail Log Backups

Kendra Little explains the importance of tail log backups in the course of answering a reader question:

When you restore a full backup, does it restore to when you started the backup job— or when it completed?

In this episode, I give you the super-short answer. (Spoiler: a point near the end of when the backup was running.) For the full answer, complete with a detailed timeline to help you understand the nitty gritty, read “Understanding SQL Server Backups” by Paul Randal

Click through for the video as well as a bit more information on tail log backups.

Comments closed

Disappearing Availability Groups

Cody Konior has a not-so-great magic trick:

I logged onto that node and the AG Dashboard looked okay at first glance. But the test was still failing when I re-ran it manually, so, I looked deeper.

I logged onto the second node and noticed the AG was completely gone. All the databases were in recovery but there was no sign of the AG at all. Nothing. Nada. Zip. (I don’t have any other words). It’s like it was never there.

At first I thought someone must have done something awful. I quickly poured a coffee while checking the default trace which usually records system-level configuration changes like dropping an entire replica but in this case nothing relevant showed up.

Read on for the answer, as well as action items to take if you’re actively using Availability Groups.

Comments closed

Graph Database Basics

Victoria Holt has some good resources on learning more about graph databases:

There is graph support in the next version of SQL Server. The private preview page states

SQL Graph adds graph processing capabilities to SQL Server, which will help you link different pieces of connected data to help gather powerful insights and increase operational agility. Graphs are well suited for applications where relationships are important, such as fraud detection, risk management, social networks, recommendation engines, predictive analysis, dependence analysis, IoT suites, etc.
Initially, SQL Server will support CRUD graph operations and multi-hop graph navigation, and the following functionality will be available in the private preview:

  • Create graph objects, that is, nodes to represent entities and edges to represent relationships between any 2 given nodes. Both Nodes and Edges can have properties associated to them.
  • SQL language extensions to support join free, pattern matching queries for multi-hop navigation

Kennie Pontoppipidan wrote a great blog post on where to find out more information.

Click through for more links to interesting resources.

Comments closed

Checking Query Settings

Angela Henry had a query which worked fine in Management Studio but not in Powershell:

After I dusted off my PowerShell 2.0 documentation, I got my script written and started testing.  I processed several folders and their files before I received the following error while running my PowerShell script:

Invoke-Sqlcmd : String or binary data would be truncated.
The statement has been terminated.
At line:127 char:36
+ … MyResults = Invoke-Sqlcmd -ServerInstance $ServerName `
+                 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : InvalidOperation: (:) [Invoke-Sqlcmd], SqlPowerShellSqlExecutionException
+ FullyQualifiedErrorId : SqlError,Microsoft.SqlServer.Management.PowerShell.GetScriptCommand

Interesting.  I added some Write-Host statements for troubleshooting and found the offending entry.  Like any good programmer, I tested my stored procedure call in SQL Server Management Studio (SSMS) to make sure it really was a SQL Server error and guess what?  It worked just fine!  No errors what so ever.  WTH?!  This is where my tunnel vision sets in.  If it works in SSMS but not in PowerShell, then PowerShell must be the problem, right?  Well, sort of.

Read on for the solution.

Comments closed

Storyboarding: Uncovering The Problem

Jonathan Stewart continues his series on storyboarding:

If you have heard my “Data Visualization: How to truly tell a great story!” presentation, you will have heard me mention about using a storyboard to get a better understanding of the problem. Cole Nussbaumer Knaflic does a great job of introducing this concept in her book “Storytelling with Data” which is a great read and an excellent reference tool for anyone in the data viz world.

I have adapted to using her basic storyboard as my basis for my development and we will use it today as the foundation of our series.

Jonathan ends with a set of sample questions to ask.  These are just starter questions, but they’ll help uncover important but hidden business requirements.

Comments closed

Inconsistencies With SQL_VARIANT

Erik Darling warns against using SQL_VARIANT data types:

I half-stumbled on the weirdness around SQL_VARIANT a while back while writing another post about implicit conversion. What I didn’t get into at the time is that it can give you incorrect results.

When I see people using SQL_VARIANT, it’s often in dynamic SQL, when they don’t know what someone will pass in, or what they’ll compare it to. One issue with that is you’ll have to enclose most things in single quotes, in case a string or date is passed in. Ever try to hand SQL those without quotes? Bad news bears. You don’t get very far.

Read on for the demo.  I have never used SQL_VARIANT in any project.  I’ve done a lot of crazy things with SQL Server (some of them intentionally) but never this.

Comments closed

Scripting Tables With SSMS

Tim Cost shows a few ways to script tables using SQL Server Management Studio:

Still … there is a trick here, and I don’t see a lot of people using it.  Maybe it’s just me, maybe I’m lazier than the average dev, but I often find myself using the Script Table As menu and choosing SELECT To and Clipboard.  This creates a nice select statement with all my fields wrapped in hard brackets.  I can then copy this into an INSERT query I might be working on to save myself some typing.  I can quickly copy the field list from the Query ‘Script Table As’ gives me and use it in the top of my INSERT query, then I can copy the entire SELECT query into the bottom of my INSERT query and Bob’s yer Uncle, I’ve got a simple INSERT query ready to go.  Note:  This is most useful when I’m trying to create a new table based on an existing table with only minor changes to field names.  I use this frequently when I’m establishing a reporting database based on staging tables.

That’s three ways to do it in Management Studio; the next step in the process is using SMO to script using a .NET language (C#, F#, Powershell).

Comments closed