Press "Enter" to skip to content

Month: February 2018

Flotilla: Containerized Task Execution

Akshay Wadia, et al, introduce a new open source tool:

Data scientists are not always equipped with the requisite engineering skills to deploy robust code to a production job execution and scheduling system. Yet, forcing reliance on data platform engineers will impede the scientists’ autonomy. If only there was another way.

Today we’re excited to introduce Flotilla, our latest open source project. Flotilla is a human friendly service for task execution. It allows you to focus on the work you’re doing rather than how to do it. In other words, Flotilla takes the struggle out of defining and running containerized jobs.

It looks like an interesting service.

Comments closed

dbatools Making Extended Events Easier

Chrissy LeMaire solves a bunch of common problems with Extended Events:

So why do people keep using traces? We compiled a list of reasons from Erin Stellato’s Why do YOU avoid Extended Events. And this list is LONG!

  • Traces are straightforward and less complex than Extended Events
    Totally seems that way!
  • Traces provide a consistent interface for mixed environments
    Whether you use SQL Server 7 or SQL Server 2017, the interface is pretty much the same.
  • Traces are faster to setup quick traces
    Just open up Profiler, connect to a server, click a few times and you’re set.
  • “Extended Events are more efficient for the SQL Server engine, but not more efficient for the DBA”
    Love this quote.
  • People already have a library of Profiler templates
    Including me
  • Ignorance of XML / Querying all the generated XML is outrageous
    When I first saw what it takes to query Extended Events, I bailed immediately. I am not learning XPATH, ever.
  • Templates work remotely across all instances
    This is also true for Extended Events, but the commenter did not know that.
  • XEvents are persistent and must be stopped manually
    Once you close Profiler or restart SQL Server, all non-default traces will disappear. Extended Events will persist until you delete them.
  • Ability to import PerfMon data and look at Trace and PerfMon counter data at the same time
    Most people that use Profiler don’t seem to know about this feature but those who do LOVE it. You can read more at Brad McGehee’s Correlating SQL Server Profiler with Performance Monitor.

    Microsoft reportedly has no plans to provide this functionality.

  • Consistent user experience across SSAS and Database Engine
    Gotta take their word, I don’t use SSAS.
  • It’s easy to train others to use Profiler
    Imagine – if it’s easier to learn Profiler, it’ll be far easier to teach.
  • Traces can be easily replayed
    There are a number of Microsoft tools to replay traces, but none to replay Extended Events.
  • MS Premier Support still asks for traces
    Likely because they also have tools that they want to work across all supported versions, which still includes SQL Server 2008 R2.
  • xe_file_target_read_file is a CPU hog
    This wasn’t listed on Erin’s page but was told to me while I was performing my research.

Whewf! That’s a lot of compelling reasons not to make the switch. So let’s see how we can address each of them using PowerShell. All code listed here can be found at sqlps.io/xecode.

Chrissy & co can’t solve all of them, but they solve a majority.  Click through to see how.

Comments closed

Temp Tables In Redshift

Derik Hammer has some notes on temporary tables in Amazon Redshift:

One difference between regular tables and temporary tables is how they are typically used. Temporary tables are session scoped which means that adding them into a process or report will probably cause them to be created multiple times. Temporary tables might be very similar to regular tables but most regular tables are not re-written into, every time they are queried.

The disk writes involved in populating the temporary table might be more expensive than the reads would be if you were to modify your query to include the logic into one, larger, query. The frequency of the report or process will be a factor into how much of a performance hit you get by using the temporary tables. If you are using temporary tables to make debugging a procedure easier or to enhance readability, make sure you understand the IO cost of performing writes and then reading that data back into a subsequent query.

Read on for more.

Comments closed

Query Store And CU4

Erin Stellato notes that if you ran SQL Server 2017 CU2 with Query Store, you want to upgrade and run a special script:

There is a Query Store fix in the latest Cumulative Updates of SQL Server 2017, and if you are keeping current with the latest cumulative updates, there’s a script you need to run if you’ve installed CU2 at any point.

First, if you are still on CU2, you might want to consider upgrading to CU4 (released February 17, 2018).

Second, if you previously had CU2 installed (and are now on CU3 or CU4), you want to make sure that you’ve run the script included with CU3 and CU4 release notes.

The script removes plans from Query Store that were captured when running CU2.

Read the whole thing and keep those servers patched.

Comments closed

Replication Update: Dynamic Parameter Loading

Sourabh Agarwal announces a new feature with SQL Server replication:

When a new agent user profile is created and assigned to an agent, relevant entries are added to the [dbo].[MSagent_profiles] and the [dbo].[MSagent_parameters] tables in the MSDB database on the Distributor server. The values in the tables are updated whenever a parameter value is updated for the Profile. These values are reloaded from the tables, whenever a replication agent is restarted. As part of the improvement a new profile parameter “ProfileReloadInterval” with a default value of 300 seconds, was added to the profiles. This parameter dictates the interval at which the agent reloads the values from the tables. If using the default value, the agent will try and reload the values from the MSDB tables every 5 minutes. If it encounters that any of the above-mentioned parameter values have changed, the new values will be used by the agents.

I don’t work with replication enough to really see the benefit, but I’m sure somebody is giddy about this.

Comments closed

Special Commands In mssql-cli

Alan Yu announces an update to mssql-cli:

Special commands are meant to make your life easier. They are shortcuts to perform common tasks and queries. All special commands start with a backslash (\), and you can use the built-in IntelliSense to see a list of special commands you can use. You can also learn more by running the following command:

$ mssql-cli>\?

To start off, let’s say you have a common query you run often, but don’t want to type it out each time or scroll through your history. You can “save” your query by running the following special command:

$ mssql-cli>\sn address select * from "Person"."AddressType"

To execute this query, simply run:

$ mssql-cli>\n address

Click through to learn more and see it in action.

Comments closed

Kerberos And SQLCMD

Kenneth Fisher notes a behavior change for SQLCMD with SQL Server 2014 and later:

This error is seen when using a Windows Authenticated id and NTLM or an untrusted Kerberos connection. From what I understand, with Kerberos the client (SQL in this case) gets a ticket containing your credentials, it can then pass this ticket on to other servers that trust it to have authenticated you to be sure you are who you say you are. Note the mention of trust. This means that even if you are using Kerberos if the two servers aren’t in a trusted relationship then you will still see the error. And NTLM doesn’t have the option to do this at all (again, not sure why).

At this point it’s important to note that this is not a SQL Server error. You will see this error in other places where you bounce across multiple machines and aren’t using Kerberos with a trusted relationship. I’ve seen it when trying to access files before (it was a weird situation admittedly).

Read on to learn about the ramifications of this behavioral change and how you can prevent double-hop problems when running newer versions of SQLCMD.

Comments closed

Using Telegraf To Display SQL Server Metrics In Grafana

Tracy Boggiano has a writeup showing how to use Telegraf + InfluxDB + Grafana to view SQL Server metrics:

We have in the middle an open source time series database called InfluxDBis designed for collecting data that is timestamped such as performance metrics. Into that, we feed data from an open source project called Telegraf which can feed in more than just SQL Server statistics. And to be able to show us the data in nice pretty graphs that we can manipulate, drill-down on, and even set up alerts we display it using Grafana. Links to all of these products you find as we go through the setup of the solution.

Tracy’s post is dedicated to installation and configuration more than defining metrics, but it does get you on the road to custom metrics visualization.

Comments closed

Parsing CSVs With Logstash

Mike Hillwig continues his Logstash series by reading in a CSV:

As I was writing this, I thought I’d play with the autodetect_column_namessetting. Unfortunately, it wasn’t an option for this particular file. Logstash threw an error :exception=>java.lang.ArrayIndexOutOfBoundsException: -1which leads me to guess that my file is too wide for this setting. This file is staggeringly wide with 75 columns. If you have a more narrow file, this could be a really cool option. If your file format changes by someone adding or removing a column from the CSV, it’ll be a lot easier to maintain. Alas, it’s not an option in this situation.

Check out the script.

Comments closed

Splitting A Large Script With Powershell

Jana Sattainathan shows how to split a file with a large number of independent operations and have Powershell run the job in batches:

Everyday is an interesting day in the life of a DBA. Today, I received a request to run in a HUGE script. This script had 125k INSERT statements each in a separate line. Just opening the script was an issue, leave alone running it into a database. You get this error in SSMS just trying to open – “The operation could not be completed. Not enough storage is available to complete this operation

This is how I handled it

  1. I split the file into a manageable 1000 line files (total of 125 files)
  2. Looped through each split file and ran it in!

Yes, it was that simple. Thanks to PowerShell!

Read on to see how simple it is.

Comments closed