Author: Kevin Feasel

Explaining RBAR

Published 2016-08-18 by Kevin Feasel

Kenneth Fisher explains RBAR with the help of an animated GIF:

So 23 milliseconds for the batch version and 850 milliseconds for RBAR. What a difference.

Now in this case the code for the RBAR is also a lot more complicated. But that isn’t always the case. It also isn’t always the case that RBAR is slower. But it’s almost always a lot slower than batch.

So, while the code for RBAR is often easier to write, even though it might be physically longer, it’s probably going to be slower too.

Well-written, set-based solutions aren’t always guaranteed to be faster, but that’s one of the safest bets to make with T-SQL.

Comments closed

Azure SQL Database Alerts

Published 2016-08-18 by Kevin Feasel

Julie Koesmarno shows how to set up Azure SQL Database alerts:

Over the last year, I have been intentionally seeking out to get feedback from the community via various SQL events, particularly those who plan to use or are currently using Azure SQL Database. A lot of questions have come up about managing Azure SQL Database better – i.e. being more proactive and more responsive in managing Azure SQL Database. One of the ways to be more proactive about your SQL Database is by setting up alerts. As an example, you can create an alert in case DTU goes above 95% – say in the last 5 minutes, so that you can either investigate why this might be or upgrade it to a higher SKU.

This article walks through how you can setup an Alert on Azure SQL DB.

I really like the fact that they offer web hooks; that way, I can integrate these alerts with Slack or other messaging systems.

Comments closed

Database Project Basics

Published 2016-08-18 by Kevin Feasel

James Anderson gives a basic overview of database projects within Visual Studio:

SSDT is a VS plugin that can script out a database into individual files so that you can us a VCS (I use Git) to version control them. Once those scripts are in my Git repo, I can use it as the single source of truth to generate my releases from. This is the basis of getting our databases into our CI process. ReadyRoll will be used to further improve this process and to add our migration/upgrade scripts to our repo. SSDT is required by ReadyRoll and can be found here.

Before we can start with ReadyRoll, we need to learn some Visual Studio basics.

I’ve used database projects for the better part of a decade. They aren’t perfect but in most environments, they’re quite helpful…if other people use them as well…

Comments closed

PySpark With MapR

Published 2016-08-17 by Kevin Feasel

Justin Brandenburg has a tutorial on combining Python and Spark on the MapR platform:

Looking at the first 5 records of the RDD

kddcup_data.take(5)
This output is difficult to read. This is because we are asking PySpark to show us data that is in the RDD format. PySpark has a DataFrame functionality. If the Python version is 2.7 or higher, you can utilize the pandas package. However, pandas doesn’t work on Python versions 2.6, so we use the Spark SQL functionality to create DataFrames for exploration.

The full example is a fairly simple k-means clustering process, which is a great introduction to PySpark.

Comments closed

Over-Engineering

Published 2016-08-17 by Kevin Feasel

Dave Copeland discusses over-engineering problems:

The main problem with an over-engineered solution is that it takes longer to ship than is necessary. By definition, we are doing more than is necessary, and that will take longer to ship. There’s almost never a reason to prefer longer ship-times over shorter ones, all things being equal.

The more serious problem with over-engineering is the carry cost.

A carrying cost is a cost the team bears for having to maintain software and infrastructure. Each feature requires tests, monitoring, and maintenance. Each new feature is made in the context of those that came before it. This is why a feature that might’ve taken one week when the project was new requires a month to make in more mature project.

Read the whole thing and simplify your solutions.

Comments closed

Solr Lock Contention

Published 2016-08-17 by Kevin Feasel

Michael Sun shows how the Apache Solr team found and fixed a performance issue in their code:

Based on this testing, lock contention, which usually results in a performance bottleneck and underutilized resources, was our first “suspect.” We knew that using a commercial Java profiler, such as Yourkit, JProfiler and Java Flight Recorder, would help easily identify locks and determine how much time threads spend waiting on them. Meanwhile, the team had built custom infrastructure that allows one to run experiments with a profiler attached via a single command-line parameter.

In my own testing, the profiler data indeed revealed some contention particularly related to VersionBucket andHdfsUpdateLog locks, leading to long thread wait time. Although promisingly, this result corresponded somewhat to the description in SOLR-6820, nothing actionable resulted from the experiment.

I like these sorts of case studies because example is the school of mankind. In this particular case, I really like the methodical approach, using available information to search for a root cause. Some of the things Michael calls “false starts” I would consider to be initial steps: checking OS, filesystem, and garbage collection metrics are important even in a case like this in which they did not lead to the culprit, as they help you eliminate suspects.

Comments closed

Breaking Out URLs With M

Published 2016-08-17 by Kevin Feasel

Chris Webb shows the RelativePath and Query parameters of Web.Contents in M:

Generates a call that returns 20 results, rather than the default 10:

https://data.gov.uk/api/3/action/package_search?q=cows&rows=20

Obviously these options make it easier to construct urls and the code is much clearer, but there are also other benefits to using these options which I’ll cover in another blog post soon.

This makes for a more maintainable, dynamic URL generation. Think about an internal product dashboard, where you might need to make API calls to pull in data by product (or maybe you want to send people to an external link for each product). This can help you parameterize your URLs quite easily.

Comments closed

SQLPS Is Dead; Long Live SQLPS

Published 2016-08-17 by Kevin Feasel

Mike Fal thought he had escaped his SQLPS nightmare:

The second issue is that even if you do install SSMS 2016, SQL Agent won’t recognize and give you access to the new module if you use a PowerShell job step. When you create a PowerShell job step, the script in that job step runs within a specific context. It’s hidden from you, but whenever that script runs the first thing that happens is SQL Server launchessqlps.exe.

Check out the links Mike provides to Connect items and the Trello board if you want to see the issues he brought up fixed.

Comments closed

Troubleshooting Parameter Sniffing

Published 2016-08-17 by Kevin Feasel

Brent Ozar has a guide on troubleshooting parameter sniffing:

Parameter sniffing fixes are based on your career progression with databases, and they go like this:

1. Reboot the server! – Junior folks panic and freak out, and just restart the server. Sure enough, that erases all cached execution plans. As soon as the box comes back up, they run rpt_Sales for China because that’s the one that was having problems. Because it’s called first, it gets a great plan for big data – and the junior admin believes they’ve fixed the problem.

2. Restart the SQL Server instance – Eventually, as these folks’ careers progress, they realize they can’t go rebooting Windows all the time, so they try this instead. It has the same effect.

If a reboot can’t fix the problem, I’m out of ideas…

By the way, I second Brent’s recommendation of Erland’s query plan article. Erland doesn’t publish frequently, but when he does it’s worth the wait.

Comments closed

Learning JSON

Published 2016-08-17 by Kevin Feasel

Jason Brimhall wants to learn a bit of JSON:

Let’s just get this out there right now – I suck at JSON. I suck at XML. The idea of querying a non-normalized document to get the data is not very endearing to me. It is for that reason that I have written utilities or scripts to help generate my XML shredding scripts – as can be seen here.

Knowing that I have this allergy to features similar to XML, I need to build up some resistance to the allergy through a little learning and a little practice. Based on that, my plan is pretty simple:

Read up on JSON
Find some tutorials on JSON
Practice using the feature
Potentially do something destructive with JSON

I’m not particularly excited about JSON support in SQL Server 2016 but the fact that it is there, combined with the fact that so many developers love JSON means that it’s a good idea to learn how to integrate, if only to figure out when it’s a bad idea to parse JSON within your very expensive SQL Server instances.

Comments closed

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30