Press "Enter" to skip to content

Curated SQL Posts

Windows Server Failover Clustering Error Code 5054

Josh Darnell walks us through an error when setting up an Availability Group:

For setting up the environment, I was following this really in-depth guide from former Data Platform MVP and current Microsoft employee Ryan J. Adams: Build a SQL Cluster Lab Part 1

The guide is generally fantastic, and provides a lot of good insight into the non-SQL Server related aspects of setting up an Availability Group. I’d highly recommend checking it out if you’re interested in that sort of thing.

Relevant to this post, he has provided a diagram of how the different networks are configured:

If you’re very experienced with networking, you may already have some idea of what the problem is going to be. Don’t spoil it for everyone else okay?

I’ll admit I did not have an idea of what the problem was.

Comments closed

HammerDB CLI for Oracle Running on Azure

Kellyn Pot’vin-Gorman goes through a rough experience:

Disclaimer: I’m not a big fan of benchmark data.  I find it doesn’t provide us as much value in the real world as we’d like to think it does.  As Cary Milsap says, “You can’t hardware your way out of a software problem” and I find that many folks think that if they just get the fastest hardware, their software problems will go away and this just isn’t true.  Sooner or later, it’s going to catch up with you-  and it rarely tells you what your real database workload needs to run most efficiently or what might be running in your database that could easily be optimized to save the business time and money.

The second issue is that when comparing different workloads or even worse, different platforms or applications, using the same configuration can be detrimental to the benchmarks collected, which is what we’ll discover in this post.

That said, Kellyn dives into the problem and documents several of the issues in building out this test.

Comments closed

The Pain of OR Clauses

Erik Darling wants you to embrace the healing power of AND:

This is one of my least favorite query patterns, because even with appropriate indexes, performance often isn’t very good without additional interventions.

Without indexes in place, or when “indexes aren’t used”, then the query plans will often look like one of these.

Maybe not always, but there are pretty common.

It’s something that I do wish the optimizer could be smarter about. One important thing to note in Erik’s demo: the OR clause is on two different columns, so SELECT x.Col1 FROM dbo.TblX x WHERE x.ID = 8 OR x.ID = 7 works fine, but WHERE x.ID = 8 OR x.SomethingElse = 14 is liable to cause performance issues on a large enough table.

Comments closed

Restoring a TDE Database to a Different Server

Chad Callihan reminds us of the other half of backups:

You setup TDE for your database. Great! Now you need to restore that database to a different server. Maybe you’re migrating off of an older server or maybe there is testing to be completed for an upcoming release. If we try to backup/restore or attach the database on a different server we’ll get some variation of an error stating “cannot find server certificate with thumbprint…” and the process will fail. There are a few steps to get the TDE database restored and we’re going to walk through them today.

Read on for those instructions.

Comments closed

Finding Spinlock Owners in Dump Files

Forrest McDaniel leaves us with a head injury:

Now, I may not be an expert, but sqlmin!Spinlock sounds like…a spinlock. This thread has been spinning for over a minute, never returning to a waiting state, because something else is holding the spinlock resource.

Thankfully, helpful friends alerted me to a blog that revealed the value of an acquired spinlock “is the Windows thread ID of the owner.” Meaning I might be able to find the cause.

Read the whole thing.

Comments closed

IDENTITY Overflow in SSIS

Alex Stuart hits a weird error:

Conversion/overflow errors aren’t that unusual – normally a data flow broken by some unexpected data (“no, there’s no chance that field would ever have a character in it”), or perhaps a column hitting max size (“INT will be enough for years, like, 5 years. I’ll have left the company by then”)

But that wasn’t the case here – the package and user tables involved were checked by the dev team and there was no possible overflow. I’d checked system databases for maxed-out identity columns and found nothing. Heads were scratched.

Read on for the post-head-scratch answer.

Comments closed

Automating Montoring

Thomas Williams has a multi-part series. Part 1 is a big picture primer:

It would be nice if every problem, now and in the future, was automatically monitored by an intuitive, cheap, flexible tool that only raised the alarm when there was a real problem, at the right time, to the right people.

However, in reality, monitoring is a game of whack-a-mole, built on an ad-hoc collection of tools and technologies, requiring know-how to interpret the often puzzling situations which may be high priority – or just background noise.

Part 2 continues the big picture look:

Monitoring is here to stay. There’s never enough disk space, memory, CPU, network throughput. If they ever existed, perfect conditions don’t persist, whether because of a downstream system failing, newly-discovered security issue, change in process, extraordinary amount of load etc. And, systems grow and change to meet new requirements, so yesterday’s monitoring may not meet tomorrow’s uptime goals.

Following on from part 1, here are some further “big picture” considerations for automated monitoring for the DBA:

Part 3 gets to ideas on instrumentation:

With some of the broader ideas out of the way, if there’s no monitoring in place and I was the “accidental”/”default” DBA, there are some metrics I’d definitely want to monitor. However, there’s a ton of real-time metrics I haven’t included below like CPU & memory use, number of current connections, disk I/O as at right now. They’re probably not good candidates for “roll-your-own”-type monitoring.

I consider “alerts” in the table below to include an e-mail, SMS or dashboard item that aims to trigger a response to fix. The collection frequency especially is just a guide – closer to real-time might be desirable, but the trade-off is impact on servers and connected apps.

Check out all three posts so far in the series.

Comments closed

Ordering and Sorting Data in Spark

Landon Robinson shows how to sort data in Spark RDDs and DataFrames:

In the analysis section of Spark Starter Guide 4.6: How to Aggregate Data, we asked these questions: “Who is the youngest cat in the data? Who is the oldest?”

Let’s use ordering in Spark as an alternative method to answer those same questions, and achieve the same result. Specifically, let’s again find the youngest and oldest cats in the data.

Click through for plenty of examples.

Comments closed

Time Series Data in PostgreSQL

Michael Grogan has a few examples of working with time series data in PostgreSQL:

Tools such as Python or R are most often used to conduct deep time series analysis.

However, knowledge of how to work with time series data using SQL is essential, particularly when working with very large datasets or data that is constantly being updated.

Here are some useful commands that can be invoked in SQL to better work with time series data within the data table itself.

Click through for examples like using a window function to calculate moving averages and using time zones. H/T Mark Hutchinson.

Comments closed