Press "Enter" to skip to content

Curated SQL Posts

SOS_Mutex

Ewald Cress continues his dive into system internals, this time looking at SOS_Mutex:

Put differently, we can build a mutex from an auto-reset EventInternal by tacking on an owner attribute, making a rule that only the owner has the right to signal the event, and adding assignment of ownership as a fringe benefit of a successful wait. A nonsignalled event means an acquired mutex, and a signalled event means that the next acquisition attempt will succeed without waiting, since nobody currently owns the mutex. The end result is that our SOS_Mutex class exposes the underlying event’sSignal() method and its own take on Wait(). From the viewpoint of the mutex consumer, the result of a successful wait is that it owns the mutex, and it should act honourably by calling Signal() as soon as it is done using the resource that the mutex stands guard over.

There’s some deep detail here, so this is definitely one of those “after your first cup of coffee” posts to read.

Comments closed

Career-Limiting Moves

Randolph West has a series of career-limiting moves, which sadly I had missed until now.  I’ll make up for that by linking the whole series.

First, dropping a table:

For whatever reason, we ran the script in the Oracle SQL*Plus client, which Wikipedia generously describes as “the most basic” database client.

Cut to the part where we run the script. The DROP TABLE command was run successfully and committed, but the previous step where the data was moved had failed.

The entire table was gone.

Second, saying “no” at the wrong time:

My job was to provide technical support to a senior staff member, and I said no because I was busy on something that was, for all intents and purposes, not as important.

This was of course escalated very quickly to the managing director, who in turn shouted at my boss, who in turn shouted at me. If I recall correctly, my boss eventually helped his colleague with her important problem and only reamed me out after the fact.

Third, playing the blame game:

She explained to me that whether or not that was the case, the language was totally inappropriate and calling a vendor on the weekend for something that did not constitute an emergency was unprofessional. In any number of scenarios, I could have been fired for my behaviour.

Chastened, I took away several important lessons: it doesn’t matter whose fault something is. The job had to be done, and I was around to do it. Furthermore, it is important never to be caught bad-mouthing someone on the record, no matter how good a relationship you have with a vendor. It will always come back to bite you.

 

His current in the series is Reply All, in which he’s looking for your stories.

Comments closed

Password-Limiting Moves

Brian Kelley is not happy that Office 365 limits password lengths:

I blinked when I saw the warning, “Your password can’t be longer than 16 characters.” I couldn’t believe that I had gotten that warning, so I erased what I had typed for a password and started typing 1, 2, 3, etc., to see if this warning did trip at 17 characters. It did. Why in the world is there a limitation on password length if you’re going to do a hash my password? And if you had to pick a limit, why 16 characters? Why not 50 or 100 or 255?

I’ll go one step further:  there is never a good limit to how long a password should be.  For services like these, Microsoft should have the plaintext version of the password (which again, should be a string of an arbitrary length) only enough to perform an adequate number of rounds of hashing and salting using an appropriate hashing function (e.g., bcrypt).  At that point, once the password gets hashed, the hash is always the same length, meaning the length of the plaintext is irrelevant for storage.

Comments closed

Skirting Around Dynamic Data Masking

Phil Factor gives a trivial method of subverting dynamic data masking:

Dynamic data masking is a great product and solves some niche problems that come if you need to do certain testing with live data at the application level. You should, however, beware of using it as a database-level security device.

I haven’t yet used it in testing because I don’t have the problem that it solves.

The problem that it solves is for people doing testing, especially user-acceptance testing, of an application using live data. It is good at masking data where the user is unable to make direct access to the database to execute queries.

Phil has code to get around credit card numbers, and I will say that he’s not the first person I’ve seen do this.  Dynamic Data Masking is not a general-purpose security solution.

Comments closed

DBCC Help

Kenneth Fisher describes DBCC Help:

Many years ago during an interview I was asked to “Name the top 7 DBCC commands that you use.” I think I was able to name 3. Ever since then I’ve paid particular attention to DBCC commands in preparation of the day when I hear that question again. Well not to long ago I was watching Erin Stellato’s (b/t) PluralSight course “SQL Server: Understanding and Using DBCC Commands”. Note: It’s a great course and I highly recommend it. In it she goes over quite a few DBCC commands, some documented, some not. In noted two in particular.

The first one I noted gives you a list of the size and free space in the logs of all databases on the instance. Highly useful. I’ve used it repeatedly recently. Unfortunately my memory is not always the best and I can never quite remember the exact name. Which brings us to the other command.

I completely agree that Erin’s Pluralsight course is worth watching.

Comments closed

Aggregating Clickstream Data

Ofer Habushi solves a clickstream aggregation problem using Spark:

At this point, an interesting question came up for us: How can we keep the data partitioned and sorted? 

That’s a challenge. When we sort the entire data set, we shuffle in order to get sorted RDDs and create new partitions, which are different than the partitions we got from Step 1. And what if we do the opposite?

Sort first by creation time and then partition the data? We’ll encounter the same problem. The re-partitioning will cause a shuffle and we’ll lose the sort. How can we avoid that?

Partition→sort = losing the original partitioning

Sort→partition = losing the original sort

There’s a solution for that in Spark. In order to partition and sort in Spark, you can use repartitionAndSortWithinPartitions. 

This is an interesting solution to an ever-more-common problem.

Comments closed

Actual Rows Read

Rob Farley goes into detail on the Actual Rows Read property in execution plans:

The warning says “Operation caused residual IO. The actual number of rows read was 2,130, but the number of rows returned was 2.” Sure enough, further up we see “Actual Rows Read” saying 2,130, and Actual Rows at 2.

Whoa! To find those rows, we had to look through 2,130?

You see, the way that the Seek runs is to start by thinking about the Seek Predicate. That’s the one that leverages the index nicely, and which actually causes the operation to be a Seek. Without a Seek Predicate, the operation becomes a Scan. Now, if this Seek Predicate is guaranteed to be at most one row (such as when it has an equality operator on a unique index), then we have a Singleton seek. Otherwise, we have a Range Scan, and this range can have a Prefix, a Start, and an End (but not necessarily both a Start and an End). This defines the rows in the table that we’re interested in for the Seek.

But ‘interested in’ doesn’t necessarily mean ‘returned’, because we might have more work to do. That work is described in the other Predicate, which is often known as the Residual Predicate.

Definitely worth a read.

Comments closed

Think About Recovery Mode

Kendra Little answers a user question about a co-worker who puts all prod databases into Simple recovery mode:

What if Your Coworker Regularly Goes Off the Ranch and Doesn’t Use Change Control?

Don’t cover for them.

Ask them about it first to make sure there wasn’t an emergency change request you’re unaware of for the change, but be honest about what happened when you’re asked.

In other words, treat them as an equal and a grown-up.

Sometimes in this situation, people sugar coat things or cover for the person who makes mistakes. You need to treat them as an adult though.

If you made mistakes, you would own up to what happened and work to not do it again, right? It’s just about respectfully allowing others to own their actions.

Kendra’s answer is a good mixture of technical explanation and not being socially inept.  It’s important to follow that two-pronged approach.

Comments closed

HIBPwned

Steph Locke has a new CRAN package out:

HIBPwned is a feature complete R package that allows you to use every (currently) available endpoint of the API. It’s vectorised so no need to loop through email addresses, and it requires no fiddling with authentication or keys.

You can use HIBPwned to do things like:

  1. Set up your own notification system for account breaches of myriad email addresses & user names that you have

  2. Check for compromised company email accounts from within your company Active Directory

  3. Analyse past data breaches and produce charts like Dave McCandless’ Breach chart

The regular service is extremely useful and Steph’s wrapper looks like it’s worth checking out.

Comments closed