Research – Curated SQL

It’s easy to be a bit cynical about this and think, “That’s not the way it is in my job.” If you feel that way, I challenge you to take the initiative to change your own perspective! There is no need for database specialists to wait around for someone to give them permission to start thinking about their organization’s customers and to start being a creative collaborator with others.
Give yourself permission, and very likely it will have a positive impact on your career.

It’s easy for developers and DBAs to forget that they’re in the business of providing services. If you can’t name your consumers and their interests, that opens up concerns about whether you’re satisfying their needs.

Comments closed

Agorics

Published 2017-07-04 by Kevin Feasel

One of my interests about a decade ago was agorics, the study of computational markets. Mark S. Miller and K. Eric Drexler pushed this idea in the late 1990s and collected a fair portion of the work on the topic on Drexler’s website. A sample from the section on computation and economic order:

Trusting objects with decisions regarding resource tradeoffs will make sense only if they are led toward decisions that serve the general interest-there is no moral argument for ensuring the freedom, dignity, and autonomy of simple programs. Properly-functioning price mechanisms can provide the needed incentives.

The cost of consuming a resource is an opportunity cost-the cost of giving up alternative uses. In a market full of entities attempting to produce products that will sell for more than the cost of the needed inputs, economic theory indicates that prices typically reflect these costs.

Consider a producer, such as an object that produces services. The price of an input shows how greatly it is wanted by the rest of the system; high input prices (costs) will discourage low-value uses. The price of an output likewise shows how greatly it is wanted by the rest of the system; high output prices will encourage production. To increase (rather than destroy) value as ‘judged by the rest of the system as a whole’,a producer need only ensure that the price of its product exceeds the prices (costs) of the inputs consumed. This simple, local decision rule gains its power from the ability of market prices to summarize global information about relative values.

I still think it’s an interesting concept, and the rise of cloud computing has, to an extent, fulfilled this idea: AWS spot pricing is one of the best examples I know of, where resource spot prices will change depending upon load.

Comments closed

Readings In Database Systems

Published 2016-11-24 by Kevin Feasel

Curated SQL is taking Thanksgiving off. If you are looking for some good reading, check out Readings in Database Systems, 5th Edition.

Comments closed

LIME

Published 2016-09-02 by Kevin Feasel

William Vorhies discusses a new technical paper on Local Interpretable Model-Agnostic Explanations:

What the model actually used for classification were these: ‘posting’, ‘host’, ‘NNTP’, ‘EDU’, ‘have’, ‘there’. These are meaningless artifacts that appear in both the training and test sets and have nothing to do with the topic except that, for example, the word “posting” (part of the email header) appears in 21.6% of the examples in the training set but only two times in the class “Christianity.”

Is this model going to generalize? Absolutely not.

An Example from Image Processing

In this example using Google’s Inception NN on arbitrary images the objective was to correctly classify “tree frogs”. The classifier was correct in about 54% of cases but also interpreted the image as a pool table (7%) and a balloon (5%).

Looks like an interesting paper. Click through for a link to the paper.

Comments closed

Say It With Screenshots

Published 2016-05-17 by Kevin Feasel

Brent Ozar continues his series on interviewing tactics:

After writing about “For Technical Interviews, Don’t Ask Questions, Show Screenshots”, lots of folks asked what kinds of screenshots I’d show. Here’s this week’s example.

I show each screenshot on a projector (or shared desktop) to the candidate and say:

What’s this screen from?
What does the screen mean?
If it was a server you inherited from someone else, would there be any actions you’d take?
What questions might you want to ask before you take those actions?
Would there be any drawbacks to your actions?
What would be the benefits of your actions?

I have started to use this in interviews and I’m already loving it. I don’t want people to memorize minutia (“Name all of the policies available in Policy-Based Management”) but if I show a picture of the different policies, that should jog your memory on when you’ve used PBM to solve interesting problems.

Comments closed

Finding Object Counts

Published 2016-01-18 by Kevin Feasel

SQLWayne shows how to break down counts of objects by type:

And while it did the trick, I was wanting, for no particular reason, to also have the total number of objects and the percentage. Again, no particular reason. It might be able to be done with a window function, but that is also something that I have limited familiarity with, so I decided to approach it as a CTE. And it works nicely. The objs CTE gives me a count of each object type while the tots CTE gives me the count of all objects. By giving each CTE a column with the value of 1, it’s easy to join them together then calculate a percentage.

That’s one of the nicest things about SQL as a language: you access metadata the same way you access regular data, so that technique can be used to query other data sets as well.

1 Comment

Feature Spelunking

Published 2015-12-03 by Kevin Feasel

Aaron Bertrand shows us how to find hidden features in CTPs:

In honesty, I’m just meticulous about installing each new build and immediately digging into the metadata. It would be hard to take a look at sys.all_objects and identify what’s new by sight; even columns like create_date and modify_date are not as accurate as you might expect. (For example, in CTP 3.1, sp_helpindex has a create_date of 2015-11-21 18:03:15.267.)

So instead of relying on photographic memory or hoping that something new will jump out at me while scanning the new catalog, I always install the new CTP side-by-side with the previous CTP (or, in the case of the very first CTP, side-by-side with the previous version). Then I can just perform various types of anti-semi-joins across a linked server to see objects and columns that have been added, removed, or changed.

Very interesting.

Comments closed

Category: Research

The Benefits of User Research

Agorics

Readings In Database Systems

LIME

Say It With Screenshots

Finding Object Counts

Feature Spelunking