Press "Enter" to skip to content

Author: Kevin Feasel

The Downside Of Trusted Assemblies

Solomon Rutzky does not like the Trusted Assembly solution to SQL Server 2017 CLR:

Hopefully, Microsoft removes all traces of “Trusted Assemblies” (as I have suggested here). In either case, please just use Certificates (and possibly Asymmetric Keys, depending on your preference and situation) as I have demonstrated in these past three posts (i.e. Parts 2, 3, and 4). Even better, especially for those using SSDT, would be if Microsoft implemented my suggestion to allow Asymmetric Keys to be created from a binary hex bytes string. But, even without that convenience, there is still no reason to ever, ever, use the “Trusted Assemblies” feature.

He’s given three alternatives so far, so if you’re interested in CLR security, there’s plenty of food for thought.

Comments closed

SQL Code Smells

Phil Factor has updated his code smells compendium:

Some time ago, Phil Factor wrote his booklet ‘SQL Code Smells’, collecting together a whole range of SQL Coding practices that could be considered to indicate the need for a review of the code. It was published as 119 code smells, even though there were 120 of them at the time. Phil Factor has continued to collect them and the current state of the art is reflected in this article. There are now around 150 of these smells and SQL Code Guard is committed to cover as many as possible of them.

I loved this booklet when it came out almost as much as I loved his Confessions of an IT Manager.  If you’re looking for some light reading over a long weekend, you can do a lot worse than this.

Comments closed

Searching In Windbg

Ewald Cress shows us how to search for a four-byte pattern in the Windows debugger:

Cracking open Windbg on 2016 SP1 with the s command to look for byte patterns yielded nothing of value. Maybe something has changed with conventions or indirection? Nope, no joy in 2014 either.

In the end, it took the extremely brave step of RTFM, in this case the Windbg online help, to realise where I was going wrong. I was searching for a four-byte pattern by searching for doublewords. Sounds reasonable on the face of it, but what I had missed was that this specifically required the doublewords to be doubleword-aligned, i.e. starting on an address divisible by four. My method only had a 25% chance of working, so it’s sheer luck I ever got good results with it.

Changing to a byte search for four consecutive bytes gave me the non-aligned semantics my taste buds craved, and the results came pouring in.

This is in the context of gathering information on an uncommon wait type related to columnstore indexes.

Comments closed

Azure Archive Blob Storage

James Serra talks about a new tier of blob storage:

Last year Microsoft introduced Azure Cool Blob storage, which cost customers a penny per GB per month in some Azure regions.  Now, users have another, lower-cost option in Azure Archive Blob Storage, along with new Blob-Level Tiering data lifecycle management capabilities.  So there are now three Azure blog storage tiers: Hot, Cool, and Archive.

Azure Archive Blob Storage costs 0.18 cents per GB per month when the service is delivered through its cloud data center in the East US 2 (for comparison, in the same region hot is 1.8 cents and cool is 1.0 cents per GB per month) .  Customers can expect a 99 percent availability SLA (service level agreement) when the service makes its way out of the preview stage.

This is Azure’s response to AWS Glacier.  The immediate sticker price is a bit higher, but if there aren’t any incremental costs associated with deletion, uploading, or retrieving files, then it could end up matching Glacier in TCO.

Comments closed

R Versus Python

Vincent Granville believes that Python is overtaking R in the realm of data science:

We use the app in question to compare search interest for R data Science versus Python Data Science, see above chart.  It looks like until December 2016, R dominated, but fell below Python by early 2017. The above chart displays an interest index, 100 being maximum and 0 being minimum. Click here to access this interactive chart on Google, and check the results for countries other than US, or even for specific regions such as California or New York.

Note that Python always dominated R by a long shot, because it is a general-purpose language, while R is a specialized language. But here, we compare R and Python in the niche context of data science. The map below shows interest for Python (general purpose) per region, using the same Google index in question.

It’s an interesting look at the relative shift between R and Python as a primary language for statistical analysis.

Comments closed

The SQL Has Landed

Drew Furgiuele’s SQL Server in Space project was a success:

We took quite a few readings to try and account for any variations, and when we were done we ended up with a little more lift than we calculated, but not by much. We then proceeded to tie off the balloon. This is by far the most stressful part of the entire operation. Weather balloons have what amounts to a very long, thick neck where you inflate them from. Unlike, say, a party balloon, you don’t just knot the end and call it a day. Instead, you need to tie one end of the neck off to prevent the escape of the helium, then bend the neck back. Then, you tie that off. You’re left with a “U” shaped bend at the bottom of the balloon that you then attach your payload train to: the parachute, then the reflector, and ultimately the payload. Oh, and you have to do this while, you know, trying to wrangle a 7 foot diameter balloon.

Tagged “cloud” even though it was a bright and sunny day.

Comments closed

Lambda And Kappa Architectures

Michael Verrilli has a post contrasting the Lambda and Kappa data architectures:

Any query may get a complete picture by retrieving data from both the batch views and the real-time views. The queries will get the best of both worlds. The batch views may be processed with more complex or expensive rules and may have better data quality and less skew, while the real-time views give you up to the moment access to the latest possible data. As time goes on, real-time data expires and is replaced with data in the batch views.

One additional benefit to this architecture is that you can replay the same incoming data and produce new views in case code or formula changes.

The biggest detraction to this architecture has been the need to maintain two distinct (and possibly complex) systems to generate both batch and speed layers. Luckily with Spark Streaming (abstraction layer) or Talend (Spark Batch and Streaming code generator), this has become far less of an issue… although the operational burden still exists.

I haven’t seen much on the topic of Big Data architectures this year; it seems like it was a much more popular topic last year.

Comments closed

Monitoring SQL Agent Job Failures

Mark Wilkinson shows how to set up a SQL Agent job failure monitoring solution:

Since we are storing the date the records are added to the table, this query will always return the latest set of failures. This is a simple example, but the possibilities are endless:

  • Send the results of this query via database mail

  • Join with dbo.sysjobs and dbo.syscategories, alerting on different thresholds per job category

  • Extend the TOP (1) to include multiple capture periods and alert on average failures per capture

Check it out.  This is particularly helpful if you get blasted with thousands of error messages per minute because somebody made a bunch of untested changes and broke every job in your environment and caused the mail server to throttle your account for a multi-day period.  Not that this has ever happened to me, of course…

Comments closed

Columnstore Segment Elimination Pains

Niko Neugebauer shows us two pain points with row group (segment) elimination on clustered columnstore indexes:

Still this gives us processing of just 1 Segment with 253 lob logical reads vs 6 Segments and 2669 lob logical reads in the original query.

But wait, that’s not all! It can get significantly worse!
Let’s use the inequality search with the OR condition, to see what happens – should be safe, right ?

Spoilers:  it’s not safe.  I’ve been burned on the Min + Max issue that Niko shows, where you’d think that it could eliminate all segments except for the first and last, but that’s not how the process works today.

Comments closed

SSIS Out Of Memory

Kevin Hill diagnoses an out-of-memory error in SQL Server Integration Services:

Notice the red sections above:

  • Not enough storage – was this physical memory, virtual memory or disk?

  • Buffer failed to allocate (10MB) – Why?

  • 2% memory load – yes, this box has 1 TB of RAM…plenty for what we are doing you would think.

  • Attempt to add a row – well, yeah…if SSIS uses buffers and cannot allocate one, there are either none, or they are full.  Makes logical sense to me

I won’t spoil the ending; you’ll have to read it over there.

Comments closed