Author: Kevin Feasel

Deprecated Features to Ignore

Published 2021-02-23 by Kevin Feasel

The issue in this specific case is that they spent that time contemplating a decision based on incomplete information. Are those I/Os reads or writes? If the user has an open transaction and has merely read a lot of data, there is far less impact in rolling that transaction back than if they have changed a lot of data. So, instead of sys.sysprocesses, let’s see what the more modern DMV, sys.dm_exec_sessions, can show us about this session:

On a slightly different topic, Mala Mahadevan looks at old but not-deprecated features, some of which you shouldn’t use:

I am part of a weekly talk show we run at the TriPASS user group, called ‘Shop Talk’. Shop Talk was the brainchild of Kevin Feasel, our key user group lead..we meet on a bi weekly basis and discuss random tech topics related to sql server. Some of these are questions from our audience, and some are just ideas for discussion that one of us come up with. I am constantly amazed and grateful for how much I learn by being part of this show – from my co hosts and from the very intelligent audience we are blessed with. Last week, we discussed Brent Ozar’s blog post on ‘What SQL Server Feature Do You Wish Would Go Away?’. The recording of our discussion (this topic starts around 26:00) is here.

And if you want a quick synopsis of the last link, I have the notes from each episode of Shop Talk, including the one Mala references.

Comments closed

Enabling Always Encrypted in SQL Server

Published 2021-02-23 by Kevin Feasel

Scott Klein takes us through the process of enabling Always Encrypted:

Always Encrypted provides a separation between those who own the data and can view it, and those who manage the data but should have no access. Always Encrypted does this by allowing clients to encrypt sensitive data inside client applications while never revealing the encryption key to the database engine.
One of our clients utilizes Always Encrypted to protect sensitive data. In our work for this client, we realized that a walkthrough documenting the end-to-end process of enabling Always Encrypted would be helpful for everyone.

Click through for a walkthrough of the process.

Comments closed

Tidying the Confusion Matrix in R

Published 2021-02-22 by Kevin Feasel

Gary Hutson has a new package for us:

The package aim is to make it easier to convert the outputs of the lists from caret and collapse these down into row-by-row entries, specifically designed for storing the outputs in a database or row by row data frame.
This is something that the CARET library does not have as a default and I have designed this to allow the confusion matrix outputs to be stored in a data frame or database, as many a time we want to track the ML outputs and fits over time to monitor feature slippage and changes in the underlying patterns of the data.

I like the way caret shows the confusion matrix when I’m reviewing result on my own, but I definitely appreciate efforts to make it easier to handle within code—similar to how broom reads linear regression outputs. H/T R-bloggers

Comments closed

Changing Azure Databricks Color Themes

Published 2021-02-22 by Kevin Feasel

Dave Ruijter has an interesting browser plugin:

If you’re working with multiple Azure Databricks workspaces at the same time it can be very hard to keep those workspaces apart. Personally I struggled to keep track of the dev / tst / acc / prd workspaces. Luckily, Henko had the answer: color them differently!

Click through to learn more about the browser extension and how to use it with Databricks.

Comments closed

Fixing Those Pesky Wait Stats

Published 2021-02-22 by Kevin Feasel

David Alcock keeps us from having to think:

CXPACKET
Attempt to pronounce parallellellellellism correctly then set MAXDOP to 1.
PAGELATCH
Mention TempDB and contention in the same sentence. Delete TempDB
BACKUP_
Delete any long running backup jobs. If wait persists then delete all backup jobs.

Click through for plenty of excellent nuggets of advice which definitely won’t land you on the unemployment line.

Comments closed

The Joy of Parameter Sniffing

Published 2021-02-22 by Kevin Feasel

Erik Darling points out that parameter sniffing is generally a good thing:

To some degree, I get it. You’re afraid of incurring some new performance problem.
You’ve had the same mediocre performance for years, and you don’t wanna make something worse.
The thing is, you could be making things a lot better most of the time.

We should specifically talk about parameter sniffing problems rather than parameter sniffing as a problem. These sorts of problems are closer to the exception than the rule.

Comments closed

Fun with Multi-Column Unique Constraints

Published 2021-02-22 by Kevin Feasel

Aaron Bertrand has an interesting use case:

A problem that comes up occasionally with constraints in SQL Server is when a unique constraint applies to multiple columns, but the values in those columns can be populated in any order. For example, if a table holds interactions between two users, and has columns User1 and User2, there should only be one row allowed for users 100 and 200, regardless of whether the data is entered as 100, 200 or 200, 100.

Click through for one solution. Another solution would be to normalize this down further with a dbo.ConversationParticipants table.

Comments closed

Disorderly Queries

Published 2021-02-22 by Kevin Feasel

Chad Callihan wants you to think about that ORDER BY clause:

I recently came across a scenario where an application process was not performing correctly on one database but was working fine on others. The process should have been completing in seconds but was taking minutes with no indication of activity. After some investigation, I found that the process was stuck waiting on a SELECT statement to complete. Even worse, it was holding an exclusive lock on a table which was then blocking new information from processing.
One part of the SELECT query that stood out was that it was ordering by a date field. Considering what the process was doing, there was no need to have the data ordered. Generally, it’s better to have the data sorted on the application side instead of SQL Server but in this case not even that was necessary.

There are definitely good cases where you need to use ORDER BY in a database—such as paging scenarios. But if you don’t need things in a particular order, Chad shows that you can potentially save a good deal on performance without an explicit ordering.

Comments closed

Choosing an Image File Type

Published 2021-02-19 by Kevin Feasel

The folks at Jumping Rivers continue a series on image optimization:

As the JPEG compression algorithm significantly reduces file size, JPEG files are ubiquitous across the web. If you take a photo on your camera, it’s almost certainly using a JPEG storage format. Historically the file extension was .jpg as Microsoft Windows only handled three character file extensions (also .htm vs .html). But today both extensions are used (personally I prefer .jpeg, but I’m not very consistent if I’m totally honest).
If you did a little Googling on which file format to use for images, then the answer you would come across is that JPEG’s are the default choice. But remember, figures are different from standard images!

Click through for a review of three viable image formats.

Comments closed

Research with R and Production with Python

Published 2021-02-19 by Kevin Feasel

Matt Dancho and Jarrell Chalmers lay out an argument:

The decision can be challenging because they both Python and R have clear strengths.
– R is exceptional for Research – Making visualizations, telling the story, producing reports, and making MVP apps with Shiny. From concept (idea) to execution (code), R users tend to be able to accomplish these tasks 3X to 5X faster than Python users, making them very productive for research.
– Python is exceptional for Production ML – Integrating machine learning models into production systems where your IT infrastructure relies on automation tools like Airflow or Luigi.

They make a pretty solid argument. I’ve launched success R-based projects using SQL Server Machine Learning Services, but outside of ML Services, my team’s much more likely to deploy APIs in Python, and we’re split between Dash and Shiny for visualization. H/T R-Bloggers

Comments closed

M	T	W	T	F	S	S
		1	2	3	4	5
6	7	8	9	10	11	12
13	14	15	16	17	18	19
20	21	22	23	24	25	26
27	28	29	30	31