Obfuscating Continuous Variables

Kevin Feasel

2018-07-19

T-SQL

Phil Factor continues his series on data obfuscation:

Imagine that you have a table giving invoice values. You will want your spoof data to conform with the same ups and downs of the real data over time. You may be able to get the overall distribution the same as the real data, but the resulting data would be useless for seeing the effect of last years sales promotion. The invoice values will depend on your sales promotions if your marketing people have done their job properly.

By making your data the same distribution as your production data, you don’t necessarily get the same strategy chosen by the query analyser, but you dramatically increase the chances of getting it. SQL Server uses a complex paradigm to select amongst its alternative plans for a query. It maintains distribution statistics for every column and index that is used for selecting rows. These aren’t actually histograms in the classic sense, but they perform a similar function and are used by the SQL Server engine to predict the number of rows that will be returned.

The focus is on independent variables, though there is a little bit at the end about working with dependencies.

Related Posts

Converting Binary To Hex With T-SQL

Dave Mason uses STRING_SPLIT to convert binary values to their hex equivalents: I started pondering it for a bit and began to wonder if I could use the new for SQL Server 2016 STRING_SPLIT function to convert a binary string to decimal. The thought process was to split the string into rows of CHAR(1) values, along with an in-string character […]

Read More

Faster Scalar Functions In SQL Server 2019

Brent Ozar looks at improvements the SQL Server team has made to scalar functions in 2019: My database has to be in 2019 compat mode to enable Froid, the function-inlining magic. Run the same query again, and the metrics are wildly different: Runtime: 4 seconds CPU time: 4 seconds Logical reads: 3,247,991 (which still sounds bad, […]

Read More

Categories

July 2018
MTWTFSS
« Jun Aug »
 1
2345678
9101112131415
16171819202122
23242526272829
3031