Building Random Number Ranges

Kevin Feasel

2017-06-26

T-SQL

David Fowler shows how to generate a random number for each record in a result set:

Hmmmmm…. It looks like we’ve got the same number for every person, that wasn’t what we wanted.  And that’s my issue with RAND(), it’ll give you a different random number every time it runs but if run as part of a query it’ll always return the same number for every row returned.

So what else can we do?  We’ll there is something that gives us a ‘random’ value for every row in the query and that’s our good old friend NEWID().  Let’s try the same query but this time we’ll swap RAND() with NEWID().

One major use case for this is sampling data sets for model training and testing:  if you pull from a range of 1-10, you could perhaps train against 1-5, cross-validate against 6-7, and test against 8-10.  Doing this instead of TOP X% reduces the likelihood of sampling bias.

Related Posts

Bitwise Logic To Make Values Negative

David Fowler is working with arcane magic: The short answer to the question is to simply apply a bitwise NOT to the number that you want to convert and add 1 to the result.  I wrote about bitwise logic in The Arcane Science of Bitwise Logic and SQL Server but didn’t mention the bitwise NOT. If you […]

Read More

Executing Dynamic SQL: EXEC vs sp_executesql

Tara Kizer gives a scenario where using sp_executesql can result in lower CPU utilization than EXEC: For frequently executed queries, “EXEC sp_executesql @sql”is a good choice but at the risk of encountering parameter sniffing issues. For queries that aren’t executed very often, “EXEC (@sql)” can be used, but you could end up with a bloated […]

Read More

Categories

June 2017
MTWTFSS
« May Jul »
 1234
567891011
12131415161718
19202122232425
2627282930