Getting A Random Row

Kevin Feasel

2018-03-06

T-SQL

Brent Ozar shares four methods for getting a random row from a table:

Method 1, Bad: ORDER BY NEWID()

Easy to write, but it performs like hot, hot garbage because it scans the entire clustered index, calculating NEWID() on every row:

That took 6 seconds on my machine, going parallel across multiple threads, using tens of seconds of CPU for all that computing and sorting. (And the Users table isn’t even 1GB.)

Click through for the other three methods.  The really tricky part is when you want to get a random sample from the table, as TABLESAMPLE is an awful choice for that.

Related Posts

Simulating LAG And LEAD Prior To SQL Server 2012

Izik Ben-Gan highlights a reader submission from his last post: Last month I covered a Special Islands challenge. The task was to identify periods of activity for each service ID, tolerating a gap of up to an input number of seconds (@allowedgap). The caveat was that the solution had to be pre-2012 compatible, so you couldn’t […]

Read More

New Use Hint In SQL Server 2017 CU10

Pedro Lopes shows us a new use hint introduced in SQL Server 2017 CU10: In this scenario, you only have this one query that apparently does better in SQL Server 2014 than 2017. That’s all “New CE” – there’s no CE70 vs CE 120+ at issue here. Using any known trace flag, the FORCE_LEGACY_CARDINALITY_ESTIMATION hint […]

Read More

Categories

March 2018
MTWTFSS
« Feb Apr »
 1234
567891011
12131415161718
19202122232425
262728293031