Regular Expressions With R

Kevin Feasel

2017-10-03

CLR, R

Dave Mason looks at using SQL Server R Services to execute regular expressions against a T-SQL data set:

Have you ever had the need to use Regular Expressions directly in SQL Server? I sometimes hear or see others refer to using RegEx in TSQL. But I always assume they’re talking about the TSQL LIKE operator, because RegEx isn’t natively supported. In TSQL’s defence, you can get a lot of mileage out of LIKE and some clever pattern matching strings, even though it’s not authentic RegEx. You can leverage RegEx libraries in the .NET Framework via a CLR stored procedure. You should also be able to do something similar with an old-school extended stored procedure.

I discussed all of this during a recent interview. It was a day or two afterwards (of course) when it dawned on me that there’s another way to leverage RegEx from TSQL: the R language. Prior to this mini-revelation, I had always thought of R (and Python) as strictly a means to an end for Data Science and related disciplines. Now I am thinking I’ve been looking at R and Python through too narrow of a lens and I should take a larger view.

I think I’d prefer CLR for this because there’s additional overhead to making R Services calls, but it’s a clever use of R Services.

Related Posts

Python versus R (Again)

Alex Woodie looks at whether Python is dominating R in the data science space: There is some evidence that Python’s popularity is hurting R usage. According to the TIOBE Index, Python is currently the third most popular language in the world, behind perennial heavyweights Java and C. From August 2018 to August 2019, Python usage surged […]

Read More

Local Randomness and R

Evgeni Chasnovski has a problem around generating random data: Let’s say we have a deterministic (non-random) problem for which one of the solutions involves randomness. One very common example of such problem is a function minimization on certain interval: it can be solved non-randomly (like in most methods of optim()), or randomly (the simplest approach being […]

Read More

Categories

October 2017
MTWTFSS
« Sep Nov »
 1
2345678
9101112131415
16171819202122
23242526272829
3031