Regular Expressions With R

Kevin Feasel

2017-10-03

CLR, R

Dave Mason looks at using SQL Server R Services to execute regular expressions against a T-SQL data set:

Have you ever had the need to use Regular Expressions directly in SQL Server? I sometimes hear or see others refer to using RegEx in TSQL. But I always assume they’re talking about the TSQL LIKE operator, because RegEx isn’t natively supported. In TSQL’s defence, you can get a lot of mileage out of LIKE and some clever pattern matching strings, even though it’s not authentic RegEx. You can leverage RegEx libraries in the .NET Framework via a CLR stored procedure. You should also be able to do something similar with an old-school extended stored procedure.

I discussed all of this during a recent interview. It was a day or two afterwards (of course) when it dawned on me that there’s another way to leverage RegEx from TSQL: the R language. Prior to this mini-revelation, I had always thought of R (and Python) as strictly a means to an end for Data Science and related disciplines. Now I am thinking I’ve been looking at R and Python through too narrow of a lens and I should take a larger view.

I think I’d prefer CLR for this because there’s additional overhead to making R Services calls, but it’s a clever use of R Services.

Related Posts

Timing R Function Calls

Colin Gillespie shows off an R package for benchmarking: Of course, it’s more likely that you’ll want to compare more than two things. You can compare as many function calls as you want with mark(), as we’ll demonstrate in the following example. It’s probably more likely that you’ll want to compare these function calls against more […]

Read More

Exploratory Data Analysis with inspectdf

Laura Ellis continues a dive into Exploratory Data Analysis, this time using the inspectdf package: I like this package because it’s got a lot of functionality and it’s incredibly straightforward to use. In short, it allows you to understand and visualize column types, sizes, values, value imbalance & distributions as well as correlations. Better yet, […]

Read More

Categories

October 2017
MTWTFSS
« Sep Nov »
 1
2345678
9101112131415
16171819202122
23242526272829
3031