Regular Expressions With R

Kevin Feasel

2017-10-03

CLR, R

Dave Mason looks at using SQL Server R Services to execute regular expressions against a T-SQL data set:

Have you ever had the need to use Regular Expressions directly in SQL Server? I sometimes hear or see others refer to using RegEx in TSQL. But I always assume they’re talking about the TSQL LIKE operator, because RegEx isn’t natively supported. In TSQL’s defence, you can get a lot of mileage out of LIKE and some clever pattern matching strings, even though it’s not authentic RegEx. You can leverage RegEx libraries in the .NET Framework via a CLR stored procedure. You should also be able to do something similar with an old-school extended stored procedure.

I discussed all of this during a recent interview. It was a day or two afterwards (of course) when it dawned on me that there’s another way to leverage RegEx from TSQL: the R language. Prior to this mini-revelation, I had always thought of R (and Python) as strictly a means to an end for Data Science and related disciplines. Now I am thinking I’ve been looking at R and Python through too narrow of a lens and I should take a larger view.

I think I’d prefer CLR for this because there’s additional overhead to making R Services calls, but it’s a clever use of R Services.

Trusted Assemblies And Module Signing

Solomon Rutzky continues his SQLCLR and trusted assemblies series:

Ownership chaining is quite handy as it makes it easier to not grant explicit permissions on base objects (i.e. Tables, etc) to everyone. Instead, you just grant EXECUTE / SELECTpermissions to Stored Procedures, Views, etc.

However, one situation where ownership chaining does not work is when using Dynamic SQL. And, any SQL submitted by a SQLCLR object is, by its very nature, Dynamic SQL. Hence, any SQLCLR objects that a) do any data access, even just SELECT statements, and b) will be executed by a User that is neither the owner of the objects being accessed nor one that has been granted permissions to the sub-objects, needs to consider module signing in order to maintain good and proper security practices. BUT, the catch here is that in order to sign any Assembly’s T-SQL wrapper objects, that Assembly needs to have been signed with a Strong Name Key or Certificate prior to being loaded into SQL Server. Neither “Trusted Assemblies” nor even signing the Assembly with a Certificate within SQL Server suffices for this purpose, as we will see below.

Read on for more details.

More On Certificates Versus Trusted Assemblies

Solomon Rutzky compares using the new Trusted Assemblies functionality in SQL Server 2017 versus generating certificates for loading CLR objects:

Considering that “Trusted Assemblies” is entirely worthless within the context of the regular (i.e. non-Azure) SQL Server, does the same argument of “just use the existing Certificates and Module Signing functionality” also apply here?

Well, this just happens to be the one area where there might possibly be, in the worst-case scenario, some argument made for keeping this feature. The problem with applying the same Certificate / ADD SIGNATURE logic to Azure SQL Database is that you can’t create a signature-based Login (from either a Certificate or an Asymmetric Key) in that environment. Ouch! That is definitely a nail-in-the-coffin for the Certificate idea. Ok, so assuming that “Trusted Assemblies” would work in this scenario, is it an acceptable solution to the problem?

This has been a thought-provoking series so far.  If you agree with his conclusions, Solomon has a Connect item he’d like you to upvote.

The Downside Of Trusted Assemblies

Solomon Rutzky does not like the Trusted Assembly solution to SQL Server 2017 CLR:

Hopefully, Microsoft removes all traces of “Trusted Assemblies” (as I have suggested here). In either case, please just use Certificates (and possibly Asymmetric Keys, depending on your preference and situation) as I have demonstrated in these past three posts (i.e. Parts 2, 3, and 4). Even better, especially for those using SSDT, would be if Microsoft implemented my suggestion to allow Asymmetric Keys to be created from a binary hex bytes string. But, even without that convenience, there is still no reason to ever, ever, use the “Trusted Assemblies” feature.

He’s given three alternatives so far, so if you’re interested in CLR security, there’s plenty of food for thought.

One CLR Solution

Solomon Rutzky continues his SQL Server 2017 CLR security series:

This new requirement prevents the technique described towards the end of Part 1 from working. That technique uses a SAFE Assembly as an indirect means of creating the Asymmetric Key to create the Login from. That worked perfectly prior to SQL Server 2017, but now even SAFE Assemblies require that the signature-based Login be created first, which now puts us in a whole chicken-egg paradox.

Before proceeding to the solution, it should be noted that yes, Microsoft has, as of RC2 (released on 2017-08-02), provided a kinda/sorta “fix” for this that allows for creating an Assembly without having the signature-based Login. HOWEVER, that “fix” is absolutely horrible, convoluted, and unnecessary. It should not be used by anyone. Ever! In fact, it should be completely removed and forgotten about. In no uncertain terms: it is not an option! To help clarify, I am being intentionally vague about that new feature here (and in Part 1) so as not to distract from these two solutions (this post and Part 3) that do not promote bad practices; it will be covered starting in Part 4.

Solomon outlines one approach to dealing with CLR security changes, though it’s a bit lengthy.

Whitelisting SQL CLR Assemblies

Niels Berglund walks through the process of whitelisting a CLR assembly in SQL Server 2017:

What Microsoft introduces in SQL Server 2017 RC1, is something I refer to as whitelisting. It is somewhat similar to the TRUSTWORTHY setting, where you indicate that a database is to be trusted. But instead of doing it on the database level, you do it per assembly.

To whitelist in SQL Server 2017 RC1, you use the system stored procedure sys.sp_add_trusted_assembly. As the name implies the procedure adds an assembly to a list of “trusted” assemblies. By marking an assembly as trusted, SQL Server will allow it to be loaded when clr strict security is on (on by default), even if:

  • the assembly is not signed, and

  • the database where you want to deploy it to is not TRUSTWORTHY.

With the elimination of the CAS model finally hitting CLR, this is probably going to be one of the easier ways for DBAs to move forward with CLR in the future.

Making A CLR Function

Kevin Feasel

2017-07-21

CLR

Michael Bourgon walks us through creating a CLR function:

What we settled on was building a CLR that would make the web calls, feeding it our data via a FOR JSON query.  We would then log the results into a separate table to make sure everything worked as expected.  I made this as generic as possible so that others could use it.

So let’s go through the steps.

  1. Create the .Net code necessary

  2. Create a CLR script for compilation

  3. Compile the CLR

  4. CREATE the ASSEMBLY

  5. CREATE the PROCEDURE

  6. Call the procedure

  7. Run it automatically

For a more detailed look at building a CLR function, after you go through Michael’s post, check out Solomon Rutzky’s Stairway to CLR.

Whither CLR?

Joey D’Antoni is shaking his head about a CLR announcement:

With this is mind, Microsoft has made some big changes to CLR in SQL Server 2017. SQL CLR has always been an interesting area of the engine—it allows for the use of .NET code in stored procedures and user defined types. For certain tasks , it’s an extremely powerful tool—things like RegEx and geo functions can be much faster in native CLR than trying to do the equivalent operation in T-SQL. It’s always been a little bit of a security risk, since under certain configurations, CLR had access to resources outside of the context of the database engine. This was protected by boundaries defined in the CLR host policy. We had SAFE, EXTERNAL_ACCESS, and UNSAFE levels that we could set. SAFE simply limited access of the assembly to internal computation and local data access. For the purposes of this post, we will skip UNSAFE and EXTERNAL_ACCESS, but it is sufficed to say, these levels allow much deeper access to the rest of the server.

Code Access Security in .NET (which is used to managed these levels) has been marked obsolete. What does this mean? The boundaries that are marked SAFE, may not be guaranteed to provide security. So “SAFE” CLR may be able to access external resources, call unmanaged code, and acquire sysadmin privileges. This is really bad.

It’s not the end of the world for CLR, but this is a breaking change.  Read on for more details.

SQL Server Regex

Kevin Feasel

2016-12-30

CLR, T-SQL

Dev Nambi has a new open-source project:

Databases store text, and the best way to manipulate text is to use a regular expression (‘regex’). Using regular expressions in SQL queries has been possible in many database engines for decades.

Now you can use regular expressions in SQL Server queries, too. I’ve created an open-source project, sql-server-regex, that lets you run regular expressions in T-SQL queries using scalar and table-valued functions.

This is a set of CLR functions which use the built-in .NET regular expressions functionality.  That makes it pretty easy to see how the code works.

CLR Survey

Kevin Feasel

2016-07-28

CLR

Michael J. Swart wants to know if you’re using CLR in your environment:

CREATE ASSEMBLY supports specifying a CLR assembly using bits, a bit stream that can be specified using regular T-SQL. The full method is described in Deploying CLR Database Objects. In practice, the CREATE ASSEMBLY statement looks something like:

After learning about assembly deployment, check out Michael’s one-question survey.

Categories

November 2017
MTWTFSS
« Oct  
 12345
6789101112
13141516171819
20212223242526
27282930