Press "Enter" to skip to content

Author: Kevin Feasel

Using Service Broker To Queue Up External Script Calls

Arvind Shyamsundar shows how to use Service Broker to run external R or Python scripts based on new data coming into a transactional system:

Here, we will show you how you can use the asynchronous execution mechanism offered by SQL Server Service Broker to ‘queue’ up data inside SQL Server which can then be asynchronously passed to a Python script, and the results of that Python script then stored back into SQL Server.

This is effectively similar to the external message queue pattern but has some key advantages:

  • The solution is integrated within the data store, leading to fewer moving parts and lower complexity
  • Because the solution is in-database, we don’t need to make copies of the data. We just need to know what data has to be processed (effectively a ‘pointer to the data’ is what we need).

Service Broker also offers options to govern the number of readers of the queue, thereby ensuring predictable throughput without affecting core database operations.

There are several interconnected parts here, and Arvind walks through the entire scenario.

Comments closed

Updating Data In Common Table Expressions

Kenneth Fisher shows that you can directly update a table referenced in a common table expression:

CTEs are cool things. You can essentially create one or more in-line view(s) within your query. One thing that isn’t overly well known is that you can actually update the data within the CTE. No, I don’t mean using using the UPDATE statement with a CTE but actually running the update through the CTE.

This is really powerful when combined with window functions, like only updating the first record given a particular partition.  You can also delete, which makes duplicate detection and deletion fairly straightforward.

Comments closed

Using Regular Expressions In Check Constraints

Denis Gobo shows that SQL Server check constraints support limited regular expression capabilities:

While SQL server does not support a full implementation of regular expression, you can do what the person asked for without a problem in T-SQL. Here is what the regular expression looks like

[DMOPT][0-9][0-9]

A constraint like that will allow allow the following alphabetic characters (D, M, O, P or T) followed by 2 numeric characters. Enough talking let’s look at some code, first create this table

Read on to see how this constraint works and for implementation code.

Comments closed

Online Learning Algorithms

Xin Hunt describes the benefits of online learning algorithms:

A few examples of classical online learning algorithms include recursive least squares, stochastic gradient descent and multi-armed bandit algorithms like Thompson sampling. Many online algorithms (including recursive least squares and stochastic gradient descent) have offline versions. These online algorithms are usually developed after the offline version, and are designed for better scaling with large datasets and streaming data. Algorithms like Thompson sampling on the other hand, do not have offline counterparts, because the problems they solve are inherently online.

Let’s look at interactive ad recommendation systems as an example. You’ll find ads powered by these systems when you browse popular publications, weather sites and social media networks. These recommendation systems build customer preference models by tracking your shopping and browsing activities (ad clicking, wish list updates and purchases, for example). Due to the transient nature of shopping behaviors, new recommendations must reflect the most recent activities. This makes online learning a natural choice for these systems.

My favorite online learning algorithm at the moment is Online Passive-Aggressive Algorithms.  Not just because that name describes my Twitter feed.

1 Comment

Optimizing Apache Flink

Ivan Mushketyk has a few tips for speeding up programs using Apache Flink:

One more way to optimize your Flink application is to provide some information about what your user-defined functions are doing with input data. Since Flink can’t parse and understand code, you can provide crucial information that will help to build a more efficient execution plan. There are three annotations that we can use:

  1. @ForwardedFields: Specifies what fields in an input value were left unchanged and are used in an output value.

  2. @NotForwardedFields: Specifies fields that were not preserved in the same positions in the output.

  3. @ReadFields: Specifies what fields were used to compute a result value. You should only specify fields that were used in computations and not merely copied to the output.

Click through for his four tips.

Comments closed

Installing The Azure ML Workbench

Leila Etaati walks us through setting up the Azure ML workbench:

In Microsoft ignite 2017, Azure ML team announce new on-premises tools for doing machine learning. this tools much more comprehensive as it provides

1- a workspace helps data wrangling

2- Data Visualization

3-Easy to deploy

4-Support Python codes

in this post and next posts, I will share my experiment with working this tools.

Click through for the step-by-step installation guide.

Comments closed

When The Maximum Workspace Memory Isn’t The Internal Pool Maximum

Lonny Niederstadt answers the call from someone who needs the combination of Perfmon and DMV data:

When is a maximum not really the maximum?
When it’s a maximum for an explicitly or implicitly modified default.
Whether “the definitive documentation” says so or not.

Yesterday on Twitter #sqlhelp this question came up.

*****

*****

Aha! I thought to myself.  For this I am purposed! To show how Perfmon and DMV data tie out!

Read on for the simple form of the answer, followed by the complication which makes life interesting.

Comments closed

Checking If UseLargeFRS Is Enabled

Cody Konior has a quick post showing how to see if you have the UseLargeFRS switch turned on for your SQL Server disks:

PureStorage has a pretty cool post that mentions the importance of formatting SQL Server disks with a 64KB clusters and the /L flag (also known as the UseLargeFRS switch on PowerShell’s Format-Volume cmdlet).

Why UseLargeFRS? It’s to help avoid DBCC CHECKDB failures on large/busy databases. But how do you work out whether it’s enabled or not? PowerShell to the rescue!

You can’t work this out remotely but you can do it locally.

Read on for the script.

Comments closed

Tracking Replication Throughput

Phil Grayson has a couple of replication-related queries:

It’s a right laugh when replication goes wrong, if you haven’t been there and use replication heavily, you’re in for the night / early morning of your life. Like any issue, the cause of replication problems could be one of many, but checking what commands are being passed through the distributor is always worthwhile, as an inefficient query could be updating the same data over and over, putting unnecessary strain on the distributor, infrastructure or the subscribers.

Read on for those queries.

Comments closed

XESmartTarget: Event Handling With SQL Server

Gianluca Sartori introduces a new tool to help with event handling in SQL Server:

I am proud to introduce XESmartTarget: the easiest way to interact with extended events sessions without writing a single line of code.

XESmartTarget uses the Extended Events streaming API to connect to a session and execute some actions in response to each event. The actions you perform can be configured in a simple .json file, which controls the Response types and all their properties.

For instance, if you have a session to capture deadlocks, you may want to be notified whenever such an event is captured. All you need to do is configure XESmartTarget to send an email using the EmailResponse class. You can do that by creating a simple .json file with all the parameters needed

This looks quite interesting.

Comments closed