Press "Enter" to skip to content

Curated SQL Posts

Batch Requests Per Second

Tara Kizer explains the Batch Requests/sec metric:

WHAT IS BATCH REQUESTS/SEC?

Batch Requests/sec is a performance counter that tells us the number of T-SQL command batches received by the server per second. It is in the SQLServer:SQL Statistics performance object for a default instance or MSSQL$InstanceName:SQL Statistics for a named instance.

WHAT COMPRISES A BATCH?

If I execute a stored procedure that has multiple queries and calls to other stored procedures, what will Batch Requests/sec show? Let’s test it to find out.

Click through for the answer.

Comments closed

Deadlock Priority

Kenneth Fisher tells a story of a deadlock:

Why does it matter that they were system sessions? The important thing to remember here is that these sessions can not be KILLed. So because they were holding locks on the database (And somehow even though it was in single user there were multiple sessions with locks in the database. Don’t ask me how.) I wasn’t able to get that exclusive access I needed.

Interestingly when I tried to do the ALTER instead of just hanging I immediately got a deadlock error. I spent a little while trying various things and searching through forums before I went for help on twitter using the #SQLHELP hashtag.

Read on for the answer, including how deadlock priorities saved the day.

Comments closed

Twitter Sentiment Analysis Using doc2vec

Sergey Bryl uses word2vec and doc2vec to perform Twitter sentiment analysis in R:

But doc2vec is a deep learning algorithm that draws context from phrases. It’s currently one of the best ways of sentiment classification for movie reviews. You can use the following method to analyze feedbacks, reviews, comments, and so on. And you can expect better results comparing to tweets analysis because they usually include lots of misspelling.

We’ll use tweets for this example because it’s pretty easy to get them via Twitter API. We only need to create an app on https://dev.twitter.com (My apps menu) and find an API Key, API secret, Access Token and Access Token Secret on Keys and Access Tokens menu tab.

Click through for more details, including code samples.

Comments closed

Plan Comparison

Max Vernon looks at plan comparison in SQL Server Management Studio:

The table has 10,000,000 rows. I’ve create a non-clustered columnstore index on the table, which I’ll talk about in a future post. I’ve included it here because it provides a succinct difference in the two plans.

To compare the plans visually, side-by-side, you need to save the first plan by right-clicking on the plan window, clicking “Save Execution Plan As…”, and specifying a filename. Next, right-click on the plan window, and choose “Compare Showplan”:

I’ve only used this once or twice, but it is an interesting feature.

Comments closed

Hadoop In The Cloud

Peter Coates talks about pros and cons to Hadoop in the cloud:

Hadoop was developed for deployment over Linux running on bare metal. Cloud deployment implies virtual machines, and for Hadoop it’s a huge difference.

As detailed in other articles (for instance, Your Cluster Is an Appliance or Understanding Hadoop Hardware Requirements), bare-metal deployments have an inherent advantage over virtual machine deployments. The biggest of these is that they can use direct attached storage, i.e., local disks.

Not every Hadoop workload is storage I/O bound, but most are, and even when Hadoop seems to be CPU bound, much of the CPU activity is often either directly in service of I/O, i.e., marshaling, unmarshaling, compression, etc., or in service of avoiding I/O, i.e., building in-memory tables for map-side joins.

Read the whole thing.

Comments closed

String Splitting And Concatenation

Aaron Bertrand and Steve Hughes talk about string splitting, and Aaron also discusses string concatenation.  First Aaron:

That may not look like a massive simplification, but don’t forget about all the logic buried behind the table-valued function in the first example. And if you’re like several shops I know, if you look across your codebase and see all the messy uses you have for either of these methods, the benefits should be even more clear – and testing should bear that the performance savings compared to traditional, expensive methods are the sweetest part of the deal.

And Steve:

The STRING_SPLIT function will return a single column result set. The column name is “value”. The datatype will be NVARCHAR for strings that are NCHAR or NVARCHAR. VARCHAR is used for strings that are CHAR or VARCHAR types.

These two functions are small, but come in handy quite frequently.

Comments closed

Galaxy Classification With SQL Server

David Smith points out a nice Microsoft demo for classifying galaxies using SQL Server:

The SQL Server Blog has since published a step-by-step tutorial on implementing the galaxy classifier in SQL Server (and the code is also available on GitHub). This updated version of the demo uses the new MicrosoftML package in Microsoft R Server 9, and specifically the rxNeuralNet function for deep neural networks. The tutorial recommends using the Azure NC class of virtual machines, to take advantage of the GPU-accelerated capabilities of the function, and provides details on using the SQL Server interfaces to train the neural netowrk and run predictions (classifications) on the image database. For the details, follow the link below.

If you’re going to get into SQL Server R Services at any level of seriousness, I highly recommend R Tools for Visual Studio, as it will make building those external stored procedure calls much easier.

Comments closed

AT TIME ZONE And Reports

Rob Farley shows how to use AT TIME ZONE without sacrificing performance:

Because how am I supposed to know whether a particular date was before daylight saving started or after? I might know that an incident occurred at 6:30am in UTC, but is that 4:30pm in Melbourne or 5:30pm? Obviously I can consider which month it’s in, because I know that Melbourne observes daylight saving time from the first Sunday in October to the first Sunday in April, but then if there are customers in Brisbane, and Auckland, and Los Angeles, and Phoenix, and various places within Indiana, things get a lot more complicated.

To get around this, there were very few time zones in which SLAs could be defined for that company. It was just considered too hard to cater for more than that. A report could then be customised to say “Consider that on a particular date the time zone changed from X to Y”. It felt messy, but it worked. There was no need for anything to look up the Windows registry, and it basically just worked.

But these days, I would’ve done it differently.

Now, I would’ve used AT TIME ZONE.

Read on for the scenario.

Comments closed