Press "Enter" to skip to content

Curated SQL Posts

Target Server Memory Vs Total Memory

Arun Sirpal compares the Target Server Memory and Total Memory Perfmon counters:

Target Server Memory (KB) is the amount of memory that SQL Server is willing (potential) to allocate to the buffer pool under its current load. Total Server Memory (KB) is what SQL currently has allocated.

Using SQL Server 2014 developer edition (64 bit) my machine has 12GB RAM and maximum server memory is currently set to 8GB and for the purpose of this post I have set  minimum server memory to 1GB (Lock Pages in Memory has not been set).

Read on for a nice description laden with Perfmon pictures.

Comments closed

Preventing Event Storms

Kenneth Fisher has some good advice when dealing with event notifications:

One of the most common ways to get an event notification is by email. So what happens when you get 500 emails in a day and only one or two are actionable? Do you read every single email? Spending quite literally hours to find those one or two gems? Or do you just ignore the whole lot and wait for some other notification that there is a problem. Say, by a user calling you?

Next, let’s say you have a job that runs every few minutes checking if an instance is down. When that instance goes down you get an immediate email. Which is awesome! Of course then while you are trying to fix the issue you get dozens more emails about the same outage. That is at best distracting and at worst makes it take longer for you to fix the issue.

Fun story time:  at one point during my work career, there was a person (not me!) who accidentally broke every single SQL Agent job on dozens of instances and nobody noticed it for hours.  These weren’t production instances so it wasn’t the end of the world or anything…except that included in the broken jobs were a bunch which ran every minute.  And alerted every minute.  Via e-mail.  The entire database team essentially lost e-mail access for 3 days as there were so many messages coming in that it overwhelmed our provider’s ability to serve messages to us.  This type of mistake can happen, and if we had put into place some of the things Kenneth talks about, the consequences would have been less severe.

Comments closed

How SSDT Publication Works

Ed Elliott has a great post deconstructing the SQL Server Data Tools deployment process:

The first thing to note is the garbage collection, there are like 8 changes to the line so 8 sets of garbage collection that happen which is expensive for a .net app. We get much more processor usage for a more prolonged time and we really start to see some disk activity, write activity just has one peak while the files are written but reading stays quite high throughout the whole process. It could be that it is sql reading from disk (my demo was from a single ssd with a local sql instance) but we can see after the sql cpu settles down the green disk read line stays high so it is unlikely to be pure sqlserver.exe.

What does this tell us?

Well if you have large objects, multi-thousand line stored procs then you better get yourself some serious ssd’s to deploy from. If you have smaller objects then you’ll need CPU and memory – don’t scrimp here!

Check it out.

Comments closed

Cognitive Services With R

Steph Locke shows how to use the Microsoft Cognitive Services Text Analytics API within R:

We have some different languages and we need to first do language detection before we can analyse the sentiment of our phrases

# Construct a request
response<-POST(cogapi, add_headers(`Ocp-Apim-Subscription-Key`=cogapikey), body=toJSON(mydata))

Now we need to consume our response such that we can add the language code to our existing data.frame. The structure of the response JSON doesn’t play well with others so I use data.table’s nifty rbindlist. It is a **very good* candidate for purrr but I’m not up to speed on that yet.

Check out the whole post; Steph makes it look easy.

Comments closed

Building A Neural Net

Shirin Glander has a great post on using Spark + sparklyr + h2o + rsparkling to build a neural net to study arrhythmia of the heart:

The data I am using to demonstrate the building of neural nets is the arrhythmia dataset from UC Irvine’s machine learning database. It contains 279 features from ECG heart rhythm diagnostics and one output column. I am not going to rename the feature columns because they are too many and the descriptions are too complex. Also, we don’t need to know specifically which features we are looking at for building the models. For a description of each feature, see https://archive.ics.uci.edu/ml/machine-learning-databases/arrhythmia/arrhythmia.names. The output column defines 16 classes: class 1 samples are from healthy ECGs, the remaining classes belong to different types of arrhythmia, with class 16 being all remaining arrhythmia cases that didn’t fit into distinct classes.

Very interesting post.

Comments closed

Prophet

Rodrigo Agundez looks at Prophet, Facebook’s new API for store sales forecasting:

The data is of a current client, therefore I won’t be disclosing any details of it.

Our models make forecasts for different shops of this company. In particular I took 2 shops, one which contains the easiest transactions to predict from all shops, and another with a somewhat more complicated history.

The data consists of real transactions since 2014. Data is daily with the target being the number of transactions executed during a day. There are missing dates in the data when the shop closed, for example New Year’s day and Christmas.

The holidays provided to the API are the same I use in our model. They contain from school vacations or large periods, to single holidays like Christmas Eve. In total, the data contains 46 different holidays.

It looks like Prophet has some limitations but can already make some nice predictions.

Comments closed

Getting M Code From Power Query

Chris Webb shows that you can copy a query and paste into Notepad to get the M code for that query:

Two years ago I blogged about a method to export all the M code for all of your queries in Power Query using the Send A Frown button – useful if you need the code for documentation purposes. This trick doesn’t work with Power BI Desktop, unfortunately, but the good news is that there’s a better way to do this now in Power Query and Power BI Desktop using copy/paste. It’s pretty simple really: when you copy a query from the Power Query or Power BI Desktop Query Editor you can not only paste the query to another Query Editor (pasting from Power Query to Power BI and vice versa works too) but you can also paste the query to a text editor like Notepad and get the M code for the query.

Read on for more.

Comments closed

Broadcast Nested Loop Joins In Spark

Reynold Xin, et al, debug an interesting test case:

While we were pretty happy with the improvement, we noticed that one of the test cases in Databricks started failing. To simulate a hanging query, the test case performed a cross join to produce 1 trillion rows.

spark.range(1000 * 1000).crossJoin(spark.range(1000 * 1000)).count()

On a single node, we expected this query would run infinitely or “hang.” To our surprise, we started seeing this test case failing nondeterministically because sometimes it completed on our Jenkins infrastructure in less than one second, the time limit we put on this query.

You’re not going to get this performance against a real data set, but it was interesting reading their troubleshooting notes.

Comments closed