Multiple Regression

Kevin Feasel



Anastasios Markitsis is starting a series of exercises on multiple regression in R:

Exercise 1

a. Load the state datasets.
b. Convert the state.x77 dataset to a dataframe.
c. Rename the Life Exp variable to Life.Exp, and HS Grad to HS.Grad. (This avoids problems with referring to these variables when specifying a model.)

Click through for the rest of the exercises as well as the answers.

Windows Firewall: Allowing Inbound Connections

Stephen West has a post on creating Windows firewall rules to allow inbound traffic for a SQL Server instance:

For Static Port:

  • Go to Start>Run and type WF.msc and then click on OK button

  • Under the Windows Firewall with Advanced Security, right-click on Inbound Rules, and then click on New Rule

  • In the Rule Type box, select the option Port, and then click on Next button

  • In the dialog box of Port, select the option TCP. Then, select the option Specific local ports, after that type the port number 1433 for the static instance. After that click on Next button

  • Select Allow the action under the Action dialog box and then click on Next button

  • Now, Under the Profile dialog box, select any profiles which you want to connect to the SQL server, and then click on Next button

  • Type a name and description of the rule, in the Name dialog box and then click on Finish button

Read on for dynamic ports.  I feel like I need to throw out all kinds of warnings about not exposing a SQL Server instance directly to the public internet.

Understanding Bayesian Priors

Angelika Stefan and Felix Schönbrodt explain the concept of priors:

When reading about Bayesian statistics, you regularly come across terms like “objective priors“, “prior odds”, “prior distribution”, and “normal prior”. However, it may not be intuitively clear that the meaning of “prior” differs in these terms. In fact, there are two meanings of “prior” in the context of Bayesian statistics: (a) prior plausibilities of models, and (b) the quantification of uncertainty about model parameters. As this often leads to confusion for novices in Bayesian statistics, we want to explain these two meanings of priors in the next two blog posts*. The current blog post covers the the first meaning of priors.

Priors are a big differentiator between the Bayesian statistical model and the classical/frequentist statistical model.

Replication And Date Conversion

Jeffrey Verheul digs up a strange replication bug:

After a lot of different variables in the test-setup, I found out that it’s probably an old bug that wasn’t properly patched when upgrading the SQL Server engine to a newer version. Let me elaborate on that:

– The bug is reproducible on the test server, which is an upgraded engine from SQL 2012 or 2014 to SQL 2016 RTM
– The bug is reproducible on the production server, which is an upgraded engine from SQL 2014 to SQL 2016 RTM
– The bug is not reproducible on a clean install of SQL 2014
– The bug is not reproducible on a clean install of SQL 2016 RTM
– The bug is not reproducible on a clean install of SQL vNext CTP

There’s a lot of good investigative work here, so check it out.

Doomed Transactions

Michael Swart talks about doomed transactions:

So the procedure was complicated and it used explicit transactions, but I couldn’t find any TRY/CATCH blocks anywhere! What I needed was a stack trace, but for T-SQL. People don’t talk about T-SQL stack traces very often. Probably because they don’t program like this in T-SQL. We can’t get a T-SQL stack trace from the SQLException (the error given to the client), so we have to get it from the server.

Michael shows how to get stack trace information and provides some advice on the process (mostly, “don’t do what we did”).


Kenneth Fisher explains what the COPY_ONLY flag means for a backup:

So to put that in simple terms. I have a database, Test. I take a full backup, changes happen, I take a differential backup, changes happen, I take a differential backup, etc. Ignoring all of the log backups that are happening if the database is in FULL recovery of course.

Currently, both differentials contain everything that has happened between the time of the full backup and the time the differential was taken. But what happens if I take a second full backup between the first and second differential? Now, that second differential will only contain data between the second full backup and the differential.

Read on for more.

Animating Visuals In R

Tomaz Kastrun shows how to create animated charts in R using ggplot2:

In addition to R code, the ImageMagic program needs to be installed on your machine, as well. Also the speed, quality and many other parameters can be set, when creating animated gif.

Animated gif can be also included into your SSRS report, your Sharepoint site or any other site – like my blog 🙂 and it will stay interactive. In Power BI, importing animated gif as a picture, unfortunately will not work.

Be very careful with this, as not everything supports animated GIFs and you can make some really painful graphs if you try hard enough…

Session-Level Wait Stats

Arun Sirpal points out that SQL Server 2016 has a session-level wait stats DMV:

This tells me about the waits since my last reboot or since a manual reset of the stats. It’s probably why you should do at least time-based analysis or reset the wait stats before starting, that is if you are interested in something time specific or if you want to understand certain workloads at a given time.

So the other option is that you could go down the session level route. With the session based analysis I took the query and changed it slightly to query sys.dm_exec_session_waits_stats and also pull back the session_id that I am interested in.

I had no idea this was available, and it’s something that I’ve wanted for a very long time, so that’s excellent.

Setting Your Maximum Memory

Thomas Rushton provides a script to set max memory on a SQL Server instance:

The thing to do, ideally, is to configure the maximum server memory when you build the server; however, sometimes you walk into a place where there are many servers where this hasn’t been done, or are otherwise looking for a quick way to determine what the setting should be. Jonathan Kehayias of SQLSkills blogged about a sensible SQL Server Maximum memory calculation (in response to a post elsewhere about a really dodgy memory config advisor, but I’m not going to link to that…)

What I’ve done below is codify that knowledge into a nice friendly T-SQL query that you can run, below. It makes use of the sys.dm_os_sys_info DMV to get the memory physically in the server; that DMV, though, has changed form between SQL 2008R2 and SQL 2012, the new version reporting physical_memory_kb whereas the previous version had physical_memory_in_bytes. Hence a bit of dynamic SQL nastiness at the start of the query.

Click through for the script, but make sure to tweak it for your environmental peculiarities.


January 2017
« Dec Feb »