Press "Enter" to skip to content

Author: Kevin Feasel

Data Analysis Basics In R

Sibanjan Das provides some of the basics of data analysis using R:

Let’s start thinking in a logical way the steps that one should perform once we have the data imported into R.

  • The first step would be to discover what’s in the data file that was exported. To do this, we can:
    • Use head function to view few rows from the data set. By default, head shows first 5 rows. Ex: head(mtcars)
    • str to view the structure of the imported data. Ex: str(mtcars)
    • summary to view the data summary. Ex: summary(mtcars) 

There’s a lot to data analysis, but this is a good start.

Comments closed

Suse On Windows 10

Brad Sams reports that Ubuntu isn’t the only flavor of Linux available on Windows 10 anymore:

If you do go down this route, you have the option for installing either openSUSE Leap 42.2 and SUSE Linux Enterprise Server 12 SP2.

The benefits here are obvious, with Microsoft enabling the Windows subsystem for Linux, they are opening the door to more than simply running Bash inside of Windows 10. While that is a good feature and one of the most likely used instances of this subsystem, what Microsoft has actually done is opened the door for more vendors to bring their Linux tools to the Windows platform.

I’d expect Red Hat to follow suit.

Comments closed

Detail Rows Expression In SSAS Tabular

Chris Webb shows a new feature in SSAS Tabular vNext:

What drillthrough does in SSAS Multidimensional, and what the new Detail Rows Expression property in SSAS Tabular v.next does, is allow an end user to see the detail-level data (usually the rows in the fact table) that was aggregated to give the value the user clicked on in the original PivotTable.

Read through for an example as well as how it’s already an improvement over Multidimensional’s dillthrough.

Comments closed

Thinking About Real-Time Analytics

Martin Willcox offers some advice for people getting into the real-time analytics game:

  1. Clarify who will be making the decision – man, or machine? Humans have powers of discretion that machines sometimes lack, but are much slower than a silicon-based system, and only able to make decisions one-at-a-time, one-after-another.  If we chose to put a human in the loop, we are normally in “please-update-my-dashboard-faster-and-more-often” territory.

  2. It is important to be clear about decision-latency. Think about how soon after a business event you need to take a decision and then implement it. You also need to understand whether decision-latency and data-latency are the same. Sometimes a good decision can be made now on the basis of older data. But sometimes you need the latest, greatest and most up-to-date information to make the right choices.

There are some good insights here.

Comments closed

Windows Firewall: Allowing Inbound Connections

Stephen West has a post on creating Windows firewall rules to allow inbound traffic for a SQL Server instance:

For Static Port:

  • Go to Start>Run and type WF.msc and then click on OK button

  • Under the Windows Firewall with Advanced Security, right-click on Inbound Rules, and then click on New Rule

  • In the Rule Type box, select the option Port, and then click on Next button

  • In the dialog box of Port, select the option TCP. Then, select the option Specific local ports, after that type the port number 1433 for the static instance. After that click on Next button

  • Select Allow the action under the Action dialog box and then click on Next button

  • Now, Under the Profile dialog box, select any profiles which you want to connect to the SQL server, and then click on Next button

  • Type a name and description of the rule, in the Name dialog box and then click on Finish button

Read on for dynamic ports.  I feel like I need to throw out all kinds of warnings about not exposing a SQL Server instance directly to the public internet.

Comments closed

Understanding Bayesian Priors

Angelika Stefan and Felix Schönbrodt explain the concept of priors:

When reading about Bayesian statistics, you regularly come across terms like “objective priors“, “prior odds”, “prior distribution”, and “normal prior”. However, it may not be intuitively clear that the meaning of “prior” differs in these terms. In fact, there are two meanings of “prior” in the context of Bayesian statistics: (a) prior plausibilities of models, and (b) the quantification of uncertainty about model parameters. As this often leads to confusion for novices in Bayesian statistics, we want to explain these two meanings of priors in the next two blog posts*. The current blog post covers the the first meaning of priors.

Priors are a big differentiator between the Bayesian statistical model and the classical/frequentist statistical model.

Comments closed

Replication And Date Conversion

Jeffrey Verheul digs up a strange replication bug:

After a lot of different variables in the test-setup, I found out that it’s probably an old bug that wasn’t properly patched when upgrading the SQL Server engine to a newer version. Let me elaborate on that:

– The bug is reproducible on the test server, which is an upgraded engine from SQL 2012 or 2014 to SQL 2016 RTM
– The bug is reproducible on the production server, which is an upgraded engine from SQL 2014 to SQL 2016 RTM
– The bug is not reproducible on a clean install of SQL 2014
– The bug is not reproducible on a clean install of SQL 2016 RTM
– The bug is not reproducible on a clean install of SQL vNext CTP

There’s a lot of good investigative work here, so check it out.

Comments closed

Doomed Transactions

Michael Swart talks about doomed transactions:

So the procedure was complicated and it used explicit transactions, but I couldn’t find any TRY/CATCH blocks anywhere! What I needed was a stack trace, but for T-SQL. People don’t talk about T-SQL stack traces very often. Probably because they don’t program like this in T-SQL. We can’t get a T-SQL stack trace from the SQLException (the error given to the client), so we have to get it from the server.

Michael shows how to get stack trace information and provides some advice on the process (mostly, “don’t do what we did”).

Comments closed

COPY_ONLY Backups

Kenneth Fisher explains what the COPY_ONLY flag means for a backup:

So to put that in simple terms. I have a database, Test. I take a full backup, changes happen, I take a differential backup, changes happen, I take a differential backup, etc. Ignoring all of the log backups that are happening if the database is in FULL recovery of course.

Currently, both differentials contain everything that has happened between the time of the full backup and the time the differential was taken. But what happens if I take a second full backup between the first and second differential? Now, that second differential will only contain data between the second full backup and the differential.

Read on for more.

Comments closed