However, it seems that there might be two kinks in the line:
The first kink occurs somewhere between the 800m distance and the mile. It seems that the sprinting distances (and the 800m is sometimes called a long sprint) has different dynamics from the events up to the marathon.
The analysis is done in R, and the code is available in the post. Check it out.
But R is also part of an entire ecosystem of open tools that can be linked together. For example, Markdown, Pandoc, and knitr combine to make R an incredible tool for dynamic reporting and reproducible research. If your chosen output format is HTML, you’ve linked into yet another open ecosystem with countless further extensions.
Generating a page from R is one of those good ideas that I probably don’t want to see in a production environment.
Not only can we create and download custom visuals from PowerBI.com to extend the capabilities of Power BI, we can use R to create a ridiculous amount of powerful visualizations. If you can get the data into Power BI, you can use R to perform interesting statistical analysis and create some pretty cool, interactive visuals.
Dustin and Jan Mulkens are working on similar posts at the same time, so watch both of them.
Jan Mulkens has started a series on combining Power BI and R.
Fact is, R is here to stay. Even Microsoft has integrated R with SQL Server 2016 and it has made R scripting possible in it’s great Azure Machine Learning service.
So it was only a matter of time before we were going to see R integrated in Power BI.
From the previous point, it seems that R is just running in the background and that most of the functionality can be used.
Testing some basic functionality like importing and transforming data in the R visual worked fine.
I haven’t tried any predictive modelling yet but I assume that will just work as well.
So instead of printing “Hello world” to the screen, we’ll use a simple graph to say hello to the world.
First we need some data, Power BI enables us to enter some data in a familiar Excel style.
Just select “Enter Data” and start bashing out some data.
I’m looking forward to the rest of the series.
So I went through and converted everything in my Rtraining to this and realised it messed up my slide decks – it’s been so long since I had built a pure knitr solution that I forgot that
knitr::knit. For my slidedecks, if I wanted the ioslides_presentation format, I needed to use
rmarkdown::render. The problem with that has been the relative references to the CSS and the logo.
To solve this I read about the custom render formats capability and created afunction that produces an ioslides_presentation but with my CSS preloaded by default. This now means that I can produce slides with better file referencing.
Steph has put up all of her R-related presentations and documentation as well, so check that out.
Detecting fraudulent transactions is a key applucation of statistical modeling, especially in an age of online transactions. R of course has many functions and packages suited to this purpose, including binary classification techniques such as logistic regression.
If you’d like to implement a fraud-detection application, the Cortana Analytics gallery features an Online Fraud Detection Template. This is a step-by step guide to building a web-service which will score transactions by likelihood of fraud, created in five steps
Read through for the five follow-up articles. This is a fantastic series and I plan to walk through it step by step myself.
Awesome. Fixed that algorithm problem, right?
That’s because algorithms are not the problem… the only problem. The real problem is data preparation. A lot of the examples you’ll read online are very straight forward with nice neat data sets. That’s because they were carefully groomed and prepared. Here I am looking at the wooly wild real data and I’m utterly lost in how to properly prepare this so that it’s appropriately set up as a continuous distribution(or a distribution at all). WOOF! The reason this is so hard is because I actually don’t understand the data fundamentals of the problem I’m trying to solve in exactly the way needed to solve the problem. More cogitation is necessary.
Just because you can write R code doesn’t mean you are a data scientist. Grant has the right mindset, but this post is fair warning that R’s complexity isn’t so much in its being a DSL, but rather in the domain itself.
You may have heard that R and the big-data RevoScaleR package have been integrated with with SQL Server 2016 as SQL Server R Services. If you’ve been wanting to try out R with SQL Server but haven’t been sure where to start, a new MSDN tutorial will take you through all the steps of creating a predictive model: from obtaining data for analysis, to building a statistical model, to creating a stored prodedure to make predictions from the model. To work through the tutorial, you’ll need a suitable Windows server on which to install the SQL Server 2016 Community Technology Preview, and make sure you have SQL Server R Services installed. You’ll also need a separate Windows machine (say a desktop or laptop) where you’ll install Revolution R Open and Revolution R Enterprise. Most of the computations will be happening in SQL Server, though, so this “data science client machine” doesn’t need to be as powerful.
The tutorial is made up of five lessons, which together should take you about 90 minutes to run though. If you run into problems, each lesson includes troubleshooting tips at the end.
SQL Server R Services has the potential to be a great tool. The standard V1 warning obviously applies, but I’m excited.
When I needed to do an rmarkdown repository for making R Consortium Infrastructure Proposals, I was able to take the opportunity to take Jan’s code and move forward with it so that the ISC proposal is always web-facing. Here’s how I did it:
She’s using this to build the satRday planning site.
I have three blog posts on installing and using R in SQL Server.
First, installing SQL Server R Services:
I’m excited that CTP 3 of SQL Server 2016 is publicly available, in no small part because it is our first look at SQL Server R Services. In this post, I’m going to walk through installing Don’t-Call-It-SSRS on a machine.
Getting a Linux machine to talk to a SQL Server instance is harder than it should be. Yes, Microsoft has a Linux ODBC driver and some easy setup instructions…if you’re using Red Hat or SuSE. Hopefully this helps you get connected.
If you’re using RStudio on Windows, it’s a lot easier: create a DSN using your ODBC Data Sources.
Finally, using SQL Server R Services:
So, what’s the major use of SQL Server R Services? Early on, I see batch processing as the main driver here. The whole point of getting involved with Revolution R is to create sever-quality R, so imagine a SQL Agent job which runs this procedure once a night against some raw data set. The R job could build a model, process that data, and return a result set. You take that result set and feed it into a table for reporting purposes. I’d like to see more uses, but this is probably the first one we’ll see in the wild.
It’s a preview of a V1 product. Keep that in mind.
The first and third posts are for CTP 3, so beware the time-sensitive material warnings.