RTVS 1.0

Kevin Feasel

2017-03-24

R

Shahrokh Mortazavi announces that R Tools for Visual Studio 1.0 is officially out:

RTVS builds on Visual Studio, which means you get numerous features for free: from using multiple languages to word-class Editing and Debugging to over 7,000 extensions for every need:

  • A polyglot IDE – VS supports R, Python, C++, C#, Node.js, SQL, etc. projects simultaneously.

  • Editor – complete editing experience for R scripts and functions, including detachable/tabbed windows, syntax highlighting, and much more.

  • IntelliSense – (aka auto-completion) available in both the editor and the Interactive R window.

  • R Interactive Window – work with the R console directly from within Visual Studio.

  • History window – view, search, select previous commands and send to the Interactive window.

  • Variable Explorer – drill into your R data structures and examine their values.

  • Plotting – see all of your R plots in a Visual Studio tool window.

  • Debugging – breakpoints, stepping, watch windows, call stacks and more.

  • R Markdown – R Markdown/knitr support with export to Word and HTML.

  • Git – source code control via Git and GitHub.

  • Extensions – over 7,000 Extensions covering a wide spectrum from Data to Languages to Productivity.

  • Help – use ? and ?? to view R documentation within Visual Studio.

I’ve been using it for a little while and it’s pretty snazzy for integrating with SQL Server R Services.  R Studio is still more feature-rich, but RTVS is definitely catching up.

Datashader

John Mount is a bit jazzed when it comes to a new package:

I recently got back from Strata West 2017 (where I ran a very well received workshop on R and Spark). One thing that really stood out for me at the exhibition hall was Bokeh plus datashader from Continuum Analytics.

I had the privilege of having Peter Wang himself demonstrate datashaderfor me and answer a few of my questions.

I am so excited about datashader capabilities I literally will not wait for the functionality to be exposed in R through rbokeh. I am going to leave my usual knitr/rmarkdown world and dust off Jupyter Notebook just to use datashader plotting. This is worth trying, even for diehard R users.

For the moment, it looks like datashader is only available for Python, but it’s coming to R.

Visualizing Market Basket Analyses With Power BI

Leila Etaati explains how to use Power BI and a Force-Directed Graph custom visual to display results of a market basket analysis:

By clicking on the “R transformation” a new windows will show up. This windows is a R editor that you can past your code here. however there are couple of things that you should consider.

1. there is a error message handling but always recommended to run and be sure your code work in R studio first (in our example we already tested it in Part 1).

2. the all data is holding in variable “dataset”.

3. you do not need to write “install.packages” to get packages here, but you should first install required packages into your R editor and here just call “library(package name)”

Leila takes this step-by-step, leading to a Power BI visual with drill-down.

R On Athena

Kevin Feasel

2017-03-21

Cloud, Hadoop, R

Gopal Wunnava shows how to run R scripts using Amazon Athena as a data source:

Next, you’ll practice interactively querying Athena from R for analytics and visualization. For this purpose, you’ll use GDELT, a publicly available dataset hosted on S3.

Create a table in Athena from R using the GDELT dataset. This step can also be performed from the AWS management console as illustrated in the blog post “Amazon Athena – Interactive SQL Queries for Data in Amazon S3.”

This is an interesting use case for Athena.

RevoScaleR With Power BI

Kevin Feasel

2017-03-21

Power BI, R

Tomaz Kastrun looks at several methods for using RevoScaleR packages in a Power BI dashboard:

I was invited to deliver a session for Belgium User Group on SQL Server and R integration. After the session – which we did online using web based Citrix  – I got an interesting question: “Is it possible to use RevoScaleR performance computational functions within Power BI?“. My first answer was,  a sceptical yes. But I said, that I haven’t used it in this manner yet and that there might be some limitations.

The idea of having the scalable environment and the parallel computational package with all the predictive analytical functions in Power BI is absolutely great. But something tells me, that it will not be that straight forward.

Read on for the rest of the story.

Graphing R Package Dependencies

Kevin Feasel

2017-03-17

R

Tomaz Kastrun uses the igraph package to graph package dependencies in R:

With importing package tools, we get many useful functions to find additional information on packages.

Function package.dependencies() parses and check dependencies of a package in current environment. Function package_dependencies()  (with underscore and not dot) will find all dependent and reverse dependent packages.

This probably tilts more toward “fun” than “practical,” but this will let you see the full set of dependencies for a package if, for example, you need to grab all of these packages for upgrading an offline instance.

Understanding Neural Nets

David Smith links to a video which explains how neural networks do their thing:

In R, you can train a simple neural network with just a single hidden layer with the nnet package, which comes pre-installed with every R distribution. It’s a great place to start if you’re new to neural networks, but the deep learning applications call for more complex neural networks. R has several packages to check out here, including MXNetdarchdeepnet, and h2o: see this post for a comparison. The tensorflow package can also be used to implement various kinds of neural networks.

R makes it pretty easy to run one, though it then becomes important to understand regularization as a part of model tuning.

Analyzing The Ramones

Salvino Salvaggio uses R to analyze The Ramones:

Musical purists always reproached the Ramones for knowing a couple of chords only and making an excessive use of them. Data show that the band knew at least… 11 different chords (out of too-many-to-bother-counting possibilities) although 80% of their songs were built on no more than 6. And there is no evidence of a sophistication of the Ramones’ compositions over time.

It’s a fun analysis with all the R code attached.  This fun analysis, however, includes n-gram analysis, sentiment analysis, and token distribution analysis.

dplyr Tricks

Kevin Feasel

2017-03-09

R

Bruno Rodrigues shares a few lesser-known tricks in R’s dplyr package:

Removing unneeded columns

Did you know that you can use - in front of a column name to remove it from a data frame?

There are a few good tricks in here.  H/T R Bloggers.

Air Travel Route Maps With ggplot2

Peter Prevos wants to create a pretty map of flights he’s taken:

The first step was to create a list of all the places I have flown between at least once. Paging through my travel photos and diaries, I managed to create a pretty complete list. The structure of this document is simply a list of all routes (From, To) and every flight only gets counted once. The next step finds the spatial coordinates for each airport by searching Google Maps using the geocode function from the ggmap package. In some instances, I had to add the country name to avoid confusion between places.

The end result is imperfect (as Peter mentions, ggmap isn’t wrapping around), but does fit the bill for being eye-catching.

Categories

April 2017
MTWTFSS
« Mar  
 12
3456789
10111213141516
17181920212223
24252627282930