Press "Enter" to skip to content

Category: R

Using R To Hit Azure ML From Power BI

Leila Etaati shows how you can use R to hit an Azure ML endpoint to populate a data set in Power BI:

You need to create a model in Azure ML Studio and create a web service for it.

The traditional example in Predict a passenger on Titanic ship is going to survived or not?

we have a dataset about passengers like their age, gender, and passenger class, then we are going to predict whether they are going to survive or not

Open Azure ML Studio and follow the steps to create a model for predicting this. Navigate to Azure ML Studio.

Then download the dataset for titanic from here

Click through for the step-by-step instructions.

Comments closed

Connecting To Elasticsearch With R

Jerod Johnson has a sample of connecting to Elasticsearch with R:

You will need the following information to connect to Elasticsearch as a JDBC data source:

  • Driver Class: Set this to cdata.jdbc.elasticsearch.ElasticsearchDriver.
  • Classpath: Set this to the location of the driver JAR. By default, this is the lib subfolder of the installation folder.

The DBI functions, such as dbConnect anddbSendQuery , provide a unified interface for writing data access code in R. Use the following line to initialize a DBI driver that can make JDBC requests to the CData JDBC Driver for Elasticsearch:

Read on for the full instructions.

Comments closed

Voice Control For Shiny Apps

Over at Jumping Rivers, an example of using a Javascript library to control a page using voice commands:

I have found that performance across all devices and browsers is definitely not equal. By far the best browser I have found for viewing the apps is Google Chrome. I have also tended to find that my Ubuntu machines don’t do as well as Microsoft machines in picking up words correctly. A chat I had with someone recently suggested this might be down to drivers under Ubuntu for the microphones but that is not my area of expertise. Voice recognition was also fine on both of my Blackberry phones (one running BB OS 10, the other running Android 7).

It is worth noting that this does require an internet connection to function, in Chrome the voice to text is performed in the cloud.

The other thing I have noticed is that annyang seems relatively sensitive to background noise. This isn’t so bad for functions called using specific phrases but does sometimes have a large effect on the multi-word splats. This is because the splats are greedy and the background noise makes the recognition engine think that you are still talking long after you finished which gives the appearance of the application hanging.

The solution is by no means perfect, but it does look quite interesting.

Comments closed

Visualizing In R: 3 Packages

Kristian Larsen has a quick demo of three R visualization packages, ggplot2, dygraphs, and plotly:

Another value generating visualisation package in R is dygraphs. This package focuses on creating interactive visualisations with elegant interactive coding modules. Furthermore, the package specialises in creating visualisations for machine learning methods. The below coding generates different visualisation graphs with dygraphs:

Three useful libraries to learn.  Two more which might be useful are ggvis and rbokeh.

Comments closed

Modifying A ggplot2 Theme

Sebastian Sauer gives us an example of modifying a standard ggplot2 theme:

ggplot2 is customizeable. Frankly, one can change a heap of details – not everything probably, but a lot. Of course, one can add a theme to the ggplot call, in order to change the theme. However, a more catch-it-all approach would be to change the standard theme of ggplot itself. In this post, we’ll investigate this option.

To date, I’ve only used themes others have created, but if you need to customize a theme, there’s a lot you can do here.

Comments closed

Multi-Threaded R With Microsoft R Client

David Parr shows us how to get started with Microsoft R Client and performs some quick benchmarking:

This message will pop up, and it’s worth noting as it’s got some information in it that you might need to think about:

  • It’s worth noting that right now Microsoft r Client is lagging behind the current R version, and is based on version 3.4 of R, not 3.5. This will mean your default package libraries will not be shared between the installations if you are running R 3.5.

  • It’s using a snapshot of CRAN called MRAN to source packages by default. 90% of the time it will operate just as you expect, but because it takes a ‘snapshot’ of packages, newer features and changes that have hit CRAN may not be in the version of the package you are grabbing.

    • RevoScaleR and probably the ggplot2 and dplyr packages will likely be installed for you already as default in Microsoft R Client. The other two you will probably have to install yourself.
  • Intel MKL will have scanned your system on install and attempted to work out how many cores your processor has. Here it’s identified 2 on my old Lenovo Yoga. This is where the speed boost will come from.

I had an old two-core Lenovo Yoga too, so this article really spoke to me.

Comments closed

Image Clustering With Keras And R

Shirin Glander shows us how to use R to extract learned features from Keras and cluster those features:

For each of these images, I am running the predict() function of Keras with the VGG16 model. Because I excluded the last layers of the model, this function will not actually return any class predictions as it would normally do; instead we will get the output of the last layer: block5_pool (MaxPooling2D).

These, we can use as learned features (or abstractions) of the images. Running this part of the code takes several minutes, so I save the output to a RData file (because I samples randomly, the classes you see below might not be the same as in the sample_fruits list above).

Read the whole thing.

Comments closed

Using plm To Analyze Panel Data

Michael Grogan shows us how to use the plm package to perform linear regression against panel data:

Types of data

  • Cross-Sectional: Data collected at one particular point in time
  • Time Series: Data collected across several time periods
  • Panel Data: A mixture of both cross-sectional and time series data, i.e. collected at a particular point in time and across several time periods
  • Fixed Effects: Effects that are independent of random disturbances, e.g. observations independent of time.
  • Random Effects: Effects that include random disturbances.

Let us see how we can use the plm library in R to account for fixed and random effects. There is a video tutorial link at the end of the post.

Read on for an example.

Comments closed

Analyzing Update Dates For R Packages

Tomaz Kastrun takes a look at CRAN package update dates:

So more updates are coming in autumn times. But the results of correlation:

cor(dd_ym2010)[2,3]

is still just 0.155, making it hard to draw any concrete conclusions. Adding year 2018 will skew the picture and add several outliers, as the fact that year 2018 is still a running year (as of writing this blog post).

Read on for a descriptive analysis of this data set.

Comments closed