Press "Enter" to skip to content

Category: R

Handwriting Character Recognition

Tomaz Kastrun compares a few different libraries in terms of handwritten numeric character recognition:

Recently, I did a session at local user group in Ljubljana, Slovenija, where I introduced the new algorithms that are available with MicrosoftML package for Microsoft R Server 9.0.3.

For dataset, I have used two from (still currently) running sessions from Kaggle. In the last part, I did image detection and prediction of MNIST dataset and compared the performance and accuracy between.

MNIST Handwritten digit database is available here.

Tomaz has all of the code available as well.

Comments closed

Analyzing Flight Data With Sparklyr

Aki Ariga continues his sparklyr series with some analysis of US flight data:

In this post, we will show you a visualization and build a predictive model of US flights with sparklyr. Flight visualization code is based on this article.

This post assumes you already have the following tables:

You should make these tables available through Apache Hive or Apache Impala (incubating) with Hue.

There’s some setup work to get this going, but getting a handle on sparklyr looks to be a good idea if you’re in the analytics space.

Comments closed

RTVS RC1

David Smith alerts us that R Tools for Visual Studio Release Candidate 1 is available:

We’ll cover the features in detail with the general availability release of RTVS 1.0, but in summary the new features include:

  • Remote Execution: type R code in your local RTVS instance, but have the computations performed on a remote R server. You can also switch between local and remote workspaces at will.

  • SQL Server Integration: work with database connections and SQL queries, and create stored procedures with embedded R code.

  • Enhanced R Graphics Support: multiple floating and dockable plot windows, each with plot history.

I’ve been using RTVS more frequently lately and it’s definitely growing on me.

Comments closed

Python + knitr

Steph Locke shows how to integrate Python code into knitr:

One of the nifty things about using R is that you can use it for many different purposes and even other languages!

If you want to use Python in your knitr docs or the newish RStudio R notebook functionality, you might encounter some fiddliness getting all the moving parts running on Windows. This is a quick knitr Python Windows setup checklist to make sure you don’t miss any important steps.

Between knitr, Zeppelin, and Jupyter, you should be able to find a cross-compatible notebook which works for you.

Comments closed

Global Maps In R

The folks at Sharp Sight Labs show how to create high-quality map visuals in R:

Maps are great for practicing data visualization. First of all, there’s a lot of data available on places like Wikipedia that you can map.

Moreover, creating maps typically requires several essential skills in combination. Specifically, you commonly need to be able to retrieve the data (e.g., scrape it), mold it into shape, perform a join, and visualize it. Because creating maps requires several skills from data manipulation and data visualization, creating them will be great practice for you.

And if that’s not enough, a good map just looks great. They’re visually compelling.

With that in mind, I want to walk you through the logic of building one step by step.

Read on for a step by step process.

Comments closed

Gloom Indexes

David Smith points out an interesting use of R:

Radiohead is known for having some fairly maudlin songs, but of all of their tracks, which is the most depressing? Data scientist and R enthusiast Charlie Thompson ranked all of their tracks according to a “gloom index”, and created the following chart of gloominess for each of the band’s nine studio albums. (Click for the interactive version, crated with with highcharter package for R, which allows you to explore individual tracks.)

Do click through for Charlie’s explanation, including where the numbers come from.

Comments closed

Market Basket Analysis Basics

Leila Etaati has an introduction to market basket analysis with R:

For instance, imagine we have below transaction items from a shopping store  for last hours,

Customer 1: Salt, pepper, Blue cheese

Customer 2: Blue Cheese, Pasta, Pepper, tomato sauce

Customer 3: Salt, Blue Cheese, Pepperoni, Bacon, egg

Customer 4: water, Pepper, Egg, Salt

we want to know how many times customer purchase pepper and salt together
the support will be : from out four main transactions (4 customers), 2 of them purchased salt and pepper together. so the support will be 2 divided by 4 (all number of transaction.

Basket analysis is one way of building a recommendation engine:  if you’re buying butter, cream, and eggs, do you also want to buy sugar?

Comments closed

R 3.4.0 Performance Improvements

David Smith discusses performance improvements upcoming in R 3.4.0:

A “just-in-time” JIT compiler will be included. While the core R packages have been byte-compiled since 2011, and package authors also have the option of btye-compiling the R code they contain, it was tricky for ordinary R users to gain the benefits of byte-compilation for their own code. In 3.4.0, loops in your R scripts and functions you write will be byte-compiled as you use them (“just-in-time”), so you can get improved performance for your R code without taking any additional actions.

Stay tuned for the release.

Comments closed

Twitter Sentiment Analysis Using doc2vec

Sergey Bryl uses word2vec and doc2vec to perform Twitter sentiment analysis in R:

But doc2vec is a deep learning algorithm that draws context from phrases. It’s currently one of the best ways of sentiment classification for movie reviews. You can use the following method to analyze feedbacks, reviews, comments, and so on. And you can expect better results comparing to tweets analysis because they usually include lots of misspelling.

We’ll use tweets for this example because it’s pretty easy to get them via Twitter API. We only need to create an app on https://dev.twitter.com (My apps menu) and find an API Key, API secret, Access Token and Access Token Secret on Keys and Access Tokens menu tab.

Click through for more details, including code samples.

Comments closed