# Taking Advantage Of Vectorization In R

2018-10-29

R is an interpreted programming language with vectorized data structures. This means a single R command can ask for very many arithmetic operations to be performed. This also means R computation can be fast. We will show an example of this using Conway’s Game of Life.

Conway’s Game of Life is one of the most interesting examples of cellular automata. It is traditionally simulated on a rectangular grid (like a chessboard) and each cell is considered either live or dead. The rules of evolution are simple: the next life grid is computed as follows:

• To compute the state of a cell on the next grid sum the number of live cells in the eight neighboring cells on the current grid.

• If this sum is 3 or if the current cell is live and the sum is 2 or 3, then the cell in the next grid will be live.

Not only is the R code faster, but it’s also terser.

# Visualizing A Correlation Matrix With corrplot

2018-10-24

First we need to read the packages into the R library. For descriptive statistics of the dataset we use the `skimr` package and for visualization of correlation matrix we use the `corrplot` package. We will work with windspeed dataset from the `bReeze` package:

```# Read packages into R library
library(bReeze)
library(corrplot)
library(skimr)```

Click through for the demo.

# Getting The Right R Version For Packages

2018-10-24

In R, there is a handy function called `available.packages()` that returns a matrix of details corresponding to packages currently available at one or more repositories. Unfortunately, the format isn’t initially amenable to manipulation. For example, consider the readr package

``readr_desc = available.packages() %>% as_tibble() %>% filter(Package == "readr")``

I immediately converted the data to a tibble, as that

• changed the rownames to a proper column

• changed the matrix to a data frame/tibble, which made selecting easier

There’s a good use of R functionality to delve into package requirements, as well as a script to try it out yourself.

# Packages For Testing R Packages

2018-10-23

There’s an R package called `RUnit` for unit testing, but in the whole post we’ll mention resources around the `testthat` package since it’s the one we use in our packages, and arguably the most popular one. `testthat` is great! Don’t hesitate to reads its docs again if you started using it a while ago, since the latest major release added the `setup()` and `teardown()` functions to run code before and after all tests, very handy.

To setup testing in an existing package i.e. creating the test folder and adding `testthat` as a dependency, run `usethis::use_testthat()`. In our WIP `pRojects` package, we set up the tests directory for you so you don’t forget. Then, in any case, add new tests for a function using `usethis::use_test()`.

The `testthis` package might help make your testing workflow even smoother. In particular, `test_this()` “reloads the package and runs tests associated with the currently open R script file.”, and there’s also a function for opening the test file associated with the current R script.

This is an area where I know I need to get better, and Maelle gives us a plethora of tooling for tests.

# Reshaping Data Frames With tidyr

2018-10-23

As it is shown above, the variable `agegp` has 6 groups (i.e., 25-34, 35-44) which has different alcohol intake and smoking use combinations. I think it would be interesting to transform this dataset from long to wide and to create a column for each age group and show the respective cases. Let see how the dataset will look like.

``dt %>% spread(agegp, ncases) %>% slice(1:5)``

Click through for a few additional transformations.

# Using cdata To Created Faceted Plots

2018-10-22

First, load the packages and data:

``````library("ggplot2")library("cdata")
iris <- data.frame(iris)``````

Now define the data-shaping transform, or control table. The control table is basically a picture that sketches out the final data shape that I want. I want to specify the `x` and `y` columns of the plot (call these the value columns of the data frame) and the column that I am faceting by (call this the key column of the data frame). And I also need to specify how the key and value columns relate to the existing columns of the original data frame.

Read on to see how you can use `cdata` to tie together different faceted plots.

# Using wrapr For A Consistent Pipe With ggplot2

2018-10-16

Now we can run a single pipeline that combines data processing steps and `ggplot` plot construction.

``data.frame(x = 1:20) %.>% mutate(., y = cos(3*x)) %.>% ggplot(., aes(x = x, y = y)) %.>% geom_point() %.>% geom_line() %.>% ggtitle("piped ggplot2")``

Check it out.

# Using R To Hit Azure ML From Power BI

2018-10-16

You need to create a model in Azure ML Studio and create a web service for it.

The traditional example in Predict a passenger on Titanic ship is going to survived or not?

we have a dataset about passengers like their age, gender, and passenger class, then we are going to predict whether they are going to survive or not

Open Azure ML Studio and follow the steps to create a model for predicting this. Navigate to Azure ML Studio.

Click through for the step-by-step instructions.

# Connecting To Elasticsearch With R

2018-10-15

You will need the following information to connect to Elasticsearch as a JDBC data source:

• Driver Class: Set this to `cdata.jdbc.elasticsearch.ElasticsearchDriver`.
• Classpath: Set this to the location of the driver JAR. By default, this is the lib subfolder of the installation folder.

The DBI functions, such as `dbConnect` and`dbSendQuery` , provide a unified interface for writing data access code in R. Use the following line to initialize a DBI driver that can make JDBC requests to the CData JDBC Driver for Elasticsearch:

Read on for the full instructions.

# Voice Control For Shiny Apps

2018-10-12

I have found that performance across all devices and browsers is definitely not equal. By far the best browser I have found for viewing the apps is Google Chrome. I have also tended to find that my Ubuntu machines don’t do as well as Microsoft machines in picking up words correctly. A chat I had with someone recently suggested this might be down to drivers under Ubuntu for the microphones but that is not my area of expertise. Voice recognition was also fine on both of my Blackberry phones (one running BB OS 10, the other running Android 7).

It is worth noting that this does require an internet connection to function, in Chrome the voice to text is performed in the cloud.

The other thing I have noticed is that annyang seems relatively sensitive to background noise. This isn’t so bad for functions called using specific phrases but does sometimes have a large effect on the multi-word splats. This is because the splats are greedy and the background noise makes the recognition engine think that you are still talking long after you finished which gives the appearance of the application hanging.

The solution is by no means perfect, but it does look quite interesting.

December 2018
MTWTFSS
« Nov
12
3456789
10111213141516
17181920212223
24252627282930
31