This minor update, codenamed “Bug in Your Hair”, makes a few small fixes to the R 3.3.0 release. Bugs fixed include mostly rarely-encountered cases like generating Gamma random numbers with zero or infinite rate parameters, and correctly matching text (with the
matchfunction) that only differed in the encoding.
There are no new features in this update, and all R code and packages should work with R 3.3.1 just as they did with R 3.3.0. For a complete list of the fixes in R 3.3.1, follow the link below.
Even though this is a small update, it might be useful to check out.
Say you’ve got 30 numbers and a strong urge to estimate their standard deviation. But you’ve left your computer at home. Unless you’re really good at mentally squaring and summing, it’s pretty hard to compute a standard deviation in your head. But there’s a heuristic you can use:
Subtract the smallest number from the largest number and divide by four
Let’s call it the “range over four” heuristic. You could, and probably should, be skeptical. You could want to see how accurate the heuristic is. And you could want to see how the heuristic’s accuracy depends on the distribution of numbers you are dealing with.
Sometimes you just don’t have STDEV() available.
We see a different behaviour:
messyinto a long data format with a warning by treating all columns as variable, while
melt()has treated trt as an “id variables”. Id columns are the columns that contain the identifier of the observation that is represented as a row in our data set. Indeed, if
melt()does not receive any id.variables specification, then it will use the factor or character columns as id variables.
gather()requires the columns that needs to be treated as ids, all the other columns are going to be used as key-value pairs.
Despite those last different results, we have seen that the two functions can be used to perform the exactly same operations on data frames, and only on data frames! Indeed,
gather()cannot handle matrices or arrays, while
melt()can as shown below.
It seems that these two tools have some overlap, but each has its own point of focus: tidyr is simpler for data tidying, whereas reshape2 has functionality (like data aggregation) which tidyr does not include.
I’m pleased to announce tidyr 0.5.0. tidyr makes it easy to “tidy” your data, storing it in a consistent form so that it’s easy to manipulate, visualise and model. Tidy data has a simple convention: put variables in the columns and observations in the rows. You can learn more about it in the tidy data vignette.
Check out the latest version of tidyr; it’s one of the most useful data manipulation packages on the R platform.
If you look at the code from the interactive window, you will notice that the error occurred with trying to run rxSummary. In both cases I didn’t get the error when I changed the compute context to SQL Server from local, but when I tried to run a function which runs on the server. In both cases the R tools where installed prior to installing SQL Server 2016. The Open Source R tools install to C:\Program Files\R\R-3.3.0 (your version number may be higher). The Microsoft R Open installs to C:\Program Files\Microsoft\MRO\R-3.2.5. To use the libraries needed for the RevoScaleR libraries included in R Server, the version of Microsoft R required is Microsoft RRE, which is installed here C:\Program Files\Microsoft\MRO-for-RRE\8.0. Unfortunately, SQL Server 2016 shipped with version 8.0.3 not 8.0.0. If you are getting data and using a local compute context, you will have no problems. However, when you want to change your compute context to run on SQL Server, you will get an error.
While I received a different error on the server than my laptop, the reason for both messages was the same. Neither computer was running version 18.104.22.168 of the R client tools. On the server I was able to fix the error without downloading a thing. After installing a stand-alone version of R Server from the SQL Server Installation Center, the error went away and I got results when trying to run rxSummary. Unfortunately, it was not possible for me to run R Server on my laptop, as R Server is disabled from within the Installation Center. I believe that is because I have SQL Server 2016 developer edition on a laptop, not on a server. I needed to do something else to make it work.
Click the link for the full story.
Microsoft has not one version of R, they have two but two. These two different versions are needed because they have two different purposes in mind. Microsoft R Open, is open source and fully R compatible and is faster than open source R because they rewrote a number of the algorithms to include multi-threaded math libraries. If you want to run R code on SQL Server, this is the not the version you want to use. You want to use the non-open source version designed to run on R Server, which is included with SQL Server 2016, Microsoft RRE Open. This version will run R code not only in memory but swap to disk, to create code which can access SQL Server data without needing to create a file, and can run code on the server from the client. The version of RRE Open which is included in SQL Server 2016 is 8.0.3.
She follows this up with a demo program to pull data from a SQL Server table and generate a histogram. If you have zero R experience, there’s no time like the present to get started.
If you already have R installed on the same system as PowerBI, you just need to paste the R scripts in the code pen. Otherwise you need to install R in the system where you are using the PowerBI desktop like this:
This step-by-step guide features a lot of images and should be pretty easy for a new user.
HIBPwned is a feature complete R package that allows you to use every (currently) available endpoint of the API. It’s vectorised so no need to loop through email addresses, and it requires no fiddling with authentication or keys.
You can use HIBPwned to do things like:
Set up your own notification system for account breaches of myriad email addresses & user names that you have
Check for compromised company email accounts from within your company Active Directory
Analyse past data breaches and produce charts like Dave McCandless’ Breach chart
The regular service is extremely useful and Steph’s wrapper looks like it’s worth checking out.
In previous videos you’ve learned that we can demonstrate R visualization in Power BI, In this video you will learn how R visualization is working interactively with other elements in Power BI report. In fact Power BI works with R charts as a regular visualization and highlighting and selecting items in other elements of report will effect on that. Here is a quick video about this functionality
Check out the five-minute video.
Dimensionality reduction is a common techique to visualize observations in a dataset, by combining all features into two, that can then be used to draw the observation in an scatter plot.
One popular algorithm that implements this technique is PCA (Principal Components Analysis), which is available in R through the prcomp() function.
The algorithm was applied to observations of sthe dataset, and ggplot2’s geom_point() function was used to draw the results in a 2D chart.
I would want to see this done for a couple hundred thousand domains, but I do like the idea of taking advantage of statistical modeling tools to find security threats.