A stochastic problem, with application to financial theory. Some say it goes back to Warren Buffett. I relied to my colleague Norman Markgraf, who pointed it out to me.
Assume there are two coins. One is fair, one is loaded. The loaded coin has a bias of 60-40. Now, the question is: How many coin flips do you need to be “sure enough” (say, 95%) that you found the loaded coin?
Let’s simulate la chose.
It took a few more flips than I had expected but the number is not outlandish.
In standard (or value oriented evaluation) code you type in is taken to be variable names, functions, names, operators, and even numeric literal values. String values or literals need extra marks, such as quotes.
John walks us through several examples along the way. At the end, John is a major proponent of Standard Evaluation over Non-Standard Evaluation.
Back in August of 2017, we wrote two posts #9: Compating your Share Libraries and #10: Compacting your Shared Libraries, After The Buildabout “stripping” shared libraries. This involves removing auxiliary information (such as debug symbols and more) from the shared libraries which can greatly reduce the installed size (on suitable platforms – it mostly matters where I work, i.e. on Linux).
There’s a pretty good space savings in the
tidyverse package. H/T R-Bloggers.
Naturally, I would expect my model to be unbiased, at least in intention, and hence any leftovers on either side of the regression line that did not make it on the line are expected to be random, i.e. without any particular pattern.
That is, I expect my residual error distributions to follow a bland, normal distribution.
In R, you can do this elegantly with just two lines of code.
1. Plot a histogram of residuals
2. Add a Quantile-Quantile plot with a line that passes through, namely, the first and third quantiles.
There are several more techniques in here to analyze residuals, so check it out.
As someone very interested in storytelling, ggplot2 is easily my data visualization tool of choice. It is like the Swiss army knife for data visualization. One of my favorite features is the ability to pack a graph chock-full of dimensions. This ability is incredibly handy during the data exploration phases. However, sometimes I find myself wanting to look at trends without all the noise. Specifically, I often want to look at very dense scatterplots for outliers. Ggplot2 is great at this, but when we’ve isolated the points we want to understand, we can’t easily examine all possible dimensions right in the static charts.
Enter plotly. The plotly package and ggploty function do an excellent job at taking our high quality ggplot2 graphs and making them interactive.
Read on for several quality, interactive visuals.
If you want to work in the above way we suggest giving our
cdatapackage a try. We named the functions
unpivot_to_blocks. The idea was: by emphasizing the record structure one might eventually internalize what the transforms are doing. On the way to that we have a lot of documentation and tutorials.
This is your regular reminder that the Tidyverse is very useful, but it is not the entirety of R.
If your software or research depends on many complex and changing packages, you have no way to establish your work is correct. This is because to establish the correctness of your work, you would need to also establish the correctness of all of the dependencies. This is worse than having non-reproducible research, as your work may have in fact been wrong even the first time.
Low dependencies and low complexity dependencies can also be wrong, but in this case there at least exists the possibility of checking things or running down and fixing issues.
There are some insightful comments on this post as well, so check those out. This is definitely an area where there are trade-offs, so trying to reason through when to move in which direction is important.
ggplot– You can spot one from a mile away, which is great! And when you do it’s a silent fist bump. But sometimes you want more than the standard theme.
Fonts can breathe new life into your plots, helping to match the theme of your presentation, poster or report. This is always a second thought for me and
needto work out how to do it again, hence the post .
Read on to see how to use each of these packages. H/T R-bloggers
There seems to be a general (false) impression among non R-core developers that to run tests,
Rpackage developers need a test management system such as
testthat. And a further false impression that
testthatis the only
Rtest management system. This is in fact not true, as
Ritself has a capable testing facility in “
R CMD check” (a command triggering
Rchecks from outside of any given integrated development environment).
By a combination of skimming the
R-manuals ( https://cran.r-project.org/manuals.html ) and running a few experiments I came up with a description of how
R-testing actually works. And I have adapted the available tools to fit my current preferred workflow. This may not be your preferred workflow, but I have and give my reasons below.
Food for thought for any R developer.
The R Core Team announced yesterday the release of R 3.5.3, and updated binaries for Windows and Linux are now available (with Mac sure to follow soon). This update fixes three minor bugs (to the functions
stopifnot), but you might want to upgrade just to avoid the “package built under R 3.5.4” warnings you might get for new CRAN packages in the future.
Click through for more info on this release, including where the name from each R release comes from.