# Checking Functional Dependencies In R Data Frames

2018-11-01

Notice only grouping columns and columns passed through an aggregating calculation (such as `max()`) are passed through (the column `z`is not in the result). Now because `y` is a function of `x` no substantial aggregation is going on, we call this situation a “pseudo aggregation” and we have taught this before. This is also why we made the seemingly strange choice of keeping the variable name `y` (instead of picking a new name such as `max_y`), we expect the `y` values coming out to be the same as the one coming in- just with changes of length. Pseudo aggregation (using the projection `y[[1]]`) was also used in the solutions of the column indexing problem.

Our `wrapr` package now supplies a special case pseudo-aggregator (or in a mathematical sense: projection): `psagg()`. It works as follows.

In this post, John calls the act of grouping functional dependencies (where we can determine the value of y based on the value of x, for any number of columns in y or x) pseudo-aggregation.

## The Lesser-Known Apply Functions In R

2018-11-14

Andrew Treadway covers a few of the lesser-known apply functions in R: rapply Let’s start with rapply. This function has a couple of different purposes. One is to recursively apply a function to a list. We’ll get to that in a moment. The other use of rapply is to a apply a function to only those elements in […]

## Bias Correction In Standard Deviation Estimates

2018-11-13

John Mount explains how to perform bias correction and explains why it happens so rarely in practice: The bias in question is falling off at a rate of 1/n (where n is our sample size). So the bias issue loses what little gravity it ever may have ever had when working with big data. Most sources of noise will […]

This site uses Akismet to reduce spam. Learn how your comment data is processed.

November 2018
MTWTFSS
« Oct
1234
567891011
12131415161718
19202122232425
2627282930