Checking Functional Dependencies In R Data Frames

John Mount shows us how to use the psagg function in wrapr to ensure that functional dependencies are valid:

Notice only grouping columns and columns passed through an aggregating calculation (such as max()) are passed through (the column zis not in the result). Now because y is a function of x no substantial aggregation is going on, we call this situation a “pseudo aggregation” and we have taught this before. This is also why we made the seemingly strange choice of keeping the variable name y (instead of picking a new name such as max_y), we expect the y values coming out to be the same as the one coming in- just with changes of length. Pseudo aggregation (using the projection y[[1]]) was also used in the solutions of the column indexing problem.

Our wrapr package now supplies a special case pseudo-aggregator (or in a mathematical sense: projection): psagg(). It works as follows.

In this post, John calls the act of grouping functional dependencies (where we can determine the value of y based on the value of x, for any number of columns in y or x) pseudo-aggregation.

Related Posts

The Lesser-Known Apply Functions In R

Andrew Treadway covers a few of the lesser-known apply functions in R: rapply Let’s start with rapply. This function has a couple of different purposes. One is to recursively apply a function to a list. We’ll get to that in a moment. The other use of rapply is to a apply a function to only those elements in […]

Read More

Bias Correction In Standard Deviation Estimates

John Mount explains how to perform bias correction and explains why it happens so rarely in practice: The bias in question is falling off at a rate of 1/n (where n is our sample size). So the bias issue loses what little gravity it ever may have ever had when working with big data. Most sources of noise will […]

Read More

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Categories

November 2018
MTWTFSS
« Oct  
 1234
567891011
12131415161718
19202122232425
2627282930