Notice only grouping columns and columns passed through an aggregating calculation (such as
max()) are passed through (the column
zis not in the result). Now because
yis a function of
xno substantial aggregation is going on, we call this situation a “pseudo aggregation” and we have taught this before. This is also why we made the seemingly strange choice of keeping the variable name
y(instead of picking a new name such as
max_y), we expect the
yvalues coming out to be the same as the one coming in- just with changes of length. Pseudo aggregation (using the projection
y[]) was also used in the solutions of the column indexing problem.
In this post, John calls the act of grouping functional dependencies (where we can determine the value of y based on the value of x, for any number of columns in y or x) pseudo-aggregation.