SQL like aggregation in R

Mon 28 July 2014

[caption id="" align="alignright" width="75"]Zwei aneinandergekoppelte B-Wagen der Bogestra... Group by (Photo credit: Wikipedia)[/caption]

The use case is similar to the one descibe in usual SQL quick reference guide at the "group by" section.
first of all, I need a dataset as one data.frame:
d <- data.frame(product=sample(c("fruit", "phone", "computer"), size=20, replace=TRUE), vendor=sample(c("manu", "the other guy"), size=20, replace=TRUE), note=sample(c(1:5), size=20, replace=TRUE))
to compute the mean note of product sold by each vendor, is SQL the query looks like:
SELECT vendor, MEAN(note)
FROM d
GROUP BY vendor;
aggregate(note ~ vendor, d, function(x){mean(x)})
Note that the column of the data.frame should have name and that the function can be any function you make.
more complicated: group by several column: I don't know if there is a canonical way to do it, but I found one:
aggregate(note ~ vendor + product, function(x){mean(x)})

Note that you can use any formula containing both vendor and product.

Category: how to Tagged: Aggregate Group by R SQL programming

Page 1 of 1