MongoDB Aggregation Framework & Meteor.js
What is the MongoDB Aggregation Framework?
In a nutshell, the aggregation framework let you find and manipulate certain data by the hand of queries, by N pipelines.
Umm okay.. and what's a pipeline?
From all about pipelines:
A pipe is a form of redirection that is used in Linux and other Unix-like operating systems to send the output of one program to another program for further processing.
I really love rahmu's answer on this StackOverflow question about a pipeline example as well.
Ok ok, but how does a pipeline in Mongo work? We are talking about mongo over here.
Well, you already see that a pipeline is a group of commands. In JavaScript, we call groups and array, and this is how the syntax looks like:
db.example.aggregate( [<pipeline>] )
Where [pipeline] could be all this $operators (I'm just going to link to the whole list from mongodb docs, instead of copy & pasting it).
Now this seems pretty interesting and all, but you might be wondering whether it works with Meteor?
The answer is yes, there are some package over there on Atmosphere (the open source package Meteor community), but I have to say that the best one over there is [the package from Meteorhacks](https://atmospherejs.com/meteorhacks/aggregate).
So know that you know what's Aggregation, what's a pipeline, and even better, how they all work with Meteor, you are likely wondering, "how the he#$# can I use it?"\
Example
Let's say you are working for the Walmart company, and you have access to all the inventory, and also you have lots of customers. Say that you are in charge of the invoices from Walmart, and you have the following requirement:
"Customer X wants to know which was the lowest prices of Y products he is already paying, at W State"
And, thank god you have a collection named Invoices
, which stores all the invoices, so now you need to map under that collection by Customer Name and the State Code. You can then iterate over all the products inside the Invoice collection (each invoice could be from 0 to 1000 products), and then compare to see what's the lower price of Y.
Such a long and boring query right? Let's use aggregation for this.
CustomerInvoices.aggregate([{
$match: {
customerNumber: customerNumber,
shipToState: billToState
}
}, {
$unwind: "$invoiceItems"
}, {
$match: {
"invoiceItems.productCode": {
$in: productCodes
},
}
}, {
$group: {
_id: "$invoiceItems.productCode",
price: {
$min: "$invoiceItems.price"
}
}
}]);
So whats going on here?
First the $match
operators, as the name implies, matches only the documents to make the amount of data passed to the next task on the pipeline less big.
Then the $unwind
operator will take an array and convert it to single documents to iterate over them, so we can $match
them like we did in the first step.
Then we are using $match
again to get only the product that matches the given array of product codes (the ones that the client is asking for the lowest price).
Finally, we use $group
. This will take the output from the last pipeline and group them by and _id key (to identify), but also you can compute values like the one we are using here, which is $min
, to return only the document _id and the one with the lowest price.
And that's it, it looks pretty right? So if you have similar use-cases like this and you are using a set of dirty _.eachs
, go and switch to aggregation.