jq .
What separates a good developer from a great developer?
While a good developer is solving a specific task (e.g. create a new React component, edit REST endpoint, etc), a great developer has a broad understanding and a set of tools which can help him/her in various situations. (my2c)
One of these tools is jq - an amazing CLI tool with 13k+ stars on Github. You can see it like a Swiss Army knife for working with JSONs. It's very useful when you want to quickly understand data in a JSON; it saved me a lot of time over the past years. I enjoy using it because it's a command line tool and I love to do as much as work possible in CLI. That's my personal choice - and another topic.
Let's see how jq
can help you!
Setup
You'll find all the instructions you need here. All you have to do is download a binary for the OS you're using and move it to a location where the OS looks for binary files, e.g. /usr/local/bin
- for linux/macOS.
Alternatively, you can install it using a package manager (specific to your OS). On macOS, you can use brew
$ brew install jq
and on Ubuntu you can use
$ sudo apt-get install jq
Thanks Teo for pointing this out in the comments section.
Useful resources: tutorial, manual and playground. I highly advise you to look up in the manual all the jq filters used in this post.
Basic info
Like stated in manual, every jq program is a filter. It takes an input (the JSON data) and it produces an output. There are a lot of builtin filters for extracting a particular field of an object, or converting a number to a string, or various other standard tasks.
We are going to explore some of these filters today.
Enough talk, let's see some examples
You already saw one, which is very handy: jq .
the short form of jq '.'
. This command will print the entire JSON and it will beautify it, if it's minified.
I'm using mockaroo to generate some mock data. Here you can find the JSON I'm using for tests.
jq .
Let's try First let's download the JSON file
$ cd ~
$ curl -s https://gist.githubusercontent.com/charlietango/15a943a0630b50a9848a6872c810364a/raw/0d5e67c79ce1bad84789e5feee90d10b5d035dfd/codementorjq.json > mock.json
Now check the output of these commands:
$ cat mock.json
$ cat mock.json | jq .
As stated before, jq .
will print the whole JSON but beautified. Pretty cool, huh?
Basic filtering: .[], .a
What if we want only the emails of those users? That's simpler than you think.
If you take a closer look at the output of the previous command, you'll notice the root of the JSON is an array.
To get all the elements in that array, not the array itself, you can use []
. You can think of this like decomposing the array.
$ cat mock.json | jq .[]
To get only the emails, just append .email
, so the filter becomes:
$ cat mock.json | jq .[].email
Arrays: index, nested filters
To get the first element in array, all you have to do is:
$ cat mock.json | jq .[0]
Of course you can append more filters, for getting the email from the first item or for getting the online status of the first device used by the first person:
$ cat mock.json | jq .[0].email
$ cat mock.json | jq .[0].devices[0].online
Note: jq .a.b.c
and jq '.a | .b | .c'
produce the same result. So we'd achieve the same result if we'd use:
$ cat mock.json | jq '.[0] | .email'
$ cat mock.json | jq '.[0] | .devices[0] | .online'
Length: array construction, length, pipe
What if you want to get the total number of emails? You can use wc -l
, of course.
$ cat mock.json | jq .[].email | wc -l
But you can also use the length
function in jq. Let's see it in action!
$ cat mock.json | jq '.[].email | length'
Not what you expected, right? That's actually the length of the strings, because length
gets applied to all email values (the decomposed array); length
works on arrays, strings and objects. So, for example, doing jq '.[] | length'
will yield the number of fields in each object.
To get the number of emails, all we have to do is to construct an array - which is intuitive.
$ cat mock.json | jq '[.[].email] | length'
You may ask yourself what would happen if you'd call jq '. | length'
. That will give you the total number of items in the array, because it haven't been destructed/decomposed.
Filter: select, and, contains, ==
What if you want to filter those emails? You can use grep
, of course.
$ cat mock.json | jq .[].email | grep @google
But you can also use select
.
$ cat mock.json | jq '.[].email | select(. | contains("@google"))'
Let's add one more condition to our filter to get all women with a google email.
$ cat mock.json | jq '.[] | select((.email | contains("@google")) and .gender == "Female")'
This filter will return 2 objects (not emails) because the first filter .[]
returns objects which we filter using select((.email | contains("@google")) and .gender == "Female")
. If we take a closer look at this bit, we'll see the 2 conditions .gender == "Female"
and (.email | contains("@google")
. Pipe |
is used to apply both select
and contains
functions.
Note: We've used pipes inside the jq command to combine filters to and call length
and select
. Be mindful about the single quotes used to encapsulate all filters.
If we want to get only the email, all we need to do is to pipe another filter.
$ cat mock.json | jq '.[] | select((.email | contains("@google")) and .gender == "Female") | .email'
Length + Filter
Let's find out how many women are using google emails. We only need to construct an array and call length
.
$ cat mock.json | jq '[.[] | select((.email | contains("@google")) and .gender == "Female")] | length'
Like it so far?
Object construction and string interpolation: {}, (.a)
Let's say we want to have an array of objects containing only 3 fields: first_name
, last_name
and email
. We need to construct some new objects and, again, the syntax is intuitive.
$ cat mock.json | jq '.[] | {first_name: .first_name, last_name: .last_name, email: .email}'
What about concatenatig first_name
and last_name
? The solution is string interpolation - similar with the JS syntax.
$ cat mock.json | jq '.[] | {name: "\(.first_name) \(.last_name)", email: .email}'
Grouping: group_by
Another cool thing jq can do is grouping. We can demonstrate that by grouping by gender.
$ cat mock.json | jq group_by(.gender)
It worked, but the result isn't very readable - it's an array which contains 2 other arrays, which contain the grouped objects. Let's tweak it a bit. We'll decompose 2 times (since we have array in array).
$ cat mock.json | jq 'group_by(.gender) | .[] | .[]'
And we'll form some new objects, using the the technique presented above.
$ cat mock.json | jq 'group_by(.gender) | .[] | .[] | {name: "\(.first_name) \(.last_name)", gender: .gender}'
Looking good, but the items are all together. We want to keep 2 different arrays with. The array construction technique is our solution.
$ cat mock.json | jq 'group_by(.gender) | .[] | [.[] | {name: "\(.first_name) \(.last_name)", gender: .gender}]'
One last thing. Let's see how many men and how many women there are in our dataset.
$ cat mock.json | jq 'group_by(.gender) | .[] | [.[] | {name: "\(.first_name) \(.last_name)", gender: .gender}] | length'
Conclusions
jq is a very powerful and lightweight tool and I think every developer should have at least a basic understanding of how it works.
I only scratched the surface and I highly recommend to have a look on the manual and see what's capable of.
Thanks
Thanks for reading! I hope this will help.
Code on!
Really cool stuff Catalin!
I wish I had this article a few months ago when I was trying to interpret json for my team. But if a future json project comes my way - I’ll have this ready. Thank you!
I also never heard of mockaroo, which looks like a huge game changer for me in testing datasets. Great great stuff!
Knowing that I added a tool to someone’s arsenal It’s a huge thing for me. Many thanks for your appreciation!
Code on!