Converting XML to JSON using Recursion
The other day, I was working on an app which needed to fetch data from a third party rest api, and what happened next is the thing of the one of the worst nightmares of a JavaScript developer.
The server sent back response in.. gasp.. XML instead of JSON like any sane rest api would do.
So, I came up with a way to easily convert XML into JavaScript Object. Here’s an example of the data I was trying to read.
Keep in mind that this code makes use of WebAPIs so it is not available in server side javascript like NodeJS. This works great for front end applications like React or Angular.
The format of XML is generally something like this:
<book>
<title>Some title</title>
<description>some description </description>
<author>
<id>1</id>
<name>some author name</name>
</author>
<review>nice book</review>
<review>this book sucks</review>
<review>amazing work</review>
</book>
I want the ouput to look a little something like this:
{
"book": {
"title": "Some title",
"description": "some description",
"author": { "id": "1", "name": "some author name" },
"review": ["nice book", "this book sucks", "amazing work"]
}
}
Since, XML has a lot of nested tags, this problem is a perfect example of a practical application of recursion.
Before we begin to code, we need to understand something called the DOMParser Web API.
According to the MDN documentation,
The
DOMParser
interface provides the ability to parse XML or HTML source code from a string into a DOM TREE.
In simple words, it converts and XML string into a DOM Tree. Here’s how it works.
Lets say we have an some XML stored in a string, strxml. We can parse the data in it as a DOM tree like this:
let strxml = `<book><title>Some title</title>
<description>some description </description>
<author>
<id>1</id>
<name>some author name</name>
</author>
<review>nice book</review>
<review>this book sucks</review>
<review>amazing work</review></book>
`;
const parser = new DOMParser(); // initialize dom parser
const srcDOM = parser.parseFromString(strxml, "application/xml"); // convert dom string to dom tree.
// Now we can call DOM methods like GetElementById, etc. on scrDOM.
Now that we have got the basics right. Let’s start writing the psuedo code.
Initialize variable jsonResult is empty object.
If scrDOM has no children nodes:
return innerHTML of the DOM. // This is our base case.
For each childNode in children nodes:
Check if childNode has siblings of same name.
If it has no siblings of same name:
set childnode name as key whose value is json of the child node. (we're calling the function recursively.)
If it has no siblings of same name
set childnode name as key whose value is an empty array, every child whose name is same as this pushed into this array.
return jsonResult
Here’s the JavaScript code:
/**
* This function coverts a DOM Tree into JavaScript Object.
* @param srcDOM: DOM Tree to be converted.
*/
function xml2json(srcDOM) {
let children = [...srcDOM.children];
// base case for recursion.
if (!children.length) {
return srcDOM.innerHTML
}
// initializing object to be returned.
let jsonResult = {};
for (let child of children) {
// checking is child has siblings of same name.
let childIsArray = children.filter(eachChild => eachChild.nodeName === child.nodeName).length > 1;
// if child is array, save the values as array, else as strings.
if (childIsArray) {
if (jsonResult[child.nodeName] === undefined) {
jsonResult[child.nodeName] = [xml2json(child)];
} else {
jsonResult[child.nodeName].push(xml2json(child));
}
} else {
jsonResult[child.nodeName] = xml2json(child);
}
}
return jsonResult;
}
// testing the function
let xmlstr = `<book><title>Some title</title>
<description>some description </description>
<author>
<id>1</id>
<name>some author name</name>
</author>
<review>nice book</review>
<review>this book sucks</review>
<review>amazing work</review></book>
`;
// converting to DOM Tree
const parser = new DOMParser();
const srcDOM = parser.parseFromString(xmlstr, "application/xml");
// Converting DOM Tree To JSON.
console.log(xml2json(srcDOM));
/** The output will be
{
"book": {
"title": "Some title",
"description": "some description",
"author": { "id": "1", "name": "some author name" },
"review": ["nice book", "this book sucks", "amazing work"]
}
}
*/
This is the basic algorithm / code for converting an XML string into a JSON object. Since, it uses recursion, it can go very deep into the DOM tree and parse every single element.
This works for most of the cases. You can modify this algorithm according to your own needs or requirements.