Codementor Events

Bit Packing or How to Love AND, OR or XOR

Published Jun 07, 2019
Bit Packing or How to Love AND, OR or XOR

Summary

In this article we are going to explore the use of bit packing. It allows us to compress multiple values into a integer or binary field in MongoDB potentially allowing us to save on document size in storage and transmission over the wire. It has some real benefits but comes with trade offs that you should be clearly aware off before applying it.

Pros Cons
* Reduce space consumed by document. * Cannot use indexes for querying bit packed field entries.
* Can efficiently pack multiple values and flags into a single field. * Complex update statements required.
* Smaller documents means less data transmitted over the wire. * Requires bit twiddling knowledge.

In general you should only consider bit packing when small document size is required attribute as it introduces additional complexity to your application. However if you do need it, it can be an invaluable tool.

Overview

Lets frame our exploration of bit packing with a hypothetical data model that models a unix like file system including read, write, execute bits for the user, group and others. Lets consider the following model.

{
  name: "peter",
  isDirectory: true,
  path: "/home/user/peter",
  parentPath: "/home/user",
  group: 0,
  user: 10,
  permissions: {
    group: {
      read: true,
      write: false,
      execute: false
    },
    user: {
      read: true,
      write: true,
      execute: true
    },
    all: {
      read: true,
      write: false,
      execute: false
    }
  }
}

In this model each document represents either a directory or file. Each of the of entries belong to a user and a group. In our system we can have 256 different groups represented by 8 bits and 16384 distinct users represented by 14 bits.

Now what if we could represent all of these fields and their value by a single 32 bit integer. We would save a fair bit of space. Instead of having to store all the space for the fields isDirectory, group, user, permissions, permissions.group, permissions.group.read/write/execute, permissions.user, permissions.user.read/write/execute, permissions.all and permissions.all.read/write/execute we could create a single field that contained all the values packed into a single integer value of 32 bits called metadata.

For the metadata field we would define each grouping of bits to contain the compressed information.

1                                     // [0:0]    isDirectory
 11111111                             // [1:9]    Group id
         11111111111111               // [10:23]  User id
                       111            // [24:26]  Group r/w/x
                          111         // [27:29]  User r/w/x
                             111      // [29:31]  All r/w/x
  • The [0:0] bits defines the isDirectory value.
  • The [1:9] bits encode the 8 bits of the Group id.
  • The [10:23] bits encode the 14 bits of the User id.
  • The [24:26] bits encode the Group read/write/execute flags.
  • The [27:29] bits encode the User read/write/execute flags.
  • The [29:31] bits encode the All read/write/execute flags.

The resulting metadata field would be a 32 bit value that would represent the flags.

{
  name: "peter",
  path: "/home/user/peter",
  parentPath: "/home/user",
  metadata: 0b11111111111111111111111111111111
}

Lets look at how the change affects the size of the resulting MongoDB document. To do this we are going to use a method in the shell called Object.bsonsize that lets us pass JS object and get the BSON binary size in bytes.

Document Size in Bytes
Original 244
Bit packed 93

As we can see the bit packing reduces the size of the resulting document by quite a bit. The trade off is of course more complex code.

Modifying the metadata

So how do we modify the read permissions or change the user or group id? To do this we need to use the update operator $bit to perform and and or operations on the metadata field.

We are going to use a wrapper class called Entry to simplify the modification of the metadata field.

class Entry {
  constructor(collection, _id, name, path, parentPath, metadata) {
    this.collection = collection;
    this._id = _id;
    this.name = name;
    this.path = path;
    this.parentPath = parentPath;
    this.metadata = metadata;
  }

  setGroupId(id) {
    // Create AND and OR bitmasks and update metadata field
    updateMask(this.collection, this._id,
      NumberInt(
        parseInt(`10000000011111111111111111111111`, 2)),
      NumberInt(
        parseInt(`0${createIntegerMask(id, 8)}00000000000000000000000`, 2))
    );
  }

  setUserId(id) {
    // Create AND and OR bitmasks and update metadata field
    updateMask(this.collection, this._id,
      NumberInt(
        parseInt(`11111111100000000000000111111111`, 2)),
      NumberInt(
        parseInt(`000000000${createIntegerMask(id, 14)}000000000`, 2))
    );
  }

  setGroupBits(read, write, execute) {
    // Get generated AND and OR flags
    var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
    // Create AND and OR bitmasks and update metadata field
    updateMask(this.collection, this._id,
      NumberInt(
        parseInt(`11111111111111111111111${andBits}111111`, 2)),
      NumberInt(
        parseInt(`00000000000000000000000${orBits}000000`, 2))
    );
  }

  setUserBits(read, write, execute) {
    // Get generated AND and OR flags
    var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
    // Create AND and OR bitmasks and update metadata field
    updateMask(this.collection, this._id,
      NumberInt(
        parseInt(`11111111111111111111111111${andBits}111`, 2)),
      NumberInt(
        parseInt(`00000000000000000000000000${orBits}000`, 2))
    );
  }

  setAllBits(read, write, execute) {
    // Get generated AND and OR flags
    var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
    // Create AND and OR bitmasks and update metadata field
    updateMask(this.collection, this._id,
      NumberInt(
        parseInt(`11111111111111111111111111111${andBits}`, 2)),
      NumberInt(
        parseInt(`00000000000000000000000000000${orBits}`, 2))
    );
  }

  save() {
    this.collection.updateOne({
      _id: this._id
    }, {
      $set: {
        name: this.name, 
        path: this.path, 
        parentPath: this.parentPath, 
        metadata: this.metadata  
      }
    }, { upsert: true });
  }
}

The class contains several methods.

Method name Description
setGroupId(id) Sets the groupId part of the metadata field.
setUserId(id) Sets the userId part of the metadata field.
setGroupBits(read, write, execute) Sets the read, write and execute bits for the group permissions, passing undefined or null will skip the bit.
setUserBits(read, write, execute) Sets the read, write and execute bits for the user permissions, passing undefined or null will skip the bit.
setAllBits(read, write, execute) Sets the read, write and execute bits for the all permissions, passing undefined or null will skip the bit.
save() Inserts or updates the document

We are going to dissect two methods for our cause. The first being the setUserId(id) method and the second one being the setUsersBits(read, write, execute) method.

Let's start by looking at the setUserId(id) method.

setUserId(id) {
  // Create AND and OR bitmasks and update metadata field
  updateMask(this.collection, this._id,
    NumberInt(parseInt(`11111111100000000000000111111111`, 2)),
    NumberInt(parseInt(`000000000${createIntegerMask(id, 14)}0000000000`, 2))
  );
}

We are calling two generic methods. The first is called createIntegerMask and the second is called updateMask.

The createIntegerMask method takes the value passed in and the bits resolution of the field.

The updateMask method takes a collection instance, the _id value of the target document and two masks. The first being the AND mask and the second being the OR mask.

Lets look at each of them does, starting with the createIntegerMask(id, 14).

function createIntegerMask(id, bitResolution) {
  let bitsString = id.toString(2);
  if (bitsString.length > bitResolution) {
    throw Error(`value id must be between 0 and ${Math.pow(2, bitResolution) - 1}`);
  }

  // Ensure `bitResolution` bit string
  let missingValues = "";
  for(let i = 0; i < (bitResolution - bitsString.length); i++) {
    missingValues = missingValues + "0";
  }

  // Pad the bitString with 0s
  return missingValues + bitsString;  
}

The createIntegerMask method will generate a String of length bitResolution that is the binary representation of the value id. To ensure the returned string is exactly bitResolution long this method pads the front of the bit string with 0s until the end length is bitResolution long.

In other words say that we pass in the id value of 1 with bitResolution set to 4. This will be turned into the following String representation 0001.

Now lets go back to the calling method.

setUserId(id) {
  // Create AND and OR bitmasks and update metadata field
  updateMask(this.collection, this._id,
    NumberInt(parseInt(`11111111100000000000000111111111`, 2)),
    NumberInt(parseInt(`000000000${createIntegerMask(id, 14)}0000000000`, 2))
  );
}

Remember the bit packing schema we defined.

1                                     // [0:0]    isDirectory
 11111111                             // [1:9]    Group id
         11111111111111               // [10:23]  User id
                       111            // [24:26]  Group r/w/x
                          111         // [27:29]  User r/w/x
                             111      // [29:31]  All r/w/x
  • The [0:0] bits defines the isDirectory value.
  • The [1:9] bits encode the 8 bits of the Group id.
  • The [10:23] bits encode the 14 bits of the User id.
  • The [24:26] bits encode the Group read/write/execute flags.
  • The [27:29] bits encode the User read/write/execute flags.
  • The [29:31] bits encode the All read/write/execute flags.

We want to update the bits in the position [10:23] with a new userId passed in as the id parameter. To make it more tangible lets assume the id has the value 1.

We are creating two masks here.

Mask Operation
11111111100000000000000111111111 AND
000000000${createIntegerMask(id, 14)}0000000000 OR

The AND mask has the bits at the positions [10:23] set to 0 and the OR mask has the same [10:23] bits set to the binary representation of the id value.

So what does the updateMask actually do. Lets take a look.

function updateMask(collection, _id, andMask, orMask) {
  // Update the fields
  collection.updateOne({
    _id: _id
  }, {
    $bit: {
      metadata: {
        and: andMask, 
        or: orMask
      }
    }
  });
}

Not much as you can see. It basically updated the document with the _id field equal to the passed in _id parameter. However the magic lies in the $bit operator.

The $bit operator lets us apply the three bitwise operations and, or and xor. What does the three operations do to the metadata field? To understand this we need to understand what a bitwise and, or or xor does.

AND Operation

Lets start by looking at how the AND operation works on a single bit field. This is most easily expressed using a truth table.

A B A and B
0 0 0
0 1 0
1 0 0
1 1 1

Both A and B must be 1 for the AND operation to return 1. If any of the two sides is a 0 the result is also 0.

We can use AND operations to clear bits by providing a mask where the bits we want cleared is set to 0.

Lets take an example of an existing value and applying a mask using AND.

Value Description
10111101 Original value
11100111 AND mask
10100101 Resulting value

In the AND mask we set the bits where we want to keep the original bit setting to 1 and the ones we want to clear to 0. If you look at the table you can see that a 1 will ensure no change for the matching bit in the original value while a 0 will always clear it.

OR Operation

Next lets look at how the OR bitwise operation works. Lets look at the truth table for OR.

A B A or B
0 0 0
0 1 1
1 0 1
1 1 1

We can see from the table that if the mask bit value is set to 1 the resulting bit value will always be 1.

We can use the OR operation to to ensure specific bits are set on the final value.

Lets take an example of an existing value and applying a mask using OR.

Value Description
10100101 Original value
00010000 OR mask
10110101 Resulting value

In the OR mask we set the bits where we want to keep the original bit setting to 0 and the ones we want to set to 1. If you look at the table you can see that a 0 will ensure no change for the matching bit in the original value while a 1 will always set it.

XOR Operation

Finally lets look at the XOR Operation or Exclusive OR. You can think of XOR as a way to inverse the value of a single bit. Lets look at the truth table for XOR.

A B A xor B
0 0 0
0 1 1
1 0 1
1 1 0

We can see from the table that if the mask bit is set to 1 it will flip the value of the original bit to it's reverse. That is to say the a 0 will become a 1 and a 1 will become a 0.

We can use the XOR to flip a flag from 0 to 1 or 1 to 0.

Lets take an example of an existing value and applying a mask using OR.

Value Description
10110101 Original value
00011000 XOR mask
10101101 Resulting value

We can see that the value 10 gets it bits flipped and becomes 01.

$bit Operation in setUser(id)

Lets look at the $bit operation we are doing in the update statement.

$bit: {
  metadata: {
    and: andMask, 
    or: orMask
  }
}

What exactly does it do ? Lets use a full example for the setUserId(id) method where id is equal to 1.

Value Description
11111111100000001110000111111111 Original value
11111111100000000000000111111111 AND mask
11111111100000000000000111111111 Resulting value
11111111100000000000001111111111 OR mask
11111111100000000000001111111111 Resulting value

The first AND step, sets all the bits in the userId to 0, thus clearing the 14 bits value.

Next the OR mask sets all bits to 1 where its mask has a bit set to 1, thus applying the new userId to it's space in the metadata field.

The operation lets us apply a set of bits to a specific location in the 32 bit long integer, thus saving the groupId as a 14 bit value inside the metadata field at the positions [10:23] of the 32 bit integer value.

The setUserBits(read, write, execute) function

The setUserBits function lets us set the three flags read, write and execute for the user on the metadata field.

setUserBits(read, write, execute) {
  // Get generated AND and OR flags
  var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
  // Create AND and OR bitmasks and update metadata field
  updateMask(this.collection, this._id,
    NumberInt(parseInt(`11111111111111111111111111${andBits}111`, 2)),
    NumberInt(parseInt(`00000000000000000000000000${orBits}000`, 2))
  );
}

Optimizing Performance

One thing that is painfully obvious is that we currently have to perform an update for each time we want to modify a field in the packed metadata field. What if we could set multiple fields values at the same time?

Lets modify the Entry class to allow us to perform multiple field updates in a single operation.

class TurboEntry {
  constructor(collection, _id, name, path, parentPath, metadata) {
    this.collection = collection;
    this._id = _id;
    this.name = name;
    this.path = path;
    this.parentPath = parentPath;
    this.metadata = metadata;

    // Contains the masks we are going to apply
    this.updateMasks = {
      and: null,
      or: null
    }
  }

  applyMask(andMask, orMask) {
    if (!this.updateMasks.and) {
      this.updateMasks.and = andMask;
    } else {
      this.updateMasks.and = this.updateMasks.and & andMask;
    }

    if (!this.updateMasks.or) {
      this.updateMasks.or = orMask;
    } else {
      this.updateMasks.or = this.updateMasks.or | orMask;
    }
  }

  setGroupId(id) {
    this.applyMask(
      parseInt(`10000000011111111111111111111111`, 2),
      parseInt(`0${createIntegerMask(id, 8)}00000000000000000000000`, 2)
    )
  }

  setUserId(id) {
    // Create AND and OR bitmasks and update metadata field
    this.applyMask(
      parseInt(`11111111100000000000000111111111`, 2),
      parseInt(`000000000${createIntegerMask(id, 14)}000000000`, 2)
    );
  }

  setGroupBits(read, write, execute) {
    // Get generated AND and OR flags
    var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
    // Create AND and OR bitmasks and update metadata field
    this.applyMask(
      parseInt(`11111111111111111111111${andBits}111111`, 2),
      parseInt(`00000000000000000000000${orBits}000000`, 2)
    );
  }

  setUserBits(read, write, execute) {
    // Get generated AND and OR flags
    var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
    // Create AND and OR bitmasks and update metadata field
    this.applyMask(
      parseInt(`11111111111111111111111111${andBits}111`, 2),
      parseInt(`00000000000000000000000000${orBits}000`, 2)
    );
  }

  setAllBits(read, write, execute) {
    // Get generated AND and OR flags
    var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
    // Create AND and OR bitmasks and update metadata field
    this.applyMask(
      parseInt(`11111111111111111111111111111${andBits}`, 2),
      parseInt(`00000000000000000000000000000${orBits}`, 2)
    );
  }

  updateMetadata() {
    // If no mask applied do nothing
    if (!this.updateMasks.and || !this.updateMasks.or) {
      return;
    }

    // Update the fields
    this.collection.updateOne({
      _id: this._id
    }, {
      $bit: {
        metadata: {
          and: NumberInt(this.updateMasks.and), 
          or: NumberInt(this.updateMasks.or),
        }
      }
    });
  }

  save() {
    this.collection.updateOne({
      _id: this._id
    }, {
      $set: {
        name: this.name, 
        path: this.path, 
        parentPath: this.parentPath, 
        metadata: this.metadata  
      }
    }, { upsert: true });
  }
}

Lets look over the main changes here. First we keep the current merged and and or masks in the instance variable updateMasks.

// Contains the masks we are going to apply
this.updateMasks = {
  and: null,
  or: null
}

The updateMasks variable contains an and and or entry field that are initially set to null. Lets look at the setAllBits method of the new Entry class.

setAllBits(read, write, execute) {
  // Get generated AND and OR flags
  var [andBits, orBits] = generateFlagBitMaps(read, write, execute);
  // Create AND and OR bitmasks and update metadata field
  this.applyMask(
    parseInt(`11111111111111111111111111111${andBits}`, 2),
    parseInt(`00000000000000000000000000000${orBits}`, 2)
  );
}

We can see that it looks mostly the same. The only significant change is that we are calling the applyMask method passing in the bit masks. Lets take a look at the applyMask method.

applyMask(andMask, orMask) {
  if (!this.updateMasks.and) {
    this.updateMasks.and = andMask;
  } else {
    this.updateMasks.and = this.updateMasks.and & andMask;
  }

  if (!this.updateMasks.or) {
    this.updateMasks.or = orMask;
  } else {
    this.updateMasks.or = this.updateMasks.or | orMask;
  }
}

As we can see the applyMask method take an andMask and an orMask. If there is no existing masks in the updateMasks instance variable it sets the current passed in mask as the current masks.

If we already have masks in the updateMasks instance variable we AND the and mask in updateMasks with the passed in andMask parameter to merge them. We then OR the or mask in the updateMasks variable with the passed in orMask parameter. These two operation merge the two masks together making sure we only affect the bits of the metadata field we are planning to modify.

Finally to update the metadata field we create a new method called updateMetadata.

updateMetadata() {
  // If no mask applied do nothing
  if (!this.updateMasks.and || !this.updateMasks.or) {
    return;
  }

  // Update the fields
  this.collection.updateOne({
    _id: this._id
  }, {
    $bit: {
      metadata: {
        and: NumberInt(this.updateMasks.and), 
        or: NumberInt(this.updateMasks.or),
      }
    }
  });
}

If there are masks present in the updateMasks instance variable we apply them to the metadata field using the $bit operator as before.

Querying on the metadata

Now that we know how to modify the fields packed in the metadata field lets look at how we can query for the different fields. We are going to create a simple class (just focusing on the query aspects) to allow us to query for the fields packed in metadata. Before we do, we need to understand what the limitations are when querying on the metadata field.

  1. We can only perform equality operations. That is to say we can match on the Group Id being equal to 5 but not on Group Id larger than 5.
  2. Queries on the metadata field cannot leverage any indexes so we should ensure the metadata field is the last part of a query using other criteria to narrow the number of metadata fields that must be inspected.

So lets see what operations are available to us to perform bitwise queries.

Operator Description
$bitsAllClear Matches numeric or binary values in which a set of bit positions all have a value of 0.
$bitsAllSet Matches numeric or binary values in which a set of bit positions all have a value of 1.
$bitsAnyClear Matches numeric or binary values in which any bit from a set of bit positions has a value of 0.
$bitsAllClear Matches numeric or binary values in which any bit from a set of bit positions has a value of 1.

What does that mean in practice? Lets look at each of the operators in turn using examples. First we need a sample document with a binary value.

db.bittest.insertOne({
  metadata: 0b011000011110110
});

$bitsAllClear

Lets perform a simple query operation where we are going to match on the second group of 4 bits.

db.bittest.findOne({
  metadata: {
    $bitsAllClear: NumberInt(0b0000111100000000)
  }
});

This will match the inserted document above as the second group of bits in the metadata field are all set to 0.

$bitsAllSet

Lets perform a simple query operation where we are going to match on the third group of 4 bits.

db.bittest.findOne({
  metadata: {
    $bitsAllSet: NumberInt(0b0000000011110000)
  }
});

This will match the inserted document above as the third group of bits in the metadata field are all set to 0.

$bitsAnyClear

Lets perform a simple query operation where we are going to match on the first group of 4 bits.

db.bittest.findOne({
  metadata: {
    $bitsAnyClear: NumberInt(0b1111000000000000)
  }
});

This will match the inserted document above as the third group of bits in the metadata field contains two bits set to 0.

$bitsAnySet

Lets perform a simple query operation where we are going to match on the third group of 4 bits.

db.bittest.findOne({
  metadata: {
    $bitsAnySet: NumberInt(0b0000000000001111)
  }
});

This will match the inserted document above as the third group of bits in the metadata field contains two bits set to 1.

Locate entries with Group Id = 5 with all permission read = true

In this query we are aiming to retrieve all the entries where the metadata field contains the Group Id equal to 5 and the all read permissions is read.

db.entries.find({
  metadata: {
    $bitsAllSet:   NumberInt(0b00000010100000000000000000000100),
    $bitsAllClear: NumberInt(0b00000001000000000000000000000000)
  }
}).toArray();

The value 5 is expressed as the binary value 0b101. To correctly match we need to combine the $bitsAllSet and $bitsAllClear operators. The $bitsAllSet mask contains the Group Id value of 5 expressed as 0b101. The $bitsAllClear contains the reverse of the bit pattern 0b101 which is 0b010. This way we ensure a perfect match on the bit pattern. Only using $bitsAllSet would not check if the middle bit was set to 1 or 0. That is the reason we have to test both for the bits set to 1 as well as the bits set to 0. In the same way we set the all permissions read flag to 1, while the write and execute flags are set to 0. This means we will match on any document where the all permissions read flag is set, and we don't care about the value of the write and execute flags.

Performing a query

Lets create a simple class that will simplify the querying of the data from the metadata field.

class QueryClass {
  constructor(collection) {
    this.collection = collection;
    this.allSetBits = 0b0000000000000000;
    this.allClearBits = 0b0000000000000000;  
  }

  groupId(id) {
    this.allSetBits = this.allSetBits | parseInt(`0${createIntegerQueryMask(id, 8)}00000000000000000000000`, 2);
    this.allClearBits = this.allClearBits | parseInt(`0${createIntegerQueryMask(id, 8, true)}00000000000000000000000`, 2);
    return this;
  }

  userId(id) {
    this.allSetBits = this.allSetBits | parseInt(`000000000${createIntegerQueryMask(id, 14)}000000000`, 2);
    this.allClearBits = this.allClearBits | parseInt(`000000000${createIntegerQueryMask(id, 14, true)}000000000`, 2);
    return this;
  }

  groupPermissions(read, write, execute) {
    let [mask, reverseMask] = createPermissionsQueryMasks(read, write, execute);
    this.allSetBits = this.allSetBits | parseInt(`00000000000000000000000${mask}000000`, 2);
    this.allClearBits = this.allClearBits | parseInt(`00000000000000000000000${reverseMask}000000`, 2);
    return this;
  }

  userPermissions(read, write, execute) {
    let [mask, reverseMask] = createPermissionsQueryMasks(read, write, execute);
    this.allSetBits = this.allSetBits | parseInt(`00000000000000000000000000${mask}000`, 2);
    this.allClearBits = this.allClearBits | parseInt(`00000000000000000000000000${reverseMask}000`, 2);
    return this;
  }

  allPermissions(read, write, execute) {
    let [mask, reverseMask] = createPermissionsQueryMasks(read, write, execute);
    this.allSetBits = this.allSetBits | parseInt(`00000000000000000000000000000${mask}`, 2);
    this.allClearBits = this.allClearBits | parseInt(`00000000000000000000000000000${reverseMask}`, 2);
    return this;
  }

  find() {
    return this.collection.find({
      metadata: {
        $bitsAllSet: NumberInt(this.allSetBits),
        $bitsAllClear: NumberInt(this.allClearBits)
      }
    }).toArray();
  }
}

The QueryClass allows us to build a query that perform equality matches on the binary packed fields in the metadata field. We can chain the query like this.

new QueryClass(db.entries)
  .groupId(5)
  .allPermissions(true)
  .find();

This will match any document where the metadata packed field groupId is equal to 5 and the all permission read = true (ignoring the values of the write and execute flags).

Lets have a look at what the groupId(id) and allPermissions(read, write, execute) methods do, starting with the groupId(id) one.

groupId(id) {
  this.allSetBits = this.allSetBits | parseInt(`0${createIntegerMask(id, 8)}00000000000000000000000`, 2);
  this.allClearBits = this.allClearBits | parseInt(`0${createIntegerMask(id, 8, true)}00000000000000000000000`, 2);
  return this;
}

We know the groupId is encoded in the [1:9] bits. We need to create two masks, the $bitsAllSet mask is created calling the createIntegerQueryMask method passing in the id and the length of the groupId field which is 8 bits. The $bitAllClear mask is created calling the createIntegerQueryMask method but passing true after the id and bit length, telling the method to reverse the mask bits before returning them.

createIntegerQueryMask

Lets look at what the createIntegerQueryMask does.

function createIntegerQueryMask(id, bitResolution, reverse) {
  let bitsString = id.toString(2);
  if (bitsString.length > bitResolution) throw Error(`value id must be between 0 and ${Math.pow(2, bitResolution) - 1}`);
  
  // Reverse the bit String
  if (reverse) {
    let bitStringArray = [];
    for (let i = 0; i < bitsString.length; i++) {
      bitStringArray[i] = bitsString[i] == "0" ? "1" : "0";
    }  

    // Save the reverse string
    bitsString = bitStringArray.join('');
  }

  // Ensure `bitResolution` bit string
  let missingValues = "";
  for(let i = 0; i < (bitResolution - bitsString.length); i++) {
    missingValues = missingValues + (reverse ? "1" : "0");
  }

  // Pad the bitString with 0s
  return missingValues + bitsString;
}

The createIntegerQueryMask method will generate a String of length bitResolution that is the binary representation of the value id. To ensure the returned string is exactly bitResolution long this method pads the front of the bit string with 0s (or 1s if we have set the reverse parameter to true) until the end length is bitResolution long.

In other words say that we pass in the id value of 1 with bitResolution set to 4. This will be turned into the following String representation 0001.

If we set the parameter reverse to true the returned bit mask for the id value will be reversed (this is used to ensure the $allClearBits matches on the bits in id that are null). This means that if id is set to 5, with a bitResolution of 4 the returned value for id in the bit mask will be 1010 as the 3 last bits are reversed and we pad a 1 to the front to ensure we match only on values where the bits before the id value are set to 0 in the metadata field.

groupId(id) {
  this.allSetBits = this.allSetBits | parseInt(`0${createIntegerMask(id, 8)}00000000000000000000000`, 2);
  this.allClearBits = this.allClearBits | parseInt(`0${createIntegerMask(id, 8, true)}00000000000000000000000`, 2);
  return this;
}

After calling the createIntegerMask(id, 8) we insert the corresponding bit mask in full 32 bits bit mask at the positions [1:9]. We then parse the bit mask to an integer and OR with the this.allSetBits. As we remember OR will set any bits to 1 in the result when the bit mask on the right contains a 1 in a given bit position. This will merge the two bit masks allowing us to match on multiple fields.

We then do the same for the this.allClearBits using the reverse id bit mask.

createPermissionsQueryMasks

Now lets look at how we query for the individual flags. Lets look at the allPermissions(read, write, execute) method.

allPermissions(read, write, execute) {
  let [mask, reverseMask] = createPermissionsQueryMasks(read, write, execute);
  this.allSetBits = this.allSetBits | parseInt(`00000000000000000000000000000${mask}`, 2);
  this.allClearBits = this.allClearBits | parseInt(`00000000000000000000000000000${reverseMask}`, 2);
  return this;
}

Just as for the createIntegerQueryMask we need to create a $bitsAllSet and a $bitsAllClear mask for our query, and then OR each of the masks with the instance values this.allSetBits and this.allClearBits. Lets look at what the createPermissionsQueryMasks(read, write, execute) does.

function createPermissionsQueryMasks(read, write, execute) {
  // Create $bitsAllSet mask
  let mask = [
    read == true ? "1" : "0",
    write == true ? "1" : "0",
    execute == true ? "1" : "0"
  ];

  // Create $bitsAllClear mask
  let reverseMask = [
    read == true || read == null ? "0" : "1",
    write == true || write == null ? "0" : "1",
    execute == true || execute == null ? "0" : "1"
  ]

  return [mask.join(''), reverseMask.join('')];
}

For each of the passed in flags read, write or execute we check if the value is equal to true and add it's value to the array of bits representing the flags.

If we call the method with true, false, true we get the mask of ["1", "0", "1"].

Next we create the reverse of the mask with one small exception. If either of the read, write or execute parameters are set to null we set the corresponding mask entry to 0 as we do not want to match on that flag value. Finally we return an array of the mask and reverseMask.

The findOne method

Lets look at the findOne method real quick.

find() {
  return this.collection.find({
    metadata: {
      $bitsAllSet: NumberInt(this.allSetBits),
      $bitsAllClear: NumberInt(this.allClearBits)
    }
  }).toArray();
}

As we can see all it does it query using the collection.findOne method against the metadata field using the $bitsAllSet operator set to the merged bit mask this.allSetBits as well as the operator $bitAllClear operator set to the merged bit mask this.allClearBits.

Setting up some test documents

Lets insert some test documents we can query against.

db.entries.insertMany([{
    _id: 1,
    name: "peter",
    path: "/home/user/rowan",
    parentPath: "/home/rowan",
    metadata: NumberInt(0b00000010100000000000001111111100)
}, {
    _id: 2,
    name: "peter",
    path: "/home/user/paul",
    parentPath: "/home/paul",
    metadata: NumberInt(0b00111010100000000000001111111100)
}])

Performing the Query

Finally run the query shown before, where we match against all documents where Group Id = 5 and the the all read permission is true.

new QueryClass(db.entries)
  .groupId(5)
  .allPermissions(true)
  .find();

As expected we retrieve the document with _id = 1. The document contains the following metadata field values.

0                                     // [0:0]    isDirectory   = false
 00000101                             // [1:9]    Group id      = 5
         00000000000001               // [10:23]  User id       = 1
                       111            // [24:26]  Group r/w/x   = true/true/true
                          111         // [27:29]  User r/w/x    = true/true/true
                             100      // [29:31]  All r/w/x     = true/false/false

We can see that the Group Id = 5 and that the read flag in the all permissions is set to true as well.

Conclusion

We can see that managing the bit packing is quite complex compared to normal database values but that we can achieve considerable space saving benefits when needed. That said you should think twice about using it if space concerns are not primary issues for you.

Discover and read more posts from Christian Amor Kvalheim
get started