Bit Twiddling: Understanding Bit Operations
This post explains bit operations and terms: And, Or, Xor, Complement, Shifting & Masking, Unsigned & Signed data types, sign extension, and more!
Review of using Binary for Numeric Representation
Binary is a way of repreesenting numbers using strings of binary digits. This is how the computer interally stores things. When we look at a string of bits in binary, we can use them to represent signed or unsigned numbers.
Powers of 2
Binary is based on base 2 rather that our more familiar base 10. Just is in base 10 each digit put on the left of a number represent another position of the power of the base. So, in base 10, 567 means 5x10^2 + 6x10^1 + 7x10^0. In binary 101 means 1x2^2 + 0x2^1 + 1x2^0 or in base 10: 5.
There is a relationship about numbers in binary, bit positions, and powers of 2.
Power | Raised Value(10) | Raised value(2) | Log | # Values |
---|---|---|---|---|
0 | 2^0 = 1 | 000001 | log2(1) = 0 | 2 |
1 | 2^1 = 2 | 000010 | log2(2) = 1 | 4 |
2 | 2^2 = 4 | 000100 | log2(4) = 2 | 8 |
3 | 2^3 = 8 | 001000 | log2(8) = 3 | 16 |
4 | 2^4 = 16 | 010000 | log2(16) = 4 | 32 |
5 | 2^5 = 32 | 100000 | log2(32) = 5 | 64 |
"# Values" means the number of different values that can be represented in "Raised Value" number of bits. If we have one bit, we can store 2 different values; if we have two bits we can store 4 different values.
Signed & Unsigned
Variables hold values. We associate our variables with data types.
The data type tells us the range of possible values, and, also gives the interpretation for all the bit patterns.
The terms signed and unsigned are properties of data types. For example, a variable of type unsigned byte can hold 8 bits of data, ranging from 0 to 255 in value. A variable of type signed byte can hold 8 bits of data, ranging from -128 to 127 in value.
Positive and negative (and zero) are (mutually exclusive) properties of the values of signed data types. Unsigned data types will not admit negative numbers, so they can only be zero or positive.
Inappropriate treatment of variable & storage is a common problem in programming, especially in assembly language. In assembly language, the programmer determines how to interpret data at each and every instruction that accesses data. The hardware generally doesn't remember what data type was last used in what memory location or CPU register.
Accidentally treating some data, for example, as signed when it isn't or vice versa is a very real problem. High-level languages seek to mitigate these problems by using type systems, which detect problems of mismatch either at compile time or at runtime but before the program does something really bad.
In assembly language, however, we don't have these type systems, and sometimes these problems go undetected until values toward the edges of the range of representation are used.
Unsigned Representation
Unsigned representation allows all bits of a number, register, or field to be used for magnitude, and hence, negative numbers cannot be represented. Let's take a look at all possible 3-bit numbers, for example:
3-bit value in decimal as unsigned
000 0
001 1
010 2
011 3
100 4
101 5
110 6
111 7
Signed Representation
All modern computers use what we call "2's complement" for storing signed integers. 2's complement representation uses numbers with the high bit set (or clear) to indicate negative (non-negative) numbers. Here's the same chart for a signed 3-bit value:
3-bit value in decimal as signed (two's complement)
100 -4
101 -3
110 -2
111 -1
000 0
001 1
010 2
011 3
Note the high bit is sometimes called the most significant bit or MSB.
2's complement has the following qualities:
- Negation is performed in 2's complement by inverting all the bits, and then adding one.
- The range allows for negative numbers that are 1 past the maximum positive, so above: -4 to 3, or for 8-bit bytes: -128 to 127, for 16-bit words: -32768 to 32767.
- That means that negation has a non-zero chance of overflow (and if it happens the value stays the same!).
- The same hardware circuitry can be used for addition, subtraction, multiplication, and division regardless of whether the operands are signed or unsigned, which made it heavily favored when circuit capacity was limited (as with initial microprocessors).
- The advent of the computer chip (integrated circuit), while a huge step forward in speed and other aspects actually placed a new (but constantly growing) hard limit on the amount of logic that could be provided on chip, whereas before cost was the main limiting factor in the amount of logic provided as individual transistors & diodes. Reusing circuitry was one way to fit within limits of new hardware technology.
- Detection of overflow for these arithmetic operations is different for signed vs. unsigned. For example,
- For unsigned addition, a carry out of the most significant bit indicates overflow.
- For signed addition, if both inputs have the same sign, and the result doesn't, it is overflow. All other conditions either cannot overflow or indicate no overflow.
We can compare with 1's complement, which was used in the 50's and 60's before computer makers settled on 2's complement. 1's complement had the following qualities:
- The same range of negative numbers as positive numbers.
- Both a positive zero, and negative zero.
- Some computers automatically converted negative 0 to positive 0 after each arithmetic operation
- Negation is just the complement (no subsequent increment required).
- Different hardware is required for signed vs. unsigned for all arithmetic operations
Introduction
Bit manipulation operations operate on strings of binary bits, each bit of which is either 0's or 1's.
However, many bit operations, in particular the ones I'm showing in this article, involve 2 inputs, namely one that is a variable and another that is a constant. Of the two operands, the constant is referred to as a mask, and, performing bit manipulation operations using such a constant mask is referred to as masking. The value of the mask has to do with both the operation we are using and the desired result we want. In the following diagrams, the variable will be represented as a string of letters, one for each bit. This will help you see where the variable's bits wind up after the operation.
For example, with a variable of 3-bits whose value is "abc" — when shifted left by 1 bit — results in (here as a 3-bit result) "bc0": a zero bit is shifted into the low bit position, while the previous bits move left by one position, and the most significant bit is lost.
We'll use an 8-bit byte as an input to show how the various bit operations produce their outputs. Using this notation we will be able to see what is happening to the various bits.
A number of the bit manipulation operators are binary operators, meaning they consume two sources, and produce one result. These operators are: And, Or, Xor, along with a Left-shift, and two Right-shifts. (Some instruction sets and programming languages also include rotate instructions in addition to the shifts.)
Field Extraction
Lets turn to an example. Let's say that we have a sensor that delivers some bit values as readings, and, that these bit values are packed into a single byte memory location. For illustrative purposes, let's say there is a one 1-bit value, one 4-bit value, and one 3-bit value. So, our memory location has a byte that is packed as follows:
76543210 bit position (in decimal)
abcdefgh bit variables for illustration
where "a" represents the 1-bit field, "bcde" represents the 4-bit field, and "fgh" represents the 3-bit field. For the following examples, we'll concern ourselves with the 4-bit field: "bcde". In order to treat this 4-bit field like any other number, we need to isolate the wanted bits of the byte, and then, right justify it. Our intent is to go from "abcdefgh", a packed byte, to "0000bcde": this field, now unpacked from the other bits, and right justified in the byte. Typically, we'll first read the packed byte into a CPU register and then unpack it there. We are presuming an 8-bit wide CPU register for simplicity; however, other CPU register sizes can be extrapolated from this.
Note that for now, we're treating the 4-bit field as an unsigned field. (In a subsequent section will treat it as a signed field, which means that we're interested in an answer of "bbbbbcde", instead of "0000bcde". A signed field means that the most significant bit represents the sign rather than magnitude as the other bits do.)
Unsigned Field Extraction
There are at least 3 techniques to do this unpacking, so we'll describe those. The first uses a combination of shifts: (1) a Left-shift followed by a Right-shift. The other two use a combination of (2) Right-Shift followed by a masking And operation, or (3) vice versa.
Approach (1u) — Left-shift, then Right-shift
First, we left-justify the field in our CPU register. In our case, this means shifting it left by 1 bit, or register high bit number - field start position = 7 - 6 = 1.
76543210 bit position (in decimal)
abcdefgh our input byte, shown in binary as variables
1 shift amount in decimal
-------- << logical left shift operation
bcdefgh0 result
The field we are interested in is now left-justified in our 8-bit CPU register. This left justification has effectively removed any bits in higher positions than we are interested in. Next, we shift right to right-justify the field, which requires a shift right of 4 bits, which comes from the field's length: register size - field length = 8 - 4 = 4.
bcdefgh0 intermediate output/input
4 shift amount in decimal
-------- >> logical right shift operation
0000bcde result
Because we want the item right justified when we're done, we do the right shifting last.
Approach (2u) — Right-shift, then masking And
First, the right shift:
abcdefgh our input byte, shown in binary as variables
3 shift amount in decimal
-------- >> logical right shift operation
000abcde result
Next, the masking And. The constant mask uses 1's where we want to keep bits from other input and 0's where we want to ignore bits from the other input. In this case our mask value is 15 in decimal, but the binary rendition is more illuminating.
000abcde intermediate output/input
00001111 constant mask for And operation
-------- & And operation
0000bcde result
Approach (3u) — Masking And, then Right-shift
First, the masking And. The constant mask uses 1's where we want to keep bits from other input and 0's where we want to ignore bits from the other input. In this case our mask value is 120 in decimal, but the binary rendition is more illuminating.
abcdefgh our input byte, shown in binary as variables
01111000 constant mask for And operation
-------- & And operation
0bcde000 result
Next, the right shift:
0bcde000 intermediate output/input
3 shift amount in decimal
-------- >> logical right shift operation
0000bcde result
To compare these sequences we have to understand a bit about the processor we're using; however all of these sequences are two instructions on most processors, so roughly comparable.
Some (older) processors shift more slowly the larger the shift count, which would mean that shift-heavy solutions would be slower; however, that isn't usually the case these days. Certain sequences use shorter immediate (constant) values than others, and this can result in shorter instructions, which is a space and sometimes speed advantage. For example, Approach (3) uses a constant of 120, for the mask, whereas Approach (4) uses a constant of 15 for the mask.
Signed Field Extraction
For signed field extraction, we need to replicate the field's sign bit across the rest of the CPU register. This means that where we had 0's in the unsigned field extraction, "0000bcde", we need the sign bit replicated instead: "bbbbbcde". In some sense this places the field's sign bit into the CPU register's sign bit, though for correctness, this bit is not just placed there but replicated to all the rest of the high bits in the register.
The approach here is that we will first left-justify the value in the CPU register using a Left-shift, and then, right justify the value as in (1u) — but this time using an Arithmetic Right-shift, which keeps the sign bit as is, instead of shifting zero bits into the leftmost bit (as with Logical Right-shift). We need to position the field's sign bit in the registers sign bit, using the left shift, in order to productively use the Arithmetic Right-shift operation for sign extension. Essentially, we left justify the field in the CPU register, then right justify it using arithmetic shifting.
Approach (1s) — Left-shift, then Right-shift
First, we left-justify the field in our CPU register. In our case, this means shifting it left by 1 bit, or register high bit number - field start position = 7 - 6 = 1.
76543210 bit position (in decimal)
abcdefgh our input byte, shown in binary as variables
1 shift amount in decimal
-------- << logical left shift operation
bcdefgh0 result
The field we are interested in is now left-justified in our 8-bit CPU register. Next, we shift right to right-justify the field, which requires a shift right of 4 bits, which comes from the field's length: register size - field length = 8 - 4 = 4.
bcdefgh0 intermediate output/input
4 shift amount in decimal
-------- >> arithmetic right shift operation
bbbbbcde result
Masking Operations
Next we'll cover each bit manipulation operation. Note that we are using constant masks, and that the mask chosen is arbitrary. Masks can have consecutive 1's (as in 00111100), or mixed 1's and 0's: as in 01011010.
And Operation
The And operation using a constant mask: the result passes through the variable input to the result where the constant mask has a 1, and clears the result where the mask has a 0.
abcdefgh our input byte, shown in binary as variables
01011010 constant mask
-------- & And operation
0b0de0g0 result
This operation is useful for extracting information of interest, while ignoring information that is not of interest.
Or Operation
The Or operation using a constant mask: the result passes through the variable input to the result where the constant mask has a 0, and sets the result where the mask has a 1.
abcdefgh our input byte, shown in binary as variables
01011010 constant mask
-------- | Or operation
a1c11f1h result
This operation is useful for setting (to 1 or true) some bits while leaving other, neighboring bits unaffected.
Xor Operation
The Xor operation using a constant mask: the result passes through the variable input to the result where the constant mask has a 0, and complements the result where the mask has a 1. Here we're using the notation that A is the complement of a. that is, A = ~a, B = ~b, and so forth.
abcdefgh our input byte, shown in binary as variables
01011010 constant mask
-------- ^ Xor operation
aBcDEfGh result
This operation is useful for toggling some bits while leaving other, neighboring bits unaffected.
Complement Operation
The Complement operation reverses every bit. It is like using Xor with a constant mask of all 1's
abcdefgh our input byte, shown in binary as variables
-------- ~ Complement operation
ABCDEFGH result
This operation is useful for inverting a mask or other value, and also used in negation.
Negation Operation
Most processors have an operation to perform negation on signed numbers, however, the negation construct for 2'c complement representation of signed numbers, is to apply the complement and then add one (increment).
Shifting Operations
In these examples, we shift by only one bit. Though modern hardware can shift multiple bit positions in a single cycle, a shift result for a higher shift count can still be thought of as multiple consecutive 1-bit shifts.
Note that some hardware has anomalies when using a shift count that is equal to or larger than the CPU register size, or when using a variable shift amount that turns out to be zero. (I view these as unfortunate inconsistencies, shifting by zero, for example, should simply return the input value rather than something unexpected).
Left Shift
The Left-shift operation shifts the input by the shift count, and inserts zeros into vacated bits on the right.
abcdefgh our input byte, shown in binary as variables
1 shift count in decimal
-------- << Left shift operation
bcdefgh0 result
This shift operation is also the same as multiplication by a power of 2, where the power is the shift count. This is a more efficient way to multiple by such values, if you know in advance you're dealing with a power of 2.
Logical Right-shift
The logical Right-shift operation shifts the input by the shift count, and inserts zeros into vacated bits on the left.
abcdefgh our input byte, shown in binary as variables
1 shift count in decimal
-------- >> Logical Left shift operation
0abcdefg result
This also divides by a power of 2, where the power of 2 is the shift count.
Arithmetic Right-shift
The arithmetic Right-shift operation shifts the input by the shift count, and inserts copies of the sign bit into vacated bits on the left.
abcdefgh our input byte, shown in binary as variables
1 shift count in decimal
-------- >> Logical Left shift operation
aabcdefg result
Of particular note here is that the sign bit is kept and/or replicated depending on how we look at it. Larger shift counts will result in a string of aaa's starting with the high bit. This is the result we want in order to properly sign extend a smaller value to the size of a CPU register, for example.
This shift operation is similar to a divide by a power of 2 but has an issue with an input value of -1 as we would expect the result of -1/2 to be 0 when truncated to integer, but using arithmetic shifting it simply stays as -1 instead of going to 0.
Dynamic Mask Generation
1 << n
With 0 <= n <= register size (e.g. 32), this construct computes a value with a single bit set. Here's a table of 1<<n for n:
1<<n n 1<<n (decimal)
00000001 0 1
00000010 1 2
00000100 2 4
00001000 3 8
00010000 4 16
00100000 5 32
01000000 6 64
10000000 7 128
Notice that the 1's march across the word.
(1 << n) - 1
This construct generates a set of consecutive 1's.
(1<<n)-1 n (1<<n)-1 (decimal)
00000000 0 0
00000001 1 1
00000011 2 3
00000111 3 7
00001111 4 15
00011111 5 31
00111111 6 63
01111111 7 127
((1 << n) - 1) << m
Let's say that m is one. This generates a bit mask of consecutive ones at a position other than right justified.
((1<<n)-1)<<1 n
00000000 0
00000010 1
00000110 2
00001110 3
00011110 4
00111110 5
01111110 6
11111110 7
The complement of each of these can form a useful mask:
~ (1 << n)
~(1<<n) n
11111110 0
11111101 1
11111011 2
11110111 3
11101111 4
11011111 5
10111111 6
01111111 7
~((1 << n) - 1)
~((1<<n)-1) n
11111111 0
11111110 1
11111100 2
11111000 3
11110000 4
11100000 5
11000000 6
10000000 7
~(((1 << n) - 1) << m)
~(((1<<n)-1)<<1) n
11111111 0
11111101 1
11111001 2
11110001 3
11100001 4
11000001 5
10000001 6
00000001 7
Using dynamic mask notation for constant mask creation
In C and other languages, you can use the above dynamic mask generation with constant values for n. That can make it easy to generate masks of interest. For example, a 4-bit mask (e.g. for the low nibble) can be generated using the constant (1 << 4)-1. A mask for the upper nibble of a byte is (the former) << 4.
Other Operations
Some CPUs have direct signed and/or unsigned field extraction instructions, which can replace the above sequences.
Many CPUs also have sign extension and zero extension operations for 8-bit and 16-bit inputs to convert them to 32-bit values.
Older CPUs may have only 1-bit shift operations. In such cases, shift operations need to be used in sequence to obtain the desired shift. If a processor is absent a variable shift amount, a loop of 1-bit shifting can be used.
See Also https://graphics.stanford.edu/~seander/bithacks.html
This post first appear on Erik Eidt's "Discussions on Programming Models" at https://erikeidt.blogspot.com/2018/04/bit-twiddling-understanding-bit.html
Copyright (c) 2018, Erik L. Eidt