|
Hamming weight - Fuhz Articles Hamming weight article, this Fuhz page will hopefully provide the answers to the who, what where and why on the Hamming weight topic. At the bottom of the page we often provide links to external documents relating to Hamming weight which may also help your research. Every effort is made to ensure the content on this page is as accurate and error free as possible, however whenever researching information that requires the utmost accuracy such as a term paper it is always best to cross reference facts with numerous sources. |
| This article needs additional citations for verification. Please help improve this article by adding reliable references. Unsourced material may be challenged and removed. (January 2009) |
The Hamming weight of a string is the number of symbols that are different from the zero-symbol of the alphabet used.
It is thus equivalent to the Hamming distance from the all-zero string of the same length. For the most typical case, a string of bits, this is the number of 1's in the string. In this binary case, it is also called the population count, or popcount; it is the digit sum of the binary representation of a given number.
Contents |
| alphabet | string | Hamming weight |
| 0,1 | 11101 | 4 |
| 0,1 | 11101000 | 4 |
| 0,1 | 00000000 | 0 |
| ' ',a-z | hello world | 10 |
The Hamming weight is named after Richard Hamming. It is used in several disciplines including information theory, coding theory, and cryptography.
Examples of applications of the Hamming weight include:
The population count of a bitstring is often needed in cryptography and other applications. The Hamming distance of two words A and B can be calculated as the Hamming weight of A xor B.
The problem of how to implement it efficiently has been widely studied. Some processors have a single command to calculate it (see below), and some have parallel operations on bit vectors. For processors lacking those features, the best solutions known are based on adding counts in a tree pattern. For example, to count the number of 1 bits in the 16-bit binary number A=0110110010111010, these operations can be done:
| Expression | Binary | Decimal | Comment |
| A | 01 10 11 00 10 11 10 10 | The original number | |
| B = A & 01 01 01 01 01 01 01 01 | 01 00 01 00 00 01 00 00 | 1,0,1,0,0,1,0,0 | every other bit from A |
| C = (A >> 1) & 01 01 01 01 01 01 01 01 | 00 01 01 00 01 01 01 01 | 0,1,1,0,1,1,1,1 | the remaining bits from A |
| D = B + C | 01 01 10 00 01 10 01 01 | 1,1,2,0,1,2,1,1 | list giving # of 1s in each 2-bit piece of A |
| E = D & 0011 0011 0011 0011 | 0001 0000 0010 0001 | 1,0,2,1 | every other count from D |
| F = (D >> 2) & 0011 0011 0011 0011 | 0001 0010 0001 0001 | 1,2,1,1 | the remaining counts from D |
| G = E + F | 0010 0010 0011 0010 | 2,2,3,2 | list giving # of 1s in each 4-bit piece of A |
| H = G & 00001111 00001111 | 00000010 00000010 | 2,2 | every other count from G |
| I = (G >> 4) & 00001111 00001111 | 00000010 00000011 | 2,3 | the remaining counts from G |
| J = H + I | 00000100 00000101 | 4,5 | list giving # of 1s in each 8-bit piece of A |
| K = J & 0000000011111111 | 0000000000000101 | 5 | every other count from J |
| L = (J >> 8) & 0000000011111111 | 0000000000000100 | 4 | the remaining counts from J |
| M = K + L | 0000000000001001 | 9 | the final answer |
Here, the operations are as in C, so X >> Y means to shift X right by Y bits, X & Y means the bitwise AND of X and Y, and + is ordinary addition. The best algorithms known for this problem are based on the concept illustrated above and are given here:
//types and constants used in the functions below
typedef unsigned __int64 uint64; //assume this gives 64-bits
const uint64 m1 = 0x5555555555555555; //binary: 0101...
const uint64 m2 = 0x3333333333333333; //binary: 00110011..
const uint64 m4 = 0x0f0f0f0f0f0f0f0f; //binary: 4 zeros, 4 ones ...
const uint64 m8 = 0x00ff00ff00ff00ff; //binary: 8 zeros, 8 ones ...
const uint64 m16 = 0x0000ffff0000ffff; //binary: 16 zeros, 16 ones ...
const uint64 m32 = 0x00000000ffffffff; //binary: 32 zeros, 32 ones
const uint64 hff = 0xffffffffffffffff; //binary: all ones
const uint64 h01 = 0x0101010101010101; //the sum of 256 to the power of 0,1,2,3...
//This is a naive implementation, shown for comparison,
//and to help in understanding the better functions.
//It uses 24 arithmetic operations (shift, add, and).
int popcount_1(uint64 x) {
x = (x & m1 ) + ((x >> 1) & m1 ); //put count of each 2 bits into those 2 bits
x = (x & m2 ) + ((x >> 2) & m2 ); //put count of each 4 bits into those 4 bits
x = (x & m4 ) + ((x >> 4) & m4 ); //put count of each 8 bits into those 8 bits
x = (x & m8 ) + ((x >> 8) & m8 ); //put count of each 16 bits into those 16 bits
x = (x & m16) + ((x >> 16) & m16); //put count of each 32 bits into those 32 bits
x = (x & m32) + ((x >> 32) & m32); //put count of each 64 bits into those 64 bits
return x;
}
//This uses fewer arithmetic operations than any other known
//implementation on machines with slow multiplication.
//It uses 17 arithmetic operations.
int popcount_2(uint64 x) {
x -= (x >> 1) & m1; //put count of each 2 bits into those 2 bits
x = (x & m2) + ((x >> 2) & m2); //put count of each 4 bits into those 4 bits
x = (x + (x >> 4)) & m4; //put count of each 8 bits into those 8 bits
x += x >> 8; //put count of each 16 bits into their lowest 8 bits
x += x >> 16; //put count of each 32 bits into their lowest 8 bits
x += x >> 32; //put count of each 64 bits into their lowest 8 bits
return x & 0x7f;
}
//This uses fewer arithmetic operations than any other known
//implementation on machines with fast multiplication.
//It uses 12 arithmetic operations, one of which is a multiply.
int popcount_3(uint64 x) {
x -= (x >> 1) & m1; //put count of each 2 bits into those 2 bits
x = (x & m2) + ((x >> 2) & m2); //put count of each 4 bits into those 4 bits
x = (x + (x >> 4)) & m4; //put count of each 8 bits into those 8 bits
return (x * h01)>>56; //returns left 8 bits of x + (x<<8) + (x<<16) + (x<<24) + ...
}
The above implementations have the best worst-case behavior of any known algorithm. However, when a value is expected to have few nonzero bits, it may instead be more efficient to use algorithms that count these bits one at a time. As Wegner (1960) described,2 the lowest-order nonzero bit of a word x may be found as the exclusive or of x with x − 1: subtracting 1 changes the rightmost string of 0s to 1s, and changes the rightmost 1 to a 0. If, instead of an exclusive or, one uses an and operation, the result is to remove the lowest-order nonzero bit. If x originally had n bits that were 1, then after only n iterations of this operation, x will be reduced to zero. The following are based on this principle.
//This is better when most bits in x are 0
//It uses 3 arithmetic operations and one comparison/branch per "1" bit in x.
int popcount_4(uint64 x) {
uint64 count;
for (count=0; x; count++)
x &= x-1;
return count;
}
//This is better if most bits in x are 0.
//It uses 2 arithmetic operations and one comparison/branch per "1" bit in x.
//It is the same as the previous function, but with the loop unrolled.
#define f(y) if ((x &= x-1) == 0) return y;
int popcount_5(uint64 x) {
if (x == 0) return 0;
f( 1) f( 2) f( 3) f( 4) f( 5) f( 6) f( 7) f( 8)
f( 9) f(10) f(11) f(12) f(13) f(14) f(15) f(16)
f(17) f(18) f(19) f(20) f(21) f(22) f(23) f(24)
f(25) f(26) f(27) f(28) f(29) f(30) f(31) f(32)
f(33) f(34) f(35) f(36) f(37) f(38) f(39) f(40)
f(41) f(42) f(43) f(44) f(45) f(46) f(47) f(48)
f(49) f(50) f(51) f(52) f(53) f(54) f(55) f(56)
f(57) f(58) f(59) f(60) f(61) f(62) f(63)
return 64;
}
//Use this instead if most bits in x are 1 instead of 0
#define f(y) if ((x |= x+1) == hff) return 64-y;
If we are allowed greater memory usage, we can calculate the Hamming weight faster than the above methods. With unlimited memory, we could simply create a large lookup table of the Hamming weight of every 64 bit integer. If we can store a lookup table of the hamming function of every 16 bit integer, we can do the following to compute the Hamming weight of every 32 bit integer.
static unsigned char wordbits[65536] = { bitcounts of ints between 0 and 65535 };
static int popcount( unsigned int i )
{
return( wordbits[i&0xFFFF] + wordbits[i>>16] );
}
Some C compilers provide intrinsics that provide bit counting facilities. For example, GCC (since version 3.4 in April 2004) includes a builtin function __builtin_popcount that will use a processor instruction if available or an efficient library implementation otherwise.3. LLVM-GCC has included this function since version 1.5 in June, 2005.4
In C++ STL, the bit-array data structure bitset has a count() method that counts the number of bits that are set.
In Java, the growable bit-array data structure BitSet has a BitSet.cardinality() method that counts the number of bits that are set. In addition, there are Integer.bitCount(int) and Long.bitCount(long) functions to count bits in primitive 32-bit and 64-bit integers, respectively. Also, the BigInteger arbitrary-precision integer class also has a BigInteger.bitCount() method that counts bits.
In Common Lisp, the function logcount, given a non-negative integer, returns the number of 1 bits. (For negative integers it returns the number of 0 bits in 2's complement notation.) In either case the integer can be a BIGNUM.
Article from Wikipedia. All text is available under the terms of the GNU Free Documentation License - Our Privacy Policy - Thanks Auto Blog Commenter