Thursday, May 31, 2012

Counting the number of non-zero bits in a a 32bit unsigned integer

from http://stackoverflow.com/questions/9244153/what-is-the-most-efficient-way-of-counting-the-number-of-1s-in-an-integer: Given a 32bit unsigned integer, we want to count the number of non-zero bits in its binary representation. What is the fastest way to do that ? We want to do this N~10^10 times. note: using a large look up table is usually not a good idea because of the architecture of current cpu's . it is much faster to calculate it locally than to use a huge array that needs looking at the external memory --------------------------------------------------------------------------------------------------------------------------------- ZarakiKenpachi: There are actually several options, I presume the native way is way too slow for this. You can go with lookup table for 8-bit value and do it in parallel for all four bytes from unsigned int value, then sum the result. This one could be also quite well-paralelizable (be it multi-core, or maybe even some SSE3/4 could help). You can also go with Brian Kernighan's solution: unsigned int v; // count the number of bits set in v unsigned int c; // c accumulates the total bits set in v for (c = 0; v; c++) { v &= v - 1; // clear the least significant bit set } And the last possible way I found somewhere some time ago is (on 64-bit machines, as the modulo operation would be really fast there): unsigned int v; // count the number of bits set in v unsigned int c; // c accumulates the total bits set in v c = ((v & 0xfff) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f; c += (((v & 0xfff000) >> 12) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f; c += ((v >> 24) * 0x1001001001001ULL & 0x84210842108421ULL) % 0x1f;