McEs, A Hacker Life: Puzzle of the Week: select count(*) on bit vectors

Monday, April 11, 2005

Puzzle of the Week: select count(*) on bit vectors

Recently I mentioned that counting the number of set (a.k.a on, 1) bits in a bit vector is fast. Well, that's definitely not true. This puzzle is about that problem. A naive approach is this C function:


int
count_bits_naive (unsigned char *buf, int len)
{
 int n = 0;
 for (; len > 0; len--, buf++)
   for (; *buf; *buf >>= 1)
     n += *buf & 1;
 return n;
}

I want to see how much one can speed up this code. Valueable responses would include code, statistics, and theoretical analysis. Assume a 32 or 64-bit machine.

I've got a quite interesting idea about that myself, not sure how it does in reality. BTW, like the sample code above, assume for simplicity that you can trash the array in place.

Submit your solutions through comment system or via email (preferrably) before Sunday April 17, 2005.

¶ 11:03 PM

Comments:

I would precompute the number of set bits for every 8-bit value and store it in a 256-byte array:

char a[256] = {0};
for (int i = 1; i < 256; ++i)
a[i] = a[i/2] + i%2;

int n = 0;
for (; len > 0, --len)
n += a[*buf++];
return n;

It is also possible to precompute more values at the expense of higher memory usage.

# posted by

Anonymous : April 12, 2005 4:03 PM

what if we want to count numner of 0's? I am new to computer science and I was wondering if the same algorithm will work for 0's or not. (please mention a book about counting various numbers in datastructures if possible)

# posted by

Anonymous : April 27, 2005 6:20 AM

Have you seen this:
http://graphics.stanford.edu/~seander/bithacks.html

# posted by

Anonymous : May 17, 2005 11:53 PM

Thanks a lot, that page made my night! Just submitted some corrections.

# posted by

behdad : May 18, 2005 5:28 AM

#define TRESULT long
BYTE f(TRESULT vect)
{
TRESULT lululu;
BYTE sum=0;
for(lululu=1;lululu;lululu<<=1)
if(vect&lululu)
sum++;
//for(;;)love(!); I don't have much memory, but my CPU is fast!
return sum;
}

# posted by

Anonymous : May 24, 2005 12:48 AM

Thanks to everybody who commented. Indeed using a precomputed table is the fastet way. But Xet, you really want to use an static table man. For the database search results bit vector that is sparse, a word-wide check for zero words can speed up by a constant factor still.

But on the smart solutions, what I was looking for was a way that takes time proportional to the number of set bits in a word, not the number of bits in the word, and indeed Brian Kernighan's method does exactly that. Thanks to Owrdac for the BitHacks link. You can find it here.

Thanks for the anonymous love letter h.

And finally, my rather dull idea was: If you have two bit vectors A and B, then C=(A|B) and D=(A&B) are two new vectors such that n(A)+n(B) == n(C)+n(D) where n(X) is the number of set bits in X. Moreover, n(C) >= max(n(A), n(B)), and n(D) <= min(n(A), n(B)). So if you have cheap operations for bitwise or, bitwise and, check for all-zero, and check for all-one on vectors, we could use this algorithm: given vector X, if X is all-zero or all-one, return with the answer, otherwise break X into X1 and X2, boost the number of set bits towards all-one and all-zero, by replacing X1 and X2 with (X1|X2) and (X1&X2) and recurse. Apparently this is of no use in Turing-equivalent machines we have today, but maybe in some quantum mechanics computers in the future ;).

# posted by

behdad : June 15, 2005 9:33 PM

About Me

Twitter Updates