June 20, 2008
Binary lower bound search

Blogging from my wife's BlackBerry here... while I'm watching the kids outside. Should be fun!

I recently ran into a work problem where I had to find, in a sorted array, the largest entry smaller than my input number. You can approach this as a binary search problem, and I did at first. When Google couldn't find a good implementation in C, I wrote one of my own, even to the point of coding up test scaffoling in the style of Programming Pearls.

But then I recognized that the problem was actually called "finding the lower bound", which gives better search results. There's even an implementation in STL that I could use in my target application. More on that later.

Anyway, about a week later I've spent so much time thinking about the damn algorithm I haven't even put it in the target code yet!

Here's the code, in case anyone finds it useful. Might save someone a half hour coding and testing, in any case.


/* binary search, find the bucket in which v should belong
a is a strictly increasing integer array of n elements
bucket i is defined by k : a[i] <= k < a[i+1]
there are n-1 buckets in the array

 valid returns are 0..n-2
*/
int bsearch_lb( int a[], int n, int v )
{
	int l = 0;
	int u = n-1;
	
	/* this exit is useful if all you need is a validity test */
	if( v < a[l] || v >= a[u] )
		return -1;
	
	/* this exit is more useful if you need to know whether you're out range above or below */
	/* if( v < a[l] ) return -1; */
	/* if( v >= a[u] ) return n; */
	
	while( u - l > 1 )
	{
		int t = (l+u)/2; /* risk of overflow if n > 2**31 */
		
		if( v > a[t] )
			l = t;
		else if( v < a[t] )
			u = t;
		else /* v == a[t] */
			return t; /* by def. of bucket */
	}
	
	return l;
}

#include <stdio.h>
#include <assert.h>

int main()
{
	/* scaffolding */
	int i;
	
	/* equally spaced */
	int a[11] = { 0, 10, 20, 30, 40, 50, 60, 70, 80, 90, 100 };
	
	/* variable spaced, including neg. numbers */
	int a2[6] = { -100, 0, 100, 300, 700, 1000 };
	int t[] = { -101, -100,   -99, -1, 
		0,   1,   99, 
					   100, 101, 299, 
					   300, 301, 699,
					   700, 701, 999,
					   1000, 1001, 5000 };
	int v[] = { -1, 0, 0, 0,
		1, 1, 1,
		2, 2, 2,
		3, 3, 3,
		4, 4, 4,
		-1, -1, -1 };
	
	for( i = 0; i < 100; ++i )
	{
		int b = bsearch_lb( a, 11, i );
		assert( b == i/10 );
	}
	
	for( i = 0; i < sizeof(t)/sizeof(int); ++i )
	{
		int b = bsearch_lb( a2, 6, t[i] );
		assert( b == v[i] );
	}	
}
Posted by Sam at 11:36 AM