I had a phone interview around mid-October. In the second half of the interview I was asked the obligatory programming question as follows,
Given an input vector, find the maximum sum of a contiguous sub-vector in that vector. Your algorithm should be in linear time.
This was taken verbatim from the Jon Bentley’s excellent book “Programming Pearls” (2nd. Ed.). I coud’nt solve the problem at the time (and I kicked myself because I owned and had read the book), but it got me thinking about the problem again.
I thought that even though the problem did’nt ask for it, the maximum sum is not very useful. What would be more useful is the actual range of the sub-vector. After all the problem itself was a simplified version of the (as yet open) problem for finding the largest sum of 2-D matrix. I’m not going to go into various details on how the problem can be solved, but I did think about how I could implement a generic version of the algorithm that would dove-tail nicely with various sequences in the STL. (If any reader is interested, I strongly suggest you buy the book). Here is my first cut,
template<typename SequenceIterator>
struct find_max_sum_range
{
std::pair<SequenceIterator, SequenceIterator>
operator () (typename SequenceIterator first, typename SequenceIterator end)
{
SequenceIterator::value_type max_so_far = 0;
SequenceIterator::value_type max_ending_here = 0;
std::pair<SequenceIterator, SequenceIterator> p;
for (;first!=end; ++first){
if (max_ending_here==0){
p.first = p.second = first;
}
max_ending_here =
std::max<typename SequenceIterator::value_type>(max_ending_here + *first, 0);
if (max_ending_here > max_so_far){
max_so_far = max_ending_here;
p.second = first;
}
};
//Have to increment STL iterator because the end always points to
//one incrment beyond
++p.second;
return p;
}
};
I was’nt very happy with this version. I thought that to keep it in the spirit of STL, I needed to use the proper STL library functions, and in this case the right algorithm to use would be the std::for_each function. The problem with this function is that there is no way to access the current iterator while the algorithm is being executed. (Actually this is a general weakness in the otherwise very powerful STL iterator concept). My second solution got around the problem by using a functor that held a reference to a local iterator that got updated whenever the functor got executed. Hence the second cut of the solution.
template<typename SequenceIterator>
struct find_max_sum_range2
{
private:
template<typename SequenceIterator>
struct find_max_sum_range_helper
{
public:
find_max_sum_range_helper
( typename SequenceIterator::value_type &max_so_far_
, typename SequenceIterator::value_type &max_ending_here_
, std::pair<SequenceIterator, SequenceIterator > &p_
, typename SequenceIterator ¤t_)
: max_so_far(max_so_far_)
, max_ending_here(max_ending_here_)
, p(p_)
, current(current_)
{ /* Do nothing else */}
void operator()(typename SequenceIterator::value_type val){
if(max_ending_here==0){
p.first = p.second = current;
}
max_ending_here =
std::max<typename SequenceIterator::value_type>(max_ending_here + val, 0);
if (max_ending_here > max_so_far){
max_so_far = max_ending_here;
p.second = current;
}
//increment the current operator;
++current;
}
private:
typename SequenceIterator::value_type &max_so_far;
typename SequenceIterator::value_type &max_ending_here;
typename std::pair<SequenceIterator, SequenceIterator> &p;
SequenceIterator ¤t;
};
public:
std::pair<SequenceIterator, SequenceIterator>
operator() (typename SequenceIterator first, typename SequenceIterator end)
{
SequenceIterator::value_type max_so_far = 0;
SequenceIterator::value_type max_ending_here = 0;
std::pair<SequenceIterator, SequenceIterator> p;
SequenceIterator current = first;
std::for_each(first, end,
find_max_sum_range_helper<SequenceIterator>(max_so_far, max_ending_here, p, current));
//increment to make sure that it points to 1 element beyond the last element.
++p.second;
return p;
}
};
Comparing the various solutions, it seems the implementation using std::for_each was much more complicated, although it abstracted away the alogrithm for every element away from the iteration of the algorithm. My next step would be use to Boost::lambda to make the code somewhat more clearer while trying to stay in the STL spirit of things.
Usage is as follows
/* TEST 1:
* Test a sequence of ints using vectors
*/
int a[]={-3, 2, 4, -7, 2, 6, 8, -3, 2};
std::vector<int> x(a, a+sizeof(a)/sizeof(int));
typedef std::pair<std::vector<int>::iterator, std::vector<int>::iterator> vector_iterator_pair;
vector_iterator_pair vec_pair = find_max_sum_range<std::vector<int>::iterator>()(x.begin(), x.end());
print_sequence(vec_pair.first, vec_pair.second);
/* Test 2:
* Test a sequence of longs using lists
*/
long b[]={-3, 2, 4, -7, 2, 6, 8, -1, 2};
std::list<long> list_of_longs(sizeof(b)/sizeof(long), 0);
std::swap_ranges(list_of_longs.begin(), list_of_longs.end(), b);
typedef std::pair<std::list<long>::iterator, std::list<long>::iterator> list_iterator_pair ;
list_iterator_pair lp = find_max_sum_range<std::list<long>::iterator>()
(list_of_longs.begin(), list_of_longs.end()); print_sequence(lp.first, lp.second);
/* Test 3:
* Test a single element sequence, using a set of flats
*/
float ca[]={2.365f};
std::set<float> set_of_floats(ca, ca + sizeof(ca)/sizeof(float));
typedef std::pair<std::set<float>::iterator, std::set<float>::iterator> float_iterator_pair;
float_iterator_pair cp = find_max_sum_range<std::set<float>::iterator>()
( set_of_floats.begin(), set_of_floats.end());
print_sequence(cp.first, cp.second);