Jump to content

Algorithm for searching within ranges

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
10 replies to this topic

#1
laganojunior

laganojunior

    Newbie

  • Members
  • Pip
  • 5 posts
I have a certain problem that I would like to solve better in one of
my coding projects, and I was wondering if there is some ingenious
way of doing it as it doesn't seem to me that it would be uncommon

The problem is this: Given a set of n intervals [a_i, b_i] and a point
c, get all the intervals where c lies inside.

I want to avoid a linear time search of the intervals. I was thinking of
perhaps building a ternary tree. Each node represents an interval,
[a, b] and the subtrees represent the intervals (-infinity, a], [a, b],
and [b, infinity). And when adding another interval, it is recursively
added to the subtree that has some intersection with it. The problem
is, in the worse case this tree could grow exponentially (as in the case
of intervals that are inserted in an order such that an interval contains
all intervals inserted before it) and then searching the tree would be no better than a linear time search.

#2
John

John

    Writes binary right handed and hex left handed

  • Moderators
  • 6,321 posts
With your interval, is a_i always less than or equal to b_i?

#3
laganojunior

laganojunior

    Newbie

  • Members
  • Pip
  • 5 posts
Yes, the intervals have a_i <= b_i. That was silly of me. :)

#4
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
I don't see how you're going to get better than O(n) search unless you know something about how the intervals are presented to you.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#5
laganojunior

laganojunior

    Newbie

  • Members
  • Pip
  • 5 posts
Any pre-processing of the intervals is allowed. That is, the entire set of intervals is known and can be then processed to be in any order or structure desired. Once the data structure is constructed, then it is fixed (no other intervals are added) and is used repeatedly to test many points. I don't really care about the complexity of the preprocessing step, as in my project the vast amount of points to test will probably outweigh that step. I hope this makes it easier to find a better algorithm.

#6
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
Here's the problem, as I see it. To simply READ the ranges is O(n). So, unless you can avoid even reading some of the ranges, you're stuck with O(n). Sorting them in most any fashion will be O(n ln(n)) or worse. So, unless you plan to do a LOT of searches where having O(ln(n)) on the searches will outweigh the above costs, you aren't gaining anything.

All that said, what you could probably do is find the intersections of the ranges and associate each intersection with a list of ranges it's in. This would give you a discrete set. Now, find the "center" range, where there's an equal number of ranges to the left and right, and use that as the root of a balanced binary tree. The result is that for each point, it is either left of, in, or right of the range in the current node, and if it is in the node, you have the list of original ranges ready to be printed out.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#7
John

John

    Writes binary right handed and hex left handed

  • Moderators
  • 6,321 posts

WingedPanther said:

To simply READ the ranges is O(n).

Unless you use a PRAM, in which case reading takes unit time. :)

#8
laganojunior

laganojunior

    Newbie

  • Members
  • Pip
  • 5 posts
@WingedPanther
It's an interesting algorithm, but in the worst case, there might be an area that is the intersection of all n ranges, and searching that list is O(n) time. And the constant hidden in the O notation for this algorithm is probably larger than for the simple search. I don't think I can avoid the O(n) time for the general case.

@John
Parallelization makes everything funny. =)

#9
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts

laganojunior said:

@WingedPanther
It's an interesting algorithm, but in the worst case, there might be an area that is the intersection of all n ranges, and searching that list is O(n) time. And the constant hidden in the O notation for this algorithm is probably larger than for the simple search. I don't think I can avoid the O(n) time for the general case.

I'm not sure you get what I'm after. Consider the following ranges:
[-inf,10]
[-3,20]
[0,inf]

This would build the following set of sub-ranges:
[-inf,-3) -> [-inf,10]
[-3,0) -> [-inf,10],[-3,20]
[0,10] -> [-inf,10],[-3,20],[0,inf]
(10,20] -> [-3,20],[0,inf]
(20,inf] -> [0,inf]

Now you can arrange the subranges in a tree:

[0-10] --- [-3,0) --- [-inf,-3)
\-- (10,20] --- (20,inf]
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#10
laganojunior

laganojunior

    Newbie

  • Members
  • Pip
  • 5 posts
Oh I get it now. Thanks for the explanation.

#11
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
I'm glad I could help :)
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog