I have a certain problem that I would like to solve better in one of
my coding projects, and I was wondering if there is some ingenious
way of doing it as it doesn't seem to me that it would be uncommon
The problem is this: Given a set of n intervals [a_i, b_i] and a point
c, get all the intervals where c lies inside.
I want to avoid a linear time search of the intervals. I was thinking of
perhaps building a ternary tree. Each node represents an interval,
[a, b] and the subtrees represent the intervals (-infinity, a], [a, b],
and [b, infinity). And when adding another interval, it is recursively
added to the subtree that has some intersection with it. The problem
is, in the worse case this tree could grow exponentially (as in the case
of intervals that are inserted in an order such that an interval contains
all intervals inserted before it) and then searching the tree would be no better than a linear time search.
Algorithm for searching within ranges
Started by laganojunior, Jul 20 2009 06:11 PM
10 replies to this topic
#1
Posted 20 July 2009 - 06:11 PM
|
|
|
#2
Posted 20 July 2009 - 06:46 PM
#3
Posted 20 July 2009 - 06:50 PM
Yes, the intervals have a_i <= b_i. That was silly of me. :)
#4
Posted 21 July 2009 - 07:07 AM
I don't see how you're going to get better than O(n) search unless you know something about how the intervals are presented to you.
#5
Posted 21 July 2009 - 10:36 AM
Any pre-processing of the intervals is allowed. That is, the entire set of intervals is known and can be then processed to be in any order or structure desired. Once the data structure is constructed, then it is fixed (no other intervals are added) and is used repeatedly to test many points. I don't really care about the complexity of the preprocessing step, as in my project the vast amount of points to test will probably outweigh that step. I hope this makes it easier to find a better algorithm.
#6
Posted 21 July 2009 - 12:53 PM
Here's the problem, as I see it. To simply READ the ranges is O(n). So, unless you can avoid even reading some of the ranges, you're stuck with O(n). Sorting them in most any fashion will be O(n ln(n)) or worse. So, unless you plan to do a LOT of searches where having O(ln(n)) on the searches will outweigh the above costs, you aren't gaining anything.
All that said, what you could probably do is find the intersections of the ranges and associate each intersection with a list of ranges it's in. This would give you a discrete set. Now, find the "center" range, where there's an equal number of ranges to the left and right, and use that as the root of a balanced binary tree. The result is that for each point, it is either left of, in, or right of the range in the current node, and if it is in the node, you have the list of original ranges ready to be printed out.
All that said, what you could probably do is find the intersections of the ranges and associate each intersection with a list of ranges it's in. This would give you a discrete set. Now, find the "center" range, where there's an equal number of ranges to the left and right, and use that as the root of a balanced binary tree. The result is that for each point, it is either left of, in, or right of the range in the current node, and if it is in the node, you have the list of original ranges ready to be printed out.
#7
Posted 21 July 2009 - 06:46 PM
WingedPanther said:
To simply READ the ranges is O(n).
Unless you use a PRAM, in which case reading takes unit time. :)
#8
Posted 24 July 2009 - 06:51 PM
@WingedPanther
It's an interesting algorithm, but in the worst case, there might be an area that is the intersection of all n ranges, and searching that list is O(n) time. And the constant hidden in the O notation for this algorithm is probably larger than for the simple search. I don't think I can avoid the O(n) time for the general case.
@John
Parallelization makes everything funny. =)
It's an interesting algorithm, but in the worst case, there might be an area that is the intersection of all n ranges, and searching that list is O(n) time. And the constant hidden in the O notation for this algorithm is probably larger than for the simple search. I don't think I can avoid the O(n) time for the general case.
@John
Parallelization makes everything funny. =)
#9
Posted 25 July 2009 - 03:25 AM
laganojunior said:
@WingedPanther
It's an interesting algorithm, but in the worst case, there might be an area that is the intersection of all n ranges, and searching that list is O(n) time. And the constant hidden in the O notation for this algorithm is probably larger than for the simple search. I don't think I can avoid the O(n) time for the general case.
It's an interesting algorithm, but in the worst case, there might be an area that is the intersection of all n ranges, and searching that list is O(n) time. And the constant hidden in the O notation for this algorithm is probably larger than for the simple search. I don't think I can avoid the O(n) time for the general case.
I'm not sure you get what I'm after. Consider the following ranges:
[-inf,10]
[-3,20]
[0,inf]
This would build the following set of sub-ranges:
[-inf,-3) -> [-inf,10]
[-3,0) -> [-inf,10],[-3,20]
[0,10] -> [-inf,10],[-3,20],[0,inf]
(10,20] -> [-3,20],[0,inf]
(20,inf] -> [0,inf]
Now you can arrange the subranges in a tree:
[0-10] --- [-3,0) --- [-inf,-3)
\-- (10,20] --- (20,inf]
#10
Posted 25 July 2009 - 12:01 PM
Oh I get it now. Thanks for the explanation.
#11
Posted 25 July 2009 - 06:08 PM
I'm glad I could help :)


Sign In
Create Account

Back to top









