Saturday 12 October 2013

I was helping a co-worker with a quick hackathon project (patch-juggler, which will hopefully grow into an interactive rebase GUI), and he had a function to determine whether two ranges overlap.

His code used eight comparisons to check whether the endpoint of one of the ranges was contained within the other range. In Python it would look like this:

`def overlap(start1, end1, start2, end2):`

"""Does the range (start1, end1) overlap with (start2, end2)?"""

return (

start1 <= start2 <= end1 or

start1 <= end2 <= end1 or

start2 <= start1 <= end2 or

start2 <= end1 <= end2

)

I said you could do it in two comparisons rather than eight, but could never remember the trick.

Looking online, I found this great explanation on Stack Overflow. I’ll restate it in my own words:

Instead of thinking about how overlapping ranges might relate, consider
what conditions have to be true for the ranges *not* to overlap.
It’s much simpler: either range1 is entirely less than range2, or
range1 is entirely greater than range2.

Since the ranges are ordered (start <= end), range1 less than range2 is true if end1 < start2. Similarly range1 greater than range2 can be determined with end2 < start1.

So we can check for non-overlapping ranges with two compares. Inverting the result tells us whether they overlap:

`def overlap(start1, end1, start2, end2):`

"""Does the range (start1, end1) overlap with (start2, end2)?"""

return not (end1 < start2 or end2 < start1)

De Morgan’s laws mean that we can change it to this:

`def overlap(start1, end1, start2, end2):`

"""Does the range (start1, end1) overlap with (start2, end2)?"""

return end1 >= start2 and end2 >= start1

Just looking at this final code, it’s hard to see the logic. It’s even harder to think about it without holding out two fingers on each hand and waving them around to picture overlapping ranges! But with the reasoning about non-overlapping ranges, it makes perfect sense.

This is one of those programming nuggets that makes me happy, I can’t explain why.

## Comments

Ed Davies7:31 AM on 13 Oct 2013Yes, this is one of those little tricks that I've sort of known for years but have to work out from first principles each time leaving me not completely confident I've got it right as the result seems so distant from the original problem statement. Still, funny that after all this time there should be two posts on the subject together, the other being:

http://blog.plover.com/prog/intervals.html

Tim Golden7:44 AM on 14 Oct 2013This is such a common need in my business (SQL backend for advertising bookings) that's it's one of my interview questions. I'll accept any reasonable answer including a simple series of OR clauses, but it's nice when you occasionally get someone turn up this solution.

James2:35 PM on 15 Oct 2013Interval arithmetic is a neat little subject! I started a little Python library for doing these sorts of calculations with intervals. I still have to implement the bisection and range query operators but this trick is in the __and__ of an Interval object (ie: Interval(1, 3) & Interval(2, 5) # a non-zero value means they do, Interval(None, None) if they do not).

https://github.com/agentultra/Ambit

Trinket10:28 AM on 27 Oct 2013Wouldn't it be simpler to cast both ranges into sets and just check intersection?

Ned Batchelder12:54 PM on 27 Oct 2013@Trinket: if the ranges are over integers, they could be very large. Computing the intersection of (1,999999) and (500000,9999999999) as sets would take a great deal of memory. And if they are over floats, the set of discrete values is even huger.

Not to mention, they might not be numbers at all. We can talk about the range from "Apple" to "Mango" and whether it overlaps the range from "Jicama" to "Pear". There are an infinite number of values in each range, it's impossible to enumerate them as sets.

Trinket1:59 PM on 3 Nov 2013Thanks for the explanation Ned.

Always a pleasure reading your answers :-)

Vjeko3:55 PM on 29 Nov 2013Thanks for sharing, helped a lot!

Henrik Warne1:55 PM on 13 Feb 2014Hi Ned,

Good write-up! I used this trick today when I needed to check if two ranges overlapped. One small comment: I find that using De Morgan's law (to get the final version) makes the overlap method harder to understand – looking at it I can't see an intuitive reason why it works. For "not (end1 < start2 or end2 < start1)" on the other hand, I can picture the logic it represents, so I used that version.

## Add a comment: