### Linear Time Sort Puzzler

Interviewing for an engineering position at Google can be grueling and humbling. You get less than an hour with each interviewer, and may get one technical problem after another to solve. If you have performance anxiety you may have a hard time focusing on the problem. The best approach, if you can do it, is to think of the interview as a technically challenging but friendly brainstorming session. If you get hired, that's what your life will be like.

My friend, Jim Davis, told me about someone he interviewed earlier this week.
The candidate thought
the interview wasn't going well and hoped to impress Jim by
sharing an idea he had: a new algorithm for sorting in linear time. It
isn't a sort based exclusively on comparisons; it can be proven that any
such sort must perform O(*n* log *n*) comparisons. It isn't
a bucket sort, which is a standard algorithm for this problem (assuming
certain characteristics of the input set). But the candidate's algorithm
helped Jim understand the candidate's capabilities.

The algorithm works like this: you start by making a linear scan through
the *n* numbers to be sorted and note the largest and smallest of the
elements. Using that, you compute a linear function *f(x)* that will
map the smallest input element to zero and the largest to *n*. Then,
for each input element *x _{i}*, you fork a task that waits

*f(x*milliseconds and then outputs

_{i})*x*. After

_{i}*n*milliseconds the output contains the input elements in sorted order.

There may be problems when two tasks attempt to output their data at
nearly the same time, but let's ignore that for a moment and assume an
"ideal" system. If you have a truly concurrent computer, this
may run in O(*n*) time using O(*n*) processors, thereby
consuming a total of O(*n ^{2}*) processing time. There are
far better concurrent sorting algorithms around. Much more interesting
is the idea of running this algorithm on a sequential computer with a
non-preemptive scheduler. The non-preemptive scheduler avoids the problem
of two tasks producing data at the same time. The scheduler can also ensure
that the tasks run in their proper order, even if the running time of some
tasks causes others to start slightly "too late".

Jim explained the problem with the algorithm, concluding "so you've reduced the problem of sorting in linear time to a different linear time sorting algorithm." After Jim explained this remark, the candidate replied "well, the idea isn't really very well thought out." Here is the puzzle: what was Jim talking about?

## 16 comments:

In the non-preemptive scheduler case the scheduler needs to sort the sleeping threads based on the length of time they are sleeping for in order to wake them up 'in order'.

Aside from the fact that the candidate's algorithm is essentially bucksort (with temporal buckets) (s)he missed a trick: why not compute a non-linear function which maps the smallest element to 0, and the largest to logn (or loglogn or whatever), all other details the same. This gets you a sort in O(logn) time and O(nlogn) work on a truly parallel machine ... hurrah!

Did you already drop the keyword: "scheduler"? If n tasks ask to be rescheduled in f(x), the scheduler has to determine which one to schedule next, and the one after that, ... and thus sort the tasks which it can't do faster than O(n).

Computing f(x)?

I guess he converted the sorting problem into a scheduling problem. The scheduler has to sort the n tasks to schedule them in the proper order, which takes O(nlogn) time.

It's in the words: "sorting in linear time" is different from "linear time sorting", at least in this case.

Jim was talking about closures :)

Computing the perfect hash function.

Sweeeeeeet, I finally answered a puzzler.

The difference is that the new algorithm is O(n) where n is the difference between the smallest and largest values in the set, not the number of items in the set as sorting is typically measured by.

CORRECT answers from: matt, matthias, manu. Bonus points for harryhoppel. All the other answers are incorrect.

Can I get a job at Google now?

Winners: send me your resumes!

Take n pieces of spaghetti in your hand, hit them on the table and pick the longest. Now, isn't that sorting in O(1)?

Reply to O(1) spaghetti sorting ... that only works if you can hold n spaghetti strands in your hand at one time. My hand (yours too, I assume...) can only hold a constant number of spaghetti strands. :-)

So there is no such algorithm.

Tim Davis, Univ of Florida

You could have multiple number of hands performing the same thing, the same time. That is Linear.

I was wondering if you could sort in a different domain. All sorting algorithms, at least the ones i know, sort in time domain.

Post a Comment