Jump to content

sorting efficiency

- - - - -

This topic has been archived. This means that you cannot reply to this topic.
4 replies to this topic

#1
John

John

    Writes binary right handed and hex left handed

  • Moderators
  • 6,321 posts
Under what circumstances would say, selection sort [ O(n^2) ], be more optimal than say, merge sort [ O(n logn ) ]?

Is it reasonable to thing of these as functions, thus for values of n where O(n^2) is less than O(n logn ) it is more efficient, and where O(n logn ) is less than O(n^2) it is more efficient?

#2
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
Because O() notation ignores a constant multiplier (and usually overhead for memory allocation/function calls), when n is small, the selection sort can actually be more efficient. For example, selection sort's efficiency might be 10 n^2 and merge might be 100 n log n. For n=2, that's 40 verses 60.

There are also times when you have to look at whether merge sort is *significantly* more efficient, given the addition coding time that may be involved.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#3
John

John

    Writes binary right handed and hex left handed

  • Moderators
  • 6,321 posts
So if you were just given the O() notation (without a constant multiplier or any information regarding overhead for memory allocation), it would be impossible to determine the exact amount of data for which each algorithm is more efficient?

#4
WingedPanther

WingedPanther

    A spammer's worst nightmare

  • Moderators
  • 16,831 posts
The purpose of O() notation is to measure which is more efficient for sufficiently large input. This relates to the idea of whether something will scale well. If one has efficiency 1000000x^2, and another has efficiency x^3, the second will perform better on small trials, but will not scale well to large trials.
Programming is a branch of mathematics.
My CodeCall Blog | My Personal Blog

#5
CygnetGames

CygnetGames

    Programmer

  • Members
  • PipPipPipPip
  • 119 posts
Another problem with just looking at time complexity is that in practice you may be concerned with space complexity as well. In functional languages, merge sorts are usually better for space than insertion sort, but in popular languages like C++ etc. implementing a merge sort usually requires more space than an insertion sort.

An example of when the constant multipiler is really important is with analogue computers. Using an analogue computer made of spaghetti it is possible to sort a list of numbers in linear time! (and linear space) But the problem is that it is such a task to actually build a machine that sorts using spaghetti. The constant multiplier is too high.

Also, if you know that your list is nearly sorted then insertion sort will do less work than merge sort. In my computer science course, we were advised to use insertion sort for small arrays. For medium-sized arrays, we were advised to start with quicksort but not take it all the way. If you stop a few iterations before quicksort is finished, you have a nearly-sorted array so it will be better to use insertion sort. For large arrays, shell sort was recommended. Not sure how widely-used these ideas are.