What's the fastest algorithm for sorting a linked list?What is the fast way to sort singly linked list in descending orderHow to sort a linked list using an efficient algorithm, with only the list sent as parameters?What is the best way to sort link list?How can I sort a linked list in C?Parallel sorting of a singly-linked listQuicksort on Single Linked listHow can I sort a LinkedList with just the standard library?How to implement quicksort on a double-linked list of pointers?Best sorting algorithm - Partially sorted linked listWhich is faster, sorting an array of structs or sorting a list of structures C?How do I sort a list of dictionaries by a value of the dictionary?Sort a Map<Key, Value> by valuesHow to sort a list of objects based on an attribute of the objects?How do I sort a dictionary by value?Sort array of objects by string property valueHow to Sort a List<T> by a property in the objectImage Processing: Algorithm Improvement for 'Coca-Cola Can' RecognitionHow to find time complexity of an algorithmHow to pair socks from a pile efficiently?What is the optimal algorithm for the game 2048?
How to sort human readable size
Why there is a red color in right side?
How can I ping multiple IP addresses at the same time?
Why things float in space, though there is always gravity of our star is present
60's (or earlier) sci-fi short story about two spacecrafts exchanging plants for gold and thinking they got the better of the exchange
How to compute the inverse of an operation in Q#?
What preparations would Hubble have needed to return in a Shuttle?
Kelvin type connection
How are で and いう being used in this context?
How would one carboxylate CBG into its acid form, CBGA?
Is there any way to revive my Sim?
Am I legally required to provide a (GPL licensed) source code even after a project is abandoned?
Is there a polite way to ask about one's ethnicity?
Can a character learn spells from someone else's spellbook and then sell it?
I found a password with hashcat but it doesn't work
If the mass of the Earth is decreasing by sending debris in space, does its angular momentum also decrease?
sudo passwd username keeps asking for the current password
Boundaries and Buddhism
What does this Swiss black on yellow rectangular traffic sign with a symbol looking like a dart mean?
Is declining an undergraduate award which causes me discomfort appropriate?
Counterfeit checks were created for my account. How does this type of fraud work?
My student in one course asks for paid tutoring in another course. Appropriate?
Story of a Witch Boy
Why is it 出差去 and not 去出差?
What's the fastest algorithm for sorting a linked list?
What is the fast way to sort singly linked list in descending orderHow to sort a linked list using an efficient algorithm, with only the list sent as parameters?What is the best way to sort link list?How can I sort a linked list in C?Parallel sorting of a singly-linked listQuicksort on Single Linked listHow can I sort a LinkedList with just the standard library?How to implement quicksort on a double-linked list of pointers?Best sorting algorithm - Partially sorted linked listWhich is faster, sorting an array of structs or sorting a list of structures C?How do I sort a list of dictionaries by a value of the dictionary?Sort a Map<Key, Value> by valuesHow to sort a list of objects based on an attribute of the objects?How do I sort a dictionary by value?Sort array of objects by string property valueHow to Sort a List<T> by a property in the objectImage Processing: Algorithm Improvement for 'Coca-Cola Can' RecognitionHow to find time complexity of an algorithmHow to pair socks from a pile efficiently?What is the optimal algorithm for the game 2048?
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;
I'm curious if O(n log n) is the best a linked list can do.
algorithm sorting linked-list complexity-theory
add a comment |
I'm curious if O(n log n) is the best a linked list can do.
algorithm sorting linked-list complexity-theory
28
Just so that you know, O(nlogn) is the bound for comparison based sorts. There are non-comparison based sorts than can give O(n) performance (e.g. counting sort), but they require additional constraints on the data.
– MAK
Oct 6 '09 at 15:50
add a comment |
I'm curious if O(n log n) is the best a linked list can do.
algorithm sorting linked-list complexity-theory
I'm curious if O(n log n) is the best a linked list can do.
algorithm sorting linked-list complexity-theory
algorithm sorting linked-list complexity-theory
edited Jun 2 '13 at 0:50
Daniel Imms
35.3k7103142
35.3k7103142
asked Oct 6 '09 at 11:51
DirkDirk
3,130124468
3,130124468
28
Just so that you know, O(nlogn) is the bound for comparison based sorts. There are non-comparison based sorts than can give O(n) performance (e.g. counting sort), but they require additional constraints on the data.
– MAK
Oct 6 '09 at 15:50
add a comment |
28
Just so that you know, O(nlogn) is the bound for comparison based sorts. There are non-comparison based sorts than can give O(n) performance (e.g. counting sort), but they require additional constraints on the data.
– MAK
Oct 6 '09 at 15:50
28
28
Just so that you know, O(nlogn) is the bound for comparison based sorts. There are non-comparison based sorts than can give O(n) performance (e.g. counting sort), but they require additional constraints on the data.
– MAK
Oct 6 '09 at 15:50
Just so that you know, O(nlogn) is the bound for comparison based sorts. There are non-comparison based sorts than can give O(n) performance (e.g. counting sort), but they require additional constraints on the data.
– MAK
Oct 6 '09 at 15:50
add a comment |
12 Answers
12
active
oldest
votes
It is reasonable to expect that you cannot do any better than O(N log N) in running time.
However, the interesting part is to investigate whether you can sort it in-place, stably, its worst-case behavior and so on.
Simon Tatham, of Putty fame, explains how to sort a linked list with merge sort. He concludes with the following comments:
Like any self-respecting sort algorithm, this has running time O(N log N). Because this is Mergesort, the worst-case running time is still O(N log N); there are no pathological cases.
Auxiliary storage requirement is small and constant (i.e. a few variables within the sorting routine). Thanks to the inherently different behaviour of linked lists from arrays, this Mergesort implementation avoids the O(N) auxiliary storage cost normally associated with the algorithm.
There is also an example implementation in C that work for both singly and doubly linked lists.
As @Jørgen Fogh mentions below, big-O notation may hide some constant factors that can cause one algorithm to perform better because of memory locality, because of a low number of items, etc.
2
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
2
@L.E. It's actually for both. If you see the signature forlistsort
, you'll see you can switch by using the parameterint is_double
.
– csl
Oct 21 '13 at 14:07
1
@L.E.: here's a Python version of thelistsort
C code that supports only singly-linked lists
– jfs
Nov 2 '13 at 7:26
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
add a comment |
Depending on a number of factors, it may actually be faster to copy the list to an array and then use a Quicksort.
The reason this might be faster is that an array has much better
cache performance than a linked list. If the nodes in the list are dispersed in memory, you
may be generating cache misses all over the place. Then again, if the array is large you will get cache misses anyway.
Mergesort parallelises better, so it may be a better choice if that is what you want. It is also much faster if you perform it directly on the linked list.
Since both algorithms run in O(n * log n), making an informed decision would involve profiling them both on the machine you would like to run them on.
--- EDIT
I decided to test my hypothesis and wrote a C-program which measured the time (using clock()
) taken to sort a linked list of ints. I tried with a linked list where each node was allocated with malloc()
and a linked list where the nodes were laid out linearly in an array, so the cache performance would be better. I compared these with the built-in qsort, which included copying everything from a fragmented list to an array and copying the result back again. Each algorithm was run on the same 10 data sets and the results were averaged.
These are the results:
N = 1000:
Fragmented list with merge sort: 0.000000 seconds
Array with qsort: 0.000000 seconds
Packed list with merge sort: 0.000000 seconds
N = 100000:
Fragmented list with merge sort: 0.039000 seconds
Array with qsort: 0.025000 seconds
Packed list with merge sort: 0.009000 seconds
N = 1000000:
Fragmented list with merge sort: 1.162000 seconds
Array with qsort: 0.420000 seconds
Packed list with merge sort: 0.112000 seconds
N = 100000000:
Fragmented list with merge sort: 364.797000 seconds
Array with qsort: 61.166000 seconds
Packed list with merge sort: 16.525000 seconds
Conclusion:
At least on my machine, copying into an array is well worth it to improve the cache performance, since you rarely have a completely packed linked list in real life. It should be noted that my machine has a 2.8GHz Phenom II, but only 0.6GHz RAM, so the cache is very important.
2
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
1
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
1
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
1
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
2
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
|
show 5 more comments
Comparison sorts (i.e. ones based on comparing elements) cannot possibly be faster than n log n
. It doesn't matter what the underlying data structure is. See Wikipedia.
Other kinds of sort that take advantage of there being lots of identical elements in the list (such as the counting sort), or some expected distribution of elements in the list, are faster, though I can't think of any that work particularly well on a linked list.
add a comment |
As stated many times, the lower bound on comparison based sorting for general data is going to be O(n log n). To briefly resummarize these arguments, there are n! different ways a list can be sorted. Any sort of comparison tree that has n! (which is in O(n^n)) possible final sorts is going to need at least log(n!) as its height: this gives you a O(log(n^n)) lower bound, which is O(n log n).
So, for general data on a linked list, the best possible sort that will work on any data that can compare two objects is going to be O(n log n). However, if you have a more limited domain of things to work in, you can improve the time it takes (at least proportional to n). For instance, if you are working with integers no larger than some value, you could use Counting Sort or Radix Sort, as these use the specific objects you're sorting to reduce the complexity with proportion to n. Be careful, though, these add some other things to the complexity that you may not consider (for instance, Counting Sort and Radix sort both add in factors that are based on the size of the numbers you're sorting, O(n+k) where k is the size of largest number for Counting Sort, for instance).
Also, if you happen to have objects that have a perfect hash (or at least a hash that maps all values differently), you could try using a counting or radix sort on their hash functions.
add a comment |
This is a nice little paper on this topic. His empirical conclusion is that Treesort is best, followed by Quicksort and Mergesort. Sediment sort, bubble sort, selection sort perform very badly.
A COMPARATIVE STUDY OF LINKED LIST SORTING ALGORITHMS
by Ching-Kuang Shene
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.9981
add a comment |
A Radix sort is particularly suited to a linked list, since it's easy to make a table of head pointers corresponding to each possible value of a digit.
1
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
add a comment |
Merge sort doesn't require O(1) access and is O ( n ln n ). No known algorithms for sorting general data are better than O ( n ln n ).
The special data algorithms such as radix sort ( limits size of data ) or histogram sort ( counts discrete data ) could sort a linked list with a lower growth function, as long as you use a different structure with O(1) access as temporary storage.
Another class of special data is a comparison sort of an almost sorted list with k elements out of order. This can be sorted in O ( kn ) operations.
Copying the list to an array and back would be O(N), so any sorting algorithm can be used if space is not an issue.
For example, given a linked list containing uint_8
, this code will sort it in O(N) time using a histogram sort:
#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
typedef struct _list list_t;
struct _list
uint8_t value;
list_t *next;
;
list_t* sort_list ( list_t* list )
list_t* heads[257] = 0;
list_t* tails[257] = 0;
// O(N) loop
for ( list_t* it = list; it != 0; it = it -> next )
list_t* next = it -> next;
if ( heads[ it -> value ] == 0 )
heads[ it -> value ] = it;
else
tails[ it -> value ] -> next = it;
tails[ it -> value ] = it;
list_t* result = 0;
// constant time loop
for ( size_t i = 255; i-- > 0; )
if ( tails[i] )
tails[i] -> next = result;
result = heads[i];
return result;
list_t* make_list ( char* string )
list_t head;
for ( list_t* it = &head; *string; it = it -> next, ++string )
it -> next = malloc ( sizeof ( list_t ) );
it -> next -> value = ( uint8_t ) * string;
it -> next -> next = 0;
return head.next;
void free_list ( list_t* list )
for ( list_t* it = list; it != 0; )
list_t* next = it -> next;
free ( it );
it = next;
void print_list ( list_t* list )
printf ( "[ " );
if ( list )
printf ( "%c", list -> value );
for ( list_t* it = list -> next; it != 0; it = it -> next )
printf ( ", %c", it -> value );
printf ( " ]n" );
int main ( int nargs, char** args )
list_t* list = make_list ( nargs > 1 ? args[1] : "wibble" );
print_list ( list );
list_t* sorted = sort_list ( list );
print_list ( sorted );
free_list ( list );
5
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
9
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, any sort algorithm faster thanO(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).
– bdonlan
Oct 7 '09 at 2:48
3
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
add a comment |
Not a direct answer to your question, but if you use a Skip List, it is already sorted and has O(log N) search time.
1
expectedO(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance
– bdonlan
Oct 7 '09 at 2:47
add a comment |
As I know, the best sorting algorithm is O(n*log n), whatever the container - it's been proved that sorting in the broad sense of the word (mergesort/quicksort etc style) can't go lower. Using a linked list will not give you a better run time.
The only one algorithm which runs in O(n) is a "hack" algorithm which relies on counting values rather than actually sorting.
2
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
Strictly speaking, it runs inO(n lg c)
. If all of your elements are unique, thenc >= n
, and therefore it takes longer thanO(n lg n)
.
– bdonlan
Oct 7 '09 at 2:50
add a comment |
Here's an implementation that traverses the list just once, collecting runs, then schedules the merges in the same way that mergesort does.
Complexity is O(n log m) where n is the number of items and m is the number of runs. Best case is O(n) (if the data is already sorted) and worst case is O(n log n) as expected.
It requires O(log m) temporary memory; the sort is done in-place on the lists.
(updated below. commenter one makes a good point that I should describe it here)
The gist of the algorithm is:
while list not empty
accumulate a run from the start of the list
merge the run with a stack of merges that simulate mergesort's recursion
merge all remaining items on the stack
Accumulating runs doesn't require much explanation, but it's good to take the opportunity to accumulate both ascending runs and descending runs (reversed). Here it prepends items smaller than the head of the run and appends items greater than or equal to the end of the run. (Note that prepending should use strict less-than to preserve sort stability.)
It's easiest to just paste the merging code here:
int i = 0;
for ( ; i < stack.size(); ++i)
if (!stack[i])
break;
run = merge(run, stack[i], comp);
stack[i] = nullptr;
if (i < stack.size())
stack[i] = run;
else
stack.push_back(run);
Consider sorting the list (d a g i b e c f j h) (ignoring runs). The stack states proceed as follows:
[ ]
[ (d) ]
[ () (a d) ]
[ (g), (a d) ]
[ () () (a d g i) ]
[ (b) () (a d g i) ]
[ () (b e) (a d g i) ]
[ (c) (b e) (a d g i ) ]
[ () () () (a b c d e f g i) ]
[ (j) () () (a b c d e f g i) ]
[ () (h j) () (a b c d e f g i) ]
Then, finally, merge all these lists.
Note that the number of items (runs) at stack[i] is either zero or 2^i and the stack size is bounded by 1+log2(nruns). Each element is merged once per stack level, hence O(n log m) comparisons. There's a passing similarity to Timsort here, though Timsort maintains its stack using something like a Fibonacci sequence where this uses powers of two.
Accumulating runs takes advantage of any already sorted data so that best case complexity is O(n) for an already sorted list (one run). Since we're accumulating both ascending and descending runs, runs will always be at least length 2. (This reduces the maximum stack depth by at least one, paying for the cost of finding the runs in the first place.) Worst case complexity is O(n log n), as expected, for data that is highly randomized.
(Um... Second update.)
Or just see wikipedia on bottom-up mergesort.
Having run creation perform well with "reversed input" is a nice touch.O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.
– greybeard
Dec 9 '15 at 3:46
add a comment |
You can copy it into an array and then sort it.
Copying into array O(n),
sorting O(nlgn) (if you use a fast algorithm like merge sort ),
copying back to linked list O(n) if necessary,
so it is gonna be O(nlgn).
note that if you do not know the number of elements in the linked list you won't know the size of array. If you are coding in java you can use an Arraylist for example.
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
add a comment |
Mergesort is the best you can do here.
9
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
11
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
add a comment |
Your Answer
StackExchange.ifUsing("editor", function ()
StackExchange.using("externalEditor", function ()
StackExchange.using("snippets", function ()
StackExchange.snippets.init();
);
);
, "code-snippets");
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "1"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f1525117%2fwhats-the-fastest-algorithm-for-sorting-a-linked-list%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
12 Answers
12
active
oldest
votes
12 Answers
12
active
oldest
votes
active
oldest
votes
active
oldest
votes
It is reasonable to expect that you cannot do any better than O(N log N) in running time.
However, the interesting part is to investigate whether you can sort it in-place, stably, its worst-case behavior and so on.
Simon Tatham, of Putty fame, explains how to sort a linked list with merge sort. He concludes with the following comments:
Like any self-respecting sort algorithm, this has running time O(N log N). Because this is Mergesort, the worst-case running time is still O(N log N); there are no pathological cases.
Auxiliary storage requirement is small and constant (i.e. a few variables within the sorting routine). Thanks to the inherently different behaviour of linked lists from arrays, this Mergesort implementation avoids the O(N) auxiliary storage cost normally associated with the algorithm.
There is also an example implementation in C that work for both singly and doubly linked lists.
As @Jørgen Fogh mentions below, big-O notation may hide some constant factors that can cause one algorithm to perform better because of memory locality, because of a low number of items, etc.
2
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
2
@L.E. It's actually for both. If you see the signature forlistsort
, you'll see you can switch by using the parameterint is_double
.
– csl
Oct 21 '13 at 14:07
1
@L.E.: here's a Python version of thelistsort
C code that supports only singly-linked lists
– jfs
Nov 2 '13 at 7:26
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
add a comment |
It is reasonable to expect that you cannot do any better than O(N log N) in running time.
However, the interesting part is to investigate whether you can sort it in-place, stably, its worst-case behavior and so on.
Simon Tatham, of Putty fame, explains how to sort a linked list with merge sort. He concludes with the following comments:
Like any self-respecting sort algorithm, this has running time O(N log N). Because this is Mergesort, the worst-case running time is still O(N log N); there are no pathological cases.
Auxiliary storage requirement is small and constant (i.e. a few variables within the sorting routine). Thanks to the inherently different behaviour of linked lists from arrays, this Mergesort implementation avoids the O(N) auxiliary storage cost normally associated with the algorithm.
There is also an example implementation in C that work for both singly and doubly linked lists.
As @Jørgen Fogh mentions below, big-O notation may hide some constant factors that can cause one algorithm to perform better because of memory locality, because of a low number of items, etc.
2
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
2
@L.E. It's actually for both. If you see the signature forlistsort
, you'll see you can switch by using the parameterint is_double
.
– csl
Oct 21 '13 at 14:07
1
@L.E.: here's a Python version of thelistsort
C code that supports only singly-linked lists
– jfs
Nov 2 '13 at 7:26
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
add a comment |
It is reasonable to expect that you cannot do any better than O(N log N) in running time.
However, the interesting part is to investigate whether you can sort it in-place, stably, its worst-case behavior and so on.
Simon Tatham, of Putty fame, explains how to sort a linked list with merge sort. He concludes with the following comments:
Like any self-respecting sort algorithm, this has running time O(N log N). Because this is Mergesort, the worst-case running time is still O(N log N); there are no pathological cases.
Auxiliary storage requirement is small and constant (i.e. a few variables within the sorting routine). Thanks to the inherently different behaviour of linked lists from arrays, this Mergesort implementation avoids the O(N) auxiliary storage cost normally associated with the algorithm.
There is also an example implementation in C that work for both singly and doubly linked lists.
As @Jørgen Fogh mentions below, big-O notation may hide some constant factors that can cause one algorithm to perform better because of memory locality, because of a low number of items, etc.
It is reasonable to expect that you cannot do any better than O(N log N) in running time.
However, the interesting part is to investigate whether you can sort it in-place, stably, its worst-case behavior and so on.
Simon Tatham, of Putty fame, explains how to sort a linked list with merge sort. He concludes with the following comments:
Like any self-respecting sort algorithm, this has running time O(N log N). Because this is Mergesort, the worst-case running time is still O(N log N); there are no pathological cases.
Auxiliary storage requirement is small and constant (i.e. a few variables within the sorting routine). Thanks to the inherently different behaviour of linked lists from arrays, this Mergesort implementation avoids the O(N) auxiliary storage cost normally associated with the algorithm.
There is also an example implementation in C that work for both singly and doubly linked lists.
As @Jørgen Fogh mentions below, big-O notation may hide some constant factors that can cause one algorithm to perform better because of memory locality, because of a low number of items, etc.
edited Feb 3 '16 at 8:52
answered Oct 6 '09 at 12:05
cslcsl
8,84634482
8,84634482
2
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
2
@L.E. It's actually for both. If you see the signature forlistsort
, you'll see you can switch by using the parameterint is_double
.
– csl
Oct 21 '13 at 14:07
1
@L.E.: here's a Python version of thelistsort
C code that supports only singly-linked lists
– jfs
Nov 2 '13 at 7:26
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
add a comment |
2
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
2
@L.E. It's actually for both. If you see the signature forlistsort
, you'll see you can switch by using the parameterint is_double
.
– csl
Oct 21 '13 at 14:07
1
@L.E.: here's a Python version of thelistsort
C code that supports only singly-linked lists
– jfs
Nov 2 '13 at 7:26
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
2
2
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
This is not for single linked list. His C code is using *prev and *next.
– L.E.
Apr 27 '13 at 5:35
2
2
@L.E. It's actually for both. If you see the signature for
listsort
, you'll see you can switch by using the parameter int is_double
.– csl
Oct 21 '13 at 14:07
@L.E. It's actually for both. If you see the signature for
listsort
, you'll see you can switch by using the parameter int is_double
.– csl
Oct 21 '13 at 14:07
1
1
@L.E.: here's a Python version of the
listsort
C code that supports only singly-linked lists– jfs
Nov 2 '13 at 7:26
@L.E.: here's a Python version of the
listsort
C code that supports only singly-linked lists– jfs
Nov 2 '13 at 7:26
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
O(kn) is theoretically linear, and can be achieved with bucket sort. Assuming a reasonable k (number of bits /size of object you are sorting), it might be a bit faster
– Adam
Jul 14 '17 at 14:32
add a comment |
Depending on a number of factors, it may actually be faster to copy the list to an array and then use a Quicksort.
The reason this might be faster is that an array has much better
cache performance than a linked list. If the nodes in the list are dispersed in memory, you
may be generating cache misses all over the place. Then again, if the array is large you will get cache misses anyway.
Mergesort parallelises better, so it may be a better choice if that is what you want. It is also much faster if you perform it directly on the linked list.
Since both algorithms run in O(n * log n), making an informed decision would involve profiling them both on the machine you would like to run them on.
--- EDIT
I decided to test my hypothesis and wrote a C-program which measured the time (using clock()
) taken to sort a linked list of ints. I tried with a linked list where each node was allocated with malloc()
and a linked list where the nodes were laid out linearly in an array, so the cache performance would be better. I compared these with the built-in qsort, which included copying everything from a fragmented list to an array and copying the result back again. Each algorithm was run on the same 10 data sets and the results were averaged.
These are the results:
N = 1000:
Fragmented list with merge sort: 0.000000 seconds
Array with qsort: 0.000000 seconds
Packed list with merge sort: 0.000000 seconds
N = 100000:
Fragmented list with merge sort: 0.039000 seconds
Array with qsort: 0.025000 seconds
Packed list with merge sort: 0.009000 seconds
N = 1000000:
Fragmented list with merge sort: 1.162000 seconds
Array with qsort: 0.420000 seconds
Packed list with merge sort: 0.112000 seconds
N = 100000000:
Fragmented list with merge sort: 364.797000 seconds
Array with qsort: 61.166000 seconds
Packed list with merge sort: 16.525000 seconds
Conclusion:
At least on my machine, copying into an array is well worth it to improve the cache performance, since you rarely have a completely packed linked list in real life. It should be noted that my machine has a 2.8GHz Phenom II, but only 0.6GHz RAM, so the cache is very important.
2
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
1
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
1
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
1
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
2
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
|
show 5 more comments
Depending on a number of factors, it may actually be faster to copy the list to an array and then use a Quicksort.
The reason this might be faster is that an array has much better
cache performance than a linked list. If the nodes in the list are dispersed in memory, you
may be generating cache misses all over the place. Then again, if the array is large you will get cache misses anyway.
Mergesort parallelises better, so it may be a better choice if that is what you want. It is also much faster if you perform it directly on the linked list.
Since both algorithms run in O(n * log n), making an informed decision would involve profiling them both on the machine you would like to run them on.
--- EDIT
I decided to test my hypothesis and wrote a C-program which measured the time (using clock()
) taken to sort a linked list of ints. I tried with a linked list where each node was allocated with malloc()
and a linked list where the nodes were laid out linearly in an array, so the cache performance would be better. I compared these with the built-in qsort, which included copying everything from a fragmented list to an array and copying the result back again. Each algorithm was run on the same 10 data sets and the results were averaged.
These are the results:
N = 1000:
Fragmented list with merge sort: 0.000000 seconds
Array with qsort: 0.000000 seconds
Packed list with merge sort: 0.000000 seconds
N = 100000:
Fragmented list with merge sort: 0.039000 seconds
Array with qsort: 0.025000 seconds
Packed list with merge sort: 0.009000 seconds
N = 1000000:
Fragmented list with merge sort: 1.162000 seconds
Array with qsort: 0.420000 seconds
Packed list with merge sort: 0.112000 seconds
N = 100000000:
Fragmented list with merge sort: 364.797000 seconds
Array with qsort: 61.166000 seconds
Packed list with merge sort: 16.525000 seconds
Conclusion:
At least on my machine, copying into an array is well worth it to improve the cache performance, since you rarely have a completely packed linked list in real life. It should be noted that my machine has a 2.8GHz Phenom II, but only 0.6GHz RAM, so the cache is very important.
2
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
1
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
1
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
1
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
2
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
|
show 5 more comments
Depending on a number of factors, it may actually be faster to copy the list to an array and then use a Quicksort.
The reason this might be faster is that an array has much better
cache performance than a linked list. If the nodes in the list are dispersed in memory, you
may be generating cache misses all over the place. Then again, if the array is large you will get cache misses anyway.
Mergesort parallelises better, so it may be a better choice if that is what you want. It is also much faster if you perform it directly on the linked list.
Since both algorithms run in O(n * log n), making an informed decision would involve profiling them both on the machine you would like to run them on.
--- EDIT
I decided to test my hypothesis and wrote a C-program which measured the time (using clock()
) taken to sort a linked list of ints. I tried with a linked list where each node was allocated with malloc()
and a linked list where the nodes were laid out linearly in an array, so the cache performance would be better. I compared these with the built-in qsort, which included copying everything from a fragmented list to an array and copying the result back again. Each algorithm was run on the same 10 data sets and the results were averaged.
These are the results:
N = 1000:
Fragmented list with merge sort: 0.000000 seconds
Array with qsort: 0.000000 seconds
Packed list with merge sort: 0.000000 seconds
N = 100000:
Fragmented list with merge sort: 0.039000 seconds
Array with qsort: 0.025000 seconds
Packed list with merge sort: 0.009000 seconds
N = 1000000:
Fragmented list with merge sort: 1.162000 seconds
Array with qsort: 0.420000 seconds
Packed list with merge sort: 0.112000 seconds
N = 100000000:
Fragmented list with merge sort: 364.797000 seconds
Array with qsort: 61.166000 seconds
Packed list with merge sort: 16.525000 seconds
Conclusion:
At least on my machine, copying into an array is well worth it to improve the cache performance, since you rarely have a completely packed linked list in real life. It should be noted that my machine has a 2.8GHz Phenom II, but only 0.6GHz RAM, so the cache is very important.
Depending on a number of factors, it may actually be faster to copy the list to an array and then use a Quicksort.
The reason this might be faster is that an array has much better
cache performance than a linked list. If the nodes in the list are dispersed in memory, you
may be generating cache misses all over the place. Then again, if the array is large you will get cache misses anyway.
Mergesort parallelises better, so it may be a better choice if that is what you want. It is also much faster if you perform it directly on the linked list.
Since both algorithms run in O(n * log n), making an informed decision would involve profiling them both on the machine you would like to run them on.
--- EDIT
I decided to test my hypothesis and wrote a C-program which measured the time (using clock()
) taken to sort a linked list of ints. I tried with a linked list where each node was allocated with malloc()
and a linked list where the nodes were laid out linearly in an array, so the cache performance would be better. I compared these with the built-in qsort, which included copying everything from a fragmented list to an array and copying the result back again. Each algorithm was run on the same 10 data sets and the results were averaged.
These are the results:
N = 1000:
Fragmented list with merge sort: 0.000000 seconds
Array with qsort: 0.000000 seconds
Packed list with merge sort: 0.000000 seconds
N = 100000:
Fragmented list with merge sort: 0.039000 seconds
Array with qsort: 0.025000 seconds
Packed list with merge sort: 0.009000 seconds
N = 1000000:
Fragmented list with merge sort: 1.162000 seconds
Array with qsort: 0.420000 seconds
Packed list with merge sort: 0.112000 seconds
N = 100000000:
Fragmented list with merge sort: 364.797000 seconds
Array with qsort: 61.166000 seconds
Packed list with merge sort: 16.525000 seconds
Conclusion:
At least on my machine, copying into an array is well worth it to improve the cache performance, since you rarely have a completely packed linked list in real life. It should be noted that my machine has a 2.8GHz Phenom II, but only 0.6GHz RAM, so the cache is very important.
edited Oct 8 '09 at 13:57
answered Oct 6 '09 at 12:57
Jørgen FoghJørgen Fogh
6,20723144
6,20723144
2
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
1
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
1
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
1
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
2
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
|
show 5 more comments
2
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
1
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
1
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
1
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
2
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
2
2
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
Good comments, but you should consider the non-constant cost of copying the data from a list to an array (you'd have to traverse the list), as well as the worst case running time for quicksort.
– csl
Oct 6 '09 at 15:24
1
1
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
O(n * log n) is theoretically the same as O(n * log n + n), which would be including the cost of the copy. For any sufficiently large n, the cost of the copy really shouldn't matter; traversing a list once to the end should be n time.
– Dean J
Oct 6 '09 at 15:31
1
1
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
@DeanJ: Theoretically, yes, but remember that the original poster is putting forth the case where micro-optimizations matter. And in that case, the time spent turning a linked-list into an array has to be considered. The comments are insightful, but I'm not completely convinced it would provide performance gain in reality. It might work for a very small N, perhaps.
– csl
Oct 8 '09 at 7:37
1
1
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
@csl: Actually, I'd expect the benefits of locality to kick in for large N. Assuming that cache misses are the dominant performance effect, then the copy-qsort-copy approach results in about 2*N cache misses for the copying, plus the number of misses for the qsort, which will be a small fraction of Nlog(N) (since most accesses in qsort are to an element close to a recently-accessed element). The number of misses for the merge sort is a larger fraction of Nlog(N), since a higher proportion of comparisons cause a cache miss. So for large N, this term dominates and slows down mergesort.
– Steve Jessop
Nov 14 '09 at 15:57
2
2
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
@Steve: You are right that qsort is not a drop-in replacement, but my point is not really about qsort vs. mergesort. I just didn't feel like writing another version of the mergesort when qsort was readily available. The standard library is way more convenient than rolling your own.
– Jørgen Fogh
Nov 14 '09 at 21:40
|
show 5 more comments
Comparison sorts (i.e. ones based on comparing elements) cannot possibly be faster than n log n
. It doesn't matter what the underlying data structure is. See Wikipedia.
Other kinds of sort that take advantage of there being lots of identical elements in the list (such as the counting sort), or some expected distribution of elements in the list, are faster, though I can't think of any that work particularly well on a linked list.
add a comment |
Comparison sorts (i.e. ones based on comparing elements) cannot possibly be faster than n log n
. It doesn't matter what the underlying data structure is. See Wikipedia.
Other kinds of sort that take advantage of there being lots of identical elements in the list (such as the counting sort), or some expected distribution of elements in the list, are faster, though I can't think of any that work particularly well on a linked list.
add a comment |
Comparison sorts (i.e. ones based on comparing elements) cannot possibly be faster than n log n
. It doesn't matter what the underlying data structure is. See Wikipedia.
Other kinds of sort that take advantage of there being lots of identical elements in the list (such as the counting sort), or some expected distribution of elements in the list, are faster, though I can't think of any that work particularly well on a linked list.
Comparison sorts (i.e. ones based on comparing elements) cannot possibly be faster than n log n
. It doesn't matter what the underlying data structure is. See Wikipedia.
Other kinds of sort that take advantage of there being lots of identical elements in the list (such as the counting sort), or some expected distribution of elements in the list, are faster, though I can't think of any that work particularly well on a linked list.
answered Oct 6 '09 at 12:01
ArteliusArtelius
42k87896
42k87896
add a comment |
add a comment |
As stated many times, the lower bound on comparison based sorting for general data is going to be O(n log n). To briefly resummarize these arguments, there are n! different ways a list can be sorted. Any sort of comparison tree that has n! (which is in O(n^n)) possible final sorts is going to need at least log(n!) as its height: this gives you a O(log(n^n)) lower bound, which is O(n log n).
So, for general data on a linked list, the best possible sort that will work on any data that can compare two objects is going to be O(n log n). However, if you have a more limited domain of things to work in, you can improve the time it takes (at least proportional to n). For instance, if you are working with integers no larger than some value, you could use Counting Sort or Radix Sort, as these use the specific objects you're sorting to reduce the complexity with proportion to n. Be careful, though, these add some other things to the complexity that you may not consider (for instance, Counting Sort and Radix sort both add in factors that are based on the size of the numbers you're sorting, O(n+k) where k is the size of largest number for Counting Sort, for instance).
Also, if you happen to have objects that have a perfect hash (or at least a hash that maps all values differently), you could try using a counting or radix sort on their hash functions.
add a comment |
As stated many times, the lower bound on comparison based sorting for general data is going to be O(n log n). To briefly resummarize these arguments, there are n! different ways a list can be sorted. Any sort of comparison tree that has n! (which is in O(n^n)) possible final sorts is going to need at least log(n!) as its height: this gives you a O(log(n^n)) lower bound, which is O(n log n).
So, for general data on a linked list, the best possible sort that will work on any data that can compare two objects is going to be O(n log n). However, if you have a more limited domain of things to work in, you can improve the time it takes (at least proportional to n). For instance, if you are working with integers no larger than some value, you could use Counting Sort or Radix Sort, as these use the specific objects you're sorting to reduce the complexity with proportion to n. Be careful, though, these add some other things to the complexity that you may not consider (for instance, Counting Sort and Radix sort both add in factors that are based on the size of the numbers you're sorting, O(n+k) where k is the size of largest number for Counting Sort, for instance).
Also, if you happen to have objects that have a perfect hash (or at least a hash that maps all values differently), you could try using a counting or radix sort on their hash functions.
add a comment |
As stated many times, the lower bound on comparison based sorting for general data is going to be O(n log n). To briefly resummarize these arguments, there are n! different ways a list can be sorted. Any sort of comparison tree that has n! (which is in O(n^n)) possible final sorts is going to need at least log(n!) as its height: this gives you a O(log(n^n)) lower bound, which is O(n log n).
So, for general data on a linked list, the best possible sort that will work on any data that can compare two objects is going to be O(n log n). However, if you have a more limited domain of things to work in, you can improve the time it takes (at least proportional to n). For instance, if you are working with integers no larger than some value, you could use Counting Sort or Radix Sort, as these use the specific objects you're sorting to reduce the complexity with proportion to n. Be careful, though, these add some other things to the complexity that you may not consider (for instance, Counting Sort and Radix sort both add in factors that are based on the size of the numbers you're sorting, O(n+k) where k is the size of largest number for Counting Sort, for instance).
Also, if you happen to have objects that have a perfect hash (or at least a hash that maps all values differently), you could try using a counting or radix sort on their hash functions.
As stated many times, the lower bound on comparison based sorting for general data is going to be O(n log n). To briefly resummarize these arguments, there are n! different ways a list can be sorted. Any sort of comparison tree that has n! (which is in O(n^n)) possible final sorts is going to need at least log(n!) as its height: this gives you a O(log(n^n)) lower bound, which is O(n log n).
So, for general data on a linked list, the best possible sort that will work on any data that can compare two objects is going to be O(n log n). However, if you have a more limited domain of things to work in, you can improve the time it takes (at least proportional to n). For instance, if you are working with integers no larger than some value, you could use Counting Sort or Radix Sort, as these use the specific objects you're sorting to reduce the complexity with proportion to n. Be careful, though, these add some other things to the complexity that you may not consider (for instance, Counting Sort and Radix sort both add in factors that are based on the size of the numbers you're sorting, O(n+k) where k is the size of largest number for Counting Sort, for instance).
Also, if you happen to have objects that have a perfect hash (or at least a hash that maps all values differently), you could try using a counting or radix sort on their hash functions.
answered Oct 6 '09 at 18:12
DivineWolfwoodDivineWolfwood
1,5571019
1,5571019
add a comment |
add a comment |
This is a nice little paper on this topic. His empirical conclusion is that Treesort is best, followed by Quicksort and Mergesort. Sediment sort, bubble sort, selection sort perform very badly.
A COMPARATIVE STUDY OF LINKED LIST SORTING ALGORITHMS
by Ching-Kuang Shene
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.9981
add a comment |
This is a nice little paper on this topic. His empirical conclusion is that Treesort is best, followed by Quicksort and Mergesort. Sediment sort, bubble sort, selection sort perform very badly.
A COMPARATIVE STUDY OF LINKED LIST SORTING ALGORITHMS
by Ching-Kuang Shene
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.9981
add a comment |
This is a nice little paper on this topic. His empirical conclusion is that Treesort is best, followed by Quicksort and Mergesort. Sediment sort, bubble sort, selection sort perform very badly.
A COMPARATIVE STUDY OF LINKED LIST SORTING ALGORITHMS
by Ching-Kuang Shene
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.9981
This is a nice little paper on this topic. His empirical conclusion is that Treesort is best, followed by Quicksort and Mergesort. Sediment sort, bubble sort, selection sort perform very badly.
A COMPARATIVE STUDY OF LINKED LIST SORTING ALGORITHMS
by Ching-Kuang Shene
http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.31.9981
answered Dec 29 '10 at 17:56
Neal RichterNeal Richter
5111
5111
add a comment |
add a comment |
A Radix sort is particularly suited to a linked list, since it's easy to make a table of head pointers corresponding to each possible value of a digit.
1
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
add a comment |
A Radix sort is particularly suited to a linked list, since it's easy to make a table of head pointers corresponding to each possible value of a digit.
1
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
add a comment |
A Radix sort is particularly suited to a linked list, since it's easy to make a table of head pointers corresponding to each possible value of a digit.
A Radix sort is particularly suited to a linked list, since it's easy to make a table of head pointers corresponding to each possible value of a digit.
answered Oct 6 '09 at 18:25
Mark RansomMark Ransom
230k31292519
230k31292519
1
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
add a comment |
1
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
1
1
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
Can you please explain more on this topic or give any resource link for radix sort in linked list.
– LoveToCode
Aug 19 '15 at 20:37
add a comment |
Merge sort doesn't require O(1) access and is O ( n ln n ). No known algorithms for sorting general data are better than O ( n ln n ).
The special data algorithms such as radix sort ( limits size of data ) or histogram sort ( counts discrete data ) could sort a linked list with a lower growth function, as long as you use a different structure with O(1) access as temporary storage.
Another class of special data is a comparison sort of an almost sorted list with k elements out of order. This can be sorted in O ( kn ) operations.
Copying the list to an array and back would be O(N), so any sorting algorithm can be used if space is not an issue.
For example, given a linked list containing uint_8
, this code will sort it in O(N) time using a histogram sort:
#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
typedef struct _list list_t;
struct _list
uint8_t value;
list_t *next;
;
list_t* sort_list ( list_t* list )
list_t* heads[257] = 0;
list_t* tails[257] = 0;
// O(N) loop
for ( list_t* it = list; it != 0; it = it -> next )
list_t* next = it -> next;
if ( heads[ it -> value ] == 0 )
heads[ it -> value ] = it;
else
tails[ it -> value ] -> next = it;
tails[ it -> value ] = it;
list_t* result = 0;
// constant time loop
for ( size_t i = 255; i-- > 0; )
if ( tails[i] )
tails[i] -> next = result;
result = heads[i];
return result;
list_t* make_list ( char* string )
list_t head;
for ( list_t* it = &head; *string; it = it -> next, ++string )
it -> next = malloc ( sizeof ( list_t ) );
it -> next -> value = ( uint8_t ) * string;
it -> next -> next = 0;
return head.next;
void free_list ( list_t* list )
for ( list_t* it = list; it != 0; )
list_t* next = it -> next;
free ( it );
it = next;
void print_list ( list_t* list )
printf ( "[ " );
if ( list )
printf ( "%c", list -> value );
for ( list_t* it = list -> next; it != 0; it = it -> next )
printf ( ", %c", it -> value );
printf ( " ]n" );
int main ( int nargs, char** args )
list_t* list = make_list ( nargs > 1 ? args[1] : "wibble" );
print_list ( list );
list_t* sorted = sort_list ( list );
print_list ( sorted );
free_list ( list );
5
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
9
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, any sort algorithm faster thanO(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).
– bdonlan
Oct 7 '09 at 2:48
3
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
add a comment |
Merge sort doesn't require O(1) access and is O ( n ln n ). No known algorithms for sorting general data are better than O ( n ln n ).
The special data algorithms such as radix sort ( limits size of data ) or histogram sort ( counts discrete data ) could sort a linked list with a lower growth function, as long as you use a different structure with O(1) access as temporary storage.
Another class of special data is a comparison sort of an almost sorted list with k elements out of order. This can be sorted in O ( kn ) operations.
Copying the list to an array and back would be O(N), so any sorting algorithm can be used if space is not an issue.
For example, given a linked list containing uint_8
, this code will sort it in O(N) time using a histogram sort:
#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
typedef struct _list list_t;
struct _list
uint8_t value;
list_t *next;
;
list_t* sort_list ( list_t* list )
list_t* heads[257] = 0;
list_t* tails[257] = 0;
// O(N) loop
for ( list_t* it = list; it != 0; it = it -> next )
list_t* next = it -> next;
if ( heads[ it -> value ] == 0 )
heads[ it -> value ] = it;
else
tails[ it -> value ] -> next = it;
tails[ it -> value ] = it;
list_t* result = 0;
// constant time loop
for ( size_t i = 255; i-- > 0; )
if ( tails[i] )
tails[i] -> next = result;
result = heads[i];
return result;
list_t* make_list ( char* string )
list_t head;
for ( list_t* it = &head; *string; it = it -> next, ++string )
it -> next = malloc ( sizeof ( list_t ) );
it -> next -> value = ( uint8_t ) * string;
it -> next -> next = 0;
return head.next;
void free_list ( list_t* list )
for ( list_t* it = list; it != 0; )
list_t* next = it -> next;
free ( it );
it = next;
void print_list ( list_t* list )
printf ( "[ " );
if ( list )
printf ( "%c", list -> value );
for ( list_t* it = list -> next; it != 0; it = it -> next )
printf ( ", %c", it -> value );
printf ( " ]n" );
int main ( int nargs, char** args )
list_t* list = make_list ( nargs > 1 ? args[1] : "wibble" );
print_list ( list );
list_t* sorted = sort_list ( list );
print_list ( sorted );
free_list ( list );
5
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
9
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, any sort algorithm faster thanO(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).
– bdonlan
Oct 7 '09 at 2:48
3
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
add a comment |
Merge sort doesn't require O(1) access and is O ( n ln n ). No known algorithms for sorting general data are better than O ( n ln n ).
The special data algorithms such as radix sort ( limits size of data ) or histogram sort ( counts discrete data ) could sort a linked list with a lower growth function, as long as you use a different structure with O(1) access as temporary storage.
Another class of special data is a comparison sort of an almost sorted list with k elements out of order. This can be sorted in O ( kn ) operations.
Copying the list to an array and back would be O(N), so any sorting algorithm can be used if space is not an issue.
For example, given a linked list containing uint_8
, this code will sort it in O(N) time using a histogram sort:
#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
typedef struct _list list_t;
struct _list
uint8_t value;
list_t *next;
;
list_t* sort_list ( list_t* list )
list_t* heads[257] = 0;
list_t* tails[257] = 0;
// O(N) loop
for ( list_t* it = list; it != 0; it = it -> next )
list_t* next = it -> next;
if ( heads[ it -> value ] == 0 )
heads[ it -> value ] = it;
else
tails[ it -> value ] -> next = it;
tails[ it -> value ] = it;
list_t* result = 0;
// constant time loop
for ( size_t i = 255; i-- > 0; )
if ( tails[i] )
tails[i] -> next = result;
result = heads[i];
return result;
list_t* make_list ( char* string )
list_t head;
for ( list_t* it = &head; *string; it = it -> next, ++string )
it -> next = malloc ( sizeof ( list_t ) );
it -> next -> value = ( uint8_t ) * string;
it -> next -> next = 0;
return head.next;
void free_list ( list_t* list )
for ( list_t* it = list; it != 0; )
list_t* next = it -> next;
free ( it );
it = next;
void print_list ( list_t* list )
printf ( "[ " );
if ( list )
printf ( "%c", list -> value );
for ( list_t* it = list -> next; it != 0; it = it -> next )
printf ( ", %c", it -> value );
printf ( " ]n" );
int main ( int nargs, char** args )
list_t* list = make_list ( nargs > 1 ? args[1] : "wibble" );
print_list ( list );
list_t* sorted = sort_list ( list );
print_list ( sorted );
free_list ( list );
Merge sort doesn't require O(1) access and is O ( n ln n ). No known algorithms for sorting general data are better than O ( n ln n ).
The special data algorithms such as radix sort ( limits size of data ) or histogram sort ( counts discrete data ) could sort a linked list with a lower growth function, as long as you use a different structure with O(1) access as temporary storage.
Another class of special data is a comparison sort of an almost sorted list with k elements out of order. This can be sorted in O ( kn ) operations.
Copying the list to an array and back would be O(N), so any sorting algorithm can be used if space is not an issue.
For example, given a linked list containing uint_8
, this code will sort it in O(N) time using a histogram sort:
#include <stdio.h>
#include <stdint.h>
#include <malloc.h>
typedef struct _list list_t;
struct _list
uint8_t value;
list_t *next;
;
list_t* sort_list ( list_t* list )
list_t* heads[257] = 0;
list_t* tails[257] = 0;
// O(N) loop
for ( list_t* it = list; it != 0; it = it -> next )
list_t* next = it -> next;
if ( heads[ it -> value ] == 0 )
heads[ it -> value ] = it;
else
tails[ it -> value ] -> next = it;
tails[ it -> value ] = it;
list_t* result = 0;
// constant time loop
for ( size_t i = 255; i-- > 0; )
if ( tails[i] )
tails[i] -> next = result;
result = heads[i];
return result;
list_t* make_list ( char* string )
list_t head;
for ( list_t* it = &head; *string; it = it -> next, ++string )
it -> next = malloc ( sizeof ( list_t ) );
it -> next -> value = ( uint8_t ) * string;
it -> next -> next = 0;
return head.next;
void free_list ( list_t* list )
for ( list_t* it = list; it != 0; )
list_t* next = it -> next;
free ( it );
it = next;
void print_list ( list_t* list )
printf ( "[ " );
if ( list )
printf ( "%c", list -> value );
for ( list_t* it = list -> next; it != 0; it = it -> next )
printf ( ", %c", it -> value );
printf ( " ]n" );
int main ( int nargs, char** args )
list_t* list = make_list ( nargs > 1 ? args[1] : "wibble" );
print_list ( list );
list_t* sorted = sort_list ( list );
print_list ( sorted );
free_list ( list );
edited Oct 6 '09 at 13:49
answered Oct 6 '09 at 11:56
Pete KirkhamPete Kirkham
43.6k480152
43.6k480152
5
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
9
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, any sort algorithm faster thanO(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).
– bdonlan
Oct 7 '09 at 2:48
3
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
add a comment |
5
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
9
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, any sort algorithm faster thanO(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).
– bdonlan
Oct 7 '09 at 2:48
3
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
5
5
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
It's been proven that no comparison-based sort algorthms exist that are faster than n log n.
– Artelius
Oct 6 '09 at 11:58
9
9
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, it's been proven that no comparison-based sort algorithms on general data are faster than n log n
– Pete Kirkham
Oct 6 '09 at 11:59
No, any sort algorithm faster than
O(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).– bdonlan
Oct 7 '09 at 2:48
No, any sort algorithm faster than
O(n lg n)
would not be comparison-based (eg, radix sort). By definition, comparison sort applies to any domain that has a total order (ie, can be compared).– bdonlan
Oct 7 '09 at 2:48
3
3
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
@bdonlan the point of "general data" is that there are algorithms which are faster for constrained input, rather than random input. At the limiting case, you can write a trivial O(1) algorithm which sorts a list given the input data is constrained to be already sorted
– Pete Kirkham
Oct 7 '09 at 9:11
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
And that would not be a comparison-based sort. The modifier "on general data" is redundant, since comparison sorts already handle general data (and the big-O notation is for the number of comparisons made).
– Steve Jessop
Nov 14 '09 at 16:09
add a comment |
Not a direct answer to your question, but if you use a Skip List, it is already sorted and has O(log N) search time.
1
expectedO(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance
– bdonlan
Oct 7 '09 at 2:47
add a comment |
Not a direct answer to your question, but if you use a Skip List, it is already sorted and has O(log N) search time.
1
expectedO(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance
– bdonlan
Oct 7 '09 at 2:47
add a comment |
Not a direct answer to your question, but if you use a Skip List, it is already sorted and has O(log N) search time.
Not a direct answer to your question, but if you use a Skip List, it is already sorted and has O(log N) search time.
answered Oct 6 '09 at 11:53
Mitch WheatMitch Wheat
260k36409503
260k36409503
1
expectedO(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance
– bdonlan
Oct 7 '09 at 2:47
add a comment |
1
expectedO(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance
– bdonlan
Oct 7 '09 at 2:47
1
1
expected
O(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance– bdonlan
Oct 7 '09 at 2:47
expected
O(lg N)
search time - but not guaranteed, as skip lists rely on randomness. If you are receiving untrusted input, be sure the supplier of the input cannot predict your RNG, or they could send you data that triggers its worst case performance– bdonlan
Oct 7 '09 at 2:47
add a comment |
As I know, the best sorting algorithm is O(n*log n), whatever the container - it's been proved that sorting in the broad sense of the word (mergesort/quicksort etc style) can't go lower. Using a linked list will not give you a better run time.
The only one algorithm which runs in O(n) is a "hack" algorithm which relies on counting values rather than actually sorting.
2
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
Strictly speaking, it runs inO(n lg c)
. If all of your elements are unique, thenc >= n
, and therefore it takes longer thanO(n lg n)
.
– bdonlan
Oct 7 '09 at 2:50
add a comment |
As I know, the best sorting algorithm is O(n*log n), whatever the container - it's been proved that sorting in the broad sense of the word (mergesort/quicksort etc style) can't go lower. Using a linked list will not give you a better run time.
The only one algorithm which runs in O(n) is a "hack" algorithm which relies on counting values rather than actually sorting.
2
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
Strictly speaking, it runs inO(n lg c)
. If all of your elements are unique, thenc >= n
, and therefore it takes longer thanO(n lg n)
.
– bdonlan
Oct 7 '09 at 2:50
add a comment |
As I know, the best sorting algorithm is O(n*log n), whatever the container - it's been proved that sorting in the broad sense of the word (mergesort/quicksort etc style) can't go lower. Using a linked list will not give you a better run time.
The only one algorithm which runs in O(n) is a "hack" algorithm which relies on counting values rather than actually sorting.
As I know, the best sorting algorithm is O(n*log n), whatever the container - it's been proved that sorting in the broad sense of the word (mergesort/quicksort etc style) can't go lower. Using a linked list will not give you a better run time.
The only one algorithm which runs in O(n) is a "hack" algorithm which relies on counting values rather than actually sorting.
edited Oct 6 '09 at 12:00
answered Oct 6 '09 at 11:54
lauralaura
6,26343041
6,26343041
2
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
Strictly speaking, it runs inO(n lg c)
. If all of your elements are unique, thenc >= n
, and therefore it takes longer thanO(n lg n)
.
– bdonlan
Oct 7 '09 at 2:50
add a comment |
2
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
Strictly speaking, it runs inO(n lg c)
. If all of your elements are unique, thenc >= n
, and therefore it takes longer thanO(n lg n)
.
– bdonlan
Oct 7 '09 at 2:50
2
2
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
It's not a hack algorithm, and it doesn't run in O(n). It runs in O(cn), where c is the largest value you're sorting (well, really it's the difference between the highest and lowest values) and only works on integral values. There's a difference between O(n) and O(cn), as unless you can give a definitive upper bound for the values you're sorting (and thus bound it by a constant), you have two factors complicating the complexity.
– DivineWolfwood
Oct 6 '09 at 17:58
Strictly speaking, it runs in
O(n lg c)
. If all of your elements are unique, then c >= n
, and therefore it takes longer than O(n lg n)
.– bdonlan
Oct 7 '09 at 2:50
Strictly speaking, it runs in
O(n lg c)
. If all of your elements are unique, then c >= n
, and therefore it takes longer than O(n lg n)
.– bdonlan
Oct 7 '09 at 2:50
add a comment |
Here's an implementation that traverses the list just once, collecting runs, then schedules the merges in the same way that mergesort does.
Complexity is O(n log m) where n is the number of items and m is the number of runs. Best case is O(n) (if the data is already sorted) and worst case is O(n log n) as expected.
It requires O(log m) temporary memory; the sort is done in-place on the lists.
(updated below. commenter one makes a good point that I should describe it here)
The gist of the algorithm is:
while list not empty
accumulate a run from the start of the list
merge the run with a stack of merges that simulate mergesort's recursion
merge all remaining items on the stack
Accumulating runs doesn't require much explanation, but it's good to take the opportunity to accumulate both ascending runs and descending runs (reversed). Here it prepends items smaller than the head of the run and appends items greater than or equal to the end of the run. (Note that prepending should use strict less-than to preserve sort stability.)
It's easiest to just paste the merging code here:
int i = 0;
for ( ; i < stack.size(); ++i)
if (!stack[i])
break;
run = merge(run, stack[i], comp);
stack[i] = nullptr;
if (i < stack.size())
stack[i] = run;
else
stack.push_back(run);
Consider sorting the list (d a g i b e c f j h) (ignoring runs). The stack states proceed as follows:
[ ]
[ (d) ]
[ () (a d) ]
[ (g), (a d) ]
[ () () (a d g i) ]
[ (b) () (a d g i) ]
[ () (b e) (a d g i) ]
[ (c) (b e) (a d g i ) ]
[ () () () (a b c d e f g i) ]
[ (j) () () (a b c d e f g i) ]
[ () (h j) () (a b c d e f g i) ]
Then, finally, merge all these lists.
Note that the number of items (runs) at stack[i] is either zero or 2^i and the stack size is bounded by 1+log2(nruns). Each element is merged once per stack level, hence O(n log m) comparisons. There's a passing similarity to Timsort here, though Timsort maintains its stack using something like a Fibonacci sequence where this uses powers of two.
Accumulating runs takes advantage of any already sorted data so that best case complexity is O(n) for an already sorted list (one run). Since we're accumulating both ascending and descending runs, runs will always be at least length 2. (This reduces the maximum stack depth by at least one, paying for the cost of finding the runs in the first place.) Worst case complexity is O(n log n), as expected, for data that is highly randomized.
(Um... Second update.)
Or just see wikipedia on bottom-up mergesort.
Having run creation perform well with "reversed input" is a nice touch.O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.
– greybeard
Dec 9 '15 at 3:46
add a comment |
Here's an implementation that traverses the list just once, collecting runs, then schedules the merges in the same way that mergesort does.
Complexity is O(n log m) where n is the number of items and m is the number of runs. Best case is O(n) (if the data is already sorted) and worst case is O(n log n) as expected.
It requires O(log m) temporary memory; the sort is done in-place on the lists.
(updated below. commenter one makes a good point that I should describe it here)
The gist of the algorithm is:
while list not empty
accumulate a run from the start of the list
merge the run with a stack of merges that simulate mergesort's recursion
merge all remaining items on the stack
Accumulating runs doesn't require much explanation, but it's good to take the opportunity to accumulate both ascending runs and descending runs (reversed). Here it prepends items smaller than the head of the run and appends items greater than or equal to the end of the run. (Note that prepending should use strict less-than to preserve sort stability.)
It's easiest to just paste the merging code here:
int i = 0;
for ( ; i < stack.size(); ++i)
if (!stack[i])
break;
run = merge(run, stack[i], comp);
stack[i] = nullptr;
if (i < stack.size())
stack[i] = run;
else
stack.push_back(run);
Consider sorting the list (d a g i b e c f j h) (ignoring runs). The stack states proceed as follows:
[ ]
[ (d) ]
[ () (a d) ]
[ (g), (a d) ]
[ () () (a d g i) ]
[ (b) () (a d g i) ]
[ () (b e) (a d g i) ]
[ (c) (b e) (a d g i ) ]
[ () () () (a b c d e f g i) ]
[ (j) () () (a b c d e f g i) ]
[ () (h j) () (a b c d e f g i) ]
Then, finally, merge all these lists.
Note that the number of items (runs) at stack[i] is either zero or 2^i and the stack size is bounded by 1+log2(nruns). Each element is merged once per stack level, hence O(n log m) comparisons. There's a passing similarity to Timsort here, though Timsort maintains its stack using something like a Fibonacci sequence where this uses powers of two.
Accumulating runs takes advantage of any already sorted data so that best case complexity is O(n) for an already sorted list (one run). Since we're accumulating both ascending and descending runs, runs will always be at least length 2. (This reduces the maximum stack depth by at least one, paying for the cost of finding the runs in the first place.) Worst case complexity is O(n log n), as expected, for data that is highly randomized.
(Um... Second update.)
Or just see wikipedia on bottom-up mergesort.
Having run creation perform well with "reversed input" is a nice touch.O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.
– greybeard
Dec 9 '15 at 3:46
add a comment |
Here's an implementation that traverses the list just once, collecting runs, then schedules the merges in the same way that mergesort does.
Complexity is O(n log m) where n is the number of items and m is the number of runs. Best case is O(n) (if the data is already sorted) and worst case is O(n log n) as expected.
It requires O(log m) temporary memory; the sort is done in-place on the lists.
(updated below. commenter one makes a good point that I should describe it here)
The gist of the algorithm is:
while list not empty
accumulate a run from the start of the list
merge the run with a stack of merges that simulate mergesort's recursion
merge all remaining items on the stack
Accumulating runs doesn't require much explanation, but it's good to take the opportunity to accumulate both ascending runs and descending runs (reversed). Here it prepends items smaller than the head of the run and appends items greater than or equal to the end of the run. (Note that prepending should use strict less-than to preserve sort stability.)
It's easiest to just paste the merging code here:
int i = 0;
for ( ; i < stack.size(); ++i)
if (!stack[i])
break;
run = merge(run, stack[i], comp);
stack[i] = nullptr;
if (i < stack.size())
stack[i] = run;
else
stack.push_back(run);
Consider sorting the list (d a g i b e c f j h) (ignoring runs). The stack states proceed as follows:
[ ]
[ (d) ]
[ () (a d) ]
[ (g), (a d) ]
[ () () (a d g i) ]
[ (b) () (a d g i) ]
[ () (b e) (a d g i) ]
[ (c) (b e) (a d g i ) ]
[ () () () (a b c d e f g i) ]
[ (j) () () (a b c d e f g i) ]
[ () (h j) () (a b c d e f g i) ]
Then, finally, merge all these lists.
Note that the number of items (runs) at stack[i] is either zero or 2^i and the stack size is bounded by 1+log2(nruns). Each element is merged once per stack level, hence O(n log m) comparisons. There's a passing similarity to Timsort here, though Timsort maintains its stack using something like a Fibonacci sequence where this uses powers of two.
Accumulating runs takes advantage of any already sorted data so that best case complexity is O(n) for an already sorted list (one run). Since we're accumulating both ascending and descending runs, runs will always be at least length 2. (This reduces the maximum stack depth by at least one, paying for the cost of finding the runs in the first place.) Worst case complexity is O(n log n), as expected, for data that is highly randomized.
(Um... Second update.)
Or just see wikipedia on bottom-up mergesort.
Here's an implementation that traverses the list just once, collecting runs, then schedules the merges in the same way that mergesort does.
Complexity is O(n log m) where n is the number of items and m is the number of runs. Best case is O(n) (if the data is already sorted) and worst case is O(n log n) as expected.
It requires O(log m) temporary memory; the sort is done in-place on the lists.
(updated below. commenter one makes a good point that I should describe it here)
The gist of the algorithm is:
while list not empty
accumulate a run from the start of the list
merge the run with a stack of merges that simulate mergesort's recursion
merge all remaining items on the stack
Accumulating runs doesn't require much explanation, but it's good to take the opportunity to accumulate both ascending runs and descending runs (reversed). Here it prepends items smaller than the head of the run and appends items greater than or equal to the end of the run. (Note that prepending should use strict less-than to preserve sort stability.)
It's easiest to just paste the merging code here:
int i = 0;
for ( ; i < stack.size(); ++i)
if (!stack[i])
break;
run = merge(run, stack[i], comp);
stack[i] = nullptr;
if (i < stack.size())
stack[i] = run;
else
stack.push_back(run);
Consider sorting the list (d a g i b e c f j h) (ignoring runs). The stack states proceed as follows:
[ ]
[ (d) ]
[ () (a d) ]
[ (g), (a d) ]
[ () () (a d g i) ]
[ (b) () (a d g i) ]
[ () (b e) (a d g i) ]
[ (c) (b e) (a d g i ) ]
[ () () () (a b c d e f g i) ]
[ (j) () () (a b c d e f g i) ]
[ () (h j) () (a b c d e f g i) ]
Then, finally, merge all these lists.
Note that the number of items (runs) at stack[i] is either zero or 2^i and the stack size is bounded by 1+log2(nruns). Each element is merged once per stack level, hence O(n log m) comparisons. There's a passing similarity to Timsort here, though Timsort maintains its stack using something like a Fibonacci sequence where this uses powers of two.
Accumulating runs takes advantage of any already sorted data so that best case complexity is O(n) for an already sorted list (one run). Since we're accumulating both ascending and descending runs, runs will always be at least length 2. (This reduces the maximum stack depth by at least one, paying for the cost of finding the runs in the first place.) Worst case complexity is O(n log n), as expected, for data that is highly randomized.
(Um... Second update.)
Or just see wikipedia on bottom-up mergesort.
edited Dec 10 '15 at 3:02
answered Dec 8 '15 at 19:02
Stan SwitzerStan Switzer
112
112
Having run creation perform well with "reversed input" is a nice touch.O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.
– greybeard
Dec 9 '15 at 3:46
add a comment |
Having run creation perform well with "reversed input" is a nice touch.O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.
– greybeard
Dec 9 '15 at 3:46
Having run creation perform well with "reversed input" is a nice touch.
O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.– greybeard
Dec 9 '15 at 3:46
Having run creation perform well with "reversed input" is a nice touch.
O(log m)
additional memory should not be needed - just add runs to two list alternately until one is empty.– greybeard
Dec 9 '15 at 3:46
add a comment |
You can copy it into an array and then sort it.
Copying into array O(n),
sorting O(nlgn) (if you use a fast algorithm like merge sort ),
copying back to linked list O(n) if necessary,
so it is gonna be O(nlgn).
note that if you do not know the number of elements in the linked list you won't know the size of array. If you are coding in java you can use an Arraylist for example.
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
add a comment |
You can copy it into an array and then sort it.
Copying into array O(n),
sorting O(nlgn) (if you use a fast algorithm like merge sort ),
copying back to linked list O(n) if necessary,
so it is gonna be O(nlgn).
note that if you do not know the number of elements in the linked list you won't know the size of array. If you are coding in java you can use an Arraylist for example.
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
add a comment |
You can copy it into an array and then sort it.
Copying into array O(n),
sorting O(nlgn) (if you use a fast algorithm like merge sort ),
copying back to linked list O(n) if necessary,
so it is gonna be O(nlgn).
note that if you do not know the number of elements in the linked list you won't know the size of array. If you are coding in java you can use an Arraylist for example.
You can copy it into an array and then sort it.
Copying into array O(n),
sorting O(nlgn) (if you use a fast algorithm like merge sort ),
copying back to linked list O(n) if necessary,
so it is gonna be O(nlgn).
note that if you do not know the number of elements in the linked list you won't know the size of array. If you are coding in java you can use an Arraylist for example.
edited Feb 14 at 22:49
hce
639317
639317
answered Feb 14 at 21:38
Shirin Shirin
111
111
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
add a comment |
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
What does this add over Jørgen Fogh's answer?
– greybeard
Feb 14 at 23:00
add a comment |
Mergesort is the best you can do here.
9
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
11
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
add a comment |
Mergesort is the best you can do here.
9
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
11
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
add a comment |
Mergesort is the best you can do here.
Mergesort is the best you can do here.
edited Oct 7 '09 at 2:34
answered Oct 6 '09 at 11:53
ypnosypnos
37.7k1378114
37.7k1378114
9
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
11
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
add a comment |
9
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
11
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
9
9
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
See Simon Tatham's chiark.greenend.org.uk/~sgtatham/algorithms/listsort.html
– Dominic Rodger
Oct 6 '09 at 11:55
11
11
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
It would be a better answer if you would clarify why.
– csl
Oct 6 '09 at 12:26
add a comment |
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f1525117%2fwhats-the-fastest-algorithm-for-sorting-a-linked-list%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
28
Just so that you know, O(nlogn) is the bound for comparison based sorts. There are non-comparison based sorts than can give O(n) performance (e.g. counting sort), but they require additional constraints on the data.
– MAK
Oct 6 '09 at 15:50