i am really sorry for making offtopic, hope you will not kill me, but
this is for me life important problem which needs to be solved within
next 12 hours..
I have to create stable algorithm for sorting n numbers from interval
[1,n^2] with time complexity O(n) .
Can someone please give me a hint. Would be very very thankful!
Thanks in advance!
D.
Unless I grossly miss out on something in computer science 101, the
lower bound for sorting is O(n * log_2 n). Which makes your task
impossible, unless there is something to be assumed about the
distribution of numbers in your sequence.
Who has given you that assignment - a professor? Or some friend who's
playing tricks on you?
Diez
There is n numbers from interval [1 , n^2]
I should do measurement for :
n = 500, 1000, 1500, 2000, ..., 4500, 5000
O(n) means linear complexivity, so complexivity of algorithmus must be
linear, which i have to prove.
>
> Who has given you that assignment - a professor? Or some friend who's
> playing tricks on you?
It is official assignment , by professor from Data Structures &
Algorithms course.
Thanks in advance!
>
> Diez
> --
> http://mail.python.org/mailman/listinfo/python-list
>
> I have to create stable algorithm for sorting n numbers from interval
> [1,n^2] with time complexity O(n) .
Some kind of radix sort or counting sort. These algo. has O(n) complexity.
w.
This is only true for comparison-based sorts.
>
> There is n numbers from interval [1 , n^2]
> I should do measurement for :
> n = 500, 1000, 1500, 2000, ..., 4500, 5000
>
> O(n) means linear complexivity, so complexivity of algorithmus must be
> linear, which i have to prove.
>
>
>>
>> Who has given you that assignment - a professor? Or some friend who's
>> playing tricks on you?
>
> It is official assignment , by professor from Data Structures &
> Algorithms course.
>
> Thanks in advance!
>>
>> Diez
>> --
>> http://mail.python.org/mailman/listinfo/python-list
>>
> --
> http://mail.python.org/mailman/listinfo/python-list
>
Look at things like bucket sort and radix sort. You can also do
tricky things like translating your problem domain by making a fixed
number of linear-time passes without affecting the asymptotic run
time.
(Hopefully that's helpful... don't want to give too much away on a
homework assignment, plus tricky algorithms have never been my strong
suit.)
I think you must have fallen asleep during CS101. The lower bound for
sorting where you make a two way branch at each step is O(n * log_2 n), but
if you can choose between k possible orderings in a single comparison you
can get O(n * log_k n).
To beat n * log_2 n just use a bucket sort: for numbers with a known
maximum you can sort them digit by digit for O(n log_k n), and if you don't
restrict yourself to decimal then k can be as large as you want, so for the
problem in question I think you can set k=n and (log_n n)==1 so you get
O(n)
i.e. digit by digit bucket sort the numbers in base n.
Something like (untested):
n = len(data)
buckets1 = [[] for i in range(n)]
buckets2 = [[] for i in range(n)]
for x in data:
buckets1[x % n].append(x)
for x in (v for b in buckets1 for v in reversed(b)):
buckets2[x // n].append(x)
for x in (v for b in buckets2 for v in b):
print x
All loops are order n (the last two have about 2*n steps).
I lied about the untested:
>>> def f(data):
n = len(data)
buckets1 = [[] for i in range(n)]
buckets2 = [[] for i in range(n)]
for x in data:
buckets1[x % n].append(x)
for x in (v for b in buckets1 for v in reversed(b)):
buckets2[x // n].append(x)
for x in (v for b in buckets2 for v in b):
print x
>>> import random
>>> data = [ random.randint(1,100) for i in range(10)]
>>> f(data)
1
16
30
35
70
70
75
76
82
99
Thanks. I *totally* forgot about that.
Diez
Minor detail: with k= n, you have log_n (n^2)= 2, so you get O(2n)= O
(n). Same answer.
The generic sort theorems also assume you can compare arbitrarily
large numbers in constant time, which isn't true. That is, for any
given machine, there exist numbers that you can't compare on them in
constant time. But with a known upper bound k, you can just use a k-
bit machine.
<rhetorical>So, what's the group policy on helping with homework? </
rhetorical>
><rhetorical>So, what's the group policy on helping with homework? </
> rhetorical>
In my book anyone who is dumb enough to ask for homework help on a
newsgroup and doesn't acknowledge that when they hand in their answer
deserves whatever they get. Oh, and of course there are 2 deliberate error
in the 'solution' I posted so they're not going to get full marks if they
simply reuse it. (If there are more than 2 errors then at least one of them
is not deliberate.)
That's assuming the teacher is savvy enough to keep an eye on the obvious
sources for quick answers.
I don't have an algorithm but I have an implementation :)
def sort2(numbers):
"sort n positive integers in O(n) provided that they are all < n^2"
N = len(numbers)
units = [[] for i in xrange(N)]
tens = [[] for i in xrange(N)]
for n in numbers:
units[n % N].append(n)
for l in units:
for n in l:
tens[n / N].append(n)
return [n for l in tens for n in l]
--
Arnaud
> I think you must have fallen asleep during CS101. The lower bound for
> sorting where you make a two way branch at each step is O(n * log_2 n),
> but if you can choose between k possible orderings in a single
> comparison you can get O(n * log_k n).
I think you might have been sleeping through Maths 101 :-)
The difference between log_2 N and log_k N is a constant factor (log_2 k)
and so doesn't effect the big-oh complexity.
--
Steven
The LSD radix sort is stable.
Good luck.
It affects it if k is a function of n. In this particular example, we
can set k=n so we get O(n).
--
Arnaud
Here it comes - bucket sort with time complexity O(n) == linear complexity
#! /usr/bin/python
def sort(numbers):
"sort n positive integers in O(n) provided that they are all from
interval [1, n^2]"
N = len(numbers) # get size of test numbers
buckets_mod = [[] for i in xrange(N)]
buckets_sorted = [[] for i in xrange(N+1)]
# group numbers to buckets (list of numbers) with common modulus
for n in numbers:
buckets_mod[n % N].append(n)
print "buckets_mod: %s" % buckets_mod
# check numbers in buckets
for l in buckets_mod:
for n in l:
# place number into bucket number grouped by result of division
buckets_sorted[n / N].append(n)
print "buckets_sorted: %s" % buckets_sorted
# search through sorted buckets and return list of sorted numbers
return [n for l in buckets_sorted for n in l]
Regards,
David
> "Diez B. Roggisch" <de...@nospam.web.de> wrote:
>
>> David HlÃ¡Ä ik schrieb:
>>> Hi guys,
>>>
>>> i am really sorry for making offtopic, hope you will not kill me, but
>>> this is for me life important problem which needs to be solved within
>>> next 12 hours..
>>>
>>> I have to create stable algorithm for sorting n numbers from interval
>>> [1,n^2] with time complexity O(n) .
>>>
>>> Can someone please give me a hint. Would be very very thankful!
>>
>> Unless I grossly miss out on something in computer science 101, the
>> lower bound for sorting is O(n * log_2 n). Which makes your task
>> impossible, unless there is something to be assumed about the
>> distribution of numbers in your sequence.
>>
>> Who has given you that assignment - a professor? Or some friend who's
>> playing tricks on you?
>>
> I think you must have fallen asleep during CS101. The lower bound for
> sorting where you make a two way branch at each step is O(n * log_2 n),
> but if you can choose between k possible orderings in a single
> comparison you can get O(n * log_k n).
>
> To beat n * log_2 n just use a bucket sort: for numbers with a known
> maximum you can sort them digit by digit for O(n log_k n), and if you
> don't restrict yourself to decimal then k can be as large as you want,
> so for the problem in question I think you can set k=n and (log_n n)==1
> so you get O(n)
>
I'm sure someday, there will be a student who comes to class late and
sees this on the board: "Design a comparison sorting algorithm that has
better than O(n * log n) lower bound complexity." The unsuspecting
student copied it, thinking it's a homework. He crunched to the problem,
going to various meditations and yoga classes before finding a way that
works just before deadline, handing out the work a bit late. Six weeks
later, his professor called and said: "You know what you just did? You've
just found a problem that was supposed to be an example of unsolvable
problem."
It has happened before, why not again?
http://www.snopes.com/college/homework/unsolvable.asp
There's a big difference between an unsolvable problem and an
unsolved problem. In the cases you're talking about, nobody
had solved the problem before, but neither had anybody proved
there was no solution.
In the case at hand, there is a proof that such an algorithm
is impossible. Overturning that would require finding a
flaw in the proof, which for such a simple proof seems very
unlikely.
That's not to say nobody should try, but I won't be holding
my breath.
--
Greg
It's more likely you'd circumvent the assumptions. That is, find an
'outside-the-box' solution. For instance, a deeply parallel
architecture could escape the assumption that you can only compare two
numbers at a time (per step).
The proof's conclusion is still true if its assumption is.
> I'm sure someday, there will be a student who comes to class late and
> sees this on the board: "Design a comparison sorting algorithm that has
> better than O(n * log n) lower bound complexity." The unsuspecting
> student copied it, thinking it's a homework. He crunched to the problem,
> going to various meditations and yoga classes before finding a way that
> works just before deadline, handing out the work a bit late. Six weeks
> later, his professor called and said: "You know what you just did?
> You've just found a problem that was supposed to be an example of
> unsolvable problem."
>
> It has happened before, why not again?
> http://www.snopes.com/college/homework/unsolvable.asp
Because as you described it, it *hasn't* happened before. There is the
world of difference between an unsolvABLE problem and one that is
unsolvED.
All the positive thinking in the world won't help you:
* make a four-sided triangle;
* write down a completely accurate rational expansion for pi or the
square-root of 2;
* split a magnet into two individual poles;
* find an integer solution to 3*n = 7;
* solve the Halting Problem;
* fit n items into n-1 pigeonholes without putting two items in the one
pigeonhole;
* create a compression algorithm that will compress random data;
* or design a comparison sort which does fewer than O(n*log n) two-way
comparisons in the worst case, or fewer than O(n) comparisons in the best
case.
Some things really don't have a solution, no matter how much power of
positive thinking you apply to it.
--
Steven
> All the positive thinking in the world won't help you:
>
> * make a four-sided triangle;
>
> * split a magnet into two individual poles;
These two are fundamentally different problems.
The first is impossible by definition. The definition of triangle is, "a
three-sided polygon". Asking for a "four-sided triangle" is akin to asking
for "a value of three which is equal to four".
The second is only "impossible" because it contradicts our understanding
(based on observation) of how the physical universe works. Our
understanding could simply be wrong. We've certainly been wrong before,
and we will undoubtedly be proven wrong again in the future. When it comes
to things like electromagnetic theory, it doesn't take too many steps to
get us to the fuzzy edge of quantum physics where we know there are huge
questions yet to be answered.
I agree. Most of his examples were tautologies. The magnet one was
the exception.
Then, to beat the O( n lg n ) limit, just break an assumption.
> > * or design a comparison sort which does fewer than O(n*log n) two-way
> > comparisons in the worst case, or fewer than O(n) comparisons in the best
> > case.
Make a three-way comparison, make a non-comparison sort, or make non-
random inputs.
>
> I agree. Most of his examples were tautologies. The magnet one was
> the exception.
A proof is nothing more than a tautology :) The fact that pi is not
rational is not trivial (but certainly has been proved for some time
now).
cheers,
David
> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
>
>> All the positive thinking in the world won't help you:
>>
>> * make a four-sided triangle;
>>
>> * split a magnet into two individual poles;
>
> These two are fundamentally different problems.
>
> The first is impossible by definition. The definition of triangle is,
> "a three-sided polygon". Asking for a "four-sided triangle" is akin to
> asking for "a value of three which is equal to four".
That's right. But see below.
> The second is only "impossible" because it contradicts our understanding
> (based on observation) of how the physical universe works. Our
> understanding could simply be wrong.
And arithmetic could be inconsistent, in which case it might be possible
to prove that 3 equals 4. We don't know for sure that arithmetic is
consistent, and according to Godel, there is no way of proving that it is
consistent. There's no evidence that it isn't, but then, unless the
inconsistency was obvious, how would we know?
http://www.mathpages.com/home/kmath347/kmath347.htm
> We've certainly been wrong before,
> and we will undoubtedly be proven wrong again in the future. When it
> comes to things like electromagnetic theory, it doesn't take too many
> steps to get us to the fuzzy edge of quantum physics where we know there
> are huge questions yet to be answered.
No. I worded my question very carefully. The discovery of magnetic
monopoles, as predicted by the fuzzy end of quantum physics, would not
invalidate my claim. Magnets don't generate magnetic fields by the use of
monopoles, and the discovery of such wouldn't make it possible to cut an
ordinary magnet in two to get an individual north and south pole. That
would like taking a rope with two ends (an ordinary rope, in other
words), cut it in half, and finding that each piece has only a single end.
Now, you could counter with a clever solution involving splicing the rope
to itself in such a way that it had one end and a loop at the other, er,
end. And such a solution might be very valuable, if we needed a way to
get a rope with a loop at one end. But it isn't solving the problem of
cutting a rope in two and getting only two ends instead of four. It's
solving a different problem.
--
Steven
> Some things really don't have a solution, no matter how much power of
> positive thinking you apply to it.
Some may, only not with the current understanding of the universe. Well,
I agree that there are a few things that is straight out nonsense, such
as 4-angled triangle.
And actually you've got the wrong message from the story, I hadn't intend
it to be about positive thinking (as most motivators would exploit), it's
more about thinking outside the current popular frame of minds. When
you're not bound by everyone around you that says a task is impossible,
you might have a better chance to solve the task (well, as long as the
task *is* possible).
What follows is a not-so-funny joke.
Q: Construct a 4-angled triangle.
A: Hmmm.... let's see...
What about this?
/\
/ \
/____\
Q: That's just a regular triangle
A: How about this?
_____
| |
| |
|_____|
Q: That's just a regular rectangle.
A: This?
/|\
/ | \
/__|__\
Q: Well, still no luck, it's a triangle, but it have 5 sides
A: Well... (after thinking for a moment)
How about this?
/----\
/ \
/ \
/ \
-----------------/ \-------\
----------------------------------------
Q: Perfect, just like what I wanted, but you shouldn't let the Python eat
it.
-- Le Petit Prince & aptitude & me
> On Dec 14, 8:18 pm, Roy Smith <r...@panix.com> wrote:
>> Steven D'Aprano <st...@REMOVE-THIS-cybersource.com.au> wrote:
>> > All the positive thinking in the world won't help you:
>>
>> > * make a four-sided triangle;
>>
>> > * split a magnet into two individual poles;
>>
>> These two are fundamentally different problems.
>>
>> The first is impossible by definition. The definition of triangle is,
>> "a three-sided polygon". Asking for a "four-sided triangle" is akin to
>> asking for "a value of three which is equal to four".
>>
>> The second is only "impossible" because it contradicts our
>> understanding (based on observation) of how the physical universe
>> works. Our understanding could simply be wrong. We've certainly been
>> wrong before, and we will undoubtedly be proven wrong again in the
>> future. When it comes to things like electromagnetic theory, it
>> doesn't take too many steps to get us to the fuzzy edge of quantum
>> physics where we know there are huge questions yet to be answered.
>
> I agree. Most of his examples were tautologies. The magnet one was the
> exception.
>
> Then, to beat the O( n lg n ) limit, just break an assumption.
No no no, that's solving a different problem! You're welcome to solve a
different problem if you like, and sometimes doing so is very valuable:
sometimes the original problem turns out not to be important after all.
But don't pretend that it's the same problem.
>> > * or design a comparison sort which does fewer than O(n*log n)
>> > two-way comparisons in the worst case, or fewer than O(n) comparisons
>> > in the best case.
>
> Make a three-way comparison, make a non-comparison sort, or make non-
> random inputs.
Again no. Multi-way comparisons merely change the base of the logarithm,
which in turn merely changes the constant term. It's still O(n*log n).
Non-comparison sorts are a useful technique, but it's changing the
problem, and they are only useful in very limited circumstances. There's
a good reason that most sort routines are based on O(n*log n) comparison
sorts instead of O(n) bucket sorts or radix sorts.
And the quip about non-random inputs just shows you don't understand the
problem you're being asked to solve. O(n*log n) is for the least number
of comparisons needed for the worst possible input. It's not for the best
possible input, or random input, or the average number of comparisons, or
any other measurement. You've misunderstood the question if you think
"give better input" is the answer.
And frankly, it's *easy* to come up with a comparison sort that has O(n)
behaviour for the best possible input. Bubble sort has O(n) behaviour for
input that is already sorted. Even the infamous bogosort has O(n)
behaviour for the best possible input:
def is_sorted(alist):
"""Return True if alist is sorted, otherwise False."""
for i in xrange(1, len(alist)):
if alist[i-1] > alist[i]:
return False
return True
def bogosort(alist):
# Shuffle alist until it happens to be sorted
while not is_sorted(alist):
random.shuffle(alist)
I believe bogosort has behaviour O(n!), which is far worse than merely
exponential O(2**n) complexity. But when given already sorted input, the
version of bogosort above is O(n). Even the archtypically Awful (not just
Bad) algorithm is amazingly fast if you happen to give it the best
possible input.
(By the way, the above code is not suitable for use in production code.
Due to limitations of the standard Python random number generator, not
all permutations of lists larger than a certain size can be generated.
The default random number generator for Python 2.5 is the Mersenne
Twister, which has a period of 2**19937-1. But the number of permutations
of n items is n!, and n! is greater than 2**19937-1 for n=2081 or more.
That means that for any list with 2081 or more items, there is a finite
probability that the above implementation of bogosort will merely cycle
repeatedly over 2**19937-1 unsorted permutations and hence never
terminate.)
--
Steven
I meant they don't care if I use a comparison or non-comparison sort
of course.
And if n is small and sparse (ie, k > n) , O(k*n) for radix sort could
be worse than O(n^2). You could also ask why people make such a big
deal about quicksort over mergesort, since mergesort has a guaranteed
O(n log n) time whereas quicksort can be O(n^2) on pathological cases.
I think I remember learning in my algorithms class that for small
sorts (n < ~40) , bubblesort can actually be the fastest (or close to
the fastest) in terms of wall-clock time because it has a relatively
small constant factor in its O(n^2) complexity.
> And if n is small and sparse (ie, k > n) , O(k*n) for radix sort could
> be worse than O(n^2). You could also ask why people make such a big
> deal about quicksort over mergesort, since mergesort has a guaranteed
> O(n log n) time whereas quicksort can be O(n^2) on pathological cases.
Python's current list.sort uses mergesort because it better exploits
existing structure in a list.
> I think I remember learning in my algorithms class that for small
> sorts (n < ~40) , bubblesort can actually be the fastest (or close to
> the fastest) in terms of wall-clock time because it has a relatively
> small constant factor in its O(n^2) complexity.
It uses binary insert sort for n < 64 (chosen empirically) . That also
does O(n logn) comparisons and is only O(n**2) for data movement, which
a decent C compiler translates into fast block-move assembly instructions.
tjr
It can be proven that you cannot sort an arbitrarily large set of
numbers, given no extra information, faster than O(n log n). It is
provable using information theory. However, if your teacher is giving
you evil problems, there's no reason you can't be evil in return:
Evil trick 1: Allocate an array of n^2 booleans, initialized to false
(This is O(n^2)). Declare this to be "before the sort" Then iterate
through the list and set each matching entry in the array to True
(This is O(n)). Now you have them "sorted." To get access to the
data, you need to iterate across the array O(n^2), but this is "after
the sort"
Evil trick 2: inserting into a set is O(1), so you could insert n
items into a set. This is O(n). Then you can argue that, since the
set cares not about order, it is as good as ordered!
Evil trick 3: pull up your python debugger, and have your program
search for the right answer in your teacher's test library. If done
smart, this could even be an O(1) sort O_o (queue Nukees reference:
"Welcome to my computer security course. Your grades have already
been entered into the school's grading systems as Fs. You have one
semester to change that, good luck")
... these are fun =)
Cormen Leiserson and Rivest, "Algorithms", have a short clear chapter
on
"Sorting in linear time":
" ... counting sort, radix sort and bucket sort ... use operations
other than comparisons.
Consequently, the Omega( n lg n ) lower bound does not apply to
them."
Some of the book is in books.google.com; enjoy
But the key here is "given no extra information." Your examples of
non-comparison sorts require extra information, such as knowledge
about the possible range the numbers can take on.