Project 5 troubleshooting

8 views
Skip to first unread message

Afif Khaja

unread,
May 11, 2013, 1:31:38 PM5/11/13
to MCS 260 Google group
Hi Jeremy
 
I am having a few issues with project 5. When I use any sorting method with a randomly generated list, it works fine. When I use quicksort for a sorted or reverse sorted list, I get a huge error message. When I use bubblesort, betterbubble, or isort for a sorted or reverse sorted list, it takes WAY too long.
 
Also, when I write to a textfile, it first erases everything and then writes the content to the file. I tried using "w+" when opening the file but it hasn't worked.
Do you have suggestions?
 
Thanks,
Afif Khaja
 
 

Jeremy Kun

unread,
May 11, 2013, 2:33:44 PM5/11/13
to Afif Khaja, MCS 260 Google group
Lookup the flag for "append" to not delete the file before writing. This is something I covered long ago when we first talked about files. 

What do you mean by "it takes way to long"? Does it ever terminate? How big are your lists? Maybe it just seems to take way too long because you don't know how long it's supposed to take. 

Jeremy Kun
Mathematics Graduate Student
University of Illinois at Chicago


 
 

--
You received this message because you are subscribed to the Google Groups "uic-mcs260-s13" group.
To unsubscribe from this group and stop receiving emails from it, send an email to uic-mcs260-s1...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

Afif Khaja

unread,
May 11, 2013, 5:24:16 PM5/11/13
to Jeremy Kun, MCS 260 Google group

What I mean by too long is that I waited for about 3 minutes and I didn't get it to stop, whereas when I was using random arrays the max it took was about 1 minute. My array sizes are from 1000 (incremented by 500 20 times) to 11,000. So when I click the "run" button, it returns a list with the time it took to sort each arrays (20 in all).
 

Afif Khaja

unread,
May 11, 2013, 5:36:20 PM5/11/13
to MCS 260 Google group
I also know that the sorting algorithms work because I have tested them using simple arrays, and I used the notes in class to write them.

Jeremy Kun

unread,
May 11, 2013, 5:40:23 PM5/11/13
to Afif Khaja, MCS 260 Google group
So let's do some math. Suppose you want to sort an n=10,000 length array and it's random. quicksort will take about n * log n steps. That's about 40,000 steps, and so if we say you just do ten of the 10,000 length sortings one after the other, this is a total of about 400,000 steps.

Now let's suppose the arrays are sorted already. This is the worst possible case for quicksort (which is part of why quicksort is lame) and it takes order n^2 steps. If n is again 10,000 then that's a total of 100,000,000 steps. That's one hundred million. Multiply that by ten iterations and you're looking at a billion steps. 

This is exactly the difference between n logn and n^2

Now, you may have written some inefficient code or your code may be perfect. In either case, you shouldn't expect it to take less than 2500 times longer to do the sorted case than the random case. In your example, it was one minute for random, so it should take just under two days to finish for the sorted case.

As you're experiencing right now, this is not a hypothetical possibility. So no, it's not taking way too long. It's taking just as long as its supposed to. It doesn't help your case that the algorithm isn't very stable, but that's why I advocate for mergesort: it will always be n log n no matter what happens.

Jeremy Kun
Mathematics Graduate Student
University of Illinois at Chicago


Afif Khaja

unread,
May 11, 2013, 6:42:58 PM5/11/13
to MCS 260 Google group
Thanks Jeremy,
The problem definitely has to do with the large number of iterations that are being performed. I think I have pinpointed the problem: my code to create sorted and reverse sorted arrays is too slow.
 
sorted:
a = [range(1,size+1) for i in range(size)]
 
reverse sorted
a = [range(1,size+1) for i in range(size)]
        a.reverse()
 
Can I make it faster?
Afif

Jeremy Kun

unread,
May 11, 2013, 7:06:55 PM5/11/13
to Afif Khaja, MCS 260 Google group
You have a serious type error here. What is the type of each element of the list "a"?

Now that I think about it, one minute is egregiously long for sorting ten lists of 10,000 elements. It should be more or less instantaneous.

Afif Khaja

unread,
May 11, 2013, 7:14:12 PM5/11/13
to MCS 260 Google group
Oh no. I think I'm make a list within a list.
 
This is better:
 
sorted
a = range(1,size+1)
 
reverse sorted
a = range(1,size+1)
a.reverse()
 
Thank you. Although it seems obvious now, I didn't realize what I was doing.
 
Afif
 
 
 

Jeremy Kun

unread,
May 11, 2013, 7:15:30 PM5/11/13
to Afif Khaja, MCS 260 Google group
Yup! You can also make a reversed list like this:

[i for i in xrange(size, 0, -1)]

Jeremy Kun
Mathematics Graduate Student
University of Illinois at Chicago


 
Afif
 
 
 

--

Afif Khaja

unread,
May 11, 2013, 7:32:32 PM5/11/13
to MCS 260 Google group
Another improvement! Thanks

S.F. Kyale

unread,
May 12, 2013, 6:24:51 AM5/12/13
to Jeremy Kun, Afif Khaja, MCS 260 Google group
I find it hard to see how one could like merge sorts without liking quick sort. They're kind of inverses of each other. Merge splits the lists and than orders the elements in recombination while quick sort orders the elements while splitting the lists. And if the algorithm is done out of place like most merge sorts, it's very easy to do with fairly reliable n(log(n)) time and in a stable way. In theory bad cases are always possible, and it could be a very bad problem if one of those cases cropped up while processing important data and a return expected in a few days had to wait a few billion years. So merge so is wise.

Still, for my best efforts merge sort is taking longer (almost twice as long as lame for sorted or revers sorted lists. Lame is a halfway decent implementation  of quick sort.

I think if had a better idea of how to use extending to combine lists, I might be able to make a faster merge. I think python's built in sort is a sort of merge and it's about five times faster than lame in most cases.

This is what my merge looks like.

def mergesort(L):
   if len(L) < 2: return

   aList = [[L[i],val] if val > L[i] else [val,L[i]
      ] for i, val in enumerate(L[1:len(L)])if i%2 == 0
         ] + [[L[len(L)-1]] if len(L)%2 == 1 else []]

   while len(aList) > 1:
      count, cmax = 0, len(aList)-1
      while count < cmax:
         aList[count>>1] = assembler(aList[count], aList[count+1])
         count += 2
      if count == cmax:
         aList[(count>>1)] = aList[cmax]
         aList = aList[:(count>>1)+1]
      else: aList = aList[:(count>>1)]
   for i, val in enumerate(aList[0]):
      L[i] = val

# assembler: list, list -> list
def assembler(list1, list2):
   inList1,inList2, len1, len2 = 0,  0, len(list1), len(list2)
   #outList = list(xrange(len1+len2))
   outList = []
   for i in xrange(len1+len2):
      if inList1 < len1:
         if inList2 >= len2:
            list(outList.append(val) for val in list1[inList1:])
            break
         if list1[inList1] <= list2[inList2]:
            outList.append(list1[inList1])
            inList1 += 1
            continue
      else:
         list (outList.append(val) for val in list2[inList2:])
         break
      outList.append(list2[inList2])
      inList2 += 1
   return outList



Jeremy Kun

unread,
May 12, 2013, 1:03:25 PM5/12/13
to S.F. Kyale, Afif Khaja, MCS 260 Google group
Wait, I don't think you're understanding the theory here. Mergesort always has runtime n log n. There is no arguing with that if your algorithm is correct. Quicksort has a runtime which is variable, and worst case n^2, best case n log n. I argue for mergesort because it has a better worst case than quicksort.

Jeremy Kun
Mathematics Graduate Student
University of Illinois at Chicago


S.F. Kyale

unread,
May 12, 2013, 3:28:38 PM5/12/13
to Jeremy Kun, S.F. Kyale, Afif Khaja, MCS 260 Google group
Merge sort is only n log n if the hardware is functioning properly. And unless the algorithm is sorting the wrong sort of fractals, the odds of a decent quick sort having a worst case should be less likely than hardware failure on very large lists. And a quicksort could be programmed to finish as a merge sort if a worst case (>k ln n) iterations) is detected.

I think it's important to point out that any REASONABLE implementation of quick sort will perform much more reliably than the version required for mp5. And only an extremely small subset of large lists will give a worst case.

It's probably also good to know the type of data that will be sorted.

Jeremy Kun

unread,
May 12, 2013, 7:49:43 PM5/12/13
to S.F. Kyale, Afif Khaja, MCS 260 Google group
Real-world quicksort implementations actually use a mix of quicksort and insertion sort.

But the point is not about "hardware failures." In the analysis of algorithms there is no such thing as failing hardware. The problem is that Quicksort is an unstable algorithm by nature, and algorithms like mergesort are stable. If you have a time-critical operation, such as doing a financial analysis in time to make a bid before a stock price plummets, an n log n worst case bound guarantees you will be able to sort your numbers in time. It doesn't matter how unlikely the worst case is for Quicksort, you want that guarantee and you're willing to pay an extra few milliseconds per sorting operation to get it.

In the real world hardware is cheap and easy to replace, and you can prevent hardware failure from crippling your programs by mirroring your systems. You cannot do that with algorithms, nor can you cheaply or easily replace the time (and money) you lost when a worst-case scenario happens.

Jeremy Kun
Mathematics Graduate Student
University of Illinois at Chicago


Reply all
Reply to author
Forward
0 new messages