slow sort with string secondary keys

50 views
Skip to first unread message

Attila Zséder

unread,
Nov 28, 2015, 12:19:30 PM11/28/15
to julia-users
Hi,

i'm new to Julia and wrote a baseline implementation of word count, that counts words, and writes them to stdout sorted by counts (highest first), and when tie, using alphabetical order.
My code is here:
https://github.com/juditacs/wordcount/blob/master/julia/wordcount.jl

Half of the time is spent if secondary sorting is used (for alphabetical order), this is the first bottleneck. But string addition to the dict is quite slow.
The python/cpp implementation runs for about 20-25 seconds while this implementation runs for 80 seconds.
Can you give me some hints on why this implementation is slow?

Thank you!

Tim Holy

unread,
Nov 28, 2015, 3:42:44 PM11/28/15
to julia...@googlegroups.com
AbstractString is, as the name suggests, an abstract type. That's bad news for
performance, see the FAQ.

Try using a concrete type like ASCIIString or UTF8String.

--Tim
Reply all
Reply to author
Forward
0 new messages