I found there are no method such as sort_by() after v0.3.But I want to count word frequency with Dict() and sort by its value to find frequent word.So, how can I sort Dict efficiently?
julia> v = randn(10^7);julia> let w = copy(v); @time sort!(w)[1:1000]; end;elapsed time: 0.882989281 seconds (8168 bytes allocated)julia> let w = copy(v); @time select!(w,1:1000); end;elapsed time: 0.054981192 seconds (8192 bytes allocated)
Of course Julia can work fast with Array, I know.
But in natural language processing or text analyzing, we often count word frequency and create dictionary. We usually store word frequency in kind-a Dict and we always cut off non-frequent words (its frequency are under threshold) to exclude noisy words. So I want remove keys which values follow some condition.
Finally, I found John Myles White's implementation creating n-gram. So, I will refer this.
https://github.com/johnmyleswhite/TextAnalysis.jl/blob/master/src/ngramizer.jl