On 21/01/14 13:11, Chris Perkins wrote:
> This part: (some #{hashed} already-seen) is doing a linear lookup in
> `already-seen`. Try (contains? already-seen hashed) instead.
+1 to that as it will become faster...
I would also add the following not so related to performance:
(drop1 (line-seqf)) ==> (next(line-seqf))
(ifseen? nil [url hashed]) ==> (when-not seen?[url hashed])
(ifseen? nil hashed) ==>(when-not seen? hashed)
(if(seq(restlines))... ==> (if(seqlines)...
I actually think the last one is a bug...it seems to me that you are skipping one row in the condition...you pass (rest lines) every time you recurse yes?
checking for more lines should be done for *all* current lines, not (rest current-lines)...unless I 've misunderstood something...
Jim