How to find duplicate PDF, txt. or doc files?

adam turner

unread,

Jan 16, 2017, 2:02:52 AM1/16/17

to AntConc-Discussion

Hello everyone,

I would like to clean up my corpus a bit by removing duplicate files. Could anyone recommend some software that could do the job?

txt only would be sufficient. PDF also would be great. I want to automatically find duplicate academic journal articles that I and others have downloaded.

I looked on the internet, but there are too many options, and I don't like downloading utility type software from untrusted sources. Most programs also seem to be designed to find duplicate lines of text in a program.

Adam

Laurence Anthony

unread,

Jan 16, 2017, 7:10:09 AM1/16/17

to ant...@googlegroups.com

Hi,

I've used the following tool in the past:

http://www.alldup.de/en_download_alldup.php

As you say though, I cannot say for certain if the software can be trusted or not. You would have to use it at your own risk.

Laurence.

--
You received this message because you are subscribed to the Google Groups "AntConc-Discussion" group.
To unsubscribe from this group and stop receiving emails from it, send an email to antconc+unsubscribe@googlegroups.com.
To post to this group, send email to ant...@googlegroups.com.
Visit this group at https://groups.google.com/group/antconc.
For more options, visit https://groups.google.com/d/optout.

JFlorian

unread,

Jan 16, 2017, 8:41:55 AM1/16/17

to ant...@googlegroups.com

Perhaps I'm not considering a bigger picture, but... why not just use Windows search?

Reply all

Reply to author

Forward