Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Any freeware to compare .pdf files?

14 views
Skip to first unread message

John C.

unread,
Jun 29, 2023, 8:53:51 AM6/29/23
to
Sometimes, it's possible to download two identically-sized .pdf files,
which if you do a page-by-page comparison, are also identical in that
regard.

However, if you do a hash check to see if they're two copies of the same
file, it turns out that they are not.

If the files have a lot of pages, then doing a page-by-page compare of
the two can be pretty difficult to accomplish.

Winmerge (https://winmerge.org/) will compare .pdf files and point out
metadata differences, but not actual content differences.

Does anybody know of a freeware program that can compare two .pdf files
and list any differences in content?

TIA.

--
John C. BS206. No ad, CD, cripple, demo, nag, pay, pirated, share, spy,
time-limited, trial or web wares for me please.

I filter out anything posted:
-through Google Groups (source of most of spam posted here)
-cross-posted messages (sent to multiple newsgroups at a time)
-posts from several trolls (eg. "Bucky Breeder").

TPayne

unread,
Jun 29, 2023, 11:33:47 AM6/29/23
to
On 6/29/2023 8:53 AM, John C. wrote:
> Sometimes, it's possible to download two identically-sized .pdf files,
> which if you do a page-by-page comparison, are also identical in that
> regard.
>
> However, if you do a hash check to see if they're two copies of the same
> file, it turns out that they are not.
>
> If the files have a lot of pages, then doing a page-by-page compare of
> the two can be pretty difficult to accomplish.
>
> Winmerge (https://winmerge.org/) will compare .pdf files and point out
> metadata differences, but not actual content differences.
>

If you're ok with an online comparer, then
https://www.pdfforge.org/online/en/compare-pdf

"With this tool you can easily compare the text of two PDF files. It
quickly shows you any discrepancies in the content and indicates the
respective line numbers. This saves you a lot of time compared to going
through both files manually."

kelown

unread,
Jun 29, 2023, 1:03:49 PM6/29/23
to

> Does anybody know of a freeware program that can compare two .pdf files
> and list any differences in content?

The 9 Best Tools to Compare Two PDFs Side by Side
https://www.makeuseof.com/tools-compare-two-pdfs-side-by-side

Compares via Windows apps or online compare sites.

B. R. 'BeAr' Ederson

unread,
Jun 29, 2023, 1:51:06 PM6/29/23
to
On Thu, 29 Jun 2023 05:53:42 -0700, John C. wrote:

> Does anybody know of a freeware program that can compare two .pdf files
> and list any differences in content?

Before DiffPdf went commercial at the end of 2013, it was GNU open source
freeware. The latest archived website of the freeware version is:

https://web.archive.org/web/20130805135953/http://qtrac.eu/diffpdf.html

The Windows binary always was compiled and provided by Steven Lee, who
still has the binary and the source code of the last free version on
his website:

https://soft.rubypdf.com/software/diffpdf

It is byte-identical to the one I downloaded back then.

MD5: 52e9749aa28d8f44d5aa2b6e689b2a40 *diffpdf-2.1.3-win32-static.zip

INHO, this "old" version still does a very good job and even offers
different comparison modes (by character, by word, visually). Visual
mode even lets you detect added comments and markers, when the former
modes regard 2 Pdf-files as identical. (Which they are on internal text
level.)

BeAr
--
===========================================================================
= What do you mean with: "Perfection is always an illusion"? =
===============================================================--(Oops!)===

VanguardLH

unread,
Jun 29, 2023, 3:57:11 PM6/29/23
to
"John C." <r9j...@yahoo.com> wrote:

> Sometimes, it's possible to download two identically-sized .pdf files,
> which if you do a page-by-page comparison, are also identical in that
> regard.
>
> However, if you do a hash check to see if they're two copies of the same
> file, it turns out that they are not.
>
> If the files have a lot of pages, then doing a page-by-page compare of
> the two can be pretty difficult to accomplish.
>
> Winmerge (https://winmerge.org/) will compare .pdf files and point out
> metadata differences, but not actual content differences.
>
> Does anybody know of a freeware program that can compare two .pdf files
> and list any differences in content?
>
> TIA.

There is more information in a PDF than just the content of the
document. There's where you were last in the document to reposition you
to the same spot when you later revisit the document. There's the
creation datestamp (inside the file, not in the file system for the OS).
There is metadata that contains the author's name (different people can
generate the same PDF), keywords, copyright info, if a password,
certificate, or security policy was applied to the PDF, the initial view
(opening page number, zoom level, if bookmarks, thumbnails, toolbar, and
menu are displayed), custom properties can be added, images embedded in
the document could be at different resolutions, or even slightly
different images that your eye cannot perceive, and more.

The author of a PDF can decide to include no fonts in the PDF relying on
the recipient's host to have identical or nearly identical fallback
fonts to render the document, or they can include just those fonts used
in the PDF, or they can include the complete font(s) used in the PDF.
That a document looks the same doesn't mean it is identical to another
document that looks the same.

Did you view the document properties to see if those were identical?
There is a lot more defined within a PDF than just the bare document.
You don't want to do a binary compare on the .pdf files. You want to do
a text compare on the documents within. If you're checking for
plagarism, you only want to compare on the document's text, not on
photos. You can copyright paintings, but not photos since anyone else
could take the same photo as you.

Adobe has their own PDF compare tool, but it is trialware, not freeware
(https://www.adobe.com/acrobat/how-to/compare-two-pdf-files.html).

You could print the PDFs, and compare the document content dumped into
the output files. Many PDF readers let you "print" or save to doc
formats other than PDF, like save to .docx or .txt. You could even
print/save to .pdf format, but that could carry along all the metadata
which could differ between the PDF files. Word files (.doc[x]) have
metadata, so you'd have to check if saving a .pdf in whatever PDF viewer
you are using will migrate PDF metadata to Word metadata. Text files
have no metadata. They can have Alternate Data Streams (ADS), but you
won't be viewing those when doing a compare. Of course, text files
cannot contain images, and sometimes a PDF is nothing but one, or a
series, or images, or will contain an image; however, saving
(converting) to .txt format will strip images in both the files you want
to compare, but that also means you won't be doing a true compare since
the images could be different (even if they don't look different).

There's PDFforge online (https://www.pdfforge.org/online/en/compare-pdf)
but I've not used that one; however, I've rarely had to compare PDF
files against each other. It's been a very long time since I used the
PDFforge tools. They have their PDFCreator that lets you convert from
PDF to other doc formats, so you could convert to a doc format that
doesn't have or carry along any metadata. Their free edition is limited
(see https://www.pdfforge.org/pdfcreator/editions), so I don't know if
it supports the conversion feature; however, it looks like the free
edition is adware per their own comparison table. While the conversion
they advertise is about printing a document using their printer emulator
(you create PDFs from other docs), I would think it would have its own
print or save function that lets you choose the doc format for output.

PDF24 also has their online tools; see:
https://tools.pdf24.org/en/compare-pdf
https://tools.pdf24.org/en/all-tools

There's https://www.diffchecker.com with their online PDF comparer. You
can get their client tool, but it is trialware (30 days) - their pricing
page shows it is actually subscriptionware. However, you didn't say if
this was a one-time need, or if you need something for repeated use for
later.

Another online tool: https://app.copyleaks.com/text-compare/compare-pdf-files
You didn't mention if online tools are a taboo to you. This one looks
to require you create an account with them.

We don't know what other document software you have, or even which PDF
viewer you are using. If you have MS Word (after whatever version added
PDF support), you can open a .pdf in Word, and then save as a .doc[x]
file. Be aware that Word files also have metadata. See:

https://support.microsoft.com/en-us/office/remove-hidden-data-and-personal-information-by-inspecting-documents-presentations-or-workbooks-356b7b5d-77af-44fe-a07f-9aa4d085966f

I suspect the local client products are mostly payware is because they
have to pay for a license to Adobe to use some tool they embed in their
product. You can try the trialware versions, but that's not a solution
if you intend to keep doing PDF compares well past the trial period.
Their online versions are free, because you never get their software
used to do the compare that runs in the background.

MikeS

unread,
Jun 29, 2023, 4:44:12 PM6/29/23
to
On 29/06/2023 20:57, VanguardLH wrote:
>
> Did you view the document properties to see if those were identical?
> There is a lot more defined within a PDF than just the bare document.
> You don't want to do a binary compare on the .pdf files. You want to do
> a text compare on the documents within. If you're checking for
> plagarism, you only want to compare on the document's text, not on
> photos. You can copyright paintings, but not photos since anyone else
> could take the same photo as you.
>
There are hundreds of commercial image libraries which vigorously
safeguard the copyright of their images including photos. Likewise, if
authors wish to reproduce previously published photos their publisher
will insist on seeing written permission from the copyright holders.

VanguardLH

unread,
Jun 29, 2023, 5:22:00 PM6/29/23
to
I stand at a spot alongside a lake and take a picture. What precludes
someone else from standing in the same spot just a minute after I leave
to take the same photo? How do you copyright a lake, or any publicly
visible object? Anyone can get a copyright on just about anything, and
can use it for intimidation, but that doesn't make it legal.

John C.

unread,
Jun 30, 2023, 6:27:29 AM6/30/23
to
Thanks for all that, VanguardLH. I'll check out your recommendations.

John C.

unread,
Jun 30, 2023, 6:29:19 AM6/30/23
to
B. R. 'BeAr' Ederson wrote:
> On Thu, 29 Jun 2023 05:53:42 -0700, John C. wrote:
>
>> Does anybody know of a freeware program that can compare two .pdf files
>> and list any differences in content?
>
> Before DiffPdf went commercial at the end of 2013, it was GNU open source
> freeware. The latest archived website of the freeware version is:
>
> https://web.archive.org/web/20130805135953/http://qtrac.eu/diffpdf.html
>
> The Windows binary always was compiled and provided by Steven Lee, who
> still has the binary and the source code of the last free version on
> his website:
>
> https://soft.rubypdf.com/software/diffpdf
>
> It is byte-identical to the one I downloaded back then.
>
> MD5: 52e9749aa28d8f44d5aa2b6e689b2a40 *diffpdf-2.1.3-win32-static.zip
>
> INHO, this "old" version still does a very good job and even offers
> different comparison modes (by character, by word, visually). Visual
> mode even lets you detect added comments and markers, when the former
> modes regard 2 Pdf-files as identical. (Which they are on internal text
> level.)
>
> BeAr

Thanks, BeAr. I'll give that one a shot. Sounds like exactly what I'm
looking for.

John C.

unread,
Jun 30, 2023, 6:30:51 AM6/30/23
to
TPayne wrote:
> On 6/29/2023 8:53 AM, John C. wrote:
>> Sometimes, it's possible to download two identically-sized .pdf files,
>> which if you do a page-by-page comparison, are also identical in that
>> regard.
>>
>> However, if you do a hash check to see if they're two copies of the same
>> file, it turns out that they are not.
>>
>> If the files have a lot of pages, then doing a page-by-page compare of
>> the two can be pretty difficult to accomplish.
>>
>> Winmerge (https://winmerge.org/) will compare .pdf files and point out
>> metadata differences, but not actual content differences.
>>
>
> If you're ok with an online comparer, then
> https://www.pdfforge.org/online/en/compare-pdf

Actually, I am. That is, of course, unless the files in question contain
sensitive (to me) information. The files I'm referring to do not.

> "With this tool you can easily compare the text of two PDF files. It
> quickly shows you any discrepancies in the content and indicates the
> respective line numbers. This saves you a lot of time compared to going
> through both files manually."

Thanks, Tpayne.

John C.

unread,
Jun 30, 2023, 6:31:31 AM6/30/23
to
Thanks! I'll read the article right now.

John C.

unread,
Jun 30, 2023, 6:34:11 AM6/30/23
to
John C. wrote:
> Sometimes, it's possible to download two identically-sized .pdf files,
> which if you do a page-by-page comparison, are also identical in that
> regard.
>
> However, if you do a hash check to see if they're two copies of the same
> file, it turns out that they are not.
>
> If the files have a lot of pages, then doing a page-by-page compare of
> the two can be pretty difficult to accomplish.
>
> Winmerge (https://winmerge.org/) will compare .pdf files and point out
> metadata differences, but not actual content differences.
>
> Does anybody know of a freeware program that can compare two .pdf files
> and list any differences in content?
>
> TIA.

My thanks to everybody who replied. Looks like I'll settle on DiffPdf.

wasbit

unread,
Jul 1, 2023, 4:47:52 AM7/1/23
to
On 29/06/2023 18:51, B. R. 'BeAr' Ederson wrote:
> On Thu, 29 Jun 2023 05:53:42 -0700, John C. wrote:
>
>> Does anybody know of a freeware program that can compare two .pdf files
>> and list any differences in content?
>
> Before DiffPdf went commercial at the end of 2013, it was GNU open source
> freeware. The latest archived website of the freeware version is:
>
> https://web.archive.org/web/20130805135953/http://qtrac.eu/diffpdf.html
>

Neither the link nor Archive.org will open in Pale Moon, FireFox or
Brave, warning that the 'Secure Connection Failed'.
Presumably this is a problem at their end.


> The Windows binary always was compiled and provided by Steven Lee, who
> still has the binary and the source code of the last free version on
> his website:
>
> https://soft.rubypdf.com/software/diffpdf
>
> It is byte-identical to the one I downloaded back then.
>
> MD5: 52e9749aa28d8f44d5aa2b6e689b2a40 *diffpdf-2.1.3-win32-static.zip
>
> INHO, this "old" version still does a very good job and even offers
> different comparison modes (by character, by word, visually). Visual
> mode even lets you detect added comments and markers, when the former
> modes regard 2 Pdf-files as identical. (Which they are on internal text
> level.)
>

--
Regards
wasbit

AllanH

unread,
Jul 1, 2023, 5:15:27 AM7/1/23
to
On 7/1/2023 3:47 AM, wasbit wrote:
> On 29/06/2023 18:51, B. R. 'BeAr' Ederson wrote:
>> On Thu, 29 Jun 2023 05:53:42 -0700, John C. wrote:
>>
>>> Does anybody know of a freeware program that can compare two .pdf files
>>> and list any differences in content?
>>
>> Before DiffPdf went commercial at the end of 2013, it was GNU open source
>> freeware. The latest archived website of the freeware version is:
>>
>> https://web.archive.org/web/20130805135953/http://qtrac.eu/diffpdf.html
>>
>
> Neither the link nor Archive.org will open in Pale Moon, FireFox or
> Brave, warning that the 'Secure Connection Failed'.
> Presumably this is a problem at their end.
>

It sounds to me like a Web browser setting problem. Vivaldi displays the
link without any message, and the "View Site info" icon tells me that
"Connection is secure".

B. R. 'BeAr' Ederson

unread,
Jul 1, 2023, 3:07:36 PM7/1/23
to
On Sat, 1 Jul 2023 09:47:46 +0100, wasbit wrote:

>> https://web.archive.org/web/20130805135953/http://qtrac.eu/diffpdf.html
>>
>
> Neither the link nor Archive.org will open in Pale Moon, FireFox or
> Brave, warning that the 'Secure Connection Failed'.
> Presumably this is a problem at their end.

Pale Moon (being my main Internet browser) is the program I copied the
link from in the first place. I now checked again. And the links opens
just fine, although I have fairly strict security settings in place.
Neither Firefox, nor Edge or Tor browser have any problems opening
the link, either.

Either, there was a temporary glitch with archive.org, or you need to
check your browser settings. (Although it is unlikely, that all 3 you
mentioned are mal-configured in the same way...)

To check out the last freeware version, accessing above link is not
necessary, though. You just get a large screenshot and a couple of
explanations about the program and its development and distribution.
The author did not host the Windows binary on his site, but linked
to the RubySoft page of Steven Lee, which is still online and has
the last freeware version for download. (Link is in my earlier post.)

AllanH

unread,
Jul 1, 2023, 3:50:54 PM7/1/23
to
On 7/1/2023 2:07 PM, B. R. 'BeAr' Ederson wrote:
>
> To check out the last freeware version, accessing above link is not
> necessary, though. You just get a large screenshot and a couple of
> explanations about the program and its development and distribution.
> The author did not host the Windows binary on his site, but linked
> to the RubySoft page of Steven Lee, which is still online and has
> the last freeware version for download. (Link is in my earlier post.)
>
> BeAr

This is the download link that I used.
https://web.archive.org/web/20131221114900/http://www.qtrac.eu/diffpdf-2.1.3-win32-static.zip

wasbit

unread,
Jul 10, 2023, 4:42:27 AM7/10/23
to
On 01/07/2023 09:47, wasbit wrote:
> On 29/06/2023 18:51, B. R. 'BeAr' Ederson wrote:
>> On Thu, 29 Jun 2023 05:53:42 -0700, John C. wrote:
>>
>>> Does anybody know of a freeware program that can compare two .pdf files
>>> and list any differences in content?
>>
>> Before DiffPdf went commercial at the end of 2013, it was GNU open source
>> freeware. The latest archived website of the freeware version is:
>>
>> https://web.archive.org/web/20130805135953/http://qtrac.eu/diffpdf.html
>>
>
> Neither the link nor Archive.org will open in Pale Moon, FireFox or
> Brave, warning that the 'Secure Connection Failed'.
> Presumably this is a problem at their end.
>
> snip <
>

Update
Seems the failure is down to my having recently changed ISP.

--
Regards
wasbit
0 new messages