Encoding issues with Word documents created on Macs

40 views
Skip to first unread message

Elizabeth Franklin

unread,
Dec 15, 2023, 12:30:45 AM12/15/23
to PRONOM
Hi All,

At the National Library of Australia we have recently come across word documents that were created in a mac environment using Word for Mac 6.0. The Pronom signature for Word for Mac 6 (x-fmt/2) has been deprecated and superseded with Word 6.0/95 (fmt/39).

 The issue that we are facing is that files of the fmt/39 that are created in a Windows environment will render correctly if opened using either Word 95 or modern versions of word.  However, files created in the Mac environment will only render correctly (without any input from the user) in Word 95. Opening them in later versions of Word requires the user to specify that the file uses Mac encoding to ensure all characters render correctly.

 Have others experienced this issue? If so, how have you managed it? Is there value in de-deprecating x-fmt/6 to provide unique identification for files created in the Mac environment so we can specify the version of word required to render them correctly?

Thanks,

Elizabeth

Tyler Thorsted

unread,
Dec 22, 2023, 5:02:39 PM12/22/23
to PRONOM
Hi Elizabeth,

I think I can help with this question. I am still doing a bit or research to see how compatible a Macintosh created Word file from Word 6 is with different Office Windows versions, but I have lots of samples to test from.

First of all the original x-fmt/2 Word for Mac 6 never had any signature, it was a placeholder. It was deprecated as it appears to be the same format as the Windows version and all of them currently identify as fmt/39 as you indicated.

In looking a little closer at the internal bits of a Mac created Word 6 file I can see some differences. One being a string that appears only in the Mac versions, "NB6W". I am interested to hear if you see this string as well when you use a hex editor to look at the file?

The current signature for fmt/39 is a container signature and is looking for the string "Word.Document." with 6 or 7 after it. If we include the string "NB6W" it would make the Mac version unique and identify them with the original x-fmt/2. Of course the PRONOM folks would have to make sure we can (un) depreciate the PUID.

I will continue some testing and see if this makes sense to do, and/if there are other versions that may be better to have the more specific identification.

Thanks!
Tyler Thorsted

Reply all
Reply to author
Forward
0 new messages