Removing superfluous double byte characters (pet peeve)

840 views
Skip to first unread message

Steven P. Venti

unread,
Jun 29, 2009, 11:27:25 PM6/29/09
to Honyaku E<>J translation list
Back in the early '90s when I first started translating, it was common
practice to be careful not to mix double byte characters in a file that
contained only English, and I've always tried to be conscientious about
editing out superfluous double byte characters from English text,
especially when the Japanese author has used double byte spaces instead
of tabs to align his text.

My question to everyone is: Is this now waste of the translator's time?

The main rationale for the previous practice was that the file would
neither display properly nor print out properly if it contained hidden
double byte characters.It seems to me that this concern has pretty much
gone the way of the horse buggy, and I can't help thinking that I'm
wasting my time doing something that the client is completely oblivious
to.

Comments, please.

-----------------------------------------------------------------
Steven P. Venti
Mail: spv...@bhk-limited.com
Rockport Sunday
http://www.youtube.com/watch?v=bCPpd20CgXE
-----------------------------------------------------------------

Cary Strunk

unread,
Jun 29, 2009, 11:37:47 PM6/29/09
to hon...@googlegroups.com
I don't know for certain, but one of the IT guys where I work tells me that it is still a concern. Being generally lousy with computers, I listen to him.

HTH.

Best,

Cary Strunk


Ginstrom IT Solutions (GITS)

unread,
Jun 30, 2009, 12:26:42 AM6/30/09
to hon...@googlegroups.com
> [mailto:hon...@googlegroups.com] On Behalf Of Steven P. Venti

> especially when the Japanese author has used double byte
> spaces instead of tabs to align his text.
>
> My question to everyone is: Is this now waste of the
> translator's time?

If you're working in an "office" program like Word/Excel/PowerPoint or the
Open Office equivalents, then the only difference is cosmetic: the
characters won't look quite right to someone not used to Japanese text.

If the translation is destined for a Web page, or might be in the future
(and this is becoming increasingly likely over time), then including
double-byte characters in an English document can cause problems. This is
because even the major Japanese websites (Toyota, Sony, ...) still use
shift-JIS encoding instead of utf-8. If that encoding declaration is
stripped out, then the browser will try to guess the encoding, leading to
gobbledy-gook where that parenthesis was supposed to be.

Regards,
Ryan

--
Ryan Ginstrom
trans...@ginstrom.com
http://ginstrom.com/

Steven P. Venti

unread,
Jun 30, 2009, 1:12:42 AM6/30/09
to hon...@googlegroups.com
"Ginstrom IT Solutions \(GITS\)" <sup...@felix-cat.com> wrote:
> If you're working in an "office" program like Word/Excel/PowerPoint or the
> Open Office equivalents, then the only difference is cosmetic
[snip]

> If the translation is destined for a Web page, or might be in the future
> (and this is becoming increasingly likely over time), then including
> double-byte characters in an English document can cause problems.

Thanks to Cary and Ryan for their comments. I guess it's time to start
defining for certain clients just exactly what 和文を上書きする does and
does not entail.

Additional comments also appreciated.

Alex Koolhof

unread,
Jun 30, 2009, 2:59:28 AM6/30/09
to hon...@googlegroups.com
I am a big fan of your FindNextJ Macro for microsoft word, Ryan.  I always give my completed documents a quick run over with that macro just to be sure there are no hidden double byte characters lurking. 

alex
 width= >>>View my travel photos!


2009/6/30 Steven P. Venti <spv...@bhk-limited.com>

martha mcclintock

unread,
Jul 1, 2009, 9:22:03 PM7/1/09
to hon...@googlegroups.com
Alex, as a completely non computer sort who uses MSword constantly but without the benefit of the whizbang macros etc, how does one set up this macro for finding doublebytes? I always check for them as I am checking texts, usually it is double byte spaces etc that i find in text handled by my clients, and it is critical to get them out before the small house publishers i work with get them into press, and create more problems at proofing stage....

please share this method for us basic sorts!

thanks!
martha mcclintock

Peter Clark

unread,
Jul 1, 2009, 9:56:48 PM7/1/09
to hon...@googlegroups.com
Dear Martha,
This should get you started.
http://word.mvps.org/faqs/MacrosVBA/CreateAMacro.htm
 
Do you do anything repetetively in Word, either at the start or end of every translation, or during translation? Then Macros can help you.
The Wordfast Knowledgebase also has some great hints.
http://www.wordfast.net/index.php?whichpage=knowledge&Task=view&questId=24&catId=9
 
Peter Clark



Click here to find out more POP access for Hotmail is here!

Ben Bullock

unread,
Jul 1, 2009, 11:15:46 PM7/1/09
to hon...@googlegroups.com
2009/7/2 martha mcclintock <mjm...@gmail.com>:

> Alex, as a completely non computer sort who uses MSword constantly but
> without the benefit of the whizbang macros etc, how does one set up this
> macro for finding doublebytes? I always check for them as I am checking
> texts, usually it is double byte spaces etc that i find in text handled by
> my clients, and it is critical to get them out before the small house
> publishers i work with get them into press, and create more problems at
> proofing stage....
> please share this method for us basic sorts!

This is built into Word. You don't need any macros. To remove all the
double characters:

1. Press the "Ctrl" key and then press the "A" key with the "Ctrl" key
pressed down to select everything
--> All the text in the document is inverted into white on black
2. Go to the "Format" menu at the very top of the Word window.
3. Choose "Change case".
--> A box appears
4. Choose "Half-width" from the box.
5. Press "OK".

This will remove all double byte spaces and other "double byte"
characters from the document (side-effect: it will also convert any
katakana from the usual format into the narrow format.)

To remove other annoying things like too-wide degree signs which the
above doesn't catch:

1. Press Ctrl key + A to select everything.
(Alternatively, select the offending items by holding down the left
mouse button, and dragging the pointer with this button held down to
select the text)
2. On the font menu of the formatting toolbar, choose a font such as
"Times New Roman".

This way the "double sized" degree signs will all be turned into
normal-sized ones.

Ben Bullock

Alex Koolhof

unread,
Jul 2, 2009, 12:40:05 AM7/2/09
to hon...@googlegroups.com
Advice from Peter and Ben should help you out there Martha!  You can also go to Ryan's macro page here: http://ginstrom.com/software/wordmacros/
and there are instructions on how to install whichever one you choose.

Alex
2009/7/2 Ben Bullock <benkasmi...@gmail.com>

martha mcclintock

unread,
Jul 2, 2009, 1:57:43 AM7/2/09
to hon...@googlegroups.com
thanks peter! 

martha mcclintock

unread,
Jul 2, 2009, 1:58:57 AM7/2/09
to hon...@googlegroups.com
thanks Alex, Peter, Ben, and Ryan!
i will be doing revisions to a full book manuscript, etc over the next 6 months and i think it is time i learn how macros can help with such!
thanks!

martha mcclintock

unread,
Jul 2, 2009, 2:03:33 AM7/2/09
to hon...@googlegroups.com
ok... i just followed ben's instructions below, and I think we are
working in different versions of Word. I am in an English Mac OSX
version of Word 2004, with "upper case", "lower case" etc as the
"change case" options.......

Ben, what version of Word are you working with?

John Senior

unread,
Jul 2, 2009, 3:46:32 AM7/2/09
to hon...@googlegroups.com
You can search for (and replace) double-byte characters with Word's
own search & replace function. I often do this as a final check when
editing documents created or "corrected" by clients.

The method may differ slightly from version to version, but if you
open the search window and go into the options, you'll find you can
search for format-related things like tabs, line breaks, italics etc.
Most people know that, but there's also an option for searching by
language. If you select Japanese and leave the search text box blank,
it will find any double-byte character for you, including spaces.

Presumably this is the process that the macro performs automatically.
I'll be installing it forthwith to save time and mousework.

I definitely think double-byte characters should be avoided in English
documents. They can mess up line/character spacing when printed and,
as noted by others, may cause mojibake if used on the web.

--
John Senior
@Tokyo, Japan

Ben Bullock

unread,
Jul 2, 2009, 5:45:50 AM7/2/09
to hon...@googlegroups.com
Martha,

> Ben, what version of Word are you working with?

I am using Word 2002/2003, Japanese version, with the English user
interface pack. You may not have Japanese language support in your
version. The following links might help:

http://office.microsoft.com/en-us/word/HP052510371033.aspx
http://office.microsoft.com/en-us/word/HP030845661033.aspx

Ben Bullock

2009/7/2 martha mcclintock <mjm...@gmail.com>:

martha mcclintock

unread,
Jul 2, 2009, 7:08:09 PM7/2/09
to hon...@googlegroups.com
thanks ben, yes, i thought it might be a japanese version you were
using...
will check these options out
grins
Martha
Reply all
Reply to author
Forward
0 new messages