Specifically, I am using OmegaT+ at the moment. It is quite helpful in
handling the enormous repetition you find in a lot of patents, but these
documents also have very long sentences, often broken up by chemical or
mathematical formulas, etc. When the CAT app segments them, small parts
of the sentences turn up in separate segments, and since Japanese word
order has no relationship to English word order, of course, I find that
I have to rearrange everything in the word-processor file that is the
ultimate product.
Is there a way of reducing the amount of labor this requires, or is it a
problem one just has to live with?
Jon Johanning // jjoha...@igc.org
Looks very similar to the procedure I have arrived at (I use // as a
"bookmark"). What I do with the OCR'ed pdfs I often deal with is to fix
up the text file from the OCR to some extent before I feed it to the
CAT. Obviously one has to use some judgment about what jobs are worth
the trouble.
So far I haven't done any CAT jobs sent from agencies (usually they
require Trados and I can get out of these jobs that way), but there is
an agency I work for who will start giving me jobs using another CAT app
at some point, and I will see what happens with their stuff when it arrives.
Jon Johanning // jjoha...@igc.org
Not having a CAT tool with the above features, many of Matthew
Schlecht's suggestions will work. In fact, it is a very good to
pre-process all files a little before feeding them to a CAT tool.
Another thing to do with chemical patent claims is to rearrange them
into the English order (generally just bringing the ending phrase in
Japanese forward does the trick, even though the Japanese does not make
sense that way. If you then have glossary phrases, you can just pop them
in from you glossary in most CAT tools. (I am not sure if transfer from
glossary to target segment is available in OmegaT, but I think you can
copy and paste at least.)
At the very least, you can remove the unnecessary line breaks and
graphics to put them back later. Remove the breaks so the entire claim
is segmented together.
When there are long lists of chemicals, you can rearrange and then
separate the chemicals into single units to make you TM more effective
later, but I have found that having a good glossary is even better.
(Some CAT tools have assembly features that will bring the lists
together if you have a really good glossary.)
Charles Aschmann
--
You received this message because you are subscribed to the Honyaku Mailing list.
To unsubscribe from this group, send email to honyaku+u...@googlegroups.com
For more options, visit this group at http://groups.google.com/group/honyaku?hl=en?hl=en
it feels good to read that others use a similar s/w equipment and
share the same problems.
I have been trying to migrate from OmegaT to Across which is
another cheap way of using TM and may be that it allows to
change segment borders after segmentation which would
solve the problem.
However, in my point of view Across is as function overloaded as
Trados and Transit (STAR AG) and OmegaT is much easier to
handle than Across and other "bolides".
So I have changed my attitude and do not consider no longer the
pre- and postediting as problem, but as part of a defined translation
process.
My steps are:
1st: Copying the text from the source into the Windows editor
2nd: Make sure that all headers appear in separate lines
3rd: deleting unneeded information which does not appear in
the corresponding PDF-File
4th: Copying the text from Editor to Open Office
5th: Presegmentation (add "new line" after each "maru"),
replacing 2-byte numbers by one-byte numbers, assigning
formates (templates) and other preediting work when
needed.
6th: translating using OmegaT.
7th: post editing (restauraton of paragraphs if they
contained more than one snetence ending with a maru; others
like adding Tabs or NewLine where needed)
8th: Spell check and proof reading
There is some work for which writing macros may be a solution
but I am not that familfamiliar with writing macros.
Best regards,
Uwe Hirayama
> It is quite helpful in handling the enormous repetition you find in a lot of patents, but these documents also have very long sentences, often broken up by chemical or mathematical formulas, etc. When the CAT app segments them, small parts of the sentences turn up in separate segments, and since Japanese word order has no relationship to English word order, of course, I find that I have to rearrange everything in the word-processor file that is the ultimate product.
>
> Is there a way of reducing the amount of labor this requires, or is it a problem one just has to live with?
Change the segmentation rules to fit the patterns in the documents.
> Specifically, I am using OmegaT+ at the moment.
Is there a specific reason why you use OmegaT+ and not OmegaT ?
Jean-Christophe Helary
----------------------------------------
fun: http://mac4translators.blogspot.com
work: http://www.doublet.jp (ja/en > fr)
tweets: http://twitter.com/brandelune
Charles Aschmann
Since I almost never encounter segments that require merging or splitting without being able to find a quick round trip I can't really say, plus I don't do patents... But I'd like to see one of those documents where there are formulas that break the structure.
Also, Anaphraseus (a Wordfast equivalent for OpenOffice.org/NeoOffice) works well to in such contexts. I don't use it very much so I don't know if it does splitting merging but that would sure be with a try.
The problem is of course not because of any specific CAT tool.
Minoru Mochizuki
The West Japan Committee, Japan Translation Federation would like to let you
know the following event.
---------------------
The 2nd JTF West Japan Seminar on July 2, 2010 (Fri.)
Theme: Translation from Japanese into English of Business Documents and
Advanced International Business
Seminar Instructor: Shintaro Tominaga, Cross-Cultural Business Consultant
http://jp.linkedin.com/in/feilong
MC: Naomi Kaminaga, President, Fan Works Co., Ltd.
http://abac.asia/business/
Date: July2, 2010 (Fri.) Time: from 2:00 p.m. to 5:00 p.m.
Venue: The Consortium of Universities in Osaka
Campus Port Osaka , Room E, 4th Floor, Osaka Ekimae Daini Building,
1-2-2-400, Umeda, Kita-ku, Osaka City
TEL:06-6344-9560 FAX:06-6344-956
http://www.consortium-osaka.gr.jp/about/access.html
Operated by: JTF West Japan Committee
Sponsored by: Kansai Bureau of Economy, Trade and Industry, Ministry of
Economy, Trade and Industry
Summary:
Part 1: To Work on Japanese-English Translation of International Telephone
Quarterly Meeting of a Japanese company.
Part 2: To concentrate on Japanese-English Translation of a brochure of
ASAHI INTECC, a company manufacturing medical appliances.
Part 3: To address developing translation into international business.
Part 4: To make use of The Nikkei Weekly for Japanese-English Translation.
Admission fees:
JTF member:2,500 yen
Non member:3,500 yen
Application Deadline
June 29, 2010 (Tue.)
Application will close when it will reach the fixed number.
Please make your application on the following site:
http://www.jtf.jp/west_seminar/index_w.do?fn=search
------------
Akiko Sato,
JTF Director
> So I have changed my attitude and do not consider no longer the pre-
> and postediting as problem, but as part of a defined translation process.
>
> My steps are:
>
> 1st: Copying the text from the source into the Windows editor
> 2nd: Make sure that all headers appear in separate lines
> 3rd: deleting unneeded information which does not appear in the
> corresponding PDF-File
> 4th: Copying the text from Editor to Open Office
> 5th: Presegmentation (add "new line" after each "maru"), replacing
> 2-byte numbers by one-byte numbers, assigning formates (templates)
> and other preediting work when
> needed.
> 6th: translating using OmegaT.
> 7th: post editing (restauraton of paragraphs if they contained
> more than one snetence ending with a maru; others
> like adding Tabs or NewLine where needed)
> 8th: Spell check and proof reading
I'm using much the same process, translated into Mac terms.
Jon Johanning // jjoha...@igc.org
>> Specifically, I am using OmegaT+ at the moment.
>>
> Is there a specific reason why you use OmegaT+ and not OmegaT ?
>
I just find it more comfortable to use, personally, in various respects.
But it's not much different.
Jon Johanning // jjoha...@igc.org
Is "bolide" a term of art in software jargon meaning
"a bloated monstrosity overloaded with seldom-used functions"?
A Google search on "define:bolide" yields only entries like
"an exploding or fragmenting meteor or fireball".
Curious,
Mark Spahn (West Seneca, NY)
> On 6/22/10 8:44 PM, Jean-Christophe Helary wrote:
>> Change the segmentation rules to fit the patterns in the documents.
> That would be a good idea if Japanese patent writers followed "patterns." Patterns? It would be a big help if they would even write correct Japanese.
:)
>> Is there a specific reason why you use OmegaT+ and not OmegaT ?
>
> I just find it more comfortable to use, personally, in various respects. But it's not much different.
Would you mind giving details ?
> Is "bolide" a term of art in software jargon meaning
> "a bloated monstrosity overloaded with seldom-used functions"?
> A Google search on "define:bolide" yields only entries like
> "an exploding or fragmenting meteor or fireball".
No, it's just another one of Uwe's "Germanisms" (his messages are
peppered with them). In German, "Bolide" (pronounced "bo-leed-ay")
refers to a high-powered sports car and is also sometimes used for other
powerful, robust, and/or slightly menacing pieces of machinery or
systems (high-end amplifiers weighing a ton and costing a fortune are
another example). It derives from the same root as English, namely a
fireball or meteor, and it still has that meaning in German, too, but
the other meaning could be rendered in English as "over-engineered".
Wolfgang Bechstein
thanks for making me aware of another "Germanism".
@Wolfgang
> No, it's just another one of Uwe's "Germanisms" (his messages are
> peppered with them).
You can be glad that you did not have the chance to hear my
accent. To put it in other words: (NES, please close your eyes)
Ze mor my Japanees improofs the badder my English becoms :-)
Well, I am glad that this is a mailing list for translators who work
with Japanese but not necessarily with English. (BTW this is
the reason why I usually write JP2GER TRSL in the last line.)
>> Is "bolide" a term of art in software jargon meaning
>> "a bloated monstrosity overloaded with seldom-used functions"?
>> A Google search on "define:bolide" yields only entries like
>> "an exploding or fragmenting meteor or fireball".
This, however, prooves that the image (methapher?) works
somehow.
OmegaT may appear like the "Nano" of Tata but it
does what a translation memory should do.
I once happened to use a "satellite" version of the
product of STAR AG (Was it called Transit?) and
it did its job very fine and in particular the dictionary
functions were excellent. But for me as a fan of
shortcuts operating the software resulted in a kind
of "apparatus gymnastics" for the fingers (as far as
I remember).
Across may be an alternative, however, registring
vocabulary took to many steps and I always felt
a kind of uncertainty about its cooperation with the
underlying database (Microsoft SQL Server):
When I started the program sometimes it prompted
me for entering a password which I never ever had
registered before. So I felt always uncertain whether
I will be able to start it at the next session.
Kind regards,
Uwe Hirayama
hira...@t-online.de
JP2GER TRSL
One thing I recall was that, since I was mainly using OmegaT on jobs
that I received as pdfs and had to OCR to get a text file, the OCR
software did lots of strange things to the Japanese, which produced huge
numbers of completely unnecessary tags when I got it into OmegaT. I
eventually figured out how to eliminate a lot of the tags by
manipulating the text files before I fed them to OmegaT, but there were
still quite a few, which were quite a nuisance.
Of course, this was not OmegaT's fault, but I had to just ignore the
tags and bypass the tag reconciliation step at the end of the job, and
if I recall correctly the only way I found of getting an actual text
file which I could dump into a word processor and finish up with was to
run my cursor over the whole translation in OmegaT, select it, and copy
and paste into the word processor.
At any rate, when I use OmegaT+ now, I find that I don't have the
useless tag problem, and when I get through the job, I can just
"generate translation," convert that file from .odt to .rtf, and clean
up the .rtf file.
Jon Johanning // jjoha...@igc.org