TSE: About the automatic translation of Chinese character based text to English text

6 views
Skip to first unread message

knud van eeden

unread,
Aug 11, 2022, 7:03:56 AMAug 11
to SemWare TSE Pro Text Editor
Hlade Law: The most difficult task should be given to the laziest person. He will surely find an easy and high-quality way to solve it. After all, a lazy person will not complicate his life and will do the work immediately well, so that later he does not have to redo it ;-) 

===

Hello,

It seems to me that this translation should be (probably easily) automated.

Because e.g. there should (maybe) be a one-to-one mapping possible.

Steps:

1. Given a TSE .s file

2. Parse the .s file for possible Chinese text presentation

a. Look for strings (double quotes, single quotes, ...)

b. Keywords Ask(), Warn(), Format(), ...)

3. Extract that Chinese text presentation

4. Convert that found Chinese text presentation character by character
   to e.g. Google translate ready text (Unicode, ...)

5. Send it e.g. to translate.google.com (e.g. using screen scraping or URLGet API or ...) for Chinese -> English translation

6. Replace the Chinese character presentation in the TSE .s file by its English text.




knud van eeden

unread,
Aug 11, 2022, 12:00:20 PMAug 11
to SemWare TSE Pro Text Editor
I would try to solve it as simple as possible.

Start first with an example with only 1 character
then if that works 2 characters, 
then if that works 3 characters, 
and so on...

Not sure if there is a same of amount of characters (e.g. 2) per Chinese character when parsing that TSE .s file, 
otherwise it will become very much more difficult to parse as one needs to scan a very large table of possibilities.

In general the principle of induction, the principle of divide and conquer, the principle of Occam's razor thus to possibly
resolve it.

with friendly greetings
Knud van Eeden

Reply all
Reply to author
Forward
0 new messages