To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/008801d8185b%24cdecfa20%2469c6ee60%24%40gmail.com.
Hi Yves, hi Chase,
thank you very much for the quick answer!
The HTML filter will generate a restype based on the surrounding element, but the XML Stream filter for some reason does not. I am not sure why this is.
Probably because in HTML you can define, what element should map to what value of the restype attribute (available values are pre-defined in xliff spec).
And with XML you can not do that.
Therefore it would be good to have a possiblity to set the
resname accordingly.
On Wed, Feb 2, 2022 at 9:39 AM <yves.s...@gmail.com> wrote:
Actually you may (not sure) be able to do this.
The XML filter supports ITS 2 and there is a way to tell what the value of an ID will be based on an XPath.
See https://okapiframework.org/wiki/index.php/XML_Filter#idValue_and_xml:id for details.
With "ID" you mean, the ID of the trans-unit would be set to the value of the xpath, right?
Unfortunately this is no solutions, since we need (as you pointed out) unique IDs.
And a lot of segments in the xml will have the same structure tags surrounding them.
The purpose for getting the external tag information is to be able to pass it to translate5, so that we can show this information as context information to the translator.
Like many other CAT tools are doing it.
So it is not possible right now?
If no: It could be, that I can convince our clients to fund such a development. Do you know someone, who could do that for funding?
best
Marc
Hi Chase,
thank you, this solution sounds very interesting.
I just tried it, but it did not work and I do not know why.
I tried it using Rainbow with Okapi 1.41 locally on my Ubuntu.
I have this fprm for the xml stream filter:
global_cdata_subfilter: okf_regex@translate5-exclude-cdata
assumeWellformed: true
preserve_whitespace: false
attributes:
xml:lang:
ruleTypes: [ATTRIBUTE_WRITABLE]
xml:id:
ruleTypes: [ATTRIBUTE_ID]
id:
ruleTypes: [ATTRIBUTE_ID]
xml:space:
ruleTypes: [ATTRIBUTE_PRESERVE_WHITESPACE]
preserve: ['xml:space', EQUALS, preserve]
default: ['xml:space', EQUALS, default]
elements:
LINK:
ruleTypes: [INLINE]
FORMAT:
ruleTypes: [TEXTUNIT]
elementType: FORMAT
This is the xml I'm processing
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE FIRSTspirit_XML_EXPORT SYSTEM
"TranslationXml.dtd">
<FIRSTspirit_XML_EXPORT sourceLanguage="EN"
version="5.2.200508.79058">
<PAGENODE revision="33558" uid="haegglunds"
uidType="PAGESTORE">
<PAGENODE revision="33558" uid="haegglunds_products"
uidType="PAGESTORE">
<PAGE revision="52459" uid="standard_page_1"
uidType="PAGESTORE">
<BODY name="main_content">
<SECTION id="307797" name="text_modules_1"
templateid="41">
<CMS_VALUE
name="st_text">
<FORMAT name="h2">Power for
productivity</FORMAT>
</CMS_VALUE>
</SECTION>
</BODY>
</PAGE>
</PAGENODE>
</PAGENODE>
</FIRSTspirit_XML_EXPORT>
And this is the xliff I'm getting:
<?xml version="1.0" encoding="UTF-8"?>
<xliff version="1.2"
xmlns="urn:oasis:names:tc:xliff:document:1.2"
xmlns:okp="okapi-framework:xliff-extensions"
xmlns:its="http://www.w3.org/2005/11/its"
xmlns:itsxlf="http://www.w3.org/ns/its-xliff/"
its:version="2.0">
<file original="xxxxx.xml" source-language="en-US"
target-language="de-DE" datatype="xml">
<body>
<trans-unit id="tu1">
<source xml:lang="en-US">Power for
productivity</source>
<seg-source><mrk mid="0" mtype="seg">Power for
productivity</mrk></seg-source>
<target xml:lang="de-DE"></target>
</trans-unit>
</body>
</file>
</xliff>
What do I miss?
Thank you very much in advance
Marc
--
You received this message because you are subscribed to the Google Groups "okapi-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/CAGRYq4h2qDpRgBN07DkpC6dxe3fPj-nM1c%3D4jWWh4M3uOogQEQ%40mail.gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/c097a6da-34c9-359c-12d8-c5376ebbff0c%40marcmittag.de.
Thank you Jim, thank you Chase,
that works! Lower case did the trick. It's a bit confusing, that you have to put the tag in the config lowercase, even though it appears in uppercase in the xml.
Yet anyway, now I know it.
Thank you!
What unfortunately does not work is to do something like this in the fprm:
format:
ruleTypes: [TEXTUNIT]
elementType: h1
conditions:
- name
- EQUALS
- [h1]
format:
ruleTypes: [TEXTUNIT]
elementType: p
conditions:
- name
- EQUALS
- [p]
So to define the elementType dependent on the value of an attribute differently for the same tag.
Because Rainbow silently deletes the first format definitions.
Probably there is no solution for this at the moment, right?
Yet this is a seldom case, I must admit, and we could solve it
with a pre- and post-processing of the xml before we send the xml
to okapi and after we get it back from it for the export.
So know I have a solution and can talk to the client.
And yes, to have a config option in Okapi that (if active) sets the restype by default would be great. Maybe we can find a way to implement this/support its development sooner or later.
best
Marc
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/acf0935e-ec84-8190-929a-a804281358ac%40gmail.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/acf0935e-ec84-8190-929a-a804281358ac%40gmail.com.
-- Marc Mittag MittagQI - Quality Informatics Service Desk for Requests: https://jira.translate5.net/servicedesk Please request a login via mail, if you have none MittagQI Konrad-Lorenz-Weg 10 D-72116 Mössingen Germany Tel.: ++49 (0)7473/220202 Fax: ++49 (0)7473/220211 mailto: Ma...@MittagQI.com Web: www.MittagQI.com Optionale PGP-Verschlüsselung: Für jeden Mitarbeiter von MittagQI ist auf pool.sks-keyservers.net ein PGP-Key hinterlegt den Sie zur PGP-Verschlüsselung Ihrer Mails an uns nutzen können.
Hi Jim,
I was working with Okapi Rainbow 1.41.
So maybe the behavior has changed with 1.42? I did not download that one yet.
best
Marc
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/2fe8744b-1f02-841b-c073-4d287dd74cc9%40gmail.com.
Hi Chase,
thank you again for the answer below!
I understand it right, that there is no possiblitiy to do that
for the XML-ITS filter somehow, right? This would have to be
developed, right?
best
Marc
--
You received this message because you are subscribed to the Google Groups "okapi-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email to okapi-users...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/okapi-users/CAGRYq4h2qDpRgBN07DkpC6dxe3fPj-nM1c%3D4jWWh4M3uOogQEQ%40mail.gmail.com.