Additional subfiltering fix (was Issue 282)

17 views
Skip to first unread message

Chase Tingley

unread,
Dec 12, 2012, 8:14:45 PM12/12/12
to okapi...@googlegroups.com
I mentioned on the call today that I had found a file that was producing funny results in my ongoing subfilter tests.  This turned out to be a small bug in the original fix for Issue 282 (the textunit stack got confused when the textunit rule was added as part of an empty element).

I've checked in a testcase and simple fix to dev.  (Commit 1997fe2c40aa)

ct

Jim Hargrave

unread,
Dec 13, 2012, 9:14:15 PM12/13/12
to okapi...@googlegroups.com, Chase Tingley
Thanks Chase - saw your code - was merging in a fix for Sergei about the
same time.

J

Chase Tingley

unread,
Dec 14, 2012, 1:35:46 PM12/14/12
to Jim Hargrave, okapi...@googlegroups.com
Cool.

The  current state of this feature is looking pretty good to me.  I've done some testing with pretty elaborate files and had good results.

The only thing that's a little weird is that I get vestigal TUs consisting only of a single placeholder every time a block of content is passed off the to subfilter.  (ie, one per original TextUnit.)  I haven't gone in deep enough to know if this is an essential aspect of how the merging works with subfilters, or something - it would be slighty nicer if I could find a way to make these go away.

Jim Hargrave

unread,
Dec 14, 2012, 1:40:35 PM12/14/12
to Chase Tingley, okapi...@googlegroups.com
Is this a self-referential thing? Standalone placeholder points to the original TU?

If you have a snippet test case I can look at that too.

Jim

Chase Tingley

unread,
Dec 14, 2012, 1:57:15 PM12/14/12
to okapi...@googlegroups.com
Here's the relevant xml and xliff snippets (also attached, with fprm)

XML:
<xml>                                                                           
    <foo>&lt;html&gt;&lt;head&gt;&lt;title&gt;This is the title&lt;/title&gt;&lt;/head&gt;&lt;body&gt;&lt;p&gt;This is the body.&lt;/p&gt;&lt;/body&gt;&lt;/html&gt;</foo>
</xml>

XLIFF:
<body>                                                                          
  <group id="tu2_ssf1" resname="sub-filter:foo">                                  
    <trans-unit id="tu2_tu1" resname="foo_3" restype="x-title">                     
      <source xml:lang="en">This is the title</source>                                
    </trans-unit>                                                                   
    <trans-unit id="tu2_tu2" resname="foo_6" restype="x-paragraph">                 
      <source xml:lang="en">This is the body.</source>                                
    </trans-unit>                                                                   
  </group>                                                                        
  <trans-unit id="tu2" restype="x-foo">                                           
    <source xml:lang="en"><x id="1"/></source>                                      
  </trans-unit>                                                                   
  <group id="tu1_ssf2" resname="sub-filter:xml">                                  
  </group>                                                                        
  <trans-unit id="tu1" restype="x-xml">                                           
    <source xml:lang="en"><x id="1"/></source>                                      
  </trans-unit>                                                                   
</body>                                                                         

So, there's a couple things going on here.  The subfiltered TUs appear in the tu_ssf1 group.  This is followed by the tu2 TU, which consists only of a placeholder -- presumably representing the subfiltered content.

There's then a another group+TU pair, except in this case the group is also empty.  This corresponds to subfiltering the whitespace between the <xml> and <foo> elements.

okf_xmlstream@subfilter.fprm
simple.xml
simple.xml.xlf

Jim Hargrave

unread,
Dec 14, 2012, 2:20:29 PM12/14/12
to okapi...@googlegroups.com, Chase Tingley
Got to love okapi's resource model :-)  Powerful, but allows for a lot of variation.  I think what we need to do first is come up with a standard way to represent these cases and document them. If we can get a consensus on that then we can modify the code accordingly.

We have some old wiki pages that documents a lot of cases (i.e., http://code.google.com/p/okapi/wiki/ResourceCase001) - maybe we can add this case to the list.

Jim
Reply all
Reply to author
Forward
0 new messages