Regarding NoSke XML export format (possible feature request)

29 views
Skip to first unread message

Balázs INDIG

unread,
Jan 5, 2024, 8:45:02 AM1/5/24
to NoSketch Engine
Dear All,

somewhere between bonito-open-4.24.6 and bonito-open-5.55.5 the XML export behavior is changed. Currently the code looks like this:

```
                left = ' '.join([n(scoll(x)) for x in l['Left']])
                kwic = ' '.join([n(s(x)) for x in l['Kwic']])
                right = ' '.join([n(scoll(x)) for x in l['Right']])
                if params.get('viewmode') == 'kwic':
                    outf.write('    <left>%s</left>' % left + nl)
                    outf.write('    <kwic>%s</kwic>' % kwic + nl)
                    outf.write('    <right>%s</right>' % right + nl)
                else:
                    outf.write('    %s %s %s' % (left, kwic, right) + nl)
```

In the old behavior the else branch did not exist and even when selecting "sentence" display mode instead of "kwic" the left, kwic and right tags were written to the output, currently their content is joined by spaces making them distinguishable (In the browser the kwic is still displayed in red, clearly distinguishable from the left and right context. One would expect the same in the XML output.)

I want to use the old behavior (i.e. one full sentence in each concordance with left, kwic, right tags separated/distinguishable automatically without external education). I cannot see any use case where one could not easily merge the content of left, kwic, right tags when needed, while the inverse operation is impossible.  

1. Is there a rationale behind this new behavior?
2. Can I request reverting it back or adding an option to achieve it? 

Thank you,

Balázs INDIG

Balázs INDIG

unread,
Mar 18, 2024, 6:41:29 PM3/18/24
to NoSketch Engine, Balázs INDIG
Dear All,

while trying to circumvent the aforementioned limitation I've realised that using KWIC viewmode to export data the context is limited to 100/500 characters (compared to Sentence mode, where the output is only limited by the length of the sentence).

The KWIC part is mentioned at https://www.sketchengine.eu/guide/account-limitations/  but I miss the information on Sentence viewmode.

This makes impossible to have one whole sentence splitted by left, kwic, right tags in XML export.
So this is a regression compared to the previous versions which could not be circumvented.

Is there any rationale behind this change? Can I have some clarification on the topic?
The changelog page (if it refers to bonito-open) is outdated: https://www.sketchengine.eu/documentation/bonito-changelog/

Thank you in advance,

Balázs INDIG

Tomáš S

unread,
Mar 28, 2024, 4:21:54 AM3/28/24
to NoSketch Engine, Balázs INDIG
Dear  Balázs,
we have released new Open-Bonito (5.71.15) with updated XML export. Exported concordance now contains <left>, <kwic> and <right> elements.


Best regards!

Tomas

Dne pondělí 18. března 2024 v 23:41:29 UTC+1 uživatel Balázs INDIG napsal:

Balázs INDIG

unread,
Mar 28, 2024, 7:23:09 PM3/28/24
to NoSketch Engine, Tomáš S, Balázs INDIG
Dear Tomas,

thank you!

It works! 

Balázs INDIG

Reply all
Reply to author
Forward
0 new messages