--
You received this message because you are subscribed to the Google Groups "NoSketch Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noske+unsubscribe@sketchengine.co.uk.
To post to this group, send email to no...@sketchengine.co.uk.
Visit this group at https://groups.google.com/a/sketchengine.co.uk/group/noske/.
To view this discussion on the web visit https://groups.google.com/a/sketchengine.co.uk/d/msgid/noske/ff520286-7770-42af-b09d-c5abaeb09627%40sketchengine.co.uk.
For more options, visit https://groups.google.com/a/sketchengine.co.uk/d/optout.
Dear Elen,the awk tool could be used for filtering any text (XML) files.You can use regular expressions like in the following command line:awk '/<doc .*date="201[456]"/,/<\/doc>/' corpus.xml >subcorpus.xmlBest
On Wed, Mar 29, 2017 at 11:19 AM, Elen <el...@fridu.net> wrote:
Dear all,
I have annotated my corpus with XML and as a result can create subcorpora in Sketch Engine. However, I need to be able to download these subcorpora for further analysis with a different piece of software. To my great disappointment, it turns out this is not possible in the commercial version of Sketch Engine. I was wondering whether this is something that might be possible using NoSketch Engine. Are you aware of such a function?
If not, I wonder whether any of you could point me towards a simple, preferrably non-commercial, piece of software that would enable me to extract some of the XML-annotated content of my corpus automatically? I suspect it really doesn't need to be something fancy but I am so far failed to find something suitable and would be very grateful for any tips.
Thank you very much for your help.
Best regards,
Elen
--
You received this message because you are subscribed to the Google Groups "NoSketch Engine" group.
To unsubscribe from this group and stop receiving emails from it, send an email to noske+un...@sketchengine.co.uk.