Hi Roman,
unfortunately, this is not a valid XML file:
$ xmllint --noout nas_anotacni_slovnik.xml
nas_anotacni_slovnik.xml:3: parser error : Extra content at the end of the document
<SYNSET><ID>00004022-v</ID><POS>v</POS><SYNONYM><LITERAL>inhalovat<SENSE>1</SENS
^
and therefore I can't use any XML library, which makes the job more
complex. Do you have input files which are valid XML?
best,
aitor
On Wed, Dec 16, 2015 at 08:08:32AM -0800, Roma Sudarikov wrote:
> Hi Aitor,
>
> thank you for such a quick response.
> I've attached the file, but also you can download it from the page I've
> mentioned
> (
https://lindat.mff.cuni.cz/repository/xmlui/bitstream/handle/11858/00-097C-0000-0001-4880-3/Czech_WordNet_1.9_PDT.zip?sequence=1&isAllowed=y)
>
> Best regards,
> Roman Sudarikov
>
>
> On Wednesday, December 16, 2015 at 5:05:05 PM UTC+1, Aitor Soroa wrote:
> >
> > Hi Roman,
> >
> > if you provide me with a sample XML file I can try to hack a script to
> > create the graph file and dictionary as needed by ukb.
> >
> > best,
> > aitor
> >
> > On Wed, Dec 16, 2015 at 07:53:22AM -0800,
pomp...@gmail.com <javascript:>
> > <
https://www.google.com/url?q=https%3A%2F%2Flindat.mff.cuni.cz%2Frepository%2Fxmlui%2Fhandle%2F11858%2F00-097C-0000-0001-4880-3&sa=D&sntz=1&usg=AFQjCNFMRAwKm7903ma64-8vcnLRSQYThg>),
> >
> > > but the problem is that it is stored as a single XML file with entries
> > like:
> > >
> > >
> > <SYNSET><ID>00005811-v</ID><POS>v</POS><SYNONYM><LITERAL>mrkat<SENSE>2</SENSE></LITERAL><LITERAL>zamrkat<SENSE>1</SENSE></LITERAL></SYNONYM><ILR>00559482-v<TYPE>hypernym</TYPE></ILR></SYNSET>
> >
> > >
> > > or
> > >
> > >
> > <SYNSET><ID>00014558-n</ID><POS>n</POS><SYNONYM><LITERAL>forma<SENSE>1</SENSE></LITERAL><LITERAL>tvar<SENSE>1</SENSE></LITERAL><LITERAL>podoba<SENSE>1</SENSE></LITERAL></SYNONYM><BASE>*</BASE></SYNSET>
> >
> > >
> > > but I haven't found any script to convert it to dict/ files, which are
> > > expected by UKB scripts.
> > > Please, has anybody encountered anything similar?
> > >
> > > Best regards,
> > > Roman Sudarikov
> >
>
> --
> You received this message because you are subscribed to the Google Groups "ukblist" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to
ukblist+u...@googlegroups.com.
> To post to this group, send email to
ukb...@googlegroups.com.
> Visit this group at
https://groups.google.com/group/ukblist.
> For more options, visit
https://groups.google.com/d/optout.
--
ondo izan
aitor