Tomaž Erjavec
unread,May 28, 2020, 3:13:45 AM5/28/20Sign in to reply to author
Sign in to forward
You do not have permission to delete messages in this group
Sign in to report message
Either email addresses are anonymous for this group or you need the view member email addresses permission to view the original message
to NoSketch Engine
Hi,
SkE URLs have the very useful "format" parameter, and I've just written
a small XSLT script, where you give it a list of CQL query / corpus name
pairs, and the XSLT constructs the query URLs with &format=xml,
retrieves the concordances via the document() function, slightly
polishes the results and returns a XML document with all the hits. So
far, so good.
However, if any query returns 0 results, the return XML ends with a
"spam" comment which contains a HTML page with a a stack trace, as if an
error occurred, as below.
The problem is that the <body> elements in the comment end with "-->",
i.e. the XML comment in fact ends there, and the rest is taken as part
of the XML, resulting in ill-formed XML, which can't be parsed and hence
used.
It is not a big issue, but it would simplify matters for those using the
XML output if the complete spam comment at the end were removed, or at
least if all the "-->" were removed from it.
Best,
Tomaž
Example XML output with 0 hits:
<?xml version='1.0' encoding='UTF-8' ?>
<export>
<header>
<corpus>imp</corpus>
<subcorpus>-</subcorpus>
<query>
<subquery operation="Query" size="0">[lc="tvit" |
lemma_lc="tvit"]</subquery>
</query>
</header>
<concordance>
<!--: spam
Content-Type: text/html
<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> -->
<body bgcolor="#f0f0f8"><font color="#f0f0f8" size="-5"> --> -->
</font> </font> </font> </script> </object> </blockquote> </pre>
</table> </table> </table> </table> </table> </font> </font>
</font><body bgcolor="#f0f0f8">
<table width="100%" cellspacing=0 cellpadding=2 border=0 summary="heading">
<tr bgcolor="#6622aa">
<td valign=bottom> <br>
<font color="#ffffff" face="helvetica,
arial"> <br><big><big><strong><type
'exceptions.KeyError'></strong></big></big></font></td><td
align=right valign=bottom><font color="#ffffff" face="helvetica,
arial">Python 2.7.17: /usr/bin/python<br>Wed May 27 17:46:56
2020</font></td></tr></table>
<p>A problem occurred in a Python script. Here is the sequence of
function calls leading up to the error, in the order they occurred.</p>
...