<script language="javascript"> function ordnum(i) { var s = "" + i; if (i > 10 && i < 20) { s += "th"; } else if (s.search(/1$/) >= 0) { s += "st"; } else if (s.search(/2$/) >= 0) { s += "nd"; } else if (s.search(/3$/) >= 0) { s += "rd"; } else { s += "th"; } document.write("The " + s); } </script> (please excuse the formatting :) The problem is the first "if" (i > 10 && i < 20). The parser explodes because the "<" is not a start element and it's not escaped. In the good old days the content of <script> would be wrapped in <!CDATA ]]> but it looks like in HTML5 that is not required anymore (the is a note here: http://wiki.whatwg.org/wiki/HTML_vs._XHTML#Element-specific_parsing) Anyway, my attempts to resolve this: - I could read the file as a string and "strip" (maybe via regexp) the content of <script> </script> but I refuse to do that :) - I have tried using the go.net/html package that actually does parse the file just fine but then I don't know what to do with it (the xpath package expects an xml.Decoder). And I really don't want to scan the tree generated by the HTML parser (but in theory I could write a modified xpath package that uses go.net/html. - I would love to write a "bridge" that scans the HTML tree and generates XML events, but the Decoder package doesn't allow me to do that (I don't want to say it but I miss Java and Sax :) Is there a way that I am missing to implement my own Decoder.Token() or Decoder.RawToken() and pass it to xmlpath ? - Actually all I wanted was a way to "restart" xml.Decoder after an error. I didn't try but I suspect that adding a Decoder.ClearError() that set Decoder.err=nil would probably do the trick. Then, after an error Iike what I am getting, I would clear it, skip to the end of the last open element ( the script ) and continue as if nothing has happened. Would this be a useful feature worth considering ? Is my current problem a "bug" of non-strict mode ? (i.e. should the non-strict mode treat the content of script, and possibly the other tags describe in the previous link as CDATA, or at least ignore parsing error inside those tags ? Is there any plan to extend encoding/xml (and maybe encoding/json) so that it will be possible to implement a "decoder" that generates tokens "out of thin air" ? (again, I still miss the possibilities of a Sax parser :) Did I miss any obvious way to fix my specific problem ? Thanks! -- Raffaele |
--
You received this message because you are subscribed to the Google Groups "golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email to golang-nuts...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
... To mangle Tolstoy, happy XML documents are all alike, every
unhappy XML document is unhappy in its own way.
and if it would be useful to have a parser that can recover
from errors