I have come across a situation where the CustomReport I requested
contains XML that I am unable to parse with a Java SAXParser, and tidy
fails with errors as well. The reports I am having problems with are
quite large - I'm working on narrowing down exactly what types of
reports fail, but in the meantime I'm wondering if this is a known
issue? Has anyone else run into this problem? Partial Java stack
traces and tidy error output are below.
Thanks in advance to anyone who can shed light on the problem.
jessica
org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x0)
was found in the value of attribute "kwDestUrl" and element is "row".
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:
236)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:
215)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
386)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
316)
at
com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:
1438)
at
com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:
969)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanAttribute(XMLDocumentFragmentScannerImpl.java:
1033)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:
851)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl
$FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:
1693)
at
com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:
368)
at
com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:
834)
...
com.imi.util.exception.BadFileFormatException: Parsing error
at
com.imi.util.parser.GoogleXMLParser.parse(GoogleXMLParser.java:328)
at
com.imi.batches.help.AdwordsReportFilter.processTriggerAPI(AdwordsReportFilter.java:
327)
at
com.imi.batches.help.AdwordsReportFilter.process(AdwordsReportFilter.java:
133)
at com.imi.batches.ProcessMsgThread.run(ProcessEmailBatch.java:
203)
Caused by: org.xml.sax.SAXParseException: XML document structures must
start and end within the same entity.
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:
236)
at
com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:
215)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
386)
at
com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
316)
at
com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:
1438)
....
line 3 column 261785 - Warning: <row> attribute with missing trailing
quote mark
line 3 column 456499 - Error: unexpected </ro> in <row>
line 3 column 476226 - Error: unexpected </row> in <rowyword>
line 3 column 487495 - Error: unexpected </row> in <rowesort->
line 3 column 689950 - Warning: <row> attribute "ca/row" lacks value
line 3 column 740127 - Warning: <row> attribute name
"mmunity"campaign" (value="Boca Raton") is invalid
System.Exception: Error processing report: Root element is missing. ---
> System.Xml.XmlException: Root element is missing.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.ThrowWithoutLineInfo(String res)
at System.Xml.XmlTextReaderImpl.ParseDocumentContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlReader.MoveToContent()
Happened Oct 11th 2007, 09:00h CET.
On Oct 10, 7:46 pm, jessica <unidad2...@gmail.com> wrote:
> Hi all,
>
> I have come across a situation where the CustomReport I requested
> contains XML that I am unable to parse with a Java SAXParser, and tidy
> fails with errors as well. The reports I am having problems with are
> quite large - I'm working on narrowing down exactly what types of
> reports fail, but in the meantime I'm wondering if this is a known
> issue? Has anyone else run into this problem? Partial Java stack
> traces and tidy error output are below.
>
> Thanks in advance to anyone who can shed light on the problem.
>
> jessica
>
> org.xml.sax.SAXParseException: An invalid XML character (Unicode: 0x0)
> was found in the value of attribute "kwDestUrl" and element is "row".
> at
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:
> 236)
> at
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:
> 215)
> at
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
> 386)
> at
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
> 316)
> at
> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:
> 1438)
> at
> com.sun.org.apache.xerces.internal.impl.XMLScanner.scanAttributeValue(XMLScanner.java:
> 969)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanAttribute(XMLDocumentFragmentScannerImpl.java:
> 1033)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanStartElement(XMLDocumentFragmentScannerImpl.java:
> 851)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl
> $FragmentContentDispatcher.dispatch(XMLDocumentFragmentScannerImpl.java:
> 1693)
> at
> com.sun.org.apache.xerces.internal.impl.XMLDocumentFragmentScannerImpl.scanDocument(XMLDocumentFragmentScannerImpl.java:
> 368)
> at
> com.sun.org.apache.xerces.internal.parsers.XML11Configuration.parse(XML11Configuration.java:
> 834)
> ...
>
> com.imi.util.exception.BadFileFormatException: Parsing error
> at
> com.imi.util.parser.GoogleXMLParser.parse(GoogleXMLParser.java:328)
> at
> com.imi.batches.help.AdwordsReportFilter.processTriggerAPI(AdwordsReportFilter.java:
> 327)
> at
> com.imi.batches.help.AdwordsReportFilter.process(AdwordsReportFilter.java:
> 133)
> at com.imi.batches.ProcessMsgThread.run(ProcessEmailBatch.java:
> 203)
> Caused by: org.xml.sax.SAXParseException: XML document structures must
> start and end within the same entity.
> at
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.createSAXParseException(ErrorHandlerWrapper.java:
> 236)
> at
> com.sun.org.apache.xerces.internal.util.ErrorHandlerWrapper.fatalError(ErrorHandlerWrapper.java:
> 215)
> at
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
> 386)
> at
> com.sun.org.apache.xerces.internal.impl.XMLErrorReporter.reportError(XMLErrorReporter.java:
> 316)
> at
> com.sun.org.apache.xerces.internal.impl.XMLScanner.reportFatalError(XMLScanner.java:
I ran across this in the api v10 release notes: "Very large reports
that include zero impression rows may fail with error 112. If your
report fails, try breaking it down into smaller ones." I am running
very large reports with zero impressions, although I have seen no
specific mention of error 112.
I guess I will try getting the zipped report and/or breaking my
reports in smaller parts, then let you all know how it goes.
jessica
System.Xml.XmlException: There is an unclosed literal string. Line 3,
position 82125311.
at System.Xml.XmlTextReaderImpl.Throw(Exception e)
at System.Xml.XmlTextReaderImpl.Throw(String res, String arg)
at System.Xml.XmlTextReaderImpl.ParseAttributeValueSlow(Int32
curPos, Char quoteChar, NodeData attr)
at System.Xml.XmlTextReaderImpl.ParseAttributes()
at System.Xml.XmlTextReaderImpl.ParseElement()
at System.Xml.XmlTextReaderImpl.ParseElementContent()
at System.Xml.XmlTextReaderImpl.Read()
at System.Xml.XmlReader.SkipSubtree()
at System.Xml.XmlReader.ReadToNextSibling(String name)
The report was scheduled with ScheduleReportJob() and retrieved with
getGzipReportDownloadUrl().
zdsoftware, not sure if you are running reports larger than mine? The
largest report I ran was around 200MB unzipped.
jessica
BTW: I *think* I read somewhere that the URL returned by
getReportDownloadUrl()/getGzipReportDownloadUrl() is only valid for
five minutes, in which case a poor performance of the server with the
URL could be an explanation.
If you're getting truncated reports and you're not requesting them
gzipped, please try using compression. zdvsoftware is correct that the
reports machines can be picky about serving files > ~250MB.
If you continue to get inoperable gzip files or truncated XML, please
post the report job ID in this thread so we can look into the problem.
Thanks,
-Aaron Karp
AdWords API Team
On Oct 12, 5:25 pm, "zdvsoftw...@hotmail.com"