I have been looking into the issue of localizing transformation and
validation errors into the original source document...
We are transforming :
1) ODF to Metalex
2) Metalex to AN
3) Validating AN
Case 1) is where it is (i believe) relatively easy to resolve to
source document ...
Case 2) is more complex, and ideally we shouldnt get transformation
errors at that layer.
Case 3) we can have validation errors on the final document (if we are
using a customized schema for a parliament - and we want to resolve
these errors back to source).
I found various mechanisms in Saxon which will help us resolve to the
source... some are new features in the most current saxon release...
for instance, the current release 9.1.5 has a new tracing and
diagnostics mechanism :
http://www.saxonica.com/documentation/changes/intro/trace91.html
essentially we can browse the exception frames and resolve the error
to the line / column no.
This combined with a SourceLocator can help use pinpoint the problem
in the odf content.xml.
The saxon XPathException object has a source code location resolver
API ... which lets you identify source location in both the XSL
stylesheet and the XML document being transformed.
See my attached example which i hacked together ... it iterates
through the error stack frame - and gives a full trace history of the
error (note this traces to source XSL not to source XML...but that can
be done by recursing through the XPathContext returned by the
transformerexception... the same mechanism applies also the XML schema
validation using Saxon...). The saxon documentation is quite massive
so finding the right api is sometimes difficult :-)
e.g. stack trace :
error was found
applied in net.sf.saxon.trace.ContextStackFrame$ApplyTemplates@5998cb
(at style.xsl: 36)
called in net.sf.saxon.trace.ContextStackFrame$FunctionCall@3e6f83
(at style.xsl: 24)
applied in net.sf.saxon.trace.ContextStackFrame$ApplyTemplates@b0c5a
(at style.xsl: 36)
called in net.sf.saxon.trace.ContextStackFrame$FunctionCall@58046e
(at style.xsl: 18)
applied in net.sf.saxon.trace.ContextStackFrame$ApplyTemplates@8ad9a0
(at style.xsl: 36)
called in net.sf.saxon.trace.ContextStackFrame$FunctionCall@d5c653
(at style.xsl: 18)
`-> /xsl:stylesheet[1]/xsl:function[1]/xsl:choose[1]/xsl:otherwise[1]
applied in net.sf.saxon.trace.ContextStackFrame$ApplyTemplates@cfb11f
(at style.xsl: 10)
`->
`-> /xsl:stylesheet[1]/xsl:function[1]/xsl:choose[1]
`->
`-> /xsl:stylesheet[1]/xsl:function[1]
`->
`-> /xsl:stylesheet[1]
What do you think ?
I think a sourcelocation resolver in the transformer will greatly
improve its usability....
I am going to impelemnt a semantic rule module for the editor so the
semantic checks are done at source...
Ashok
/*
* To change this template, choose Tools | Templates
* and open the template in the editor.
*/
package org.bungeni.xslt.errortrace;
import java.io.File;
import java.util.Iterator;
import javax.xml.transform.ErrorListener;
import javax.xml.transform.Source;
import javax.xml.transform.Transformer;
import javax.xml.transform.TransformerException;
import javax.xml.transform.TransformerFactory;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
import net.sf.saxon.TransformerFactoryImpl;
import net.sf.saxon.expr.StackFrame;
import net.sf.saxon.expr.XPathContext;
import net.sf.saxon.om.Axis;
import net.sf.saxon.om.AxisIterator;
import net.sf.saxon.om.Item;
import net.sf.saxon.om.NodeInfo;
import net.sf.saxon.pattern.NameTest;
import net.sf.saxon.pattern.NodeKindTest;
import net.sf.saxon.trace.ContextStackFrame;
import net.sf.saxon.trace.ContextStackFrame.ApplyTemplates;
import net.sf.saxon.trace.ContextStackFrame.CallTemplate;
import net.sf.saxon.trace.ContextStackFrame.FunctionCall;
import net.sf.saxon.trans.XPathException;
import net.sf.saxon.type.Type;
/**
*
* @author ashok
*/
public class Main {
public static void main(String[] args) throws TransformerException
{
try {
TransformerFactory factory = TransformerFactoryImpl.newInstance();
Source filestyle = new StreamSource(new File("src/org/bungeni/xslt/errortrace/style.xsl"));
Transformer trans = factory.newTransformer(filestyle);
trans.setErrorListener( new TrapErrorListener());
trans.transform(filestyle, new StreamResult(System.out));
System.out.println("no error");
}
catch (XPathException ex) {
System.out.println("error was found ");
XPathContext ctx = ex.getXPathContext();
Iterator ir = ctx.iterateStackFrames();
while (ir.hasNext()) {
String xpath = "";
int nfLineNo = 0;
int nfColNo = 0;
Object csf = ir.next();
ContextStackFrame mcsf = (ContextStackFrame)csf;
Item ctxItem = mcsf.getContextItem();
if (ctxItem instanceof NodeInfo) {
xpath = makePathTo((NodeInfo)ctxItem);
}
if (csf.getClass().getName().equals(ContextStackFrame.ApplyTemplates.class.getName())) {
ApplyTemplates ctxApply = (ApplyTemplates)csf;
processStackFrame(ctxApply, xpath);
} else if (csf.getClass().getName().equals(ContextStackFrame.FunctionCall.class.getName())) {
FunctionCall ctxFunc = (FunctionCall) csf;
processStackFrame(ctxFunc, xpath);
} else if (csf.getClass().getName().equals(ContextStackFrame.CallTemplate.class.getName())) {
CallTemplate ctxFunc = (CallTemplate) csf;
processStackFrame(ctxFunc, xpath);
}
}
}
}
public static String callerStart = "";
public static void processStackFrame(ApplyTemplates frame, String xpath) {
int line = frame.getLineNumber();
String callerHead = " applied ";
String pos = parseFile(frame.getSystemId()) + ": " + line ;
System.out.println(callerHead + "in " + frame + " (at " + pos + ")");
if ( xpath != null ) {
System.err.println(" `-> " + xpath);
}
}
private static String parseFile(String fileFull)
{
int idx = fileFull.lastIndexOf('/');
if ( idx < 0 ) {
return fileFull;
}
else {
return fileFull.substring(idx + 1);
}
}
public static void processStackFrame(FunctionCall frame, String xpath ) {
int line = frame.getLineNumber();
String callerHead = " called ";
String pos = parseFile(frame.getSystemId()) + ": " + line ;
System.out.println(callerHead + "in " + frame + " (at " + pos + ")");
if ( xpath != null ) {
System.err.println(" `-> " + xpath);
}
}
public static void processStackFrame(CallTemplate frame, String xpath) {
int line = frame.getLineNumber();
String callerHead = " called ";
String pos = parseFile(frame.getSystemId()) + ":" + line ;
System.out.println(callerHead + "in " + frame + " (at " + pos + ")");
if ( xpath != null ) {
System.err.println(" `-> " + xpath);
}
}
/**
* Suppresses saxon's own error messages
*/
private static class TrapErrorListener implements ErrorListener
{
public void error(TransformerException ex)
throws TransformerException {
}
public void fatalError(TransformerException ex)
throws TransformerException {
}
public void warning(TransformerException ex)
throws TransformerException {
}
}
private static String makePathTo(NodeInfo node)
{
if ( node == null ) {
return null;
}
String path = null;
switch ( node.getNodeKind() ) {
case Type.DOCUMENT: {
return "/";
}
case Type.ELEMENT: {
String name = node.getNamePool().getDisplayName(node.getNameCode());
AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
int pos = 1;
while ( ai.moveNext() ) {
++ pos;
}
path = name + "[" + pos + "]";
break;
}
case Type.ATTRIBUTE: {
String name = node.getNamePool().getDisplayName(node.getNameCode());
path = "@" + name;
break;
}
case Type.TEXT: {
AxisIterator ai = node.iterateAxis(Axis.PRECEDING, NodeKindTest.TEXT);
int pos = 1;
while ( ai.moveNext() ) {
++ pos;
}
path = "text()[" + pos + "]";
break;
}
case Type.COMMENT: {
AxisIterator ai = node.iterateAxis(Axis.PRECEDING, NodeKindTest.COMMENT);
int pos = 1;
while ( ai.moveNext() ) {
++ pos;
}
path = "comment()[" + pos + "]";
break;
}
case Type.PROCESSING_INSTRUCTION: {
String name = node.getNamePool().getDisplayName(node.getNameCode());
AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
int pos = 1;
while ( ai.moveNext() ) {
++ pos;
}
path = "processing-instruction(" + name + ")[" + pos + "]";
break;
}
case Type.NAMESPACE: {
int name_code = node.getNameCode();
String name = name_code < 0 ? "" : node.getNamePool().getDisplayName(name_code);
AxisIterator ai = node.iterateAxis(Axis.PRECEDING, new NameTest(node));
int pos = 1;
while ( ai.moveNext() ) {
++ pos;
}
path = "namespace(" + name + ")[" + pos + "]";
break;
}
default: {
throw new RuntimeException("runtime exception");
}
}
String parent = makePathTo(node.getParent());
if ( parent == null ) {
return path;
}
else if ( "/".equals(parent) ) {
return "/" + path;
}
else {
return parent + "/" + path;
}
}
}
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:bungeni="http://www.bungeni.org"
version="2.0">
<xsl:template match="/">
<html>
<head/>
<body>
<xsl:apply-templates select="*" mode="y"/>
</body>
</html>
</xsl:template>
<xsl:template match="*" name="nana" mode="y">
<b>
<xsl:value-of select="name(.)"/>
<xsl:sequence select="bungeni:fun(*)"/>
</b>
</xsl:template>
<xsl:template match="xsl:function/xsl:choose | aa//bb" mode="y">
<b>
<xsl:sequence select="bungeni:fun(*)"/>
</b>
</xsl:template>
<xsl:function name="bungeni:fun">
<xsl:param name="n" as="node()*"/>
<xsl:choose>
<xsl:when test="$n[0][name() = 'inexistent']">
<xsl:apply-templates mode="y" select="
$n[position() mod 2 eq 0]"/>
</xsl:when>
<xsl:otherwise>
<xsl:apply-templates mode="y" select="$n"/>
</xsl:otherwise>
</xsl:choose>
</xsl:function>
<xsl:template match="xsl:otherwise" mode="y">
<xsl:if test="@test">
<xsl:sequence select="' '"/>
</xsl:if>
<dummy>
<xsl:sequence select="
error(xs:QName('bungeni:ERR007'), 'My error message')"/>
</dummy>
</xsl:template>
</xsl:stylesheet>
Can you update me on the progress made with the error handler using
the Saxon API ... ?
Also can you please comment on the issues with the judgement XML markup ?
thanks
Ashok
I'm working on the error handler.Using saxon I have finally found the way to show the line and the element name in witch an error occurs.Now I have to understand how to map this line and name with the one in the ODT document.The source locator, unfortunately, tells me only the line of the result AKN document in witch a validation error occurs.And also remember that the steps to create the final document are many.
So in my opinion we have to use SAXON only to get the line number and element name in the final document.To pinpoint the initial document I have to create a mechanism to myself.
In my opinion I have to read the initial document and create a temp file in witch I store the position of the elements.Then, step by step, I have to register the new position of the element.At the end, I can take the position (the line number and the element name) of the error and navigate up the chain of the steps in order to get the initial position.
Dear Ashok,
A question.Is it simple for you to send to me a file containing the starting line of each section of the ODF document.
- in addition to the ODF file do you want me to pass in a additional file containing the list of sections and just the first line from each section ?
- if you want me to pass in such a file ...do you want it in a particular format ?
Its not difficult for me to give you this information in an additional file if that is what you want.
if you have a structure like this :
-section1
---heading text in section 1
------section1.1
---------section1.1.1
------------heading text in section 3
---------section1.1.2
------------heading text in section 3
in the above example the 'text' for section 1.1 is actually the text
of section1.1.1 i.e. section1.1 does not have any direct text of its
own ... since its merely a container for other sections ...
>
> No It is not a problem. This is because the only thing that we can do in the
> error handling is to point to a section with an id.
> When the error come in something that has not an id, I have to get the first
> named ancestor and point out that the error is in that section.
> This is the only think that we can do.
> Another problem is how to show the issue in the section that the user will
> never see ... for example the metadata section ... but I'll think to this
The metadata is anyway always 'invisible' on the document ... the only
'visible' aspect is a reference mark to the metadata.
if the problem is in the reference to the metadata then the user can
be pointed to the erroneous reference mark.
if the problem is in the non-visible metadata simply returning an
error will suffice e.g. metadata 'X' is missing / invalid...
Yeah i know that saxon-sa (commercial ) is schema aware which saxon-b
(open source) is not schema aware ... however both of them provide for
XSLT stack tracing - just that saxon-sa provides an inbuilt mechanism
and APIs to interpret the xslt stack trace back to source.
I am not sure how Xerces fits into the picture here since it can be
used only for output validation (once the transformation has already
occured ) - for use as an output validator -- fine i agree.
What about errors that occur *during* translation ...where-by no
output document is produced ? I don't understand how Xerces fits into
this particular since the error / xslt error stack is produced by
saxon ?
Okay .. if that is the case then fine -- agreed.
Ashok
Something occured regarding your approach ...
Since it *always* outputs a result AN xml even if it encounters errors
-- isnt it possible that the result AN xml may be valid - when it
actually isnt ?
Okay let me try and explain with a use case perhaps such a scenario is
not valid anymore...
- We have a ODF document with a clause .
- According to the schema in the use case the clause is supposed to
start with a heading
- The docment is passed to the translator without a marked up heading
- The translator transforms the clause into AN xml , but the clause
does not have a heading
At this point I recall from an earlier thread that you were generating
dummy headings in some cases during the translation process ...
if you were putting dummy headings then during validation the missing
heading will never be detected ...?
Alright ... thats clear now. I think we are good on using your
suggested mechanism.
Ashok