Problem with reading valid XML files with large amounts of data

120 views
Skip to first unread message

nun...@itn.pt

unread,
Nov 2, 2010, 5:01:31 AM11/2/10
to FoX-discuss
Hello,

I write a code called NDF, for analysis of Ion Beam Analysis data;
last year me and the authors of the other two main codes in the field
agreed on a universal interchange XML-based format. I wrote a schema
for it, http://idf.schemas.itn.pt/, and I need to implement it; one of
the other guys already did, but he writes C.

I use Intel® Visual Fortran Compiler Professional Edition 11.1
revision 11.1.067, on Windows Vista. I compiled and linked FoX (based
on a project file sent to me by Andrew Walker - thanks) together with
my own code. I was unable to link to the .lib produced by the stand-
alone FoX project, but once I included all the source files in my own
project it worked fine.

I put as first lines of my code the examples of the "Practical 5:
Reading XML data into a scientific application". and they worked
correctly. However, when trying to read my own files I run into
trouble!

The input files I mention here are given at the bottom.

1 - FoX stops working without error message on some of my XML files
(well-formed and validated). for instance, IAEA25_NDF_1.xml, which is
exactly the same as IAEA25_NDF.xml but saved with XMLSpy or Notepad.

The reason for this seems to be that I have a tag where I put one
data spectrum e.g.
<x>1 2 3 4 ... </x>
when there are too many data entries, the parser crashes (without
error message)
I don't put all the data in one line, there are new lines after each
16 numbers.
I now checked, and it seems that the crash occurs when the number of
total characters is close to 1024. I understand the Fortran 1024 limit
per line, but my individual lines are much below that. It seems FoX is
not recognising the end of lines?

The code stops working without an error message from FoX. When playing
the debugger, the stack is:

> ndf.exe!M_DOM_PARSE::PARSEFILE(TYPE(NODE) PARSEFILE={...}, CHARACTER(16) FILENAME='IAEA25_NDF_1.xml', TYPE(DOMCONFIGURATION) CONFIGURATION= Undefined address, INTEGER(4) IOSTAT= Undefined pointer/array, TYPE(DOMEXCEPTION) EX= Undefined pointer/array, .tmp..T3024__V$1f5d=) Line 534 Fortran
ndf.exe!NTS() Line 68 + 0x2c bytes Fortran
ndf.exe!_main() + 0x63 bytes
ndf.exe!__tmainCRTStartup() Line 266 + 0x19 bytes C
ndf.exe!mainCRTStartup() Line 182 C
kernel32.dll!7580e4a5()
[Frames below may be incorrect and/or missing, no symbols loaded for
kernel32.dll]
ntdll.dll!7714cfed()
ntdll.dll!7714d1ff()

And the locals are:

+ PARSEFILE {...} TYPE(NODE)
FILENAME 'IAEA25_NDF_1.xml' CHARACTER(16)
+ CONFIGURATION Undefined address TYPE(DOMCONFIGURATION)
IOSTAT Undefined pointer/array INTEGER(4)
+ EX Undefined pointer/array TYPE(DOMEXCEPTION)
IOSTAT_ 0 INTEGER(4)
+ EX_ {...} TYPE(DOMEXCEPTION)
+ FXML {...} TYPE(M_DOM_PARSE_mp_XML_T)
+ MAINDOC Undefined pointer/array TYPE(M_DOM_PARSE_mp_NODE)


The file IAEA25_NDF_2.xml which is the same but with fewer lines in
the <x> and <y> tags, is read correctly

2 - Finally, I thought full paths were accepted? I.e.

doc => parseFile("IAEA25_NDF_2.xml")
doc => parseFile("D:\NDF\Nuno\IBA_IDF\NDF\IAEA25_NDF_2.xml")

should both work? The second alternative leads to

forrtl: severe (408): fort: (2): Subscript #1 of the array F has value
1 which is greater than the upper bound of -1
Image PC Routine Line
Source
ndf.exe 00CEA82A Unknown Unknown Unknown
ndf.exe 00C179B0 Unknown Unknown Unknown
ndf.exe 00C17FD1 Unknown Unknown Unknown
ndf.exe 0055283A _M_SAX_PARSER_mp_ 93
m_sax_parser.F90
ndf.exe 0057C66F _M_SAX_OPERATE_mp 43
m_sax_operate.F90
ndf.exe 0057F8DF _M_DOM_PARSE_mp_P 534
m_dom_parse.f90

And when playing the debugger the stack is

> ndf.exe!M_SAX_PARSER::SAX_PARSER_INIT(TYPE(SAX_PARSER_T) FX={...}, TYPE(FILE_BUFFER_T) FB={...}) Line 93 + 0xb8 bytes Fortran
ndf.exe!M_SAX_OPERATE::OPEN_XML_FILE(TYPE(XML_T) XT={...},
CHARACTER(40) FILE='D:\NDF\Nuno\IBA_IDF\NDF\IAEA25_NDF_2.xml',
INTEGER(4) IOSTAT=0, INTEGER(4) LUN= Undefined pointer/
array, .tmp..T2159__V$18c2=) Line 43 + 0x1f bytes Fortran
ndf.exe!M_DOM_PARSE::PARSEFILE(TYPE(NODE) PARSEFILE={...},
CHARACTER(40) FILENAME='D:\NDF\Nuno\IBA_IDF\NDF\IAEA25_NDF_2.xml',
TYPE(DOMCONFIGURATION) CONFIGURATION= Undefined address, INTEGER(4)
IOSTAT= Undefined pointer/array, TYPE(DOMEXCEPTION) EX= Undefined
pointer/array, .tmp..T3024__V$1f5d=) Line 534 + 0x29 bytes Fortran
ndf.exe!NTS() Line 68 + 0x2c bytes Fortran
ndf.exe!_main() + 0x63 bytes
ndf.exe!__tmainCRTStartup() Line 266 + 0x19 bytes C
ndf.exe!mainCRTStartup() Line 182 C
kernel32.dll!7580e4a5()
[Frames below may be incorrect and/or missing, no symbols loaded for
kernel32.dll]
ntdll.dll!7714cfed()
ntdll.dll!7714d1ff()

And the debugger generates an automatic breakpoint in m_sax_parser.F90
line 93:

! FIXME do we copy correctly from fx%nlist to fx%xds%nlist?
allocate(fx%xds)
call init_xml_doc_state(fx%xds)
deallocate(fx%xds%inputEncoding)
fx%xds%inputEncoding => vs_str_alloc("us-ascii")
! because it always is ...
> if (fb%f(1)%lun>0) then
fx%xds%documentURI => vs_vs_alloc(fb%f(1)%filename)
else
fx%xds%documentURI => vs_str_alloc("")
endif

I uploaded the files IAEA25_NDF_1.xml, and IAEA25_NDF_2.xml in the
page of my code NDF, in one zipped file IAEA_FoX_problem_01.zip
(The file IAEA25_NDF.xml, also there, is generated from XMLCopyEditor,
and it has some UTF-8 characters in the beginning, it is not needed
for the problems I describe here)

It is at the bottom of the NDF page:
http://www.itn.pt/facilities/lfi/ndf/uk_lfi_ndf.htm

Can somebody help, please? Thanks!

Nuno Barradas
Instituto Tecnológico e Nuclear
E.N. 10
2686-953 Sacavém
nun...@itn.pt
+351 219946150

Andrew Walker

unread,
Nov 3, 2010, 11:00:30 AM11/3/10
to fox-d...@googlegroups.com
Hi,

Regarding the crash - do you see this whenever the DOM parseFile function is called with the larger document or only when it's used inside your code? I've just run IAEA25_NDF_1.xml and IAEA25_NDF_2.xml through dom_canonicalize.ns.yes from the examples directory and, on my system, this runs without a problem (this just calls parseFile to build a DOM then writes it out again in canonical form). I'm using the gfortran (4.4.5) compiler on a linux system so I suspect the first thing to check is if this is a compiler problem. Does running the parser in a small stand alone bit of code work for you with both documents? In the meantime I'll try and get access to a system with an up-to-date intel compiler and see if I can reproduce the crash.

It's worth noting that if this does turn out to be an Ifort 11 issue, then there has been major progress with the compiler. The last update I had was from Noam Bernstein (seehttp://groups.google.co.uk/group/fox-discuss/browse_thread/thread/15351ce51a3d9f12 ) in December last year indicating that the DOM module could not be compiled due to various internal compiler errors.

Cheers,

Andrew

> --
> You received this message because you are subscribed to the Google Groups "FoX-discuss" group.
> To post to this group, send email to fox-d...@googlegroups.com.
> To unsubscribe from this group, send email to fox-discuss...@googlegroups.com.
> For more options, visit this group at http://groups.google.com/group/fox-discuss?hl=en.
>

--

Andrew Walker <andrew...@bris.ac.uk>

Department of Earth Sciences,
University of Bristol,
Wills Memorial Building,
Queen’s Road,
Bristol, BS8 1RJ, UK


Noam Bernstein

unread,
Nov 3, 2010, 11:06:46 AM11/3/10
to fox-d...@googlegroups.com

On Nov 3, 2010, at 11:00 AM, Andrew Walker wrote:

>
> It's worth noting that if this does turn out to be an Ifort 11 issue, then there has been major progress with the compiler. The last update I had was from Noam Bernstein (seehttp://groups.google.co.uk/group/fox-discuss/browse_thread/thread/15351ce51a3d9f12 ) in December last year indicating that the DOM module could not be compiled due to various internal compiler errors.

Yup - and it's at least mostly working now (we haven't noticed any problems,
but we haven't done systematic tests), as of 11.1.072/11.1.073 at least.

Noam

nun...@itn.pt

unread,
Nov 3, 2010, 11:17:17 AM11/3/10
to FoX-discuss
Hi,

thanks for the asnwers! However, I don't believe it is a compiler
problem - the problem you mention, I had (I couldn't compile
properly), but when I switched to 067, which is older than 073 but
it's the only one Microsoft Portugal has, but it has that bug
corrected, it is second semester of 2010.

My best guess is that it is a Windows problem.

I will try a stand alone FoX asap (i.e. tomorrow, I need to finish a
report today!) - but my code is doing nothing, the first executed
lines are those trying to read the XML file.

Nuno

nun...@itn.pt

unread,
Nov 4, 2010, 10:14:47 AM11/4/10
to FoX-discuss
I now tested building a project from scratch, with FoX plus a simple
main program:

program test1

use FoX_dom
type(Node), pointer :: doc, element1
character*30 filename

filename="IAEA25_NDF_1.xml"
print *,'parseFile ',filename
doc => parseFile(filename)
element1 => getDocumentElement(doc)
print*, "Root element is: ", getNodeName(element1)
print*, "Root element's namespace is: ",
getNamespaceURI(element1)
print*, "Root element's localname is: ", getLocalName(element1)

call destroy(doc)
end

This runs as before: correctly on hypoDD.kml and IAEA25_NDF_2.xml
(which had no long data entries), and simply stops working on
IAEA25_NDF_1.xml and IAEA25_NDF_3.xml

If it were a compiler problem, surely it would not run on any of the
files?

Allen Barnett

unread,
Nov 4, 2010, 1:01:07 PM11/4/10
to FoX-discuss
I reported a bug to Intel about IVF 11.1.067 causing a stack overflow
in m_sax_tokenizer. It occurs when you try to read a very long,
uninterrupted, text block, e.g., <tag> here is a lot of text..... </
tag>. See http://software.intel.com/en-us/forums/showthread.php?t=77847&o=a&s=lr.
Perhaps that is related to your problem. You can finesse FoX a little
by increasing the size of the stack, but basically it's a compiler
error.

Andrew Walker

unread,
Nov 5, 2010, 5:16:07 AM11/5/10
to fox-d...@googlegroups.com
Thanks Allen,

This certainly sounds like the problem Nuno is seeing: the DOM parse function uses the SAX parser to read the document. The stack trace doesn't point at the tokenizer, but that's not unexpected if the stack gets corrupted.

Cheers,

Andrew

nun...@itn.pt

unread,
Nov 5, 2010, 5:35:51 AM11/5/10
to FoX-discuss
Thank you very much for the answers! I now asked out IT guy to ask our
local Intel dealer to get the new version.

Err..., is the issue with not being able to pass full path file names
also a compiler bug, or is it a FoX fixture? When I try, it crashes, I
think, in those lines in my first post that say " ! FIXME do we
copy correctly from fx%nlist to fx%xds%nlist? "

This Ion Beam Analysis Data Format (IDF in short) is a standard
defined as the result of an International Atomic Energy Agency "task
force", so I (and the other guys writing IBA codes) really need to
implement it, and until I saw FoX, I thoguht I'd either have to hard
code absolutely everything, which is a nightmare, or go again to mixed
language (there's loads of stuff for C), which I did before but is a
pain. So thank you!!!

On Nov 5, 9:16 am, Andrew Walker <andrew.wal...@bristol.ac.uk> wrote:
> Thanks Allen,
>
> This certainly sounds like the problem Nuno is seeing: the DOM parse function uses the SAX parser to read the document. The stack trace doesn't point at the tokenizer, but that's not unexpected if the stack gets corrupted.
>
> Cheers,
>
> Andrew  
>
> On 4 Nov 2010, at 17:01, Allen Barnett wrote:
>
> > I reported a bug to Intel about IVF 11.1.067 causing a stack overflow
> > in m_sax_tokenizer. It occurs when you try to read a very long,
> > uninterrupted, text block, e.g., <tag> here is a lot of text..... </
> > tag>. Seehttp://software.intel.com/en-us/forums/showthread.php?t=77847&o=a&s=lr.
> > Perhaps that is related to your problem. You can finesse FoX a little
> > by increasing the size of the stack, but basically it's a compiler
> > error.
>
> > --
> > You received this message because you are subscribed to the Google Groups "FoX-discuss" group.
> > To post to this group, send email to fox-d...@googlegroups.com.
> > To unsubscribe from this group, send email to fox-discuss...@googlegroups.com.
> > For more options, visit this group athttp://groups.google.com/group/fox-discuss?hl=en.
>
> --
>
> Andrew Walker  <andrew.wal...@bris.ac.uk>

Andrew Walker

unread,
Nov 5, 2010, 8:01:59 AM11/5/10
to fox-d...@googlegroups.com
Hi Nuno,

Regarding the file path issue, we've seen something like this before. See the discussion towards the end of this thread:

https://groups.google.com/group/fox-discuss/browse_thread/thread/ef11524d62c9e2a0/7ea08305563dee59?hl=en&lnk=gst&q=file#7ea08305563dee59

What's supposed to happen is that the whole string supplied as a filename gets passed down to open_xml_file in m_sax_operate.F90 and then to open_actual_file (via open_file and open_new_file) in m_sax_reader.F90. This, in turn, calls open, checks and checks that this works.

Looking at the details of your crash (which appears to be due to file name never getting written to fb%f, the crash comes from reading this) my guess is that, for some reason, open() on line 156 of m_sax_reader fails. This then exposes what appears to be a bug in FoX: we should catch the fact that iostat isn't zero (and hence that we haven't set f%filename) and print out a nice error. Looking back at the old thread, my cryptic comment that:

> I also suspect that there is a need to check the error stack in
> open_xml_file before sax_parser_init gets called, but that's another
> issue.


may be relevant to this.

There are a couple of things you could do to help. Could you add print statements in open_actual_file to work out what the value of the file variable is after entering the open_actual_file subroutine and what the value of iostat is before and after the call to open(). i.e. after line 148, 155 and 157, respectively?

If the filename looks correct and iostat is none zero after the open() statement could you try running with the iostat argument to open() removed. This should give you a report from the underlying system stating why the file cannot be opened.

I'll try and work out why FoX doesn't trap this error. Are you passing any optional arguments into the DOM parse function?

Cheers,

Andrew

> For more options, visit this group at http://groups.google.com/group/fox-discuss?hl=en.
>

--

Andrew Walker <andrew...@bris.ac.uk>

Department of Earth Sciences,

Nuno P Barradas

unread,
Nov 5, 2010, 8:15:14 AM11/5/10
to fox-d...@googlegroups.com
Hi Andrew,

I'm still righting that report, so I'll do this asap (probably Monday!)

thanks again for the answers!

cheers

Nuno

>>> Queen�s Road,


>>> Bristol, BS8 1RJ, UK
>> --
>> You received this message because you are subscribed to the Google Groups "FoX-discuss" group.
>> To post to this group, send email to fox-d...@googlegroups.com.
>> To unsubscribe from this group, send email to fox-discuss...@googlegroups.com.
>> For more options, visit this group at http://groups.google.com/group/fox-discuss?hl=en.
>>
> --
>
> Andrew Walker<andrew...@bris.ac.uk>
>
> Department of Earth Sciences,
> University of Bristol,
> Wills Memorial Building,

> Queen�s Road,
> Bristol, BS8 1RJ, UK
>
>
>
>

--
Nuno Pessoa Barradas
Principal researcher
Instituto Tecnol�gico e Nuclear
E.N. 10, Apartado 21
2686-953 Sacav�m
Portugal
Tel: +351 219946150
Fax: +351 219941039

nun...@itn.pt

unread,
Nov 5, 2010, 12:12:22 PM11/5/10
to FoX-discuss
Hi,

I put some prints here and there, in open_file there seems to be a
problem:

subroutine open_file(fb, iostat, file, lun, string, es)
type(file_buffer_t), intent(out) :: fb
character(len=*), intent(in), optional :: file
integer, intent(out) :: iostat
integer, intent(in), optional :: lun
character(len=*), intent(in), optional :: string
type(error_stack), intent(inout) :: es

type(URI), pointer :: fileURI

print *,'open_file file',file
print *,'open_file string',string
print *,'open_file iostat',iostat
print *,'open_file present(file)',present(file)
print *,'open_file present(string)',present(string)

iostat = 0

call setup_io()
if (present(string)) then
if (present(file)) then
call FoX_error("Cannot specify both file and string input to
open_xml")
elseif (present(lun)) then
call FoX_error("Cannot specify lun for string input to
open_xml")
endif
fileURI => parseURI("")
call open_new_string(fb, string, "", baseURI=fileURI)
else
fileURI => parseURI(file)
if (.not.associated(fileURI)) then
call add_error(es, "Could not open file "//file//" - not a
valid URI")
print *,"Could not open file",file
return
endif
call open_new_file(fb, fileURI, iostat, lun)
endif
call destroyURI(fileURI)

end subroutine open_file

leads to following results:

D:\NDF\Nuno\IBA_IDF\testFoX\test1\test1\Release\IAEA25_NDF_2.xml

2
T
F
Could not open file
D:\NDF\Nuno\IBA_IDF\testFoX\test1\test1\Release\IAEA25_NDF_2.xml

So, the file us being transmitted correctly to open_file, where it
could not be parsed because it is "not a valid URI ", and returns to
open_xml_file. That is, it never reaches open_new_file.

In open_xml_file, it calls sax_parser_init, where it explodes on line
93

if (fb%f(1)%lun>0) then

Is this again a compiler problem? I am still using 067.

thanks, have a good weekend

Nuno

On Nov 5, 12:01 pm, Andrew Walker <andrew.wal...@bristol.ac.uk> wrote:
> Hi Nuno,
>
> Regarding the file path issue, we've seen something like this before. See the discussion towards the end of this thread:
>
> https://groups.google.com/group/fox-discuss/browse_thread/thread/ef11...

Andrew Walker

unread,
Nov 6, 2010, 11:14:03 AM11/6/10
to fox-d...@googlegroups.com
Hi,

This information is most helpful. The crash at the "if (fb%f(1)%lun>0)"
line is because the array "f" hasn't been allocated: f(1) is outside the
array bounds. They reason FoX gets this far without doing the allocations
is because it does not trap the error generated by the failure to parse
the filename as a URI: the error is noted, lots of setup is skipped, but
the error never gets reported and processing does not stop. The two
attached patches should avoid the crash and report "Could not open file
D:\NDF\Nuno\IBA_IDF\testFoX\test1\test1\Release\IAEA25_NDF_2.xml - not a
valid URI"

This, of course, isn't a full solution. I need to reread the URI spec to
understand this bit. Does it work if you make your path:

file://D:\NDF\Nuno\IBA_IDF\testFoX\test1\test1\Release\IAEA25_NDF_2.xml

Or use D:/NDF/Nuno... etc.

Cheers,

Andrew

> On Nov 5, 12:01ï¿œpm, Andrew Walker <andrew.wal...@bristol.ac.uk> wrote:
>> Hi Nuno,
>>
>> Regarding the file path issue, we've seen something like this before.
>> See the discussion towards the end of this thread:
>>
>> https://groups.google.com/group/fox-discuss/browse_thread/thread/ef11...
>>
>> What's supposed to happen is that the whole string supplied as a
>> filename gets passed down to open_xml_file in m_sax_operate.F90 and then
>> to open_actual_file (via open_file and open_new_file) in
>> m_sax_reader.F90. This, in turn, calls open, checks and checks that this
>> works.
>>
>> Looking at the details of your crash (which appears to be due to file
>> name never getting written to fb%f, the crash comes from reading this)
>> my guess is that, for some reason, open() on line 156 of m_sax_reader
>> fails. This then exposes what appears to be a bug in FoX: we should

>> catch the fact that iostat isn't ï¿œzero (and hence that we haven't set


>> f%filename) and print out a nice error. Looking back at the old thread,
>> my cryptic comment that:
>>

>> > I also suspect that there is a need to check the error stack in ᅵ
>> > open_xml_file before sax_parser_init gets called, but that's another ᅵ


>> > issue.
>>
>> may be relevant to this.
>>
>> There are a couple of things you could do to help. Could you add print
>> statements in open_actual_file to work out what the value of the file
>> variable is after entering the open_actual_file subroutine and what the
>> value of iostat is before and after the call to open(). i.e. after line
>> 148, 155 and 157, respectively?
>>
>> If the filename looks correct and iostat is none zero after the open()
>> statement could you try running with the iostat argument to open()
>> removed. This should give you a report from the underlying system
>> stating why the file cannot be opened.
>>
>> I'll try and work out why FoX doesn't trap this error. Are you passing
>> any optional arguments into the DOM parse function?
>>
>> Cheers,
>>
>> Andrew
>>
>> On 5 Nov 2010, at 09:35, nun...@itn.pt wrote:
>>
>>
>>
>> > Thank you very much for the answers! I now asked out IT guy to ask our
>> > local Intel dealer to get the new version.
>>
>> > Err..., is the issue with not being able to pass full path file names
>> > also a compiler bug, or is it a FoX fixture? When I try, it crashes, I

>> > think, in those lines in my first post that say " ᅵ ᅵ! FIXME do we


>> > copy correctly from fx%nlist to fx%xds%nlist? "
>>
>> > This Ion Beam Analysis Data Format (IDF in short) is a standard
>> > defined as the result of an International Atomic Energy Agency "task
>> > force", so I (and the other guys writing IBA codes) really need to
>> > implement it, and until I saw FoX, I thoguht I'd either have to hard
>> > code absolutely everything, which is a nightmare, or go again to mixed
>> > language (there's loads of stuff for C), which I did before but is a
>> > pain. So thank you!!!
>>
>> > On Nov 5, 9:16 am, Andrew Walker <andrew.wal...@bristol.ac.uk> wrote:
>> >> Thanks Allen,
>>
>> >> This certainly sounds like the problem Nuno is seeing: the DOM parse
>> function uses the SAX parser to read the document. The stack trace
>> doesn't point at the tokenizer, but that's not unexpected if the
>> stack gets corrupted.
>>
>> >> Cheers,
>>

>> >> Andrew ᅵ


>>
>> >> On 4 Nov 2010, at 17:01, Allen Barnett wrote:
>>
>> >>> I reported a bug to Intel about IVF 11.1.067 causing a stack
>> overflow
>> >>> in m_sax_tokenizer. It occurs when you try to read a very long,
>> >>> uninterrupted, text block, e.g., <tag> here is a lot of text..... </
>> >>> tag>.
>> Seehttp://software.intel.com/en-us/forums/showthread.php?t=77847&o=a&s=lr.
>> >>> Perhaps that is related to your problem. You can finesse FoX a
>> little
>> >>> by increasing the size of the stack, but basically it's a compiler
>> >>> error.
>>
>> >>> --
>> >>> You received this message because you are subscribed to the Google
>> Groups "FoX-discuss" group.
>> >>> To post to this group, send email to fox-d...@googlegroups.com.
>> >>> To unsubscribe from this group, send email to
>> fox-discuss...@googlegroups.com.
>> >>> For more options, visit this group
>> athttp://groups.google.com/group/fox-discuss?hl=en.
>>
>> >> --
>>

>> >> Andrew Walker ᅵ<andrew.wal...@bris.ac.uk>


>>
>> >> Department of Earth Sciences,
>> >> University of Bristol,
>> >> Wills Memorial Building,

>> >> Queen&#65533;s Road,


>> >> Bristol, BS8 1RJ, UK
>>
>> > --
>> > You received this message because you are subscribed to the Google
>> Groups "FoX-discuss" group.
>> > To post to this group, send email to fox-d...@googlegroups.com.
>> > To unsubscribe from this group, send email to
>> fox-discuss...@googlegroups.com.
>> > For more options, visit this group
>> athttp://groups.google.com/group/fox-discuss?hl=en.
>>
>> --
>>

>> Andrew Walker ᅵ<andrew.wal...@bris.ac.uk>


>>
>> Department of Earth Sciences,
>> University of Bristol,
>> Wills Memorial Building,

>> Queen&#65533;s Road,

0001-sax-Trap-invalid-URI-errors-from-open_file.patch
0002-dom-Report-errors-from-the-sax-s-open_xml_file.patch

nun...@itn.pt

unread,
Nov 8, 2010, 7:02:58 AM11/8/10
to FoX-discuss
this is grood progress, thanks!

Both

D:/NDF/Nuno/IBA_IDF/testFoX/test1/test1/Release/IAEA25_NDF_2.xml

and

file://D:/NDF/Nuno/IBA_IDF/testFoX/test1/test1/Release/IAEA25_NDF_2.xml

work, the other altrernatives with \ do not.

cheers

Nuno

On Nov 6, 3:14 pm, "Andrew Walker" <Andrew.Wal...@bristol.ac.uk>
wrote:
>  0001-sax-Trap-invalid-URI-errors-from-open_file.patch
> 3KViewDownload
>
>  0002-dom-Report-errors-from-the-sax-s-open_xml_file.patch
> 3KViewDownload

Andrew Walker

unread,
Nov 9, 2010, 3:08:10 PM11/9/10
to fox-d...@googlegroups.com
Hi Nuno,

Thanks for the information. I think I now understand what's going on with
the dos-style paths. However, I'm not sure what the correct
fix/work-around is. The basic issue is that the file name provided to the
sax open_xml_file subroutine (and thus the DOM parsefile function) is used
in two ways: (i) as a file name to be passed to the underlying operating
system via a fortran open statement so that the operating system can make
arrangements for us to be able to read some text; (ii) as a string to be
converted into the default base URI for the document, so that, for
example, the DOM getBaseURI function will work in the absence of
'xml:base' attributes. The error is generated because the backslash
character ("\") isn't permitted in a URI. The forward slash works because
this character is permitted in a URI and, equally importantly, something
inside the open statement knows how to deal with paths delimited with
forward slashes as well as backslashes.

I suspect there is a similar problem (not limited to windows) if you
attempt to open a file with a name containing embedded spaces (or any
other character that should be percent encoded: files with embedded
percent symbols or colons also look tricky).

There are a number of ways I can think of to handle this:

(1) Make the argument to parsefile a valid URI by percent-encoding.
Replacing the backslashes with %5C should keep FoX's parseURI function
happy. The path part of the URI is un-percent encoded prior to being
passed to the open statement, so this should get backslashes and
conversion within open will need to take place.

(2) Make the argument to parsefile a valid URI by swapping backslashes
with forward slashes. The argument is then a valid URI but we rely on the
ability of the open statement to treat this as a file path.

(3) Add an additional argument to parsefile and open_xml_file so that the
user can provide the URI separately to the file path.

(4) If the filename cannot be converted to a URI, we could (should?)
choose a "default base URI" which is "necessarily application-dependent".

Options (1) and (2) could be done inside of FoX or left to the program
that uses FoX. If done inside FoX the change could be made contingent on a
failed initial attempt to parse the URI. If the expectation is to make the
code using FoX deal with this, it would be possible to provide some helper
functions in the FoX utils module. Whatever we do, the documentation will
need updating.

I see potential problems with all approaches. I don't think we can just
blindly percent encode the filename string within FoX as this will break
things starting with file://. Switching the slashes around may not be
portable and is only a limited solution. Any new argument would have to be
optional, so we would still have problems if it wasn't present.

I think, but am not convinced, that the best solution, at least for the
short term, will be to add a 'sanitise filename' function to utils. This
will apply various heuristics to turn a string argument into something
that ought to be useful as a filename and as a URI at the same time. Views
on this are welcome.

Cheers,

Andrew

On 8 Nov 2010, at 12:02, nun...@itn.pt wrote:this is grood progress, thanks!

Both

D:/NDF/Nuno/IBA_IDF/testFoX/test1/test1/Release/IAEA25_NDF_2.xml

and

file://D:/NDF/Nuno/IBA_IDF/testFoX/test1/test1/Release/IAEA25_NDF_2.xml

work, the other altrernatives with \ do not.

cheers

Nuno

On Nov 6, 3:14 pm, "Andrew Walker"

wrote:
Hi,

Cheers,

Andrew

type(URI), pointer :: fileURI

iostat = 0

end subroutine open_file

leads to following results:

if (fb%f(1)%lun>0) then

Nuno

On Nov 5, 12:01ᅵ&amp;#65533;pm, Andrew Walker wrote:
Hi Nuno,

Regarding the file path issue, we've seen something like this before. See
the discussion towards the end of this thread:

https://groups.google.com/group/fox-discuss/browse_thread/thread/ef11...

What's supposed to happen is that the whole string supplied as a
filename gets passed down to open_xml_file in m_sax_operate.F90 and then
to open_actual_file (via open_file and open_new_file) in
m_sax_reader.F90. This, in turn, calls open, checks and checks that this
works.

Looking at the details of your crash (which appears to be due to file name
never getting written to fb%f, the crash comes from reading this) my guess
is that, for some reason, open() on line 156 of m_sax_reader fails. This
then exposes what appears to be a bug in FoX: we should catch the fact

that iostat isn't ᅵ&amp;#65533;zero (and hence that we haven't set


f%filename) and print out a nice error. Looking back at the old thread, my
cryptic comment that:

I also suspect that there is a need to check the error stack in

ᅵ&amp;#65533; open_xml_file before sax_parser_init gets called, but
that's another ᅵ&amp;#65533; issue.

may be relevant to this.

There are a couple of things you could do to help. Could you add print
statements in open_actual_file to work out what the value of the file
variable is after entering the open_actual_file subroutine and what the
value of iostat is before and after the call to open(). i.e. after line
148, 155 and 157, respectively?

If the filename looks correct and iostat is none zero after the open()
statement could you try running with the iostat argument to open()
removed. This should give you a report from the underlying system
stating why the file cannot be opened.

I'll try and work out why FoX doesn't trap this error. Are you passing any
optional arguments into the DOM parse function?

Cheers,

Andrew

On 5 Nov 2010, at 09:35, nun...@itn.pt wrote:

Thank you very much for the answers! I now asked out IT guy to ask our
local Intel dealer to get the new version.

Err..., is the issue with not being able to pass full path file names also
a compiler bug, or is it a FoX fixture? When I try, it crashes, I think,

in those lines in my first post that say " ᅵ&amp;#65533; ᅵ&amp;#65533;!


FIXME do we copy correctly from fx%nlist to fx%xds%nlist? "

This Ion Beam Analysis Data Format (IDF in short) is a standard
defined as the result of an International Atomic Energy Agency "task
force", so I (and the other guys writing IBA codes) really need to
implement it, and until I saw FoX, I thoguht I'd either have to hard code
absolutely everything, which is a nightmare, or go again to mixed language
(there's loads of stuff for C), which I did before but is a pain. So thank
you!!!

On Nov 5, 9:16 am, Andrew Walker wrote:
Thanks Allen,

This certainly sounds like the problem Nuno is seeing: the DOM parse
function uses the SAX parser to read the document. The stack trace doesn't
point at the tokenizer, but that's not unexpected if the
stack gets corrupted.

Cheers,

Andrew ᅵ&amp;#65533;

On 4 Nov 2010, at 17:01, Allen Barnett wrote:

I reported a bug to Intel about IVF 11.1.067 causing a stack
overflow
in m_sax_tokenizer. It occurs when you try to read a very long,

uninterrupted, text block, e.g., here is a lot of text..... .
Seehttp://software.intel.com/en-us/forums/showthread.php?t=77847&amp;amp;o=a&amp;amp;s=lr.


Perhaps that is related to your problem. You can finesse FoX a
little
by increasing the size of the stack, but basically it's a compiler error.

--
You received this message because you are subscribed to the Google Groups
"FoX-discuss" group.
To post to this group, send email to fox-d...@googlegroups.com. To
unsubscribe from this group, send email to
fox-discuss...@googlegroups.com.
For more options, visit this group
athttp://groups.google.com/group/fox-discuss?hl=en.

--

Andrew Walker ᅵ&amp;#65533;

Department of Earth Sciences,
University of Bristol,
Wills Memorial Building,

Queen&amp;amp;#65533;s Road,
Bristol, BS8 1RJ, UK

--
You received this message because you are subscribed to the Google Groups
"FoX-discuss" group.
To post to this group, send email to fox-d...@googlegroups.com. To
unsubscribe from this group, send email to
fox-discuss...@googlegroups.com.
For more options, visit this group
athttp://groups.google.com/group/fox-discuss?hl=en.

--

Andrew Walker ᅵ&amp;#65533;

Department of Earth Sciences,
University of Bristol,
Wills Memorial Building,

Queen&amp;amp;#65533;s Road,
Bristol, BS8 1RJ, UK

--
You received this message because you are subscribed to the Google Groups
"FoX-discuss" group.
To post to this group, send email to fox-d...@googlegroups.com. To
unsubscribe from this group, send email to
fox-discuss...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/fox-discuss?hl=en.

--

0001-sax-Trap-invalid-URI-errors-from-open_file.patch
3KViewDownload

0002-dom-Report-errors-from-the-sax-s-open_xml_file.patch
3KViewDownload

--
You received this message because you are subscribed to the Google Groups
"FoX-discuss" group.
To post to this group, send email to fox-d...@googlegroups.com. To
unsubscribe from this group, send email to
fox-discuss...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/fox-discuss?hl=en.


--

Andrew Walker

Department of Earth Sciences,
University of Bristol,
Wills Memorial Building,

Queen&amp;#65533;s Road,
Bristol, BS8 1RJ, UK


Reply all
Reply to author
Forward
0 new messages