Parsing bugs/feature requests in fparser

2 views
Skip to first unread message

Kurt Smith

unread,
May 7, 2009, 1:54:50 PM5/7/09
to f2py-dev
Hi,

(Original post bounced since I wasn't a member of f2py-dev ML...)

I've been getting a feel for fparser, using some f90/f95 code I have
lying around. I'm happy to work on the following bugs/features; any
pointers or comments are welcome.

For the future (and not being familiar with launchpad): where should
bug reports go? Do you have a preferred way of managing them?

Here are the bug/feature requests:

1) Improve fparser's caching in parsefortran:FortanParser by
triggering another parse when the file content has changed (for ease
in interactive testing).

Can be done by computing a hash of the file and storing that in the
cache, or could just depend on the filesystem's timestamp.

2) Improve support for arrays in type blocks, e.g.:

************************************************
type foo
complex, dimension(:,:) :: foo_c
end type foo
************************************************

Won't currently parse. Traceback:

In [2]: tree = api.parse('testarrs.f90')
Traceback (most recent call last):
File "fparser/parsefortran.py", line 86, in analyze
self.block.analyze()
File "fparser/block_statements.py", line 214, in analyze
stmt.analyze()
File "fparser/block_statements.py", line 311, in analyze
stmt.analyze()
File "fparser/block_statements.py", line 1067, in analyze
stmt.analyze()
File "fparser/typedecl_statements.py", line 306, in analyze
var.update(self.attrspec)
File "fparser/base_classes.py", line 301, in update
assert self.dimension is None, `self.dimension,attr`
AssertionError: ([':', ':'], 'dimension(:,:)')

3)

************************************************
type foo
complex, dimension(10,10) :: foo_c
end type foo
************************************************

Won't parse either. Traceback:

In [2]: tree = api.parse('commondata.f90')
Traceback (most recent call last):
File "fparser/parsefortran.py", line 86, in analyze
self.block.analyze()
File "fparser/block_statements.py", line 214, in analyze
stmt.analyze()
File "fparser/block_statements.py", line 311, in analyze
stmt.analyze()
File "fparser/block_statements.py", line 1041, in analyze
BeginStatement.analyze(self)
File "fparser/base_classes.py", line 778, in analyze
stmt.analyze()
File "fparser/typedecl_statements.py", line 314, in analyze
var.analyze()
File "fparser/base_classes.py", line 408, in analyze
shape.append(spec[1]-spec[0])
TypeError: unsupported operand type(s) for -: 'str' and 'str'

4) It seems that the 'test_*.py' in fparser has a dependency on an
older version of numpy.testing before they switched to using nose for
their testing framework, so the tests won't work with current numpy.
It's up to you; we could either:

a) update the testing framework to use numpy & nose or
b) make the testing independent of numpy and switch the testing over to unittest
c) switch over to nose ourselves, but remove the numpy dependency
d) something else?

I have some familiarity with unittest, less with doctest, and none
with nose, although I'm happy to learn if it wouldn't be too much of
an undertaking.

Pearu Peterson

unread,
May 7, 2009, 3:31:28 PM5/7/09
to F2PY Developers
Hi!

On May 7, 8:54 pm, Kurt Smith <kwmsm...@gmail.com> wrote:
> Hi,
>
> (Original post bounced since I wasn't a member of f2py-dev ML...)
>
> I've been getting a feel for fparser, using some f90/f95 code I have
> lying around. I'm happy to work on the following bugs/features; any
> pointers or comments are welcome.

Great!

> For the future (and not being familiar with launchpad): where should
> bug reports go? Do you have a preferred way of managing them?

You can file bugs to

https://bugs.launchpad.net/f2py/+filebug

> Here are the bug/feature requests:
>
> 1) Improve fparser's caching in parsefortran:FortanParser by
> triggering another parse when the file content has changed (for ease
> in interactive testing).
>
> Can be done by computing a hash of the file and storing that in the
> cache, or could just depend on the filesystem's timestamp.

Using a hash value is not safe because theoretically different files
can have the same hash value. Could you look how this is
handled in weave, for example, that also caches codes in a file
system.
IIRC, weave computes a file name that is longer than
a hash value and hence changes for conflicts are smaller.
If caches as stored in a filesystem then they should probably
be in the current working directory, say, in .fparser.cache/.

> 2) Improve support for arrays in type blocks, e.g.:
>
> ************************************************
> type foo
> complex, dimension(:,:) :: foo_c
> end type foo
> ************************************************
>
> Won't currently parse. Traceback:
>
> In [2]: tree = api.parse('testarrs.f90')
> Traceback (most recent call last):
> File "fparser/parsefortran.py", line 86, in analyze
> self.block.analyze()
> File "fparser/block_statements.py", line 214, in analyze
> stmt.analyze()
> File "fparser/block_statements.py", line 311, in analyze
> stmt.analyze()
> File "fparser/block_statements.py", line 1067, in analyze
> stmt.analyze()
> File "fparser/typedecl_statements.py", line 306, in analyze
> var.update(self.attrspec)
> File "fparser/base_classes.py", line 301, in update
> assert self.dimension is None, `self.dimension,attr`
> AssertionError: ([':', ':'], 'dimension(:,:)')

It's either a bug in fparser or the code is conflicting with Fortran
2003 standard.

> 3)
>
> ************************************************
> type foo
> complex, dimension(10,10) :: foo_c
> end type foo
> ************************************************
>
> Won't parse either. Traceback:
>
> In [2]: tree = api.parse('commondata.f90')
> Traceback (most recent call last):
> File "fparser/parsefortran.py", line 86, in analyze
> self.block.analyze()
> File "fparser/block_statements.py", line 214, in analyze
> stmt.analyze()
> File "fparser/block_statements.py", line 311, in analyze
> stmt.analyze()
> File "fparser/block_statements.py", line 1041, in analyze
> BeginStatement.analyze(self)
> File "fparser/base_classes.py", line 778, in analyze
> stmt.analyze()
> File "fparser/typedecl_statements.py", line 314, in analyze
> var.analyze()
> File "fparser/base_classes.py", line 408, in analyze
> shape.append(spec[1]-spec[0])
> TypeError: unsupported operand type(s) for -: 'str' and 'str'

Looks like a bug.

> 4) It seems that the 'test_*.py' in fparser has a dependency on an
> older version of numpy.testing before they switched to using nose for
> their testing framework, so the tests won't work with current numpy.
> It's up to you; we could either:
>
> a) update the testing framework to use numpy & nose or
> b) make the testing independent of numpy and switch the testing over to unittest
> c) switch over to nose ourselves, but remove the numpy dependency

+1 to this one. fparser should not depend on numpy and
nose is a very good alternative to former numpy.testing.

> d) something else?
>
> I have some familiarity with unittest, less with doctest, and none
> with nose, although I'm happy to learn if it wouldn't be too much of
> an undertaking.

I suggest learning nose as it really is easier to use and IIRC runs
also faster than unittests.

Thanks,
Pearu

Dag Sverre Seljebotn

unread,
May 7, 2009, 4:18:54 PM5/7/09
to f2py...@googlegroups.com
Kurt Smith wrote:
> Hi,
>
> (Original post bounced since I wasn't a member of f2py-dev ML...)
>
> I've been getting a feel for fparser, using some f90/f95 code I have
> lying around. I'm happy to work on the following bugs/features; any
> pointers or comments are welcome.

Great! I can mostly comment on scheduling.

> 1) Improve fparser's caching in parsefortran:FortanParser by
> triggering another parse when the file content has changed (for ease
> in interactive testing).

Seems useful :-)

> 2) Improve support for arrays in type blocks, e.g.:

...


> 3)
>
> ************************************************
> type foo
> complex, dimension(10,10) :: foo_c
> end type foo
> ************************************************
>
> Won't parse either. Traceback:

Keep in mind that these two aren't needed for the midterm evaluation
goals. (That's all I'll say about it, you're in a far better situation
to judge if and/or when to do it. And perhaps as important is that
fixing them ought to give you a good feel about f2py internals.)

> 4) It seems that the 'test_*.py' in fparser has a dependency on an
> older version of numpy.testing before they switched to using nose for
> their testing framework, so the tests won't work with current numpy.
> It's up to you; we could either:
>
> a) update the testing framework to use numpy & nose or
> b) make the testing independent of numpy and switch the testing over to unittest
> c) switch over to nose ourselves, but remove the numpy dependency
> d) something else?

How big is the test suite? Are we talking a week or a day to move to a
new test framework? (Basically just curious how much work is involved
here; obviously a well functioning test suite is critical to doing the
development.)


--
Dag Sverre

Kurt Smith

unread,
May 7, 2009, 5:26:26 PM5/7/09
to f2py...@googlegroups.com
On Thu, May 7, 2009 at 2:31 PM, Pearu Peterson <pearu.p...@gmail.com> wrote:


>
>> For the future (and not being familiar with launchpad): where should
>> bug reports go?  Do you have a preferred way of managing them?
>
> You can file bugs to
>
> https://bugs.launchpad.net/f2py/+filebug

I'll put them there.

>
>> Here are the bug/feature requests:
>>
>> 1)  Improve fparser's caching in parsefortran:FortanParser by
>> triggering another parse when the file content has changed (for ease
>> in interactive testing).
>>
>> Can be done by computing a hash of the file and storing that in the
>> cache, or could just depend on the filesystem's timestamp.
>
> Using a hash value is not safe because theoretically different files
> can have the same hash value. Could you look how this is
> handled in weave, for example, that also caches codes in a file
> system.
> IIRC, weave computes a file name that is longer than
> a hash value and hence changes for conflicts are smaller.
> If caches as stored in a filesystem then they should probably
> be in the current working directory, say, in .fparser.cache/.

I'm proposing putting in a hash of the file contents along with the
filename inside the cache dictionary to mark when the file contents
change -- really just a one-line change.

Basically, in FortranFileReader:
try:
fh = open(filename)
self.id = filename + md5(fh.read()).hexdigest()
finally:
fh.close()

Instead of just storing the filename in the 'id' attribute. Having
both the filename and hash of the contents would be pretty robust to
hash collisions.

We could consider doing external storage of parsed file parse trees,
but that would be a bit more than I intended.

>
>> 2) Improve support for arrays in type blocks, e.g.:
>>
>> ************************************************
>> type foo
>>    complex, dimension(:,:) :: foo_c
>> end type foo
>> ************************************************
>>

[snip]


> It's either a bug in fparser or the code is conflicting with Fortran
> 2003 standard.

You're right -- my bad. I removed an 'allocatable' from in front.
Unfortunately it still doesn't parse:

**************************************************
type foo
complex, allocatable, dimension(:,:) :: foo_c
end type foo
**************************************************

I'll write it up on launchpad.

>
>> 3)
>>
>> ************************************************
>> type foo
>>    complex, dimension(10,10) :: foo_c
>> end type foo
>> ************************************************
>>

[snip]


>
> Looks like a bug.
>
>> 4) It seems that the 'test_*.py' in fparser has a dependency on an
>> older version of numpy.testing before they switched to using nose for
>> their testing framework, so the tests won't work with current numpy.
>> It's up to you; we could either:
>>
>> a) update the testing framework to use numpy & nose or
>> b) make the testing independent of numpy and switch the testing over to unittest
>> c) switch over to nose ourselves, but remove the numpy dependency
>
> +1 to this one. fparser should not depend on numpy and
> nose is a very good alternative to former numpy.testing.
>
>> d) something else?
>>
>> I have some familiarity with unittest, less with doctest, and none
>> with nose, although I'm happy to learn if it wouldn't be too much of
>> an undertaking.
>
> I suggest learning nose as it really is easier to use and IIRC runs
> also faster than unittests.

Since the test suite is just 2 files right now, this shouldn't be too
big of an undertaking once I understand nose -- glad to hear its
easier to use.

Also, it looks like there's a duplication of the test modules in the
fparser dir and the fparser/tests/ dir. Which one is preferred (I
assume the fparser/tests/ one)?

Kurt Smith

unread,
May 7, 2009, 5:40:39 PM5/7/09
to f2py...@googlegroups.com

That's one of 3 motives behind it:

1) Learn fparser's internals (fixing easy(?) bugs is a great way to do it)
2) Make fparser more robust
3) It's a long-term goal beyond GSoC (and I'll stay focused on the
midterm goals), but I'd like to see about fully wrapping some of my
extant fortran code at some future date. Fixing the type parsing
would be the first small step in getting that working.

I'm not intending on having derived-type support working as part of
the GSoC, though. Just the basic wrapping including intrinsic-type
array support.

>
>> 4) It seems that the 'test_*.py' in fparser has a dependency on an
>> older version of numpy.testing before they switched to using nose for
>> their testing framework, so the tests won't work with current numpy.
>> It's up to you; we could either:
>>
>> a) update the testing framework to use numpy & nose or
>> b) make the testing independent of numpy and switch the testing over to unittest
>> c) switch over to nose ourselves, but remove the numpy dependency
>> d) something else?
>
> How big is the test suite? Are we talking a week or a day to move to a
> new test framework? (Basically just curious how much work is involved
> here; obviously a well functioning test suite is critical to doing the
> development.)

The test suite is just 2 files, with about 200 test cases. But
they're pretty homogeneous and I don't think it'd be more than a day's
work at most, assuming that I can pick up nose quickly.

Kurt

Pearu Peterson

unread,
May 8, 2009, 2:43:51 AM5/8/09
to F2PY Developers


On May 8, 12:26 am, Kurt Smith <kwmsm...@gmail.com> wrote:

> Also, it looks like there's a duplication of the test modules in the
> fparser dir and the fparser/tests/ dir.  Which one is preferred (I
> assume the fparser/tests/ one)?

Yes, all tests should go to the tests/ directory. Duplications
should be removed from the fparser/ directory.

Pearu
Reply all
Reply to author
Forward
0 new messages