Account Options

  1. Sign in
The old Google Groups will be going away soon, but your browser is incompatible with the new version.
Google Groups Home
« Groups Home
execution speed
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
Luca  
View profile  
 More options Mar 15 2011, 10:26 am
From: Luca <luca.dallo...@gmail.com>
Date: Tue, 15 Mar 2011 07:26:10 -0700 (PDT)
Local: Tues, Mar 15 2011 10:26 am
Subject: execution speed
Hi,
I wrote a small lepl grammar to parse a dump file, it works but it
seems quite slow even for a few lines, I am sure that the problem lies
in the grammar I wrote... Could you please give me some hint about
enhancing the grammar speed (about 5 seconds) ? Thank you in advance!

Luca

Here is the snippet grammar :

SOL = Drop(LineAwareSol())
EOL = Drop(LineAwareEol())
integer = Map(Token(Integer()), int)
uletter = Token(Upper())
real = Map(Token(Real()), float)

        source = '''1      G      0.0            0.0
0.0            0.0            0.0            0.0
             2      G      0.0            0.0
0.0            0.0            0.0            0.0
             3      G      0.0            0.0
0.0            0.0            0.0            0.0
             4      G      0.0            0.0
0.0            0.0            0.0            0.0
             5      G      0.0            0.0
0.0            0.0            0.0            0.0
             6      G      0.0            0.0
0.0            0.0            0.0            0.0
             7      G      0.0            0.0
0.0            0.0            0.0            0.0
             8      G      0.0            0.0
0.0            0.0            0.0            0.0
             9      G      0.0            0.0           -9.856000E-05
-1.444699E-17   1.944000E-03   0.0
            10      G      0.0            0.0           -9.856000E-05
-1.427843E-17   1.944000E-03   0.0
            11      G      0.0            0.0           -1.085216E-02
-2.749537E-16   1.874400E-02   0.0
            12      G      0.0            0.0           -1.085216E-02
-2.748317E-16   1.874400E-02   0.0
            13      G      0.0            0.0           -3.600576E-02
-6.652665E-16   3.074400E-02   0.0
            14      G      0.0            0.0           -3.600576E-02
-6.717988E-16   3.074400E-02   0.0
            15      G      0.0            0.0           -7.075936E-02
-8.592844E-16   3.794400E-02   0.0
            16      G      0.0            0.0           -7.075936E-02
-8.537008E-16   3.794400E-02   0.0
            17      G      0.0            0.0           -1.103130E-01
-9.445027E-16   4.034400E-02   0.0
            18      G      0.0            0.0           -1.103130E-01
-9.538811E-16   4.034400E-02   0.0
           100      G      0.0            0.0
0.0            0.0            0.0            0.0
           200      G      0.0            0.0
0.0            0.0            0.0            0.0
'''

        data_line = SOL & integer & uletter & Repeat(real, start = 6,
stop = 6) & EOL
        table =OneOrMore(data_line)
        table.config.default_line_aware()
        begin = datetime.datetime.now()
        print table.parse(source)
        print datetime.datetime.now() - begin


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
andrew cooke  
View profile  
 More options Mar 15 2011, 10:37 am
From: andrew cooke <and...@acooke.org>
Date: Tue, 15 Mar 2011 11:37:37 -0300
Local: Tues, Mar 15 2011 10:37 am
Subject: Re: [LEPL] execution speed

Hi,

Your grammar looks fine.  It's worth remembering that Lepl is written in
Python, so it is slow.  However, what I think is your main problem is that you
are not separating the work Lepl has to do to create the parser from the time
actually spent parsing.

Lepl does a lot of work "compiling" the parser.  This does take time, but is
done just once.  You can then use the parser many times.

By default the compiled parser is saved internally and reused, so you only pay
for this once, the first time you call the parser.  But for timing, it means
that if you do the timing as in your code you include that time.

If you want to time just the time needed to run the parser then I would
suggest changing your code to:

         table.config.default_line_aware()
         begin = datetime.datetime.now()
         parser = table.get_parse()
         print 'compile time:', datetime.datetime.now() - begin
         begin = datetime.datetime.now()
         print table.parse(source)
         print 'parse time:', datetime.datetime.now() - begin

Or alternatively:

         table.config.default_line_aware()
         for i in range(3):
             begin = datetime.datetime.now()
             table.parse(source)
             print 'parse time:', i, datetime.datetime.now() - begin

We should show that the first time is longer than the rest (although this will
also show the results of any caching if you use that).

Does that help?

Cheers,
Andrew


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Discussion subject changed to "Fix/workaround [Was: execution speed]" by andrew cooke
andrew cooke  
View profile  
 More options Mar 15 2011, 8:11 pm
From: andrew cooke <and...@acooke.org>
Date: Tue, 15 Mar 2011 21:11:24 -0300
Local: Tues, Mar 15 2011 8:11 pm
Subject: Fix/workaround [Was: execution speed]

Hi,

OK I've been looking at this in more detail after work.

While what I said is true, you have also come across a bug / issue.  My
regular expression code is taking a crazy long time to compile the regexp for
floats / reals, and that is what you are using here.

I've only know about this for a few weeks, and am still thinking about
possible fixes.  In the long term I have some better regexp code that I will
add to Lepl.  In the shorter term I may be able to switch to Python's re
library.

In your case a simple woraround is to replace Float() with a simpler regexp.
If I do timings for Token(Float()) and
Token(r'\-?[0-9]+\.[0-9]+(?:E\-?[0-9]+)?') I get:

    Timing results
    --------------

    With compilation:   1 parse(s), best of 3
    Parse only:       100 parse(s), best of 3

    Matcher            Compiling | Parse only
    -------------------------------------------
                   float   3.769 | 0.000112 (s)
        float no memoize   3.767 | 0.052816 (s)
                  regexp   0.542 | 0.000113 (s)
       regexp no memoize   0.538 | 0.052089 (s)

(this is output form a new utility in Lepl 5).  As you can see, using the
alternative regexp reduces the compilation time from 4s to half a second.  And
the actual parsing, in either case is only taking 1/20s.

Hope that helps, and sorry for not spotting this earlier,

Andrew


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
Luca  
View profile  
 More options Mar 16 2011, 4:46 am
From: Luca <luca.dallo...@gmail.com>
Date: Wed, 16 Mar 2011 01:46:34 -0700 (PDT)
Local: Wed, Mar 16 2011 4:46 am
Subject: Re: Fix/workaround [Was: execution speed]
Hi,

thank you very much, your solution works like a charm! :-)
I was trying to also take into account the compile time, because I
actually have two usage profiles : hundreds of small unit tests which
I execute quite often (so they need to be fast enough, and which would
not benefit of compilation) and the real executions (one single
execution over hundreds of thousands of similar lines, so compilation
is really welcome in this case). Since the compilation time is not too
high, I think I will keep it, otherwise I think I could disable it
during tests with a flag...

I suppose that in the future lepl could possibly take advantage of a
cython compilation, when cython will support things such as "yield"
keyword and such... I have personally been able to compile most of my
code with interesting improvements ;-)

thank you again for your prompt answer and help, and for creating
lepl!

Luca

On 16 mar, 01:11, andrew cooke <and...@acooke.org> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »