parseDateText() returns datetime object when no valid date and/or time is passed in

14 views
Skip to first unread message

Tim Duffy

unread,
Mar 9, 2013, 11:23:23 PM3/9/13
to parsedat...@googlegroups.com
Hello parsedatetime List,

I am new to using parsedatetime, but thus far I am finding it to be really great!  I am, however, seeing operation that I don't think I should with the parseDateText() function.  It is returning a valid datetime object with a, seemingly, random date and time when the text passed into it does not have a date or a time within it.

Setup:

import parsedatetime as pdt

c = pdt.Constants();
c.BirthdayEpoch = 80 # if parsed year value is less than this value set to 2000+ value
p = pdt.Calendar(c)

First, the parseDate() function:

>>> print p.parseDate("PRESENT: Mayor M. Connie Castaeda, Trustee William G. Andrews, Trustee Margaret B. Blackman, ")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/parsedatetime/__init__.py", line 356, in parseDate
    v1    = int(s[:index])
ValueError: invalid literal for int() with base 10: 'PRESENT: Mayor M'

Works as expected, great.

Now, the parseDateText() function:

>>> print p.parseDateText("PRESENT: Mayor M. Connie Castaeda, Trustee William G. Andrews, Trustee Margaret B. Blackman, ")
(2013, 5, 1, 23, 10, 15, 5, 68, 0)

Not sure why that text returns a date.  So I thought, maybe it was an issue (or mode of operation) that a date can be returned if there is no date present.  So I did this:

>>> print p.parseDateText("Fluffy is cute.")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/administrator/.virtualenvs/mmenv/local/lib/python2.7/site-packages/parsedatetime/__init__.py", line 427, in parseDateText
    mth = m.group('mthname')
AttributeError: 'NoneType' object has no attribute 'group'

So it's not all text with out date/time, it is just some.

>>> print p.parseDateText("I took my cat sniffles to the market with me.")
(2014, 3, 1, 23, 17, 57, 5, 68, 0)

Thoughts on what this issue is, or how I can fix it?  Am I setting something up incorrectly?

Thanks so much!

-TD



bear

unread,
Mar 10, 2013, 1:40:21 AM3/10/13
to parsedat...@googlegroups.com
parseDate() and parseDateText() are helper routines designed to be called from the main entry point parse() -- they both assume (maybe incorrectly) that a good date string will be passed.

only parse() is designed to be called with freeform/natural text - it then parses/determines where the dates are in the string and calls one of the parseDate* routines to figure out the date/time



--
You received this message because you are subscribed to the Google Groups "parsedatetime-dev" group.
To unsubscribe from this group and stop receiving emails from it, send an email to parsedatetime-...@googlegroups.com.
To post to this group, send email to parsedat...@googlegroups.com.
Visit this group at http://groups.google.com/group/parsedatetime-dev?hl=en.
For more options, visit https://groups.google.com/groups/opt_out.
 
 



--
Bear

be...@xmpp.org (email)
bea...@gmail.com (xmpp, email)
be...@code-bear.com (xmpp, email)
http://code-bear.com/bearlog (weblog)

PGP Fingerprint = 9996 719F 973D B11B E111  D770 9331 E822 40B3 CD29

Tim Duffy

unread,
Mar 10, 2013, 5:32:56 PM3/10/13
to parsedat...@googlegroups.com
Bear,

Thanks so much for your reply.  It looks like I am using the library incorrectly then.  I have many lines of text that only one contains a valid date (in some form that parsedatetime *does* recognize).  My problem is that I need to know if a date exists within the text first, before sending it to the parse() function.

Thoughts on any tricks to perform that function and return Boolean of valid or not?

I have posted the same question at StackOverflow here:

Thank you again for your timely response.  Your library is very useful, and as seen from the logs, quite mature!

-TD

bear

unread,
Mar 10, 2013, 9:20:07 PM3/10/13
to parsedat...@googlegroups.com
On Sun, Mar 10, 2013 at 5:32 PM, Tim Duffy <theq...@gmail.com> wrote:
Bear,

Thanks so much for your reply.  It looks like I am using the library incorrectly then.  I have many lines of text that only one contains a valid date (in some form that parsedatetime *does* recognize).  My problem is that I need to know if a date exists within the text first, before sending it to the parse() function.

Thoughts on any tricks to perform that function and return Boolean of valid or not?


If your into natural language parsing, you can go down that route, but that route is one of the reasons why I wrote parsedatetime to begin with (!)

When you look at the result of the parse() call you should see a tuple that contains two items: the date/time evaluated (which will be what you pass in as a sourcetime if no date/time is found and a result flag.

The result flag should be 0 if no data was parsed, 1 if it was date only and 2 if it was time only and 3 if it was a date/time

So if your getting anything other than 0 for that with text that doesn't contain anything date/time related then I would say you have found a bug ;)

(i'll be pasting this into your SO question also)

Tim Duffy

unread,
Mar 10, 2013, 9:31:47 PM3/10/13
to parsedat...@googlegroups.com

Thanks again for the quick response Bear.  In the SO question, I provided the example here:

print p.parse("Mary had a little lamb.")

Which returns:

((2014, 3, 1, 20, 53, 56, 6, 69, 1), 1)

The flag for result is set to 1, stating that the parse() function did in fact find a 'date only'.  Perhaps even odder, it populates the time poetionbof the returned object as well with seemingly random time info as well (not the current time of the system).

With this information, do you think it is fair to say this is a bug?

Thanks again,

-TD

You received this message because you are subscribed to a topic in the Google Groups "parsedatetime-dev" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/parsedatetime-dev/Ex20k4boAgE/unsubscribe?hl=en.
To unsubscribe from this group and all its topics, send an email to parsedatetime-...@googlegroups.com.

bear

unread,
Mar 10, 2013, 9:35:46 PM3/10/13
to parsedat...@googlegroups.com
On Sun, Mar 10, 2013 at 9:31 PM, Tim Duffy <theq...@gmail.com> wrote:

Thanks again for the quick response Bear.  In the SO question, I provided the example here:

print p.parse("Mary had a little lamb.")

Which returns:

((2014, 3, 1, 20, 53, 56, 6, 69, 1), 1)

The flag for result is set to 1, stating that the parse() function did in fact find a 'date only'.  Perhaps even odder, it populates the time poetionbof the returned object as well with seemingly random time info as well (not the current time of the system).

With this information, do you think it is fair to say this is a bug?


hmm, yea - I think that's a bug.  I will have to take a look at it to find out why - it may take me a couple of days to get to this as Sat and Sun are when I normally do my open-source coding, but yea, bug :/
Reply all
Reply to author
Forward
0 new messages