Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Match.groupdict: Meaning of default argument?

229 views
Skip to first unread message

Loris Bennett

unread,
Apr 29, 2022, 3:50:08 AM4/29/22
to
Hi,

If I do

import re
pattern = re.compile(r'(?P<days>\d*)(-?)(?P<hours>\d\d):(?P<minutes>\d\d):(?P<seconds>\d\d)')
s = '104-02:47:06'
match = pattern.search(s)
match_dict = match.groupdict('0')

I get

match_dict
{'days': '104', 'hours': '02', 'minutes': '47', 'seconds': '06'}

However, if the string has no initial part (corresponding to the number of
days), e.g.

s = '02:47:06'
match = pattern.search(s)
match_dict = match.groupdict('0')

I get

match_dict
{'days': '', 'hours': '02', 'minutes': '47', 'seconds': '06'}

I thought that 'days' would default to '0'.

What am I doing wrong?

Cheers,

Loris
--
This signature is currently under construction.

Julio Di Egidio

unread,
Apr 30, 2022, 1:55:39 AM4/30/22
to
On Friday, 29 April 2022 at 09:50:08 UTC+2, Loris Bennett wrote:
> Hi,
>
> If I do
>
> import re
> pattern = re.compile(r'(?P<days>\d*)(-?)(?P<hours>\d\d):(?P<minutes>\d\d):(?P<seconds>\d\d)')
> s = '104-02:47:06'
> match = pattern.search(s)
> match_dict = match.groupdict('0')
>
> I get
>
> match_dict
> {'days': '104', 'hours': '02', 'minutes': '47', 'seconds': '06'}
>
> However, if the string has no initial part (corresponding to the number of
> days), e.g.
>
> s = '02:47:06'
> match = pattern.search(s)
> match_dict = match.groupdict('0')
>
> I get
>
> match_dict
> {'days': '', 'hours': '02', 'minutes': '47', 'seconds': '06'}
>
> I thought that 'days' would default to '0'.
>
> What am I doing wrong?

You tell, but it's quite obvious that you (just) run a regex on a string and captures are going to be strings: indeed, '02' is not a number either...

Julio

Loris Bennett

unread,
May 3, 2022, 7:38:22 AM5/3/22
to
r...@zedat.fu-berlin.de (Stefan Ram) writes:

> "Loris Bennett" <loris....@fu-berlin.de> writes:
>>I thought that 'days' would default to '0'.
>
> It will get the value '0' if (?P<days>\d*) does
> /not/ participate in the match.
>
> In your case, it /does/ participate in the match,
> \d* matching the empty string.
>
> Try (?P<days>\d+)?.

Ah, thanks. I was misunderstanding the meaning of 'participate'.

Loris Bennett

unread,
May 3, 2022, 7:50:08 AM5/3/22
to
"Loris Bennett" <loris....@fu-berlin.de> writes:

> r...@zedat.fu-berlin.de (Stefan Ram) writes:
>
>> "Loris Bennett" <loris....@fu-berlin.de> writes:
>>>I thought that 'days' would default to '0'.
>>
>> It will get the value '0' if (?P<days>\d*) does
>> /not/ participate in the match.
>>
>> In your case, it /does/ participate in the match,
>> \d* matching the empty string.
>>
>> Try (?P<days>\d+)?.
>
> Ah, thanks. I was misunderstanding the meaning of 'participate'.

What I actually need is

((?P<days>\d+)(-?))?(?P<hours>\d\d):(?P<minutes>\d\d):(?P<seconds>\d\d)

so that I can match both

99-11:22:33

and

11:22:33

and have 'days' be '0' in the later case.

Thanks for pointing me in the right direction.

Loris Bennett

unread,
May 3, 2022, 10:44:10 AM5/3/22
to
I am not sure what you are trying to tell me. I wasn't expecting
anything other than strings. The problem was, as Stefan helped me to
understand, that I misunderstood what 'participating in the match'
means.

Julio Di Egidio

unread,
May 3, 2022, 1:18:30 PM5/3/22
to
On Tuesday, 3 May 2022 at 16:44:10 UTC+2, Loris Bennett wrote:
> Julio Di Egidio <ju...@diegidio.name> writes:
> > On Friday, 29 April 2022 at 09:50:08 UTC+2, Loris Bennett wrote:
> >> Hi,
> >>
> >> If I do
> >>
> >> import re
> >> pattern = re.compile(r'(?P<days>\d*)(-?)(?P<hours>\d\d):(?P<minutes>\d\d):(?P<seconds>\d\d)')
> >> s = '104-02:47:06'
> >> match = pattern.search(s)
> >> match_dict = match.groupdict('0')
> >>
> >> I get
> >>
> >> match_dict
> >> {'days': '104', 'hours': '02', 'minutes': '47', 'seconds': '06'}
> >>
> >> However, if the string has no initial part (corresponding to the number of
> >> days), e.g.
> >>
> >> s = '02:47:06'
> >> match = pattern.search(s)
> >> match_dict = match.groupdict('0')
> >>
> >> I get
> >>
> >> match_dict
> >> {'days': '', 'hours': '02', 'minutes': '47', 'seconds': '06'}
> >>
> >> I thought that 'days' would default to '0'.
> >>
> >> What am I doing wrong?
> >
> > You tell, but it's quite obvious that you (just) run a regex on a string and captures are going to be strings: indeed, '02' is not a number either...
>
> I am not sure what you are trying to tell me. I wasn't expecting
> anything other than strings. The problem was, as Stefan helped me to
> understand, that I misunderstood what 'participating in the match'
> means.

Stefan's "fix" makes it fail if there is not at least one digit in that position. And not only I don't se how that is in fact a fix, nor I still can guess where the string '0' is supposed to originate from in case we do not have the days part in the input. Maybe I am overlooking something...?

Julio

Julio Di Egidio

unread,
May 3, 2022, 1:24:14 PM5/3/22
to
Yes, I am missing the meaning of the parameter in the call to groupdict.
<https://docs.python.org/2.7/library/re.html#re.MatchObject.groupdict>

And then Stefan's solution makes sense, as it's the default for the *non-optional* parts that are missing...

Right? Sorry for the confusion.

Julio

Julio Di Egidio

unread,
May 3, 2022, 1:25:55 PM5/3/22
to
P.S. I don't see Stefan's posts, and I doubt he sees mine...

Julio
0 new messages