After some french discussions about this idea, I subscribed here to
suggest adding a new string litteral, for regexp, inspired by other
types like : u"", r"", b"", br"", f""…
The regexp string litteral could be represented by : re""
It would ease the use of regexps in Python, allowing to have some regexp
litterals, like in Perl or JavaScript.
We may end up with an integration like :
>>> import re
>>> if re".k" in 'ok':
... print "ok"
ok
>>>
Regexps are part of the language in Perl, and the rather complicated
integration of regexp in other languages, especially in Python, is
something that comes up easily in language comparing discussion.
I've always felt JavaScript integration being half the way it should,
and new string litterals types in Python (like f"") looked like a good
compromise to have a tight integration of regexps without asking to make
them part of the language (as I imagine it has already been discussed
years ago, and obviously denied…).
As per XKCD illustration, using a regexp may be a problem on its own,
but really, the "each-language a new and complicated approach" is
another difficulty, of the level of writing regexps I think. And then,
when you get the trick for Python, it feels to me still to much letters
to type regarding the numerous problems one can solve using regexps.
I know regexps are slower than string-based workflow (like .startswith)
but regexps can do the most and the least, so they are rapide to come up
with, once you started to think with them. As Python philosophy is to
spare brain-cycles, sacrificing CPU-cycles, allowing to easily use
regexps is a brain-cycle savior trick.
What do you think ?
--
Simon Descarpentries
+336 769 702 53
http://acoeuro.com
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
There are several regular expression libraries for Python. One of them
is included in the stdlib, but this is not the first regular expression
library in the stdlib and may be not the last. Particular project can
choose using an alternative regular expression library (because it has
additional features or is faster for particular cases).
On Mon, Mar 27, 2017 at 05:17:40PM +0200, Simon D. wrote:I dislike the suggested syntax re".k". It looks ugly and not different
> The regexp string litteral could be represented by : re""
>
> It would ease the use of regexps in Python, allowing to have some regexp
> litterals, like in Perl or JavaScript.
>
> We may end up with an integration like :
>
> >>> import re
> >>> if re".k" in 'ok':
> ... print "ok"
> ok
enough from a raw string. I can easily see people accidentally writing:
if r".k" in 'ok':
...
and wondering why their regex isn't working.
Yes, but if the "in" operator is used, it would still work, because
r"..." is a str, and "str" in "string" is meaningful.
But I think a better solution will be for regex literals to be
syntax-highlighted differently. If they're a truly-supported syntactic
feature, they can be made visually different in your editor, making
the distinction blatantly obvious.
That said, though, I'm -1 on this. Currently, every prefix letter has
its own meaning, and broadly speaking, combining them combines their
meanings. An re"..." literal should be a raw "e-string", whatever that
is, so I would expect that e"..." is the same kind of thing but with
different backslash handling.
> But I think a better solution will be for regex literals to be
> syntax-highlighted differently. If they're a truly-supported syntactic
> feature, they can be made visually different in your editor, making
> the distinction blatantly obvious.
>
> That said, though, I'm -1 on this. Currently, every prefix letter has
> its own meaning, and broadly speaking, combining them combines their
> meanings. An re"..." literal should be a raw "e-string", whatever that
> is, so I would expect that e"..." is the same kind of thing but with
> different backslash handling.
First, I would like to state that the "module-static" version of regexp
functions, avoiding the compile step, are a great idea.
(e.g. : mo = re.search(r'.k', mystring) )
The str integrated one also, but maybe confusing, which regexp lib is
used ? (must be the default one).
Then, re"" being two letters looks like a real problem. Lets pick one
amongs the 22 remaining free alphabet letters. What about :
- g"", x"" (like in regex) ?
- m"" (like shawn for Perl, meaming Match ?)
- q"" (for Query ?)
- k"" (in memory of Stephen Cole Kleene ?
https://en.wikipedia.org/wiki/Regular_expression)
- /"" (to be half the way toward /regexp/ syntax)
- ~"" ?"" (other symbols, I avoid regexp-starting symbols, would be ugly
in real usage)
And what about an approach with flag firsts ? (or where to put them ?) :
i"" (regexp with ignorecase flag on)
AILMSX"" (regexp with all flags on)
It would consume a lot of letters, but would use it for a good reason
:-)
Personnally, I think a JavaScript-like syntaxe would be great, and I
feel it as asking too much… :
- it would naturally be highlihted differently ;
- it would not be the first (happy) similarity
(https://hackernoon.com/javascript-vs-python-in-2017-d31efbb641b4#.ky9it5hph)
- its a working integration, including flag matters.
--
Simon Descarpentries
+336 769 702 53
My 2 cents is that regular expressions are pretty un-pythonic because of their horrible readability. I would much rather see Python adopt something like Verbal Expressions ( https://github.com/VerbalExpressions/PythonVerbalExpressions ) into the standard library than add special syntax support for normal REs.
I feel like that borders on a bit too wordy...
> The str integrated one also, but maybe confusing, which regexp lib is
> used ? (must be the default one).
>
Ok, this was a mistake, based on JavaScript memories… There is no regexp
aware functions around str, but some hint to go find your happiness in
the re module.
--
Simon Descarpentries
+336 769 702 53
a huge advantage of REs is that they are common to many
languages. You can take a regex from grep to Perl to your editor to
Python. They're not absolutely identical, of course, but the basics
are all the same. Creating a new search language means everyone has to
learn anew.
ChrisA
You think that example is more readable than the proposed transalation
^(http)(s)?(\:\/\/)(www\.)?([^\ ]*)$
which is better written
^https?://(www\.)?[^ ]*$
or even
^https?://[^ ]*$
which makes it obvious that the regexp is not very useful from the
word "^"? (It matches only URLs which are the only thing, including
whitespace, on the line, probably not what was intended.)
Are those groups capturing in Verbal Expressions? The use of "find"
(~ "search") rather than "match" is disconcerting to the experienced
user.
What does alternation look like?
How about alternation of
non-trivial regular expressions?
As far as I can see, Verbal Expressions are basically a way of making
it so painful to write regular expressions that people will restrict
themselves to regular expressions
I don't think that this failure to respect the
developer's taste is restricted to this particular implementation,
either.
On 03/04/2017 02:22, Neil Girdhar wrote:
> Same. One day, Python will have a decent parsing library.
>
Nothing here https://wiki.python.org/moin/LanguageParsing suits your needs?
--
My fellow Pythonistas, ask not what our language can do for you, ask
what you can do for our language.
Mark Lawrence
_______________________________________________
Python-ideas mailing list
Python...@python.org
https://mail.python.org/mailman/listinfo/python-ideas
Code of Conduct: http://python.org/psf/codeofconduct/
--
---
You received this message because you are subscribed to a topic in the Google Groups "python-ideas" group.
To unsubscribe from this topic, visit https://groups.google.com/d/topic/python-ideas/FSd6xLHowg8/unsubscribe.
To unsubscribe from this group and all its topics, send an email to python-ideas...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and all its topics, send an email to python-ideas+unsubscribe@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
To unsubscribe from this group and all its topics, send an email to python-ideas...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
I've tried PyParsing. I haven't tried Grako.