Barry Margolin <bar...@alum.mit.edu> writes: > It can't be done using a function, because literal parsing is done at > read time. It would have to be done using a new string syntax. Perhaps > something like:
> #/regex/
> would make sense.
For the record,
``SXEmacs has Python-style raw strings. It greatly reduces "backslashitis" when writing those hairy regexps. :-)
Normal regexp: "\\(?:^\\|[^\\]\\)\\(?:\\\\\\\\\\)*\\(\\\\[@A-Za-z]+\\)" Raw string regexp: #r"\(?:^\|[^\]\)\(?:\)*\(\\[@A-Za-z]+\)"
emacs regex has a odd pecularity in that it needs a lot backslashes. More specifically, a string first needs to be properly escaped, then this passed to the regex engine.
For example, suppose you have this text “Sin[x] + Sin[y]” and you need to capture the x or y.
In emacs i need to use “\\(\\[[a-z]\\]\\)” for the actual regex “\(\[[a-z]\]\)”.
Here's somewhat typical but long regex for matching a html image tag
The toothpick syndrom gets crazy making already difficult regex syntax impossible to read and hard to code.
My question is, why is elisp's regex has this 2-steps process? Is this some design decision or just happened that way historically?
Second question: can't elisp create some like “regex-string” wrapper function that automatically takes care of the quoting? I can't see how this migth be difficult?
"xah...@gmail.com" <xah...@gmail.com> writes: > emacs regex has a odd pecularity in that it needs a lot backslashes. > More specifically, a string first needs to be properly escaped, then > this passed to the regex engine.
> For example, suppose you have this text “Sin[x] + Sin[y]” and you need > to capture the x or y.
> In emacs i need to use > “\\(\\[[a-z]\\]\\)” > for the actual regex > “\(\[[a-z]\]\)”.
> Here's somewhat typical but long regex for matching a html image tag
Moral: there is absolutely NO double antislash. It's a figment of your imagination.
Another proof there is no double antislash:
(let ((test "\\a\\b")) (insert (format "%c %c %c %c\n" (aref test 0) (aref test 1) (aref test 2) (aref test 3)))) C-x C-e
inserts: \ a \ b
and not: \ \ a \
> My question is, why is elisp's regex has this 2-steps process? Is this > some design decision or just happened that way historically?
The Emacs Regexp syntax involves double anti-slash, only to match an antislash:
(string-match "\\\\" "abc\\def") --> 3
Otherwise, there is only one anti-slash in each occurence:
(string-match "\\." "abc.def") --> 3
> Second question: can't elisp create some like “regex-string” wrapper > function that automatically takes care of the quoting? I can't see how > this migth be difficult?
It would not be simple, because there is no reader macros in emacs lisp.
You would have to change either the syntax of regular expressions, to use some other character. For example, double-quote.
". would be the regexp to match a single dot "( ") would match a group, etc.
Then you would write: "\"(a\"|\".\")" ; look ma! no double-slash! (but double-quotes...)
Or you could try another character that doesn't need escaping from string literals. Let's say ^
"^(a^|^.^)" matches a or dot anywhere.
Or, if you changed the emacs reader, you could have it use another character tha anti-slash to escape double-quote and the escape character in strings. Let's use ~
"\(~"a~"\|~\\.\)" would match "a" or \. (insert "\(~"a~"\|\.\)") would insert: \("a"\|\.\) You would have to write ~n to insert a newline in your string literals...
In article <0add1712-a31e-4499-9523-955c49126...@x41g2000hsb.googlegroups.com>,
"xah...@gmail.com" <xah...@gmail.com> wrote: > My question is, why is elisp's regex has this 2-steps process? Is this > some design decision or just happened that way historically?
Just history. They adopted string notation from Common Lisp, which uses backslash as the escape. And they adopted the standard Unix regex notation, which also uses backslash as the escape.
> Second question: can't elisp create some like “regex-string” wrapper > function that automatically takes care of the quoting? I can't see how > this migth be difficult?
It can't be done using a function, because literal parsing is done at read time. It would have to be done using a new string syntax. Perhaps something like:
#/regex/
would make sense.
-- Barry Margolin, bar...@alum.mit.edu Arlington, MA *** PLEASE post questions in newsgroups, not directly to me *** *** PLEASE don't copy me on replies, I'll read them in the group ***
"xah...@gmail.com" <xah...@gmail.com> writes: > emacs regex has a odd pecularity in that it needs a lot backslashes.
Yes, I've made the same complaint. You haven't even mentioned what I would call the really bad problem: some backslashes need to be doubled up when you stash them in strings, but *others don't*: if you turn "\t" or "\n" into "\\t" or "\\n" you'll break the regexp.
> My question is, why is elisp's regex has this 2-steps process? Is this > some design decision or just happened that way historically?
I hypothesize that "(" and ")" need to be escaped because lisp hackers think in terms of meta-hacking lisp, but I don't know that that's true.
As other people have pointed out, the problem is an interaction between two different sets of design decisions, one concerning regexps, the other concerning strings.
> Second question: can't elisp create some like “regex-string” wrapper > function that automatically takes care of the quoting? I can't see how > this migth be difficult?
I've had the same thought (after making the same complaint):
No, I don't think it would be all that hard to write a "regexp-whack-off" function that would do the escaping for you, but getting anyone else to use it might be difficult: it's the sort of thing where you don't see the need for it until you've learned to dance around it.
"xah...@gmail.com" <xah...@gmail.com> writes: > Second question: can't elisp create some like “regex-string” wrapper > function that automatically takes care of the quoting? I can't see how > this migth be difficult?
Do you mean something like regexp-quote?
regexp-quote is a built-in function in `C source code'.
(regexp-quote STRING)
Return a regexp string which matches exactly STRING and nothing else.
Joseph Brenner <d...@kzsu.stanford.edu> writes: > "xah...@gmail.com" <xah...@gmail.com> writes:
>> emacs regex has a odd pecularity in that it needs a lot backslashes.
> Yes, I've made the same complaint. You haven't even mentioned what > I would call the really bad problem: some backslashes need to be > doubled up when you stash them in strings, but *others don't*: > if you turn "\t" or "\n" into "\\t" or "\\n" you'll break the regexp.
That would probably be because \t and \n translate to characters in a literal string - they don't have any particular meaning in a regex.
The real issue is that a string literal has specific syntax that means you need \ as a an escape char. After parsing the string _literal_ construct, the resulting _string_ is parsed as a regex, which unfortunately also uses \ as the escape character. Many languages that don't have a special regex literal construct work the same way, and have the same issues.
>> My question is, why is elisp's regex has this 2-steps process? Is this >> some design decision or just happened that way historically?
> I hypothesize that "(" and ")" need to be escaped because lisp hackers > think in terms of meta-hacking lisp, but I don't know that that's true.
The need to escape capturing parentheses \( and \) is traditional regex syntax, I believe.
> As other people have pointed out, the problem is an interaction > between two different sets of design decisions, one concerning > regexps, the other concerning strings.
Yup.
>> Second question: can't elisp create some like “regex-string” wrapper >> function that automatically takes care of the quoting? I can't see how >> this migth be difficult?
> I've had the same thought (after making the same complaint):
> No, I don't think it would be all that hard to write a > "regexp-whack-off" function that would do the escaping for > you, but getting anyone else to use it might be difficult: > it's the sort of thing where you don't see the need for it > until you've learned to dance around it.
It's relatively straightforward to implement some nice looking regex literal syntax in Common Lisp, but elisp lacks reader macros. The only way I can see it work in elisp, is to use another character as the escape char, and then translate it using a macro or function.
David Kastrup <d...@gnu.org> writes: > "xah...@gmail.com" <xah...@gmail.com> writes:
>> Second question: can't elisp create some like regex-string wrapper >> function that automatically takes care of the quoting? I can't see how >> this migth be difficult?
> Do you mean something like regexp-quote?
> regexp-quote is a built-in function in `C source code'.
> (regexp-quote STRING)
> Return a regexp string which matches exactly STRING and nothing else.
Actually, no, that isn't at all the kind of thing we're talking about.
The problem is not how to literally match on "\(this\|that\)", the problem is how to both enter and then convert a string like "\(this\|that\)", into "\\(this\\|that\\)", so that it'll match "this" or "that".
Alternatly, you might go all the way and try to find a way to implement regexps in a more "modern" style (if you'll excuse the expression), where "(this|that)" would match "this" or "that".
Joseph Brenner <d...@kzsu.stanford.edu> writes: > David Kastrup <d...@gnu.org> writes: >> "xah...@gmail.com" <xah...@gmail.com> writes:
>>> Second question: can't elisp create some like “regex-string” wrapper >>> function that automatically takes care of the quoting? I can't see how >>> this migth be difficult?
>> Do you mean something like regexp-quote?
>> regexp-quote is a built-in function in `C source code'.
>> (regexp-quote STRING)
>> Return a regexp string which matches exactly STRING and nothing else.
> Actually, no, that isn't at all the kind of thing we're talking about.
> The problem is not how to literally match on "\(this\|that\)", the > problem is how to both enter and then convert a string like > "\(this\|that\)", into "\\(this\\|that\\)", so that it'll match > "this" or "that".
r u, like, trying 2 b a dumbass, or u trying 2 b divisive?
i can play a round of game with u or 2.
In this thread, so far there r 2 morons. One is Pascal J Bourguignon (congradulations Pascal!). The other is you. Congrat.
* * *
sometimes i wonder why there are these fucking morons slaving in newsgroups. Are they like, having nothing to do? Yes, i think that is the reason. I too, have nothing to do. But at least i think i have some redeeming qualities. While, on the other hand, these massive number of fuckheads that slave in newsgroups, although perhaps suffering from loneliness to a lesser degree than me, but, their knowledge, moral quality, philosophical outlook, depth of humour, is at a level equivalent to skateboard toting teens. Thus, the behavior, actions, of these morons at boredom, is rather insufferable.
* * *
there are, of course, quite a few common lisp morons here. After interacting with them a while, u sometimes realize they are no different than unix morons they despise. They just wear different pants. These morons at least really believed their dickless opinions and mushy drivels when they tech geek in computing forums online. However, folks, look at this case here today. What is David doing?? Does he actually really not understood the subject of this thread? Could he have really misread? Surely he's a emacs lisp developers for years, who are, capable of grasping lisp subject matters, i think. So, is he, intentionally trying to be a fuckhead?? Like, he's having a bad day maybe? The third possibility, is that he is just one of the billions of walking joe, shits all over as they carry about thru life without care or capable of control. With this interpretation, then David probably didn't pay attention to this thread, and simply just injected his say into a thread, and that it. Its, like, who gives a shit?
Like, one plus one is three a-ok, yes no a-ok. LOLZ and wtf? who gives a shit?!? u can do it with regex-opt and regexp-quote.
In my online forum posting career, since about 1991 with CompuServe and AppleLink days, especially since about 1998 when i started to get into contact with the unix industry programers, i noticed there is this type of morons, this class. It originally began with the unix morons, perl morons, as i classified. But no, it's not just they. When i got in contact with Mac community, where u can smell fashion and elegance in their air, there are quite a lot of morons too, among them known as Mac fanatics. The typical Mac fanatics are in general dumber by the numbers, although their demeanor is holier-than-thou with their more-expensive-than-thou hardware. But then in Python community, which i made first contact in 2005, moron masses i witnessed in them too of the type. In lisp community, morons of this type you see too... (though, i have to say, oddly you dont see much of these type of morons in Java communities. Umm. Perhaps because java programers tends to be more suites) i think i dunno what i'm trying to say here am a bit dizzy atm n cant be bothered to put out thoughts in a focused way or make it to my writings. But distinctly these morons impressed me, and i wanted to make sure that it is these particular class of morons i'm currently trying to write out. Ok, back to topic... these morons. morons... Ok, i think originally what i was trying to say, was that i realized...
kk, i've been wondering, why are these so many this type morons? My final thought and solution. My final answer to this wonderment, is the last interpretation of the paragraph about David's case. The essence, that characterizes or qualifies this moron class's behavior, is just because that's what they are. They shit thru life. They careless. Happy-go-lucky. Most human animals are like this, actually. Me, myself, with intense introspection and capabilities, austerity and asceticism, magnanimity and grandiloquence of self love, realized that i'm not like most people in being a lone genius. Not that i'm a wunderkind or the greatest, but my kind is sparse.
stupid people offend me. people with low IQ offend me. Sloppiness and slouchers offend me. Low aspiration, low lifers, witless shits offends me. Well they dont rly offend me, just that i find little interest in them and look down on them. O, Jail me! O, my passion for the world and human animals. (n pussy) O. It began to OFFEND me, when these morons begin to be hateful. Bingo, that's most of these tech geeking morons who slaves in programing newsgroups, are. Dickless cocks.
What is it they want in life? well you get married and have kids and die. A inevitibility. (there's glimmer of immortability with biology on the horizon, however) But what they want? What are they thinking? What is it, they care? Sure, we dont want pain, we dont want hunger. We want to have money, and all. But the tech geeking morons, what r they getting out of their existance? What do for example David in this thread, who inject his irrelevance with irreverence in the middle, want to achieve?
Is it the chatting? The socialization? The simple fact of smashing a wine glass and have another human ilk respond, underwritten by empathy and pleasure. Like, rubbing elbows, schmooze, have a laugh. Having a good time. I, being what i am, dont see the humour. There is no knowledge; there is no art. But they can go about their tech drivel and i go about weighty fantasies in math or human outcome. But these morons impinge me, when they trust forward their male nature for a pissing fight with ignorance and hatefulness.
> Joseph Brenner <d...@kzsu.stanford.edu> writes: > > David Kastrup <d...@gnu.org> writes: > >> "xah...@gmail.com" <xah...@gmail.com> writes:
> >>> Second question: can't elisp create some like “regex-string” wrapper > >>> function that automatically takes care of the quoting? I can't see how > >>> this migth be difficult?
> >> Do you mean something like regexp-quote?
> >> regexp-quote is a built-in function in `C source code'.
> >> (regexp-quote STRING)
> >> Return a regexp string which matches exactly STRING and nothing else.
> > Actually, no, that isn't at all the kind of thing we're talking about.
> > The problem is not how to literally match on "\(this\|that\)", the > > problem is how to both enter and then convert a string like > > "\(this\|that\)", into "\\(this\\|that\\)", so that it'll match > > "this" or "that".
"xah...@gmail.com" <xah...@gmail.com> writes: > Hi David.
> r u, like, trying 2 b a dumbass, or u trying 2 b divisive?
[ ... ]
> sometimes i wonder why there are these fucking morons slaving in > newsgroups. Are they like, having nothing to do? Yes, i think that is > the reason. I too, have nothing to do. But at least i think i have > some redeeming qualities.
I thought your redeeming qualities were good enough, but this post confirms that I should have known better.
« My question is, why is elisp's regex has this 2-steps process? Is this some design decision or just happened that way historically? »
Barry Margolin wrote: > Just history. They adopted string notation from Common Lisp, which uses > backslash as the escape. And they adopted the standard Unix regex > notation, which also uses backslash as the escape. Xah wrote:
«Second question: can't elisp create some like “regex-string” wrapper function that automatically takes care of the quoting? I can't see how this migth be difficult?»
Barry wrote: > It can't be done using a function, because literal parsing is done at > read time. It would have to be done using a new string syntax. Perhaps > something like:
On 12 Lip, 00:45, "xah...@gmail.com" <xah...@gmail.com> wrote:
> r u, like, trying 2 b a dumbass, or u trying 2 b divisive?
(...)
Your nasty humor is the most hilarious thing I have read in a while, really. Yes, you guessed, I'm a moron too, a Java one, by the way. I think you'd be delighted with our moronic sense of humor :D
Keep kool, -- José A. Romero L. joseito (at) poczta (dot) onet (dot) pl "We who cut mere stones must always be envisioning cathedrals." (Quarry worker's creed)
jose...@poczta.onet.pl writes: > On 12 Lip, 00:45, "xah...@gmail.com" <xah...@gmail.com> wrote: >> r u, like, trying 2 b a dumbass, or u trying 2 b divisive? > (...)
> Your nasty humor is the most hilarious thing I have read in a while, > really.
I suppose you mean "humor" as in
3. State of mind, whether habitual or temporary (as formerly supposed to depend on the character or combination of the fluids of the body); disposition; temper; mood; as, good humor; ill humor. [1913 Webster]
Examine how your humor is inclined, And which the ruling passion of your mind. --Roscommon. [1913 Webster]
A prince of a pleasant humor. --Bacon. [1913 Webster]
I like not the humor of lying. --Shak. [1913 Webster]