why emacs lisp's regex has 2-steps escapes?
  Home FAQ Contact Sign in
gnu.emacs.help only
 
Advanced search
POPULAR GROUPS

more...

gnu.emacs.help Profile…
 Up
why emacs lisp's regex has 2-steps escapes?         


Author: Xah
Date: Jul 9, 2008 03:30

emacs regex has a odd pecularity in that it needs a lot backslashes.
More specifically, a string first needs to be properly escaped, then
this passed to the regex engine.

For example, suppose you have this text “Sin[x] + Sin[y]” and you need
to capture the x or y.

In emacs i need to use
“\\(\\[[a-z]\\]\\)”
for the actual regex
“\(\[[a-z]\]\)”.

Here's somewhat typical but long regex for matching a html image tag

(search-forward-regexp " \" +width=\"\\([0-9]+\\)\" +height=\"\\([0-9]+\\)\" ?>" nil t)

The toothpick syndrom gets crazy making already difficult regex syntax
impossible to read and hard to code.

My question is, why is elisp's regex has this 2-steps process? Is this
some design decision or just happened that way historically?

Second question: can't elisp create some like “regex-string” wrapper
function that automatically takes care of the quoting? I can't see how
this migth be difficult?
Show full article (1.03Kb)
2 Comments
Re: why emacs lisp's regex has 2-steps escapes?         


Author: Kevin Rodgers
Date: Jul 10, 2008 01:17

Xah wrote:
> emacs regex has a odd pecularity in that it needs a lot backslashes.
> More specifically, a string first needs to be properly escaped, then
> this passed to the regex engine.
>
> For example, suppose you have this text “Sin[x] + Sin[y]” and you need
> to capture the x or y.
>
> In emacs i need to use
> “\\(\\[[a-z]\\]\\)”

If all you want to capture is the x or y (without the square brackets):

"\\[\\([a-z]\\)\\]"
> for the actual regex
> “\(\[[a-z]\]\)”.

The enclosing double quotes are misleading in this context. I would
simply write (again, capturing the letter but not the brackets):

\[\([a-z]\)\]
Show full article (2.35Kb)
no comments
Re: why emacs lisp's regex has 2-steps escapes?         


Author: Alan Mackenzie
Date: Jul 10, 2008 02:39

On Wed, Jul 09, 2008 at 03:30:27AM -0700, Xah wrote:
> emacs regex has a odd pecularity in that it needs a lot backslashes.
> More specifically, a string first needs to be properly escaped, then
> this passed to the regex engine.

Yes. The greatest number of consecutive backslashes I've seen (in a
non-joke context) is 10.
> For example, suppose you have this text ???Sin[x] + Sin[y]??? and you need
> to capture the x or y.

Ironically, Xah, you are doing the same sort of thing in your post,
using crazy quote characters (if that is indeed what they are), 0x5397c
and 0x5397d (according to C-u C-x =). Over my SSH link to my SSP, your
quotes look something like "â~@~]", and are most difficult to read
without a pair of sunspecs which filters out the UTF.

Could you, perhaps, use the standard ASCII quotes 0x22 and 0x27 here,
please?
> In emacs i need to use
> ???\\(\\[[a-z]\\]\\)???
> for the actual regex
> ???\(\[[a-z]\]\)???.
Show full article (4.53Kb)
no comments

RELATED THREADS
SubjectArticles qty Group
Re: .emacs in the Windows version of Emacsgnu.emacs.help ·