>(You posted to too many newsgroups, trimmed.)
>
>18TH MODEM DROPPAGE TODAY AT 15:15, SINCE FIRST LOGIN TODAY AT 10:49.
>(During one of my earlier messages I lost count by one, sorry.)
>MEAN TIME BETWEEN MODEM FAILURES <15 MINUTES.
>
>> From: "xah...@
gmail.com"
gmail.com>
>> In the past weeks i've been thinking over the problem on the
>> practical problems of regex in its matching power. For example,
>> often it can't be used to match anything of nested nature, even the
>> most simple nesting. It can't be used to match any simple grammar
>> expressed by BNF. Some rather very regular and simple languages
>> such as XML, or even url, email address, are not specified as a
>> regex. (there exist regex that are pages long that tried to match
>> email address though)
>
>A couple years ago I wrote a table-driven parser in Lisp that could
>deal with nested stuff to some degree. The result of the parse was
>a structure that matched the parse tree, so that further processing
>could be done on that result. My test case was Received lines in spam.
>The syntax was essentially the properly-structured version of
>something like a regex crossed with BNF.
>
>
>Looking there now ...
>
> For example, in computing, one would like to say that email address has
> the form xyz, where xyz is a perl regex. (e.g. "[A-z0-9]+@[A-z]+\.com")
>
>Per my system, that would be something more like
> (:ANYNUMBEROF1 (:CHOICE (:RANGE A Z) (:RANGE a z) (:RANGE 0 9)))
> (:EXACTLY "@")
> (:ANYNUMBEROF1 (:CHOICE (:RANGE A Z) (:RANGE a z)))
> (:EXACTLY ".com")
>I forget how I specified characters, such as in ranges, so I left
>that unspecified there. But I'm pretty sure I used strings when
>specifying exact text, so I put quote marks in there. No, I don't
>feel like looking up the details right now. My general idea above
>should be enough for your immediate interest.
>
>19TH MODEM DROPPAGE TODAY AT 15:34, SINCE FIRST LOGIN TODAY AT 10:49.
>(During one of my earlier messages I lost count by one, sorry.)
>MEAN TIME BETWEEN MODEM FAILURES <15 MINUTES.
>
> It is quite desirable to have a general grammar language, designed in
> a human-readible way, and concise. With such a language, we could use
> it to verify if a text is a valid form. We could use it for human
> communication.
>
>I think my nested notation is sufficiently clear. Also, typically I
>break my pattern into multiple production rules, with a meaningful
>name for each. Thus the above might be instead:
>
> Address = (User Atsign Host)
> User = (:ANYNUMBEROF1 (:CHOICE (:RANGE A Z) (:RANGE a z) (:RANGE 0 9)))
> Atsign = (:EXACTLY "@")
> Host = (Hostname Dotcom)
> Hostname = (:ANYNUMBEROF1 (:CHOICE (:RANGE A Z) (:RANGE a z)))
> Dotcom = (:EXACTLY ".com")
>
>20TH MODEM DROPPAGE TODAY AT 15:39, SINCE FIRST LOGIN TODAY AT 10:49.
>
>
>Looking at that now ...
>
> Unlike CFGs, PEGs are not ambiguous; if a string parses, it has
> exactly one valid parse tree.
PEGs ca have arbitrary length ambiguous prefixes so they can have
complexity approaching CFG. The reason for staying within LR(1) (or
the overlapping LL(k)) whenever possible is to keep ambiguity to
reasonable levels - even people start to have problems with more
complex languages.
>When using BNF or other formal specification of a grammar (syntax),
>you have to be careful not to present overlapping rules, or you
>need a precedence that one rule applies before another if both
>match. Is the claim that with PEGs there is no such problem in the
>first place??
>
> + Ordered choice: e[1] / e[2]
>
>What if both e[1] and e[2] match exactly the same sub-string?
>That would seem to me to create an ambiguity as to how to parse
>that sub-string. How does a PEG resolve this ambiguity??
>
> The choice operator e[1] / e[2] first invokes e[1], and if e[1]
> succeeds, returns its result immediately. Otherwise, if e[1] fails,
> then the choice operator backtracks to the original input position at
> which it invoked e[1], but then calls e[2] instead, returning e[2]'s
> result.
>
>OK, there's the answer, linear precedence. Basically they've
>re-invented BNF with the caveat that the production rules are in
>linear sequence and earlier rules override later rules, using the
>choice operator to syntactically condense multiple sequential
>production rules into a single mega-rule with multiple sequential
>clauses. (Actually BNF already had that condensation IIRC, it just
>didn't have the linear precedence on clauses.)
>
>I'd need to look up whether my own structured parse-grammar used
>precedence too. Maybe some other day if anybody asks.
>
>21ST MODEM DROPPAGE TODAY AT 16:01, SINCE FIRST LOGIN TODAY AT 10:49.
>
>22ND MODEM DROPPAGE TODAY AT 16:03.
--
for email reply remove "/" from address