RfD: Escaped Strings
  Home FAQ Contact Sign in
comp.lang.forth only
 
Advanced search
POPULAR GROUPS

more...

comp.lang.forth Profile…
 Up
RfD: Escaped Strings         


Author: Peter Knaggs
Date: Jul 11, 2007 10:23

21 August 2006, Stephen Pelc

20060822 Updated solution section.
20060821 First draft.

Rationale
=========

Problem
-------
The word S" 6.1.2165 is the primary word for generating strings.
In more complex applications, it suffers from several deficiencies:
1) the S" string can only contain printable characters,
2) the S" string cannot contain the '"' character,
3) the S" string cannot be used with wide characters as discussed
in the Forth 200x internationalisation and XCHAR proposals.

Current practice
----------------
At least SwiftForth, gForth and VFX Forth support S\" with very similar
operations. S\" behaves like S", but uses the '\' character as an escape
character for the entry of characters that cannot be used with S".

This technique is widespread in languages other than Forth.
Show full article (12.23Kb)
25 Comments
Re: RfD: Escaped Strings         


Author: Alex McDonald
Date: Jul 12, 2007 12:16

Peter Knaggs wrote:
> 21 August 2006, Stephen Pelc
>
> 20060822 Updated solution section.
> 20060821 First draft.
>
> Rationale
> =========
>
> Problem
> -------
> The word S" 6.1.2165 is the primary word for generating strings.
> In more complex applications, it suffers from several deficiencies:
> 1) the S" string can only contain printable characters,
> 2) the S" string cannot contain the '"' character,
> 3) the S" string cannot be used with wide characters as discussed
> in the Forth 200x internationalisation and XCHAR proposals.
>
> Current practice
> ---------------- ...
Show full article (6.32Kb)
no comments
Re: RfD: Escaped Strings         


Author: Peter Knaggs
Date: Jul 13, 2007 02:08

Alex McDonald wrote:
>
> How would the following
>
> s\" \"
>
> be handled? Win32Forth treats incomplete strings
>
> s" incomplete
>
> as being correctly terminated at the cf/lf boundary.

The current definition of s" does not define what happens in this
circumstance. Consequently this proposal does not not define this
condition either. Your solution would be just as valid for s\" as s".

It find it moderately interesting that the rather standard \ is
not. Traditionally this means ignore the line break.
>> \[0-7]+ Octal numerical character value, finishes at the first
>> non-octal character
>> \x[0-9a-f]+ Hex numerical character value, finishes at the first
>> non-hex character...
Show full article (1.73Kb)
2 Comments
Re: RfD: Escaped Strings         


Author: Alex McDonald
Date: Jul 13, 2007 02:41

On Jul 13, 10:08 am, Peter Knaggs bournemouth.ac.uk> wrote:
> Alex McDonald wrote:
>
>> How would the following
>
>> s\" \"
>
>> be handled? Win32Forth treats incomplete strings
>
>> s" incomplete
>
>> as being correctly terminated at the cf/lf boundary.
>
> The current definition of s" does not define what happens in this
> circumstance. Consequently this proposal does not not define this
> condition either. Your solution would be just as valid for s\" as s".
>
> It find it moderately interesting that the rather standard \ is
> not. Traditionally this means ignore the line break.
Show full article (2.21Kb)
no comments
Re: RfD: Escaped Strings         


Author: Stephen Pelc
Date: Jul 13, 2007 02:49

On Thu, 12 Jul 2007 20:16:51 +0100, Alex McDonald rivadpm.com>
wrote:
>How would the following
>
> s\" \"
>
>be handled? Win32Forth treats incomplete strings
>
> s" incomplete

It's a badly formed string, and so ambiguous. I've added this to the
ambiguous conditions list.
>I'm confused by the previous, and how to terminate an octal or hex
>string. Is \x12AB the equivalent of pchars 12 'A' and 'B', or is it 0x12AB?

This was part of the discussion, so we define \xABcdef as generating
the primitive character AB and cdef is then parsed.

The octal notation is not specified in the normative part of the
proposal.

Stephen
Show full article (0.90Kb)
no comments
Re: RfD: Escaped Strings         


Author: Anton Ertl
Date: Jul 13, 2007 05:08

Alex McDonald rivadpm.com> writes:
>How would the following
>
> s\" \"
>
>be handled? Win32Forth treats incomplete strings
>
> s" incomplete
>
>as being correctly terminated at the cf/lf boundary.

That's what the standard prescribes in Section 3.4.1:

|[If no delimiter character is present], the string continues up to
|and including the last character in the parse area, and the number in
|>IN is changed to the length of the input buffer, thus emptying the
|parse area.

Since the proposal uses the usual "parse ... delimited by ..." idiom,
I expect that it works the same way, modulo not interrpreting the " in
\" as delimiter. Maybe this could be made clearer in the proposal.
Show full article (0.98Kb)
no comments
Re: RfD: Escaped Strings         


Author: Anton Ertl
Date: Jul 13, 2007 05:15

Alex McDonald rivadpm.com> writes:
>On Jul 13, 10:08 am, Peter Knaggs bournemouth.ac.uk> wrote:
>> It find it moderately interesting that the rather standard \ is
>> not. Traditionally this means ignore the line break.
>
>That would be a useful enhancement;

No existing practice in Forth.
> but perhaps \c might be clearer,
>as it differentiates between a silent space as in \ and \
> and permits comments.
>
>s\" abcdefg\c \ continue on a new line
> hijklmn" \ blank strip leading & catenate for
>abcdefghijklmn

In C one can construct a longer literal string by writing to adjacent
literal strings, separated only by white space and comments. E.g.:
Show full article (1.37Kb)
no comments
Re: RfD: Escaped Strings         


Author: Anton Ertl
Date: Jul 13, 2007 05:24

Peter Knaggs bournemouth.ac.uk> writes:
>21 August 2006, Stephen Pelc

Pretty good. There's always room for improvement:

- Test cases should be added before the CfV.

- I guess that you want \xAB to represent a (primitive) character.
This does not come out clearly (actually, if there was no mention of
XCHARS and definition of "primitive characters" in the informative
sections, this would be clearer).

- It seems that the detailed description of an existing solution in
the "Solution" section is confusing, because it is very similar to the
proposal, but still different. Better leave it away and just mention
the issues (like fixed-length vs. variable-length \x) in a discussion
section.

- anton
no comments
Re: RfD: Escaped Strings         


Author: Stephen Pelc
Date: Jul 13, 2007 07:05

On Fri, 13 Jul 2007 12:24:39 GMT, anton@mips.complang.tuwien.ac.at
(Anton Ertl) wrote:
>- Test cases should be added before the CfV.

Volunteer? You? The gForth test suite?
>- I guess that you want \xAB to represent a (primitive) character.
>This does not come out clearly (actually, if there was no mention of
>XCHARS and definition of "primitive characters" in the informative
>sections, this would be clearer).

Given the problems with the definition of char throughout the
document, the definition of char in terms of primitve characters
*has* to be done in a different section of the document.

For example, if char=16 bits on a byte-addressed machine, there
is no way for a standard program to write a byte to a file!

If you use a variable width character set such as UTF-8, what does
CMOVE mean?

The only practical solutions I see are
a) define char=byte
b) define char=implementation defined unit
Show full article (1.81Kb)
no comments
Re: RfD: Escaped Strings         


Author: Stephen Pelc
Date: Jul 13, 2007 07:07

On Wed, 11 Jul 2007 18:23:53 +0100, Peter Knaggs
bournemouth.ac.uk> wrote:
>21 August 2006, Stephen Pelc

Here's the latest version

Stephen

RfD - S\" and quoted strings with escapes
21 August 2006, Stephen Pelc

20070712 Redrafted non-normative portions.
20060822 Updated solution section.
20060821 First draft.

Rationale
=========
Show full article (12.35Kb)
no comments
1 2 3