Document Number: J4/02-0158
August 2, 2002
Page 1 of 3
Subject:
Interpretation Request – Distinguishing separators from decimal separators
Author:
William M. Klein (wmklein
@
ix.netcom.com)
References:
ISO/IEC FDIS 1989:2002(E)
1.
2.
02-0099, Interpretation Request – Distinguishing separators from editing symbols
QUESTION:
This is a follow-up on my earlier interpretation request concerning distinguishing separators
from editing symbols.
Consider a program with the following source code (embedded within an otherwise conforming
source program)
My question concerns whether the comma in “99, ” is part of the same text-word as the “99” or
not. There seem to be several rules that relate to this, but do not (to me) adequately answer
the question.
The section “7.1 Text manipulation” in the FDIS seems to provide some guidance, but no
solution. First it says,
“The following elements and the separators required to distinguish them shall be
syntactically correct in the initial source text and library text:
— COPY statements
— compiler directives
— alphanumeric, boolean, and national literals
— fixed and floating indicators
— constant entries specifying a FROM phrase”
Then it goes on to say,
“Other indicators, language elements, and separators need not be syntactically correct
until the completion of the text manipulation stage.”
The two most significant (to me) issues are that the list of literals that must be syntactically
correct includes alphanumeric, boolean, and national ones – but it does not include numeric
ones.
Document Number: J4/02-0158
August 2, 2002
Page 2 of 3
The next relevant rule (or potentially relevant rule) in the FDIS is in “8.3.1.2.2.1 Fixed-point
numeric literals” which says,
“3) ... The decimal point is treated as an assumed decimal point, and may appear
anywhere within the literal except as the rightmost character.”
Now, it seems clear that after the text manipulation phase,
would be a syntactically incorrect numeric literal. (and I think this means that the comma is a
valid separator) However, it is also clear that, as numeric literals are not included in the list of
elements that must be syntactically correct before the text manipulation phase, that (as the
rules currently stand) this is “irrelevant”.
It is my guess that the comma followed by a space is treated as a separator during the text
manipulation phase rather than treating the “99,” as a (irrelevantly) syntactically incorrect
numeric literal and, therefore, a single text-word.
It seems to me that this “intent” (if it is the intent) could be clarified by one of two modifications
to the section, “7.1.1.4 Text-words”.
Option 1:
Change,
1) a separator, except for: a space; a pseudo-text delimiter; and the opening and closing
delimiters for alphanumeric, boolean, and national literals. In determining which
character sequences form text-words, the colon, the right parenthesis, and the left
parenthesis characters, in any context except within alphanumeric or national literals,
are treated as separators;
to:
1) a separator, except for: a space; a pseudo-text delimiter; and the opening and closing
delimiters for alphanumeric, boolean, and national literals. In determining which
character sequences form text-words, the colon, the right parenthesis, the left
parenthesis characters, and the consecutive two-characters combinations “,
” and “.
”, in
any context except within alphanumeric or national literals, are treated as separators;
Option 2:
Change,
“3) any other sequence of contiguous COBOL characters bounded by separators,
except for: comments and the word 'COPY'.”
to:
“3) any other sequence of contiguous COBOL characters bounded by separators,
except for: comments and the word 'COPY'. Except within alphanumeric or national
literals, all items in “8.3.2 Separators” are treated as if they were separators – even if
they could be interpreted as being in other types of literals or picture character-string,
as excluded from the normal identification of 8.3.2.”
Document Number: J4/02-0158
August 2, 2002
Page 3 of 3
A couple of comments on these two possible changes:
A) Both of these changes would handle (I think) many of the issues concerning potential
picture character-strings ending with periods or commas followed by a space. (At least,
when doing COPY and REPLACE processing)
B) If the 2nd option were adopted, it would be possible (I believe) to remove the
parenthesis (but not the colon) from the exception listed in the first rule of this section. It
would also seem to make the rules “more consistent”.
It is possible that others think that the original wording of “rule 3” already says what I propose
as the revised wording. However, if that were true, then I don’t understand why parenthesis
need to be included in rule 1.
Another possible solution would be to add numeric literals to the list of items that must be
syntactically correct before text manipulation. However, it is my “sense” that the intent is to
require as little as possible syntax checking before text manipulation.