Document Number:     J4/02-0158
August 2, 2002 
 
Page    1  of   3
Subject:
Interpretation Request – Distinguishing separators from decimal separators
Author: 
William M. Klein (wmklein
@
ix.netcom.com)
References:
ISO/IEC FDIS 1989:2002(E)
1.
2.
 02-0099, Interpretation Request – Distinguishing separators from editing symbols
http://www.cobolstandard.info/j4020099.htm
QUESTION:
This is a follow-up on  my earlier interpretation  request concerning distinguishing separators
from editing symbols.
Consider a program with the following source code (embedded within an otherwise conforming
source program)
My question concerns whether the comma in “99, ” is part of the same text-word as the “99” or
not.  There seem to be several rules that relate to this, but do not (to me) adequately answer
the question.
The  section  “7.1  Text  manipulation”  in  the  FDIS  seems  to  provide  some  guidance,  but  no
solution.  First it says,
“The  following  elements  and  the  separators  required  to  distinguish  them  shall  be
syntactically correct in the initial source text and library text:
— COPY statements
— compiler directives
— alphanumeric, boolean, and national literals
— fixed and floating indicators
— constant entries specifying a FROM phrase”
Then it goes on to say,
“Other indicators, language elements, and separators need not be syntactically correct
until the completion of the text manipulation stage.”
The two most significant (to me) issues are that the list of literals that must be syntactically
correct includes alphanumeric, boolean, and national ones – but it does not include numeric
ones.
Document Number:     J4/02-0158
August 2, 2002 
 
Page    2  of   3
The next relevant rule (or potentially relevant rule) in the FDIS is in “8.3.1.2.2.1 Fixed-point
numeric literals” which says,
“3)  ...  The  decimal  point  is  treated  as  an  assumed  decimal  point,  and  may  appear
anywhere within the literal except as the rightmost character.”
Now, it seems clear that after the text manipulation phase,
would be a syntactically incorrect numeric literal. (and I think this means that the comma is a
valid separator) However, it is also clear that, as numeric literals are not included in the list of
elements that must be syntactically correct before the  text manipulation phase,  that (as  the
rules currently stand) this is “irrelevant”.
It is my guess that the comma followed by a space is treated as a  separator during the text
manipulation  phase  rather  than  treating  the  “99,”  as  a  (irrelevantly)  syntactically  incorrect
numeric literal and, therefore, a single text-word.
It seems to me that this “intent” (if it is the intent) could be clarified by one of two modifications
to the section, “7.1.1.4 Text-words”.
Option 1:
Change,
1) a separator, except for: a space; a pseudo-text delimiter; and the opening and closing
delimiters   for   alphanumeric,   boolean,   and   national   literals.   In   determining   which
character  sequences  form  text-words,  the  colon,  the  right  parenthesis,  and  the  left
parenthesis characters, in any context except within alphanumeric or national literals,
are treated as separators;
to:
1) a separator, except for: a space; a pseudo-text delimiter; and the opening and closing
delimiters   for   alphanumeric,   boolean,   and   national   literals.   In   determining   which
character   sequences   form   text-words,   the   colon,   the   right   parenthesis,   the   left
parenthesis characters, and the consecutive two-characters combinations “,
” and “.
”, in
any context except within alphanumeric or national literals, are treated as separators;
Option 2:
Change,
“3)  any  other  sequence  of  contiguous  COBOL  characters  bounded  by  separators,
except for: comments and the word 'COPY'.”
to:
“3)  any  other  sequence  of  contiguous  COBOL  characters  bounded  by  separators,
except  for:  comments  and  the  word  'COPY'.  Except  within  alphanumeric  or  national
literals, all items in “8.3.2 Separators” are treated as if they were separators – even if
they could be interpreted as being in other types of literals  or picture character-string,
as excluded from the normal identification of 8.3.2.”
Document Number:     J4/02-0158
August 2, 2002 
 
Page    3  of   3
A couple of comments on these two possible changes:
A) Both of these changes would handle (I think) many of the issues concerning potential
picture character-strings ending with periods or commas followed by a space.  (At least,
when doing COPY and REPLACE processing)
B)  If  the  2nd  option  were  adopted,  it  would  be  possible  (I  believe)  to  remove  the
parenthesis (but not the colon) from the exception listed in the first rule of this section.  It
would also seem to make the rules “more consistent”.
It is possible that others think that the original wording of “rule 3” already says what I propose
as the revised wording.  However, if that were true, then I don’t understand why parenthesis
need to be included in rule 1.
Another possible solution  would be  to add numeric literals to  the list of items that must be
syntactically correct before text manipulation.  However, it is my “sense” that the intent is to
require as little as possible syntax checking before text manipulation.